⏱ Estimated reading time: 6 minutes

Table of Contents

By Alexandre Moureaux, Mobile Performance Expert @ BAM

Why would anyone need to build their own performance testing tool? Some organizations are mature enough that their performance needs become very specific. Lighthouse is a tool of this space made by Google that analyzes web page performance over many aspects and computes a performance score for web and mobile. It is integrated within the Google Chrome dev tools and is a very capable and versatile testing utility.

This approach works well for web, could anything similar exist for mobile?

Measuring performance

We define a “performant app” as a piece of mobile software that runs smoothly. Smoothness is measured in terms of images displayed per second, also named “frames per second” or fps. 60 fps is the standard, even though many devices beat this mark today.

Frames per second are not enough to measure performance: React Native runs on a JavaScript engine bound to one thread and orchestrating native components. Whenever the JS engine thread hangs, the whole app hangs and consumes more battery. Therefore, latency and power consumption are two such metrics to watch for.

The operating system is also crucial: iPhones are much more performant than Android phones and developers usually possess performant high-end devices. What about the rest of the user base? Many cannot afford them and opt for low-end devices that they replace when they slow down to a crawl. 80% of CO2-eq emissions come from device manufacturing, not usage. Therefore, designing performant apps slows the device renewal cycle.

Performance measurements cannot be deterministic. Worse even, performance measurement tooling can have a sizable impact on measured performance. Also mind the way the application run as debug mode is known to slow down the device.

Flashlight

How would you call a lighthouse that fits in a pocket? A flashlight!

Lighthouse can be used on any website. Can we achieve the same on any Android app with an equivalent solution? First, you will need Android Debug Bridge or adb to communicate with and Android phone over its debug interface. Android is based on Linux and has a shell on which we can execute commands such as top.

How can one estimate the power consumption of the JavaScript thread of React Native apps? By using the -H option (Hardware frame) of top. While this approach works, top is resource-heavy and even appears in the list if you increase the sampling rate. Its format is difficult to parse and there is a limit to the sampling rate.

To get better results, we need to turn to its code in the Android Open Source Project to understand where top sources its data. The code can be searched at https://cs.android.com/. It turns out top reads its data from /proc/${process_id}/task. Every app has one or multiple processes with each a process_id and to retrieve it you need the app identifier, which can be its package name, the bundle name, and so on.

There are commands to retrieve the identifier of the application currently running on the phone, however this command is dependent on the Android version the phone is running. With this identifier, run adb shell pidof ${app_id} to retrieve the process identifier of the main thread of the application. adb shell ls /proc/${process_id}/task returns a list of task (thread) identifiers. adb shell cat /proc/${process_id}/task/${task_id}/stat returns raw information on the thread including its name.

By retrieving the information of all the threads in the app, we can find the one named mot_v_js which represents the JavaScript thread of the React Native app and focus on its statistics. The meaning of each column in the /stat file is explained in this documentation. The 14th and 15th columns are utime and stime and they represent the user time and the system time as measured in clock ticks. Added together, they represent the time the CPU spent on the app.

To measure fps, use atrace which traces android and app events. Its output can be used to build flame graphs as it returns the list of instructions and functions that run the creation of views in Android. One function stands out in particular: Choreographer#doFrame. This one triggers regularly and the difference between two invocations leads to believe that it is responsible for the app framerate. As a reminder, to achieve a stable 60 fps the app must be able to produce 60 frames every second evenly spread. The time allocated for one frame, also called “frame budget” is 1000ms / 60fps ≈ 16.67ms per frame.

Introducing Flashlight. Install this tool on your computer, plug your phone and enable USB debugging to access it through adb. Run flashlight measure to start the service, auto-detect the app, play with it and watch the stats. End-to-end tests can be run with flashlight test. Flashlight can run on a CI/CD in a device farm and has a dashboard. For deterministic end-to-end test measurements, check out Maestro.

Soon, Flashlight will show more measurements, support iOS and AI features.

Questions and Answers

Who is sponsoring Flashlight?

Alexandre Moureaux and BAM. This project was developed to objectively measure the performance of an app. The CTO believed in the idea and invested resources in it.

Can Flashlight detect other threads than the JavaScript one?

Flashlight is agnostic technology-wise and lists other threads as well.

How is the score computed?

The score is empirically computed from the fps count, cpu usage and thread statistics depending on the technology. The formula may change over time.

Can’t the React Native profiler do that?

It cannot! That profiler profiles React and native components, does not work in production builds and is only for React Native apps.

Can profiles be extracted as proper flame graphs?

Use Perfetto with an atrace export.