TLDR;
- Monitor memory inside an E2E test
- Try a fix
- Run the test again
- Compare memory usage between runs
- GOTO step 1.
Last week, we were notified by our colleague Christiano that the Gravity application was facing quite a huge problem of memory leaks:
Although we were not able to reach such peaks as the ones Christiano was facing, a quick look at the “Memory” tab of the Chrome inspector was enough to ensure the issue was present.
One of the biggest changes we had released in the previous days was to use a beta version of gravity-data-collector. This new version was collecting screenshots of the visited pages alongside the video recording. We swiftly deactivated the collect and checked again for memory leaks and saw that the issue disappeared.
Now was the time for the real investigation.
Tracking down the issue
Limitations of the naive approach
The original response to handle the problem (navigate on the application and keep an eye on the memory tab of the Chrome inspector) was good enough to easily see if the problem was gone by the quick fix, but was not as precise as needed to pinpoint the issue in the code.
Chrome inspector is a nice tool to get an overall idea of the memory consumption of the web page, but it faces two limitations:
- the memory tab gives the memory consumption in real-time, but no history
- when activating the recording of the memory consumption (and when the app has a memory leak), the tab tends to crash, which is not really helpful.
Another issue we were facing was that the memory usage of the app tended to vary quite a lot (between 500Mo and 2Go) between two exploratory tests.
To pinpoint the problem, we needed a solution that was:
- repeatable
- limiting discrepancies
- providing an historic of the memory consumption during the navigation
Hopefully, we already had an automated E2E test runner set up on Gravity, so all we had to do was to provide some capabilities for monitoring the memory consumption.
Reading memory usage inside Cypress
Thanks to this article by Donald Le, we were provided with the basic code to read memory usage.
As with the memory tool of Chrome Inspector, this is just providing real-time memory usage, but we need to have trends to check for memory leaks. In order to do so, three tasks were added to Cypress:
setupNodeEvents(on: Cypress.PluginEvents): void {
on('task', {
'memoryUsage:reset': () => {
memoryUsageStorage = []
return null
},
'memoryUsage:add': (value: number) => {
memoryUsageStorage.push(value)
return null
},
'memoryUsage:write': (args?: Partial<{ testName: string }>) => {
fs.appendFileSync(
path.join(
__dirname,
'performanceMonitoring',
`${args?.testName || 'memoryUsage'}.csv`
),
`${memoryUsageStorage.join(';')}\n`
)
return null
},
})
The three tasks are pretty simple:
- 1:
memoryUsage:reset
resets the memory storage - 2:
memoryUsage:add(value)
adds a value in the memory storage - 3:
memoryUsage:write
appends a new line in a CSV file with all recorded memory usage during the test
We also needed a simple Cypress command to add the memory reading:
Cypress.Commands.add('getMemoryUsage', () => {
cy.window().then((window) => {
// @ts-ignore
cy.task('memoryUsage:add', window.performance.memory.usedJSHeapSize)
})
})
Now, all we have to do is to write (or reuse) an E2E test that navigates inside the application and monitors the memory usage. Our test looks something like this:
it('navigates in the application', () => {
cy.visit('http://localhost:3002')
cy.login()
cy.task('memoryUsage:reset')
navigateInUserJourneys()
navigateInPages()
navigateInSessions()
navigateInUserJourneys()
cy.task('memoryUsage:write', { testName: Cypress.spec.name }).then(() => {
// Some assertions to ensure the test correctly finished
})
})
function navigateInPages() {
cy.getMemoryUsage()
cy.get('[data-testid="Exploration.Pages"]').click()
cy.getMemoryUsage()
cy.get('.pages-page-list')
cy.getMemoryUsage()
cy.get('.page-trends-page__view-selector').click()
cy.get('[data-value="treeMap"]').click()
cy.get('.sitemap')
cy.getMemoryUsage()
cy.get('.page-trends-page__view-selector').click()
cy.get('[data-value="quadrant"]').click()
cy.get('.gravity-quadrant')
cy.getMemoryUsage()
cy.get('.page-trends-page__view-selector').click()
cy.get('[data-value="list"]').click()
cy.getMemoryUsage()
cy.get('.pages-page-list tbody tr').first().click()
cy.getMemoryUsage()
cy.get('.page-heatmap')
cy.getMemoryUsage()
cy.get('.page-page__view-selector').click()
cy.get('[data-value="interactions"]').click()
cy.get('.page-page__view-selector')
cy.getMemoryUsage()
cy.get('.page-page__view-selector').click()
cy.get('[data-value="elements"]').click()
cy.get('.action-summary__wrapper')
cy.getMemoryUsage()
}
As you can see, the test will:
- reset the memory storage at the beginning
- after each action, save a reading of the memory
- write the recording to a CSV file by the end of the test
This way, we were able to launch Cypress, click a dozen times on the replay
button and we would get a CSV file that allowed us to gather some insights about the memory leaks.
In this graph, the horizontal axis represents the time and the vertical one represents the memory consumption. Each line is a test execution.
One problem we found with this approach was that it seemed than the more we were running the tests, the more memory was consumed. In order to fix this, a simple bash script was written in order to use a fresh instance of Cypress for each run:
#!/bin/bash
filename=$1
echo "-- Running $filename"
exportName="performanceMonitoring/$(basename $filename).csv"
touch "$exportName"
while [ "$(cat $exportName | wc -l)" -lt 15 ]; do
npm run cypress:run:perf -- --spec "$filename"
done;
What this script does is that it runs the same Cypress test until the CSV file reaches 15 lines. The choice of 15 executions was a compromise between gathering enough data and not spending 3 hours waiting for those data.
With this approach, we were finally able to get some more consistent readings:
Generalizing the performance test to pinpoint the issue
We were aware that our data collector was causing the issue. This was probably due to the new screenshot feature, but we wanted to ensure our assumption was true.
In order to do so, we wrote four versions of the same test but with different options for setting up the data collector:
// noDataCollection.perf.cy.ts
describe('Performance tests', () => {
it('Navigate in Gravity - no collect', () => {
perfE2E({
sessionRecording: false,
videoRecording: false,
})
})
})
// noRecording.perf.cy.ts
describe('Performance tests', () => {
it('Navigate in Gravity - no recording', () => {
perfE2E({
sessionRecording: true,
videoRecording: false,
})
})
})
// recordVideoOnly.cy.ts
describe('Performance tests', () => {
it('Navigate in Gravity - full collect', () => {
perfE2E({
sessionRecording: true,
videoRecording: true,
snapshotRecording: false,
})
})
})
// fullCollect.perf.cy.ts
describe('Performance tests', () => {
it('Navigate in Gravity - full collect', () => {
perfE2E({
sessionRecording: true,
videoRecording: true,
})
})
})
We also updated our bash script so it would run each test 15 times and generate 4 different CSV files:
#!/bin/bash
echo "Running performance tests"
echo "-- Cleaning up existing monitoring"
rm -f performanceMonitoring/*.csv
for filename in cypress/perf/*.perf.cy.ts; do
echo "-- Running $filename"
exportName="performanceMonitoring/$(basename $filename).csv"
touch "$exportName"
while [ "$(cat $exportName | wc -l)" -lt 15 ]; do
npm run cypress:run:perf -- --spec "$filename"
done;
done;
Now, all we had to do was run this script, grab a coffee or two and throw our nice CSV files inside our favorite spreadsheet editor (or, by lack of a favorite, one that is available), and validate our hypothesis:
As we have a test which does not use the data collector at all, we could also use this one as a reference to understand the added memory usage with the data collector activated:
For each measure, we divide the value by the one in the reference test. This means that the closer to 1 we are, the less extra memory is consumed.
Rinse & repeat – finding the faulty code
Now than we know that the screenshot handling is responsible for our memory leak, it is time to digg more inside the code and find the culprit for the memory leak. We will not digg much into that part, but what we basically did was disconnect parts of the feature (starting by the end, in our case sending the screenshot to Gravity), run the test, copy the data in the spreadsheet, and see if the memory usage is closer to the normal.
This is quite a tedious task and numerous coffees have been drank while waiting for the tests to be executed, but we were finally able to get something closer to memory usage without collecting:
We were able to find that our memory leak was caused by this quite innocent-looking piece of code:
const node = rebuild(snapshot, {
doc: inDocument,
cache: createCache(),
mirror: createMirror(),
})
if (node) return new XMLSerializer().serializeToString(node)
The problem did not come from the XMLSerializer
nor the rebuild
function. It was the assignment of the node
variable that caused the issue. It was not properly picked up by the garbage collector after exiting the function. So, after each screenshot, a copy of the dom was stored inside the memory. Once this was found, it was quite easy to fix: we destroyed all the children of the node after transforming the screenshot to a string and then we “forced” its deletion (with quotes around the forcing, as there is now a real way to force the garbage collector to do its job).
const data: { node?: Node | null } = {}
data.node = rebuild(snapshot, {
doc: inDocument,
cache: createCache(),
mirror: createMirror(),
})
if (data.node) {
const serialized = new XMLSerializer().serializeToString(data.node)
removeAllChildren(data.node)
delete data.node
return serialized
}
Enhancements and other usages
Thanks to this solution, we have been able to quickly find and fix our memory leak. More importantly, we now have some code base than we will be able to use if we are facing again this kind of issue.
There are obviously a number of things that could be improved:
- use
perfomance.measureUserAgentSpecificMemory
instead ofusedJSHeapSize
: this solution should enable a finer read of the memory distribution. We did not use it as we were missing some headers in Gravity to activate it. That being said, althoughusedJSHeapSize
might not be precise, it has the advantage to be consistent, which is what we are looking for when improving memory usage. - automate the graph generation: it’s not that long to import the CSV file, add it to the spreadsheet and update the graph references, but repetition is bad. It would be nice to have a simpler tool that just reads the CSV and just show the graphs.
This solution could also be used in some other ways to improve code quality. We could imagine a CI task that runs tests with the latest deployed version and the one to be deployed and stops the CD pipeline if the memory usage grows more than a given threshold.
We talked a lot about Gravity in this article, so if you are interested in learning more about it and it’s features you can learn more here: https://www.smartesting.com/en/gravity/ and you can also book a demo with Cristiano: