Closed Bug 1689974 Opened 4 years ago Closed 3 years ago

DAMP Perf regression in custom.debugger tests (20-80% on stepOver, stepOut, stepIn, pause, preview)

Categories

(DevTools :: Debugger, defect, P3)

defect

Tracking

(firefox-esr78 unaffected, firefox85 unaffected, firefox86 unaffected, firefox87 wontfix)

RESOLVED WONTFIX
Tracking Status
firefox-esr78 --- unaffected
firefox85 --- unaffected
firefox86 --- unaffected
firefox87 --- wontfix

People

(Reporter: jdescottes, Unassigned)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Set release status flags based on info from the regressing bug 1662129

Alexandre can you take a look?

Flags: needinfo?(poirot.alex)
Flags: needinfo?(odvarko)

Considering that this is linked to a huge refactor around devtools breakpoint handling I assume we are going to take the regression for now, but let's move this bug to our next fission bugs triage session.

Julien, let me know if I can provide more specific information about this bug.

Flags: needinfo?(poirot.alex)
Flags: needinfo?(odvarko)
Whiteboard: dt-fission-m3-triage

20-80% sounds pretty bad, and this bug could do with getting a severity assigned, but I guess we'll take the perf hit if we have to...

20-80% is bad but devtools performance tests are broken down into specific subtests, and we get alerts for subtest regressions, which is not the case for most (all?) the other performance tests.
So even though 20-80% looks bad, it's not an overall slowdown of the debugger, it only impacts some actions. Most importantly it doesn't impact the opening of the panel which is something we'd try to fix more urgently. I added the name of the subtests which regressed to the title.

I'll keep the severity field unset for now, because I prefer to discuss it during our weekly triage meeting (there was no triage last week because we had a reduced meetings week).

Severity: -- → S3
Summary: DAMP Perf regression in custom.debugger tests (20-80%) → DAMP Perf regression in custom.debugger tests (20-80% on stepOver, stepOut, stepIn, pause, preview)
Severity: S3 → --

No real offender when looking at a diff profile on stepOver, which is the one that regressed by 80%:
https://share.firefox.dev/3jTfCE9
When recording the profile locally, the regression wasn't obvious. 4,5s with the regressing patch, 4,6s without it...

Profile with the patch:
https://share.firefox.dev/3bbarLV
Without:
https://share.firefox.dev/2Ng9nhD

Interesting. Although stepOver takes 200/400ms on the DAMP test machines (a bit more on OSX) and your results are 10 to 20 times slower (4-5 seconds.
Any chance you had a flag/configuration enabled (debug mode?) that would explain the difference? A 200ms regression would be a lot less noticeable on a 5000ms test than on a 200ms test.

I imagine that a combination of profiler overhead + running from within a VM.
Aren't you getting similar overhead just because of the profiler?

It seems the markers are not matching the actual subtest duration, I'll check if something is wrong again around the addMarker usage (similar to Bug 1688169)

Oh the marker fix landed a little bit after your patch, that's why. We'll need to apply it locally to record the profiles

The regression is visible locally after fixing the markers issue:

Thanks for the fix and profiles!
I'm seeing fetchFrames/< only with the regressing patch, with a cost of 10samples.
If I'm correct, the profile interval is 10ms, so it would be an additional cost of 100ms, so half of the regression.
But no idea why I could have any impact on this:
https://searchfox.org/mozilla-central/source/devtools/client/debugger/src/actions/pause/paused.js#41
fetchFrames ultimately involves ensureSourceActor/waitForSourceActorToBeRegisteredInStore, but I only renamed this method.

Similarly, fetchScopes, also 10samples appear only with the regressing patch:
https://searchfox.org/mozilla-central/source/devtools/client/debugger/src/actions/pause/paused.js#55
And similarly, I don't see why these patches would have an impact on this.

Otherwise, I'm seeing the same number of paused and resumed actions.

Whiteboard: dt-fission-m3-triage
Severity: -- → S3

The regression on stepin was compensated by various fixes (mostly removing lodash).

Closing this as wontfix, we will not attempt to revert the performance issue as a whole.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WONTFIX
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.