Closed Bug 1008978 Opened 10 years ago Closed 10 years ago

5% TART regression for OSX 10.8 on May 10th (bug 996848) on inbound (fx32)

Categories

(Core :: Widget: Cocoa, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

The patch for bug 996848 fixed some cases (edge cases, admittedly) of nsIRunnable event starvation -- bug 959281 and the testcase for bug 606937.  So this regression *might* actually be the opposite -- a restoration of things to the way they should be.  If (before bug 996848 landed), nsIRunnable events were being starved in the context of the TART test, fixing the starvation would very likely cause resource consumption (e.g. cpu usage) to rise.

I know very little about the TART/CART tests, though.
Avi, can you comment on what TART is?
Flags: needinfo?(avihpit)
TART animates tabs (like when opening or closing a tab) on few cases, and measures several aspects of frame-intervals throughput.

Less events starvation is so important and meaningful that almost any numeric "regression" probably still translates to better behavior in practice for users.

Also, it should be noted that TART (and CART and tscrollx and tsvgx) deploy an "anti events starvation" system which is not present on users systems by default (but it makes the measurements more sensitive to real changes, so it improves the test).

This "system" is setting the pref docshell.event_starvation_delay_hint=1 (defaults at 2000 for users, the pref was added on bug 884955, following bug 822096 and bug 880036, with a proper attempt to fix it on bug 930793 which landed and backed out, and now your bug[s] finally seems to make real progress).

So even if for some reason it's a real (small) numeric regression compared to the "anti starvation" system which TART deploys, the new code which Steven works on is less hacky, and more importantly enabled for users, so users will almost certainly see (much) better behavior.

Considering that this is a relatively small numeric regression, on osx 10.8 only, compared to an artificial mode which isn't deployed for users ("anti starvation") and that the wins all over from this should be huge, IMO wontfix.

Unless Steven feels he wants to investigate this further. Steven?
Flags: needinfo?(avihpit) → needinfo?(smichaud)
> Unless Steven feels he wants to investigate this further. Steven?

No, I don't :-)

But there's one thing I'd like to try unless it isn't too difficult:  I'd like to run the regressing test locally (with and without my patch) with the Gecko Profiler loaded and running, and then immediately afterwards "analyze" the results, focusing on calls to nsThread::ProcessNextEvent (as I did at bug 996848 comment #22).

If this is easy to do, please let me know how.  If not I think we can just drop this bug, at least for the time being.
Flags: needinfo?(smichaud)
> unless it isn't too difficult

if it isn't too difficult
(In reply to Steven Michaud from comment #5)
> > unless it isn't too difficult
> 
> if it isn't too difficult

It shouldn't be:

1. Clone/update the talos repo: http://hg.mozilla.org/build/talos/
2. Update to revision 0ce52a5fcce0 (currently that's 1 before HEAD - TART is broken on head).
3. Zip the content of talos/page_load_test/tart/addon/ and rename to xpi.
4. Install the xpi and visit chrome://tart/content/tart.html

5. Follow the instructions on screen (set layout.frame_rate=0 and the "starvation" pref which TART sets when running on talos). The default set of tests is exactly the set which talos runs for TART.

Use the profiler like you normally do.

Ping me on irc if you need help.

(FWIW, steps 1-4 will soon be "go to that page, click install").
Thanks!

The results of my tests were as I expected.  Without my patch, calls to nsAppShell::OnProcessNextEvent() consist of about 20% of everything called from nsThread::ProcessNextEvent(), but with my patch they amount to less than 1%.

This shows that the TART tests trigger nsIRunnable event starvation without my patch, which my patch fixes.

So if (as I understand it) the TART tests take longer with my patch, that's presumably because (without the event starvation) the tests are actually doing more with my patch.
(In reply to Steven Michaud from comment #7)
> So if (as I understand it) the TART tests take longer with my patch, that's
> presumably because (without the event starvation) the tests are actually
> doing more with my patch.

TART tests 10 different tab animation cases, and for each one it produces 3 values in ms units: 2 of average intervals (.half/.all) and one proportional to the overall duration of the animation (.error). Overall - 30 sub results, where the "final" result on graph server is the average of those.

Theoretically, even if the frame rate changes from a patch, it should still take the same amount of time to complete (the .error value). The .error value usually increases when the code which launches the animation does more "preparations" or "finalization", such that it sets the animation to NN ms, but in practice takes more time from the trigger until everything is done (and .error is the absolute difference between this NN and the actual overall time).

I'm looking at datazilla (it's slowwww to load): https://datazilla.mozilla.org/?start=1399343801&stop=1399948601&product=Firefox&repository=Mozilla-Inbound&os=mac&os_version=OS%20X%2010.8&test=tart&project=talos

Looks like most of the .error values increased slightly (one decreased) and became less noisy. The .all/.half values (1/FPS) are completely unaffected.

I don't think I can explain this behavior, as even "more work" shouldn't result in longer duration, just in lower fps at the worst case, but the fps wasn't affected by this patch.

I am intrigued by it and would have loved to know why the .error values changed, but since it's only on one platform, the magnitude is small and the overall wins weight much more, I wouldn't like to spend/waste time on this, again, unless Steven thinks it's important enough to understand. Personally, even if I want to, I can live fine without understanding.

Your call when to stop.
Thanks, Avi, for your detailed analysis.

I remain convinced that the TART tests triggered nsIRunnable event starvation (without my patch), and that my patch fixed it.  But you've now convinced me that there's no direct connection between this and the changed TART test results.

> The .all/.half values (1/FPS) are completely unaffected.

And if this is true, it doesn't seem like we have any reason to worry about these test result changes (the so-called regression).  For what really matters, the test results haven't changed.

So I'm happy to drop this now, and will do so :-)
Thanks for the investigation, Steven :)
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.