Closed
Bug 1458447
Opened 7 years ago
Closed 7 years ago
7.88% installer size (linux32) regression on push cdd70d25704ec2791fe02cd26104b8fd7b45df34 (Fri Apr 27 2018)
Categories
(Core :: Gecko Profiler, defect)
Tracking
()
RESOLVED
INVALID
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox-esr60 | --- | unaffected |
firefox59 | --- | unaffected |
firefox60 | --- | unaffected |
firefox61 | - | wontfix |
firefox62 | - | wontfix |
People
(Reporter: igoldan, Unassigned)
References
Details
(Keywords: regression)
We have detected a build metrics regression from push:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=cdd70d25704ec2791fe02cd26104b8fd7b45df34
As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
8% installer size linux32 pgo 58,437,611.17 -> 63,040,222.42
You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=12961
On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the jobs in a pushlog format.
To learn more about the regressing test(s), please see: https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Automated_Performance_Testing_and_Sheriffing/Build_Metrics
Reporter | ||
Updated•7 years ago
|
Component: Untriaged → Gecko Profiler
Product: Firefox → Core
Reporter | ||
Comment 1•7 years ago
|
||
:m_kato The installer size grew a lot, so can you look over this?
Flags: needinfo?(m_kato)
Comment 2•7 years ago
|
||
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #1)
> :m_kato The installer size grew a lot, so can you look over this?
Hmm, according to perf, binary size is back to Apr 16 by my landing.
Does anyone investigate [*1]'s reason? I suspect infra...
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=00d978a04775bd28ddeedad7c33fd232868ae659&tochange=02df9dc8a300c705cdaabebf52d9e96a6fa906a3
down to 7%
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=02df9dc8a300c705cdaabebf52d9e96a6fa906a3&tochange=057c58bcb1b19b282e6a9afe20cd90cee87244a4
Up to 7% [*1] (no C++ change)
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=ae7c124af9f03b88cf5efd53d4d2f9383fa959f2&tochange=a0a108e7e7b34e2f90a9298d225ae811024615dd
down to 7% (changed only JS!!)
Flags: needinfo?(m_kato) → needinfo?(igoldan)
Reporter | ||
Comment 3•7 years ago
|
||
(In reply to Makoto Kato [:m_kato] (offline or slow 5/3-5/6) from comment #2)
> (In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #1)
> > :m_kato The installer size grew a lot, so can you look over this?
>
> Hmm, according to perf, binary size is back to Apr 16 by my landing.
>
> Does anyone investigate [*1]'s reason? I suspect infra...
>
> https://hg.mozilla.org/integration/autoland/
> pushloghtml?fromchange=00d978a04775bd28ddeedad7c33fd232868ae659&tochange=02df
> 9dc8a300c705cdaabebf52d9e96a6fa906a3
>
> down to 7%
>
> https://hg.mozilla.org/integration/autoland/
> pushloghtml?fromchange=02df9dc8a300c705cdaabebf52d9e96a6fa906a3&tochange=057c
> 58bcb1b19b282e6a9afe20cd90cee87244a4
>
> Up to 7% [*1] (no C++ change)
>
> https://hg.mozilla.org/integration/autoland/
> pushloghtml?fromchange=ae7c124af9f03b88cf5efd53d4d2f9383fa959f2&tochange=a0a1
> 08e7e7b34e2f90a9298d225ae811024615dd
>
> down to 7% (changed only JS!!)
:bholley m_kato made a good point. There's something strange about this regression, because the improvements from April 16th don't seem to make sense. Can you validate this regression alert?
Flags: needinfo?(igoldan) → needinfo?(sledru)
Reporter | ||
Comment 4•7 years ago
|
||
Sorry, instead of :bholley I should have written :sledru.
Comment 5•7 years ago
|
||
Interestingly, there's a bump and drop of compiler warnings at the same time on april 17 and april 23, matching the size changes.
Comment 6•7 years ago
|
||
By any reason, it doesn't become hot code, so its size is small at Apr 16.
By my landing, SharedLibraryInfo::GetInfoForSelf might be hot code, then size is big. I am still looking generated code...
Comment 7•7 years ago
|
||
s/it doesn't become/anything become cold code for pgo/
Comment 8•7 years ago
|
||
There's apparently a change in -Wmaybe-uninitialized warning numbers.
Overall, this all smells like PGO not really working properly, like, firefox crashing during profiling, which is actually a know issue: see https://bugzilla.mozilla.org/show_bug.cgi?id=1252556#c8
Comment 9•7 years ago
|
||
I see.
Before my landed
https://taskcluster-artifacts.net/fx46e6hNRaOcVfRQckpalw/0/public/logs/live_backing.log
[task 2018-04-27T11:57:52.450Z] 11:57:52 INFO - MOZ_PGO_INSTRUMENTED=1 JARLOG_FILE=jarlog/en-US.log /builds/worker/workspace/build/src/obj-firefox/_virtualenv/bin/python /builds/worker/workspace/build/src/build/pgo/profileserver.py 10
[task 2018-04-27T11:57:52.451Z] 11:57:52 INFO - ExceptionHandler::WaitForContinueSignal waiting for continue signal...
[task 2018-04-27T11:57:52.451Z] 11:57:52 INFO - ExceptionHandler::GenerateDump cloned child 32581
[task 2018-04-27T11:57:52.451Z] 11:57:52 INFO - ExceptionHandler::SendContinueSignalToChild sent continue signal to child
[task 2018-04-27T11:57:52.451Z] 11:57:52 INFO - jarlog: /builds/worker/workspace/build/src/obj-firefox/jarlog/en-US.log
[task 2018-04-27T11:57:52.451Z] 11:57:52 INFO - ExceptionHandler::GenerateDump cloned child 32614
[task 2018-04-27T11:57:52.451Z] 11:57:52 INFO - ExceptionHandler::SendContinueSignalToChild sent continue signal to child
[task 2018-04-27T11:57:52.451Z] 11:57:52 INFO - ExceptionHandler::WaitForContinueSignal waiting for continue signal...
After my landed
https://taskcluster-artifacts.net/cnX-KOQKSlKpLGml_IMIVg/0/public/logs/live_backing.log
[task 2018-04-27T12:37:45.709Z] 12:37:45 INFO - MOZ_PGO_INSTRUMENTED=1 JARLOG_FILE=jarlog/en-US.log /builds/worker/workspace/build/src/obj-firefox/_virtualenv/bin/python /builds/worker/workspace/build/src/build/pgo/profileserver.py 10
[task 2018-04-27T12:37:45.709Z] 12:37:45 INFO - *** UTM:SVC TimerManager:registerTimer called after profile-before-change notification. Ignoring timer registration for id: telemetry_modules_ping
[task 2018-04-27T12:37:45.709Z] 12:37:45 INFO - ###!!! [Parent][DispatchAsyncMessage] Error: PClientManager::Msg_Teardown Route error: message sent to unknown actor ID
[task 2018-04-27T12:37:45.709Z] 12:37:45 INFO - JavaScript error: resource://gre/modules/osfile/osfile_async_front.jsm, line 408: Error: OS.File has been shut down. Rejecting post to DirectoryIterator_prototype_close
[task 2018-04-27T12:37:45.709Z] 12:37:45 INFO - jarlog: /builds/worker/workspace/build/src/obj-firefox/jarlog/en-US.log
[task 2018-04-27T12:37:45.709Z] 12:37:45 INFO - *** WIFI GEO: shutdown called
[task 2018-04-27T12:37:45.709Z] 12:37:45 INFO - make[1]: Leaving directory '/builds/worker/workspace/build/src/obj-firefox'
So PGO work well by my landed.
Updated•7 years ago
|
status-firefox59:
--- → unaffected
status-firefox60:
--- → unaffected
status-firefox-esr52:
--- → unaffected
status-firefox-esr60:
--- → unaffected
tracking-firefox61:
--- → +
Comment 10•7 years ago
|
||
Ionuț, sorry but I am not sure what you expect from me here?
Flags: needinfo?(sledru) → needinfo?(igoldan)
Comment 11•7 years ago
|
||
Before landing bug 1350500, PGO process didn't work correctly. And our infra cannot detect whether profiling stage of PGO works (bug 1252556). So until fixing bug 1252556, binary size of PGO package is unreliable.
Ionuț, should we keep open for this bug?
Comment 12•7 years ago
|
||
Also, since we don't run talos on linux 32 bit builds, what we don't see is that when pgo is broken (so, before *this* regression), performance is degraded. Meaning, if talos were running on linux 32 bit builds, we should have seen performance improvements at the same time as this installer size regression.
Reporter | ||
Comment 13•7 years ago
|
||
(In reply to Sylvestre Ledru [:sylvestre] from comment #10)
> Ionuț, sorry but I am not sure what you expect from me here?
I wanted to know what happened around April 17: was there a real installer size reduction or did the test broke and reported erroneously?
Flags: needinfo?(igoldan)
Reporter | ||
Comment 14•7 years ago
|
||
(In reply to Makoto Kato [:m_kato] from comment #11)
> Before landing bug 1350500, PGO process didn't work correctly. And our
> infra cannot detect whether profiling stage of PGO works (bug 1252556). So
> until fixing bug 1252556, binary size of PGO package is unreliable.
>
> Ionuț, should we keep open for this bug?
My conclusion here is that PGO went wrong. This made the test report fake improvements. Then things returned to normal.
I'd say we resolve this as worksforme, as it was a temporary glitch with the infra.
Comment 15•7 years ago
|
||
Doesn't sound like there's a real issue other, other than the whole "we managed to break Linux32 PGO without knowing it" thing. We could definitely use a follow-up bug for being better-able to detect when that happens, but this isn't the place for that discussion.
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox62:
--- → wontfix
tracking-firefox62:
--- → -
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•