Consider optimizing gecko android builds for speed (-O2) rather than size (-Oz)
Categories
(Firefox Build System :: Android Studio and Gradle Integration, enhancement)
Tracking
(firefox74 disabled, firefox75 disabled, firefox76 fixed)
People
(Reporter: acreskey, Assigned: acreskey)
References
Details
Attachments
(3 files)
Currently the android builds are heavily optimized for size, "-Oz"
https://searchfox.org/mozilla-central/rev/7536d7f480a7f18c941a590a2d4c5119d9f52770/old-configure.in#602
I did a quick test where I changed the build flag from "-Oz" to "-O3" and fixed and hacked the resulting link errors.
This looks to be very beneficial performance-wise:
12% improvement in raptor-speedometer-geckoview on all three pgo builds:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=6beb413001270be7f351d4cc0579cb21e882161f&newProject=try&newRevision=874e37247c7823ede2e693945d1492635493cd67&framework=10
Numerous double-digit improvements in raptor-tp6 cold and warm loads
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=86d75e90c1e046b69a800362c8444033904f38f0&newProject=try&newRevision=d457d2e7df09a05b7b7c42a8973877b509d61d51&framework=10
As expected, the resulting binary is now larger.
geckoview_example, aarch64, pgo goes from 50.7MB
to to 62.6MB
geckoview_example, aarch32, pgo goes from 44.0MB
to 54.9MB
Assignee | ||
Comment 1•5 years ago
|
||
James, is this a tradeoff we've looked at before?
To me, this looks well worth the additional binary size.
Comment 2•5 years ago
|
||
What are the results with -O2
? -O3
usually just brings bloat along for marginal performance benefit.
Yeah I guess I would like to see what -O2
does. That might be a good compromise between size and speed. At one point, though, I think -Os
was faster than -O2
because it gave better cache performance. Maybe we could look at -Os
again too?
Looks like Chrome may use -O3
unless you specifically request to optimize for size.
https://chromium.googlesource.com/chromium/src/build/config/+/master/compiler/BUILD.gn
Assignee | ||
Comment 5•5 years ago
|
||
Good points, thanks.
I've kicked off -O2
builds.
I did try -Os
and it showed gains almost as big on speedometer for the opt-builds. I haven't figured out why yet, but the PGO builds of -Os
are not there.
Assignee | ||
Comment 6•5 years ago
|
||
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) (he/him) from comment #4)
Looks like Chrome may use
-O3
unless you specifically request to optimize for size.https://chromium.googlesource.com/chromium/src/build/config/+/master/compiler/BUILD.gn
That's a good data point.
Assignee | ||
Comment 7•5 years ago
|
||
The speedometer results for O2
look very similar to O3
results and the binaries are indeed a bit smaller:
Baseline, left (-Oz
) vs -O2
on speedometer
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=6beb413001270be7f351d4cc0579cb21e882161f&newProject=try&newRevision=fd97efb7e1590f1976f51a0edd057f72383d7170&framework=10
-O3
vs -O2
on speedometer
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=874e37247c7823ede2e693945d1492635493cd67&newProject=try&newRevision=fd97efb7e1590f1976f51a0edd057f72383d7170&framework=10
And the geckoview_example
sizes:
Optimization AArch Size, MB
-OZ 32 44.0
-O3 32 54.9
-O2 32 53.7
-OZ 64 50.7
-O3 64 62.6
-O2 64 60.9
Comment 8•5 years ago
|
||
(In reply to Andrew Creskey from comment #6)
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) (he/him) from comment #4)
Looks like Chrome may use
-O3
unless you specifically request to optimize for size.https://chromium.googlesource.com/chromium/src/build/config/+/master/compiler/BUILD.gn
That's a good data point.
FYI it looks like on Android chromium does optimize for size. By default chromium uses default_optimization
and optimize_max
and I don't see anything that turns on optimize_speed
on Android. Also optimize_for_size
is true on Android (and weirdly on MacOS too): so the full flags should be -Oz -O2
for Android.
Assignee | ||
Comment 9•5 years ago
|
||
So they are using both flags on Android, -Oz
and -O2
? It seems to me that some options would get overwritten that way.
I did build it and the binaries are the same size as the O2
build. Performance looks good.
Assignee | ||
Comment 10•5 years ago
|
||
I made a 2nd attempt at building -Os
, but while the OPT build succeeds the PGO profiling fails
https://treeherder.mozilla.org/#/jobs?repo=try&revision=cb24f139285bc6d9081a54eb033bd278d31feb22&selectedJob=273279539
Error:
INFO - Failed to install /builds/worker/fetches/geckoview-androidTest.apk on None: ADBError install failed for /builds/worker/fetches/geckoview-androidTest.apk. Got: Performing Push Install
The geckoview-androidTest.apk
artifact is built and when I build this locally it installs correctly.
Michael - would you have any ideas on this?
Assignee | ||
Comment 11•5 years ago
|
||
On raptor page load tests, this is
baseline
, left, vs -O2
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=86d75e90c1e046b69a800362c8444033904f38f0&newProject=try&newRevision=5ee690875fd717c96239d301217804c223c489b0&framework=10
Looks great.
-O3
vs -O2
on pageload
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=d457d2e7df09a05b7b7c42a8973877b509d61d51&newProject=try&newRevision=5ee690875fd717c96239d301217804c223c489b0&framework=10
-O2
looks roughly as good -- maybe better in some cases, maybe worse in a couple.
I'll add more jobs in selected cases.
Assignee | ||
Comment 12•5 years ago
|
||
Sharing an idea of :Agi's from slack:
I think the size limit on android is 100mb so having GeckoView be 62mb would be a big ask for non-browser apps.
Maybe we can provide both? (I would expect e.g. Fenix to want more speed)
It does look like the APK limit is 100MB, increased from 50MB in 2015.
Not an area that I know a lot about, but it looks like if your APK is generated from an App Bundle then the limit is 150MB:
https://android-developers.googleblog.com/2019/03/google-mobile-developer-day-at-game.html
But either way, it's a big footprint increase for now-browser apps.
So while the additional build configuration adds a lot of overhead and maintenance, maybe it's the best choice.
Comment 13•5 years ago
|
||
There's also some discussion of this in bug 1507636.
Comment 14•5 years ago
|
||
dmajor pointed out this code in Chromium:
which might help them control code growth a little better. I can't recall offhand whether our automation builds use lld
for Android (I don't think they do), but maybe we could translate those bits into something that would work better?
Comment 15•5 years ago
|
||
Thanks for the cc, I was unaware of this bug.
FWIW I've been investigating using the above flag for all platforms, not just Android. On Windows, we can remove over 9MB from xul.dll with no change in Speedometer. (More+broader testing still needed.)
Comment 16•5 years ago
|
||
I was unaware of this bug.
(Perhaps this bug should be under build system in case it might notify others who are interested in this kind of thing?)
Assignee | ||
Comment 17•5 years ago
|
||
(In reply to :dmajor from comment #16)
I was unaware of this bug.
(Perhaps this bug should be under build system in case it might notify others who are interested in this kind of thing?)
Makes sense - I moved the bug but feel free to adjust it.
(In reply to :dmajor from comment #15)
Thanks for the cc, I was unaware of this bug.
FWIW I've been investigating using the above flag for all platforms, not just Android. On Windows, we can remove over 9MB from xul.dll with no change in Speedometer. (More+broader testing still needed.)
That's quite interesting.
If you can help me with a patch for Android I would be very happy to see how it performs and the resulting binary size.
Comment 18•5 years ago
|
||
(In reply to Andrew Creskey from comment #10)
I made a 2nd attempt at building
-Os
, but while the OPT build succeeds the PGO profiling fails
https://treeherder.mozilla.org/#/jobs?repo=try&revision=cb24f139285bc6d9081a54eb033bd278d31feb22&selectedJob=273279539
Error:
INFO - Failed to install /builds/worker/fetches/geckoview-androidTest.apk on None: ADBError install failed for /builds/worker/fetches/geckoview-androidTest.apk. Got: Performing Push Install
Thegeckoview-androidTest.apk
artifact is built and when I build this locally it installs correctly.
Michael - would you have any ideas on this?
I haven't seen an error like that before. Does it happen again if you do a fresh push (so a new instr build in addition to a new run task)?
I diffed the geckoview-androidTest.apk from that push with your -O2 push, and the only differences are in the compiled libraries and incidental files (sha manifests and files containing the hg revision). So it doesn't look like the package was built incorrectly.
The line "Failed to install /builds/worker/fetches/geckoview-androidTest.apk on None" looked suspicious at first, but on further investigation the "None" just comes from the fact that we don't set self.device_name in android_emulator_pgo.py. The device_name is only used in error messages, which explains why things work fine without it.
If it happens again on a re-push, maybe check with gbrown to see if he has any ideas? I'm not sure what else to check here.
Comment 19•5 years ago
|
||
(In reply to Andrew Creskey from comment #17)
If you can help me with a patch for Android I would be very happy to see how it performs and the resulting binary size.
I believe this patch ought to do it: https://hg.mozilla.org/try/rev/c37802d5c0ac94a41de9fb3116ce1aa403c27d5d
However, although that patch got impressive wins on Windows and Linux, it only saved a few hundred KB on Android, and only a few hundred KB more when I further lowered the limit to 5. I'm puzzled by why Android behaves so differently.
Assignee | ||
Comment 20•5 years ago
|
||
(In reply to Michael Shal [:mshal] from comment #18)
I haven't seen an error like that before. Does it happen again if you do a fresh push (so a new instr build in addition to a new run task)?
I diffed the geckoview-androidTest.apk from that push with your -O2 push, and the only differences are in the compiled libraries and incidental files (sha manifests and files containing the hg revision). So it doesn't look like the package was built incorrectly.
The line "Failed to install /builds/worker/fetches/geckoview-androidTest.apk on None" looked suspicious at first, but on further investigation the "None" just comes from the fact that we don't set self.device_name in android_emulator_pgo.py. The device_name is only used in error messages, which explains why things work fine without it.
If it happens again on a re-push, maybe check with gbrown to see if he has any ideas? I'm not sure what else to check here.
Thank you for looking into that Michael - I'm still seeing a mysterious failure on a fresh push so I'll follow up and see what I can find.
-Os
is an interesting option.
Assignee | ||
Comment 21•5 years ago
|
||
(In reply to :dmajor from comment #19)
(In reply to Andrew Creskey from comment #17)
If you can help me with a patch for Android I would be very happy to see how it performs and the resulting binary size.
I believe this patch ought to do it: https://hg.mozilla.org/try/rev/c37802d5c0ac94a41de9fb3116ce1aa403c27d5d
However, although that patch got impressive wins on Windows and Linux, it only saved a few hundred KB on Android, and only a few hundred KB more when I further lowered the limit to 5. I'm puzzled by why Android behaves so differently.
dmajor, when I add your -import-instr-limit=10
option to the -O2
build I'm seeing very significant size savings.
Perhaps without the higher level optimizations there weren't that many long functions being imported?
libxul.so
for arm32 goes from 84.9MB
to to 79.1MB
libxul.so
for aarch64 goes from 123.1MB
to to 112.9MB
Your patch with -O2
-O2
So updated APK sizes are:
Optimization AArch Size, MB
-Oz 32 44.0
-O2,instr-limit=10 32 50.9
-O2 32 53.7
-O3 32 54.9
-Oz 64 50.7
-O2,instr-limit=10 64 57.0
-O2 64 60.9
-O3 64 62.6
-O2,instr-limit=10
looks to have roughly have the size-penalty of -O3
, so I'll retitle this bug.
Performance of O2, instr-limit=10
against mozilla-central still looks great.
Assignee | ||
Updated•5 years ago
|
Comment 22•5 years ago
|
||
(In reply to Andrew Creskey from comment #21)
Perhaps without the higher level optimizations there weren't that many long functions being imported?
Yes, I had just written up a comment speculating that, and we mid-aired. Glad to hear it helped!
Assignee | ||
Comment 23•5 years ago
|
||
I'm trying it with -import-instr-limit=5
now :)
Assignee | ||
Comment 24•5 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #13)
There's also some discussion of this in bug 1507636.
Not that this will be my decision, but the conclusions from bug 1507636 make sense to me -- Fennec is scoring ~10.5
on Speedometer while Chrome is at ~18
.
So why increase the binary size just to score a bit higher, 11.2
?
But now that we can measure Android page load performance, I think we are actually very close to Chrome.
From these results, Fenix with strict tracking protection is comparable to Chrome on load event timing, and ~10% slower on most visual metrics.
The raptor pageload tests show -O2
being a big win.
So increasing the binary size may make us faster than competing browsers.
I'm running visual metrics tests now, so we'll get a better idea of the impact on SpeedIndex, etc.
Assignee | ||
Comment 25•5 years ago
|
||
This is how a very tight -import-instr-limit
impacts binary size:
Optimization AArch Size, MB (geckoview_example.apk)
-Oz 32 44.0
-O2,instr-limit=1 32 49.2
-O2,instr-limit=3 32 49.5
-O2,instr-limit=5 32 50.2
-O2,instr-limit=10 32 50.9
-O2 32 53.7
-O3 32 54.9
-Oz 64 50.7
-O2,instr-limit=1 64 55.5
-O2,instr-limit=3 64 55.7
-O2,instr-limit=5 64 56.2
-O2,instr-limit=10 64 57.0
-O2 64 60.9
-O3 64 62.6
So far the speedometer results for all of these combinations are within noise of the plain -O2
build.
I'll run more pageload tests on the weekend when the device farm is in less demand.
Pardon my ignorance, but wouldn't something like -import-instr-limit=1
effectively disable inlining? Is that what "import" means here? Surely that has to affect performance, right?
Comment 27•5 years ago
|
||
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) (he/him) from comment #26)
Pardon my ignorance, but wouldn't something like
-import-instr-limit=1
effectively disable inlining? Is that what "import" means here? Surely that has to affect performance, right?
I believe this is cross-translation-unit inlining, so old-school inlining would still happen. Additionally, PGO puts a 10-100x multiplier on the limit for hot functions.
I agree that very small numbers somehow feel wrong though, I'm not sure we should try to go chasing every single possible byte. 5 seems like a pretty strict limit already.
Assignee | ||
Comment 28•5 years ago
|
||
I was mostly curious about the degree to which the -import-instr-limit
could reduce the binary size.
-import-instr-limit=1
is as low as it gets (-import-instr-limit=0
binaries are the same size, I guess not a lot of single-instructions being imported...).
Looking mostly at speedometer because reproducibility, I think that performance starts to degrade slightly at -import-instr-limit=3
, particularly on Pixel 2 pgo.
-O2
(left) compare against -O2, limit=5
-O2
(left) compare against -O2, limit=3
-O2
(left) compare against -O2, limit=1
Assignee | ||
Comment 29•5 years ago
|
||
These are the first visual metric results (Moto G5), cold loads.
They compare the baseline configuration (-Oz
) to the (-O2
) build.
SpeedIndex and ContentfulSpeedIndex both look to be improved between 8-10%. pageLoadTime
is onload event timing, and is also improved, perhaps a bit more.
https://docs.google.com/spreadsheets/d/1g9idJimqLgvwK5QK2HOtOVQro2bshZDD5cYPwxAOKwU/edit#gid=1852601574
These tests were run locally with Browsertime and WebPageReplay recordings.
Comment 30•5 years ago
|
||
Looking mostly at speedometer because reproducibility, I think that performance starts to degrade slightly at
-import-instr-limit=3
, particularly on Pixel 2 pgo.
I've had too many bad experiences with surprise perf regressions from suites that I didn't test, or didn't test to high enough confidence. Even if you have try runs, I highly recommend initially committing a value that is greater than what you want, let it settle for several days, make sure the sheriffs have gone through all their alerts, and only then reduce it further.
Assignee | ||
Comment 31•5 years ago
|
||
(In reply to :dmajor from comment #30)
Looking mostly at speedometer because reproducibility, I think that performance starts to degrade slightly at
-import-instr-limit=3
, particularly on Pixel 2 pgo.I've had too many bad experiences with surprise perf regressions from suites that I didn't test, or didn't test to high enough confidence. Even if you have try runs, I highly recommend initially committing a value that is greater than what you want, let it settle for several days, make sure the sheriffs have gone through all their alerts, and only then reduce it further.
That makes sense.
In this bug I would like to simply collect the performance characteristics of each optimization option so that folks can compare them.
The one I'm missing is -Os
: I logged Bug 1593785 as I'm attempting to track down the problems with its PGO runs.
Size-wise it's quite promising, closer to -Oz
, at least in the opt build. The -Os
opt performance wasn't as quite good as -O2
's (~8-9% speedometer improvement vs ~10-11%), but it still interesting.
Optimization AArch Size, MB (geckoview_example.apk)
-Oz, opt 32 43.6
-Os, opt 32 46.7
-Oz, opt 64 50.2
-Os, opt 64 52.5
Comment 32•5 years ago
|
||
As an anecdote I'll note that in bug 1592981 I got some regressions even at limit=10
on linux/win with pgo and limit=40
on mac without pgo. All it took was for one important function (nsStringBuffer::Release
) to be now considered too big for inlining, and some of the more tight-C++-loops benchmarks noticed the change, even though Speedometer alone didn't turn up anything.
Comment 33•5 years ago
|
||
Just a thought: if we do modify the optimizations, we may wish to document this or provide a configuration flag so that external GeckoView consumers also have a choice in how large their APKs will be (I'd guess they'd prefer to optimize for APK size over speed).
NI Snorp for awareness: no action needed.
Assignee | ||
Comment 34•5 years ago
|
||
Since Bug 1592981 has landed, builds of -O2
are now smaller relative to the current -Oz
.
I've updated the sizes based on a recent push to try (PGO builds).
Optimization AArch Size, MB (geckoview_example.apk)
-Oz 32 43.8
-O2 (instr-limit=10) 32 50.4 (+6.6)
-Oz 64 50.0
-O2 (instr-limit=10) 64 56.3 (+6.3)
Performance looks to be in the same ballpark, although I have yet to do a visual metric comparison and there are pending jobs:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=c9d0e32140705667e1384d73362216549b65c763&newProject=try&newRevision=dbcf100a4314589debe141c52ec0767abb1fb458&framework=10
I also noticed that 'official' nightlies of geckoview_example are a lot larger than my try pushes.
i.e. these:
https://firefox-ci-tc.services.mozilla.com/tasks/index/gecko.v2.mozilla-central.nightly.latest.mobile/android-api-16-opt
https://firefox-ci-tc.services.mozilla.com/tasks/index/gecko.v2.mozilla-central.nightly.latest.mobile/android-aarch64-opt
Optimization AArch Size, MB (geckoview_example.apk)
-Oz official 32 48.3
-Oz official 64 54.5
Digging into the apks, it's looks like this is due to the localization resources in the omni.ja
(assets/omni/chrome, etc.)
Assignee | ||
Comment 35•5 years ago
|
||
We've cleared the hurdles in building -Os
PGO (Bug 1593785).
This is an updated view of the binary sizes and speedometer improvements (geckoview_example.apk
):
I included a slightly tightened variant on -Os
where the import-instr-limit
is set to 5
instead of the default 10
.
Optimization AArch Size,MB Delta Speedometer Improvement
-Oz 32 44.1 - --
-Os instr-limit=5 32 47.9 +3.8 10.9%
-Os 32 48.5 +4.4 11.0%
-O2 32 50.7 +6.6 11.7%
-Oz 64 50.3 - --
-Os instr-limit=5 64 54.1 +3.8 10.1%
-Os 64 54.7 +4.4 10.8%
-O2 64 56.6 +6.3 12.9%
Raptor pageload comparisons:
Oz compared to Os (instr-limit=5) here
Oz compared to Os here
Oz compared to O2 here
Similar to the speedometer results, -O2
looks to be a bit faster than the -Os
variants, both of which are significantly faster than -Oz
.
Visual metrics tests are running now.
Assignee | ||
Comment 36•5 years ago
|
||
Visual metric results from cold loads on the Moto G5 are here:
https://docs.google.com/spreadsheets/d/1g9idJimqLgvwK5QK2HOtOVQro2bshZDD5cYPwxAOKwU/edit#gid=592319917
Overall these builds look like a 5-7% improvement (even though the same -O2 build looked like a 8-10% improvement in the last run).
There is a lot of noise in these tests and locally it's not possible for me to get the high repeat counts that I can get on try.
Assignee | ||
Comment 37•5 years ago
|
||
I think this plot (speedIndex) gives a good view of the performance on different sites.
Assignee | ||
Comment 38•5 years ago
|
||
The Pixel 3 cold page load visual metrics results are can be found here:
https://docs.google.com/spreadsheets/d/1g9idJimqLgvwK5QK2HOtOVQro2bshZDD5cYPwxAOKwU/edit#gid=584528033
And I attached a plot of the speedIndex.
The relative noise is quite a bit higher here.
As before, the change looks to be less impactful on Pixel 3 compared to G5.
There are numerous 5-10% improvements on SpeedIndex and also many sites that are not affected within the noise.
Sites like jianshu.com with a ~30% rel std deviation end up significantly lowering or raising the geomean based on how the 25 loads played out.
Assignee | ||
Comment 39•5 years ago
|
||
I've noted the PGO build times for these options. O2 may take a bit longer to build but save time in the Instrument/Run stages.
(Caveat: I'm not sure of the variance in these)
Option aarch Build Instrument Run
Oz 32 37 36 28
Os 32 38 46 28
O2 32 40 31 23
Oz 64 37 - -
Os 64 38 - -
O2 64 37 - -
As an experiment,
I compared Oz vs O1 (O1 is ~10-12% faster on speedometer than Oz, and a few points slower than O2)
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=480dd25f9948aed73ecd9683e536b9f47f543cf1&newProject=try&newRevision=5d24e13545108cc8bad1191dba10c20b27552665&framework=10
Assignee | ||
Comment 40•5 years ago
|
||
For performance improvements to page load and speedometer, optimize at -O2 instead of -Oz.
The previous disabling of the outliner, "-mno-outline", was removed as it is not enabled by default with -O2.
(See Bug 1508547 and https://developer.arm.com/docs/101754/latest/armclang-reference/armclang-command-line-options/-moutline-mno-outline)
Assignee | ||
Comment 41•5 years ago
|
||
This is going to land in m-c so that stability can be assessed in a staged rollout to Fenix nightly, org.mozilla.fenix.nightly
.
If successful, further experiments will be run to collect data on user engagement, retention, activation rates, and reported performance.
At that point, stake holders will determine if the performance improvements are worth the increased binary size from the -O2
build (~6.5MB increase).
Comment 42•5 years ago
|
||
bugherder |
Assignee | ||
Comment 43•5 years ago
|
||
FYI, this change will trigger a series of performance improvement sheriffing alerts on android:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=e599f431034ecca4d749fda872a2de60c3c1d721&newProject=try&newRevision=d9491acbc0d5195d09e80c36c988e740a9a3cb93&framework=10
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Comment 46•5 years ago
|
||
== Change summary for alert #24721 (as of Tue, 21 Jan 2020 08:40:55 GMT) ==
Improvements:
16% build times android-4-0-armv7-api16 pgo instrumented taskcluster-c5.4xlarge 1,945.87 -> 1,636.80
16% build times android-4-0-armv7-api16 pgo instrumented taskcluster-m5.4xlarge 2,009.47 -> 1,693.59
15% build times android-4-0-armv7-api16 pgo instrumented taskcluster-c5d.4xlarge 1,980.25 -> 1,686.91
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=24721
Comment 47•5 years ago
|
||
== Change summary for alert #24774 (as of Thu, 23 Jan 2020 06:43:35 GMT) ==
Improvements:
10% raptor-tp6m-allrecipes-geckoview-cold loadtime android-hw-g5-7-0-arm7-api-16 pgo 7,101.17 -> 6,398.00
9% raptor-tp6m-allrecipes-geckoview-cold android-hw-g5-7-0-arm7-api-16 pgo 2,449.88 -> 2,222.98
8% raptor-tp6m-allrecipes-geckoview-cold android-hw-g5-7-0-arm7-api-16 pgo 2,439.74 -> 2,240.65
7% raptor-tp6m-booking-geckoview-cold fcp android-hw-g5-7-0-arm7-api-16 pgo 808.00 -> 754.00
6% raptor-speedometer-geckoview android-hw-g5-7-0-arm7-api-16 pgo 9.45 -> 10.02
6% raptor-speedometer-geckoview android-hw-p2-8-0-android-aarch64 pgo 24.44 -> 25.82
5% raptor-tp6m-wikipedia-geckoview-cold loadtime android-hw-g5-7-0-arm7-api-16 pgo 1,087.38 -> 1,029.50
5% raptor-tp6m-allrecipes-geckoview-cold fcp android-hw-g5-7-0-arm7-api-16 pgo 1,741.96 -> 1,655.12
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=24774
Assignee | ||
Comment 48•5 years ago
|
||
The Fenix stability and pageload vs apk size tradeoff experiment which has just started:
https://github.com/mozilla-mobile/fenix/issues/7795
Comment 49•5 years ago
|
||
Landed on Beta to trigger the GV builds needed for the experiment.
https://hg.mozilla.org/releases/mozilla-beta/rev/c270bc80557a52d64796167aa11798fb0961cfaf
Then backed out after the builds were triggered.
https://hg.mozilla.org/releases/mozilla-beta/rev/adaa66ecae24973f1a75d2d955024b64ba192f7b
Comment 50•5 years ago
|
||
backout |
Backed out from Beta74 to avoid confusion with the Beta migration builds and the experiments running around this change, per Slack discussion. It remains landed on mozilla-central for GV75+.
https://hg.mozilla.org/releases/mozilla-beta/rev/607417212e2592dc01b2dd55b6a88c9c15509450
Comment 51•5 years ago
|
||
Beta backout:
== Change summary for alert #25097 (as of Mon, 24 Feb 2020 07:33:38 GMT) ==
Regressions:
20% build times android-4-0-armv7-api16 pgo instrumented taskcluster-c5.4xlarge 1,524.47 -> 1,831.05
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=25097
Comment 52•5 years ago
|
||
backout |
Backed out from Beta75, same story as comment 50.
https://hg.mozilla.org/releases/mozilla-beta/rev/f87e8a58c39a8964bbb0bf7aa1e87b96c408ca63
Assignee | ||
Comment 53•2 years ago
|
||
With the focus on Speedometer 3, I've re-run the "-O2" performance comparison.
This is how the binary size changes, measured in bytes, looking at the geckoview example from a try push:
Arm v7 AArch 64
-Oz (current) 79,681,645 86,908,174
-Os 81,498,693 89,618,699
-O2 82,953,919 90,891,453
Independent of these optimizations, from comment 35 it looks like geckoview example has grown by about 40 megs over the last three years.
In terms of performance, we are no longer seeing the large 10-12% improvements in speedometer/speedometer 3.
With "-O2" it looks like only 2-3% (although some subtests may show greater improvements)..
If I add in pageload tests, we see some other small 2-3% gains with "-O2"
:
-O2
From a quick look, "-Os"
doesn't seem very promising anymore:
-Os
Assignee | ||
Comment 54•2 years ago
|
||
One idea is to first trim what we would want to add.
So if -O2
adds between 3 and 4 megabytes the final binary in Fenix, we would first invest in finding those savings elsewhere before landing the change.
Assignee | ||
Comment 55•2 years ago
|
||
An experiment is being done in Bug 1831935
Description
•