Closed
Bug 1101182
Opened 10 years ago
Closed 10 years ago
Firefox is unusably janky & pegs the CPU, with Web of Trust & large Facebook timeline, due to ~3x perf regression from e10s shim functions (in non-e10s session)
Categories
(Core :: XPConnect, defect)
Core
XPConnect
Tracking
()
RESOLVED
FIXED
mozilla40
People
(Reporter: dholbert, Assigned: gkrizsanits)
References
(Blocks 1 open bug, )
Details
(Keywords: perf, power, regression)
Attachments
(3 files, 1 obsolete file)
(deleted),
image/png
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
bholley
:
review+
|
Details | Diff | Splinter Review |
I've been living with constant 100% CPU usage in Firefox Nightly (even when viewing static content), with often slow responsiveness. I finally got around to investigating. I've disabled e10s, if that matters.
Anyway, with the Gecko Profiler, I discovered that this background CPU usage is from the Web of Trust add-on, constantly running JS in the background. It spends most of its time in "FindReplaceLength" (a string-replacement function in the JS engine). See attached screenshot of profile. (I'll post a link to an interactive profile shortly; the one I recorded is too large & got rejected by the "share..." option.)
Add-on URL:
https://addons.mozilla.org/en-US/firefox/addon/wot-safe-browsing-tool/
This add-on is version "20131118", i.e. from a year ago. It's possible that it interacts poorly with some Firefox/Gecko change from that time-period -- I don't think it's always been this bad. Regardless, it shouldn't be running JS in the background constantly.
Reporter | ||
Updated•10 years ago
|
Reporter | ||
Comment 1•10 years ago
|
||
(Note that this is a "Featured" add-on on AMO, with nearly 1.3 million users.)
Comment 2•10 years ago
|
||
Daniel said this showed up in about the last week, and the profile shows that most of the time is being spent under some addon interposition thing.
Flags: needinfo?(wmccloskey)
Comment 3•10 years ago
|
||
I'm going to blame addon interposition until proven otherwise. ;)
Component: Add-ons → XPConnect
Product: Tech Evangelism → Core
Comment 4•10 years ago
|
||
Also of note, I think dholbert is running with e10s disabled.
Reporter | ||
Comment 5•10 years ago
|
||
(Correct, this is with e10 disabled.)
Reporter | ||
Comment 6•10 years ago
|
||
Also, 100% cpu usage might've been an overstatement; it looks like I'm actually seeing a baseline of 30%-60% usage (viewing about:blank, with a bunch of mostly-static background tabs). Though I've definitely had points where I'm viewing static content and "top" shows that Firefox is at 100% CPU (like right before I captured the profile shown in the screenshot).
Summary: Firefox has constant 100% CPU usage, and slow response-times, with "Web of Trust" add-on enabled (in their 'search.js') → Firefox has constant 30-100% CPU usage, and slow response-times, with "Web of Trust" add-on enabled (in their 'search.js')
Reporter | ||
Comment 7•10 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #2)
> Daniel said this showed up in about the last week, and the profile shows
> that most of the time is being spent under some addon interposition thing.
I wonder if it's really just that string-replacement is expensive, though... (and string-replacement is what's happening inside of the addon interposition thing. So it *looks* like add-on interposition is being a CPU hog, but really it's just dutifully performing some expensive string-manipulation as requested by the add-on's JS.)
I tried to capture a smaller profile which I could upload & share the URL, but I'm actually now having a harder time reproducing this, after disabling WOT & restarting Firefox, and then re-enabling WOT & restarting Firefox. And unfortunately, the profile that I screenshotted (too-big-for-cleopatra) has now been deleted, because cleopatra (or the profiler add-on) only keeps the last 5 profile-analyses.
I'll try to get another profile when I notice my UI getting unresponsive again, though.
Comment hidden (obsolete) |
Reporter | ||
Comment 9•10 years ago
|
||
Better profile:
http://people.mozilla.org/~bgirard/cleopatra/?customProfile=http://people.mozilla.org/~dholbert/profiles/rAs9dbaG.bin
My laptop is currently suffering from this, with CPU going between 70%-100%, with my foreground tab being Bugzilla or other static content. It janks every couple of seconds, usually losing keys that I typed (if I'm typing), making it pretty unusable.
I recorded this profile while viewing data:text/html,<textarea> in my foreground tab and typing the alphabet (with dropped characters) into the textarea. This shows 50% of our time spent in WoT JS code.
Summary: Firefox has constant 30-100% CPU usage, and slow response-times, with "Web of Trust" add-on enabled (in their 'search.js') → Firefox has constant 70-100% CPU usage, and slow response-times, with "Web of Trust" add-on enabled (in their 'search.js')
Comment 10•10 years ago
|
||
Can you hack your local copy of wot to log when wot_search.watch is called, so we can see whether it's a few slow calls or a huge number of calls going on?
Reporter | ||
Comment 11•10 years ago
|
||
Good idea.
For reference, here's a patch showing my diffs to hack my local copy of WOT & add logging (with lots of patch-context, for reference about what JS code is running here).
Specifically, I did this:
# Install WOT from
# https://addons.mozilla.org/en-US/firefox/addon/wot-safe-browsing-tool/
# and restart firefox to complete installation. Then, quit firefox.
cd $FIREFOX_PROFILE_DIR
cd extensions/\{a0d7ccb3-214d-498b-b4aa-0e8fda9a7bf7\}/chrome
unzip wot.jar
cd ..
patch -p1 < $PATCHFILE
The next time I start Firefox, I have lines like this in the output:
> ***dholbert 10:37:22 AM wot_search.watch('https://twitter.com/')
> ***dholbert 10:37:22 AM wot_search.process doing rule-matching for 'https://twitter.com/'
> ***dholbert 10:37:22 AM wot_search.process() completed in 100.9264560000156ms
Reporter | ||
Comment 12•10 years ago
|
||
In a recently-started, not-yet-janky Firefox session, I get two back-to-back (redundant?) rounds of logging, on the following approximate time-periods:
- every 2 seconds for a Facebook app-tab
- every 4 seconds for a Gmail app-tab
- every 10-25 seconds for a Twitter app-tab
As shown in the patch context, WOT has 'watch' chained off of a mutation listener. I suspect it might be re-scanning the full DOM on each of these mutations, which explains why it becomes slow after a while (because background facebook/twitter tabs will have accumulated a lot of content)
I was able to just make my session unusable by paging down my facebook timeline continuously for ~30 seconds (loading lots of content). This makes my ~every-two-second Facebook-mutation-triggered WOT-scans look like the following:
***dholbert 10:55:25 AM wot_search.watch('https://www.facebook.com/')
***dholbert 10:55:25 AM wot_search.process doing rule-matching for 'https://www.facebook.com/'
***dholbert 10:55:26 AM wot_search.process() completed in 926.6299670001026ms
***dholbert 10:55:26 AM wot_search.watch('https://www.facebook.com/')
***dholbert 10:55:26 AM wot_search.process doing rule-matching for 'https://www.facebook.com/'
***dholbert 10:55:27 AM wot_search.process() completed in 834.4735779999755ms
***dholbert 10:55:27 AM wot_search.watch('https://www.facebook.com/')
***dholbert 10:55:27 AM wot_search.process doing rule-matching for 'https://www.facebook.com/'
***dholbert 10:55:28 AM wot_search.process() completed in 960.4412390002981ms
Yes, that's 960ms = approximately 1 second spent in wot_search.process(). And then facebook dynamically updates 1 second later, and we spend another second processing rules.
Reporter | ||
Comment 13•10 years ago
|
||
Also: this bug does not seem to reproduce in e10s mode, because the mutation observer fails to get hooked up there (so Facebook doesn't prompt WOT every 2 seconds). I filed bug 1102407 on that.
Reporter | ||
Comment 14•10 years ago
|
||
Two more data points:
- I can reproduce the 1-second process()-times with a fresh profile, with WOT installed. (w/ logging patch)
- I can reproduce a much-reduced version of this bug in Firefox release (33) -- that is, the process() times do get longer (and block user input) as my timeline grows -- but the process times are still much shorter than they are in Nightly. (put another way: it doesn't get so bad, so quickly). For example: I held down my "end" key for ~30 seconds, and my wot_search.process() times weren't getting much higher than ~200ms (bad, but not crazy-bad). Whereas with Nightly, they get up to 600ms after I hold down "end" for maybe 10 seconds.
But
Reporter | ||
Comment 15•10 years ago
|
||
(Sorry, ignore the "But" at the end of previous comment)
Reporter | ||
Comment 16•10 years ago
|
||
Tracking down a regression range, w/ following experiment/definitions per-build (using the same profile, with a facebook app-tab and the WOT add-on w/ logging patch):
- Start Firefox
- Shift-reload facebook app-tab (to get a consistent baseline & make sure there's no extra content from session-restore).
- Hold down "end" key for 20 seconds
- See what the most recent "WOT logging process() completed in" values are.
At the end of the experiment, in "good" builds, the process-times are all ~150-200ms. In bad builds, they're 650-950ms.
Moreover, the "good" builds are better before I've held down "end" to bloat my Facebook timeline, too. They've got process()-times in the ~30-50ms range for a normal-size Facebook page, whereas bad builds are in the 100-200ms range.
So generally, it looks in post-regression builds, WOT's "process()" calls take around 3x as long.
Anyway, I got this m-c regression range:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=0e631971b841&tochange=0b81c10a9074
This contains mozilla-central changeset c6d53703d449, with commit message:
> Bug 1063156 - Enable e10s shims by default, even when browser.tabs.remote.autostart pref is false.
I believe this is the addon interposition thing mentioned in comment 2. Nothing else in the changelog looks particularly relevant to WOT pattern-matching runtime. Hence, flagging as regression from bug 1063156.
(I'll probably file a separate bug on asking WOT to perhaps clamp the run-time of their rule-matching scans; but for now, it seems their scans got a lot more expensive as a result of bug 1063156, and we should address that.
Blocks: 1063156
Reporter | ||
Comment 17•10 years ago
|
||
(I spun off bug 1102500 with some suggestions on how to improve things on the WoT side. Let's keep this bug here focused on the perf regression caused by the e10s shims (addon interposition stuff).)
Summary: Firefox has constant 70-100% CPU usage, and slow response-times, with "Web of Trust" add-on enabled (in their 'search.js') → Firefox is unusably janky & pegs the CPU, with Web of Trust & large Facebook timeline, due to ~3x perf regression from e10s shim functions (in non-e10s session)
Version: unspecified → Trunk
Reporter | ||
Comment 18•10 years ago
|
||
[Tracking Requested - why for this release]:
Huge perf regression, at least for users who use both Facebook & Web of Trust. (both very popular, so lots of users; also likely affects other sites/add-ons) Also, while this is in e10s-related functionality (the shims), it affects *e10s-disabled* sessions; so if nothing changes, this regression will ride the trains and hit our release users.
tracking-firefox36:
--- → ?
Comment 19•10 years ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #18)
> [Tracking Requested - why for this release]:
> Huge perf regression, at least for users who use both Facebook & Web of
> Trust. (both very popular, so lots of users; also likely affects other
> sites/add-ons) Also, while this is in e10s-related functionality (the
> shims), it affects *e10s-disabled* sessions; so if nothing changes, this
> regression will ride the trains and hit our release users.
I'd recommend not tracking since shims are ifdef's to nightly
Blocks: e10s-perf, e10s-addons
tracking-e10s:
--- → +
Reporter | ||
Comment 20•10 years ago
|
||
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #19)
> I'd recommend not tracking since shims are ifdef's to nightly
Oh, good point -- I hadn't noticed that. So, this regression won't ride the trains to release (yet), after all. (though it still is painful on Nightly, if you have WoT)
Reporter | ||
Updated•10 years ago
|
tracking-firefox36:
? → ---
This will be a good testcase for speeding up the shims. I don't think I'll be able to get to it right now, though. In the meantime, hopefully bug 1102500 will take care of the problem.
Flags: needinfo?(wmccloskey)
Reporter | ||
Comment 22•10 years ago
|
||
Bill: is there a bug filed to cover the removal of the #ifdef-nightly guards around the shims (and letting them ride the trains)? If so, I'd suggest that this bug should block that bug -- we probably don't want to accidentally ship a ~3-6x perf hit (based on measurements in comment 16).
Reporter | ||
Updated•10 years ago
|
Flags: needinfo?(wmccloskey)
Blocks: 1116854
Marked dependent on bug 1116854.
Flags: needinfo?(wmccloskey)
Updated•10 years ago
|
Updated•10 years ago
|
I'll take this but evilpie mentioned some interest in it. I'll hold off for a bit.
Assignee: nobody → wmccloskey
Handing this over to Gabor.
Assignee: wmccloskey → gkrizsanits
Assignee | ||
Comment 27•10 years ago
|
||
I had some talk with Bill about this in Bug 1116854, but I think this really belongs here.
It seems like if the e10s flag is off, the addon wrapper iterates over the proto chain and do an interposition at each level. This patch cut off the unnecessary interpose calls there.
I think this is a non-issue for the e10s turned on case (see: Bug 1116854 Comment 5)
Another thing I see is that we do a lot of interpose for the various global props like wot_search, wot_prefs, wot_core, wot_modules, etc. I was wondering if we could just skip interpose for vanilla js objects. But more importantly why do we have interpose on those? I think that the reason is that the addon loads its scripts via an overlay xul, and I'm not sure that we flag these scripts as part of the addon... Bill, what do you think?
Attachment #8595802 -
Flags: review?(wmccloskey)
Comment on attachment 8595802 [details] [diff] [review]
One interpose call for one property access
Review of attachment 8595802 [details] [diff] [review]:
-----------------------------------------------------------------
I don't know the difference between WebIDL interface objects and WebIDL prototype objects, so Bobby should review.
Attachment #8595802 -
Flags: review?(wmccloskey) → review?(bobbyholley)
Comment 29•10 years ago
|
||
Comment on attachment 8595802 [details] [diff] [review]
One interpose call for one property access
Review of attachment 8595802 [details] [diff] [review]:
-----------------------------------------------------------------
Instead of trying to enumerate the things we _aren't_ interposing on, how about just listing the things we are?
So far, I hear "WebIDL instance objects". Anything else? If that's it, just use IsDOMClass(obj).
Attachment #8595802 -
Flags: review?(bobbyholley) → review-
Assignee | ||
Comment 30•10 years ago
|
||
(In reply to Bobby Holley (:bholley) from comment #29)
> Comment on attachment 8595802 [details] [diff] [review]
> One interpose call for one property access
>
> Review of attachment 8595802 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Instead of trying to enumerate the things we _aren't_ interposing on, how
> about just listing the things we are?
>
> So far, I hear "WebIDL instance objects". Anything else? If that's it, just
> use IsDOMClass(obj).
We can try that approach as well, probably it makes a lot more sense.
Will IsDOMClass filter out the proto chain of DOM WebIDL instance objects?
What does it return for natives with webidl support that are not part of the
DOM? (if we have any, or care about them... like console)
We also want to interpose on XPCWrappedNatives.
Comment 31•10 years ago
|
||
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #30)
> We can try that approach as well, probably it makes a lot more sense.
> Will IsDOMClass filter out the proto chain of DOM WebIDL instance objects?
Yes, those have a different JSClass (IFaceAndProtoJSClass).
See
https://dxr.mozilla.org/mozilla-central/source/dom/bindings/BindingUtils.h?from=BindingUtils.h&case=true#133
and
https://dxr.mozilla.org/mozilla-central/source/dom/bindings/BindingUtils.h?from=BindingUtils.h&case=true#162
> What does it return for natives with webidl support that are not part of the
> DOM? (if we have any, or care about them... like console)
Correct.
> We also want to interpose on XPCWrappedNatives.
Those have different JSClasses.
Assignee | ||
Comment 32•10 years ago
|
||
I needed IS_PROTO_CLASS as well for Components, and we might need it for various other things... Later on I also plan to filter out all the DOM instances if they don't implement any interfaces from a previously registered set. Because I see a lot of unnecessary interposition calls for various DOM instances we never want to do interposition on.
Attachment #8595802 -
Attachment is obsolete: true
Attachment #8598743 -
Flags: review?(bobbyholley)
Comment 33•10 years ago
|
||
Comment on attachment 8598743 [details] [diff] [review]
Doing interposition only on WNs and DOM instances
Review of attachment 8598743 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/xpconnect/wrappers/AddonWrapper.cpp
@@ +29,5 @@
> + RootedObject unwrapped(cx, UncheckedUnwrap(target));
> + const js::Class* clasp = js::GetObjectClass(unwrapped);
> + if (!mozilla::dom::IsDOMClass(clasp) &&
> + !IS_WN_CLASS(clasp) &&
> + !IS_PROTO_CLASS(clasp)) {
I guess this is all certainly an improvement, but I'm skeptical that we really need IS_WN_CLASS and IS_PROTO_CLASS. Are they both just for Components? Why are we interposing there?
Attachment #8598743 -
Flags: review?(bobbyholley) → review+
Assignee | ||
Comment 34•10 years ago
|
||
(In reply to Bobby Holley (:bholley) from comment #33)
> Comment on attachment 8598743 [details] [diff] [review]
> Doing interposition only on WNs and DOM instances
>
> Review of attachment 8598743 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: js/xpconnect/wrappers/AddonWrapper.cpp
> @@ +29,5 @@
> > + RootedObject unwrapped(cx, UncheckedUnwrap(target));
> > + const js::Class* clasp = js::GetObjectClass(unwrapped);
> > + if (!mozilla::dom::IsDOMClass(clasp) &&
> > + !IS_WN_CLASS(clasp) &&
> > + !IS_PROTO_CLASS(clasp)) {
>
> I guess this is all certainly an improvement, but I'm skeptical that we
> really need IS_WN_CLASS and IS_PROTO_CLASS. Are they both just for
> Components? Why are we interposing there?
The ones I know about are:
nsICategoryManager, nsIComponentRegistrar, nsIObserverService, nsIXPCComponents_Utils
Comment 35•10 years ago
|
||
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #34)
> The ones I know about are:
> nsICategoryManager, nsIComponentRegistrar, nsIObserverService,
> nsIXPCComponents_Utils
But aside from nsIXPCComponents_Utils, none of these have PreCreate, which means that we get wrapper-per-scope, and never hit this codepath, right?
Assignee | ||
Comment 36•10 years ago
|
||
(In reply to Bobby Holley (:bholley) from comment #35)
> (In reply to Gabor Krizsanits [:krizsa :gabor] from comment #34)
> > The ones I know about are:
> > nsICategoryManager, nsIComponentRegistrar, nsIObserverService,
> > nsIXPCComponents_Utils
>
> But aside from nsIXPCComponents_Utils, none of these have PreCreate, which
> means that we get wrapper-per-scope, and never hit this codepath, right?
Isn't that the reason for having http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/XPCWrappedNativeJSOps.cpp#360 ?
Assignee | ||
Comment 37•10 years ago
|
||
Had some failures on try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ae44c5b61d16
So I had to add OuterWindowProxyClass to the list...
It seems to be green now:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e7f0f37dbfad
I might want to further optimize this later (we would only want to do the interposition on chrome windows) but since I think more optimization will come soon, and the aurora uplift is getting close, I would prefer landing this as it is for now. Unless you're against it I will do that tomorrow, or tonight (it's already a big win as it is so it's worth to land it imo).
Comment 38•10 years ago
|
||
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #36)
> (In reply to Bobby Holley (:bholley) from comment #35)
> > (In reply to Gabor Krizsanits [:krizsa :gabor] from comment #34)
> > > The ones I know about are:
> > > nsICategoryManager, nsIComponentRegistrar, nsIObserverService,
> > > nsIXPCComponents_Utils
> >
> > But aside from nsIXPCComponents_Utils, none of these have PreCreate, which
> > means that we get wrapper-per-scope, and never hit this codepath, right?
>
> Isn't that the reason for having
> http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/
> XPCWrappedNativeJSOps.cpp#360 ?
Oh! I see - I thought that the patch in this bug was wrapper-specific, but both codepaths go through xpc::Interpose. That makes sense.
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #37)
> Had some failures on try:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=ae44c5b61d16
>
> So I had to add OuterWindowProxyClass to the list...
The conventional way to check for outer-windows in this code is to check |clasp->ext.innerObject|, which lets you avoid the extra include. But I guess we include nsGlobalWindow in other places in this code, so maybe this is clearer? Shrug.
> It seems to be green now:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=e7f0f37dbfad
>
> I might want to further optimize this later (we would only want to do the
> interposition on chrome windows) but since I think more optimization will
> come soon, and the aurora uplift is getting close, I would prefer landing
> this as it is for now. Unless you're against it I will do that tomorrow, or
> tonight (it's already a big win as it is so it's worth to land it imo).
Yes, please land.
Comment 39•10 years ago
|
||
Comment 40•10 years ago
|
||
Status: NEW → RESOLVED
Closed: 10 years ago
status-firefox40:
--- → fixed
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
You need to log in
before you can comment on or make changes to this bug.
Description
•