Closed Bug 1325651 Opened 8 years ago Closed 8 years ago

safe mode not working as expected after Bug 1321868

Categories

(Toolkit :: Safe Browsing, defect)

53 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox52 --- unaffected
firefox53 --- fixed

People

(Reporter: streetwolf52, Assigned: ehsan.akhgari)

References

Details

(Keywords: regression)

Attachments

(1 file, 3 obsolete files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:53.0) Gecko/20100101 Firefox/53.0 Build ID: 20161223033425 Steps to reproduce: Insure Bug 883377 is on your version of Fx. My regression range checking pointed to this bug which may or may not be the cause. Go into safe mode. Go to various sites. Go into about:support. Actual results: Most sites do not load. The spinner on the tab just keeps spinning. Wikipedia for some reason is one of the few sites that works. about:support is missing all information except for the various about: links. Expected results: All sites should load about:support should have all information filled in
Component: Untriaged → JavaScript Engine
Product: Firefox → Core
Here's the regression range. Bad https://hg.mozilla.org/integration/mozilla-inbound/rev/deb743b033b70f97af83172d90008da2a5ce7c59 Goodhttps://hg.mozilla.org/integration/mozilla-inbound/rev/5f494ff3b83be2b5e8529e859d80ca3efacf19f7 Browser Console Error: ReferenceError: event is not defined[Learn More] browser-content.js:1772:1 NS_ERROR_XPC_GS_RETURNED_FAILURE: Component returned failure code: 0x80570016 (NS_ERROR_XPC_GS_RETURNED_FAILURE) [nsIJSCID.getService] nsUrlClassifierListManager.js:80
Blocks: 883377
Keywords: regression
your regression range is https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=5f494ff3b83be2b5e8529e859d80ca3efacf19f7&tochange=deb743b033b70f97af83172d90008da2a5ce7c59 that contains more changes. also, "ReferenceError: event is not defined[Learn More] browser-content.js:1772:1" is caused by bug 1297867 and also fixed there as followup.
Blocks: 1321868
No longer blocks: 883377
Status: UNCONFIRMED → NEW
Component: JavaScript Engine → DOM
Ever confirmed: true
There are 2 regression in -safe-mode on win10 wow64. #1 Tab crashes #2 Not crash, but page is not shown even if about:home #1 Regression window(good -> crash): https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=5f494ff3b83be2b5e8529e859d80ca3efacf19f7&tochange=eb3ac9d2775b8c4e7b089cab4d33d953d6605f0e Regressed by: b3ac9d2775b Ehsan Akhgari — Bug 1321868 - Add an API to nsIDocument to determine whether a script is on the tracking protection list; r=bkelly #2 Regression window(crash -> not crash, but page is not shown): https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=c3d4bcee4905b8d322319d9585e76a148a7b707f&tochange=7849f81b44de6f6b1a83545bdd514f0a73f21c5e Regressed by: 7849f81b44de Ehsan Akhgari — Bug 1322204 - Fail gracefully when the URL classifier service isn't available because we're in safe mode; r=smaug
Summary: safe mode not working as expected after Bug 883377 → safe mode not working as expected after Bug 1321868
Flags: needinfo?(ehsan)
(In reply to Alice0775 White from comment #4) > There are 2 regression in -safe-mode on win10 wow64. > #1 Tab crashes > #2 Not crash, but page is not shown even if about:home > > #1 Regression window(good -> crash): > https://hg.mozilla.org/integration/mozilla-inbound/ > pushloghtml?fromchange=5f494ff3b83be2b5e8529e859d80ca3efacf19f7&tochange=eb3a > c9d2775b8c4e7b089cab4d33d953d6605f0e > > Regressed by: > b3ac9d2775b Ehsan Akhgari — Bug 1321868 - Add an API to nsIDocument to > determine whether a script is on the tracking protection list; r=bkelly I think the crash is one of bug 1322204, bug 1323220 or 1325255 which are all fixed. Given the below I think you were hitting the first bug. > #2 Regression window(crash -> not crash, but page is not shown): > https://hg.mozilla.org/integration/mozilla-inbound/ > pushloghtml?fromchange=c3d4bcee4905b8d322319d9585e76a148a7b707f&tochange=7849 > f81b44de6f6b1a83545bdd514f0a73f21c5e > > Regressed by: > 7849f81b44de Ehsan Akhgari — Bug 1322204 - Fail gracefully when the URL > classifier service isn't available because we're in safe mode; r=smaug I'll investigate this one.
Assignee: nobody → ehsan
Flags: needinfo?(ehsan)
Can someone please provide an example website that doesn't load? I tried a few websites and they all loaded fine.
cnn.com doesn't load for me.
Don't forget that about:support is missing information.
Hmm, cnn.com was one of the websites that I tested. I need to know the exact version that you are running to try to reproduce there. Can you please go to about:buildconfig and paste everything there inside a comment here? Thank you!
I tested with clean new profile. https://hg.mozilla.org/mozilla-central/rev/2785aaf276ba29fb2e1f5607d90d441fee42efb4 Mozilla/5.0 (Windows NT 10.0; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0
about:buildconfig Source Built from https://hg.mozilla.org/integration/mozilla-inbound/rev/981fd29c57012c7e8a28168ca0941e00be560462 Build platform target x86_64-pc-mingw32 Build tools Compiler Version Compiler flags c:/builds/moz2_slave/m-in-w64-pgo-00000000000000000/build/src/vs2015u3/VC/bin/amd64/cl.EXE 19.00.24213 -TC -nologo -wd4091 -D_HAS_EXCEPTIONS=0 -W3 -Gy -Zc:inline -utf-8 -FS -Gw -wd4244 -wd4267 -we4553 c:/builds/moz2_slave/m-in-w64-pgo-00000000000000000/build/src/vs2015u3/VC/bin/amd64/cl.EXE 19.00.24213 -TP -nologo -wd5026 -wd5027 -Zc:sizedDealloc- -Zc:threadSafeInit- -wd4091 -wd4577 -D_HAS_EXCEPTIONS=0 -W3 -Gy -Zc:inline -utf-8 -FS -Gw -wd4251 -wd4244 -wd4267 -wd4800 -wd4595 -we4553 -GR- -Zi -GL -wd4624 -wd4952 -O1 -Oi -Oy- Configure options MOZ_AUTOMATION=1 'MOZILLABUILD=C:\mozilla-build' --host=x86_64-pc-mingw32 --target=x86_64-pc-mingw32 --enable-update-channel= MOZ_PGO=1 WINDOWSSDKDIR=c:/builds/moz2_slave/m-in-w64-pgo-00000000000000000/build/src/vs2015u3/SDK --enable-jemalloc --enable-js-shell RUSTC=c:/builds/moz2_slave/m-in-w64-pgo-00000000000000000/build/src/rustc/bin/rustc CARGO=c:/builds/moz2_slave/m-in-w64-pgo-00000000000000000/build/src/rustc/bin/cargo --enable-eme=+adobe --with-mozilla-api-keyfile=/c/builds/mozilla-desktop-geoloc-api.key --with-google-api-keyfile=/c/builds/gapi.data MAKE=c:/builds/moz2_slave/m-in-w64-pgo-00000000000000000/build/src/mozmake.EXE --enable-crashreporter --enable-release --enable-verify-mar --with-branding=browser/branding/nightly
(In reply to Alice0775 White from comment #11) > I tested with clean new profile. > https://hg.mozilla.org/mozilla-central/rev/ > 2785aaf276ba29fb2e1f5607d90d441fee42efb4 > Mozilla/5.0 (Windows NT 10.0; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0 The link you give above loads a few pages worth and then will eventually hang on me. This is in normal mode not safe mode.
Hmm, I can reproduce on Nightly but not on my local build. I suspect the reason is that my local build is out of date. I'm making a new one right now.
(In reply to Gary [:streetwolf] from comment #13) > (In reply to Alice0775 White from comment #11) > > I tested with clean new profile. > > https://hg.mozilla.org/mozilla-central/rev/ > > 2785aaf276ba29fb2e1f5607d90d441fee42efb4 > > Mozilla/5.0 (Windows NT 10.0; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0 > > The link you give above loads a few pages worth and then will eventually > hang on me. This is in normal mode not safe mode. That is a giant page which is super slow to load...
(In reply to Gary [:streetwolf] from comment #13) > (In reply to Alice0775 White from comment #11) > > I tested with clean new profile. > > https://hg.mozilla.org/mozilla-central/rev/ > > 2785aaf276ba29fb2e1f5607d90d441fee42efb4 > > Mozilla/5.0 (Windows NT 10.0; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0 > > The link you give above loads a few pages worth and then will eventually > hang on me. This is in normal mode not safe mode. You encountered an another bug.
OK, so the reason I couldn't reproduce this in my local build was that I had the fix to bug 1321874 applied (which I'm landing right now). That bug basically moves us off of the code I added in bug 1321868 by doing the URL classification in the parent process, but I would still like to investigate the underlying cause of this bug because it may affect something else. But I don't think we need to track this for the release based on this.
Blocks: 1318768
No longer blocks: 1321868
Component: DOM → Safe Browsing
Product: Core → Toolkit
I misspoke slightly, it is URLClassifierParent::StartClassify which doesn't set *aSuccess to false in the beginning which is causing us to hit this bug, even though the issue in comment 18 is also another similar bug.
Boris, not sure if you're around for the next week or so, baku who reviewed the original patch seems to not be working...
Attachment #8821676 - Flags: review?(bzbarsky)
Hmm, thinking a bit more, we probably want to move setting of *aSuccess to the beginning of ContentParent::RecvPURLClassifierConstructor() instead, since bug 1325255 added one early return there as well.
Attachment #8821677 - Flags: review?(bzbarsky)
Attachment #8821676 - Attachment is obsolete: true
Attachment #8821676 - Flags: review?(bzbarsky)
Safe mode is also destroying sessionstore files in that when starting in safe-mode the session is not restored, and its not restored when returning to normal operation. Safe-mode is mearly opening a blank screen with 1 tab, blank page. Tested with latest m-c win32 hourly build, New Profile, no addons/changes: cset: https://hg.mozilla.org/mozilla-central/rev/da22155a2dc30dafc93d70f38f6bb8a49248c80f Mozilla/5.0 (Windows NT 10.0; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0 Opened CNN.Com & Foxnews.come Closed browser Shift-key: click Nightly Icon to start Nightly Safe-mode Reply to prompt to 'continue' in safe-mode Note session is gone.
Comment on attachment 8821677 [details] [diff] [review] Assume failure in case of early returns in a couple of places So I'm not sure I follow. Classify() can set aSuccess or it can return a failure. That part seems fine. URLClassifierParent::StartClassify is buggy in the sense that it leaves *aSuccess unset if NS_FAILEED(rv). Shouldn't it set it to something in that case, or fail IPC or something? The RecvPURLClassifierConstructor change is ok to handle that part if it's the only caller of StartClassify; in that case StartClasify should assert that the incoming *aSuccess is false. But I think the changes to nsUrlClassifierDBService::Classify are unnecessary and arguably wrong.
Attachment #8821677 - Flags: review?(bzbarsky) → review-
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #24) > Comment on attachment 8821677 [details] [diff] [review] > Assume failure in case of early returns in a couple of places > > So I'm not sure I follow. Classify() can set aSuccess or it can return a > failure. That part seems fine. URLClassifierParent::StartClassify is buggy > in the sense that it leaves *aSuccess unset if NS_FAILEED(rv). Shouldn't it > set it to something in that case, or fail IPC or something? I guess I can set *aSuccess to false there too. > The RecvPURLClassifierConstructor change is ok to handle that part if it's > the only caller of StartClassify; It is. > in that case StartClasify should assert > that the incoming *aSuccess is false. It may as well set it to false itself too. > But I think the changes to > nsUrlClassifierDBService::Classify are unnecessary and arguably wrong. Yes they are unnecessary for this bug, but I thought I'd do that here. If the caller provides a properly initialized value, then there is no need to do anything here. Anyway, I'll remove that part.
Attachment #8821709 - Flags: review?(bzbarsky)
Attachment #8821677 - Attachment is obsolete: true
Comment on attachment 8821709 [details] [diff] [review] Assume failure in case of early returns in a couple of places So here's a question: what if Classify sets aSuccess to true and then returns error? This is generally ok XPCOM behavior, from what I understand, right?
Flags: needinfo?(ehsan)
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #27) > Comment on attachment 8821709 [details] [diff] [review] > Assume failure in case of early returns in a couple of places > > So here's a question: what if Classify sets aSuccess to true and then > returns error? This is generally ok XPCOM behavior, from what I understand, > right? We handle that correctly here: <http://searchfox.org/mozilla-central/rev/51aa673e28802fe6ea89f4793721fc55334a6ac8/dom/ipc/URLClassifierParent.cpp#30> Two things to note: 1) The variable name here is misleading. This boolean out arg is supposed to mean whether an asynchronous lookup was performed, so that the caller knows whether to expect a callback. In the specific case of URLClassifierParent, we always need to respond to the child with something -- but we use this information to judge whether we should respond with failure immediately. 2) You should ignore comment 18. In the buggy scenario, what happens is that we get a failure here and we never call Classify() at all, and we don't touch *aSuccess, which means whatever the caller initialized it with will be what is used. Since the call to StartClassify() comes from RecvPURLClassifierConstructor, *aSuccess is guaranteed to be initialized to true by AllocPURLClassifierParent here <http://searchfox.org/mozilla-central/rev/51aa673e28802fe6ea89f4793721fc55334a6ac8/dom/ipc/ContentParent.cpp#4783>, so we just bail out early, sending a void_t response to the child. On the child side, we receive a true value for the out arg here <http://searchfox.org/mozilla-central/rev/51aa673e28802fe6ea89f4793721fc55334a6ac8/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#1473> so the caller would expect to receive a callback, but this check won't pass due to the void_t type <http://searchfox.org/mozilla-central/rev/51aa673e28802fe6ea89f4793721fc55334a6ac8/dom/ipc/URLClassifierChild.cpp#16> and the client code sits there waiting for the callback which will never arrive. Because we were using this API during script loading, this manifested as no script loads ever finishing in the content process in e10s mode.
Flags: needinfo?(ehsan)
(And if you're wondering how we're even using the URL classifier service in the child process since it's supposed to be disabled by safe mode here <http://searchfox.org/mozilla-central/rev/51aa673e28802fe6ea89f4793721fc55334a6ac8/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#1305>, see bug 1326245...)
This bug is still open but 53 is marked as fixed? What needs to be done here still?
Flags: needinfo?(ehsan)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #30) > This bug is still open but 53 is marked as fixed? What needs to be done here > still? See comment 17 where I set the status for 53 fixed. But we should fix this nonetheless.
Flags: needinfo?(ehsan)
> We handle that correctly here: <http://searchfox.org/mozilla-central/rev/51aa673e28802fe6ea89f4793721fc55334a6ac8/dom/ipc/URLClassifierParent.cpp#30> OK, I feel like I'm missing something. Say we call Classify() and it returns failure nsresult _and_ sets *aSuccess to true. We will call ClassificationFailed, which calls Send__delete__(this, void_t()). And we will return a true *aSuccess, right? The true *aSuccess means the client will wait for a callback. Passing void_t() to Send__delete__ means we won't call OnClassifyComplete() in URLClassifierChild::Recv__delete__, right? So isn't this precisely the scenario you're talking about, where the consumer expects a callback but will never get it?
Flags: needinfo?(ehsan)
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #32) > > We handle that correctly here: <http://searchfox.org/mozilla-central/rev/51aa673e28802fe6ea89f4793721fc55334a6ac8/dom/ipc/URLClassifierParent.cpp#30> > > OK, I feel like I'm missing something. > > Say we call Classify() and it returns failure nsresult _and_ sets *aSuccess > to true. We will call ClassificationFailed, which calls > Send__delete__(this, void_t()). And we will return a true *aSuccess, right? Yes. > The true *aSuccess means the client will wait for a callback. Passing > void_t() to Send__delete__ means we won't call OnClassifyComplete() in > URLClassifierChild::Recv__delete__, right? So isn't this precisely the > scenario you're talking about, where the consumer expects a callback but > will never get it? Ugh, yeah, bug 1323220 changed things so that when OnClassifyComplete is called in the parent in the scenario above, it doesn't do anything, since mIPCOpen will be false due to the actor being destroyed. :( Sorry, I didn't realize this earlier, since the intention of the code was for the above scenario to send a callback to the child process. I'll fix this.
Flags: needinfo?(ehsan)
No wait, I was still misunderstanding before! The comment here clearly suggests that the intention was the opposite. The logic here changed during the review of bug 1318768, so indeed in this case we need to cancel the callback too.
Attachment #8822584 - Flags: review?(bzbarsky)
Attachment #8821709 - Attachment is obsolete: true
Attachment #8821709 - Flags: review?(bzbarsky)
Comment on attachment 8822584 [details] [diff] [review] Assume failure in case of early returns in a couple of places This makes a lot more sense! r=me
Attachment #8822584 - Flags: review?(bzbarsky) → review+
Pushed by eakhgari@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/263b1b07c2ac Assume failure in case of early returns in a couple of places; r=bzbarsky
I don't believe that was this patch's fault.
Flags: needinfo?(ehsan)
Pushed by eakhgari@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/13071c3b691c Assume failure in case of early returns in a couple of places; r=bzbarsky
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
@ehsan I plan to seek uplift of Bug 1318768 so that I can have Bug 1307604 uplifted. Does this need to be uplifted with those?
Flags: needinfo?(ehsan)
(In reply to Kirk Steuber [:bytesized] from comment #42) > @ehsan I plan to seek uplift of Bug 1318768 so that I can have Bug 1307604 > uplifted. Does this need to be uplifted with those? Is that a good idea? There were a few regressions that we fixed after bug 1318768 landed. If you really want to uplift you should look at the history of the code to see what fixes landed on top of it. But I suggest we shouldn't uplift this... Also, see bug 1334616...
Flags: needinfo?(ehsan)
Comment on attachment 8822584 [details] [diff] [review] Assume failure in case of early returns in a couple of places Approval Request Comment [Feature/Bug causing the regression]: This feature is needed for the Shield study to be run on Release 52 (bug 1335232), which we'll use to study the effect of making flash click-to-play by default. [User impact if declined]: Can't run the study as intended [Is this code covered by automated tests?]: Yes [Has the fix been verified in Nightly?]: Yes [Needs manual test from QE? If yes, steps to reproduce]: Not for this feature independently. We'll do QE on the study as a whole to make sure all pieces work as expected [List of other uplifts needed for the feature/fix]: Bug 1318768, Bug 1323220, Bug 1325255, Bug 1322204, Bug 1325651, Bug 1319571, Bug 1321377, Bug 1307604, Bug 1323064, Bug 1335549, Bug 1333303, Bug 1333483, Bug 1336714, Bug 1338287 [Is the change risky?]: No [Why is the change risky/not risky?]: Fixes Bug 1318768 [String changes made/needed]: none
Attachment #8822584 - Flags: approval-mozilla-beta?
Attachment #8822584 - Flags: approval-mozilla-aurora?
Comment on attachment 8822584 [details] [diff] [review] Assume failure in case of early returns in a couple of places It's been in Aurora53 already. Aurora53-.
Attachment #8822584 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora-
Comment on attachment 8822584 [details] [diff] [review] Assume failure in case of early returns in a couple of places this was deemed too risky for beta
Attachment #8822584 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: