Closed
Bug 1385928
Opened 7 years ago
Closed 7 years ago
Mozregression launched nightly after 2017-07-30 don't load start page
Categories
(Core :: Security: Process Sandboxing, defect, P2)
Core
Security: Process Sandboxing
Tracking
()
VERIFIED
FIXED
mozilla57
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox55 | --- | unaffected |
firefox56 | --- | verified |
firefox57 | --- | verified |
People
(Reporter: tcampbell, Assigned: bobowen)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
(Keywords: regression, Whiteboard: sb+)
Attachments
(2 files)
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
jimm
:
review+
lizzard
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
STR:
- On Win10, use mozregression to start nightly 2017-07-30
Expected:
- Welcome to nightly start page appears
Actual:
- No page loaded
- Hidden window appears in task bar
- Window close buttons don't work (but hotkeys Ctrl+Shift+Q does work)
Regression Range:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=9ff81e376529981389074a985083d161e2f155d3&tochange=27ac1eee8d2952cfb25b32f71a62dd0dd7f9e05e
This seems like an interaction between sandbox updates and mozregression script. Starting a system nightly with --profile works fine for me.
Reporter | ||
Comment 1•7 years ago
|
||
Could changes in Bug 1369669 break the way mozregression starts up new profiles?
Flags: needinfo?(bobowencode)
Reporter | ||
Updated•7 years ago
|
Component: mozregression → Security: Process Sandboxing
Product: Testing → Core
Version: Version 3 → unspecified
Comment 2•7 years ago
|
||
Also I'm unable to navigate to a site. Mozregression-gui is unusable regarding recent regressions right now. I'll attach the output from the log view.
Comment 3•7 years ago
|
||
Reporter | ||
Comment 4•7 years ago
|
||
Temporary work-around is to run mozregression with |--pref security.sandbox.content.level:0| option. This will turn down sandbox and allow using mozregression until issue can be fixed.
Comment 5•7 years ago
|
||
(In reply to Ted Campbell [:tcampbell] from comment #4)
> Temporary work-around is to run mozregression with |--pref
> security.sandbox.content.level:0| option. This will turn down sandbox and
> allow using mozregression until issue can be fixed.
Is this also possible with mozregression-gui?
Comment 6•7 years ago
|
||
(In reply to Arjen de Jager from comment #5)
> (In reply to Ted Campbell [:tcampbell] from comment #4)
> > Temporary work-around is to run mozregression with |--pref
> > security.sandbox.content.level:0| option. This will turn down sandbox and
> > allow using mozregression until issue can be fixed.
>
> Is this also possible with mozregression-gui?
And yes, it is. I'm sorry for bothering you! I'm not very experienced with mozregression-gui and didn't experiment with it's options that much already.
Assignee | ||
Comment 7•7 years ago
|
||
(In reply to Ted Campbell [:tcampbell] from comment #1)
> Could changes in Bug 1369669 break the way mozregression starts up new
> profiles?
This just resolved any symlinks or junction points in the child executable path.
Was it definitely bug 1369669 not bug 1366697?
Maybe it was the strengthening of the sandbox that changed things, does mozregression rely on reading files in the content process?
Flags: needinfo?(bobowencode)
Reporter | ||
Comment 8•7 years ago
|
||
Doesn't look like Bug 1366697 is to blame. We should already have been at level 3 since these are nightly builds. As well, sandbox level 1 still has the issue.
Looking at console output, I see the following when sandbox is on:
[Parent 8576] WARNING: Failed to launch tab subprocess: file z:/build/build/src/ipc/glue/GeckoChildProcessHost.cpp, line 604
Updated•7 years ago
|
Priority: -- → P1
Whiteboard: sb?
Assignee | ||
Comment 9•7 years ago
|
||
(In reply to Ted Campbell [:tcampbell] from comment #8)
> Doesn't look like Bug 1366697 is to blame. We should already have been at
> level 3 since these are nightly builds. As well, sandbox level 1 still has
> the issue.
>
> Looking at console output, I see the following when sandbox is on:
> [Parent 8576] WARNING: Failed to launch tab subprocess: file
> z:/build/build/src/ipc/glue/GeckoChildProcessHost.cpp, line 604
Yes it must be Bug 1369669 then.
I can't reproduce, but I notice that mozregression is installing firefox in the temp dir, so maybe the resolution of junction points and symlinks is failing in some circumstances.
I noticed that I missed checking for that, so I've filed bug 1388048 for fixing that and hopefully that will fix this.
Updated•7 years ago
|
Priority: P1 → P2
Whiteboard: sb? → sb+
Reporter | ||
Comment 10•7 years ago
|
||
Logs after Bug 1388048 landed.
0:02.75 INFO: [Main Thread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolving path: c:\users\tcampb~1\appdata\local\temp\tmpewukwq\firefox
0:02.75 INFO: [Main Thread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolved path to: C:\Users\tcampbell\AppData\Local\Temp\tmpewukwq\firefox
0:02.75 INFO: [Main Thread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolving path: c:\users\tcampb~1\appdata\local\temp\tmpewukwq\firefox\browser
0:02.75 INFO: [Main Thread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolved path to: C:\Users\tcampbell\AppData\Local\Temp\tmpewukwq\firefox\browser
0:02.75 INFO: [Main Thread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolving path: c:\users\tcampb~1\appdata\local\temp\tmpewukwq\firefox\firefox.exe
0:02.75 INFO: [Main Thread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolved path to: C:\Users\tcampbell\AppData\Local\Temp\tmpewukwq\firefox\firefox.exe
0:02.75 INFO: [Main Thread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolving path: c:\users\tcampb~1\appdata\local\temp\tmptahvdw.mozrunner
0:02.75 INFO: [Main Thread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolved path to: C:\Users\tcampbell\AppData\Local\Temp\tmptahvdw.mozrunner
0:03.30 INFO: [Main Thread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolving path: C:\Users\tcampbell\AppData\LocalLow\Mozilla\Temp-{6662030a-6ae2-4a45-8ff3-7f2858a2cbb2}
0:03.30 INFO: [Main Thread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolved path to: C:\Users\tcampbell\AppData\LocalLow\Mozilla\Temp-{6662030a-6ae2-4a45-8ff3-7f2858a2cbb2}
0:03.30 INFO: [Gecko_IOThread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolving path: c:\users\tcampb~1\appdata\local\temp\tmpewukwq\firefox\firefox.exe
0:03.30 INFO: [Gecko_IOThread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolved path to: C:\Users\tcampbell\AppData\Local\Temp\tmpewukwq\firefox\firefox.exe
0:03.48 INFO: [Gecko_IOThread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolving path: c:\users\tcampb~1\appdata\local\temp\tmpewukwq\firefox\firefox.exe
0:03.48 INFO: [Gecko_IOThread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolved path to: C:\Users\tcampbell\AppData\Local\Temp\tmpewukwq\firefox\firefox.exe
0:03.48 INFO: [Parent 7312] WARNING: Failed to launch tab subprocess: file z:/build/build/src/ipc/glue/GeckoChildProcessHost.cpp, line 608
0:03.48 INFO: [Gecko_IOThread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolving path: c:\users\tcampb~1\appdata\local\temp\tmpewukwq\firefox\firefox.exe
0:03.48 INFO: [Gecko_IOThread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolved path to: C:\Users\tcampbell\AppData\Local\Temp\tmpewukwq\firefox\firefox.exe
0:03.48 INFO: [Parent 7312] WARNING: Failed to launch tab subprocess: file z:/build/build/src/ipc/glue/GeckoChildProcessHost.cpp, line 608
0:03.48 INFO: [Gecko_IOThread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolving path: c:\users\tcampb~1\appdata\local\temp\tmpewukwq\firefox\firefox.exe
0:03.48 INFO: [Gecko_IOThread]: D/Widget ResolveJunctionPointsAndSymLinks: Resolved path to: C:\Users\tcampbell\AppData\Local\Temp\tmpewukwq\firefox\firefox.exe
0:03.50 INFO: [Parent 7312] WARNING: Failed to launch tab subprocess: file z:/build/build/src/ipc/glue/GeckoChildProcessHost.cpp, line 608
Reporter | ||
Comment 11•7 years ago
|
||
Doesn't reproduce on one of my machines, but does on other. I'm now wondering if problem occurs only when login name is >8 chars and we have short paths.
Reporter | ||
Comment 12•7 years ago
|
||
Confirmed. If I change my temp directory to a path without 8.3names, I no longer get issues.
Reporter | ||
Comment 13•7 years ago
|
||
More specifically, a mix of 8.3name and a directory with >8 chars.
STR:
- Create C:\longpathname\longpathname\
- Put a firefox bundle in it
- In command prompt (boring old windows one), go to C:\longpa~1\longpathname\
- Try to start firefox from here. (Run firefox\firefox.exe)
Reporter | ||
Comment 14•7 years ago
|
||
Will, can we patch mozregression to expand the path returned by mkdtemp as a workaround? It looks like there can be issues on Windows if someone's username is more than 8 characters (or contains a space).
Flags: needinfo?(wlachance)
Comment 15•7 years ago
|
||
(In reply to Ted Campbell [:tcampbell] from comment #14)
> Will, can we patch mozregression to expand the path returned by mkdtemp as a
> workaround? It looks like there can be issues on Windows if someone's
> username is more than 8 characters (or contains a space).
Should be quite straightforward. Probably just need to use the technique here:
https://stackoverflow.com/questions/11420689/how-to-get-long-file-system-path-from-python-on-windows
Here:
https://github.com/mozilla/mozregression/blob/19cd5b376be731d6b92728716ac9f32a22365184/mozregression/main.py#L51
Or maybe define a utility function performing the transformation, and use it wherever mkdtemp is called (there are a few other spots). If you wanted to do up a PR against mozregression, that would be cool, otherwise I can do it when I get back from PTO/weekend next Monday.
Flags: needinfo?(wlachance)
Assignee | ||
Comment 16•7 years ago
|
||
(In reply to Ted Campbell [:tcampbell] from comment #13)
> More specifically, a mix of 8.3name and a directory with >8 chars.
>
> STR:
> - Create C:\longpathname\longpathname\
> - Put a firefox bundle in it
> - In command prompt (boring old windows one), go to C:\longpa~1\longpathname\
> - Try to start firefox from here. (Run firefox\firefox.exe)
Ted, thanks for investigating this further.
It looks like we might need to fix this in general.
Although a workaround for mozregression in the short term would be good.
I'll look at this again when I get back.
Comment 17•7 years ago
|
||
By request of :ted, I'm going to look into the mozregression fix. Setting a NI as a reminder.
Flags: needinfo?(wlachance)
Comment 18•7 years ago
|
||
I put a fix in https://github.com/mozilla/mozregression/commit/2a76a90ced4095d557abbd61bab029557e8ae115
and posted a new versions of mozregression command line (the gui should be automatically updated shortly)
Not sure if there's anything left to do for this bug, so I'll just leave it open.
Flags: needinfo?(wlachance)
Reporter | ||
Comment 19•7 years ago
|
||
Thanks. That mozregression patch fixes the issue on my machine and lets me bisect again without turning off sandbox. We can leave this bug open until underlying sandbox issue is fixed.
Assignee | ||
Comment 20•7 years ago
|
||
(In reply to Ted Campbell [:tcampbell] from comment #13)
> More specifically, a mix of 8.3name and a directory with >8 chars.
>
> STR:
> - Create C:\longpathname\longpathname\
> - Put a firefox bundle in it
> - In command prompt (boring old windows one), go to C:\longpa~1\longpathname\
> - Try to start firefox from here. (Run firefox\firefox.exe)
The problem is due to the way the sandbox looks for the base address for the executable.
It compares paths from ::QueryFullProcessImageNameW and ::GetMappedFileNameW (searching through the virtual memory mappings).
These always seem to be consistent on Win7 but on Win10 they don't in circumstances such as these.
(This is probably after creators update as we saw similar issues with firefox.exe if it had been moved after being run and mapped into memory.)
I'll see if I can come up with something.
Assignee: nobody → bobowencode
Status: NEW → ASSIGNED
Assignee | ||
Comment 21•7 years ago
|
||
Chromium have obviously been hitting similar issues [1].
So, I think it makes sense for us to patch our version of GetProcessBaseAddress with this new one.
I might let it bed in for a few days first though. :-)
[1] https://bugs.chromium.org/p/chromium/issues/detail?id=753140
Assignee | ||
Comment 22•7 years ago
|
||
Chromium patch seems to be sticking, here's a try push with there new version of the function:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6a8e70452eafef8680ee9def26836dd95a5f9cb6
Assignee | ||
Comment 23•7 years ago
|
||
This should fix issues we have seen with running Firefox from short name paths or moved binaries.
Attachment #8901240 -
Flags: review?(jmathies)
Updated•7 years ago
|
Attachment #8901240 -
Flags: review?(jmathies) → review+
Comment 24•7 years ago
|
||
Pushed by bobowencode@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8d7578086308
Take new implementation of GetProcessBaseAddress from chromium commit f398005bc4ca0cc2dab2198faa99d4ee8f4da60d. r=jimm
Comment 25•7 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox57:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Comment 26•7 years ago
|
||
Please request Beta approval on this when you get a chance.
status-firefox55:
--- → unaffected
status-firefox56:
--- → affected
status-firefox-esr52:
--- → unaffected
Flags: needinfo?(bobowencode)
Assignee | ||
Comment 27•7 years ago
|
||
Comment on attachment 8901240 [details] [diff] [review]
Take new implementation of GetProcessBaseAddress from chromium commit f398005bc4ca0cc2dab2198faa99d4ee8f4da60d
Approval Request Comment
[Feature/Bug causing the regression]:
Windows process sandboxing.
[User impact if declined]:
Users who are running Firefox using a short name path or after moving Firefox may continue to have an unusable version of Firefox.
[Is this code covered by automated tests?]:
This code is in the launching of sandboxed child processes, which happens in nearly all e10s tests.
[Has the fix been verified in Nightly?]:
Yes.
Also the SANDBOX_FAILED_LAUNCH telemetry in Nightly now has no failures for 43 (SBOX_ERROR_CANNOT_FIND_BASE_ADDRESS) since 26th Aug when this landed.
[Needs manual test from QE? If yes, steps to reproduce]:
Yes, STR in comment 13 can be used on Windows 10 (probably needs creators update).
[List of other uplifts needed for the feature/fix]:
None
[Is the change risky?]:
Slightly
[Why is the change risky/not risky?]:
This fix has landed in chromium but it doesn't look like they will uplift it. This is possibly because they have another existing fix that somewhat mitigates the issue.
However the new code is much simpler than the old way of getting the base address, so the risk seems fairly small.
[String changes made/needed]:
None
Flags: needinfo?(bobowencode)
Attachment #8901240 -
Flags: approval-mozilla-beta?
Comment on attachment 8901240 [details] [diff] [review]
Take new implementation of GetProcessBaseAddress from chromium commit f398005bc4ca0cc2dab2198faa99d4ee8f4da60d
Fix for a new regression in 56, let's uplift this for beta 8.
Attachment #8901240 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 29•7 years ago
|
||
bugherder uplift |
Updated•7 years ago
|
Flags: qe-verify+
Comment 30•7 years ago
|
||
I managed to reproduce the bug on an old version of Nightly (2017-07-30) on Windows 10 x64 using the method from comment 13. The page was not loaded, I couldn't close any tabs and couldn't open new ones. The browser crashed.
I retested everything using the same method on latest Nightly 57.0a1 and beta 56.0b12 and the bug is not reproducing anymore.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in
before you can comment on or make changes to this bug.
Description
•