Closed Bug 1219369 Opened 9 years ago Closed 9 years ago

Leak checking does not work in Windows content processes due to being unable to create the bloat log

Categories

(Core :: XPCOM, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
e10s + ---
firefox44 --- affected
firefox47 --- fixed

People

(Reporter: mccr8, Assigned: bobowen)

References

Details

(Whiteboard: [MemShrink])

Attachments

(1 file)

philor did a try run of the current m-c when it is switched over to Aurora, and there are many leaks, in Windows 7 and Windows 8 debug browser chrome mochitests, of a single SyncObject. I'm going to land a suppression for this so the tree doesn't turn orange, but it would be good if we could fix this. https://treeherder.mozilla.org/#/jobs?repo=try&revision=5f45b9cad594&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-classifiedState=unclassified
This is a leak in the content process. I filed bug 1219371 for the suppression.
Is this still an issue? Not clear to me
Flags: needinfo?(continuation)
(In reply to Timothy Nikkel (:tnikkel) from comment #2) > Is this still an issue? Not clear to me It is still an issue. If you go to the mozilla-aurora tree, then look at a log for say Windows 7 Mochitest browser-chrome-1 (note that this is non-e10s!), you can see that the leakcheck entries include the following: TEST-INFO | leakcheck | tab process: leaked 1 SyncObject I landed a suppression in bug 1219371 so the tree isn't orange. Though, looking into this now, I'm not sure the underlying issue is actually a graphics leak. In Aurora bc1, the two leaking processes are created during this test: browser/base/content/test/plugins/browser_pluginCrashReportNonDeterminism.js On mozilla-central, this test seems to as part of bc-6 (6? why so high of a number? What is going on with chunking). And, fun fact, it doesn't leak at all. In fact, leakcheck doesn't seem to be aware of the processes that the test creates at all. I'll move this out of graphics and try to figure out what is happening.
Component: Graphics: Layers → General
Flags: needinfo?(continuation)
mconley, you fixed another issue that seemed to be related to browser_pluginCrashReportNonDeterminism.js running on Aurora but not Nightly, in bug 1153659. Did you ever figure out what was going on there?
Flags: needinfo?(mconley)
Running this test locally with Nightly on OSX, I see the two tab process leaks, as on Aurora Windows. I suspect there are two issues here: first, why is this test maybe not really running on Windows Nightly? Two, when we do run this test, why do we leak a SyncObject? I think we may just leak a SyncObject on Windows in a child process, whereas we don't on any other platform. But, maybe that's just some object that CompositorChild or ImageBridge happen to entrain on Windows, in which case there isn't a new leak.
Assignee: nobody → continuation
Flags: needinfo?(mconley)
Ok, I figured out what is happening. On Windows m-c, nsTraceRefcnt (in InitLog) is failing to create the bloat log in content processes. Entries like this appear in the mochitest log: ### XPCOM_MEM_BLOAT_LOG defined -- unable to log bloat/leaks to c:\users\cltbld\appdata\local\temp\tmpefotq8.mozrunner\runtests_leaks_tab_pid3520.log In contrast, the bloat log is being successfully created on Aurora Windows, and Linux m-c. This file is being created directly with a fopen call. I'd guess that Windows m-c has a stricter sandbox than either Aurora Windows or Linux, which is causing the file to fail to be created. Bob, does that sound plausible? There are two problems that need to be addressed: 1) Something needs to be adjusted so that the bloat log is actually created on Windows Nightly. 2) Failure to create the bloat log needs to turn the tree orange somehow. I'll file a separate bug for this, blocked by this bug.
tracking-e10s: --- → ?
Component: General → XPCOM
Flags: needinfo?(bobowen.code)
Summary: SyncObject leaks on Windows when we merge to Aurora → Leak checking does not work in Windows content processes due to being unable to create the bloat log
Whiteboard: [MemShrink]
Blocks: 1051230
Blocks: 1243949
(In reply to Andrew McCreight [:mccr8] from comment #6) > Ok, I figured out what is happening. On Windows m-c, nsTraceRefcnt (in > InitLog) is failing to create the bloat log in content processes. Entries > like this appear in the mochitest log: > > ### XPCOM_MEM_BLOAT_LOG defined -- unable to log bloat/leaks to > c:\users\cltbld\appdata\local\temp\tmpefotq8. > mozrunner\runtests_leaks_tab_pid3520.log > > In contrast, the bloat log is being successfully created on Aurora Windows, > and Linux m-c. This file is being created directly with a fopen call. > > I'd guess that Windows m-c has a stricter sandbox than either Aurora Windows > or Linux, which is causing the file to fail to be created. Bob, does that > sound plausible? Yes, we only have the Windows content sandbox turned on, on Nightly at the moment. Actually I'm hoping to get the low integrity part of the sandbox up to Aurora soon. > There are two problems that need to be addressed: > 1) Something needs to be adjusted so that the bloat log is actually created > on Windows Nightly. Does the bloat log need to be created in a particular place or could it be created in the low integrity temp directory? If it uses the directory service then TmpD in the child process gets changed to point to something that looks like: C:\Users\<user name>\AppData\LocalLow\Mozilla\Temp-<some UUID> The UUID is set per profile in the pref: security.sandbox.content.tempDirSuffix The directory gets deleted by the parent process on shutdown however, so that could well make this difficult. But the content process can write to anywhere in LocalLow. It looks like the env var XPCOM_MEM_BLOAT_LOG is used, so if that could point to LocalLow, instead of Local, that would work at least for Windows. The other alternative would be to have some test specific code that knows the path of the log and adds it as a policy rule for write access, but I'd prefer writing to LocalLow if possible.
Flags: needinfo?(bobowen.code)
(In reply to Bob Owen (:bobowen) from comment #7) > The other alternative would be to have some test specific code that knows > the path of the log and adds it as a policy rule for write access, but I'd > prefer writing to LocalLow if possible. Actually, while I was thinking about bug 1193861, I wonder if we shouldn't add some way for the tests to add file policy rules for this sort of thing. LocalLow, will work for the moment, but we might find even writing there will break as we try to tighten the sandbox. Another alternative, which would be more cross-platform, is some mechanism for the child to request the file handles from the parent, for certain things when running tests. Jed what do you think? ... how does the RemoteOpenFile stuff work and what protections are there to stop a compromised child from just asking for any file it likes?
Flags: needinfo?(jld)
As it exists now, PRemoteOpenFile is only for reading files, to implement remoteopenfile:/// URIs, and it's part of Necko (it's a subprotocol managed by PNecko). In principle it could be extended to allow writing, I think, but it might not be the right place to do that. The authorization logic lives in NeckoParent::AllocPRemoteOpenFileParent; currently https://dxr.mozilla.org/mozilla-central/source/netwerk/ipc/NeckoParent.cpp#569. It's pretty specific to B2G app:// URIs, but that could be changed. Another potential problem here is for files that need be opened before the usual IPC things are set up. In the Linux world that's not as much of a problem, because (most of) the sandbox won't be enabled that early, so the file can be opened directly, and the fd acts as a capbility to allow future reads/writes; but that might not apply as well to other OSes. Having a common interface for dynamically adding FS policy rules could be done on Linux by redoing my broker to make the policy dynamic (which I was hoping to avoid, but in hindsight that might not have been the right idea), but OS X has a problem there because it's not using a broker that can be dynamically adjusted after sandbox startup: the paths are part of the policy program that's given to Sandbox.kext when the sandbox is enabled.
Flags: needinfo?(jld)
Thanks Jed. From looking at ContentProcess::Init I think that the IPC stuff is set up before we initialise XPCOM, if that's early enough. If it isn't, it will be a problem on Windows because we have to start at low integrity even before we've lowered the sandbox. As I've just mentioned in bug 1193861, on Windows I could just add the TEMP directory to the policy, if I can tell that we're running tests. I think the IPC stuff in the Windows sandbox is definitely early enough, because I can add rules for the loading of the DLLs when Firefox and the child process is started from a network drive.
If it helps, even though the bloat log file is opened very early, I think we don't actually write anything to it until XPCOM shutdown. I could refactor things to not do the fopen() it until it is used if that makes something easier.
(In reply to Andrew McCreight [:mccr8] from comment #11) > If it helps, even though the bloat log file is opened very early, I think we > don't actually write anything to it until XPCOM shutdown. I could refactor > things to not do the fopen() it until it is used if that makes something > easier. It probably would, if we go down that route. A quick fix on Windows would be to add write access to TEMP, I think. Do you know if there is an env var that gets set for test runs? If not, I can add a new one specific to this.
Flags: needinfo?(continuation)
(In reply to Bob Owen (:bobowen) from comment #12) > It probably would, if we go down that route. Alright. Well, let me know what I can do to help to refactor stuff. > A quick fix on Windows would be to add write access to TEMP, I think. Alright. Another thing that might work would be to change XPCOM_MEM_BLOAT_LOG to be some kind of low integrity directory. It is set here: http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py#1214 A brief bit of searching on the internet didn't turn up anything simple about getting the Windows low integrity directory from Python, though. > Do you know if there is an env var that gets set for test runs? > If not, I can add a new one specific to this. XPCOM_MEM_BLOAT_LOG is being set, for one. Another thing that could be useful is that we only do leak checking in DEBUG builds, so you could #ifdef DEBUG anything you add.
Flags: needinfo?(continuation)
OK, I completely misunderstood the process log one (bug 1193861), it's not being blocked by the sandbox. The log is written from the parent anyway, it's just the sandbox launch code doesn't write it. Given that I think adding TEMP write access in DEBUG builds is fine. Debug Try push with those two patches (quite a lot of leaking): https://treeherder.mozilla.org/#/jobs?repo=try&revision=2408383f640b Here's some more with e10s forced on: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9c20425cc4b9 Opt push, just to show the process log is always written: https://treeherder.mozilla.org/#/jobs?repo=try&revision=274550410540
Thanks for looking at this, Bob. I'll try to think of some reasonable way to suppress those leaks. For now, you can try the patch in bug 1219919 and see if that covers it. That's what we've been landing on Aurora to cover the Win7 leaks that show up there. If that works, you can just land it and I can work on something more sophisticated separately.
Assignee: continuation → bobowen.code
No problem, here's a push with the latest patch from that bug: https://treeherder.mozilla.org/#/jobs?repo=try&revision=255a9f1f497e By the way, I noticed that the bug number in the comment in the patch says 1219916 not 1219919.
Status: NEW → ASSIGNED
Depends on: 1219919
Blocks: 1091917
Priority: -- → P3
I did a try push with your patches and with an updated version of the leak suppressions in bug 1219919 and it looks green: https://treeherder.mozilla.org/#/jobs?repo=try&revision=bdc3933abda0 (I fixed the one orange in there by bumping the number of SyncObject leaks in the suppression list.)
Comment on attachment 8715210 [details] [diff] [review] In Windows debug builds allow write access to TEMP for logging purposes Review of attachment 8715210 [details] [diff] [review]: ----------------------------------------------------------------- ::: security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp @@ +55,5 @@ > > +#if defined(DEBUG) > + // Allow write access to TEMP directory in debug builds for logging purposes. > + wchar_t tempPath[MAX_PATH + 2]; > + uint32_t pathLen = ::GetTempPathW(MAX_PATH + 1, tempPath); nit: Please add a comment explaining the "+ 1" and "+ 2"
Attachment #8715210 - Flags: review?(tabraldes) → review+
Thanks Tim. (In reply to Tim Abraldes [:TimAbraldes] [:tabraldes] from comment #19) > > + // Allow write access to TEMP directory in debug builds for logging purposes. > > + wchar_t tempPath[MAX_PATH + 2]; > > + uint32_t pathLen = ::GetTempPathW(MAX_PATH + 1, tempPath); > > nit: Please add a comment explaining the "+ 1" and "+ 2" Added locally: // The path from GetTempPathW can have a length up to MAX_PATH + 1, including // the null, so we need MAX_PATH + 2, so we can add an * to the end.
Blocks: 1247025
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: