Closed Bug 1558944 Opened 5 years ago Closed 5 years ago

Extending OriginAttributesDictionary triggers WPT failures

Categories

(Core :: Networking, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla70
Tracking Status
firefox70 --- fixed

People

(Reporter: esawin, Assigned: ehsan.akhgari)

References

Details

Attachments

(1 file)

Extending OriginAttributesDictionary (ChromeUtils.webidl) triggers OSX WPT (TEST-UNEXPECTED-FAIL | leakcheck | default 2012 bytes leaked (Mutex, PollableEvent, ReentrantMonitor, nsAStreamCopier, nsPipe, ...)) leaks.

Minimal failing commit: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5370ebcf98e159d33560f1ec45e64795847d67b6

See bug 1501108 for more context.

This is blocking a GeckoView API feature required for Fenix Progressive Web App support.

bz, since you have the context, can you please help us finding an owner and the proper component for this issue?

Flags: needinfo?(bzbarsky)

Andrea, Ehsan, do you have any ideas what might be going on here?

What's particularly confusing to me is that the leaks seem somewhat random...

Flags: needinfo?(ehsan)
Flags: needinfo?(bzbarsky)
Flags: needinfo?(amarchesini)

Note that in bug 1436244 a leak looking very similar to this was whitelisted. The leak threshold is 2000 bytes for the default process on OSX. Presumably some change here just made the leak slightly larger.

Now, this is obviously not a great long term solution, but if you want a quick fix to unblock some other thing then you can change this line from testing/web-platform/mozilla/meta/dir.ini
if os == "mac": [tab:10000, gmplugin:20000, default:2000, rdd:400]
to something like
if os == "mac": [tab:10000, gmplugin:20000, default:2100, rdd:400]
...and then these leaks will be hidden away again, until somebody touches some other data structure involved in the leak and then they have a mysterious leak they don't understand.

Also, I really doubt this is a webidl issue.

One thing worth maybe testing: What happens if instead of adding a new member to OriginAttributesDictionary we do the following:

  1. Define a new dictionary type that inherits from OriginAttributesDictionary and has one additional member.
  2. Change all uses of OriginAttributesDictionary in our IDL and C++ to use the new dictionary.

Does that leak? I just tried pushing https://treeherder.mozilla.org/#/jobs?repo=try&revision=320ed7150da9f332be018188a5c100aab77d9369 to test that and https://treeherder.mozilla.org/#/jobs?repo=try&revision=fe8ac220ea046f4870a76b777eb57f8a7d69f048 as a control.

If that shows the leaks, we can try changing only some of the places that use OriginAttributesDictionary to the modified thing and try to hunt things down that way.

Component: DOM: Bindings (WebIDL) → General

nsSocketTransportService is in the leak, so it is probably networking related. nsAStreamCopier is also in there, which is a runnable. A common cause of these intermittent leaks is that there's a shutdown race where somebody creates a runnable at some point after whatever is supposed to deal with the runnable is shutdown.

Component: General → Networking

As I said, a short term measure is just to kick the can down the road and bump up that leak threshold a tiny bit.

Hmm. If there's an existing leak and adding a new member to OriginAttributesDictionary just increased the size of the stuff that's leaking, that could definitely cause this problem.

Simple enough to test: we can add a member to OriginAttributes itself that is completely unused. Pushed https://treeherder.mozilla.org/#/jobs?repo=try&revision=94ee1ae9d7ba9e7a0bb7a60793ad68fdd564b495

Unfortunately, the logs from https://treeherder.mozilla.org/#/jobs?repo=try&revision=5370ebcf98e159d33560f1ec45e64795847d67b6 don't exist anymore, so I can't tell how many OriginAttributes instances were or might have been leaking and hence how much adding an nsString to them would increase the leak size. But we can get that data from the newer pushes once they finish, hopefully.

Bug 1556632 seems to be the modern incarnation of this leak (maybe on Windows, because the leak threshold wasn't bumped there?), and it has at least one log that is visible. However, there's no OriginAttributes in the full list of leaked objects. I'm not sure what that means.

Are dictionaries refcounted? If they are neither refcounted nor calling MOZ_COUNT_CTOR/DTOR, then they won't show up in leak checking.

The leaking object (nsAStreamCopier from comment 6) is the copy context out argument of NS_AsyncCopy. Looking at the call sites of this function, it can be one of the following classes:

In order to debug this, I suggest annotating these three classes with MOZ_COUNT_CTOR/DTOR to see which one is at play to make progress...

Flags: needinfo?(ehsan)

(In reply to Andrew McCreight [:mccr8] from comment #10)

Are dictionaries refcounted? If they are neither refcounted nor calling MOZ_COUNT_CTOR/DTOR, then they won't show up in leak checking.

They aren't, and they won't show up in leak checking (for example see the log in bug 1556632 comment 0).

However, there's no OriginAttributes in the full list of leaked objects.

Yeah, dictionaries don't have ctor/dtor counting. We should consider adding that to the codegen.

I wish I had try data, but the mac jobs have been queued-but-not-started for 3 hours now. :(

Looking at the log from bug 1556632, the only class there that obviously involves OriginAttributes is nsSocketTransport.

Assuming that the 2012-byte leak here had the same thing, that makes sense: we're adding an nsString to OriginAttributes and hence to nsSocketTransport, and nsString is 16 bytes on 64-bit systems. So the leaking amount would have gone from 1996 bytes to 2012 bytes and just crossed the threshold.

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #13)

Assuming that the 2012-byte leak here had the same thing, that makes sense: we're adding an nsString to OriginAttributes and hence to nsSocketTransport, and nsString is 16 bytes on 64-bit systems. So the leaking amount would have gone from 1996 bytes to 2012 bytes and just crossed the threshold.

That fully explains what we're seeing here.

OK, yes, the new try runs are showing the same thing: a 2012-byte leak, and nsSocketTransport is one of the things leaked. So I think comment 7 is the right thing to unblock bug 1501108 and then we can figure out why we're leaking this stuff...

Flags: needinfo?(amarchesini)

The fix is on inbound now. Eugen, this should no longer be blocking you.

Pushed by eakhgari@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e712c1b80f64 Increase the intermittent leak threshold for wpt tests on OSX to 2100 bytes to allow for new fields to be added to OriginAttributes; r=mccr8
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla70
Assignee: nobody → ehsan
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: