URL.createObjectURL() often creates URLs that point to a file with missing blobs
Categories
(Core :: DOM: Networking, defect, P2)
Tracking
()
People
(Reporter: onurtag, Assigned: nika)
References
(Regression)
Details
(Keywords: regression, Whiteboard: [necko-triaged], [wptsync upstream])
Attachments
(2 files)
(deleted),
text/x-phabricator-request
|
diannaS
:
approval-mozilla-beta+
RyanVM
:
approval-mozilla-esr102+
|
Details |
(deleted),
text/x-phabricator-request
|
diannaS
:
approval-mozilla-beta+
RyanVM
:
approval-mozilla-esr102+
|
Details |
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Firefox/102.0
Steps to reproduce:
The bug was found using the Multithreaded Download Manager addon so (https://addons.mozilla.org/en-US/firefox/addon/multithreaded-download-manager/)
You can reproduce the bug with this method: https://github.com/jingyu9575/multithreaded-download-manager/issues/179#issuecomment-1175821990
or you can download 10 identical URLs using the addon with 4+ threads (to create more blobs)
Actual results:
3 out of 10 URLs created point to a corrupt file with missing blobs.
The results are random. More than 3 files can be corrupt or all 10 files can be complete.
Expected results:
All created URLs should point to the same complete file.
More information at the addon's github issue: https://github.com/jingyu9575/multithreaded-download-manager/issues/179
The issue is reproducible on Firefox 102.0.1 and Nightly 104.0a1 (2022-07-18) (Win10 21H2). Also reproducible on Arch Linux & Linux Mint according to some comments on the github issue.
Mozregression result: https://user-images.githubusercontent.com/7430003/177225481-089326ac-567e-47a0-856b-e5925e9ff318.png
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=69fb0f363c1889e054c8aac505a63875e6ce3099&tochange=3f4ebef6a4b1b4fb858d9d8eba17a4ccae268d1a
I should also mention that I can't reproduce the bug using this seperate test addon: https://github.com/Onurtag/bad-object-url
Comment 2•2 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::DOM: Networking' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Comment 3•2 years ago
|
||
Set release status flags based on info from the regressing bug 1754004
Comment 4•2 years ago
|
||
The bug has a release status flag that shows some version of Firefox is affected, thus it will be considered confirmed.
New test addon that can be used to replicate the bug (Thanks mix5003):
https://github.com/mix5003/bad-object-url
This addon uses indexedDB to store the file's blobs (like the Multithreaded Download Manager addon) and unlike my fork of the test addon above it can be used to replicate the bug.
Comment 6•2 years ago
|
||
Many thanks to both onurtag and mix5003 for the invaluable reproduction case and to onurtag for the regression identification in comment 1!
Based on the changes made to make the reproduction work, it appears like this may be an edge case related to IndexedDB file-backed Blobs when used in composite/multipart blobs. I'm going to quickly try and adapt the scenario's IDB reproduction into a WPT test that also validates FileReader and Blob.prototype.text() methods of reading the composite blob (which is the easy part compared to getting this to reproduce!) and then capture that under the pernosco debugger.
Comment 7•2 years ago
|
||
Comment 8•2 years ago
|
||
It took some playing with the constants to get pernosco to reproduce this, but here's the trace. This corresponds exactly to the state of the WIP test which I ran with ./mach wpt --headless --no-pause-after-test --debugger=rr --debugger-args="record -h --disable-avx-512" /IndexedDB/blob-composite-blob-reads.any.html
to produce this. Note that it seems like fetch/URL.createObjectURL is necessary and FileReader or the direct arrayBuffer()
helper are not sufficient. But that makes sense because those will both do much simpler things to stream the Blob which also doesn't get sent over IPC, although I guess that could be added if we involve a shipped MessagePort.
https://pernos.co/debug/QJ8oNjumKn8yfhEWi7hYnA/index.html
Comment 9•2 years ago
|
||
(needinfo to draw attention to the pernosco trace above)
Updated•2 years ago
|
Comment 10•2 years ago
|
||
Set release status flags based on info from the regressing bug 1754004
Assignee | ||
Comment 11•2 years ago
|
||
Thanks a ton for this pernosco trace! Thanks to that (and quite a bit of time of trying to parse through the trace) I'm pretty sure I know what's happening here.
When serializing the stream from the content to parent process to register it in the BlobURL store the stream is serialized as a nsMultiplexInputStream
consisting of 512 distinct sub-streams. Because each of these sub-streams comes from IDB, they're all RemoteLazyInputStreams. Due to the large-attachment-count, the serialization code decides to serialize a single DataPipe to contain all of the relevant data from all of these streams, starting an AsyncCopy from the nsMultiplexInputStream
into the pipe. This copy ends up missing segments from 2 of the nested input streams: #385 and #469 in this case.
This happens due to a race condition and bug in nsMultiplexInputStream
, where it assumes that a OnInputStreamReady
notification must come from the currently selected stream, so if the passed-in stream is empty, it unconditionally advances to the next stream: https://searchfox.org/mozilla-central/rev/23bf1890e07f780ba70e075bc8f46ffb85d1128c/xpcom/io/nsMultiplexInputStream.cpp#906-908,915. What happens here is that the nsAStreamCopier is woken up by the output stream becoming available at the same time as the input stream #468 is trying to notify that it is done. We enter Read
, try to read data from segment #468, and find that it is completed. Because of that we advance to segment #469, which reports that it doesn't have data available yet, returning NS_BASE_STREAM_WOULD_BLOCK
. Immediately after that, the notification from input stream #468 arrives, and it is again detected that it is empty, meaning that the code attempts to advance to the next segment (#470), skipping #469 completely.
nsMultiplexInputStream
likely needs to add some checks to OnInputStreamReady
which makes sure that aStream
is actually the currently selected stream, and discards the notification if it is not, though there's some complexity around that due to the awkward way that nsMultiplexInputStream
is implemented right now, with its support for a mixture of both blocking and non-blocking streams.
Assignee | ||
Comment 12•2 years ago
|
||
Previously, we could end up in situations where nsMultiplexInputStream
would receive a callback for a previously notified stream after it had
already advanced to the next stream due to the async nature of
notifications. This could cause us to skip a stream accidentally. This
changes the logic to re-call AsyncWait in that situation on the correct
stream rather than skipping streams.
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 13•2 years ago
|
||
Comment 15•2 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/4179c610cd95
https://hg.mozilla.org/mozilla-central/rev/79b351c9c742
Comment 17•2 years ago
|
||
base on my test and original github issue. i can confirmed that bug has been fixed.
should this patch merge into beta and esr102 too?
Assignee | ||
Comment 18•2 years ago
|
||
We should probably consider upstreaming it, though it appears that there's a memory leak regression we should perhaps figure out first.
Comment 19•2 years ago
|
||
The patch landed in nightly and beta is affected.
:nika, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.Also, don't forget to request an uplift for the patches in the regressions caused by this fix.
- If no, please set
status-firefox104
towontfix
.
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 20•2 years ago
|
||
Comment on attachment 9287315 [details]
Bug 1780054 - Part 2: Double-check notified stream in nsMultiplexInputStream, r=asuth
Beta/Release Uplift Approval Request
- User impact if declined: Large blobs can intermittently become corrupted when being combined and transferred between processes.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: Yes
- If yes, steps to reproduce: see comment 0
- List of other uplifts needed: Bug 1782181
- Risk to taking this patch: Medium
- Why is the change risky/not risky? (and alternatives if risky): This changes some details about the way that streams are handled for nsMultiplexInputStream, which is a core stream type and used in many different components. We have already encountered one memory leak issue (bug 1782181), which was difficult to diagnose, and we hope is fixed (though cannot be certain due to the intermittent nature).
- String changes made/needed: None
- Is Android affected?: Yes
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: Broken behaviour for Blobs in some cases, leading to data corruption.
- User impact if declined: Large blobs can intermittently become corrupted when being combined and transferred between processes.
- Fix Landed on Version: 105
- Risk to taking this patch: Medium
- Why is the change risky/not risky? (and alternatives if risky): This changes some details about the way that streams are handled for nsMultiplexInputStream, which is a core stream type and used in many different components. We have already encountered one memory leak issue (bug 1782181), which was difficult to diagnose, and we hope is fixed (though cannot be certain due to the intermittent nature).
Assignee | ||
Updated•2 years ago
|
Comment 21•2 years ago
|
||
Comment on attachment 9287181 [details]
Bug 1780054 - Add an IndexedDB composite blob stress test. r=nika!
Approved for 104.0b6
Comment 22•2 years ago
|
||
Comment on attachment 9287315 [details]
Bug 1780054 - Part 2: Double-check notified stream in nsMultiplexInputStream, r=asuth
Approved for 104.0b6
Comment 23•2 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/28f3c8d6ea1c
https://hg.mozilla.org/releases/mozilla-beta/rev/4713888960a0
Updated•2 years ago
|
Comment 24•2 years ago
|
||
Reproduced the initial issue using an old Nightly from 2022-07-18 on MacOS and the method described in https://github.com/jingyu9575/multithreaded-download-manager/issues/179#issuecomment-1175821990. Verified that using latest Nightly 105.0a1 and Firefox 104.0b6 across platforms (Windows 10 64bit, macOS 11.6 and Ubuntu 22.04) this looks as fixed.
Comment 25•2 years ago
|
||
Comment on attachment 9287315 [details]
Bug 1780054 - Part 2: Double-check notified stream in nsMultiplexInputStream, r=asuth
Approved for 102.2esr.
Updated•2 years ago
|
Comment 26•2 years ago
|
||
bugherder uplift |
Comment 27•2 years ago
|
||
Also verified as fixed using latest esr102 build from treeherder across platforms (Windows 10 64bit, macOS 11.6 and Ubuntu 18.04).
Description
•