e10s flow control could cause the canceled channel suspended forever
Categories
(Core :: Networking: HTTP, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox66 | --- | wontfix |
firefox67 | --- | fixed |
firefox68 | --- | fixed |
People
(Reporter: kershaw, Assigned: CuveeHsu)
References
(Regression)
Details
(Keywords: regression, Whiteboard: [necko-triaged])
Attachments
(1 file)
(deleted),
text/x-phabricator-request
|
pascalc
:
approval-mozilla-beta+
|
Details |
When I was investigating the failed tests caused by retargeting (bug 1505493), I found this issue.
Please see the full log at [1] and search the http channel with the address 0x7f14f37ff000 and you can see this channel is not finished (OnStopRequest is not called).
Here is what happened:
- The channel is suspended due to flow control [2].
- At the same time, the channel is canceled in child process [3].
- The channel is canceled at [4].
- Since |HttpChannelChild::mCanceled| is true [5], SendBytesRead [6] will not be called.
- This means that the channel will not be resumed.
I think this still might happen without retargeting. Maybe it's just the retargeting that make this problem easier to reproduce.
[1] https://taskcluster-artifacts.net/bMAOFkkYRAGIHC67I6MXrQ/0/public/logs/live_backing.log
[2] https://searchfox.org/mozilla-central/rev/3d469329a42644b41e66944d8da22d78d8c0f5ec/netwerk/protocol/http/HttpChannelParent.cpp#1617
[3] https://searchfox.org/mozilla-central/rev/3d469329a42644b41e66944d8da22d78d8c0f5ec/netwerk/protocol/http/HttpChannelChild.cpp#2307
[4] https://searchfox.org/mozilla-central/rev/3d469329a42644b41e66944d8da22d78d8c0f5ec/netwerk/protocol/http/HttpChannelParent.cpp#812
[5] https://searchfox.org/mozilla-central/rev/3d469329a42644b41e66944d8da22d78d8c0f5ec/netwerk/protocol/http/HttpChannelChild.cpp#804
[6] https://searchfox.org/mozilla-central/rev/3d469329a42644b41e66944d8da22d78d8c0f5ec/netwerk/protocol/http/HttpChannelChild.cpp#871
Reporter | ||
Comment 1•6 years ago
|
||
I think this should be a P1 bug. Junior, could you take a look? Thanks.
Comment 2•6 years ago
|
||
Excellent find! Agree on P1.
The log can be examined here, for convenience:
Assignee | ||
Comment 3•6 years ago
|
||
Thanks a lot for finding this, Kershaw!
Of course I'll take a look. Beforehand, I thought the cancel
should finish the channel. It looks like it won't.
And thanks for the processed logan log, Honza.
Assignee | ||
Comment 4•6 years ago
|
||
I think this still might happen without retargeting. Maybe it's just the retargeting that make this problem easier to reproduce.
Yes bug 1515809 already addressed that we have some intermittent without retargeting
Assignee | ||
Comment 5•6 years ago
|
||
Assignee | ||
Comment 6•6 years ago
|
||
Assignee | ||
Comment 7•6 years ago
|
||
https://hg.mozilla.org/try/rev/fe819d5d1dbee45c8a1d25f01fdfefecc0d936ee
Looks good for other media wpts
Assignee | ||
Updated•6 years ago
|
Comment 9•6 years ago
|
||
bugherder |
Comment 10•6 years ago
|
||
We plan to nominate this for uplift to beta/release to fix the problem.
Updated•6 years ago
|
Assignee | ||
Comment 11•6 years ago
|
||
Comment on attachment 9054349 [details]
Bug 1539766 - handle the cancel case for e10s bp
Beta/Release Uplift Approval Request
- Feature/Bug causing the regression: Bug 1280629
- User impact if declined: Users may leaks >1MB when cancel a http request during the suspension triggered by e10s back pressure. The suspension happen frequently when we have high speed internet (especially intranet) and request >1MB file.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: Bug 1280629
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): No big logic change
- String changes made/needed: No
Updated•6 years ago
|
Comment 12•6 years ago
|
||
Junior, to the question "List of other uplifts needed:' you put Bug 1280629 which is the regressor and landed in 63, did you mean to indicate another bug or was it a copy/paste error and no other uplift is needed for this uplift?
Assignee | ||
Comment 13•6 years ago
|
||
It’s a paste error. No other bugs for this uplift
Comment 14•6 years ago
|
||
Comment on attachment 9054349 [details]
Bug 1539766 - handle the cancel case for e10s bp
Fix for a major leak, approved for 67 beta 10, thanks.
Comment 15•6 years ago
|
||
bugherder uplift |
Updated•6 years ago
|
Updated•6 years ago
|
Selena, you mention uplifting to release. I don't have plans (yet) for another 66 dot release.
Can this fix wait for 67? It looks like potentially a good improvement, but it's not a new problem in 66, so I don't think it needs to be a driver for 66.0.4.
Assignee | ||
Comment 17•6 years ago
|
||
Comment on attachment 9054349 [details]
Bug 1539766 - handle the cancel case for e10s bp
(Steal the ni?)
Given close to next release, it's fine to not have another dot release for this.
Updated•3 years ago
|
Description
•