Closed
Bug 664163
Opened 13 years ago
Closed 9 years ago
Fix Get(Local|Remote)(Address|Port) in HttpChannelChild
Categories
(Core :: Networking: HTTP, defect)
Tracking
()
RESOLVED
FIXED
mozilla14
People
(Reporter: jduell.mcbugs, Assigned: dragana)
References
Details
Attachments
(3 files, 4 obsolete files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
jduell.mcbugs
:
review+
|
Details | Diff | Splinter Review |
Minor followup. Details in bug 648878 comment 13 and later. Opening this just so we stick to "one bug, one patch." +r on patch copied from there.
Attachment #539226 -
Flags: review+
Reporter | ||
Comment 1•13 years ago
|
||
Whiteboard: [inbound]
Comment 2•13 years ago
|
||
Backed out in http://hg.mozilla.org/integration/mozilla-inbound/rev/d8941c23dcc7 due to causing a high frequency orange on Linux debug in test_traceable_channel_wrap.js.
Whiteboard: [inbound]
Reporter | ||
Comment 3•13 years ago
|
||
Problem was a one-liner--JS object that received observer notifications needed to be global so it didn't go out of scope by time observer tried to notify it.
https://hg.mozilla.org/integration/mozilla-inbound/rev/a2d511a124d7
Comment 4•13 years ago
|
||
Sorry, I had to back this out on inbound again:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a4d1da7fb5bb
for causing intermittent failures in test_traceable_channel_wrap.js on Linux (again):
https://tbpl.mozilla.org/php/getParsedLog.php?id=9856149&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=9858528&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=9857101&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=9849268&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=9854666&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=9846576&tree=Mozilla-Inbound
Assignee: nobody → jduell.mcbugs
Reporter | ||
Comment 5•13 years ago
|
||
OK, here's another go at it, with a couple possible fixes, and better info about what call may be failing.
https://hg.mozilla.org/integration/mozilla-inbound/rev/fd30aea30b03
I hate to use inbound as my try server here, but I've tried running test_traceable_channel_wrap.js on regular tryserver 200 times, and haven't seen an error. Any hypotheses about how the inbound buildbots may differ from try would be welcome--I'm fumbling in the dark here. I did try building with python 2.5 vs 2.7, but that's not it.
Reporter | ||
Comment 6•13 years ago
|
||
Backed out yet again, but I have a better idea which tree to bark up--error happening only with local addrs/ports not remote ones.
Updated•13 years ago
|
Target Milestone: --- → mozilla14
Reporter | ||
Comment 7•13 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/9899522b6ad9
Yet another go-round on inbound, this time with better diagnostic info so we can see why this only seems to fail on inbound and not try (or my linux box).
When this gets backed out please mention if the log contains either/both of these messages:
Parent: socket has multiple address families!
Child: socket has multiple address families!
Comment 8•13 years ago
|
||
It was backed out, but the test timed out this time so there was no debug output at all.
Comment 9•13 years ago
|
||
philor backed this out:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e81b56331d42
Aside: He also rightly pointed out that, since you already knew this was going to need backing out (per your commit message "Yes this will get backed out"), you actually could have backed it out yourself right away (or rather, as soon as builds were scheduled), instead of knowingly letting it fester for someone else (someone volunteering his time, in this case) to catch later. :) You still would have gotten a full run of tests, which it looks like you were looking for.
As for the logs -- as jdm said, they were all timeouts, sadly. For reference, the log URLs are:
https://tbpl.mozilla.org/php/getParsedLog.php?id=10052061&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=10052296&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=10052473&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=10050949&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=10051915&tree=Mozilla-Inbound
Reporter | ||
Comment 10•13 years ago
|
||
I'm punting on this bug for now: e10s isn't important enough at the moment to wade into debugging this. For the record, I'm starting to suspect that the root cause here is the same as bug 661158, i.e. some kind of intermittent IPDL serialization error, except this time on our linux buildbots.
Here's a run-down of the issue:
- The symptom: in test_traceable_channel_wrap.js we occasionally can get the socket's remote host/port but not the local addr/port. This should not be possible--if a load came from cache we'd see neither local/remote, but the cache is turned off for xpcshell (and the test would fail in that case too, yet it never does on non-e10s builds).
- The only place HttpBaseChannel::mSelfAddr gets set on the parent is during nsHttpChannel::OnTransportStatus, and it always either sets both local/remote, or neither (if the request is from cache). As discussed in bug 661158 comment 17, there's no chance of a race existing between when we set these and when we send them to the child in HttpChannelParent::SendOnStartRequest.
- One big difference between this and bug 661158: instead of the socket addr being set to family PR_AF_LOCAL (totally nonsensical), it's set to PR_AF_UNSPEC (aka, 0: also the default value set in the constructor). That's why this isn't a crash. It may also mean that I'm just missing some way that the value might remain at the default, but I don't think so (we're setting it in HttpChannelChild::OnStartRequest immediately before we call the test's OnStartRequest that checks the value).
- I'm seeing this only in e10s, and only on our linux buildbots (I've modified test_traceable_channel.js so I can run the logic 1000's of times in one run, and haven't seen it on my own linux box, nor on the tryservers).
Meanwhle I have two ideas to further debug this: 1) make sure we don't ever see a set/unset local/remote addr pair in the parent, before we send to the child; and 2) reorder the parameters in SendOnStartRequest to send the addr params first, so we can isolate if some other parameter is causing the problem.
Reporter | ||
Comment 11•13 years ago
|
||
Orig fix with a couple updates (PRBool->bool). Still runs into occasional crash on buildbots.
Attachment #539226 -
Attachment is obsolete: true
Attachment #610620 -
Flags: review+
Reporter | ||
Comment 12•13 years ago
|
||
Fixes JS observer to stay in scope throughout test. Adds better checks to see which getLocal functions are failing.
Reporter | ||
Comment 13•13 years ago
|
||
Sanity check that local/remote socket families are the same at OnStartReq time.
Reporter | ||
Comment 14•13 years ago
|
||
Not working on this until e10s is more relevant again.
Assignee: jduell.mcbugs → nobody
Reporter | ||
Comment 15•13 years ago
|
||
...and cjones has let me know that B2G is using e10s, so this is relevant again.
Assignee: nobody → jduell.mcbugs
Reporter | ||
Comment 16•9 years ago
|
||
Dragana, could you take this bug? For starters try just unbitrotting the patch and see if it still crashes occasionally on try. IIRC it was quite mysterious as to why it was happening.
We need this for things like Beacons on e10s (see bug 1217362)
Flags: needinfo?(dd.mozilla)
Assignee | ||
Updated•9 years ago
|
Assignee: jduell.mcbugs → dd.mozilla
Status: NEW → ASSIGNED
Assignee | ||
Comment 18•9 years ago
|
||
Attachment #610620 -
Attachment is obsolete: true
Attachment #8681198 -
Flags: review?(jduell.mcbugs)
Assignee | ||
Comment 19•9 years ago
|
||
Assignee | ||
Comment 20•9 years ago
|
||
Attachment #8681198 -
Attachment is obsolete: true
Attachment #8681198 -
Flags: review?(jduell.mcbugs)
Attachment #8681201 -
Flags: review?(jduell.mcbugs)
Assignee | ||
Comment 21•9 years ago
|
||
Updated•9 years ago
|
tracking-e10s:
--- → m8+
Reporter | ||
Comment 23•9 years ago
|
||
rebased and fixed a couple nits. Seems fine on try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1f8cc452b675&selectedJob=14756362
Flags: needinfo?(jduell.mcbugs)
Attachment #8700336 -
Flags: review+
Reporter | ||
Updated•9 years ago
|
Attachment #8681201 -
Attachment is obsolete: true
Attachment #8681201 -
Flags: review?(jduell.mcbugs)
Reporter | ||
Comment 24•9 years ago
|
||
Comment 25•9 years ago
|
||
bugherder |
You need to log in
before you can comment on or make changes to this bug.
Description
•