Crash in [@ nsDocShell::DoURILoad]
Categories
(Thunderbird :: Folder and Message Lists, defect)
Tracking
(thunderbird_esr6868+ fixed, thunderbird66 unaffected, thunderbird67 wontfix, thunderbird68 wontfix, thunderbird69 fixed, thunderbird70 fixed)
People
(Reporter: bc, Assigned: benc)
References
(Regression)
Details
(Keywords: crash, regression, topcrash-thunderbird, Whiteboard: [regression:tb67])
Crash Data
Attachments
(1 file, 6 obsolete files)
(deleted),
patch
|
jorgk-bmo
:
review+
jorgk-bmo
:
feedback+
jorgk-bmo
:
approval-comm-beta+
jorgk-bmo
:
approval-comm-esr68+
|
Details | Diff | Splinter Review |
This bug is for crash report bp-d02fba99-c243-4f03-869e-ee8870190325.
Top 10 frames of crashing thread:
0 libxul.so nsDocShell::DoURILoad dist/include/nsILoadInfo.h:730
1 libxul.so nsDocShell::InternalLoad docshell/base/nsDocShell.cpp:9466
2 libxul.so nsDocShell::LoadURI docshell/base/nsDocShell.cpp:763
3 libxul.so mozilla::dom::Location::SetURI dom/base/Location.cpp:237
4 libxul.so mozilla::dom::Location::SetHrefWithBase dom/base/Location.cpp:468
5 libxul.so mozilla::dom::Location::SetHref dom/base/Location.cpp:422
6 libxul.so mozilla::dom::Location_Binding::set_href dom/bindings/LocationBinding.cpp:161
7 libxul.so bool mozilla::dom::binding_detail::GenericSetter<mozilla::dom::binding_detail::CrossOriginThisPolicy> dom/bindings/BindingUtils.cpp:3097
8 libxul.so js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:442
9 libxul.so js::CallSetter js/src/vm/Interpreter.cpp:589
This has happened to me twice today with today's Daily build. It appears to happen when clicking on either a folder or a search folder that has unread messages, but I haven't narrowed down the steps to reproduce any further than that.
Reporter | ||
Updated•6 years ago
|
Comment 1•6 years ago
|
||
regression?
Nightly crashes began with buildid 20190325080840 d608dd98-2cbc-4035-8aa3-a9c160190325 68.0a1
Comment 2•6 years ago
|
||
can't declare it a trend this early. But bp-8802c335-2b8b-48d4-8ba7-cc6ee0190402 is a beta crash.
Recent nightly crashes - both linux, but most crashes are windows:
bp-11413fd0-cae5-41f9-bd4e-90d1c0190329 navigating to next unread nntp message via key shortcut n
bp-84f58d7e-6bc3-4eb2-9bf1-4442f0190328 clicked on empty imap inbox
Updated•6 years ago
|
Comment 3•6 years ago
|
||
Similar to bug 1529792, no?
Reporter | ||
Comment 4•6 years ago
|
||
Those linux crashes are me. I hit this several times a day now. I haven't seen bug 1529792 in a while.
Reporter | ||
Comment 5•6 years ago
|
||
I do however hit this on nntp messages as well: bp-c6c2716e-83a8-4240-bada-c0dd80190403
Comment 6•6 years ago
|
||
(In reply to Bob Clary [:bc:] from comment #4)
Those linux crashes are me. I hit this several times a day now.
You hit it doing what?
Updated•6 years ago
|
Reporter | ||
Comment 7•6 years ago
|
||
Clicking on a folder or message. It doesn't happen every time. Most of the time I click on saved search folders but other times it appears to happen clicking onto a message that is already displayed in either a normal folder or saved search folder.
Comment 8•6 years ago
|
||
I don't think you will get a regression windows with clearly defined STR. Also, it's pretty clear where this comes from, it's bug 1452600 and bug 1529505 / bug 1528971. If you want to confirm, start with a Daily of 2019-02-12 and then move forward. You'll hit the problem on the 13th or 16th, or the 21st the latest.
Comment 9•6 years ago
|
||
I'm hitting this crash in Linux TB 67.0b1 as well. A few times per day.
Comment 10•6 years ago
|
||
(In reply to Mats Palmgren (:mats) from comment #9)
I'm hitting this crash in Linux TB 67.0b1 as well. A few times per day.
Same, on mac. Pretty bizarre stacks, too.
(In reply to Bob Clary [:bc:] from comment #7)
Clicking on a folder or message. It doesn't happen every time.
Yep.
Comment 11•6 years ago
|
||
#5 crash for 67.0b1
Comment 12•6 years ago
|
||
What's needed to make progress here? I'm now hitting this several times a day and it's getting annoying. Is there some way I can get more info by using MOZ_LOG or an instrumented build?
Comment 13•6 years ago
|
||
Quite annoying, indeed, I see it frequently when testing stuff in a local build. I guess we need to convince our technical manager that this is a high priority. As a first step, confirming the regression range as per comment #6 would be good by going to a Daily of 2019-02-12. I also believe that this is the same issue as bug 1529792 (listed under "See Also"). Ben worked on that one.
As for your question: A reproducible case would be great.
Comment 14•6 years ago
|
||
I agree this is high priority. But the stack is very weird (wrong) and without steps to reproduce, I'm not sure where to start. Sounds like we could figure out bug 1529792 though and hope that it helps.
Comment 15•6 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #13)
Quite annoying, indeed, I see it frequently when testing stuff in a local build. ...
As for your question: A reproducible case would be great.
Well, asking for a reproducible case when I claimed myself to have one seemed pretty silly in hindsight. So I went to look at my reproducible case and found out that it's another issue that crashed every time when a certain message is viewed: Bug 1544187.
Comment 16•6 years ago
|
||
OK, looking at bp-346c597f-9ac6-40a8-a277-a4ba50190410, we crash on docshell/base/nsDocShell.cpp:9828 which is here:
https://dxr.mozilla.org/mozilla-beta/source/docshell/base/nsDocShell.cpp#9828
loadInfo->SetOpenerPolicy(oldLoadInfo->GetOpenerPolicy());
As we knew, there seems to be an issue with the load info. Quite a few reports crash at that line. Further up in those stacks is NNTP code.
There's also another type, like bp-fbfa9886-7ec2-40c6-9800-5dcb10190414, which crashes here:
dist/include/nsILoadInfo.h:730 called from here docshell/base/nsDocShell.cpp:9458:
https://dxr.mozilla.org/mozilla-beta/source/docshell/base/nsDocShell.cpp#9458
Sadly nsILoadInfo.h is generated from the IDL file and DXR doesn't track it, but looking in a local build for trunk, that's where we have NS_DEFINE_STATIC_IID_ACCESSOR(nsILoadInfo, NS_ILOADINFO_IID)
, so most likely, it's some issue with the load info again.
I'm pretty sure this is all related to the bugs quoted in comment #8, so somewhere we don't handle the load info correctly.
Ben, does that give you some sort of starting point?
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 17•6 years ago
|
||
Looking into this now. Pretty sure there's something to fix for Bug 1529792, comment 13, but that's only NNTP related, so doesn't explain the non-NNTP-related crashes here. But probably a good starting point. My suspicion is that there are multiple places where unset loadinfo is an issue.
Assignee | ||
Comment 18•6 years ago
|
||
Hmm. The chunk of code the crash was occurring in has been removed (see https://hg.mozilla.org/mozilla-central/rev/646b72163f00 ).
It was:
if (isTopLevelDoc && GetDocument() && GetDocument()->GetChannel()) {
nsCOMPtr<nsILoadInfo> oldLoadInfo = GetDocument()->GetChannel()->LoadInfo();
loadInfo->SetOpenerPolicy(oldLoadInfo->GetOpenerPolicy());
}
I'm pretty sure it was crashing due to the null oldLoadInfo
. The implication here is that Document::StartDocumentLoad()
is being called with a channel which has no loadInfo set. Which we kind of assumed was the case anyway. Still digging through trying to find the culprit(s).
Assignee | ||
Comment 19•6 years ago
|
||
Still a work in progress, but I found a couple of places in the nntp code where channels can be created with a missing loadinfo. A bit fiddly with the connection reuse, but I think this plugs up those gaps.
However, it does now cause a unit test to hang (testConnectionLimit
, in mailnews/news/test/unit/test_server.js
), so it's not ready to land yet.
In a bunch of places the nntp code bypasses the usual channel creation in order to carry along a listener and a window. Seems to break the otherwise-tidy channel abstraction, but I don't have enough bigger-picture understanding to suggest any improvements.
But now I know what to look for, so I'm be looking through the other mailnews protocols for similar alternate code paths.
Comment 20•6 years ago
|
||
If code reformat happens in NNTP, this will be the rebased patch.
Comment 21•6 years ago
|
||
Updated•6 years ago
|
Comment 22•6 years ago
|
||
Does IMAP need similar treatment? IIRC, that has a mock channel as well.
Comment 23•6 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #20)
Created attachment 9060928 [details] [diff] [review]
1538948-add-missing-loadinfo-in-nntp-1.patch - rebasedIf code reformat happens in NNTP, this will be the rebased patch.
This causes a regression.
I'm afraid I praised the patch in Bug 1529792 too soon. [F5] does not kill the TB because [F5] does not work anymore.
([F5] = menu: File -> Get New Messages for -> Current Account)
- Write a new message to a newsgroup.
- Select the server and type [F5].
Status bar says: "There are no new messages on the server"
- Select the newsgroup.
- News count increases.
Assignee | ||
Comment 24•6 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #21)
Can you please revert the needless renaming of the parameters here. It
causes extra churn and I don't see why you've done that. We still used
aParameter
in all/most C++ code although there's a discussion to change
that, but that discussion is for JS code I believe.
Sorry - will revert it. For some reason I thought we'd decided for C++ too.
Otherwise, is the patch good to go, there is positive feedback in bug
1529792 comment #37.
No, there's the unit test hanging that worries me (as well as Comment 23). Looking into it now, hopefully have a fixed patch tomorrow.
(In reply to Jorg K (GMT+2) from comment #22)
Does IMAP need similar treatment? IIRC, that has a mock channel as well.
Yes, I suspect so too. I started looking through IMAP last week. Will pick it up again once I've sorted out the NNTP patch.
Assignee | ||
Comment 25•6 years ago
|
||
Just a quick update, since this is dragging on a bit: I'm still trying to work out why my patch is causing one of the tests to hang (testConnectionLimit
, in mailnews/news/test/unit/test_server.js
- the server.performTest()
call never returns).
Spent a lot of today picking through the nntp unit test setup to try and figure out what's happening. There's a lot going on in there.
Anyway, the test hang and the regression in comment #23 definitely means it's not landable, but I'll keep digging.
Comment 26•6 years ago
|
||
Is there something wallpapery we could land on beta that would mean this crashes less? Or is there something we can do to help with getting the "right" fix landed?
Comment 27•6 years ago
|
||
Gijs, you are crashing with IMAP, right? Or is it a news issue?
Comment 28•6 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #27)
Gijs, you are crashing with IMAP, right? Or is it a news issue?
I'm not sure how to check, but I get crashes both when switching to a different news/nntp folder and when switching to a different IMAP folder (possible that was always switching away from nntp, but not sure).
Comment 29•6 years ago
|
||
OK, for some reason I assumed that you had a pure IMAP crash (hey, who's using news these days?). But since there's news involved, there is some hope that the purely NNTP patch is going into the right direction. I'll take a look.
Comment 30•6 years ago
|
||
I reverted the renaming of the variables.
Comment 31•6 years ago
|
||
I didn't understand the changes in LoadNewsUrl()
, so I reverted them. All tests pass now. I think this is the minimal refactoring we need to supply loadinfo to where it's needed.
Gijs, I'll do you a beta try build for Mac, OK?
Comment 32•6 years ago
|
||
Beta try:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=04ac0826c1448211d4cc7edcfee0d7bdccc87cd4
I hope this will work and give us a Mac build you can try.
Comment 33•6 years ago
|
||
Gijs, here's the beta build which should NOT disrupt your setup:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=04ac0826c1448211d4cc7edcfee0d7bdccc87cd4&selectedJob=246174467
You'd be testing
https://queue.taskcluster.net/v1/task/aqVC8naKSki9pwzWRp5ELw/runs/0/artifacts/public/build/target.dmg
EDIT: Added "NOT" to first sentence.
Comment 34•6 years ago
|
||
Bob, you said in comment #0 and comment #7 that clicking on a (search) folder or a message causes the crash. Does this involve news/NNTP? Or can you make it crash with purely IMAP?
Reporter | ||
Comment 35•6 years ago
|
||
It has been a while since I've seen this. If I recall correctly, this particular signature folders and saved searches which may have included both imap or nntp. I've been stuck on 20190502074238 since the menu fall out. The last time I saw this it was bp-c81a29a5-728b-48b1-918f-664f10190409.
Comment 36•6 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #33)
Gijs, here's the beta build which should NOT disrupt your setup:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=04ac0826c1448211d4cc7edcfee0d7bdccc87cd4&selectedJob=246174467You'd be testing
https://queue.taskcluster.net/v1/task/aqVC8naKSki9pwzWRp5ELw/runs/0/artifacts/public/build/target.dmgEDIT: Added "NOT" to first sentence.
Sorry, so I can run this instead of beta, is that what you're saying?
Comment 37•6 years ago
|
||
Absolutely. It is a TB 67 beta 3 with one added patch to try to stop the crash.
Comment 38•6 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #37)
Absolutely. It is a TB 67 beta 3 with one added patch to try to stop the crash.
But it's an unofficial build so it didn't use my profile and came up effectively "empty" / "clean slate"...
Comment 39•6 years ago
|
||
As per my PM, please use the profile manager (start with -p) to select you production profile. All eyes are on you now ;-)
Comment 40•6 years ago
|
||
I just crashed again, https://crash-stats.mozilla.com/report/index/c58488a4-93f2-4520-b010-6060d0190514 . No symbols for that build in crash-stats but I've not seen other crashes so I expect the same issue persists.
Comment 41•6 years ago
|
||
I just hit this crash: https://crash-stats.mozilla.com/report/index/b505768b-1af2-4fe3-a0b5-48ac80190521
I'm running TB 67.0b3 x64. It happened when I was creating a new calendar event that was two days ahead of today's date. I created the event, set time, info, etc., saved the event, then clicked back on today's date on my calendar. That crashed it. I've done that series of events many, may times without a crash.
Comment 42•6 years ago
|
||
Only posted to this bug as TB showed it as possibly a related bug. Sorry for any noise.
Comment 43•6 years ago
|
||
Gijs, have you been using the "special" beta 67 and it kept crashing?
Comment 44•6 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #43)
Gijs, have you been using the "special" beta 67 and it kept crashing?
I used it for a while but yes, still saw crashes (see my comment #40, https://crash-stats.mozilla.com/report/index/8b2574f6-2a1d-4b09-bebc-144200190521 and https://crash-stats.mozilla.com/report/index/d2a94ee1-3f04-4b85-9465-604f20190527 are further symbol-less crashes from that build), so eventually I switched back to "regular" beta because I kept misclicking on the "normal" TB icon in my dock, and I saw no stability difference.
Comment 45•6 years ago
|
||
Thanks, Gijs, at least we know that the patch I tried on beta (https://hg.mozilla.org/try-comm-central/rev/60b7cf97c2f78ad7f737baaf5e8068378c914e83) didn't help. I hope Ben will return to this issue soon. Is there any question I need to answer?
Comment 46•5 years ago
|
||
Gijs, I guess you're on TB 68 beta now and it's equally crashing, right?
Comment 47•5 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #46)
Gijs, I guess you're on TB 68 beta now and it's equally crashing, right?
I am indeed on beta 68, but it's been crashing less? I'm not sure if that's just coincidence... it's possible I've grown learned behaviour to wait a bit after switching to newsgroup folders as that seemed to make it less likely that things crashed...
Comment 48•5 years ago
|
||
(Specifically, it seems I have had 0 crashes with 68 so far - last crash was on June 12th, with 67 beta.)
Comment 49•5 years ago
|
||
(I'm now seeing some slightly different crashes in 68b3, see bug 1565073. )
Comment 50•5 years ago
|
||
Like Gijs, I'm still occasionally seeing crashes when switching between newsgroups, but the frequency isn't as high as it was in the 67 betas. As noted before, the signature has shifted a bit into [@ mozilla::dom::Document::SetUserHasInteracted ] and [@ nsDocShell::EndPageLoad ] for the 3 crashes I've seen on 68 so far.
Comment 51•5 years ago
|
||
(In reply to :Gijs (out until Monday; he/him) from comment #48)
(Specifically, it seems I have had 0 crashes with 68 so far - last crash was on June 12th, with 67 beta.)
It's been pretty crash free for me since moving to 68.0b1 though b5.
Comment 52•5 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #50)
Like Gijs, I'm still occasionally seeing crashes when switching between newsgroups, but the frequency isn't as high as it was in the 67 betas. As noted before, the signature has shifted a bit into [@ mozilla::dom::Document::SetUserHasInteracted ] and [@ nsDocShell::EndPageLoad ] for the 3 crashes I've seen on 68 so far.
Yes, I now have 2 crashes on 68 beta for each of these 2 signatures.
Comment 53•5 years ago
|
||
Still happening with Beta 69.
https://crash-stats.mozilla.com/report/index/3d938c78-e8f2-42e3-92a2-1a9550190730
Updated•5 years ago
|
Comment 54•5 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #53)
Still happening with Beta 69.
https://crash-stats.mozilla.com/report/index/3d938c78-e8f2-42e3-92a2-1a9550190730
I just bumped up to 69 a few minutes ago but prior to my report from a couple month ago, no crashes. Will report any crashes in 69 if they happen.
Assignee | ||
Comment 55•5 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #31)
I didn't understand the changes in
LoadNewsUrl()
, so I reverted them. All tests pass now. I think this is the minimal refactoring we need to supply loadinfo to where it's needed.
It's taken me all day to work it out, but it's the LoadNewsUrl()
changes which are critical. Currently, if all the real connections are busy, a nsNntpMockChannel
is created instead, but it's never given a valid loadInfo
.
The rest of the patch really just fixes up loadInfo
for the nsNntpNewsService::NewChannel()
code path... which I'm not sure is even even used (even though I think it's the 'official' way of instantiating nsIChannel
-derived objects ;-).
Head still spinning from picking through this stuff, but should have a better patch ready tomorrow.
Then it'll be a case of auditing any other protocols which use mock channels (IMAP... not sure if there were others).
Comment 56•5 years ago
|
||
Another signature? Bug 1561861 - Crash in [@ mozilla::dom::Document::BeginUpdate] - happening only for 69 beta
Updated•5 years ago
|
Assignee | ||
Comment 57•5 years ago
|
||
Just a quick progress update - I've got a revised patch... but now the testConnections
test is failing again (as per comment 25). Sigh. So I need to work what's happening and how to fix it.
Assignee | ||
Comment 58•5 years ago
|
||
Revised patch.
I'm pretty sure it was the testConnections
test that was wrong. I cleaned it up slightly and it now passes.
I'm not totally clear that test was ever actually testing what it was intended to test... But I'll leave that for a followup patch if this one looks OK.
Comment 59•5 years ago
|
||
Here's version 1 again plus the test change from version 2.
Some observations:
- Version 1 also passes if you add the hunk re. the test change from version 2.
- I think the aim of the test is to block the first connection it opens, hence it's just calling
_server.loadNewsUrl(url, null, null);
instead ofsetupProtocolTest()
which does about the same thing, only that the latter attaches a listener withonStopRequest()
and that closes the connection. So I'm inclined to say that your test changes invalidate the test. - Version 2 changes
nsNntpIncomingServer::LoadNewsUrl()
which appears undesired. As far as I can see, there's only one caller, so it doesn't appear relevant whether the loadinfo is created before the call and passed it, or created inside the call. Am I missing something?
I guess we first need to understand what the test tries to achieve before changing it. Then version 1 or version 2 are a matter of taste.
In any case, please motivate the changes in nsNntpIncomingServer::LoadNewsUrl()
switching from GetNntpConnection()
to GetNntpChannel()
. Over all, I think this needs more narrative so the reviewer can follow the course of the investigation.
Assignee | ||
Comment 60•5 years ago
|
||
(I think there's three separate things here, so I'll split them up)
(In reply to Jorg K (GMT+2) from comment #59)
- Version 2 changes
nsNntpIncomingServer::LoadNewsUrl()
which appears undesired. As far as I can see, there's only one caller, so it doesn't appear relevant whether the loadinfo is created before the call and passed it, or created inside the call. Am I missing something?
No, you're not missing anything. It's really pretty arbitrary.
For version 2 I was trying to see the bigger picture. Usually whenever a nsIChannel
is created (via nsIIOService.newChannel()
), the caller is the one that should be aware of the circumstances of the request, and thus should be responsible for providing the LoadInfo.
For this case, the outside 'entry' point for causing channel creation is nsNntpService::RunNewsUrl()
, so I figured that that was the best place for it. The idea being that eventually we'd replace all the special-case protocol-specific creation functions and just call generic nsIIOService.newChannel()
instead, as firefox does.
But I'm happy going with version 1. It saves changing existing loadNewsUrl()
calls in the unit tests.
, and landed on deciding to treat nsNntpIncomingServer::LoadNewsUrl()
as analogous to the more generic NewChannel()
, which requires a loadinfo to be passed in. It seemed as it's only called from the
I guess we first need to understand what the test tries to achieve before changing it. Then version 1 or version 2 are a matter of taste.
In any case, please motivate the changes in
nsNntpIncomingServer::LoadNewsUrl()
switching fromGetNntpConnection()
toGetNntpChannel()
. Over all, I think this needs more narrative so the reviewer can follow the course of the investigation.
Assignee | ||
Comment 61•5 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #59)
In any case, please motivate the changes in
nsNntpIncomingServer::LoadNewsUrl()
switching fromGetNntpConnection()
toGetNntpChannel()
. Over all, I think this needs more narrative so the reviewer can follow the course of the investigation.
Yep - fair enough. I shall try and clarify it in the next patch.
GetNntpConnection()
returns a real connection, or null if none available.
GetNntpChannel()
returns either a real connection, or a queued mock channel if all the real connections are busy.
The old logic in nsNntpIncomingServer::LoadNewsUrl()
replicated what GetNntpChannel() was doing, so I was trying to consolidate it in one place.
Assignee | ||
Comment 62•5 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #59)
Some observations:
- Version 1 also passes if you add the hunk re. the test change from version 2.
- I think the aim of the test is to block the first connection it opens, hence it's just calling
_server.loadNewsUrl(url, null, null);
instead ofsetupProtocolTest()
which does about the same thing, only that the latter attaches a listener withonStopRequest()
and that closes the connection. So I'm inclined to say that your test changes invalidate the test.
Hmm. I think you're right.
My take on what the test is supposed to do:
- The server
maximumConnectionsNumber
is 1. (I think this should be made more explicit - later tests change this setting). - the test sets off two requests. The first request gets a real connection, the second a mock connection.
- once the first request finishes, it closes down the connection pool (presumably just leaving the mock request hanging - I'm not sure).
- it asserts that only the first request ran and not the second (thus proving that the connection queuing works).
Assignee | ||
Comment 63•5 years ago
|
||
Ahh! This comment and assertion in the test was confusing me:
// We should have length one... which means this must be a transaction object,
// containing only us and them
Assert.ok("us" in server.playTransaction());
It turns out that playTransaction()
returns either a single object (if there was just one request), or an array of objects (if there were multiple).
So the Assert.ok()
implicitly fails if there is more than one object returned!
Assignee | ||
Comment 64•5 years ago
|
||
Here's a brutally cut-down patch which I think fixes the problem.
testConnectionLimit
now passes fine.
It does work (if I set maximumConnectionsNumber
to 2 the test fails as it should)... but I don't know why it works. I find this a little troubling.
The core of the test is:
// To test make connections limit, we run two URIs simultaneously.
var url = Services.io.newURI(prefix + "*");
_server.loadNewsUrl(url, null, null);
setupProtocolTest(NNTP_PORT, prefix + "TSS1@nntp.invalid");
server.performTest();
The first loadNewsUrl()
should get the single real connection and run fine, with no listener callbacks.
The second request, in setupProtocolTest()
, should be queued up on a mock connection. It also has a listener which calls nsNntpIncomingServer()::closeCachedConnections()
as soon as the request completes.
It seems to me like this closeCachedConnections()
wouldn't be invoked until both requests complete, which should make this test fail. But it doesn't.
The only thing I can think of is that closeCachedConnections()
prevents the fake nntpserver from logging the second transaction. But I'd have thought that'd screw up all the other NNTP tests which rely on that logging!
I feel like I'm missing something obvious.
Comment 65•5 years ago
|
||
Comment 66•5 years ago
|
||
Here we go:
Linux64: https://queue.taskcluster.net/v1/task/M26_ncjxSP-0s3NyLsZYrg/runs/0/artifacts/public/build/target.tar.bz2
Mac: https://queue.taskcluster.net/v1/task/fKH_jRZ7TiOVI4WLR_wlDA/runs/0/artifacts/public/build/target.dmg
Windows64: https://queue.taskcluster.net/v1/task/QowSM2BRRTyrHZoTysUKIA/runs/0/artifacts/public/build/install/sea/target.installer.exe
This is TB 69 beta 2 plus the patch here. Can you please try this out in production and see whether it still crashes. You might have to start with -p to select your production profile.
Reporter | ||
Comment 67•5 years ago
|
||
I've had problems in the past switching back to beta from nightly. Can we do a try build based on Nightly 70? I'll be happy to test that but while I have seen some related crashes recently they are not that frequent.
bp-aec03d78-d21a-4ab8-b187-1d33a0190801 8/1/19, 6:55 AM
bp-6ab0b31e-046d-4d1a-9b2e-0207e0190731 7/30/19, 5:30 PM
bp-23b042f9-1819-45dd-864e-643ca0190729 7/29/19, 2:52 PM
I think we might want to just get this on Nightly now and let those of us who live the world of endless Nightly can try it out over a period of days.
Comment 68•5 years ago
|
||
OK, here's try build based on Nightly:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=5d2b1e65cdfa0da8f772592959f3c2b763ad689c
Linux only, I'll get you the binary when it's done.
Comment 69•5 years ago
|
||
Here's the Linux64 binary: https://queue.taskcluster.net/v1/task/RnUbJrAhTh6bCThS-Ex7vA/runs/0/artifacts/public/build/target.tar.bz2
Reporter | ||
Comment 70•5 years ago
|
||
Downloading now. I'll let you know how it goes.
Reporter | ||
Updated•5 years ago
|
Comment 71•5 years ago
|
||
The frequency of my crashes is so low at this point that it'd take ages for me to tell you anything meaningful - the patch looks sane to me so I'd suggest just landing it, spinning updated betas, keeping the bug open, and checking if crash frequency goes down.
Comment 72•5 years ago
|
||
Seems like a plan. Wayne, when do you want to spin the next beta? I haven't got much else (https://mzl.la/2YWnmJd) and we wanted TB 69 beta 3 for patches to go into TB 68.1. But since TB 68.0 is delayed, how do you feel about doing TB 69 beta 3 now?
Comment 73•5 years ago
|
||
Updated•5 years ago
|
Comment 74•5 years ago
|
||
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/236122c254bd
Add missing LoadInfo in NNTP code. r=jorgk
Updated•5 years ago
|
Assignee | ||
Comment 75•5 years ago
|
||
Want to keep this one open a little longer - with my newfound knowledge of the TB protocols code I'm going to take a look through the IMAP loadinfo usage (and SMTP too, but I suspect that's less likely to be hitting the check in nsDocShell
).
Comment 76•5 years ago
|
||
Preferably in a new bug, no? We can reopen it if the fix didn't help and the crash is still there.
Comment 77•5 years ago
|
||
Updated•5 years ago
|
Updated•5 years ago
|
Comment 78•5 years ago
|
||
TB 68.0 ESR:
https://hg.mozilla.org/releases/comm-esr68/rev/ffa42e41a99b49547a6367b873a07dd13b2e9e9e
Fingers crossed ;-) - I had positive feed back from :bc on IRC.
Description
•