crash [@ nsImapServerResponseParser::ProcessOkCommand] via FetchTryChunking, during chunking/mime_part_on_demand=true
Categories
(MailNews Core :: Networking: IMAP, defect, P2)
Tracking
(thunderbird_esr78 wontfix, thunderbird_esr91 affected, thunderbird_esr102+ affected)
People
(Reporter: wsmwk, Assigned: gds)
References
(Depends on 1 open bug)
Details
(Keywords: crash, steps-wanted, topcrash-thunderbird)
Crash Data
Attachments
(4 files)
(deleted),
image/png
|
Details | |
(deleted),
image/png
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review |
Updated•13 years ago
|
Reporter | ||
Comment 2•11 years ago
|
||
Reporter | ||
Comment 3•10 years ago
|
||
Updated•9 years ago
|
Reporter | ||
Comment 5•8 years ago
|
||
Comment 6•8 years ago
|
||
Reporter | ||
Comment 7•7 years ago
|
||
Reporter | ||
Updated•7 years ago
|
Reporter | ||
Updated•7 years ago
|
Reporter | ||
Comment 8•6 years ago
|
||
Comment 9•6 years ago
|
||
Reporter | ||
Comment 10•6 years ago
|
||
Comment 11•6 years ago
|
||
Reporter | ||
Updated•6 years ago
|
Reporter | ||
Comment 12•6 years ago
|
||
Most comments are about large attachments: sending, downloading, detaching.
Is this about chunking? Or just generic attachments issue.
bp-57920d1a-a602-4334-8baa-0fd670190119 I was trying to detach 5 large pictures (total 30MB) from an e-mail message.
0 XUL nsImapServerResponseParser::ProcessOkCommand(char const*) comm/mailnews/imap/src/nsImapServerResponseParser.cpp:436 context
1 XUL nsImapServerResponseParser::ParseIMAPServerResponse(char const*, bool, char*) comm/mailnews/imap/src/nsImapServerResponseParser.cpp:256 cfi
2 XUL nsImapProtocol::FetchMessage(nsTString<char> const&, nsIMAPeFetchFields, char const*, unsigned int, unsigned int, char*) comm/mailnews/imap/src/nsImapProtocol.cpp:3667 cfi
3 XUL nsImapProtocol::FetchTryChunking(nsTString<char> const&, nsIMAPeFetchFields, bool, char*, unsigned int, bool) comm/mailnews/imap/src/nsImapProtocol.cpp:3700 cfi
4 XUL nsImapProtocol::FallbackToFetchWholeMsg(nsTString<char> const&, unsigned int) comm/mailnews/imap/src/nsImapProtocol.cpp:3400 cfi
5 XUL nsIMAPBodyShell::Generate(char*) comm/mailnews/imap/src/nsIMAPBodyShell.cpp:224 cfi
6 XUL nsImapProtocol::ProcessSelectedStateURL() comm/mailnews/imap/src/nsImapProtocol.cpp:2759 cfi
7 XUL nsImapProtocol::ProcessCurrentURL() comm/mailnews/imap/src/nsImapProtocol.cpp:1784 cfi
8 XUL nsImapProtocol::ImapThreadMainLoop() comm/mailnews/imap/src/nsImapProtocol.cpp:1403 cfi
9 XUL nsImapProtocol::Run() comm/mailnews/imap/src/nsImapProtocol.cpp:1062 cfi
10 XUL non-virtual thunk to nsImapProtocol::Run() comm/mailnews/imap/src/nsImapProtocol.cpp:0 cfi
bp-65608ed1-9914-4a10-9323-a63ff0190205 was downloading a 5 meg jpg
bp- f3bad4be-2e44-4271-b6ad-46dd70190212 I had attached two photos each c5.4MB - too big!
Assignee | ||
Comment 13•6 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #12)
Most comments are about large attachments: sending, downloading, detaching.
Is this about chunking? Or just generic attachments issue.
I can't really tell. I've been poking around in this area and working with big attachments so I will watch for this. Will try some detaching too.
I'm not real familiar with the crash report pages that you reference, but when I click on the blue links inside them I get a bad response and it doesn't go to the code line that it links to.
Reporter | ||
Comment 14•5 years ago
|
||
(In reply to gene smith from comment #13)
(In reply to Wayne Mery (:wsmwk) from comment #12)
Most comments are about large attachments: sending, downloading, detaching.
Is this about chunking? Or just generic attachments issue.
I can't really tell. I've been poking around in this area and working with big attachments so I will watch for this. Will try some detaching too.
All the crashes I examined do have chunking on the stack. So does bug 1222455, but that signature is gone in version 68 so I am closing it out.
Any new ideas?
I'm not real familiar with the crash report pages that you reference, but when I click on the blue links inside them I get a bad response and it doesn't go to the code line that it links to.
Yeah, unfortunately been that way a long time. And not changing soon afaik. :(
Reporter | ||
Comment 15•4 years ago
|
||
68.12.1 ranked #10
78.3.3 ranked #80
Reporter | ||
Comment 16•4 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #15 on October 10)
68.12.1 ranked #10
78.3.3 ranked #80
If we take version 68.12., 68.11., 68.10.* out of the 6 month graph we see that version 78 crashes are pretty flat [1]. Which indicates that either the crash signature changed in version 78 or we fixed something in 78 that we didn't uplift to 68.
The slow introduction of version 78 makes it hard to pin down a date where things got better, but from [2] should we suspect Bug 1654995 - Thunderbird 79.0b2 corrupting large attachments because of chunking - which went live in 78.2.0 in mid-August? Or should we be looking for something that landed earlier? (sure would be nice to have a year of crash-stats data)
Whatever it was that helped this bug, didn't help Bug 1581766 - Crash in [@ arena_t::DallocSmall | je_free | nsIMAPGenericParser::AdvanceToNextToken] via nsImapServerResponseParser
Assignee | ||
Updated•4 years ago
|
Reporter | ||
Updated•4 years ago
|
Comment hidden (obsolete) |
Comment hidden (obsolete) |
Comment hidden (obsolete) |
Comment hidden (obsolete) |
Reporter | ||
Comment 22•3 years ago
|
||
(In reply to Worcester12345 from comment #21)
Thank you.
Wow, looks like a lot in 78.
Not really - it's not in the top 50 signatures https://crash-stats.mozilla.org/topcrashers/?product=Thunderbird&version=78.11.0&_facets_size=200
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Comment 23•3 years ago
|
||
From bug 1708712 "Was moving some folders, moving email between folders, and did a "repair folder" on main inbox folder."
Was the repair done while the moving was in progress?
Comment 24•3 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #23)
From bug 1708712 "Was moving some folders, moving email between folders, and did a "repair folder" on main inbox folder."
Was the repair done while the moving was in progress?
Don't know at this point.
Reporter | ||
Comment 25•3 years ago
|
||
Version 91 is crash count is about the same as 78, but ranking is lower, ~93 vs 66 respectively
Comment 26•3 years ago
|
||
FYI, just encountered it in Thunderbird 94.0 while reading my emails...
Crash Report [@ nsImapServerResponseParser::ProcessOkCommand ]
bp-4c49f26a-881c-4c63-b7bd-a2d0f0211104
Crashing Thread (85), Name: IMAP
Frame Module Signature Source Trust
0 xul.dll nsImapServerResponseParser::ProcessOkCommand(char const*) comm/mailnews/imap/src/nsImapServerResponseParser.cpp:422 context
1 xul.dll nsImapServerResponseParser::ParseIMAPServerResponse(char const*, bool, char*) comm/mailnews/imap/src/nsImapServerResponseParser.cpp:270 cfi
2 xul.dll nsImapProtocol::FetchMessage(nsTString<char> const&, <unnamed-tag>, char const*, unsigned int, unsigned int, char*) comm/mailnews/imap/src/nsImapProtocol.cpp:3806 cfi
3 xul.dll nsImapProtocol::FetchTryChunking(nsTString<char> const&, <unnamed-tag>, bool, char*, unsigned int, bool) comm/mailnews/imap/src/nsImapProtocol.cpp:3856 cfi
4 xul.dll nsIMAPBodypart::GeneratePart(nsImapBodyShell*, bool, bool) comm/mailnews/imap/src/nsImapBodyShell.cpp:408 cfi
5 xul.dll nsIMAPBodypartLeaf::Generate(nsImapBodyShell*, bool, bool) comm/mailnews/imap/src/nsImapBodyShell.cpp:527 cfi
6 xul.dll nsIMAPBodypartMultipart::Generate(nsImapBodyShell*, bool, bool) comm/mailnews/imap/src/nsImapBodyShell.cpp:888 cfi
7 xul.dll nsIMAPBodypartMultipart::Generate(nsImapBodyShell*, bool, bool) comm/mailnews/imap/src/nsImapBodyShell.cpp:888 cfi
8 xul.dll nsIMAPBodypartMultipart::Generate(nsImapBodyShell*, bool, bool) comm/mailnews/imap/src/nsImapBodyShell.cpp:888 cfi
9 xul.dll nsIMAPBodypartMessage::Generate(nsImapBodyShell*, bool, bool) comm/mailnews/imap/src/nsImapBodyShell.cpp:759 cfi
10 xul.dll nsImapBodyShell::Generate(char*) comm/mailnews/imap/src/nsImapBodyShell.cpp:223 cfi
11 xul.dll nsImapProtocol::ProcessSelectedStateURL() comm/mailnews/imap/src/nsImapProtocol.cpp:2941 cfi
12 xul.dll nsImapProtocol::ProcessCurrentURL() comm/mailnews/imap/src/nsImapProtocol.cpp:1944 cfi
13 xul.dll nsImapProtocol::ImapThreadMainLoop() comm/mailnews/imap/src/nsImapProtocol.cpp:1493 cfi
14 xul.dll nsImapProtocol::Run() comm/mailnews/imap/src/nsImapProtocol.cpp:1159 cfi
Assignee | ||
Comment 27•3 years ago
|
||
Richard,
It looks like you were fetching a multi-part email and it was fetching 1st the bodystructure. Does it always crash when you try to read a specific email?
Since you run without offline store (from what you've told me) getting the bodystructure is needed since parts are saved to memory cache for faster recall. You may have to move to another message, then clear the memory cache and then try to open the "problem" message again to duplicated the problem.
To clear memory cache, under General preferences go near the end and click the "Clear Now" button in the "Disk" region.
If you can duplicate the crash, a IMAP:5,IMAPCache:5 log might show something. Or if you could provide the problem .eml, probably better.
FWIW, the crash point (frame 0 above) may be caused by m_shell (pointer to the bodystructure shell) being null or corrupt.
Comment 28•3 years ago
|
||
(In reply to gene smith from comment #27)
It looks like you were fetching a multi-part email and it was fetching 1st the bodystructure. Does it always crash when you try to read a specific email?
It mostly never happens... it was just expected... after restart of TB and selecting same emails all seems fine...
Since you run without offline store (from what you've told me) getting the bodystructure is needed since parts are saved to memory cache for faster recall. You may have to move to another message, then clear the memory cache and then try to open the "problem" message again to duplicated the problem.
I haven't identified which email exactly caused the issue, but I will try to clear cache and reselect those I may thing it could have been at the time...
If you can duplicate the crash, a IMAP:5,IMAPCache:5 log might show something. Or if you could provide the problem .eml, probably better.
FWIW, the crash point (frame 0 above) may be caused by m_shell (pointer to the bodystructure shell) being null or corrupt.
If I can replicate with a specific email, will provide you some logs...
Comment 29•3 years ago
|
||
(In reply to Richard Leger from comment #28)
(In reply to gene smith from comment #27)
Since you run without offline store (from what you've told me) getting the bodystructure is needed since parts are saved to memory cache for faster recall. You may have to move to another message, then clear the memory cache and then try to open the "problem" message again to duplicated the problem.
I haven't identified which email exactly caused the issue, but I will try to clear cache and reselect those I may thing it could have been at the time...
So far after clearing the cache I haven't been able to reproduce the issue... it is a very random unexpected issue...
Reporter | ||
Updated•3 years ago
|
Comment 30•3 years ago
|
||
en-UK TB 98.0b2 on a macbook pro 10.15.7 en-UK
so I have had crashes over the last few versions of beta and there seems to be a "common" point
TB is capturing (or reloading sometimes) received emails with a number of attachments, none that large, but a large number in some cases
This might link to the "chunking" from what I understand of the string.
I have had to "refresh" a few folders of late as the email contents get mixed - content from one email, usually most recent opened, is shown in anohter email, or no content at all.
I have also sent a few +20 mg emails (a few 5-7 mg images). Though I think the crashng is more relating to the capturing of emails, old and new (as some folders seem to refresh themselves), and those that are swollen with attachments.
I have edited the mail.server.default.fetch_by_chunks to false as you suggest Wayne
Assignee | ||
Comment 31•3 years ago
|
||
I have my doubts that "chunking" really affects this.
But what you describe with message content getting "mixed" sounds like this beta specific: Bug 1734847
Comment 32•3 years ago
|
||
(In reply to gene smith from comment #31)
I have my doubts that "chunking" really affects this.
But what you describe with message content getting "mixed" sounds like this beta specific: Bug 1734847
I am on 99.0b2 and still have msgs being mixed, in that I click a msg, but it shows contents from the one above (my order being newset at top)
Assignee | ||
Comment 33•3 years ago
|
||
(In reply to Antony from comment #32)
(In reply to gene smith from comment #31)
I have my doubts that "chunking" really affects this.
But what you describe with message content getting "mixed" sounds like this beta specific: Bug 1734847I am on 99.0b2 and still have msgs being mixed, in that I click a msg, but it shows contents from the one above (my order being newset at top)
The easiest solution is to "repair" your problem folder(s). If that still doesn't help you will need to delete the mbox and msf file pairs in the profile area (files XXX and XXX.msf) for affected folders (with tb shutdown) and then start TB, visit the folders and let TB re-download and rebuild the folders. (The repair or re-download/re-build of folders may take a long time if folder is huge.)
This was caused by a bug in an earlier post-91 beta that corrupted the mbox folder storage and and requires these measures to fix the stored data.
Note: deleting XXX.msf will remove any custom column setting you have setup.
Comment 34•3 years ago
|
||
(In reply to gene smith from comment #33)
(In reply to Antony from comment #32)
(In reply to gene smith from comment #31)
I have my doubts that "chunking" really affects this.
But what you describe with message content getting "mixed" sounds like this beta specific: Bug 1734847I am on 99.0b2 and still have msgs being mixed, in that I click a msg, but it shows contents from the one above (my order being newset at top)
The easiest solution is to "repair" your problem folder(s). If that still doesn't help you will need to delete the mbox and msf file pairs in the profile area (files XXX and XXX.msf) for affected folders (with tb shutdown) and then start TB, visit the folders and let TB re-download and rebuild the folders. (The repair or re-download/re-build of folders may take a long time if folder is huge.)
This was caused by a bug in an earlier post-91 beta that corrupted the mbox folder storage and and requires these measures to fix the stored data.
Note: deleting XXX.msf will remove any custom column setting you have setup.
First I am seeing this anywhere. So, if on IMAP, you just delete the mailbox file and the associated ".msf file", it will re-download from IMAP? What if there are "mixed messages", and one of the "pair" or "parters" from that bad mix was already deleted? Will it rebuild half a message? Will it delete half the content? Also, Thunderbird is slow enough as it is, and crashes when left open long enough. How long does this "rebuild" take? I don't want it to crash in the middle of that.
Thanks.
Comment 35•3 years ago
|
||
(In reply to gene smith from comment #33)
(In reply to Antony from comment #32)
(In reply to gene smith from comment #31)
I have my doubts that "chunking" really affects this.
But what you describe with message content getting "mixed" sounds like this beta specific: Bug 1734847I am on 99.0b2 and still have msgs being mixed, in that I click a msg, but it shows contents from the one above (my order being newset at top)
The easiest solution is to "repair" your problem folder(s). If that still doesn't help you will need to delete the mbox and msf file pairs in the profile area (files XXX and XXX.msf) for affected folders (with tb shutdown) and then start TB, visit the folders and let TB re-download and rebuild the folders. (The repair or re-download/re-build of folders may take a long time if folder is huge.)
This was caused by a bug in an earlier post-91 beta that corrupted the mbox folder storage and and requires these measures to fix the stored data.
Note: deleting XXX.msf will remove any custom column setting you have setup.
Does it mean TB Repair feature for IMAP folder is buggy and need fixing? This issue could have been a good occasion to look into it...
No one should have to manually delete TB files on file system to fix issues. This could be hazardous and create more issues than it fixes...
It feels like dirt hidden under the carpet in the hope it would disappear ;-)
Comment 36•3 years ago
|
||
The patch for this bug should include code that does what is brought up in Comment 35, above. ^^
This needs to be internal to the program itself, and even better, automatically.
Assignee | ||
Comment 37•3 years ago
|
||
First I am seeing this anywhere. So, if on IMAP, you just delete the mailbox file and the associated ".msf file", it will re-download from IMAP? What if there are "mixed messages", and one of the "pair" or "parters" from that bad mix was already deleted? Will it rebuild half a message? Will it delete half the content? Also, Thunderbird is slow enough as it is, and crashes when left open long enough. How long does this "rebuild" take? I don't want it to crash in the middle of that.
I was thinking only about IMAP when I wrote the above. For local folders or POP, you should just do the repair and, if that doesn't help, only delete the XXX.msf since there is no server to re-fetch message bodies from and you will lose the XXX mbox file content if you delete it!
Again, this is to work-around a bug in a beta, so release users shouldn't need to do this.
I don't know how long it will take, only the bigger the imap folder the longer it will take.
Haven't seen TB being real slow or crashing (which I guess is the original subject of this bug).
I don't know what you mean by "half a message"? The XXX mbox file contains the full messages for a folder concatenated together. The XXX.msf stores the only the message headers (in unreadable form) for the folder. The XXX mbox files's main purpose is for reading messages when offline so it's actually optional for IMAP (but enabled by default). The XXX.msf is always present in IMAP. (For local folders you will have both files. Also, folders created under a pop account, except INBOX, are basically local folders since they have no connection to a remote email server.
Richard wrote:
Does it mean TB Repair feature for IMAP folder is buggy and need fixing? This issue could have been a good occasion to look into it...
No one should have to manually delete TB files on file system to fix issues. This could be hazardous and create more issues than it fixes...
It feels like dirt hidden under the carpet in the hope it would disappear ;-)
I've never actually seen this "emails pasted together" problem so I can't say for sure the just repair won't fix it. That's why I suggested just repair as the first step. But I've read that other beta user's had to delete at least the XXX.msf and some the mbox XXX file itself to recover.
So maybe the best step would be:
- try to repair the folder
if that doesn't work, - delete the XXX.msf file and let TB rebuild it
if that doesn't work - For IMAP only (not pop or local folders), delete XXX and XXX.msf and let TB rebuild them
if that doesn't work, maybe start over with a brand new profile or wait for the complete re-write of IMAP and other stuff in javascript.
Reporter | ||
Comment 38•3 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #25)
Version 91 is crash count is about the same as 78, but ranking is lower, ~93 vs 66 respectively
#80 for 91.8.1, so still roughly the same crash rate.
But for beta 100 the rank is #7. I haven't compared stacks to see if they are the same
https://crash-stats.mozilla.org/signature/?product=Thunderbird&signature=nsImapServerResponseParser%3A%3AProcessOkCommand&version=100.0&date=%3C2022-04-24T13%3A46%3A49%2B00%3A00&date=%3E%3D2022-03-27T13%3A46%3A49%2B00%3A00
Comment 39•2 years ago
|
||
Still happening in TB 103.0b2 on Windows 10...
bp-40d4ebab-443b-441e-b651-2a4ff0220711
Comment 40•2 years ago
|
||
(In reply to Richard Leger from comment #39)
Still happening in TB 103.0b2 on Windows 10...
bp-40d4ebab-443b-441e-b651-2a4ff0220711
Yup.
I just noticed I hadn't yet VOTED for this bug. Only 1 vote? Maybe it isn't so much of a problem after all.
Assignee | ||
Comment 41•2 years ago
|
||
From comment 27:
FWIW, the crash point (frame 0 above) may be caused by m_shell (pointer to the bodystructure shell) being null or corrupt.
Looking closer, it appears the m_shell is checked for null and used twice before reaching the crash point at line 422: https://searchfox.org/comm-central/rev/c24995178e8fe0cc7213c8a2095d167c0ad8d56f/mailnews/imap/src/nsImapServerResponseParser.cpp#422
Here's the subject code with "###gds: ..." annotations.
if (GetFillingInShell()) { ###gds: checks that m_shell is not nullptr
// There is a BODYSTRUCTURE response. Now let's generate the stream...
// that is, if we're not doing it already
if (!m_shell->IsBeingGenerated()) { ###gds: m_shell used here but doesn't crash so valid
nsImapProtocol* navCon = &fServerConnection;
char* imapPart = nullptr;
fServerConnection.GetCurrentUrl()->GetImapPartToFetch(&imapPart);
m_shell->Generate(&fServerConnection, imapPart); ###gds: m_shell also used here and doesn't crash so valid
PR_Free(imapPart);
if ((navCon && navCon->GetPseudoInterrupted()) ||
fServerConnection.DeathSignalReceived()) {
// we were pseudointerrupted or interrupted
// if it's not in the cache, then we were (pseudo)interrupted while
// generating for the first time. Release it.
if (!m_shell->IsShellCached()) m_shell = nullptr; ###gds: maybe set null here?
navCon->PseudoInterrupt(false);
li422 } else if (m_shell->GetIsValid()) { ###gds: crashes here !!! Why? GetIsValid just returns a boolean.
// If we have a valid shell that has not already been cached, then cache
// it.
if (!m_shell->IsShellCached() &&
So I'm not seeing how m_shell could be invalid at line 422 when it seemed OK above. I don't see that any of the intervening code is touching m_shell (except where it is possibly set null in the pseudointerrupted block, but that would skip over line 422).
Assignee | ||
Comment 42•2 years ago
|
||
The current default TB configuration no longer uses fetch by parts. So the above code won't be executed so a crash at line 422 can't occur. So a workaround for this crash is to just make sure prefs containing the string "parts_on_demand" are all at default:
mail.imap.mime_parts_on_demand_threshold
mail.server.default.mime_parts_on_demand
When set to their old default values (pre-78 I think) these were useful for a message with large non-inline attachments fetched on a slow network. The defaults were changed here for encryption purposes: bug 1629292
Note: There's one other similar sounding pref (probably legacy and not present on new setups):
mail.imap.mime_parts_on_demand
So if this is present just set it to default also (which may be true, but that won't matter).
Reporter | ||
Comment 43•2 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #38)
...
But for beta 100 the rank is #7. I haven't compared stacks to see if they are the same
https://crash-stats.mozilla.org/signature/?product=Thunderbird&signature=nsImapServerResponseParser%3A%3AProcessOkCommand&version=100.0&date=%3C2022-04-24T13%3A46%3A49%2B00%3A00&date=%3E%3D2022-03-27T13%3A46%3A49%2B00%3A00
For TB102 this is now #3 crash. Only one nightly crash in the past 6 months. Beta crash rate is steady, so I can't tell when it increased.
bp-e7f8073f-744d-4c74-bdbd-b2f360220730 TB102.1.0
bp-d9e2c842-5212-4b21-a511-972440220730 TB91.11.0
The crash rate for 102 is triple that of 91. Adjusting for the difference in user populations for the two versions, v102's crash rate is 15 times that of version 91. It is not yet clear whether this should block making updates automatic for TB102.
Looking at the other bug reports marked as related, the only one with signficant theories is bug 1581017 comment 21. Gene, does that still reproduce for you?
Assignee | ||
Comment 44•2 years ago
|
||
Looking at the other bug reports marked as related, the only one with signficant theories is bug 1581017 comment 21. Gene, does that still reproduce for you?
I don't see that bug 1581017 comment 21 has anything to do with this crash. (Also, the change I proposed there has problems that I saw after a recent re-test. So I either need to keep working on it or mark the attached patch as obsolete. Magnus and Kai also made some recent changes to that code, so maybe the bug is moot.)
My guess is that this crash is only for users having mime_part_on_demand still set to true and it's crashing on a message with a large attachment. I enabled mime_part_on_demand and tested some large messages and the code doesn't crash for me at line 422 of nsImapServerResponseParser.cpp. Looks like pointer m_shell would have to be null or somehow pointing into weeds to cause a crash. But it's used right before line 422 but doesn't cause a crash there. So it doesn't make sense.
Of course, all this has been pointed out several times already above. However, it would be good to know how users seeing this crash have mime_parts_on_demand set (true or false). I thought Richard Leger said he has it false but can't find where he said that.
Reporter | ||
Comment 45•2 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #43)
...
Looking at the other bug reports marked as related, the only one with signficant theories is bug 1581017 comment 21. Gene, does that still reproduce for you?
Correction, Gene had indicated that bug 1581017 does NOT reproduce.
Comment 46•2 years ago
|
||
(In reply to gene smith from comment #44)
Looking at the other bug reports marked as related, the only one with signficant theories is bug 1581017 comment 21. Gene, does that still reproduce for you?
I don't see that bug 1581017 comment 21 has anything to do with this crash. (Also, the change I proposed there has problems that I saw after a recent re-test. So I either need to keep working on it or mark the attached patch as obsolete. Magnus and Kai also made some recent changes to that code, so maybe the bug is moot.)
My guess is that this crash is only for users having mime_part_on_demand still set to true and it's crashing on a message with a large attachment. I enabled mime_part_on_demand and tested some large messages and the code doesn't crash for me at line 422 of nsImapServerResponseParser.cpp. Looks like pointer m_shell would have to be null or somehow pointing into weeds to cause a crash. But it's used right before line 422 but doesn't cause a crash there. So it doesn't make sense.
Of course, all this has been pointed out several times already above. However, it would be good to know how users seeing this crash have mime_parts_on_demand set (true or false). I thought Richard Leger said he has it false but can't find where he said that.
My current profile settings are as below:
mail.imap.mime_parts_on_demand true
mail.imap.mime_parts_on_demand_threshold 500000
mail.server.default.mime_parts_on_demand false
Those seems to be the default value as creating a new profile and checking settings show me the same settings as above.
Comment 47•2 years ago
|
||
(In reply to Richard Leger from comment #46)
My current profile settings are as below:
mail.imap.mime_parts_on_demand true
mail.imap.mime_parts_on_demand_threshold 500000
mail.server.default.mime_parts_on_demand falseThose seems to be the default value as creating a new profile and checking settings show me the same settings as above.
Mine are:
true
true
true
I've never changed these, so it goes back to whenever I originally installed Thunderbird (10+ years ago?).
What SHOULD these settings be?
Assignee | ||
Comment 48•2 years ago
|
||
What SHOULD these settings be?
The relevant one for this bug is mail.server.default.mime_parts_on_demand false
so Worcester12345 may see this crash. However since Richard has it false, so he shouldn't see the crash.
I don't think it's critical to change the values unless you are seeing the crash. The only other reason might be if you are using the end-to-end encryption which requires the the full message be fetched and not individual mime parts.
Mine are:
true
true (typo?, this one is a number)
true
As Richard lists them, the current defaults for these are:
true
500000
false
Comment 49•2 years ago
|
||
Mine was NOT a typo. The way I listed them is how they are shown. What would change them from the default? I never mess around in that stuff.
Should I change them to match Richard's, which "should be" the default?
Assignee | ||
Comment 50•2 years ago
|
||
(In reply to Worcester12345 from comment #49)
Mine was NOT a typo. The way I listed them is how they are shown. What would change them from the default? I never mess around in that stuff.
I don't know what would change them. Possibly they were manually set by editing the <profile>/prefs.js file ??
You show the middle one as a boolean "true" when it (mail.imap.mime_parts_on_demand_threshold) is an number. So I thought maybe it was a typo. If Config Editor shows it as "true" or "false", it needs to be changed to an integer with default 500000.
Should I change them to match Richard's, which "should be" the default?
I would recommend using the defaults as posted below unless you have reasons not too:
- "Slow" connection to server and
- You have tb set so attachments don't display inline, and
- Your emails tend to have non-inline attachment that you save to a file or open into an application, and
- Not using end-to-end encryption
I would just set them like this:
mail.imap.mime_parts_on_demand false
mail.imap.mime_parts_on_demand_threshold 500000
mail.server.default.mime_parts_on_demand false
The first one defaults to true but it's not really used unless the 3rd one is not present and setting to true (default) is misleading.
Note: To add one more layer of complexity, you can set the 3rd pref to be server specific using the server number (aka, key), e.g.,
mail.server.server10.mime_parts_on_demand true
mail.server.server9.mime_parts_on_demand true
etc. so server 9 and 10 do mime_parts_on_demand and the other imap servers don't since default remains false.
Comment 51•2 years ago
|
||
(In reply to gene smith from comment #50)
(In reply to Worcester12345 from comment #49)
Mine was NOT a typo. The way I listed them is how they are shown. What would change them from the default? I never mess around in that stuff.
I don't know what would change them. Possibly they were manually set by editing the <profile>/prefs.js file ??
You show the middle one as a boolean "true" when it (mail.imap.mime_parts_on_demand_threshold) is an number. So I thought maybe it was a typo. If Config Editor shows it as "true" or "false", it needs to be changed to an integer with default 500000.
Should I change them to match Richard's, which "should be" the default?
I would recommend using the defaults as posted below unless you have reasons not too:
- "Slow" connection to server and
- You have tb set so attachments don't display inline, and
- Your emails tend to have non-inline attachment that you save to a file or open into an application, and
- Not using end-to-end encryption
I would just set them like this:
mail.imap.mime_parts_on_demand false mail.imap.mime_parts_on_demand_threshold 500000 mail.server.default.mime_parts_on_demand false
The first one defaults to true but it's not really used unless the 3rd one is not present and setting to true (default) is misleading.
Note: To add one more layer of complexity, you can set the 3rd pref to be server specific using the server number (aka, key), e.g.,
mail.server.server10.mime_parts_on_demand true
mail.server.server9.mime_parts_on_demand true
etc. so server 9 and 10 do mime_parts_on_demand and the other imap servers don't since default remains false.
Not only have I not changed these, I'm not quite sure how.
On your 1-4:
- "Slow" connection to server and
- You have tb set so attachments don't display inline, and
- Your emails tend to have non-inline attachment that you save to a file or open into an application, and
- Not using end-to-end encryption
- No
- Not sure, never changed anything
- Not sure, never changed anything
- Pretty sure I don't
Reporter | ||
Updated•2 years ago
|
Reporter | ||
Comment 52•2 years ago
|
||
(In reply to Worcester12345 from comment #40)
I just noticed I hadn't yet VOTED for this bug. Only 1 vote? Maybe it isn't so much of a problem after all.
Votes are are irrelevant when an issue is a topcrash, which explicitly means many users are affected
Reporter | ||
Comment 53•2 years ago
|
||
The increase in crash rate, which comes with version 102, is clearly seen in the graph.
- First in early July when we 102.0 released.
- More recently over the last few weeks when the number of 102 users is doubled.
In an attempt to find why v102 crash rate is higher than 91, beta and nightly releases, show strong crash rate as far back as buildid 20220404222700, which is beta 100. The crash rate looks steady throughout the period. Thunderbird returned to crash-stats in early April, so we don't know when in the past year the beta crash rate increased.
And nightly stats are useless because the crash rate is so low there - less than one crash per month.
Reporter | ||
Comment 54•2 years ago
|
||
3/4 of crashes do not have nsImapProtocol::ParseIMAPandCheckForNewMail on the stack. Examples:
- bp-62dc1eda-2e76-40e7-b908-2dd360220905
0 xul.dll nsImapServerResponseParser::ProcessOkCommand(char const*) mailnews/imap/src/nsImapServerResponseParser.cpp:422
1 xul.dll nsImapServerResponseParser::ParseIMAPServerResponse(char const*, bool, char*) mailnews/imap/src/nsImapServerResponseParser.cpp:270
2 xul.dll nsImapProtocol::FetchMessage(nsTString<char> const&, <unnamed-tag>, char const*, unsigned int, unsigned int, char*) mailnews/imap/src/nsImapProtocol.cpp:3820
3 xul.dll nsImapProtocol::FetchTryChunking(nsTString<char> const&, <unnamed-tag>, bool, char*, unsigned int, bool) mailnews/imap/src/nsImapProtocol.cpp:3853
4 xul.dll nsImapProtocol::FallbackToFetchWholeMsg(nsTString<char> const&, unsigned int) mailnews/imap/src/nsImapProtocol.cpp:3582
5 xul.dll nsImapBodyShell::Generate(nsImapProtocol*, char*) mailnews/imap/src/nsImapBodyShell.cpp:167
6 xul.dll nsImapProtocol::ProcessSelectedStateURL() mailnews/imap/src/nsImapProtocol.cpp:2948
7 xul.dll nsImapProtocol::ProcessCurrentURL() mailnews/imap/src/nsImapProtocol.cpp:1947
8 xul.dll nsImapProtocol::ImapThreadMainLoop() mailnews/imap/src/nsImapProtocol.cpp:1487
9 xul.dll nsImapProtocol::RunImapThreadMainLoop() mailnews/imap/src/nsImapProtocol.cpp:1163 - bp-4d76a8bc-4c26-49de-be6d-3f5900220905
0 xul.dll nsImapServerResponseParser::ProcessOkCommand(char const*) mailnews/imap/src/nsImapServerResponseParser.cpp:422
1 xul.dll nsImapServerResponseParser::ParseIMAPServerResponse(char const*, bool, char*) mailnews/imap/src/nsImapServerResponseParser.cpp:270
2 xul.dll nsImapProtocol::FetchMessage(nsTString<char> const&, <unnamed-tag>, char const*, unsigned int, unsigned int, char*) mailnews/imap/src/nsImapProtocol.cpp:3820
3 xul.dll nsImapProtocol::FetchTryChunking(nsTString<char> const&, <unnamed-tag>, bool, char*, unsigned int, bool) mailnews/imap/src/nsImapProtocol.cpp:3853
4 xul.dll nsImapProtocol::FallbackToFetchWholeMsg(nsTString<char> const&, unsigned int) mailnews/imap/src/nsImapProtocol.cpp:3582
5 xul.dll nsImapBodyShell::Generate(nsImapProtocol*, char*) mailnews/imap/src/nsImapBodyShell.cpp:167
6 xul.dll nsImapProtocol::ProcessSelectedStateURL() mailnews/imap/src/nsImapProtocol.cpp:2948
7 xul.dll nsImapProtocol::ProcessCurrentURL() mailnews/imap/src/nsImapProtocol.cpp:1947
8 xul.dll nsImapProtocol::ImapThreadMainLoop() mailnews/imap/src/nsImapProtocol.cpp:1487
Assignee | ||
Comment 55•2 years ago
|
||
I still can't duplicate the crash but I found a possible cause. The crashing function nsImapServerResponseParser::ProcessOkCommand() usually gets called at least twice recursively, as seen in the crash dumps. I see also see the recursion with nested attachments, e.g.,, attach an eml that contains jpgs. Near the end of the function m_shell is set to null. If m_shell is nulled in the level 2 call and then Generate() returned from the level 1 call, m_shell would be null when m_shell->GetValid() occurs and it would crash. There seems to be code to preclude this but maybe it still happens somehow.
My proposed fix is to keep track of the recursion level with mOKdepth and only set m_shell null when the value of mOkdepth is 1, meaning there is only one ProcessOkCommand() running and it's about to return. I also added an NS_ASSERTION to check for unexpected null m_shell and don't call m_shell->GetValid() if m_shell is null.
I left in some temporary printf in the diff.
I haven't seen any problems with this diff in place.
Reporter | ||
Comment 56•2 years ago
|
||
Richard, worcester,
Are you able to run a try build if one is provided?
Assignee | ||
Comment 57•2 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #56)
Richard, worcester,
Are you able to run a try build if one is provided?
If either can, I went ahead and made a try build with my diff here:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=e1d35b1718921885aa11b162ec965245bd9c0d1c
The build is still in progress at this time.
However, according to comment 46, Richard has this set false:
mail.server.default.mime_parts_on_demand false
and I never see the crash point occur with it set like that. However, I think Worcester has it set true per comment 47.
Not sure how often they actually see this crash. It's not something easy to reproduce just by opening a specific email so they may have to run with this try build a long time to see if it prevents a crash. But maybe just running it for a while to see that it does no harm would be useful.
Assignee | ||
Comment 58•2 years ago
|
||
Here's the direct link to the windows installer for the now finished try build: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/bQVym9vYT2O9WRVYuEs_eg/runs/0/artifacts/public/build/install/sea/target.installer.exe
In case you don't know, a try build is the current daily build with just my patch (diff) applied as the only change.
Assignee | ||
Comment 59•2 years ago
|
||
Still haven't duplicated the crash but, while trying, I did find one definite problem. As messages with attachments are opened, the code tries to save a cache of up to 20 "Body Shells". The idea is that once the cache is full, the oldest shell is removed (ejected) from the cache to make room for the next one. The shells are stored in an array and in a hash list with the key being the message UID and UID Validity concatenated as a string. However, I noticed that once all 20 are in the cache, on accessing the next message, nothing ever actually gets removed so the hash table grows indefinitely as new messages are opened.
The problem is when the oldest entry is attempted to be ejected, it is searched for in the hash table using only the UID as the key and not the correct UID+UID Validity combo. So older entries are never found and "ejected" allowing the cache to grow unchecked as new messages with attachments are accessed.
The attachment bs.diff shows the fix marked with "gds fix". (I also left in some temporary printfs for debugging.)
It's possible this is the cause of this bug, but I doubt it. Before fixing this, I opened a LOT of messages with attachments and never saw the crash.
Comment 60•2 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #56)
Richard, worcester,
Are you able to run a try build if one is provided?
Trying it...
Comment 61•2 years ago
|
||
(In reply to Richard Leger from comment #60)
(In reply to Wayne Mery (:wsmwk) from comment #56)
Richard, worcester,
Are you able to run a try build if one is provided?Trying it...
There seems to be authentication method issue with this version of TB. I have not been able to connect to my IMAP server with it... by either copying and re-using my existing profile or creating a new one... it fails to authenticate even with the right settings...
Assignee | ||
Comment 62•2 years ago
|
||
Hi Richard, Thanks for trying it. I just re-tried running it on linux and it worked OK. I also booted over to Windows and installed it and it ran OK for me. However, I did see one problem when I connected to my test server with a self-signed certificate. I had to go to Inbox and click "Get Messages" a few time to trigger the override dialog before I could open messages. This is a known feature (bug?) that used to work automatically. Don't know if this is what you are seeing, otherwise, with normal valid certs I see no problem connecting in Linux or Windows with the try build.
Just to make sure you are running the try build, it should see in about daily: 106.0a1 (2022-09-07).
Also, I'm still working on a better fix for this so probably the patch in the try build won't be used completely as-is anyhow. But would still be good to know if it has any good (or bad) effect for you.
Assignee | ||
Updated•2 years ago
|
Comment 63•2 years ago
|
||
(In reply to gene smith from comment #62)
Also, I'm still working on a better fix for this so probably the patch in the try build won't be used completely as-is anyhow. But would still be good to know if it has any good (or bad) effect for you.
I finally managed... trying it now... for the next couple of weeks in a new profile with just my IMAP/SMTP mailbox :-)
Keep you posted.
Comment 64•2 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #56)
Richard, worcester,
Are you able to run a try build if one is provided?
Probably, but would need some very specific instructions.
Comment 65•2 years ago
|
||
(In reply to Richard Leger from comment #63)
(In reply to gene smith from comment #62)
Also, I'm still working on a better fix for this so probably the patch in the try build won't be used completely as-is anyhow. But would still be good to know if it has any good (or bad) effect for you.
I finally managed... trying it now... for the next couple of weeks in a new profile with just my IMAP/SMTP mailbox :-)
Keep you posted.
I have not noticed crash so far, but I can no longer use the daily as it seems the msf. files got corrupted. Few times some emails lost their headers, not displaying on message list (but not all of them must few of them toward the top), at some point I lost all of them... no header displayed... I tried to repair affected folder couple of time... which fixed the problem for a while, but now for a reason I cannot come to understand all my emails appears as unread in all my folders :-) So will stop the trial here and revert back to using the normal beta. For the record it may not be linked to your fix as I was also trying the unified view to test bug related to it... so it may be that as well... but no crash related to this bug in my small period of testing...
Assignee | ||
Comment 66•2 years ago
|
||
Thanks for testing it Richard. Thought maybe the problem you describe in comment 65 was the same as this bug 1791169 that I caused with another attempted fix for autosync. But, based on dates, apparently not.
We may end up just ripping out the bodystructure/bodyshell stuff since Magnus really wants to get rid of "fetch by parts" from servers. It probably causes more problems than it actually fixes and adds a lot of complexity.
Assignee | ||
Updated•2 years ago
|
Comment 67•2 years ago
|
||
(In reply to gene smith from comment #66)
Thanks for testing it Richard. Thought maybe the problem you describe in comment 65 was the same as this bug 1791169 that I caused with another attempted fix for autosync. But, based on dates, apparently not.
We may end up just ripping out the bodystructure/bodyshell stuff since Magnus really wants to get rid of "fetch by parts" from servers. It probably causes more problems than it actually fixes and adds a lot of complexity.
Sorry I could not pull off this testing for you. Sounds like it might not matter anyhow if you're changing things.
Is there a new bug for this ripping out?
I have been using Firefox much less, and Thunderbird a bit less also at home, and therefore don't go into Bugzilla as much as I used to.
Comment 68•2 years ago
|
||
(In reply to gene smith from comment #42)
The current default TB configuration no longer uses fetch by parts. So the above code won't be executed so a crash at line 422 can't occur. So a workaround for this crash is to just make sure prefs containing the string "parts_on_demand" are all at default:
Was this code removed?
What can we do to help with this going forward?
Reporter | ||
Comment 69•2 years ago
|
||
FWIW, this is the only crash signature for version 102.* that has FetchTryChunking
on the stack [1]. That seems odd.
But, there are no beta and no nightly crashes after 105 beta, i.e. starting with 106.
Reporter | ||
Comment 70•1 year ago
|
||
There are no version 115 crashes https://crash-stats.mozilla.org/signature/?proto_signature=~nsImapServerResponseParser%3A%3AProcessOkCommand&product=Thunderbird&date=%3E%3D2023-08-22T21%3A33%3A00.000Z&date=%3C2023-08-29T21%3A33%3A00.000Z&_sort=-date&signature=nsImapServerResponseParser%3A%3AProcessOkCommand
In fact, there are no version 115 crashes with nsImapProtocol or nsImapServerResponseParser on the stack, period.
Reporter | ||
Comment 71•1 year ago
|
||
Per Gene "bodyshell code is gone with bug 1805186"
Reporter | ||
Updated•1 year ago
|
Updated•1 year ago
|
Description
•