TB 78 and 91 very slow and unresponsive in short bursts, with high CPU, on 54 GB system with three disks
Categories
(Thunderbird :: Untriaged, defect)
Tracking
(Not tracked)
People
(Reporter: hrdubwd, Unassigned)
References
Details
(Keywords: perf, stalled, Whiteboard: [closeme 2022-10-01])
User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:93.0) Gecko/20100101 Firefox/93.0
Steps to reproduce:
Almost anything ...
Actual results:
It takes several seconds or more to see any response, such as the cursor.
Expected results:
Almost immediate selection of cursor location, typed text to appear, selection of mailbox or pane, etc etc. Deletion of a simple one line message took 4 seconds just now. It makes it very difficult to type a message - display lags, characters and arrow key actions often do not register.
At the moment, TB has been running for 8 days and has accumulated nearly 13 hours of CPU time. There seems to be an awful lot of spontaneous 'background' activity that prevents foreground actions, about every 3 minutes or so. This seems always to involve about 0.3 GB temporary ramped increase in the System Commit memory with one core fully occupied - even when I am doing nothing with it, such as typing now in Firefox - for 8 to 20 seconds at a time. (For comparison now, I have no such problem with what I am doing here.)
It would appear that the priority for the 'background' is wrong, without doubting that it is sensible activity, it is just getting in the way of use. Indeed, watching the thread activity, there is an awful lot going on all the time, even when message are not being retrieved (which collection incidentally does not cause a memory usage spike nor use more than about a third of the core's capacity).
It is very frustrating to have such an unresponsive program.
Comment 1•3 years ago
|
||
The two minute, no muss no fuss test without touching messages or folders ...
Please start Windows' safe mode with networking enabled
Still in Windows safe mode, start Thunderbird in Troubleshoot mode
Does problem go away?
Reporter | ||
Comment 2•3 years ago
|
||
2 minutes? That would take a couple of hours at least, for various reasons. When I get a chance, next week perhaps. (91.3 now, and it still does it).
Can you not identify what processes are running as described?
Reporter | ||
Comment 3•3 years ago
|
||
Sorry for the delay - lots in the way.
Anyway, I did that, and all was fine in that mode, no sign of a problem.
Going back to normal, in a couple of tests, it all now seems good!
Since reporting this item, I had occasion to uninstall Google Drive (I had concluded that it served no purpose but it seemed to tag a 'crashhandler' onto most running programs where any kind of text was involved - this I thought was odd anyway). I have also removed Macrium Reflect because I could not get it to work properly. There was no change in TB behaviour after that (i.e. no reboot - not apparent that it was required). But, killing the crashhandler on TB made no difference.
There has been no other change except a Windows update (latest .NET Framework, at the first reboot here) .
It would appear that some oddity was operating, though whether it was attributable to Google Drive or Macrium I could not say except coincidence.
I will keep an eye on it for a few days. If it does not recur I'll come back and cancel this.
Thanks.
Reporter | ||
Comment 4•3 years ago
|
||
I noticed that the problem was reappearing. Typing a reply had multiple interruptions which were accompanied by appreciable spikes in I/O activity and repeated spikes in the Private bytes of the root process. Exit and restart and repeating the reply was perfectly fine - no I/O spikes, no memory spikes - smooth typing.
It would seem that leaving TB running for long periods (this now after two days) is associated with some untoward effects.
Comment 5•3 years ago
|
||
Two days isn't long. Please give numbers for CPU and memory usage. Please list your AV and backup software.
Reporter | ||
Comment 6•3 years ago
|
||
2 days: exactly my point ... not long, but it happens.
Sorry, what numbers? I gave an example - total accumulated: 13 h in 8 days. I can show a graph of Private Bytes memory usage when the problem recurs (only just restarted) (presently 356 MB for the root process). But you have to specify exactly what you want.
AV: M$ Security Essentials,
BU: Easeus - but not running when this occurs.
Comment 7•3 years ago
|
||
I am looking for point in time numbers/percentages. (cumulative numbers are less interesting)
Please also create a profile https://support.mozilla.org/en-US/kb/profiling-thunderbird-performance
Reporter | ||
Comment 8•3 years ago
|
||
Point in time: tricky, because this is about much short term activity that cumulatively seems to be the problem. I will endeavour to get some examples.
Profile: OK, set up, will have to wait for the problem to recur in notcieable fashion - it seems to require some time to develop, as I said (I have only just restarted my system).
Thanks.
Updated•3 years ago
|
Comment 9•3 years ago
|
||
According this article, perf monitor can be set to alert when high disk activity is detected https://www.cloudsavvyit.com/3931/how-to-set-up-monitoring-to-alert-on-windows-high-system-usage/
Reporter | ||
Comment 10•3 years ago
|
||
Not really helpful here I feel. Very elaborate, but misses the point, I think, because the problem is that TB locks in a way that I never see with other programs, no matter what. Asking for action, like read or save, and waiting for that to complete is different. This is about typing as I am now and finding that there is no response whatsoeve. As far as I can see, FF can be very busy - CPU peaking up to 40%, lots of disk activity, pages reloading in the background - and still I have no problem. In simply trying to reply to a message in TB it freezes repeatedly for seconds at a time.
Comment 11•3 years ago
|
||
(In reply to Dr B W Darvell from comment #10)
Not really helpful here I feel. Very elaborate, but misses the point, I think, because the problem is that TB locks in a way that I never see with other programs, no matter what. Asking for action, like read or save, and waiting for that to complete is different. This is about typing as I am now and finding that there is no response whatsoeve. As far as I can see, FF can be very busy - CPU peaking up to 40%, lots of disk activity, pages reloading in the background - and still I have no problem. In simply trying to reply to a message in TB it freezes repeatedly for seconds at a time.
How frequently have you set up to auto save to Drafts ?
I encountered this before in Support and the hanging coincided with each time there was a save to drafts. This was worse for those who had AV scanning allowed.
Reporter | ||
Comment 12•3 years ago
|
||
It is set at 5 minutes, and that does not match the timing of the problems. One event memory occurs about every two minutes - but not exactly regularly precise. On occasions there are such spikes every few seconds, but I cannot see what triggers that sequence.
I''ll switch off autosave anyway, just to check, but it will need a couple of days to know as I have just restarted TB because it was getting unusable.
AV has been off for the profile folder for a couple of days - no effect.
Thanks.
Comment 13•3 years ago
|
||
(In reply to Dr B W Darvell from comment #12)
It is set at 5 minutes, and that does not match the timing of the problems. One event memory occurs about every two minutes - but not exactly regularly precise. On occasions there are such spikes every few seconds, but I cannot see what triggers that sequence.
I''ll switch off autosave anyway, just to check, but it will need a couple of days to know as I have just restarted TB because it was getting unusable.
AV has been off for the profile folder for a couple of days - no effect.
Thanks.
In Account Settings > Server Settings how frequently have you set to check for new messages ?
If you made a time note of when hanging/spikes occur, do they coincide with information in the 'Activity Manager'?
Is your Anti-Virus allowed to scan Thunderbird files and folders ?
Reporter | ||
Comment 14•3 years ago
|
||
- 5 min
- Noting the time is near impossible because all such events are for just a few seconds and a manual log would become ridiculous. I will have to wait a couple of days for the problem to recur because TB has just been updated. However, when it does I will the Act. Man. All I have seen there is fetching messages comments, and that does not coincide with the problem. The memory spike is about every 2 minutes, as I said.
- As in comment 12, no.
Comment 15•3 years ago
|
||
I had similar problems with TB. I think without proof that it's related to MBOX files being large and needing to be updated. In any case, I took my chances with MAILDIR (one file per message) and essentially all the delays disappeared. I had previously improved the situation by unsubscribing form All Mail (I'm using Gmail imap) and working in the much smaller inbox. I have not tried to resubscribe to All Mail under MAILDIR - in the rare case I need to search it the web interface is better anyhow.
Comment 16•3 years ago
|
||
Just a drive by comment.
Have you tried disabling the global index/ search in the preferences? On my old device I did have issues with hangs, memory use and CPU. Replacing the old and quite slow hard disk with an SSD did appear to fix a lot of that. But in the process of monitoring to try and determine what was actually hanging, I was seeing a lot of activity manager messages that just sat there with "determining messages to index on" XXXXX. Those folders may have had 2 new messages, but Thunderbird was pondering the situation for minutes at a time. I think that perhaps there may be issues in the global indexing code hence my suggestion to try it without.
Reporter | ||
Comment 17•3 years ago
|
||
Activity Manager does not seem very helpful. It logs download (or not), moves and deletions - and that is all as far as I can see. I compacted folders: this led to a lot of I/O as would be expected, but only occasionally visible and very transient 'compacting a folder' message was shown (there are other transients occasionally but I cannot see what they are for - too quick). The I/O continued with no activity being indicated. The status bar was similarly silent during this most of the time, although the progress bar was clearly moving slowly along. When the status showed that the compacting was completed, the memory use ramped up over a few seconds by 170 MB and has stayed there, apart from the regular and very short spike at 2 min 2 or 3 second intervals of about 100 MB (which corresponds to a spike im CPU use) - until 9 min 20 seconds later when it dropped 120 MB. There was no visible I/O in that time ( a few bytes only, it seems). Again, no activity log that I saw.
If the Activity Manager is supposed to help, I cannot see how. Watching several windows at once in the hope of seeing something is not sensible.
BTW: I checked, and the IMAP account I mentioned earlier is not active - there is NO message checking for that account. That means that only local folder accounts are involved.
Reporter | ||
Comment 18•3 years ago
|
||
Matt: I can give that a go, but it might take a couple of days to know if it works. But, I have never seen the Activity Manager show any such message. Does that mean that there are hidden AM settings?
Comment 19•3 years ago
|
||
(In reply to Dr B W Darvell from comment #6)
... total accumulated: 13 h in 8 days.
As you previously noted, 13h is a lot of CPU. With such a significant amount of CPU, it should be easy to determine whether in Windows safe mode the cumulative CPU reduces and problem occurrence reduces - so far I haven't seen that determined.
And given that you also see poor response time while typing, it's also should be possible to capture a screen shot of disk IO by file name as shown in resource monitor (as we discussed in email).
Activity manager is time stamped - so you only need to have it open to refer back to it later, i.e. you don't need to be "watching" it to see what was happening in the past. And yes, it's not designed to show everything.
You have one imap account. How many pop accounts? With no server side filters?
Reporter | ||
Comment 20•3 years ago
|
||
Safe mode: I don't think I could run in that for the several days it would take for the problem to occur and work in the meantime. It might work if I was away for a few days, but I am not going anywhere of course, given the general situation.
Disk IO: I have not noticed any coincidence of that kind, but that window ends up covered very easily, and by the time I have noticed a problem and restored the view the event is over. It cannot be done to order, but once it starts I can try to get a sharper response. However, $Logfile (NTFS Volume Log) and $Mft (Master File Table) seem to get a lot of traffic, as does popstate.dat and sqlite-related files, even when TB is in the background and nothing else is going on. The memory usage goes up and down for no apparent reason (even with Global Indexing off now), and the 2-minute recurring spike continues (clearly on a timer that is nothing to do with checking mail).
Activity "Manager" seems to be an overstatement: There is no interaction possible except to clear the list, the server connection failures sit on the top of the list without timestamps (even when the connection is regained), duplicates are lost, and nothing relevant now ever appears (there are transient messages - but I cannot read them in that flash). It seems to me to be a waste of space, it tells me nothing useful and serves no diagnostic purpose whatsoever. What did you expect me to see?
Pop accounts: 4, no filters except for spam, but I have no control of those anyway.
Reporter | ||
Comment 21•3 years ago
|
||
Activity Manager does not seem very helpful. It logs download (or not), moves and deletions - and that is all as far as I can see. I compacted folders: this led to a lot of I/O as would be expected, but only occasionally visible and very transient 'compacting a folder' message was shown (there are other transients occasionally but I cannot see what they are for - too quick). The I/O continued with no activity being indicated. The status bar was similarly silent during this most of the time, although the progress bar was clearly moving slowly along. When the status showed that the compacting was completed, the memory use ramped up over a few seconds by 170 MB and has stayed there, apart from the regular and very short spike at 2 min 2 or 3 second intervals of about 100 MB (which corresponds to a spike im CPU use) - until 9 min 20 seconds later when it dropped 120 MB. There was no visible I/O in that time ( a few bytes only, it seems). Again, no activity log that I saw.
If the Activity Manager is supposed to help, I cannot see how. Watching several windows at once in the hope of seeing something is not sensible.
BTW: I checked, and the IMAP account I mentioned earlier is not active - there is NO message checking for that account. That means that only local folder accounts are involved.
Reporter | ||
Comment 22•3 years ago
|
||
(Firstly, sorry about that duplicated message - I do not know how that happened, but it might have been because I had two tabs open for the same bug at one point.)
I think I have identified a significant factor in the problem: there seems to be a memory leak. Whenever there is any activity with email the 'Private Bytes afterwards is (pretty much always) higher than before. (There is a lot of up and down while things are being done, so I am referring only to the 'settled' value. ) This creep continues until the delays get unbearable (somewhere over 800 MB) and I have to restart. On my system at least on starting TB settles to about 240 MB before any actions on my part. At the moment it is 680 MB after about 40 h running.
I can get similar creep by simply steepping through to display mail in a folder, and even stepping through the folder tree seems to do it. If I repeatedly display the same two messages by alternating, arrow up and down in the message list, the creep still occurs. By displaying about 40 messages 4 times by stepping, the usage has gone up by 87 MB. Repeating the exercise, more or less, added about 70 MB. None of this is ever recovered it seems, even after a long pause.
This is with the 'Compact' and 'Indexing' settings off, as is spell check and data collection.
After some more traffic, the figure is now 913 MB ... I wonder how far it will go?
Surely there is a major problem somewhere.
Reporter | ||
Comment 23•3 years ago
|
||
The Private Bytes are now up to 1.4 GB, with CPU frequent activity even when I am doing nothing, and it has now beome a real pain to work. TB has been up for only ~2.5 days, the main process having run for > 2 hours cumulative.
I suppose most people will reboot more frequently than that and so not notice the problem. As it is, I now definitely have to restart TB and do so more or less daily to avoid hassle. This needs fixing.
Reporter | ||
Comment 24•3 years ago
|
||
The Private Bytes are now up to 1.4 GB, with CPU frequent activity even when I am doing nothing, and it has now become a real pain to work. TB has been up for only ~2.5 days, the main process having run for > 2 hours cumulative.
I suppose most people will reboot more frequently than that and so not notice the problem. As it is, I now definitely have to restart TB and do so more or less daily to avoid hassle. This needs fixing.
Comment 25•3 years ago
|
||
Access Profile name folder
Exit Thunderbird
What size is the 'panacea.dat' file ?
Delete:
panacea.dat
A new one will get created when you next start Thunderbird.
How many 'prefs.js' files do you see?
There should be one 'prefs,js' file which is the one in current use by Thunderbird, but do you see others with a number eg: 'prefs-1.js', 'prefs-2.js' etc
The one with the largest number being the last one used.
If you see a lot of these 'prefs-n.js' files (where n is a number):
Keep the 'prefs.js' file.
Keep the 'prefs-n.js' file which has highest number assuming it has a size same as 'prefs.js' file.
Delete all other 'prefs-n.js' files.
rename 'prefs-n.js' to say 'prefs-1.js' - it acts like a backup.
Reporter | ||
Comment 26•3 years ago
|
||
Panacea : 453 kB
There is only one prefs*.js (under the profile, and another under the debugger profile. Nothing numbered.
Restarted: panacea is now 0 bytes.
Now what?
Reporter | ||
Comment 27•3 years ago
|
||
Panacea is now 1230 kB.
Theer is still only one prefs.js, plus the copy in the debugger folder.
Private bytes still climbing after the restart: 508 MB. The leak is still there.
Reporter | ||
Comment 28•3 years ago
|
||
Now on TB 91.5.0 (64b). A number of memory faults have been fixed, it seems (the bug list does not work so I cannot check what).
The behaviour now does seem to be different in the pattern of variation of CPU and memory, but it is still creeping up: it started at 209 MB and is now ~356 MB, in the space of ~90 min.
We'll see.
(Panacea 205 kB, still only one prefs.js.)
Reporter | ||
Comment 29•3 years ago
|
||
After ~21 h and some light traffic, the memory usage is now 770 MB and the delays have started again.
The patterns of CPU and memory variation are different from the previous version, but the leak is still there and the problem persists.
(Panacea 1505 kB, still only one prefs.js. Do Itake it that these can now be ignored as of no bearing?)
Reporter | ||
Comment 30•3 years ago
|
||
Memory now up to 965 MB in 2 days, the problem is unchanged. The two-minute 'tick' is still there strongly (core maxed, memory spike 350 MB, lasting about 8 s).
I have to restart.
Panacea now 3855 kB. Still only one prefs (at 51 kB).
Comment 31•3 years ago
|
||
Some additional areas for potential exploration
- step through https://wiki.mozilla.org/Thunderbird:Testing:Memory_Usage_Problems
- other high memory bug reports https://mzl.la/3KWMwkv
Reporter | ||
Comment 32•3 years ago
|
||
1 I cannot see anything in there relevant or not already tried except logging, which is now started . What are we looking for in there?
2 No apparent relevance (where I can understand what is being said). One thing though is that memory usage is essentially static when nothing is being done (except for the tick).
BTW: what is 'panacea' for? Can we ignore that? 238 kB as I write now - it is very variable in size.
The memory usage seems to increase without end - I have tested again since comment 30 and gone well past 1 GB, making it very painful to use.
Now today on 91.5.1. Nothing apparently relevant in the changelog.
Reporter | ||
Comment 33•3 years ago
|
||
I now have a pop3 log file of 98 MB. What shall I do with it?
Updated•3 years ago
|
Comment 34•3 years ago
|
||
Compare bug 592876.
Those who see this bug: Are you using the calendar? How are is your calendar sync configured, exactly?
Comment 35•3 years ago
|
||
Also, how long is your Thunderbird process running? How long ago did you start Thunderbird? When I saw a similar bug, it appeared only after a few days of running Thunderbird continuously. A Thunderbird restart fixed it immediate, for a few days.
Reporter | ||
Comment 36•3 years ago
|
||
I am not using the calendar very much - there is the very occasional item that appears to go into it, but that is all. It is never open.
As I said above, it takes a day or two (depending on activity) to get the memory use to rise high enough for the problem to appear. I am getting into the habit of checking and restarting accordingly, before trouble. As with you, restart is the only immediate, but always temporary, fix, no matter what other settings (off) I have tried that have been suggested.
Comment 37•2 years ago
|
||
I face the same problem.
Thunderbird's private memory grows up hour after hour - now 1,2 GB, only after after 2 days. Freezes happen after a few hours or days of use. I have to restart TB every 3-4 days because theses freezes become longer and longer with time, resulting in an unusable UI.
Looking at performance profiler, il looks like freezes coincide with repetitive cycles collection calls : I uploaded an example to https://share.firefox.dev/3xFZzRm
Comment 38•2 years ago
|
||
The profiler is better in version 102. Please post a new profile using version 102 started in troubleshoot mode.
https://support.mozilla.org/en-US/kb/profiling-thunderbird-performance
Reporter | ||
Comment 39•2 years ago
|
||
OK, set up. For how long (or when) do you want me to record?
Comment 40•2 years ago
|
||
15-20 seconds should be sufficient.
Set it up in advance, but do not turn on the capture until the slownes begins. Or, if you know how to reproduce it, turn on capture before starting your reproduction.
Reporter | ||
Comment 41•2 years ago
|
||
It is reproducible by waiting a few days.
I usually restart regularly to avoid the problem, but I will now leave it run.
Thanks.
Reporter | ||
Comment 42•2 years ago
|
||
Is this what you need: https://share.firefox.dev/3DlRvtt ?
TB had reached 1 GB Private memory (reached more rapidly than before) and a CPU time of 25+ min for an uptime of 20 h.
I will leave it run for now.
Comment 43•2 years ago
|
||
(In reply to Dr B W Darvell from comment #42)
Is this what you need: https://share.firefox.dev/3DlRvtt ?
Yes, that is a proper profile. But I don't see anything in there that suggests a long affect on performance. I will ask someone to look at it.
TB had reached 1 GB Private memory (reached more rapidly than before) and a CPU time of 25+ min for an uptime of 20 h.
FWIW I don't think 25 minutes of CPU in a day is super unusual. Nor 1gb of memory.
Reporter | ||
Comment 44•2 years ago
|
||
Profile: OK, good.
Time: It seems to me to be out of proportion for the actual activity, although it does seem less than previously.
Memory: maybe, but it just grows indefinitely. It is 1.3 Gb now ... I am sure there is a memory leak.
Comment 45•2 years ago
|
||
The profile you submitted has two problems.
- It looks like you didn't do step 5 of https://support.mozilla.org/en-US/kb/profiling-thunderbird-performance
- It does not have appeared to be run during a period where you were actually experiencing slowness
Reporter | ||
Comment 46•2 years ago
|
||
- I thought that was what I did at comment 42. Where should it have gone?
- I experienced slow down at around that point, but it is clearly hard to know whether it is still the case. I'll try again later when I detect it.
Memory now 1.5 Gb, CPU time 54 min.
(Sorry, this was supposed to have been sent 2 d ago!)
Reporter | ||
Comment 47•2 years ago
|
||
Anyway, memory now at 2.3 GB, CPU 2 h 55 min, and stuttering is becoming obvious (but clearly less rapdily or dramatically than when I started this).
New profile: https://share.firefox.dev/3Dpwden
Reporter | ||
Comment 48•2 years ago
|
||
Memory: 2.8 GB, CPU 5:48 after some 6 d.
Stuttering now obvious.
New profile: https://share.firefox.dev/3LlfmeQ
Reporter | ||
Comment 49•2 years ago
|
||
2.9 GB (levelled off?), 6:58 - an hour a day of CPU time, with actually relatively little traffic. I have to restart now, the interference is annoying.
New profile: https://share.firefox.dev/3f30825
Comment 50•2 years ago
|
||
Sorry, these profiles still lack the needed options, Step 1, item #5 of https://support.mozilla.org/en-US/kb/profiling-thunderbird-performance
Configure profiler settings:
- Click "edit settings" which is the last item in the Developer Tools window.
- Select the "Thunderbird" preset (if it isn't already).
- Scroll down and also mark the checkbox for "All File IO".
- Change any other settings that might be needed. For example, if you expect to need a very long sample, longer than a minute, you might need to increase the buffer size from 1GB to 2GB, or reduce the sample interval to less than 1ms.
- Close the settings window.
Reporter | ||
Comment 51•2 years ago
|
||
Sorry, I thought I had followed the instructions.
I'll give it another go when it happens again. The creep does seem to be slowing down over recent updates.
Reporter | ||
Comment 52•2 years ago
|
||
Update: TB (102.4.1) has been running for 7 days now with no evident slowdown. Memory usage at 580 MB, which means that it has increased very slowly in comparison with previous reports. CPU usage at 1 h 12 min, which is also substantially less than before. Very little background activity evident, a little I/O from time to time, but temp. memory usage increase small.
A second process has 1 h 46 min and 570 MB - which seems to me proportionately more then previously (I have not been looking at this becaise it was not a factor previously, it seemed).
Whatever changes have been in the updates they seem gradually to have moderated this problem to the point of practical insignificance. Memory creep notwithstanding, it has not become a pain. View message cycling does not cause as much creep, but there is some (40 or 50 view gets +6 MB).
I'll leave it run and not update to 102.4.2 yet, see what happens.
Comment 53•2 years ago
|
||
Thanks for the update.
Reporter | ||
Comment 54•2 years ago
|
||
Resolved? The memory leak is still there.
Comment 55•2 years ago
|
||
View message cycling does not cause as much creep, but there is some (40 or 50 view gets +6 MB).
this is rather insignificant
Reporter | ||
Comment 56•2 years ago
|
||
On startup, memory usage is ~295 MB. After 5 days (I had to restart) it is up to 601 MB on the main process, 472 MB (vs. 103) on the second. That is not trivial. I thought memory like that was a security risk, apart from the inconvenience. The slow down may have gone away, but the is clearly still a fault.
Comment 57•2 years ago
|
||
Thank you for pointing that out. However, one is never going to maintain the amount of memory used from startup to over a period of days or weeks. There may be a fault, but the amount of increase in the time period you cite is not something we would invest time trying to diagnose. If we were talking multiple GB it might be worth investigating.
Reporter | ||
Comment 58•2 years ago
|
||
Noted. Thanks.
Description
•