Closed Bug 679498 Opened 13 years ago Closed 11 years ago

Firefox rainbow beachball locks up frequently with ~10 windows each with ~10 tabs, have to force quit multiple times daily

Categories

(Firefox :: General, defect)

6 Branch
x86_64
macOS
defect
Not set
critical

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: tantek, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: hang)

Attachments

(1 file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.112 Safari/535.1 Steps to reproduce: Open Firefox with about 10 windows, each with about 10 tabs on average. Then nearly any action: Actual results: Firefox is responsive for a bit (minutes?) but then quickly bogs down, and then *any* of the following actions cause a rainbow beachball cursor, CPU of almost 100% usage and unresponsiveness: * open a new window * start typing a URL in the URL bar, drop down appears, then beachball. You can drag window and the drop down stays in place. * refresh one of the existing tabs * open a new tab * quit Some of these improved in responsiveness with FF6 over 5, and I have to force quit less often, but it's still horrible from a responsiveness perspective. What is FF spending all that processor time doing that has nothing related to the active user thread to the point where it becomes unresponsive? Expected results: NEVER show the rainbow beachball. This is an architectural problem, it should never be possible for *any* action to cause the user event thread to be blocked / starved. Once that is fixed, then start attacking the causes of spiking CPU usage to nearly 100%. I am running on less than a month old MacBook Air 11" 4GB RAM with OSX Lion. Same results (but worse) on previous generation MacBook Air 11"n 4GB RAM with OSX Snow Leopard. Sidenote: I would love it if there was some tool that allowed me to simply export my current window/tab state as a *single double-clickable file* that I could then email or share with anyone else, who could easily open the file, have it launch/open Firefox, and then open the same exact windows with the same exact tabs (for extra credit, include tab history as well). This would greatly help pass on more information about what windows/tabs are open so that folks working on performance, responsiveness etc. could have a more precise starting point for debugging such problems.
And yes I had to use Chrome to file this bug because FF6 was locked up rainbow beachballing.
And it just happened again, clicking on the URL for this bug in #firefox IRC, it loads, then about 10-15 seconds of rainbow beachball before it would respond to me clicking on the (Login) button. Seriously - bugzilla is not that JS/resource heavy - what is going on?
Also: only extensions installed/active: FireFTP, HTTPS-Everywhere, Mass Password Reset, Operator. Disabled extensions: Firebug, Flashblock, Table2Clipboard, TestPilot and a few others. Plugins: only QuickTime Plug-in 7.7.1 is installed/active. Disabled plugins: iPhotoPhotocast, Java, Shockwave Flash
Tantek, can you replicate that behavior in a fresh profile? I assume that all your testing happened in your daily profile so far.
I would suggest going to http://support.mozilla.org and working with someone in realtime on debugging this. Other than, try running it with a new profile with all extensions and plugins disabled.
:abillings Are you suggesting he use LiveChat?
Yes at https://support.mozilla.com/en-us/chat. Logging a bug isn't a good way to debug an individual support issue.
Let's assume for a second that Tantek's problem is a corrupted profile... Lars: When the browser repeatedly crashes, the browser is smart enough to detect a second successive crash and not restore tabs. BUT the bigger issue I see is that when users' PROFILES become corrupted, is it smart enough to detect X repeated crashes and suggest a user run with new profile and/or extensions & plugins disabled? If not, I'd be happy to talk through what the UX of might be with whomever on the platform you would recommend.
@Henrik - would love to without data loss if possible. In order to replicate as you request "in a fresh profile", how do I: * save my current profile (figured it out per https://support.mozilla.com/en-US/kb/Backing%20up%20your%20information ) * create a new profile (figured it out from https://support.mozilla.com/en-US/kb/Managing-profiles#w_creating-a-profile ) * re-open the same set of windows, tabs (and preferably, tab history as well) - without having to write down the set of URLs and manually open them up one-by-one (see above request for a tool that exports window/tab state so that it can be reopened in another browser/machine/profile etc.) ? @Al - already pinged Crystal Beasley (working on SuMo) about this problem, and she suggested filing the bug. Would love to test with a new profile - how do you reopen a specific window/tab set in a new profile? (starting up a new profile momentarily and expecting to start with an empty history, windows/tabs etc. ... )
Ok have created a fresh profile, and as predicted/expected, none of previously open windows/tabs are open, however those are necessary for reproducing this bug. Therefore the "try running it with a new profile" suggestion is insufficient to make progress on this bug. Reverting to previous profile. Also, hanging out in #firefox.
Tantek, simply copy the sessionstore.js file from the affected profile to the new one. Then start Firefox again with the fresh profile. That should trigger SessionStore to reopen all of your previous open tabs and windows.
Severity: normal → critical
Keywords: hang
Hardware: x86 → x86_64
Thanks Henrik will give that a shot. Also, title updated to be a bit more specific.
Summary: Firefox rainbow beachball locks up on me all the time, have to force quit multiple times daily → Firefox rainbow beachball locks up frequently with ~10 windows each with ~10 tabs, have to force quit multiple times daily
Something else which could be helpful is to use Shark to get a process map of the frozen application. When you could attach one it would be great.
Ok I copied my previous sessionstore.js file over to the fresh profile, all my windows/tabs re-opened as expected. Could not reproduce frequent rainbow beachballing with > 12 hours of use. I then reconnected to my Firefox Sync account, which restored bookmarks/passwords as expected. Will run for another 12-24 hours that way and then continue with adding things back in.
Tantek, given that you were able to reproduce the issue likely kinda often I would propose the other way. Just create a clone of your daily profile and remove files from it until the issue doesn't appear anymore. You should probably start with the places.sqlite file and the corresponding bookmark backups. Not sure yet, if the history (which is probably 180 days for you) has any influence.
I have a similar problem on OSX in Firefox 6. Frequent beachballs, for no obvious reason. They last from 30 seconds to a minute and are very frustrating. Another family member has the same issue, also on OSX, on a totally different machine. I'm sure that replacing my profile would paper over the problem in the short term, but there's some bug which causes profiles to bloat up / bog down over time. Users should not have to do manual maintenance on their profiles every few months to keep the browser working normally. I'll attach a process snapshot from Activity Monitor.
Attached file process snapshot while beachballing (deleted) —
Process snapshot from OSX Activity Monitor during a beachball attack.
Attachment #555124 - Attachment description: process snapshot whiel beachballing → process snapshot while beachballing
This sounds like it could be bug 686025.
Yes, that bug sounds similar to what I've been experiencing at least. I've found that clearing the browser history reduces the hangs significantly for a few weeks.
Well, I can't use Firefox until this gets fixed. It's way too damned frustrating!
(In reply to Dietrich Ayala (:dietrich) from comment #18) > This sounds like it could be bug 686025. but that bug is practically win7 only (the stack there is in jumplists code), if you are in Windows 7 you can try setting browser.taskbar.lists.frequent.enabled to false and tell if that helps.
Feels like a slow SQL query due to history though. That might be platform invariant.
The slow query (that is not slow due to the query, but due to a broken database information) is only part of the problem.
Prompted by an email query, here are some additional notes. I haven't had the chance to narrow down the original files (which I still have saved). Out for speaking/conference right now, perhaps when I get back. Here's what I've got so far (since the original bug report) I have been running with a fresh profile but with: * sessionstore.js copied over (so I didn't lose my open tabs) * logging into my same Firefox sync account (so I didn't lose passwords) * all plugins/addons turned off This seemed to fix it for quite some time (~a month of regular usage), however then I started to see a lot of beachball hanging again (similar # of windows/tabs open as before). Worst (most repoduceable) was anytime I tried to type in a new URL in the address bar to navigate too. (this is all on Firefox 7.0.1 at this point). When the freezing for each new tab/window/navigation got unbearable, I tried quitting and relaunching Firefox and it seemed to fix most of that. I'm going to continue to keep Firefox running/open for days and see if the problem resurfaces. This makes me wonder if: a) this is history related (lots of history = slow queries? = beachball due to awesomebar attempting to suggest URLs while I'm typing) b) there is a memory leak that has to do with history as well that is making this especially worse when Firefox has been open for some time. More info when I get it.
Have you run the 'Places Maintenance' add-on tasks and seen if that fixes it?
Just over a week later, with "normal" daily usage as my primary browser, my Firefox profile "places.sqlite" file is back to 10.5 MB and the lock-ups (rainbow beachballs) are back and annoying. In addition, even just typing into a textarea (like this one) is subject to frequent stalls of 30 seconds or more. All of this on a recent top of the line MacBook Air 11". Who is the performance lead for Firefox, and what other higher priority bugs are there? From anecdotal experience, this problem (Firefox being laggy, slow, locking up) is the #1 reason that our users, especially power users, long term users, heavy users of Firefox are switching browsers. And on those that I can convince, deleting the places.sqlite file usually improves performance/responsiveness significantly. This should be priority fix for FF8. Going to try deleting my places.sqlite file again...
Echoing koppah, did you install the 'Places Maintenance' addon and execute all of the maintenance tasks using its preference pane? That fixed this long standing problem for me. I believe it repairs the indexes on the places db. Bug 691509 is the real issue, I believe.
And indeed, deleting my places.sqlite file again improved responsiveness greatly again. Strange thing is - it immediately came back at 10.5MB. Perhaps from sync?
(In reply to MasterLeep from comment #27) > Echoing koppah, did you install the 'Places Maintenance' addon and execute > all of the maintenance tasks using its preference pane? That fixed this > long standing problem for me. I believe it repairs the indexes on the > places db. I have not, as while that might fix the problem for me, it's not reasonable to expect our users to do that in general (and because I've seen several users have this problem - or at least something similar that is "fixed" by deleting places.sqlite). I'd rather bear the pain and help narrow/track this down to the point where we can get a fix in the app itself. > Bug 691509 is the real issue, I believe. Ok I'll look into that. Thanks for the pointer.
(In reply to Tantek Çelik from comment #29) > (In reply to MasterLeep from comment #27) > > Echoing koppah, did you install the 'Places Maintenance' addon and execute > > all of the maintenance tasks using its preference pane? That fixed this > > long standing problem for me. I believe it repairs the indexes on the > > places db. > > I have not, as while that might fix the problem for me, it's not reasonable > to expect our users to do that in general (and because I've seen several > users have this problem - or at least something similar that is "fixed" by > deleting places.sqlite). I'd rather bear the pain and help narrow/track this > down to the point where we can get a fix in the app itself. Doing this is not a waste of time - it gives the developers more information. Executing the 'Expire' option is believed to fix this. If it fixes it for you, it confirms their ideas about what's causing it. > > > > Bug 691509 is the real issue, I believe. > > Ok I'll look into that. Thanks for the pointer. One of the devs (Marco Bonardo) is landing patches to fix this. One of those should be landed in the Beta channel soon from the looks of it, but the long-term fix will be going to Aurora.
(In reply to koppah from comment #30) > (In reply to Tantek Çelik from comment #29) > > (In reply to MasterLeep from comment #27) > > > Echoing koppah, did you install the 'Places Maintenance' addon ... > > > > I have not, as while that might fix the problem for me, it's not reasonable > > to expect our users to do that in general ... > > Doing this is not a waste of time - it gives the developers more > information. Executing the 'Expire' option is believed to fix this. If it > fixes it for you, it confirms their ideas about what's causing it. Ok I can try running that next time the problem becomes noticeable. > > > Bug 691509 is the real issue, I believe. > > > > Ok I'll look into that. Thanks for the pointer. > > One of the devs (Marco Bonardo) is landing patches to fix this. One of those > should be landed in the Beta channel soon from the looks of it, but the > long-term fix will be going to Aurora. From the comments on Bug 691509 it sounds like the fixes have already landed in Beta channel and should be in the current beta (or I'm misreading "status-firefox8: fixed"). I'm switching to using Beta channel as my daily normal use browser to see if the problem returns or not. Here's hoping (it doesn't). Thanks for the rapid follow-ups everyone.
The bug dependency was not set, so the ongoing (and fervent and concerned) work to fix this bug was not visible to the reporter. Setting that dependency now. A couple of notes: 1. The size of your places.sqlite file is not really meaningful. It is possible to have a perfect experience with 100mb places.sqlite file and a terrible one with a 1mb file. 2. The most helpful thing you can do is to try all of the suggested fixes and report your findings back to this bug. Pushing back on trying potential solutions will tell the developers nothing, while simultaneously reducing the probability that your problem will be solved by them to zero.
Status: UNCONFIRMED → NEW
Depends on: PlacesJank
Ever confirmed: true
A couple of other things to try, since places.sqlite index borkage is not the sole cause of non-responsiveness in Firefox: 1. Turn off hardware acceleration (Preferences/Advanced). I had to do this, and there are several other reports of having it on causing severe responsiveness problems. I found a graphics bug for this, but don't have it offhand. 2. I'm currently investigating a similar scenario (constant beachballing w/ low number of tabs), where the problem was that a bunch of closed tab/window data in the session was the cause. You can try fixing this by closing Firefox, deleting the sessionstore.js file in your profile, and starting again (this will lose your current tabs!). I just found this one yesterday, haven't filed a bug on it yet.
(In reply to Tantek Çelik from comment #28) > And indeed, deleting my places.sqlite file again improved responsiveness > greatly again. Strange thing is - it immediately came back at 10.5MB. That's an effect of fragmentation fighting, the database grows at 10MB chunks. (In reply to Tantek Çelik from comment #31) > From the comments on Bug 691509 it sounds like the fixes have already landed > in Beta channel and should be in the current beta (or I'm misreading > "status-firefox8: fixed"). Yes it landed in beta, but I think in beta we do build refreshes every few days, so may not yet be in current beta, but it's imminent.
(In reply to Tantek Çelik from comment #26) > In addition, even just typing > into a textarea (like this one) is subject to frequent stalls of 30 seconds > or more. All of this on a recent top of the line MacBook Air 11". This is exactly the same behavior I found in the closed tabs/windows session data problem I mentioned in comment #33. How big is your sessionstore.js file? One way to test this without losing your tabs is to bookmark all tabs (command/ctrl-shift-d), then follow the steps above for removing the session file, and then re-open that folder of bookmarks with "open all in tabs" from the Bookmarks menu. That approach will retain the loaded tab URLs, but will lose the tab histories, session cookies, form data, all the other bits of what constitutes a session.
(In reply to Marco Bonardo [:mak] from comment #34) > Yes it landed in beta, but I think in beta we do build refreshes every few > days, so may not yet be in current beta, but it's imminent. It's now available in Firefox 8 beta 4 (the current one)
Tantek, have you tried the add-on? Or are you running Beta? Would really help to know if we've addressed the issue properly or if it's some other hang requiring further investigation. Also, see bug 698500 for tracking priority UI hang/responsiveness issues.
Yes, tried the add-on, couldn't tell if it did anything. Have now been running the beta for a while. Another data point: I just left my laptop to get lunch, came back ~45 min later, and FF10.0 (beta) was locked up again. 99.3% CPU, 994 threads. Had to force quit, trying updating the beta to see if that helps. However, as far as I can tell, no, the issue has not be addressed. This problem (assuming it is just one) is still here, and there's seemingly no reasonable path to tracking it down (see original request for a tool to export currently open tab/window state into a double-clickable file). Hearing from web designers/devs *all the time* that "Firefox is just too slow or crashes and hangs all the time" (nearly all use Chrome as their primary development browser now as a result) makes me think I'm not the only one having problems like this, still (originally reported on FF6, still seeing this problem with FF10 beta).
Blocks: 698500
Also, just counted, 9 windows, 101 total tabs.
(In reply to Tantek Çelik from comment #38) > Hearing from web designers/devs *all the time* that "Firefox is just too > slow or crashes and hangs all the time" (nearly all use Chrome as their > primary development browser now as a result) makes me think I'm not the only > one having problems like this, still (originally reported on FF6, still > seeing this problem with FF10 beta). You're absolutely not the only one. The problem is widely acknowledged, and is an organizational priority. There's a massive amount of activity going on right now that is directly related to responsiveness (being generally tracked here: https://etherpad.mozilla.org/snappy). I've contacted a couple of developers in MV to see if we can get someone to put their hands on your machine to triage what's going on in your specific case.
(In reply to Dietrich Ayala (:dietrich) from comment #40) > You're absolutely not the only one. The problem is widely acknowledged, and > is an organizational priority. There's a massive amount of activity going on > right now that is directly related to responsiveness (being generally > tracked here: https://etherpad.mozilla.org/snappy). Thanks for the Etherpad. Read it, great stuff in there. Nothing on high CPU / high threads / rainbow beachball lockup AFAICT. From the Etherpad I found [1] [1] https://wiki.mozilla.org/Performance/Snappy which at least mentions bug 698500 which I've made dependent on this bug. > I've contacted a couple of developers in MV to see if we can get someone to > put their hands on your machine to triage what's going on in your specific > case. While I very much appreciate the offer, "can get someone to put their hands on your machine" is a non-starter. I travel for standards meetings/conferences, and I'm only in Mountain View every other Monday. The "get someone to put their hands on your machine" is also a non-starter for all the web designers/developers out there with this problem. On neither the Etherpad nor the wiki page [1] did I see any listing/mention of: * diagnosis tools * diagnosis steps to take That's what we really need, so that everyone who runs into these problems and cares can take steps in a distributed fashion to start narrowing down these problems. I'll repeat one concrete suggestion: a tool to export currently open windows/tabs state into a double-clickable file that when opened, launches Firefox and opens up the exact same windows/tabs state. Then anyone that hangs with a set of sites/pages open can simply do the export upon restart, and then email that one file to "developers in MV" to try running.
(In reply to Tantek Çelik from comment #41) > On neither the Etherpad nor the wiki page [1] did I see any listing/mention > of: > * diagnosis tools > * diagnosis steps to take > > That's what we really need, so that everyone who runs into these problems > and cares can take steps in a distributed fashion to start narrowing down > these problems. Yes, this is a big problem. If we relied on everybody who had a crash to manually file a bug and get somebody to investigate it, we wouldn't have a very good overall picture of what crashes are really happening. There are at least three efforts people are working on at Mozilla to address this problem. One is the hang detector, bug 429592. It isn't live yet, but what the hang detector will do is automatically send back a report to Mozilla whenever the browser hangs for 30 seconds, similar to crash reports. We had a prototype version of this turned on for a few days in Nightly, and it found a few bad problems we were unaware of. In addition, there have been a number of additional telemetry probes added recently to track how long various known sources of hangs and pauses, such as slow SQL queries, again to figure out how bad these problems are in the broader Firefox user population. Another is about:jank, which uses the new simple profiler system in bug 713227, to analyze smaller sources of problems. This is also in its early stages, but will help users work with developers to identify their particular problems, similar to how about:memory has been extremely useful in tracking down particular memory problems that users have.
Can somebody who has a thousand threads crash their browser and post the crash report here?
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #43) > Can somebody who has a thousand threads crash their browser and post the > crash report here? If you don't know how to do it, just install the Nightly Tester Tools: https://addons.mozilla.org/de/firefox/addon/nightly-tester-tools/
Depends on: 724368
(In reply to Tantek Çelik from comment #41) > I'll repeat one concrete suggestion: a tool to export currently open > windows/tabs state into a double-clickable file that when opened, launches > Firefox and opens up the exact same windows/tabs state. Then anyone that > hangs with a set of sites/pages open can simply do the export upon restart, > and then email that one file to "developers in MV" to try running. This would mean passing around all your passwords, session cookies, all kinds of things that no person should share. It's not a practice we want to encourage. Doing this in a way that is "safe" would be near impossible, and anyways would not be an accurate representation of the user's state when the bug was experienced. Perhaps we could provide builds with the new profiler (or an add-on, if possible?) that users with problems like these could run, and use to submit profiling logs. I don't know if the stack information in the profiler logs is safe to pass around or not. This would require installing a special version of Firefox or an add-on at a minimum, so is still not completely push-button. Also, I've filed a bug for getting Telemetry data for the number of threads, so we can get a better idea of how widespread your psychothreads situation is.
Tantek, I would also recommend you to install MemChaser to be able to regularly check the memory usage and garbage collection activity of Firefox: https://addons.mozilla.org/firefox/addon/memchaser/
you could attach your sessionstore.js still see this when using a current version in safe mode?
Status: NEW → UNCONFIRMED
Ever confirmed: false
Flags: needinfo?(tantek)
Keywords: testcase-wanted
Whiteboard: [closeme 2014-03-01]
Resolved per whiteboard
Status: UNCONFIRMED → RESOLVED
Closed: 11 years ago
Flags: needinfo?(tantek)
Resolution: --- → INCOMPLETE
Whiteboard: [closeme 2014-03-01]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: