Closed
Bug 723611
(VacuumManagerPragmas)
Opened 13 years ago
Closed 13 years ago
Browser hung while trying to vacuum
Categories
(Toolkit :: Telemetry, defect)
Tracking
()
RESOLVED
FIXED
mozilla13
People
(Reporter: joe, Assigned: mak)
References
(Blocks 1 open bug)
Details
(Whiteboard: [snappy:p1])
Attachments
(2 files)
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
sdwilsh
:
review+
|
Details | Diff | Splinter Review |
I walked away from my computer for a few minutes, and when I came back, my browser was hung. Some investigation shows that it's trying to vacuum my telemetry database. I presume that it's currently open in another thread.
Reporter | ||
Comment 1•13 years ago
|
||
I captured a dump that includes the heap: http://people.mozilla.org/~jdrew/telemetry-dump.7z
Reporter | ||
Comment 2•13 years ago
|
||
I should mention that I am running 2012-02-02's nightly build on Windows 7.
Comment 3•13 years ago
|
||
there is no telemetry database. Do you mean it was running vacuum while doing some telemetry queries? I suspect we can make use of the hang detector to detect sqlite contention and break it by cancelling background queries once bug 722243 is done.
Depends on: 722243
Assignee | ||
Comment 4•13 years ago
|
||
If it was doing telemetry queries on mainthread, this is likely related to bug 722242 that Paolo pushed yesterday. Btw, you exactly caught the single month day where we do vacuum to go back to your idle computer :) Apart jokes, we should do what taras said and priority the async bookmarking api as the immediate next Places project.
Reporter | ||
Comment 5•13 years ago
|
||
Ehsan and I found that sqlite was apparently working on a telemetry database: zName 0x66c85118 "telemetry-vfs" const char * Perhaps this was a red herring?
Assignee | ||
Comment 6•13 years ago
|
||
that's just the telemetry vfs we use to measure and report performances, it's a transparent layer.
Updated•13 years ago
|
Whiteboard: [snappy]
Comment 7•13 years ago
|
||
Just to check, you determined that it was vacuuming by looking at zSQL on the relevant object? Do we have any idea what the main thread was doing at this time? If it's in the stack trace attached, I don't know how to read it. It's curious that neither Joe nor I (bug 724288) had run into long hangs like this before, and we're seeing them all of a sudden.
Reporter | ||
Comment 8•13 years ago
|
||
You can actually download the .dmp file if you want to debug, but the stack I uploaded *was* the main thread.
Comment 9•13 years ago
|
||
We vacuum on the main thread? O.o Looking at the stacks again, I'm not sure that bug 724288 is a dupe. This bug's main thread is doing sIdleServiceDaily::Observe nsIdleService::TryNotifyIdleState but bug 724288 has the main thread stuck on nsNavBookmarks::RunInBatchMode (I couldn't figure out exactly what statements it was trying to run).
Assignee | ||
Comment 10•13 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #9) > We vacuum on the main thread? O.o no > Looking at the stacks again, I'm not sure that bug 724288 is a dupe. This > bug's main thread is doing > > sIdleServiceDaily::Observe > nsIdleService::TryNotifyIdleState > > but bug 724288 has the main thread stuck on nsNavBookmarks::RunInBatchMode > (I couldn't figure out exactly what statements it was trying to run). The fact is that decayFrecency has become extremely expensive, and it pushes all the async stuff down by seconds into the actual session, included vacuum. Bug 723044 will bring back the situation to what it was before, that is a small probability to have contention vs a large one.
Assignee | ||
Comment 11•13 years ago
|
||
that means, all this hang after idle bugs are just evidence of the same bug.
Comment 12•13 years ago
|
||
Okay, I think I understand bug 723044; thanks. > The fact is that decayFrecency has become extremely expensive, and it pushes all the async stuff > down by seconds into the actual session, included vacuum. I think I understand "pushing all the async stuff down by seconds", but I don't understand, "into the actual session". Why does the stack in this bug show the main thread doing a vacuum? Or is that not what the main thread stack shows?
Assignee | ||
Comment 13•13 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #12) > I think I understand "pushing all the async stuff down by seconds", but I > don't understand, "into the actual session". Sorry, I was unclear indeed. Before decayFrecency() became expensive, the possibility to have some maintenance task happen when the user comes back from idle was really small, on my system that call takes about 1 or 2 seconds. Vacuum may likely happen after this call, and if decayFrecency() takes seconds, or even minutes, the possibility for vacuum to be delayed up to when the user is back and tries to actually use the browser, is very high. > Why does the stack in this bug show the main thread doing a vacuum? Or is > that not what the main thread stack shows? Ah, this is interesting, it's not the vacuum creating contention here, rather the PRAGMAS that are run before it are contending with frecency.
Alias: VacuumManagerPragmas
Blocks: StorageJank
Updated•13 years ago
|
Whiteboard: [snappy] → [snappy:p1]
Assignee | ||
Updated•13 years ago
|
Assignee: nobody → mak77
Status: NEW → ASSIGNED
Assignee | ||
Updated•13 years ago
|
Summary: Browser hung while trying to vacuum telemetry database → Browser hung while trying to vacuum
Assignee | ||
Comment 14•13 years ago
|
||
The problem: - synchronous pragmas executed before vacuum may contend with asynchronous queries executing on the connection What this patch does: - removes unused NotifyCallback class - removes all synchronous pragmas by simplifying page_size fixup. The database will just be fixed at its turn. - makes assertions fatal
Attachment #598026 -
Flags: review?(sdwilsh)
Comment 15•13 years ago
|
||
Comment on attachment 598026 [details] [diff] [review] patch v1.0 Review of attachment 598026 [details] [diff] [review]: ----------------------------------------------------------------- r=sdwilsh ::: storage/src/VacuumManager.cpp @@ +223,5 @@ > if (os) { > + DebugOnly<nsresult> rv = > + os->NotifyObservers(nsnull, OBSERVER_TOPIC_HEAVY_IO, > + OBSERVER_DATA_VACUUM_BEGIN.get()); > + MOZ_ASSERT(NS_SUCCEEDED(rv), "Should be able to notify"); \o/
Attachment #598026 -
Flags: review?(sdwilsh) → review+
Assignee | ||
Comment 16•13 years ago
|
||
Thanks. Someone complained we were casting to void too many result values :) https://hg.mozilla.org/integration/mozilla-inbound/rev/616f4f937da2
Target Milestone: --- → mozilla13
Comment 17•13 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/616f4f937da2
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•