Closed
Bug 810454
Opened 12 years ago
Closed 12 years ago
Use fadvise() to speed up cookie db load
Categories
(Core :: Networking: Cookies, defect)
Core
Networking: Cookies
Tracking
()
RESOLVED
FIXED
mozilla23
Tracking | Status | |
---|---|---|
firefox21 | --- | unaffected |
firefox22 | + | fixed |
firefox23 | --- | fixed |
People
(Reporter: jduell.mcbugs, Assigned: bugzilla)
References
(Blocks 1 open bug)
Details
(Whiteboard: c=startup_io u= p=)
Attachments
(6 files, 1 obsolete file)
(deleted),
patch
|
glandium
:
review-
|
Details | Diff | Splinter Review |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
bugzilla
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
ehsan.akhgari
:
review+
akeybl
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
(deleted),
text/plain
|
Details |
Taras commented in bug 810209 comment 6 about am idea for speeding up (at least B2G) cookie startup latency:
--
The way to speed up cookie startup overhead is to read in file with readahead to prime the cache. So before sqlite opens the file, open it and call fadvise(will_need), close it, then let sqlite do the job from fs-cache.
Otherwise we are reading 1-2MB of data in 32KB chunks, which results in terrible latency. This may result in up to 10x throughput improvement(depending on how shitty the that flash fs is)
--
I assume posix_fadvise() (works on Linux and OSX) is the right function to use here. Not sure if there's anything to do on Windows here.
Comment 1•12 years ago
|
||
Bug 627591 used reads with the FILE_FLAG_SEQUENTIAL_SCAN flag on Windows. Just saying'
Updated•12 years ago
|
Blocks: start-faster
Comment 2•12 years ago
|
||
http://dxr.mozilla.org/mozilla-central/xpcom/glue/standalone/nsGlueLinkingDlopen.cpp.html?string=preload#l178 does the readahead on linux. we should probably abstract the way various platforms prepopulate the fs cache
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → aklotz
Assignee | ||
Comment 3•12 years ago
|
||
These functions are basically the stuff in xpcom/glue/standalone/nsGlueLinking*.cpp but modified and refactored so that we can just call mozilla::PreloadLib or mozilla::PreloadFile from just about anywhere.
Attachment #717374 -
Flags: review?(benjamin)
Assignee | ||
Comment 4•12 years ago
|
||
Attachment #717377 -
Flags: review?(mak77)
Comment 5•12 years ago
|
||
I'm missing some data here...
First, if anyone actually measured a win from this change, since we should not take perf improvements without an actual proof they are going to help. In some cases, in the past, perf improvements made performances worse, since nobody verified them on the real device before enabling them. So a percentage measure of the win would be really welcome here.
Second, comment 0 says we read 1-2MB of data in 32KB chunks, I understand the chunks part, but I'm not sure what is causing us to read 1-2MB from cookies.sqlite on startup and why that is not avoidable. Could someone enlighten me please?
Finally, this makes the Cookies service always preload, how does that affect performances between large and small files, is the perf benefit visible with both tiny and large databases or may it actually even make things worse in some cases?
Assignee | ||
Comment 6•12 years ago
|
||
(In reply to Marco Bonardo [:mak] (Away Mar 1) from comment #5)
> I'm missing some data here...
> First, if anyone actually measured a win from this change, since we should
> not take perf improvements without an actual proof they are going to help.
> In some cases, in the past, perf improvements made performances worse, since
> nobody verified them on the real device before enabling them. So a
> percentage measure of the win would be really welcome here.
> Second, comment 0 says we read 1-2MB of data in 32KB chunks, I understand
> the chunks part, but I'm not sure what is causing us to read 1-2MB from
> cookies.sqlite on startup and why that is not avoidable. Could someone
> enlighten me please?
> Finally, this makes the Cookies service always preload, how does that affect
> performances between large and small files, is the perf benefit visible with
> both tiny and large databases or may it actually even make things worse in
> some cases?
I've tested on Windows 7, cold startup, no SSD with a 512KB cookie file and a 1.5MB cookie file. Google was set as homepage. Used xperf for tracing file I/O:
Smaller file:
Readahead was 87% worse (avg 16.8ms -> avg 31.6ms)
- SQLite only read in 4 unique 32KB chunks in this case, and those reads were all at offsets within the first 128KB of the file. Preloading the whole file turned out to be more costly because the remainder of the file wasn't needed by SQLite during initialization.
Larger file:
Readahead was 53% better (avg 152.8ms -> avg 71.9ms)
- SQLite read 14 unique 32KB chunks with offsets scattered throughout the database file.
Here's a condensed version of the call stack leading to cookies loading (reads top to bottom):
...
xul.dll!nsXULWindow::OnChromeLoaded
xul.dll!nsXULWindow::SetVisibility
xul.dll!nsWindow::Show
...
xul.dll!nsDocShell::LoadURI
xul.dll!nsDocShell::LoadURI
xul.dll!nsDocShell::InternalLoad
xul.dll!nsDocShell::DoURILoad
xul.dll!nsDocShell::DoChannelLoad
xul.dll!nsURILoader::OpenURI
xul.dll!mozilla::net::nsHttpChannel::AsyncOpen
xul.dll!mozilla::net::HttpBaseChannel::AddCookiesToRequest
xul.dll!nsCOMPtr_base::assign_from_gs_contractid
xul.dll!nsComponentManagerImpl::GetServiceByContractID
xul.dll!nsComponentManagerImpl::CreateInstanceByContractID
xul.dll!mozilla::GenericFactory::CreateInstance
xul.dll!nsCookieService::GetSingleton
xul.dll!nsCookieService::Init
xul.dll!nsCookieService::InitDBStates
xul.dll!nsCookieService::TryInitDB
xul.dll!mozilla::storage::Service::OpenUnsharedDatabase
xul.dll!mozilla::storage::Connection::initialize
xul.dll!mozilla::storage::Connection::initializeInternal
xul.dll!mozilla::storage::Connection::executeSql
...
Comment 7•12 years ago
|
||
do we have telemetry about average cookies.sqlite size?
The difference in SQLite behavior on open is interesting, I wonder if it's actually bound to the size or rather bound to where the cookies for the homepage are found in the database. Would be interesting to check if this is stable across similar sized databases.
It may even be possible to do a telemetry experiment by enabling the preload in Nightly and seeing if there's a positive or negative effect on a dedicated telemetry measure
Reporter | ||
Comment 8•12 years ago
|
||
> I'm not sure what is causing us to read 1-2MB from
> cookies.sqlite on startup and why that is not avoidable.
We read in the entire cookie database the first time any cookie is requested.*** (see below for gory details). The idea is to reduce the latency of following requests (no I/O), since the DB is small enough to keep in memory.
> Smaller file... the remainder of the file wasn't needed
Interesting. I wonder what's in the rest of the file, since we are doing a "select *" essentially. Meta-data? Are you measuring the time for both SQL queries (the big, async one, and the host-specific sync one?--see below for what I'm talking about).
> I've tested on Windows 7, cold startup, no SSD
Would be nice to see data on a phone too, in case flash is very different.
> Small (avg 16.8ms -> avg 31.6ms) // gets worse by 15ms
> Large (avg 152.8ms -> avg 71.9ms) // gets better by 80ms
If these results are comparable on cellphones, I'd be inclined to take this patch (I'm not a cookie peer though, so not my call). I assume we care more about reducing the high latency for large DB than adding a few ms in the smaller case. OTOH it's more code complexity for a not-huge win on a one-time startup cost, and if we switch to initializing the cookie service earlier than the first network request (as we have already for B2G: see bug 810209) the delay isn't blocking anything in the common case.
> do we have telemetry about average cookies.sqlite size?
We don't appear to have any telemetry about cookies :( The cookie DB stores 3000 cookies, max, and since we currently don't aggressively purge old cookies (we just evict as needed to keep 3000 total), I expect many users start to approach that over time. Each cookie can be up to 4K, so there's a logical max of 12MB, but it's generally much smaller. A couple data points: my home Firefox profile has 2756 cookie and cookies.sqlite is 2 MB. My work firefox profile has 1039 cookies and the DB is 512KB.
***To be precise, what we do is on 1st cookie request is 1) issue an async SQLite read for the whole cookie DB, then 2) issue a blocking SQLite read just for the specific domain being asked for. When I last talked to bent about this, it was unclear that SQLite even does the right thing with this approach (it sounds like the "async" query may actually hold a mutex that blocks the sync query until it's done). I'm attaching an IRC chat where I talked with bent about this, but I haven't had time to look into measuring what's actually going on.
Comment 9•12 years ago
|
||
We can land this as a telemetry trial to see which approach allows the database to be read in faster
Comment 10•12 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #8)
> If these results are comparable on cellphones, I'd be inclined to take this
> patch (I'm not a cookie peer though, so not my call). I assume we care more
> about reducing the high latency for large DB than adding a few ms in the
> smaller case.
Yes, it's plausible and honestly I expect most DBs being in the > 1MB situation. I think it's worth experimenting, though changing an API to experiment doesn't sound like a good thing. And I'm not sure it's a good thing to do this at the Storage level as well considered the results may vary depending on the DB size and it's quite unclear for the API consumer if it's a positive or a negative change (we may end up with add-ons trying to preload 50MB DBs). May we do the fadvise at the nsCookieService::TryInitDB level instead? Since it's a really specific request for this Db and is basically a oneliner looks like a simpler approach for now, we can evaluate moving it into Storage in future once we find more consumers and we have some good data.
This would allow simple Nightly testing to collect Telemetry.
> ***To be precise, what we do is on 1st cookie request is 1) issue an async
> SQLite read for the whole cookie DB, then 2) issue a blocking SQLite read
> just for the specific domain being asked for.
I have to verify the current code, but I suspect this is bad, it should probably happen across 2 separate connections (one for async requests and one for sync requests).
This problem is not really different from the localStorage problem, where the API is synchronous but we'd like to avoid synchronous IO.
(In reply to Taras Glek (:taras) from comment #9)
> We can land this as a telemetry trial to see which approach allows the
> database to be read in faster
Yes, this would be really good, implement the fadvise() call in nsCookieService and add telemetry specific to measure Cookies startup.
Comment 11•12 years ago
|
||
Comment on attachment 717374 [details] [diff] [review]
Part 1 - Generic Preload Functions
If you're just copying the code from nsGlueLinking, why not consolidate it in this file and remove the other code? You can compile FileUtils.cpp into the standalone glue using some ifdefs to skip the fallocate function (which depends on NSPR).
I'm not a great person to review this, glandium would probably be best.
Attachment #717374 -
Flags: review?(benjamin) → review?(mh+mozilla)
Comment 12•12 years ago
|
||
Comment on attachment 717374 [details] [diff] [review]
Part 1 - Generic Preload Functions
I agree with Benjamin, this would be better shared with the standalone glue code.
Attachment #717374 -
Flags: review?(mh+mozilla) → review-
Comment 13•12 years ago
|
||
Comment on attachment 717377 [details] [diff] [review]
Part 2 - Adds PRELOAD_DATABASE_FILE flag to storage service, modifies cookies to use it
Per my previous comment, I think we need a less invasive patch, with telemetry bits too.
Attachment #717377 -
Flags: review?(mak77)
Assignee | ||
Comment 14•12 years ago
|
||
This is a temporary patch for nightly so that we can gather some data about how well readahead works when loading the cookie DB.
Attachment #721328 -
Flags: review?(ehsan)
Comment 15•12 years ago
|
||
Comment on attachment 721328 [details] [diff] [review]
A/B test for readahead in cookie service
Review of attachment 721328 [details] [diff] [review]:
-----------------------------------------------------------------
::: netwerk/cookie/nsCookieService.cpp
@@ +782,5 @@
> }
>
> + Telemetry::ID histID;
> + TimeStamp start = TimeStamp::Now();
> + if (rand() % 2) {
Please add a comment here to mention this is a poor man's solution for doing A/B testing... :(
@@ +796,5 @@
> rv = mStorageService->OpenUnsharedDatabase(mDefaultDBState->cookieFile,
> getter_AddRefs(mDefaultDBState->dbConn));
> NS_ENSURE_SUCCESS(rv, RESULT_RETRY);
> + Telemetry::AccumulateDelta_impl<Telemetry::Millisecond>::compute(histID,
> + start);
Nit: put everything on the same line here.
Attachment #721328 -
Flags: review?(ehsan) → review+
Assignee | ||
Comment 16•12 years ago
|
||
Fixed nits, carrying forward r+
Attachment #721328 -
Attachment is obsolete: true
Attachment #721823 -
Flags: review+
Assignee | ||
Comment 17•12 years ago
|
||
Assignee | ||
Comment 18•12 years ago
|
||
Marking [leave open] since this is a telemetry trial
Whiteboard: [leave open]
Comment 19•12 years ago
|
||
Updated•12 years ago
|
Whiteboard: [leave open] → [leave open] c=startup_io u= p=
Assignee | ||
Comment 20•12 years ago
|
||
An analysis of a week's worth of telemetry data (~2.3 million samples) showed that, while readahead made the cookie db open more expensive, it paid itself off (and then some) when it came to actually reading in the cookie DB. The combined time to open and read the cookie DB dropped by 30% with readahead enabled.
This patch removes the A/B test and leaves readahead enabled.
Attachment #731231 -
Flags: review?(ehsan)
Assignee | ||
Comment 21•12 years ago
|
||
Assignee | ||
Comment 22•12 years ago
|
||
Here are the numbers that were generated by the pydoop script. The date range was March 20-27.
open_avg 13.9539846438
read_avg 34.2212878325
readahead_open_avg 23.9083075127
readahead_read_avg 10.0726330155
Updated•12 years ago
|
Attachment #731231 -
Flags: review?(ehsan) → review+
Assignee | ||
Comment 23•12 years ago
|
||
Comment on attachment 731231 [details] [diff] [review]
Remove A/B test and leave readahead enabled
[Approval Request Comment]
Bug caused by (feature/regressing bug #): 810454, A/B test for readahead of cookie db
User impact if declined: Feature will only be enabled roughly 50% of the time.
Testing completed (on m-c, etc.): Feature has been tested on m-c for nearly a month; this patch removes the A/B test aspect.
Risk to taking this patch (and alternatives if risky): None
String or IDL/UUID changes made by this patch: None
Attachment #731231 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 24•12 years ago
|
||
Assignee | ||
Updated•12 years ago
|
Whiteboard: [leave open] c=startup_io u= p= → c=startup_io u= p=
Comment 25•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Updated•12 years ago
|
Attachment #731231 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Updated•12 years ago
|
status-firefox21:
--- → unaffected
status-firefox22:
--- → affected
status-firefox23:
--- → fixed
tracking-firefox22:
--- → +
Comment 26•12 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•