Closed Bug 547031 Opened 15 years ago Closed 14 years ago

Improve async error handling in cookies

Tracking

()

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

blocking2.0

---

betaN+

People

(Reporter: dwitte, Assigned: dwitte)

References

Details

Attachments

(6 files, 1 obsolete file)

part 1: multiple DBStates 14 years ago dwitte@gmail.com (deleted), patch	sdwilsh : review+	Details \| Diff \| Splinter Review
part 2: TryInitDB() cleanup 14 years ago dwitte@gmail.com (deleted), patch	sdwilsh : review+	Details \| Diff \| Splinter Review
part 3: get sync db connection early 14 years ago dwitte@gmail.com (deleted), patch	sdwilsh : review+	Details \| Diff \| Splinter Review
part 4: fix EnsureReadComplete() 14 years ago dwitte@gmail.com (deleted), patch	sdwilsh : review+	Details \| Diff \| Splinter Review
part 5: fix observers 14 years ago dwitte@gmail.com (deleted), patch	sdwilsh : review+	Details \| Diff \| Splinter Review
part 6: async error handling 14 years ago dwitte@gmail.com (deleted), patch		Details \| Diff \| Splinter Review
part 6 v2: async error handling 14 years ago dwitte@gmail.com (deleted), patch	sdwilsh : review+	Details \| Diff \| Splinter Review

dwitte@gmail.com

Assignee

Description

•

15 years ago

+++ This bug was initially created as a clone of Bug #536978 +++ Now that we're about to have spiffy new error handlers for async statement execution, we should do something sensible with them. For instance: 1) On corruption errors, drop the table and repopulate the entire database from memory. 2) On filesystem read/write errors, and other such low-level things, we should probably close our database connection and run entirely from memory. We may try rebuilding the database, but there's only so much we can do. 3) On "file already open" errors (i.e. an AV scanner messing with us), just close the connection and forget about it. Trying to delete or rename the file is likely to fail, and at that point, it's Not Our Problem. 4) On consistency failures (such as "row doesn't exist"), we should probably do 1). This really should never happen, because this indicates a logic error in the code. We certainly want to assert and log it clearly.

dwitte@gmail.com

Assignee

Updated

•

15 years ago

Assignee: nobody → dwitte

dwitte@gmail.com

Assignee

Comment 1

•

14 years ago

I think we want this for 2.0, since (right now) we have less error handling than we used to, and we've definitely seen issues of database corruption in previous releases.

blocking2.0: --- → ?

dwitte@gmail.com

Assignee

Comment 2

•

14 years ago

(I'd recommend betaN+ since we need some good testing on this.)

Johnny Stenback (:jst)

Comment 3

•

14 years ago

BetaN+, due to the data loss nature of this.

blocking2.0: ? → betaN+

dwitte@gmail.com

Assignee

Comment 4

•

14 years ago

Attached patch part 1: multiple DBStates (deleted) — Details — Splinter Review

This makes it so DBStates are refcounted, allowing the default state to change. Firefox may not support switching profiles, but we at least need to not horribly break if one does so with Gecko. Without this, any kind of asynchronous error handling would be prone to complete failure -- we could end up operating on the database from a different profile. This doesn't really introduce any more complexity, it's just a bunch of refactoring.

Attachment #487811 - Flags: review?(sdwilsh)

dwitte@gmail.com

Assignee

Comment 5

•

14 years ago

Attached patch part 2: TryInitDB() cleanup (deleted) — Details — Splinter Review

This makes TryInitDB() much more explicit about what the various failure modes are. Basically, any error opening the database or on subsequent operations will result in throwing away the database and retrying. It also makes it so we rename a corrupt database, rather than throwing it away. I don't have a strong opinion on this... it's certainly useful in some cases to have that database around (it's helped us diagnose migration bugs, for instance), but it's also a privacy concern to have orphaned data on disk. My rationale here is that it should happen so rarely that the advantage of having it for debugging purposes outweighs the downside of having orphaned data.

Attachment #487812 - Flags: review?(sdwilsh)

dwitte@gmail.com

Assignee

Comment 6

•

14 years ago

Attached patch part 3: get sync db connection early (deleted) — Details — Splinter Review

GetSyncDBConn() can fail with SQLITE_BUSY if a write transaction (e.g. the DELETE kicked off from Read(), or a regular INSERT/DELETE/UPDATE operation) is in progress. If we do it earlier, we're fine. WAL saves us from the possibility of those writes blocking readers, so the read statements themselves shouldn't fail to execute.

Attachment #487813 - Flags: review?(sdwilsh)

dwitte@gmail.com

Assignee

Comment 7

•

14 years ago

Attached patch part 4: fix EnsureReadComplete() (deleted) — Details — Splinter Review

Found a silly bug where we don't include a 'WHERE baseDomain NOTNULL" clause in the SELECT statement, like we do in Read(). We kick off a DELETE of those rows in Read(), but the SELECT in EnsureReadComplete() could execute first, and we really don't want those rows showing up there.

Attachment #487815 - Flags: review?(sdwilsh)

dwitte@gmail.com

Assignee

Comment 8

•

14 years ago

Those are the prelude patches to the actual meat. I've got the final part written, but I'm still writing tests to prove it works. (There are a ton of cases to test.)

dwitte@gmail.com

Assignee

Comment 9

•

14 years ago

Attached patch part 5: fix observers (deleted) — Details — Splinter Review

Don't know why I didn't see crashes earlier, but reentrancy on observers (which we do in our tests) breaks horribly right now. This makes it so we fire notifications after all list mutations are complete, such that the observers may mutate at leisure. I dropped the "cookie-db-read" notification from EnsureReadComplete() because there's no way to make it safe -- it's called from a bunch of places itself, for which list mutation would be bad. And we don't need it for tests, because it's the synchronous case.

Attachment #488125 - Flags: review?(sdwilsh)

dwitte@gmail.com

Assignee

Comment 10

•

14 years ago

Attached patch part 6: async error handling (obsolete) (deleted) — Details — Splinter Review

This actually implements the logic to rebuild the database. I've played it conservative and made it so that, if we fail during an async read or a full sync read (i.e. EnsureReadComplete()), we throw away all the results. There's likely good data to be had, but we don't know which hosts we have complete data for and which we don't. Anything read previously via EnsureReadDomain() will be kept. As an added bonus, this is the final patch, and the whole patchqueue passes tests on try!

Attachment #488903 - Flags: review?(sdwilsh)

dwitte@gmail.com

Assignee

Comment 11

•

14 years ago

Attached patch part 6 v2: async error handling (deleted) — Details — Splinter Review

... and this one's actually the right one.

Attachment #488903 - Attachment is obsolete: true

Attachment #488930 - Flags: review?(sdwilsh)

Attachment #488903 - Flags: review?(sdwilsh)

dwitte@gmail.com

Assignee

Updated

•

14 years ago

Attachment #488930 - Attachment is patch: true

Attachment #488930 - Attachment mime type: application/octet-stream → text/plain

Shawn Wilsher :sdwilsh

Comment 12

•

14 years ago

Comment on attachment 487811 [details] [diff] [review] part 1: multiple DBStates >+++ b/extensions/cookie/test/unit/test_cookies_privatebrowsing.js >@@ -0,0 +1,136 @@ >+/* Any copyright is dedicated to the Public Domain. >+ * http://creativecommons.org/publicdomain/zero/1.0/ >+ */ Please copy boilerplate from http://www.mozilla.org/MPL/boilerplate-1.1/ >+ // Create URIs and channels pointing to foo.com and bar.com. >+ let spec1 = "http://foo.com/foo.html"; >+ let spec2 = "http://bar.com/bar.html"; >+ let uri1 = NetUtil.newURI(spec1); >+ let uri2 = NetUtil.newURI(spec2); You don't create any channels, and there's no need to store the specs in a local. Please update the comment and get rid of the local spec variables. >+++ b/netwerk/cookie/nsCookieService.cpp > // cache frequently used statements (for insertion, deletion, and updating) >- rv = mDBState->dbConn->CreateStatement(NS_LITERAL_CSTRING( >+ rv = mDefaultDBState->dbConn->CreateStatement(NS_LITERAL_CSTRING( > "INSERT INTO moz_cookies (" This is only used asynchronously, right? Use CreateAsyncStatement please. >- rv = mDBState->dbConn->CreateStatement(NS_LITERAL_CSTRING( >+ rv = mDefaultDBState->dbConn->CreateStatement(NS_LITERAL_CSTRING( > "DELETE FROM moz_cookies " ditto >- rv = mDBState->dbConn->CreateStatement(NS_LITERAL_CSTRING( >+ rv = mDefaultDBState->dbConn->CreateStatement(NS_LITERAL_CSTRING( > "UPDATE moz_cookies SET lastAccessed = :lastAccessed " ditto > // finalize our statements and close the db connection for the default state. > // since we own these objects, nulling the pointers is sufficient here. We don't do any finalizing anymore, so this comment is wrong. >+ // Null out the db. If it's open, this will synchronously close it. >+ mDefaultDBState->dbConn = NULL; This will fail if you've executed any async statements on it, I'm pretty sure. >+ if (mDBState && mDBState->dbConn && >+ !nsCRT::strcmp(aData, NS_LITERAL_STRING("shutdown-cleanse").get())) { >+ // Clear the cookie db if we're in the default DBState. As I recall, we don't need the nsCRT:: anymore. > } else if (NS_LITERAL_STRING(NS_PRIVATE_BROWSING_LEAVE).Equals(aData)) { >+ NS_ASSERTION(mDefaultDBState, "don't have a default state"); >+ NS_ASSERTION(mDBState == mPrivateDBState, "not in private state"); >+ NS_ASSERTION(!mPrivateDBState->dbConn, "private DB connection not null"); >+ >+ // Clear the private DBState, and restore the default one. >+ mPrivateDBState = NULL; >+ mDBState = mDefaultDBState; Hmm, why don't we just call Observe in InitDBStates when private browsing is enabled like we used to? > NS_IMETHODIMP > nsCookieService::RemoveAll() > { > RemoveAllFromMemory(); > > // clear the cookie file > if (mDBState->dbConn) { >- NS_ASSERTION(mDBState == &mDefaultDBState, "not in default DB state"); >+ NS_ASSERTION(mDBState == mDefaultDBState, "not in default DB state"); Wouldn't it be legal to call this while in private browsing mode though? > void > nsCookieService::AsyncReadComplete() > { > // We may be in the private browsing DB state, with a pending read on the > // default DB state. (This would occur if we started up in private browsing > // mode.) As long as we do all our operations on the default state, we're OK. >+ NS_ASSERTION(mDefaultDBState->pendingRead, "no pending read"); >+ NS_ASSERTION(mDefaultDBState->readListener, "no read listener"); Should also assert mDefaultDBState here, correct? > void > nsCookieService::CancelAsyncRead(PRBool aPurgeReadSet) > { > // We may be in the private browsing DB state, with a pending read on the > // default DB state. (This would occur if we started up in private browsing > // mode.) As long as we do all our operations on the default state, we're OK. >+ NS_ASSERTION(mDefaultDBState->pendingRead, "no pending read"); >+ NS_ASSERTION(mDefaultDBState->readListener, "no read listener"); ditto >+ rv = mDefaultDBState->stmtReadDomain->BindUTF8StringByIndex(0, aBaseDomain); Would prefer to see BindUTF8StringByName here >+++ b/netwerk/cookie/nsCookieService.h >+ void CheckPrivateBrowsing(); This isn't implemented. r=sdwilsh

Attachment #487811 - Flags: review?(sdwilsh) → review+

Shawn Wilsher :sdwilsh

Comment 13

•

14 years ago

> >+++ b/extensions/cookie/test/unit/test_cookies_privatebrowsing.js > >@@ -0,0 +1,136 @@ > >+/* Any copyright is dedicated to the Public Domain. > >+ * http://creativecommons.org/publicdomain/zero/1.0/ > >+ */ > Please copy boilerplate from http://www.mozilla.org/MPL/boilerplate-1.1/ This applies to all your patches here. I won't comment further about it.

Shawn Wilsher :sdwilsh

Comment 14

•

14 years ago

Comment on attachment 487812 [details] [diff] [review] part 2: TryInitDB() cleanup >+++ b/netwerk/cookie/nsCookieService.cpp >+ // Attempt to open and read the database. If TryInitDB() returns false, we >+ // retry; if it returns true, then things either failed catastrophically >+ // (with no further action required, including cleanup) or succeeded fine. This comment is most certainly not correct given: >+ OpenDBResult result = TryInitDB(false); >+// Attempt to open and read the database. If 'aRecreateDB' is true, try to >+// move the existing database file out of the way and create a new one. Return >+// RESULT_OK if opening or creating the database succeeded; RESULT_RETRY if >+// the database cannot be opened, is corrupt, or some other failure occurred >+// that might be resolved by recreating the database; or RESULT_FAILED if there >+// was an unrecoverable error and we must run without a database. If >+// RESULT_RETRY or RESULT_FAILED is returned, the caller should perform cleanup >+// of the default DBState. This belongs in the header file in a javadoc style comment. r=sdwilsh

Attachment #487812 - Flags: review?(sdwilsh) → review+

Shawn Wilsher :sdwilsh

Comment 15

•

14 years ago

(In reply to comment #12) > > // finalize our statements and close the db connection for the default state. > > // since we own these objects, nulling the pointers is sufficient here. > We don't do any finalizing anymore, so this comment is wrong. This is addressed in part 3, so don't worry about it.

Shawn Wilsher :sdwilsh

Comment 16

•

14 years ago

Comment on attachment 487813 [details] [diff] [review] part 3: get sync db connection early r=sdwilsh

Attachment #487813 - Flags: review?(sdwilsh) → review+

Shawn Wilsher :sdwilsh

Comment 17

•

14 years ago

Comment on attachment 487815 [details] [diff] [review] part 4: fix EnsureReadComplete() r=sdwilsh

Attachment #487815 - Flags: review?(sdwilsh) → review+

Shawn Wilsher :sdwilsh

Comment 18

•

14 years ago

Comment on attachment 488125 [details] [diff] [review] part 5: fix observers r=sdwilsh

Attachment #488125 - Flags: review?(sdwilsh) → review+

Shawn Wilsher :sdwilsh

Comment 19

•

14 years ago

Comment on attachment 488930 [details] [diff] [review] part 6 v2: async error handling r=sdwilsh

Attachment #488930 - Flags: review?(sdwilsh) → review+

dwitte@gmail.com

Assignee

Comment 20

•

14 years ago

(In reply to comment #12) > Please copy boilerplate from http://www.mozilla.org/MPL/boilerplate-1.1/ But it's the same as the PD boilerplate already! > This is only used asynchronously, right? Use CreateAsyncStatement please. Done! > >+ // Null out the db. If it's open, this will synchronously close it. > >+ mDefaultDBState->dbConn = NULL; > This will fail if you've executed any async statements on it, I'm pretty sure. Yeah, that's OK. We never do that, and if we did, Connection::Close would assert all over the place. > As I recall, we don't need the nsCRT:: anymore. Hmm? I thought we did for PRUnichar* strings. > Hmm, why don't we just call Observe in InitDBStates when private browsing is > enabled like we used to? I thought it was a little obfuscated to call Observe when it's clearer just to explicitly write out those two lines, but I'm not really fussed either way. > > NS_IMETHODIMP > > nsCookieService::RemoveAll() > > { > > RemoveAllFromMemory(); > > > > // clear the cookie file > > if (mDBState->dbConn) { > >- NS_ASSERTION(mDBState == &mDefaultDBState, "not in default DB state"); > >+ NS_ASSERTION(mDBState == mDefaultDBState, "not in default DB state"); > Wouldn't it be legal to call this while in private browsing mode though? Yes, but it's guarded by mDBState->dbConn, which is always null in PB. All other comments addressed.

dwitte@gmail.com

Assignee

Comment 21

•

14 years ago

(In reply to comment #20) > > This will fail if you've executed any async statements on it, I'm pretty sure. (The comment above that code wasn't very helpfully explanatory, I've made it clearer.)

Shawn Wilsher :sdwilsh

Comment 22

•

14 years ago

(In reply to comment #20) > But it's the same as the PD boilerplate already! Yours is three lines, and the boilerplate is two. Subtle, sure, but if gerv ever needs to go through a reformat stuff, it's helpful to have it be copied and pasted verbatim from the boilerplate page. > Hmm? I thought we did for PRUnichar* strings. Pretty sure it just works for both. Try it please.

dwitte@gmail.com

Assignee

Comment 23

•

14 years ago

(In reply to comment #22) > Yours is three lines, and the boilerplate is two. I changed all the files in extensions/cookie/test/unit. > Pretty sure it just works for both. Try it please. Nope. You're thinking 'wcscmp', but we don't use that, and I don't think it's portable across platforms (though it may be by now!). 'strcmp' is part of libc and is declared 'const char*', and C doesn't allow function name overloading.

dwitte@gmail.com

Assignee

Comment 24

•

14 years ago

All done. http://hg.mozilla.org/mozilla-central/rev/bbf2de2368f4 http://hg.mozilla.org/mozilla-central/rev/12336d312ad1 http://hg.mozilla.org/mozilla-central/rev/adcc4d2f4070 http://hg.mozilla.org/mozilla-central/rev/09bb5890500d http://hg.mozilla.org/mozilla-central/rev/3e8a89a8be56 http://hg.mozilla.org/mozilla-central/rev/8d5d68434171 http://hg.mozilla.org/mozilla-central/rev/97c3302f124c http://hg.mozilla.org/mozilla-central/rev/0525032a59a3

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

timeless

Updated

•

14 years ago

Depends on: 612072

You need to log in before you can comment on or make changes to this bug.