Closed Bug 1339760 Opened 8 years ago Closed 8 years ago

Break down "nsUrlClassifierDBServiceWorker::ApplyUpdate" and synchronously run update code on update thread

Categories

(Toolkit :: Safe Browsing, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox55 --- fixed

People

(Reporter: hchang, Assigned: hchang)

References

Details

Attachments

(2 files, 3 obsolete files)

Most of the nsUrlClassifierDBServiceWorker member functions are designed to be run on the same worker thread. In order to offload independent tasks to the update thread, we have to break down the current update code before doing Bug 1339050. This bug will refactor the "update codes" in nsUrlClassifierDBServiceWorker and identify which critical pieces MUST run on which threads. Furthermore, we will actually run them on the correct threads. For example, functions for "building new tables in a temp directory/memory" will run on update thread; functions for "swapping in new tables" will run on worker thread. However, even though we offload the update tasks to the update thread, the update tasks will run synchronously to the worker thread, which means no additional concurrency is introduced. Bug 1339050 will try to add the concurrency and deal with racing issue.
Assignee: nobody → hchang
Blocks: 1339050
Attachment #8837529 - Attachment is obsolete: true
Target Milestone: --- → mozilla53
Attachment #8839846 - Flags: review?(gpascutto)
Attachment #8839846 - Flags: review?(francois)
What the patch does: 1) In DBServiceWorker::FinishUpdate(), instead of directly Classifier::ApplyUpdate(), we **synchronously** move the tasks to the update thread. 2) After the task running on the update thread is done, we then call Classifier::SwapInNewTablesAndCleanup() on the "worker" thread (It was called in at the end of Classifier::ApplyUpdate() but is called by DBServiceWorker.) Note that |SwapInNewTablesAndCleanup| is a mutating work for in-use DBs so it has to be done on the worker thread. 3) That's it! The rest of changes are for the udpate thread life cycle.
Comment on attachment 8839846 [details] Bug 1339760 - Split update process to background/foreground and run background on update thread **synchronously**. https://reviewboard.mozilla.org/r/114430/#review115908 ::: toolkit/components/url-classifier/Classifier.cpp:639 (Diff revision 1) > Classifier::SwapInNewTablesAndCleanup() > { > nsresult rv; > > + bool updatingDirExists = false; > + rv = mUpdatingDirectory->Exists(&updatingDirExists); It seems more normal to check this in SwapDirectoryContent (which can't work anyway if this is false), but perhaps that is tricky if you want the NS_OK on failure? This looks pretty hackish, though. ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:643 (Diff revision 1) > + LOG(("Bulding new tables...")); > + mozilla::SyncRunnable::DispatchToThread(gDbUpdateThread, r); > + LOG(("Done bulding new tables. Try to commit them.")); > + > + return MaybeCommitNewTables(buildRv); > +#else Remove the #ifdef and the disabled part before committing. ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:698 (Diff revision 1) > +{ > + // We've either > + // 1) failed starting a download stream > + // 2) succedded in starting a download stream but failed to obtain > + // table updates > + // 3) succedded in obtaining table updates but failed to build new typo: succeeded (and a few times below) ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:699 (Diff revision 1) > + // We've either > + // 1) failed starting a download stream > + // 2) succedded in starting a download stream but failed to obtain > + // table updates > + // 3) succedded in obtaining table updates but failed to build new > + // talbes. typo: tables ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:758 (Diff revision 1) > > nsresult > -nsUrlClassifierDBServiceWorker::ApplyUpdate() > +nsUrlClassifierDBServiceWorker::BuildNewTables() > { > - LOG(("nsUrlClassifierDBServiceWorker::ApplyUpdate()")); > + // ***** Please be very careful while adding tasks. ***** > + // ***** It will run on the update thread. ***** "Be very careful" is not useful. What should we be careful about? What can go wrong? ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:1561 (Diff revision 1) > rv = NS_NewNamedThread("URL Classifier", &gDbBackgroundThread); > if (NS_FAILED(rv)) > return rv; > > + // Start the update thread. > + rv = NS_NewNamedThread("SB DB Update", &gDbUpdateThread); "SB DB" -> "SafeBrowsing DB". Else people debugging won't know what it is.
Attachment #8839846 - Flags: review?(gpascutto) → review+
Comment on attachment 8839846 [details] Bug 1339760 - Split update process to background/foreground and run background on update thread **synchronously**. https://reviewboard.mozilla.org/r/114430/#review115908 > It seems more normal to check this in SwapDirectoryContent (which can't work anyway if this is false), but perhaps that is tricky if you want the NS_OK on failure? > > This looks pretty hackish, though. It is inheritedly tricky and hackish from [1], which is a long-standing special case in Classifier::ApplyUpdate. If we hit the case, nothing will be done and the update should be treated successful. From DBService point of view, if we hit [1], Classifier::ApplyUpdate() will return |NS_OK| and DBService should then call Classifier::SwapInNewTablesAndCleanup() and notify the observer the result. Without this hashish change, Classifier::SwapInNewTablesAndCleanup() will some error like "directory not existed". To make it look less hackish, we can either 1) Let Classifier::ApplyUpdate return nsresult along with a boolean to indicate if "swap" is required. 2) Copy everything no matter we have an actual update so that Classifier::SwapInNewTablesAndCleanup() will not fail in case [1]. [1] http://searchfox.org/mozilla-central/rev/60ae6514e4c559c0c234f0e7aefccb101b8beb2e/toolkit/components/url-classifier/Classifier.cpp#683
Attachment #8838520 - Attachment is obsolete: true
Attachment #8839098 - Attachment is obsolete: true
I did a non-trivial change to the r+'ed patch for addressing the review comments and redefining the update process to "background" and "foreground" part: 1) Regarding the "hackish" issue in the review comment, since it cannot be handled gracefully "inside" Classifier::ApplyUpdates, I decided to handle it from the caller side. That is, just don't call Classifier::ApplyUpdates if we have nothing to update. (check if mTableUpdates.IsEmpty()) 2) For many reasons, I preserve the semantics of Classifier::ApplyUpdates, which is to completely do the update job. After it returns, the update will be done. (including swapping in the updated tables and cleaning up the update intermediaries.) Then I added two functions: Classifier::ApplyUpdatesBackground and ApplyUpdateForeground. "Background" means everything in use will be guaranteed unchanged so we can do the background update concurrently with any other operation. "Foreground" part will either 1) Take the newly built tables or 2) reset the table which cannot be updated (if the failure is not caused by OOM) according to the background update result. The update intermediaries will be removed in the foreground part as well. [1] 3) Based on (2), Classifier::ApplyUpdates() will be simply a call of ApplyUpdatesBackground() followed by ApplyUpdatesForeground(). Since the semantics remain, the test case doesn't need to be changed. 4) We further split DBServiceWorker::ApplyUpdate to ApplyUpdates[Back/Fore]round. ("s" is added for consistency) The background part will be sent to "update thread" and the foreground part will be done on the worker thread (actually the worker thread is named "background thread" maybe I need a bug to rename it later.) The advantage of this big change (mostly the renaming work) is we don't need to worry about handling the update failure. The failed update table name will be sent to worker thread (foreground) to be reset. Besides, the background/foreground concept is clear to the reader that which part can or cannot, would or wouldn't do something. [1] Note that ideally the "foreground" part is better of NOT removing the update intermediaries. Imagine we have a back-to-back update while the "foreground" is removing update intermediaries... However, the update task will not begin until the previous update is done. This is controlled by the interaction between StreamUpdater and DBService.
Hi gcp, Would you like to review again in regard to the changes described in comment 12? Besides, while working on asynchronously applying update on the update thread, I found there's a out-of-band use of nsIUrlClassifierDBService for addMozEntries [1] but it will NOT be an issue until we asynchronously dispatch to update thread. [1] http://searchfox.org/mozilla-central/rev/546a05fec017cb785fca62a5199d42812a6a1fd3/toolkit/components/url-classifier/SafeBrowsing.jsm#448
Flags: needinfo?(gpascutto)
BTW, I will not try to land this patch until the coming merge day (3/6).
Comment on attachment 8839846 [details] Bug 1339760 - Split update process to background/foreground and run background on update thread **synchronously**. https://reviewboard.mozilla.org/r/114430/#review119236 ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:651 (Diff revision 4) > + mozilla::SyncRunnable::DispatchToThread(gDbUpdateThread, r); > + LOG(("Done bulding new tables. Try to commit them.")); > + > + return ApplyUpdatesForeground(backgroundRv, failedTableName); > +#else > + // The following is what it would look like if we run ApplyUpdatesBackground() As said, remove this and the #if 1 before committing. ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:698 (Diff revision 4) > +nsresult > +nsUrlClassifierDBServiceWorker::NotifyUpdateObserver(nsresult aUpdateStatus) > +{ > + // We've either > + // 1) failed starting a download stream > + // 2) succedded in starting a download stream but failed to obtain same typo again
I think this rework makes the code easier to understand, so yay to that.
Flags: needinfo?(gpascutto)
Comment on attachment 8839846 [details] Bug 1339760 - Split update process to background/foreground and run background on update thread **synchronously**. https://reviewboard.mozilla.org/r/114430/#review119408 ::: toolkit/components/url-classifier/Classifier.h:76 (Diff revision 4) > nsresult ApplyUpdates(nsTArray<TableUpdate*>* aUpdates); > > /** > + * The "background" part of ApplyUpdates. Once the background update > + * is called, the foreground update has to be called along with the > + * background result no matter the background update is successful or not. "...no matter _whether_ the background..." ::: toolkit/components/url-classifier/Classifier.h:83 (Diff revision 4) > + nsresult ApplyUpdatesBackground(nsTArray<TableUpdate*>* aUpdates, > + nsACString& aFailedTableName); > + > + /** > + * The "foreground" part of ApplyUpdates. The in-use data (in-memory and > + * on-disk) will be touched so this MUST be mutual exclusive to other "mutually exclusive" ::: toolkit/components/url-classifier/Classifier.h:84 (Diff revision 4) > + nsACString& aFailedTableName); > + > + /** > + * The "foreground" part of ApplyUpdates. The in-use data (in-memory and > + * on-disk) will be touched so this MUST be mutual exclusive to other > + * memeber functions. typo: member ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:617 (Diff revision 4) > { > if (gShuttingDownThread) { > return NS_ERROR_NOT_INITIALIZED; > } > > + NS_ASSERTION(!mProtocolParser, "Should have been nulled out in FinishStream() " This is a good thing to check but it should be a `MOZ_ASSERT()`. I don't think `NS_ASSERTION` is meant to be used in new code. ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:1613 (Diff revision 4) > return rv; > > + // Start the update thread. > + rv = NS_NewNamedThread(NS_LITERAL_CSTRING("SafeBrowsing DB Update"), > + &gDbUpdateThread); > + if (NS_FAILED(rv)) nit: braces around the `if`
Attachment #8839846 - Flags: review?(francois) → review+
Status: NEW → ASSIGNED
Priority: -- → P2
Pushed by hchang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a1dfb38c49b0 Split update process to background/foreground and run background on update thread **synchronously**. r=francois,gcp
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Depends on: 1467581
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: