Closed Bug 1315097 Opened 8 years ago Closed 8 years ago

Race condition between updates to the Mozilla lists and ClassifyLocalWithTables() leads to deadlock

Categories

(Toolkit :: Safe Browsing, defect, P1)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox51 - unaffected
firefox52 + fixed
firefox53 --- fixed

People

(Reporter: francois, Assigned: hchang)

References

Details

(Keywords: regression)

Attachments

(3 files)

I am seeing frequent hangs on Nightly (at least once a week, usually once a day). Here's an example using Nightly 2016-11-02 on Linux64: listmanager: 13:17:47 GMT-0700 (PDT): checkForUpdates with https://shavar.services.mozilla.com/downloads?client=navclient-auto-ffox&appver=52.0a&pver=2.2 listmanager: 13:17:47 GMT-0700 (PDT): this.tablesData: { "googpub-phish-proto": { "updateUrl": "https://safebrowsing.googleapis.com/v4/threatListUpdates:fetch?$ct=application/x-protobuf&key=[trimmed-google-api-key]", "gethashUrl": "https://safebrowsing.googleapis.com/v4/fullHashes:find?$req=%REQUEST_BASE64%&$ct=application/x-protobuf&key=[trimmed-google-api-key]", "provider": "google4" }, ... } listmanager: 13:17:47 GMT-0700 (PDT): existing chunks: test-trackwhite-simple;a:1 mozstd-trackwhite-digest256;a:1472162826 goog-badbinurl-shavar;a:111809-114434:s:114307-114314,114316-114319... test-track-simple;a:1-2 base-track-digest256;a:1471874828 test-malware-simple;a:1 goog-malware-shavar;a:242837-251875:s:238338-238711,238713-239044,239046-239130... goog-unwanted-shavar;a:63014-80389:s:56502,56507,56509,56511-56513... goog-phish-shavar;a:458907-463758:s:286633-286890,286892-286931,286933-287066... test-unwanted-simple;a:1 mozplugin-block-digest256;a:1471849627 test-block-simple;a:1 test-phish-simple;a:1 goog-malware-proto;Cg0IARAGGAEiAzAwMTABEJeoARoCGAYoGJZz:ZgS+9xtHHaRD/Ys1tbna7jLNHmx/SmpAh2flW1kuyUU= googpub-phish-proto;Cg0IAhAGGAEiAzAwMTABEIGkARoCGAakvame:jGKcM/6uh+Uf3cGHvqPwffSLwaxDEMjPg701Nz48vZ4= goog-unwanted-proto;Cg0IAxAGGAEiAzAwMTABEJmPARoCGAaluotS:V4NLJ9Kg9S9TtqEoNO+JSWJg/JM0KlvxolDDOS9iTNs= listmanager: 13:17:47 GMT-0700 (PDT): update request: { "tableList": "base-track-digest256,mozstd-trackwhite-digest256,mozplugin-block-digest256", "tableNames": {}, "requestPayload": "mozstd-trackwhite-digest256;a:1472162826\nbase-track-digest256;a:1471874828\nmozplugin-block-digest256;a:1471849627\n", "isPostRequest": true } listmanager: 13:17:47 GMT-0700 (PDT): makeUpdateRequestForEntry_: requestPayload mozstd-trackwhite-digest256;a:1472162826 base-track-digest256;a:1471874828 mozplugin-block-digest256;a:1471849627 update: https://shavar.services.mozilla.com/downloads?client=navclient-auto-ffox&appver=52.0a&pver=2.2 tablelist: base-track-digest256,mozstd-trackwhite-digest256,mozplugin-block-digest256 After attaching to the firefox process using gdb, I got the following backtrace: #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007fa2d20d7b4c in PR_WaitCondVar () from /home/francois/devel/nightly/libnspr4.so #2 0x00007fa2c541b6f5 in mozilla::SyncRunnable::DispatchToThread(nsIEventTarget*, bool) () from /home/francois/devel/nightly/libxul.so #3 0x00007fa2c541b7b7 in mozilla::SyncRunnable::DispatchToThread(nsIEventTarget*, nsIRunnable*, bool) () from /home/francois/devel/nightly/libxul.so #4 0x00007fa2c6b60eee in UrlClassifierDBServiceWorkerProxy::DoLocalLookup(nsACString_internal const&, nsACString_internal const&, nsTArray<mozilla::safebrowsing::LookupResult>*) () from /home/francois/devel/nightly/libxul.so #5 0x00007fa2c6b667a3 in nsUrlClassifierDBService::ClassifyLocalWithTables(nsIURI*, nsACString_internal const&, nsACString_internal&) () from /home/francois/devel/nightly/libxul.so #6 0x00007fa2c7be26e6 in mozilla::net::nsHttpChannel::BeginConnect() () from /home/francois/devel/nightly/libxul.so #7 0x00007fa2c7be2ecc in mozilla::net::nsHttpChannel::OnProxyAvailable(nsICancelable*, nsIChannel*, nsIProxyInfo*, nsresult) () from /home/francois/devel/nightly/libxul.so #8 0x00007fa2c7b8e756 in mozilla::net::nsAsyncResolveRequest::DoCallback() () from /home/francois/devel/nightly/libxul.so #9 0x00007fa2c7b8ecef in mozilla::net::nsProtocolProxyService::AsyncResolveInternal(nsIChannel*, unsigned int, nsIProtocolProxyCallback*, nsICancelable**, bool) () from /home/francois/devel/nightly/libxul.so #10 0x00007fa2c7bd02fe in mozilla::net::nsHttpChannel::ResolveProxy() () from /home/francois/devel/nightly/libxul.so #11 0x00007fa2c7be3171 in mozilla::net::nsHttpChannel::AsyncOpen(nsIStreamListener*, nsISupports*) () from /home/francois/devel/nightly/libxul.so #12 0x00007fa2c7be32c2 in mozilla::net::nsHttpChannel::AsyncOpen2(nsIStreamListener*) () from /home/francois/devel/nightly/libxul.so #13 0x00007fa2c55c7c2a in mozilla::net::HttpChannelParent::InvokeAsyncOpen(nsresult) () from /home/francois/devel/nightly/libxul.so #14 0x00007fa2c55c8d71 in mozilla::net::HttpChannelParent::DoAsyncOpen(mozilla::ipc::URIParams const&, mozilla::ipc::OptionalURIParams const&, mozilla::ipc::OptionalURIParams const&, mozilla::ipc::OptionalURIParams const&, unsigned int const&, mozilla::ipc::OptionalURIParams const&, mozilla::ipc::OptionalURIParams const&, unsigned int const&, nsTArray<mozilla::net::RequestHeaderTuple> const&, nsCString const&, mozilla::ipc::OptionalIPCStream const&, bool const&, unsigned short const&, unsigned int const&, unsigned char const&, bool const&, bool const&, unsigned int const&, bool const&, unsigned long const&, nsCString const&, bool const&, nsCString const&, bool const&, bool const&, mozilla::net::OptionalLoadInfoArgs const&, mozilla::net::OptionalHttpResponseHead const&, nsCString const&, unsigned int const&, nsCString const&, mozilla::net::OptionalCorsPreflightArgs const&, unsigned int const&, bool const&, bool const&, bool const&, nsCString const&, nsCString const&, nsCString const&) () from /home/francois/devel/nightly/libxul.so #15 0x00007fa2c55c929c in mozilla::net::HttpChannelParent::Init(mozilla::net::HttpChannelCreationArgs const&) () from /home/francois/devel/nightly/libxul.so #16 0x00007fa2c56e7e6d in mozilla::net::PNeckoParent::OnMessageReceived(IPC::Message const&) () from /home/francois/devel/nightly/libxul.so #17 0x00007fa2c5818fb8 in mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&) () from /home/francois/devel/nightly/libxul.so #18 0x00007fa2c72d7067 in mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&) () from /home/francois/devel/nightly/libxul.so #19 0x00007fa2c72d89fa in mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&) () from /home/francois/devel/nightly/libxul.so #20 0x00007fa2c72dafac in mozilla::ipc::MessageChannel::MessageTask::Run() () from /home/francois/devel/nightly/libxul.so #21 0x00007fa2c725fd6b in nsThread::ProcessNextEvent(bool, bool*) () from /home/francois/devel/nightly/libxul.so #22 0x00007fa2c727569c in NS_ProcessNextEvent(nsIThread*, bool) () from /home/francois/devel/nightly/libxul.so #23 0x00007fa2c72d6c7d in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) () from /home/francois/devel/nightly/libxul.so #24 0x00007fa2c7c5b6d3 in MessageLoop::Run() () from /home/francois/devel/nightly/libxul.so #25 0x00007fa2c6614b4a in nsBaseAppShell::Run() () from /home/francois/devel/nightly/libxul.so #26 0x00007fa2c6b4636b in nsAppStartup::Run() () from /home/francois/devel/nightly/libxul.so #27 0x00007fa2c6b8287c in XREMain::XRE_mainRun() () from /home/francois/devel/nightly/libxul.so #28 0x00007fa2c6b82b5d in XREMain::XRE_main(int, char**, nsXREAppData const*) () from /home/francois/devel/nightly/libxul.so #29 0x00007fa2c6b82dc6 in XRE_main () from /home/francois/devel/nightly/libxul.so #30 0x000000000040fca1 in do_main(int, char**, char**, nsIFile*) () #31 0x000000000040c08a in main () It looks like a deadlock between an update of the TP list and a call to ClassifyLocalWithTables().
gcp, have you looked into deadlocks like this before? I'm not sure what the best way to debug this is. I don't yet have a way to reproduce this reliably, but I imagine I could instrument whatever locks we have on the URL Classifier DB and PR_LOG() them?
Flags: needinfo?(gpascutto)
Blocks: 1305486
Hi francois, Do you have the backtrace of safebrowsing background thread ?
SyncRunnable() can be used only if we're certain that the target will never block on sending an event to the source thread. I saw that ClassifyLocalWithTables, main thread try to dispatch to worker thread by sync-dispatching in [2] And updating list, worker thread also tries to dispatch to main thread [1] Any possible deadlock here? [1] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierProxies.cpp#142 [2] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/Classifier.cpp#169
(In reply to Thomas Nguyen[:tnguyen] (use ni? plz) from comment #4) > SyncRunnable() can be used only if we're certain that the target will never > block on sending an event to the source thread. > I saw that ClassifyLocalWithTables, main thread try to dispatch to worker > thread by sync-dispatching in [2] > And updating list, worker thread also tries to dispatch to main thread [1] > > Any possible deadlock here? > [1] > https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url- > classifier/nsUrlClassifierProxies.cpp#142 > [2] > https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url- > classifier/Classifier.cpp#169 Makes sense to me! [On the main thread] nsChannelClassifier::IsTrackerWhitelisted() nsUrlClassifierDBService::ClassifyLocalWithTables() UrlClassifierDBServiceWorkerProxy::DoLocalLookup() (being blocked the background thread) [On background thread] nsUrlClassifierDBServiceWorker::DoLocalLookup Classifier::Check() Classifier::GetLookupCache() LookupCache::LookupCache() LookupCache::UpdateRootDirHandle() Classifier::GetPrivateStoreDirectory() Classifier::GetProviderByTableName() (being blocked by the main thread) Ideally, what we should fix is to avoid the main thread from being blocked by the background thread. However, since it's not trivial to make nsChannelClassifier::IsTrackerWhitelisted a sync call, I think what we have to fix is "Classifier::GetProviderByTableName" ...
Assignee: nobody → hchang
Attachment #8807469 - Flags: review?(francois)
Flags: needinfo?(gpascutto)
Blocks: 1288633
That facts that we can't change are: 1) On the main thread, nsChannelClassifier::IsTrackerWhitelisted() is a sync call which must be blocked by the SB worker thread. 2) On the worker thread, nsUrlClassifierDBServiceWorker::DoLocalLookup() *might* wait the main thread for fetching preferences. (The actual point is the first time we create LookupCache.) I took the approach to avoid (2) since avoiding (1) seems impossible without entirely changing the architecture. So, to avoid (2), the solution could be a) Find some places (still on the worker thread but where we are certain that the main thread is not waiting for the us) to read the preferences. b) Read the preferences on the main thread and propagate to where they are needed. Obviously (b) is more practical and simple and that's what the patch is doing: 1) Read the prefs and build the "table name => provider" mapping in [1]. It's guaranteed early enough and on the main thread. 2) Propagate the built map to Classifer in all OpenDb(). 3) Once Classifer has the "table name to provider" mapping, it can propagate the corresponding provider to all LookupCache and HashStore whenever needed. [1] https://dxr.mozilla.org/mozilla-central/rev/3b80868f7a8fe0361918a814fbbbfb9308ae0c0a/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#158
Comment on attachment 8807469 [details] Bug 1315097 - Build the provider dictionary on the main thread to be used everywhere. https://reviewboard.mozilla.org/r/90606/#review90548 Your approach looks pretty sound to me. Doing the pref lookup once at startup seems much better than every time we need to convert a list to a provider. I would however like gcp to take a good look at this too to make sure he doesn't see anything wrong with this. ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:225 (Diff revision 2) > { > mGethashNoise = aGethashNoise; > mCacheDir = aCacheDir; > + nsresult rv = ReadProvidersFromPrefs(mProviderDict); > + if (NS_FAILED(rv)) { > + LOG(("Failed to build provider dictionary.")); Maybe we should add an assert here too. Things will be pretty broken if we can't read providers.
Attachment #8807469 - Flags: review?(francois) → review+
Attachment #8807469 - Flags: review?(gpascutto)
Henry, could you please ensure that this gets uplifted to 51 once it lands? As far as I can tell, we landed GetProviderByTableName() in bug 1254763 which means it has already made it to aurora.
Flags: needinfo?(hchang)
Keywords: regression
Comment on attachment 8807469 [details] Bug 1315097 - Build the provider dictionary on the main thread to be used everywhere. https://reviewboard.mozilla.org/r/90606/#review90554 Obviously I don't object to this as I've actually recommended this as an alternate approach in the original bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1254763#c14 But note the remark below about observing pref changes. I think the table-generation call is in the wrong place. ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:223 (Diff revision 2) > -nsUrlClassifierDBServiceWorker::Init(uint32_t aGethashNoise, > +nsUrlClassifierDBServiceWorker::Init(uint32_t aGethashNoise, nsCOMPtr<nsIFile> aCacheDir) > - nsCOMPtr<nsIFile> aCacheDir) > { > mGethashNoise = aGethashNoise; > mCacheDir = aCacheDir; > + nsresult rv = ReadProvidersFromPrefs(mProviderDict); I think you need to call this in "nsUrlClassifierDBService::ReadTablesFromPrefs()" too for cases where we change the prefs at runtime. In fact, maybe that's the place where the call should live in the first place, given that nsUrlClassifierDBService will call it before it calls this?
Attachment #8807469 - Flags: review?(gpascutto) → review+
(In reply to Dimi Lee[:dimi][:dlee] from comment #2) > Do you have the backtrace of safebrowsing background thread ? Here it is: #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f3b332d803e in PR_WaitCondVar () from /home/francois/devel/nightly/libnspr4.so #2 0x00007f3b266aae3f in mozilla::SyncRunnable::DispatchToThread(nsIEventTarget*, bool) () from /home/francois/devel/nightly/libxul.so #3 0x00007f3b266aaf01 in mozilla::SyncRunnable::DispatchToThread(nsIEventTarget*, nsIRunnable*, bool) () from /home/francois/devel/nightly/libxul.so #4 0x00007f3b290ff019 in mozilla::safebrowsing::Classifier::GetPrivateStoreDirectory(nsIFile*, nsACString_internal const&, nsIFile**) () from /home/francois/devel/nightly/libxul.so #5 0x00007f3b290ff269 in mozilla::safebrowsing::LookupCache::UpdateRootDirHandle(nsIFile*) () from /home/francois/devel/nightly/libxul.so #6 0x00007f3b29101b96 in mozilla::safebrowsing::Classifier::SetupPathNames() () from /home/francois/devel/nightly/libxul.so #7 0x00007f3b27da098e in mozilla::safebrowsing::Classifier::RemoveBackupTables() () from /home/francois/devel/nightly/libxul.so #8 0x00007f3b29108133 in mozilla::safebrowsing::Classifier::ApplyUpdates(nsTArray<mozilla::safebrowsing::TableUpdate*>*) () from /home/francois/devel/nightly/libxul.so #9 0x00007f3b27da6d55 in nsUrlClassifierDBServiceWorker::FinishUpdate() () from /home/francois/devel/nightly/libxul.so #10 0x00007f3b290fea27 in mozilla::detail::RunnableMethodImpl<nsresult (nsUrlClassifierDBServiceWorker::*)(), true, false>::Run() () from /home/francois/devel/nightly/libxul.so #11 0x00007f3b2849a77b in nsThread::ProcessNextEvent(bool, bool*) () from /home/francois/devel/nightly/libxul.so #12 0x00007f3b284af5bc in NS_ProcessNextEvent(nsIThread*, bool) () from /home/francois/devel/nightly/libxul.so #13 0x00007f3b2850bc81 in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) () from /home/francois/devel/nightly/libxul.so #14 0x00007f3b28e51c33 in MessageLoop::Run() () from /home/francois/devel/nightly/libxul.so #15 0x00007f3b28d6136d in nsThread::ThreadFunc(void*) () from /home/francois/devel/nightly/libxul.so #16 0x00007f3b332d8d7c in _pt_root () from /home/francois/devel/nightly/libnspr4.so #17 0x00007f3b349370a4 in start_thread (arg=0x7f3afb4ff700) at pthread_create.c:309 #18 0x00007f3b33a3e62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Comment on attachment 8807469 [details] Bug 1315097 - Build the provider dictionary on the main thread to be used everywhere. https://reviewboard.mozilla.org/r/90606/#review90554 > I think you need to call this in "nsUrlClassifierDBService::ReadTablesFromPrefs()" too for cases where we change the prefs at runtime. > > In fact, maybe that's the place where the call should live in the first place, given that nsUrlClassifierDBService will call it before it calls this? My first approach was exactly as what you suggested: 1) Read in DBService::Init and listen to the prefs change. 2) Either copy or transfer the ownership to DBServiceWorker for propagating to Classifer/LookupCache/HashStore 3) On pref change, generate the mapping again 4) Propagate the up-to-date pref to Classifier/LookupCache/HashStore However, step (4) seems non-trivial since the Classifier/LookupCache/HashStore are singletons. If the prefs change, we can easily update the mapping owned by DBServive but hard to propagate to Classifier/LookupCache/HashStore since the they (Classifer/HashStore/LookupCache) mostly consume the mapping on the worker thread. If we choose only update the mapping on the main thread with propagating to the worker thread, things may be a little inconsistent. Do you think which is worth doing?
This seems to be what we guess in comment 5. Pretty confident now that the patch will solve the race condition! (In reply to François Marier [:francois] from comment #12) > (In reply to Dimi Lee[:dimi][:dlee] from comment #2) > > Do you have the backtrace of safebrowsing background thread ? > > Here it is: > > #0 pthread_cond_wait@@GLIBC_2.3.2 () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 > #1 0x00007f3b332d803e in PR_WaitCondVar () from > /home/francois/devel/nightly/libnspr4.so > #2 0x00007f3b266aae3f in > mozilla::SyncRunnable::DispatchToThread(nsIEventTarget*, bool) () > from /home/francois/devel/nightly/libxul.so > #3 0x00007f3b266aaf01 in > mozilla::SyncRunnable::DispatchToThread(nsIEventTarget*, nsIRunnable*, bool) > () > from /home/francois/devel/nightly/libxul.so > #4 0x00007f3b290ff019 in > mozilla::safebrowsing::Classifier::GetPrivateStoreDirectory(nsIFile*, > nsACString_internal const&, nsIFile**) () from > /home/francois/devel/nightly/libxul.so > #5 0x00007f3b290ff269 in > mozilla::safebrowsing::LookupCache::UpdateRootDirHandle(nsIFile*) () > from /home/francois/devel/nightly/libxul.so > #6 0x00007f3b29101b96 in > mozilla::safebrowsing::Classifier::SetupPathNames() () > from /home/francois/devel/nightly/libxul.so > #7 0x00007f3b27da098e in > mozilla::safebrowsing::Classifier::RemoveBackupTables() () > from /home/francois/devel/nightly/libxul.so > #8 0x00007f3b29108133 in > mozilla::safebrowsing::Classifier::ApplyUpdates(nsTArray<mozilla:: > safebrowsing::TableUpdate*>*) () from /home/francois/devel/nightly/libxul.so > #9 0x00007f3b27da6d55 in nsUrlClassifierDBServiceWorker::FinishUpdate() () > from /home/francois/devel/nightly/libxul.so > #10 0x00007f3b290fea27 in mozilla::detail::RunnableMethodImpl<nsresult > (nsUrlClassifierDBServiceWorker::*)(), true, false>::Run() () from > /home/francois/devel/nightly/libxul.so > #11 0x00007f3b2849a77b in nsThread::ProcessNextEvent(bool, bool*) () > from /home/francois/devel/nightly/libxul.so > #12 0x00007f3b284af5bc in NS_ProcessNextEvent(nsIThread*, bool) () > from /home/francois/devel/nightly/libxul.so > #13 0x00007f3b2850bc81 in > mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) > () from /home/francois/devel/nightly/libxul.so > #14 0x00007f3b28e51c33 in MessageLoop::Run() () from > /home/francois/devel/nightly/libxul.so > #15 0x00007f3b28d6136d in nsThread::ThreadFunc(void*) () from > /home/francois/devel/nightly/libxul.so > #16 0x00007f3b332d8d7c in _pt_root () from > /home/francois/devel/nightly/libnspr4.so > #17 0x00007f3b349370a4 in start_thread (arg=0x7f3afb4ff700) at > pthread_create.c:309 > #18 0x00007f3b33a3e62d in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Flags: needinfo?(hchang)
Francois, gcp, I revised my approach because of the following reasons: 1) We need to rebuild the mapping on prefs change (comment 11) 2) There's a coming requirement that StreamUpdater also needs the provider mapping. So, instead of having a persistent mapping in DBServiceWorker and Classifier, only one mapping is maintained in nsUrlClassifierUtils in the new patch and can be lookup up by nsIUrlClassifierUtils. Given that nsIUrlClassifierUtils is a thread-safe XPCOM service [1], nsIUrlClassifierUtils is a suitable place to put this mapping. One of the issue to maintaining the mapping in nsIUrlClassifierUtils is we need to ensure the provider lookup thread-safe. So, a mutex associated with the mapping is added to nsUrlClassifierUtils. Another issue is we also need to ensure the first use of nsIUrlClassifierUtils is on the main thread. This is guaranteed by explicitly loading it in nsUrlClassifierDBService::Init(), where no work-thread operations would occur before. Actually the revised patch is still imperfect. The per-table LookupCache will have persistent |mProvider| even if the pref changes. I don't think this is that bad because otherwise we may need to reload the LookupCache from other per-provider folder, which might bring other issue. (To be honest, I don't have an answer to "if we really need to be able to change all of these at runtime" [2]?) Could you have another look again especially the changes in nsUrlClassifierUtils.cpp? Thanks! [1] https://dxr.mozilla.org/mozilla-central/rev/336759fad4621dfcd0a3293840edbed67018accd/toolkit/components/url-classifier/nsUrlClassifierUtils.h#47 [2] https://dxr.mozilla.org/mozilla-central/rev/336759fad4621dfcd0a3293840edbed67018accd/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#1304
Flags: needinfo?(gpascutto)
Flags: needinfo?(francois)
Comment on attachment 8807469 [details] Bug 1315097 - Build the provider dictionary on the main thread to be used everywhere. https://reviewboard.mozilla.org/r/90606/#review93184 ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:1262 (Diff revisions 2 - 5) > nsresult rv; > nsCOMPtr<nsICryptoHash> dummy = do_CreateInstance(NS_CRYPTO_HASH_CONTRACTID, &rv); > NS_ENSURE_SUCCESS(rv, rv); > > + { > + // Force nsIUrlClassifierUtils loading on main thread. Maybe scope the above PSM loading in the same manner, for consistency. ::: toolkit/components/url-classifier/tests/gtest/TestLookupCacheV4.cpp:35 (Diff revision 5) > nsCOMPtr<nsIFile> file; > NS_GetSpecialDirectory(NS_APP_USER_PROFILE_50_DIR, getter_AddRefs(file)); > > file->AppendNative(GTEST_SAFEBROWSING_DIR); > > - UniquePtr<LookupCacheV4> cache = MakeUnique<LookupCacheV4>(GTEST_TABLE, file); > + UniquePtr<LookupCacheV4> cache = MakeUnique<LookupCacheV4>(GTEST_TABLE, EmptyCString(), file); What's the EmptyCString? Feels like I'm missing a change to the constructor?
(In reply to Henry Chang [:henry][:hchang] from comment #17) > Could you have another look again especially the changes in > nsUrlClassifierUtils.cpp? Seems like a good approach.
Status: NEW → ASSIGNED
Flags: needinfo?(francois)
Thanks you two! I'll be landing this patch right away after addressing gcp's comment 19.
Comment on attachment 8807469 [details] Bug 1315097 - Build the provider dictionary on the main thread to be used everywhere. https://reviewboard.mozilla.org/r/90606/#review93524 ::: toolkit/components/url-classifier/Classifier.cpp:716 (Diff revisions 5 - 8) > +#ifdef MOZ_SAFEBROWSING_DUMP_FAILED_UPDATES > + > +already_AddRefed<nsIFile> > +Classifier::GetFailedUpdateDirectroy() > +{ > + nsCString failedUpdatekDirName = STORE_DIRECTORY + nsCString("-failedupdate"); NS_LITERAL_CSTRING? ::: toolkit/components/url-classifier/Classifier.cpp:731 (Diff revisions 5 - 8) > +} > + > +nsresult > +Classifier::DumpRawTableUpdates(const nsACString& aRawUpdates) > +{ > + // DumpFailedUpdate() MUST be called first to create the Presumably it could be a member variable then, that you can check for null. That would also get rid of duplicating the construction of its nsIFile both in DumpRawTableUpdates and DumpFailedUpdate. ::: toolkit/components/url-classifier/Classifier.cpp:741 (Diff revisions 5 - 8) > + nsCOMPtr<nsIFile> failedUpdatekDirectory = GetFailedUpdateDirectroy(); > + > + // Create tableupdate.bin and dump raw table update data. > + nsCOMPtr<nsIFile> rawTableUpdatesFile; > + nsCOMPtr<nsIOutputStream> outputStream; > + if (NS_FAILED(failedUpdatekDirectory->Clone(getter_AddRefs(rawTableUpdatesFile))) || I'm not a fan of this chaining of functions with side effects as a combined conditional for the if. Note that the log won't point out what exactly failed either.
Flags: needinfo?(gpascutto)
Pushed by hchang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3e719c86ef24 Build the provider dictionary on the main thread to be used everywhere. r=francois,gcp
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Comment on attachment 8811600 [details] [diff] [review] Bug1315097-uplift-for-beta51.patch Approval Request Comment [Feature/regressing bug #]: Bug 1315097 [User impact if declined]: A deadlock might occur [Describe test coverage new/current, TreeHerder]: gtest [Risks and why]: No [String/UUID change made/needed]: No
Attachment #8811600 - Flags: approval-mozilla-beta?
Comment on attachment 8811619 [details] [diff] [review] Bug1315097-uplift-for-aurora52.patch Approval Request Comment [Feature/regressing bug #]: Bug 1315097 (regression bug) [User impact if declined]: Deadlock may occur [Describe test coverage new/current, TreeHerder]: gtest [Risks and why]: No [String/UUID change made/needed]: No
Attachment #8811619 - Flags: approval-mozilla-aurora?
Hi Henry, First, the feature/regressing bug should not be the bug itself. Second, because the patch is huge, the risk should not be "no". Third, given 51 is in beta, I need better justification for beta, otherwise, I would rather let this patch ride the train on Aurora52. Are there any reproduced steps for this issue? I would like to have QA to verify this.
Flags: needinfo?(hchang)
(In reply to Gerry Chang [:gchang] from comment #32) > Hi Henry, > First, the feature/regressing bug should not be the bug itself. This patch is to fix the regression caused by Bug 1254763. So should this field be Bug 1254763. > Second, because the patch is huge, the risk should not be "no". The only risk I can think of is this patch might introduce another deadlock since a mutex is use. > Third, given 51 is in beta, I need better justification for beta, otherwise, > I would rather let this patch ride the train on Aurora52. > > Are there any reproduced steps for this issue? I would like to have QA to > verify this. Unfortunately there's no specific steps to reproduce. The only way to reproduce is just to browser pages. Francois, do you remember any steps which is prone to cause this issue?
Flags: needinfo?(hchang) → needinfo?(francois)
(In reply to Henry Chang [:henry][:hchang] from comment #33) > Unfortunately there's no specific steps to reproduce. The only way to > reproduce > is just to browser pages. Francois, do you remember any steps which is prone > to cause this issue? I have tracking protection enabled globally in my profile and in the course of normal day-to-day browsing, I hit this deadlock on average once or twice a day. It's not guaranteed to happen, sometimes it skips a day, but it's quite frequent and frequently I get another deadlock immediately after restarting Firefox. So it's quite a bad regression for anybody using tracking protection (including anybody using Private Browsing I assume).
Flags: needinfo?(francois)
Comment on attachment 8811619 [details] [diff] [review] Bug1315097-uplift-for-aurora52.patch I'd like this to bake on Aurora a bit more before considering uplifting to Beta51. Aurora52+.
Attachment #8811619 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Gerry is worried about the risk of uplifting to 51 (too many code changes). And I talked to Henry offline. It is fine that we don't uplift this fix to 51. Henry, please help to elaborate the reason.
Flags: needinfo?(hchang)
Re-checking the code that might cause deadlock, only when goog-*-proto tables are present may cause the deadlock [1]. So, the impact without this patch should be, when user modifies the preferences to add goog-*-proto tables, the browser may freeze due to deadlock. Ni'ing francois to see if he agrees not to uplift the patch to beta51. [1] http://hg.mozilla.org/releases/mozilla-beta/file/tip/toolkit/components/url-classifier/Classifier.cpp#l180
Flags: needinfo?(hchang) → needinfo?(francois)
(In reply to Henry Chang [:henry][:hchang] from comment #38) > Re-checking the code that might cause deadlock, only when goog-*-proto > tables are present may cause the deadlock [1]. So, the impact without this > patch should be, when user modifies the preferences > to add goog-*-proto tables, the browser may freeze due to deadlock. You're right, I missed that. We don't need this on beta.
Flags: needinfo?(francois)
Attachment #8811600 - Flags: approval-mozilla-beta?
Track 51- as 51 is unaffected.
Version: unspecified → Trunk
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: