Closed
Bug 828630
Opened 12 years ago
Closed 12 years ago
data race on nsSocketTransportService::mOffline
Categories
(Core :: Networking, defect)
Tracking
()
RESOLVED
FIXED
mozilla21
People
(Reporter: jseward, Assigned: jaas)
References
Details
(Keywords: valgrind)
Attachments
(1 file)
(deleted),
patch
|
unusualtears
:
review+
mcmanus
:
review+
akeybl
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
nsSocketTransportService::mOffline and possibly mGoingOffline are
accessed from multiple threads but there is no lock that consistently
protects it.
nsSocketTransportService::Run() does protect them both with mLock.
But nsSocketTransportService::SetOffline(bool offline) accesses them
without any holding mLock.
Reporter | ||
Comment 1•12 years ago
|
||
STR: start up on desktop and let it idle. Then quit.
Helgrind's complainery follows.
----------------------------------------------------------------
Lock at 0x4EDFDA0 was first observed
at 0x4A0A0C9: pthread_mutex_init (hg_intercepts.c:429)
by 0x50589FD: PR_NewLock (ptsynch.c:147)
by 0x6BF5F07: nsSocketTransportService::nsSocketTransportService() (Mutex.h:49)
by 0x6BCBECC: nsSocketTransportServiceConstructor(nsISupports*, nsID const&, void**) (in /home/sewardj/MOZ/MC-08-01-2013-HG/ff-opt/toolkit/library/libxul.so)
by 0x7C3E49A: mozilla::GenericFactory::CreateInstance(nsISupports*, nsID const&, void**) (GenericFactory.cpp:16)
by 0x7C6E930: nsComponentManagerImpl::CreateInstanceByContractID(char const*, nsISupports*, nsID const&, void**) (nsComponentManager.cpp:1034)
by 0x7C6FD0D: nsComponentManagerImpl::GetServiceByContractID(char const*, nsID const&, void**) (nsComponentManager.cpp:1426)
by 0x7C380DC: CallGetService(char const*, nsID const&, void**) (nsComponentManagerUtils.cpp:62)
by 0x7C38367: nsGetServiceByContractIDWithError::operator()(nsID const&, void**) const (nsComponentManagerUtils.cpp:256)
by 0x7C37105: nsCOMPtr_base::assign_from_gs_contractid_with_error(nsGetServiceByContractIDWithError const&, nsID const&) (nsCOMPtr.cpp:101)
by 0x6BDCF13: nsIOService::InitializeSocketTransportService() (nsCOMPtr.h:682)
by 0x6BDD03A: nsIOService::SetOffline(bool) (nsIOService.cpp:728)
Possible data race during write of size 1 at 0x4EDFD03 by thread #1
Locks held: none
at 0x6BF570E: nsSocketTransportService::SetOffline(bool) (nsSocketTransportService2.cpp:536)
by 0x6BDD153: nsIOService::SetOffline(bool) (nsIOService.cpp:714)
by 0x6BDF24D: nsIOService::Observe(nsISupports*, char const*, unsigned short const*) (nsIOService.cpp:917)
by 0x7C4CD16: nsObserverList::NotifyObservers(nsISupports*, char const*, unsigned short const*) (nsObserverList.cpp:99)
by 0x7C4D210: nsObserverService::NotifyObservers(nsISupports*, char const*, unsigned short const*) (nsObserverService.cpp:161)
by 0x6BB04A8: nsXREDirProvider::DoShutdown() (nsXREDirProvider.cpp:847)
by 0x6BA9B42: ScopedXPCOMStartup::~ScopedXPCOMStartup() (nsAppRunner.cpp:1119)
by 0x6BAF33B: XREMain::XRE_main(int, char**, nsXREAppData const*) (nsAppRunner.cpp:3915)
by 0x6BAF5A6: XRE_main (nsAppRunner.cpp:4093)
by 0x40208E: do_main(int, char**, nsIFile*) (nsBrowserApp.cpp:195)
by 0x402195: main (nsBrowserApp.cpp:388)
This conflicts with a previous read of size 1 by thread #4
Locks held: 1, at address 0x4EDFDA0
at 0x6BF6F81: nsSocketTransportService::Run() (nsSocketTransportService2.cpp:660)
by 0x7C74686: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:627)
by 0x7C3B1D0: NS_ProcessNextEvent_P(nsIThread*, bool) (nsThreadUtils.cpp:237)
by 0x7C751F9: nsThread::ThreadFunc(void*) (nsThread.cpp:265)
by 0x505DFD6: _pt_root (ptthread.c:158)
by 0x4A09F6C: mythread_wrapper (hg_intercepts.c:219)
by 0x30E2C07D13: start_thread (in /usr/lib64/libpthread-2.15.so)
by 0x30E28F167C: clone (in /usr/lib64/libc-2.15.so)
Address 0x4EDFD03 is 83 bytes inside a block of size 168 alloc'd
at 0x4A077BC: malloc (vg_replace_malloc.c:270)
by 0x62A7078: moz_xmalloc (mozalloc.cpp:54)
by 0x6BCBEC1: nsSocketTransportServiceConstructor(nsISupports*, nsID const&, void**) (mozalloc.h:200)
by 0x7C3E49A: mozilla::GenericFactory::CreateInstance(nsISupports*, nsID const&, void**) (GenericFactory.cpp:16)
by 0x7C6E930: nsComponentManagerImpl::CreateInstanceByContractID(char const*, nsISupports*, nsID const&, void**) (nsComponentManager.cpp:1034)
by 0x7C6FD0D: nsComponentManagerImpl::GetServiceByContractID(char const*, nsID const&, void**) (nsComponentManager.cpp:1426)
by 0x7C380DC: CallGetService(char const*, nsID const&, void**) (nsComponentManagerUtils.cpp:62)
by 0x7C38367: nsGetServiceByContractIDWithError::operator()(nsID const&, void**) const (nsComponentManagerUtils.cpp:256)
by 0x7C37105: nsCOMPtr_base::assign_from_gs_contractid_with_error(nsGetServiceByContractIDWithError const&, nsID const&) (nsCOMPtr.cpp:101)
by 0x6BDCF13: nsIOService::InitializeSocketTransportService() (nsCOMPtr.h:682)
by 0x6BDD03A: nsIOService::SetOffline(bool) (nsIOService.cpp:728)
by 0x6BDE479: nsIOService::InitializeNetworkLinkService() (nsIOService.cpp:278)
----------------------------------------------------------------
Keywords: valgrind
Taking this in order to find an owner.
Assignee: nobody → joshmoz
Comment 3•12 years ago
|
||
this is probably from adam's patch..
(In reply to Patrick McManus [:mcmanus] from comment #3)
> this is probably from adam's patch..
I don't know what this is a reference to.
Comment 5•12 years ago
|
||
(In reply to Josh Aas (Mozilla Corporation) from comment #4)
> (In reply to Patrick McManus [:mcmanus] from comment #3)
> > this is probably from adam's patch..
>
> I don't know what this is a reference to.
bug 87717
Blocks: 87717
Maybe the fix is this simple? I don't know this code, just trying to help move things along.
Attachment #700706 -
Flags: review?(unusualtears)
Comment on attachment 700706 [details] [diff] [review]
fix v1.0
Thanks Josh. That should do it. I am building a patched, valgrind-enabled build to verify it with helgrind, but I don't really have any doubt.
The traces in comment 1 are clear: the main thread toggles |mGoingOffline| while another thread may be locking it in |Run|.
Attachment #700706 -
Flags: review?(unusualtears)
Attachment #700706 -
Flags: review?(mcmanus)
Attachment #700706 -
Flags: review+
Getting helgrind to behave nicely is more work than I anticipated (building suppression file and all). I don't have the patience at the moment for it. But I stand by my comment that Josh's patch to grab the lock in |SetOffline| will eliminate the race for |mGoingOffline|.
Reporter | ||
Comment 9•12 years ago
|
||
Getting H to behave nicely requires marking up the source in a few critical
places; see bug 551155.
Anyway, with the patch in place I am no longer seeing the race. So, +1 from me.
Updated•12 years ago
|
Attachment #700706 -
Flags: review?(mcmanus) → review+
Assignee | ||
Comment 10•12 years ago
|
||
pushed to mozilla-inbound
http://hg.mozilla.org/integration/mozilla-inbound/rev/6e56521bea69
Assignee | ||
Comment 11•12 years ago
|
||
Comment on attachment 700706 [details] [diff] [review]
fix v1.0
[Approval Request Comment]
Bug caused by (feature/regressing bug #): 87717
User impact if declined: This was found by a tool, not a user. I guess Firefox could get confused about its offline state. Generally races like this are pretty problematic, hard to diagnose issues because they can be hard to reproduce.
Testing completed (on m-c, etc.): on m-c now
Risk to taking this patch (and alternatives if risky): doesn't seem risky
String or UUID changes made by this patch: none
Attachment #700706 -
Flags: approval-mozilla-aurora?
Comment 12•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Comment 13•12 years ago
|
||
Comment on attachment 700706 [details] [diff] [review]
fix v1.0
We'll take a low risk fix on Aurora to prevent the need to diagnose amorphous networking bugs. Approving.
Attachment #700706 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 14•12 years ago
|
||
status-firefox20:
--- → fixed
status-firefox21:
--- → fixed
You need to log in
before you can comment on or make changes to this bug.
Description
•