Closed Bug 1248915 Opened 9 years ago Closed 9 years ago

TSan: data race on global 'mozilla::net::CacheObserver::sDiskCacheCapacity'

Categories

(Core :: Networking: Cache, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox47 --- affected
firefox48 --- fixed

People

(Reporter: jseward, Assigned: mayhemer)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-would-take])

Attachments

(4 files)

The attached logfile shows a thread/data race detected by TSan (ThreadSanitizer). * Specific information about this bug Both the main thread and the 'Cache2 I/O' access CacheObserver::sDiskCacheCapacity without any locking. * General information about TSan, data races, etc. Typically, races reported by TSan are not false positives, but it is possible that the race is benign. Even in this case though, we should try to come up with a fix unless this would cause unacceptable performance issues. Also note that seemingly benign races can possibly be harmful (also depending on the compiler and the architecture) [1][2]. If the bug cannot be fixed, then this bug should be used to either make a compile-time annotation for blacklisting or add an entry to the runtime blacklist. [1] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong [2] _How to miscompile programs with "benign" data races_: https://www.usenix.org/legacy/events/hotpar11/tech/final_files/Boehm.pdf
Attached file TSan stack trace, tidied up (deleted) —
Following the landing of bug 1136762, this is now the most frequently reported race, when running the xpcshell test suite on TSan: 45 modules/libpref/Preferences.cpp:1769:38 in mozilla::UintVarChanged <------ 35 ipc/glue/MessagePump.cpp:142:7 in mozilla::ipc::MessagePump::Sched 11 toolkit/components/telemetry/Telemetry.cpp:3318:37 in CanRecordBase 11 toolkit/components/telemetry/Telemetry.cpp:3318:11 in CanRecordBase 10 tools/profiler/core/platform-linux.cc:250:46 in (anonymous namespa 10 tools/profiler/core/platform-linux.cc:249:46 in (anonymous namespa 10 signal handler spoils errno tools/profiler/core/platform-linux.cc: 9 tools/profiler/core/platform.h:350:34 in IsActive 7 netwerk/base/nsPACMan.cpp:344:13 in nsPACMan::Shutdown()
Easy STR: do a TSan-enabled build, then: TSAN_OPTIONS=suppressions=/home/sewardj/MOZ/SUPPS/tsan-empty.txt \ DISPLAY=:1.0 ./mach xpcshell-test --sequential --log-mach - \ devtools/shared/apps/tests/unit/test_webappsActor.js
This is uncritical at all. Can I use Atomic as a static?
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
(In reply to Honza Bambas (:mayhemer) from comment #4) > This is uncritical at all. Can I use Atomic as a static? (per irc discussion) Yes you can use Atomic as a static. No problem there. The difficulty though is that the call to mozilla::Preferences::AddUintVarCache at CacheObserver.cpp:161 now passes an Atomic<int>* instead of an int* to AddUintVarCache. So there is some type fixing-up that needs to happen.
I think adding Atomic variants of AddPrefCache* methods might be the way. Others may benefit too.
Blocks: tsan
Whiteboard: [necko-would-take]
(In reply to Honza Bambas (:mayhemer) from comment #6) > I think adding Atomic variants of AddPrefCache* methods might be the way. > Others may benefit too. I'm in two minds about whether to fix it like that. I looked at it and at first glance, adding Atomic AddPrefCache* variants seems pretty ugly. Is there another way to solve this? The sequence of events is as follows: [1] main thread creates the Cache IO thread [2] the IO thread reads CacheObserver::sDiskCacheCapacity, presumably getting the initial value, |kDefaultDiskCacheCapacity| [3] the main thread then processes preferences (or whatever) and writes to CacheObserver::sDiskCacheCapacity On interpretation is that the IO thread is created too early. But I assume that the startup sequence is massively complex and full of circular dependencies and so there is no possibility of reordering the above to [1] [3] [2]. Correct?
Attached file race-sDiskCacheCapacity.txt (deleted) —
TSan stack traces again, but will full (non-truncated stacks) and ordered in the [1] [2] [3] sequence mentioned in comment #7.
Also I guess reordering would not work if it can be that the cached value can be changed at any point in the run. So, making a more atomic version of AddUintVarCache might be the way. But I am unsure whether it is really enough just to change the types in question from uint32_t to Atomic<uint32_t>. That might make TSan stop complaining, but do we need atomicity at coarser granularity to make this really correct? I am unsure how to proceed.
(In reply to Julian Seward [:jseward] from comment #9) > Also I guess reordering would not work if it can be that the cached > value can be changed at any point in the run. I really don't understand why this is such a big problem. The value itself is an int32_t which (AFAIK) is naturally atomic. OK, not in C++11.. I heard.. The value at a moment is absolutely unimportant. This is nothing critical! > > So, making a more atomic version of AddUintVarCache might be the way. > But I am unsure whether it is really enough just to change the types > in question from uint32_t to Atomic<uint32_t>. I would like to try to have a patch like that. If you feel like there is some showstopper, please let me know, can save me some cycles. > That might make TSan > stop complaining, but do we need atomicity at coarser granularity to > make this really correct? I am unsure how to proceed. I don't understand your concern. The only thing here is to ensure an atomic access to static pref cache. And let you know that there is no arithmetic op happening on that static. Regardless that uint32_t already is atomic to read/write, we need to Atomize<> it to make C++11 compilers and TSan happy. Relaxed ordering is fully sufficient here. What else am I missing?
Flags: needinfo?(jseward)
Attached patch Proposed fix (deleted) — Splinter Review
Flags: needinfo?(jseward)
Attachment #8730257 - Flags: review?(honzab.moz)
Comment on attachment 8730257 [details] [diff] [review] Proposed fix Review of attachment 8730257 [details] [diff] [review]: ----------------------------------------------------------------- Excelent! Thanks. Exactly what I had in mind. Although, I would more prefer this be Atomic<Relaxed>. There is no arithmetic happening on that global, it's only a pref cache and there is absolutely no need for any memory ordering here. We access this only on the main thread and on the cache io thread. Maybe make it templatable on the memory ordering arg? A pref module peer must review this too.
Attachment #8730257 - Flags: review?(honzab.moz) → review+
Attached patch bug1248915c13.cset (deleted) — Splinter Review
Now with templatized memory ordering constraints, as per comment 12.
Attachment #8730771 - Flags: review?(n.nethercote)
Comment on attachment 8730771 [details] [diff] [review] bug1248915c13.cset Review of attachment 8730771 [details] [diff] [review]: ----------------------------------------------------------------- I'll agree to review this because the libpref/ reviewing situation is so dire, but I refute utterly the notion that I am a libpref/ peer :) ::: modules/libpref/Preferences.cpp @@ +1787,5 @@ > gCacheData->AppendElement(data); > return RegisterCallback(UintVarChanged, aPref, data); > } > > +template <MemoryOrdering order> Template params are usually start with an upper-case letter. So use |Order| or |Ordering|. @@ +1790,5 @@ > > +template <MemoryOrdering order> > +static void AtomicUintVarChanged(const char* aPref, void* aClosure) > +{ > + CacheData* cache = static_cast<CacheData*>(aClosure); Use |auto cache = ...| to avoid writing the type twice. @@ +1798,5 @@ > + > +template <MemoryOrdering order> > +// static > +nsresult > +Preferences::AddAtomicUintVarCache(Atomic<uint32_t,order>* aCache, Comma after space. ::: modules/libpref/Preferences.h @@ +274,5 @@ > static nsresult AddUintVarCache(uint32_t* aVariable, > const char* aPref, > uint32_t aDefault = 0); > + template <MemoryOrdering order> > + static nsresult AddAtomicUintVarCache(Atomic<uint32_t,order>* aVariable, User |Order| or |Ordering| again. And comma after space. ::: netwerk/cache2/CacheObserver.cpp @@ +50,5 @@ > // Cache of the calculated memory capacity based on the system memory size > int32_t CacheObserver::sAutoMemoryCacheCapacity = -1; > > static uint32_t const kDefaultDiskCacheCapacity = 250 * 1024; // 250 MB > +Atomic<uint32_t,Relaxed> CacheObserver::sDiskCacheCapacity Space after comma. ::: netwerk/cache2/CacheObserver.h @@ +86,5 @@ > static bool sUseDiskCache; > static uint32_t sMetadataMemoryLimit; > static int32_t sMemoryCacheCapacity; > static int32_t sAutoMemoryCacheCapacity; > + static Atomic<uint32_t,Relaxed> sDiskCacheCapacity; Space after comma.
Attachment #8730771 - Flags: review?(n.nethercote) → review+
(In reply to Julian Seward [:jseward] from comment #13) > Created attachment 8730771 [details] [diff] [review] > bug1248915c13.cset > Now with templatized memory ordering constraints, as per comment 12. Honza, could you please carry over your r+ to this new patch?
Flags: needinfo?(honzab.moz)
No need to. Just land.
Flags: needinfo?(honzab.moz)
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: