Closed Bug 1230768 (CVE-2016-1975) Opened 9 years ago Closed 9 years ago

Potential race conditions around block-level statics cause memory-safety bugs

Categories

(Core :: WebRTC: Audio/Video, defect)

42 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox43 --- wontfix
firefox44 --- wontfix
firefox45 + fixed
firefox46 + fixed
firefox47 --- fixed
firefox-esr38 --- unaffected
firefox-esr45 --- fixed

People

(Reporter: q1, Assigned: gcp)

References

Details

(Keywords: csectype-race, sec-moderate, Whiteboard: [adv-main45+][post-critsmash-triage])

Attachments

(1 file)

The codebase contains many instances of code like this: static CC* CC::GetInstance () { static CC theInstance; return &theInstance; } This pattern is not thread-safe under compilers lacking C++11 compliance, in particular compliance with s.6.7(4). Concurrent execution of such code could cause use of incompletely-initialized, partially-deleted, or otherwise broken objects, depending upon how CC is implemented and the exact timing of the racing threads. Since (at least) VS < VS 2015 is non-compliant, this issue potentially affects the Mozilla codebase. The discussion at https://bugzilla.mozilla.org/show_bug.cgi?id=1218124#c11 and following comments describes the problem in detail. Also, you easily can see the problem by examining the code that, e.g., VS 2012 generates for: static CC* CC::GetInstance () { static CC theInstance; return &theInstance; } which is similar to: /* module-level hidden variable */ static bool is_CC_theInstance_initialized; if (!is_CC_theInstance_initialized) { is_CC_theInstance_initialized = true; ::new (&theInstance) CC; } return &theInstance;
VS 2013 also generates thread-unsafe code for the return-address-of-local-static pattern. I just tested it. VS 2015, in contrast, generates thread-safe code.
(In reply to q1 from comment #1) > VS 2013 also generates thread-unsafe code for the > return-address-of-local-static pattern. I just tested it. VS 2015, in > contrast, generates thread-safe code. That's VS 2013 update 4. I see that MS has released VS 2013 update 5, which I'll test soon and report the results. Note that https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Windows_Prerequisites does not specify any particular version of VS 2013.
Also the version of VS used to build FF 42.0 generates thread-unsafe code for this pattern. Examine, e.g., the generated code for AsyncStatementClassInfo::GetScriptableHelper.
Can you detail in what way it's thread-unsafe? Is the failure mode that 2 copies are created and one is lost? Is it two threads racing to initialize the same memory? That helps a lot in evaluating the potential impact of this. (Focus on MSVC 2013; 2012 isn't supported anymore for Firefox builds.) Thanks!
Flags: needinfo?(q1)
Not entirely clear where this belongs; build config (compiler selection/support) seems closest. Feel free to move it.
Component: General → Build Config
(In reply to Randell Jesup [:jesup] from comment #4) > Can you detail in what way it's thread-unsafe? Is the failure mode that 2 > copies are created and one is lost? Is it two threads racing to initialize > the same memory? That helps a lot in evaluating the potential impact of > this. (Focus on MSVC 2013; 2012 isn't supported anymore for Firefox builds.) > Thanks! More than one thread can attempt to initialize the static object simultaneously. For example, the pertinent part of the generated code for AsyncStatementClassInfo::GetScriptableHelper is: ... mov eax, dword ptr [|long name ending "::`local static guard'] test al, 1 je do_the_init return_the_result: lea rax, sJsHelper mov qword ptr [rdx], rax xor eax, eax ret do_this_init: or eax, 1 ... ; initialize sJsHelper mov dword ptr [|long name ending "::`local static guard'],eax ... ; initialize sJsHelper some more jmp return_the_result So clearly more than one thread can do the initialization if all threads execute the initial |mov| and |test| more or less simultaneously. BTW, which VS 2013 update was used to build FF 42.0? about:buildconfig doesn't give the patch level.
Flags: needinfo?(q1)
Two more observations: 1. VS 2013 update 5 (the most recent available) also generates thread-unsafe code for block-level statics. 2. Some patterns similar to the one described in comment 0 are also thread-unsafe for similar reasons, though these patterns are less likely to malfunction if a race occurs, because the competing threads generally don't directly attempt to multiply initialize a single object... class CC { ... static CC *pCC; }; // class-level static OR static CC *pCC; // module-level static OR static CC * CC::GetInstance () { static CC *pCC; // block-level static if (!pCC) { pCC = new CC; } return pCC; } ...unless you assign the result to a smart pointer: static SomeSmartPtr<CC> pCC; ... pCC = new CC;
Many of the instances of comment 0 are simply to avoid global static constructors, and are not race concerns.
Flags: sec-bounty?
(In reply to Nathan Froyd [:froydnj] from comment #8) > Many of the instances of comment 0 are simply to avoid global static > constructors, and are not race concerns. Is there any easy (or not-easy) way to identify which are which? Can someone verify if the latest MSVC 2013 patch fixes this? Has it been reported to MS? If we don't expect a fix in the build servers soon, we need to identify all such uses and a reasonable workaround. (My usecase is likely an odd one, since it's on a hot path for realtime code and I can't just drop a critial section/mutex in without a perf issue.
Flags: needinfo?(nfroyd)
(In reply to Randell Jesup [:jesup] from comment #9) > (In reply to Nathan Froyd [:froydnj] from comment #8) > > Many of the instances of comment 0 are simply to avoid global static > > constructors, and are not race concerns. > > Is there any easy (or not-easy) way to identify which are which? > > Can someone verify if the latest MSVC 2013 patch fixes this? Per comment 7, VS 2013 update 5 (the most recent available) also generates thread-unsafe code for block-level statics. Just build something like: #include <windows.h> struct C { C() { printf("C()\n"); // non-trivial constructor } }; C* f() { static C c; return &c; } DWORD WINAPI g(void *p) { C *c = f(); printf("%p\n", c); return 0; } int _tmain(int argc, _TCHAR* argv[]) { ::CreateThread(NULL, 0, g, NULL, 0, NULL); while (1) {} return 0; } and step the assembly code for f(). > Has it been reported to MS? Not by me. MS does not claim that VS 2013 is compliant with C++11 in this area, which they call "magic statics". https://msdn.microsoft.com/en-us/library/hh567368.aspx . > If we don't expect a fix in the build servers soon, we need to identify all > such uses and a reasonable workaround. (My usecase is likely an odd one, > since it's on a hot path for realtime code and I can't just drop a critial > section/mutex in without a perf issue.
(In reply to Randell Jesup [:jesup] from comment #9) > (In reply to Nathan Froyd [:froydnj] from comment #8) > > Many of the instances of comment 0 are simply to avoid global static > > constructors, and are not race concerns. > > Is there any easy (or not-easy) way to identify which are which? Not that I know of. > Can someone verify if the latest MSVC 2013 patch fixes this? Has it been > reported to MS? Comment 10 suggests this has not been fixed, and I would be surprised if MS fixed the semantics of something like this in a patch update. > If we don't expect a fix in the build servers soon, we need to identify all > such uses and a reasonable workaround. (My usecase is likely an odd one, > since it's on a hot path for realtime code and I can't just drop a critial > section/mutex in without a perf issue. I can believe this, but I am also skeptical that ~100-300 cycles would be noticeable.
Flags: needinfo?(nfroyd)
(In reply to Nathan Froyd [:froydnj] from comment #11) > (In reply to Randell Jesup [:jesup] from comment #9) > > If we don't expect a fix in the build servers soon, we need to identify all > > such uses and a reasonable workaround. (My usecase is likely an odd one, > > since it's on a hot path for realtime code and I can't just drop a critial > > section/mutex in without a perf issue. > > I can believe this, but I am also skeptical that ~100-300 cycles would be > noticeable. You can use a lock-and-spinwait approach, which costs only a few instructions when there is no contention. When there is contention, the waiter(s) waste CPU time, but since this is initialization code, will there be much contention?
Given MSVC, probably the correct approach to use is a safe-using-atomics double-checked lock (and we have Atomics available to make double-checked locking safe). Then the overhead of the lock is only seen once. Or as nathan says bite the bullet and take a hit on each call - it does this a lot, and in real-time-sensitive code - but still, perhaps even on mobile it's not enough to rise up. It rubs me the wrong way though to take that hit if we don't need to.
Note: my comment about perf mostly applies to the very specific use from webrtc.org, not the general usage this bug covers. More generally, we could implement a general GetStaticInstance() template which wraps either a c++11-compliant static instance, or a double-checked lock creator (or just a locked impl). Once the MSVC we use becomes compliant, we can just drop to a single impl.
Mmmm now that you mention a GetStaticInstance() template, I'm thinking we could actually have a template GetSingleton() function that can entirely replace every single occurrence of the pattern. That is, instead of having class A { static A* GetInstance() { static A a; return &a; } }; void foo() { A* singleton = A::GetInstance(); } we'd have: class A; template <typename T> T* GetSingleton() { static T s; return &s; } void foo() { A* singleton = GetSingleton<A>(); } (with GetSingleton fixed to do the right thing, obviously) Whether this does the right thing according to the C++ spec is discussed in detail on http://stackoverflow.com/questions/994353/static-variable-inside-template-function.
Exactly the sort of thing I was thinking of. On MSVC 2013 GetSingleton<> would be different.
security\sandbox\chromium\base\memory\singleton.h/.cc implements something like what you need. I haven't, however, reviewed it for correctness.
(In reply to q1 from comment #17) > security\sandbox\chromium\base\memory\singleton.h/.cc implements something > like what you need. I haven't, however, reviewed it for correctness. Also ipc/chromium/src/base/singleton*, which is an earlier version of the same file. A review of the differences doesn't show any obvious holes being plugged, just refinement. We should consider updating singleton to the newer version. I think for use here this would cover all our needs without relying on compiler guarantees about static initializations. So uses of the static singleton pattern would be converted to something like FooClass *ptr = Singleton<FooClass>::get();
Note that we can't even rely on VS2015 fixing the issue, because of bug 1204752
(In reply to Mike Hommey [:glandium] (VAC: 31 Dec - 11 Jan) from comment #19) > Note that we can't even rely on VS2015 fixing the issue, because of bug > 1204752 Blech!
I have a working patch for the webrtc uses using the ipc/*/singleton.h impl; I need to tweak it for android & b2g.
Nathan, do you have a suggestion for what security rating this should get? Do you think this is something we could actually be hit by or is this more of a theoretical concern?
Flags: needinfo?(nfroyd)
Not as many as I feared... Possible cases, if used from more than 1 thread. Most probably aren't. CamerasChild *might* be (gcp?) xpcom/io/nsUnicharInputStream.cpp:nsSimpleUnicharStreamFactory::GetInstance() dom/media/systemservices/OpenSLESProvider.cpp (though we don't build that with MSVC, so it's ok) dom/media/systemservices/CamerasChild.cpp (static CamerasSingleton& GetInstance()) -- may be single-thread only gfx/skia/skia/src/gpu/batches/GrBatch.cpp (GrMemoryPool* pool() const { static GrMemoryPool gPool(16384, 16384); return &gPool; } layout/base/nsCSSFrameConstructor.cpp (static const FrameConstructionData sSuppressData = SUPPRESS_FCDATA(); return &sSuppressData; There might be more; it's hard to search for the pattern.
Flags: needinfo?(gpascutto)
The CamerasChild one was definitely intended to have the safety guarantee from the C++11 standard. I think it even came up during patch review. I had no idea MSVC2013 was non-compliant there. Jesup, are you going to fix up that one too?
Flags: needinfo?(gpascutto)
(In reply to Gian-Carlo Pascutto [:gcp] from comment #24) > The CamerasChild one was definitely intended to have the safety guarantee > from the C++11 standard. I think it even came up during patch review. I had > no idea MSVC2013 was non-compliant there. > > Jesup, are you going to fix up that one too? I can use the same pattern I'm using in the GetStaticInstance patch (using singleton.h from ipc): just make it a Singleton<T>, and use ::get() to get a pointer to it in GetInstance()
(In reply to Andrew McCreight [:mccr8] from comment #22) > Nathan, do you have a suggestion for what security rating this should get? > Do you think this is something we could actually be hit by or is this more > of a theoretical concern? The conditions to do this are racy, and even if you can make threads enter the supposedly-critical-section simultaneously, the class needs to be written in such a way that you can take advantage of it, which I don't think is very common. The example that comes to mind is if you have something like: class Static { std::vector<T> mInfo; ... }; Static& GetInstance() { static Static x; return x; } Static::Static() { mInfo.reserve(X); // Now you have threads racing to set the state of x::mInfo, above. The vector // allocates memory and stores interior pointers to it. With racing, it's // possible that you wind up with some of those interior pointers pointing at // the memory allocated by thread 1 and some of the pointers pointing at the // memory allocated by thread 2, and that's going to cause some problems // somewhere down the road.... } So...probably sec-high (?).
Flags: needinfo?(nfroyd)
Sounds reasonable, thanks. It sounds like it is not too likely to be an issue in practice, but better safe than sorry.
Keywords: sec-high
Fix using singleton.h.
Attachment #8709462 - Flags: review?(rjesup)
Attachment #8709462 - Flags: review?(rjesup) → review+
Comment on attachment 8709462 [details] [diff] [review] Switch to singleton.h for safe init. [Security approval request comment] How easily could an exploit be constructed based on the patch? See comment 26. Looks fairly difficult and uncertain. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No, it looks like a cleanup. I think you mostly have to be familiar with the original problem. Which older supported branches are affected by this flaw? None (Firefox 43 and later) If not all supported branches, which bug introduced the flaw? Bug 1104616. Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? How likely is this patch to cause regressions; how much testing does it need? Seems quite unlikely. Switching from own to existing library code.
Attachment #8709462 - Flags: sec-approval?
We already a release candidate for 44, which is shipping in a week. This is too late for this cycle and will have to go in on Feb 9 or later.
Whiteboard: [checkin on 2/9]
Attachment #8709462 - Flags: sec-approval? → sec-approval+
Group: core-security → dom-core-security
Group: dom-core-security → core-security-release
lowering severity to moderate. If you win the race lottery this could be exploited, but there's no evidence you can do that let alone do it reliably.
Keywords: sec-highsec-moderate
It is okay to land this now. Thanks.
Assignee: nobody → gpascutto
Flags: needinfo?(gpascutto)
Flags: needinfo?(gpascutto)
What's the procedure for moving this to m-a, m-b, m-r? Request approval as usual? Do we want this to bake a few days on Nightly to further eliminate regression risk?
(In reply to Gian-Carlo Pascutto [:gcp] from comment #36) > What's the procedure for moving this to m-a, m-b, m-r? Request approval as > usual? Yes, though as a sec-moderate (rather than a sec-high or sec-critical) it may not be approved for uplift, unless it has some kind of impact on stability. But you should ask anyways if you think it is worth doing. > Do we want this to bake a few days on Nightly to further eliminate regression risk? Sure, if you think that makes sense.
Whiteboard: [checkin on 2/9]
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Flags: sec-bounty? → sec-bounty+
gcp, do you want to request uplift for this to aurora (maybe beta too)?
Flags: needinfo?(gpascutto)
Comment on attachment 8709462 [details] [diff] [review] Switch to singleton.h for safe init. Approval Request Comment [Feature/regressing bug #]: Bug 1104616 [User impact if declined]: sec-moderate on Windows [Describe test coverage new/current, TreeHerder]: Everything that tests WebRTC+camera [Risks and why]: Low risk, breakage would be obvious from non-functioning camera, replacing custom by library code [String/UUID change made/needed]: NA
Flags: needinfo?(gpascutto)
Attachment #8709462 - Flags: approval-mozilla-beta?
Attachment #8709462 - Flags: approval-mozilla-aurora?
Can someone reassign this to the right component? It's definitely not Core::Build Config.
gcp's stuff is webrtc: Audio/video probably. The original report was more general; however I think we've now decided that all the instances of it in the tree have been dealt with other than this in.
Component: Build Config → WebRTC: Audio/Video
Comment on attachment 8709462 [details] [diff] [review] Switch to singleton.h for safe init. Fix a security issue, taking it. Should be in 45 beta 7
Attachment #8709462 - Flags: approval-mozilla-beta?
Attachment #8709462 - Flags: approval-mozilla-beta+
Attachment #8709462 - Flags: approval-mozilla-aurora?
Attachment #8709462 - Flags: approval-mozilla-aurora+
I'm hitting conflicts uplifting this to beta. Haven't tried aurora yet. Can we get a rebased patch?
Flags: needinfo?(gpascutto)
Whiteboard: [adv-main45+]
Whiteboard: [adv-main45+] → [adv-main45+][post-critsmash-triage]
Alias: CVE-2016-1975
FWIW, this broke the build on at least arm64.
(In reply to Mike Hommey [:glandium] from comment #47) > FWIW, this broke the build on at least arm64. Presumably, that's fixed by bug 1250403.
FWIW, this also broke linking signaling standalone tests on esr45 on powerpc. I'll just apply bug 1239866.
Depends on: 1257888
Group: core-security-release
Attachment #8719961 - Attachment description: q1@lastland.net,1500?,2015-12-05,2016-02-12,2016-02-16,true,,, → q1@lastland.net,1500,2015-12-05,2016-02-12,2016-02-16,true,,,
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: