Open Bug 1238735 Opened 9 years ago Updated 2 years ago

Add LSP blocking abilities to DLL blocklist

Categories

(Core :: DLL Services, enhancement)

Unspecified
Windows
enhancement

Tracking

()

Tracking Status
firefox46 --- wontfix
firefox50 - wontfix
firefox51 + wontfix
firefox52 + wontfix
firefox53 + wontfix
firefox54 --- affected

People

(Reporter: bugzilla, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: inj+)

Attachments

(2 files)

I'm not sure how people feel about this, but I've figured out a way to safely block bad LSPs. My patch adds a special flag to the DLL blocklist such that, when seen, the blocklist substitutes its own module handle (ie, mozglue.dll) for the DLL load. The caller then treats mozglue as an LSP. I implemented a dummy LSP that just passes through all requests to the next layer in the LSP chain. There is some non-trivial code in the LSP passthrough part, as we need to do some acrobatics to figure out which LSP in the chain we are acting as a substitute for. We also need to be prepared to be substituting for multiple layers in the unlikely case that there are multiple "bad" LSPs in a chain -- that is something that normal LSPs don't need to worry about. Anyway, I'm going to attach the patches here and we can decide whether or not this is something we want.
Attached patch Proposed patch (deleted) — Splinter Review
Aaron, any progress here? We have a couple of bugs above that might benefit from having LSP blocking capability.
Flags: needinfo?(aklotz)
We might want to bring this up at the channel meeting next Tuesday, or with the platform team, or invite specific people who might have an opinion or review your proposal (i.e. it shouldn't all be on you, and how can I help? )
Well, the patch is basically written and could be rebased, reviewed, and landed. It is somewhat risky in the sense that we don't necessarily know how this code will behave with various LSPs in the wild, so we would definitely want this to ride the trains. I'd like to hear thoughts on this from bsmedberg and mcmanus for sure. I am not sure who else we might need to give the OK to land this.
Flags: needinfo?(mcmanus)
Flags: needinfo?(benjamin)
Flags: needinfo?(aklotz)
(In reply to Aaron Klotz [:aklotz] from comment #4) > It is somewhat risky in the sense that we don't necessarily know how > this code will behave with various LSPs in the wild, so we would definitely > want this to ride the trains. I should point out, however, that this patch is written to only block specific versions of specific LSPs (just like the rest of our blocklisting infra) so I would hope that as long as we got a good QA pass with machines running these LSPs, we could be reasonably confident.
I don't know that I have enough expertise to really say anything about the actual suggestion other than its quite clever. I will say the ability to block an LSP would as Liz say be very valuable and I'll add that I'm willing to accept a little risk over that.
Flags: needinfo?(mcmanus)
I think in terms of technical risk assessment, we need somebody other than me. Perhaps mhowell? I think it might also be worth testing this extensively LSPs which are currently working. Try blocking a bunch of those and see what happens when we do testing.
Flags: needinfo?(benjamin)
Matt, can you take a look at Aaron's patch? It would be nice to have this ability (for example, for bug 1290403).
Flags: needinfo?(mhowell)
Aaron actually already pinged me to ask about my familiarity with LSPs, and I had to be honest and tell him I didn't have any. But after that I couldn't help reading up a bit on them, looking up the SDK samples, etc., so I got the basics. In my opinion, I don't know of any risks this patch would introduce that outweigh mitigating bug 1290403 and other crashes like it. But I do agree with comment 7 that we should test with some known good LSPs, as well as ones we really do intend to block.
Flags: needinfo?(mhowell)
Liz do you know who can take on testing known LSPs per comment 9?
Flags: needinfo?(lhenry)
I think Aaron will figure out QA for testing his patch.
Flags: needinfo?(lhenry) → needinfo?(aklotz)
The startup crashes with chtbrkg.dll are currently #1, #5, #12 and #49 in the top crashers list for 50. We're also seeing this crash on Nightly, so if we land this patch we can quickly get confirmation that the LSP blocking is effective. See also bug 523410, where we restricted the allowed LSPs (backed out in bug 636088 because it broke a firewall).
Thanks Marco, maybe we can test this and uplift it to at least 50. Tracking to make sure we keep an eye on that as a possibility.
r? mhowell for the LSP bits (since he's looked the LSP APIs a little bit already), bsmedberg for the blocklist augmentations.
Flags: needinfo?(aklotz)
Aaron, the Winsock_LSP field in https://crash-stats.mozilla.com/report/index/abacab92-1441-41b4-8db7-ea9de2160729#tab-metadata is: > Microsoft Network Filter over [MSAFD Tcpip [TCP/IP]] : 2 : 2 : 1 : 6 : 0x66 : 0x8 : chtbrkg.dll : : > Microsoft Network Filter over [MSAFD Tcpip [UDP/IP]] : 2 : 2 : 2 : 17 : 0x609 : 0x8 : chtbrkg.dll : : > Microsoft Network Filter over [MSAFD Tcpip [RAW/IP]] : 2 : 2 : 3 : 0 : 0x609 : 0xc : chtbrkg.dll : : > Microsoft Network Filter over [MSAFD Tcpip [TCP/IPv6]] : 2 : 23 : 1 : 6 : 0x66 : 0x8 : chtbrkg.dll : : > Microsoft Network Filter over [MSAFD Tcpip [UDP/IPv6]] : 2 : 23 : 2 : 17 : 0x609 : 0x8 : chtbrkg.dll : : > Microsoft Network Filter over [MSAFD Tcpip [RAW/IPv6]] : 2 : 23 : 3 : 0 : 0x609 : 0xc : chtbrkg.dll : : > Microsoft Network Filter over [RSVP TCPv6 Service Provider] : 2 : 23 : 1 : 6 : 0x2066 : 0x8 : chtbrkg.dll : : > Microsoft Network Filter over [RSVP TCP Service Provider] : 2 : 2 : 1 : 6 : 0x2066 : 0x8 : chtbrkg.dll : : > Microsoft Network Filter over [RSVP UDPv6 Service Provider] : 2 : 23 : 2 : 17 : 0x2609 : 0x8 : chtbrkg.dll : : > Microsoft Network Filter over [RSVP UDP Service Provider] : 2 : 2 : 2 : 17 : 0x2609 : 0x8 : chtbrkg.dll : : > Microsoft Network Filter over [MSAFD RfComm [Bluetooth]] : 2 : 32 : 1 : 3 : 0x26 : 0x8 : chtbrkg.dll : : > MSAFD Tcpip [TCP/IP] : 2 : 2 : 1 : 6 : 0x20066 : 0x8 : %SystemRoot%\system32\mswsock.dll : : e70f1aa0-ab8b-11cf-8ca3-00805f48a192 > MSAFD Tcpip [UDP/IP] : 2 : 2 : 2 : 17 : 0x20609 : 0x8 : %SystemRoot%\system32\mswsock.dll : : e70f1aa0-ab8b-11cf-8ca3-00805f48a192 > MSAFD Tcpip [RAW/IP] : 2 : 2 : 3 : 0 : 0x20609 : 0xc : %SystemRoot%\system32\mswsock.dll : : e70f1aa0-ab8b-11cf-8ca3-00805f48a192 > MSAFD Tcpip [TCP/IPv6] : 2 : 23 : 1 : 6 : 0x20066 : 0x8 : %SystemRoot%\system32\mswsock.dll : : f9eab0c0-26d4-11d0-bbbf-00aa006c34e4 > MSAFD Tcpip [UDP/IPv6] : 2 : 23 : 2 : 17 : 0x20609 : 0x8 : %SystemRoot%\system32\mswsock.dll : : f9eab0c0-26d4-11d0-bbbf-00aa006c34e4 > MSAFD Tcpip [RAW/IPv6] : 2 : 23 : 3 : 0 : 0x20609 : 0xc : %SystemRoot%\system32\mswsock.dll : : f9eab0c0-26d4-11d0-bbbf-00aa006c34e4 > RSVP TCPv6 Service Provider : 2 : 23 : 1 : 6 : 0x22066 : 0x8 : %SystemRoot%\system32\mswsock.dll : : 9d60a9e0-337a-11d0-bd88-0000c082e69a > RSVP TCP Service Provider : 2 : 2 : 1 : 6 : 0x22066 : 0x8 : %SystemRoot%\system32\mswsock.dll : : 9d60a9e0-337a-11d0-bd88-0000c082e69a > RSVP UDPv6 Service Provider : 2 : 23 : 2 : 17 : 0x22609 : 0x8 : %SystemRoot%\system32\mswsock.dll : : 9d60a9e0-337a-11d0-bd88-0000c082e69a > RSVP UDP Service Provider : 2 : 2 : 2 : 17 : 0x22609 : 0x8 : %SystemRoot%\system32\mswsock.dll : : 9d60a9e0-337a-11d0-bd88-0000c082e69a > MSAFD RfComm [Bluetooth] : 2 : 32 : 1 : 3 : 0x20026 : 0x8 : %SystemRoot%\system32\mswsock.dll : : 9fc48064-7298-43e4-b7bd-181f2089792a > Microsoft Network Filter : 2 : 2 : 0 : 0 : 0x66 : 0x4 : chtbrkg.dll : 0x0 : 99552d36-7b14-4992-b6f5-052f73454665 Does this mean that `chtbrkg.dll` is also a base provider and so can't be blocked via this blocklisting mechanism?
Flags: needinfo?(aklotz)
All provider types are included in that field (and in fact, we need that field to determine the GUID to use for blocklisting). But in this case the field is showing that chtbrkg.dll is a layer provider. Third-party base providers are very rare and are usually found on enterprise servers, as they involve implementation of an entire network protocol stack. The default base provider is mswsock.dll, which is present on the list.
Flags: needinfo?(aklotz)
(In reply to Aaron Klotz [:aklotz] from comment #17) > All provider types are included in that field (and in fact, we need that > field to determine the GUID to use for blocklisting). But in this case the > field is showing that chtbrkg.dll is a layer provider. > > Third-party base providers are very rare and are usually found on enterprise > servers, as they involve implementation of an entire network protocol stack. > The default base provider is mswsock.dll, which is present on the list. Great then! I misinterpreted the condition for logging the GUID (I thought we printed the GUID only for base providers, because of https://dxr.mozilla.org/mozilla-central/rev/13f49da109ea460665ad27c8497cb1489548450c/widget/windows/LSPAnnotator.cpp#132, but actually BASE_PROTOCOL is 1, LAYERED_PROTOCOL is 0).
Comment on attachment 8811537 [details] Bug 1238735: Add LSP blocking capabilities to Windows DLL blocklist; https://reviewboard.mozilla.org/r/93626/#review94048 I think we're good; just a few minor comments. The LSP mechanism is pretty strange and underdocumented, even by Windows standards. Everything here is fine as far as I can tell, but I reiterate my earlier comment that I really want to see this tested for both blocking and leaving alone known-good LSP's (assuming we can find any), and preferably on a smattering of different Windows versions. ::: mozglue/build/WindowsDllBlocklist.cpp:17 (Diff revision 1) > #define MALLOC_DECL(name, return_type, ...) \ > extern "C" MOZ_MEMORY_API return_type name ## _impl(__VA_ARGS__); > #include "malloc_decls.h" > #endif > > +#include <ntstatus.h> Not sure why this was added; I don't see anything here that's in ntstatus.h but not in winnt.h. If I missed something, I wouldn't mind a comment saying what it was. ::: mozglue/build/WindowsDllBlocklist.cpp:224 (Diff revision 1) > { "nhasusstrixosd.dll", ALL_VERSIONS }, > { "nhasusstrixdevprops.dll", ALL_VERSIONS }, > + > + // LSPs for unit testing > + // IMPORTANT: Do not specify SUBSTITUTE_LSP_PASSTHROUGH without adding > + // the LSP's Protocol GUID to gLayerGuids in WindowsLSPPassthrough.cpp! It would be awesome to have a way to statically assert this. ::: mozglue/build/WindowsDllBlocklist.cpp:760 (Diff revision 1) > + if (!mozilla::lsp::IsRegisteredBlockedLSP(full_fname.get())) { > + // We want to block this LSP, however this DLL does not match any > + // registered LSP layer providers whose GUIDs are blocked! > + // Failing to allow the DLL load under these conditions will likely > + // break Winsock, so we must allow the LSP to load. > + goto continue_loading; Is there a way for this to happen other than trying to block a DLL that got added to the blocklist but not to gLayerGuids? If not, there should be an assert here.
Attachment #8811537 - Flags: review?(mhowell) → review+
Comment on attachment 8811537 [details] Bug 1238735: Add LSP blocking capabilities to Windows DLL blocklist; https://reviewboard.mozilla.org/r/93626/#review97880 I'm mainly worried about really comprehensive manual testing (in addition to the automated testing) for this change. Do we have a written test plan for this yet?
Attachment #8811537 - Flags: review?(benjamin) → review+
Andrei, is your team covering QE for this? Aaron and Matt, can you work with Andrei to come up with a test plan? Sounds like this may be somewhat challenging.
Flags: needinfo?(mhowell)
Flags: needinfo?(andrei.vaida)
Flags: needinfo?(aklotz)
If you follow bug 636088 and bug 523410 dependencies, duplicates and bugs mentioned in comments, you can get a (not sure how large) list of known good LSPs.
these seem to be the most common legitimate custom lsp providers according to recent crash stats data: *Sophos Web Intelligence IFSLSP: C:\ProgramData\Sophos\Web Intelligence\swi_ifslsp.dll *Forefront TMG Client Service Provider: C:\Program Files (x86)\Forefront TMG Client\FwcWsp.dll *LavasoftLSP: C:\Windows\system32\LavasoftTcpService.dll *PGPlsp: C:\windows\system32\PGPlsp.dll *OpenText SOCKS Client: C:\Program Files (x86)\Open Text\SOCKS Client\HumSOCKS.dll
My idea about testing this was just to install one of the known good LSP's and a Firefox build with this patch, and verify that the good LSP still functions. Use whichever one would be easiest to verify basic functionality for; if we break the thing with this patch, we would most likely break it completely and not partially, so a very basic check should suffice.
Flags: needinfo?(mhowell)
In bug 630481 (linked to the bugs I mentioned in comment 22) the brokenness is not so complete (Firefox was still able to connect, but the LSP was bypassed and so was not blocking what it was supposed to block). I suppose we should also test that the LSPs still do what they are supposed to do.
Sorry, yeah, that's exactly what I was trying to say; likely either the LSP would be entirely working or it would not work at all. I don't think other network functions would be affected.
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #21) > Andrei, is your team covering QE for this? > Aaron and Matt, can you work with Andrei to come up with a test plan? > Sounds like this may be somewhat challenging. Andrei is currently on PTO. Here's my take on this: 1. My understanding is we want to test both that (some) good LSPs still work, and that (some) blocked LSPs won't work. Does anyone have any examples of LSPs that are/will be blocked? 2. Do we have a build that we can use for testing? 3. We'll work to find an owner for testing this. Keeping Andrei's ni so he can see this.
Aaron can you get a build to Florin ? Is there anything you can suggest as 3rd party LSP code that we can install for testing? I think probably the usual antivirus software. Whatever uses LSPs.
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #28) > Aaron can you get a build to Florin ? > Is there anything you can suggest as 3rd party LSP code that we can install > for testing? I think probably the usual antivirus software. Whatever uses > LSPs. We have LSP annotations in crash reports, you might be able to pull some software data out of that to get an idea of what software packages use LSPs. Here's a try build - https://treeherder.mozilla.org/#/jobs?repo=try&revision=137ab18bf0765efcb3b6c313aff04a882c6da1a0 FYI aklotz is concentrating on e10s+a11y at the moment. We'll circle back around on this in the new year.
Flags: needinfo?(aklotz)
We'll probably cover this as a feature. Given that it hasn't landed yet, it's possible that the Engineering QA team will take ownership of this.
(In reply to Florin Mezei, QA (:FlorinMezei) from comment #30) > We'll probably cover this as a feature. Given that it hasn't landed yet, > it's possible that the Engineering QA team will take ownership of this. This has been indeed picked up by the Engineering QA team.
Flags: needinfo?(andrei.vaida)
Status: NEW → ASSIGNED
Blocks: 1435776
No longer blocks: injecteject
Priority: -- → P2
Whiteboard: inj+
Type: defect → enhancement
Component: General → mozglue
Depends on: 1542830
No longer depends on: 1542830
Assignee: bugzilla → nobody
Status: ASSIGNED → NEW
Component: mozglue → DLL Services
Priority: P2 → --
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: