Closed Bug 497495 (FramePoisoning) Opened 15 years ago Closed 15 years ago

Always poison deallocated objects in the frame arena

Categories

(Core :: Layout, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
Tracking Status
status1.9.2 --- beta2-fixed
status1.9.1 --- wontfix

People

(Reporter: zwol, Assigned: zwol)

References

(Depends on 1 open bug, Blocks 1 open bug, )

Details

(Whiteboard: [sg:nse] defense-in-depth)

Attachments

(13 files, 36 obsolete files)

(deleted), text/csv
Details
(deleted), application/x-gzip
Details
(deleted), application/x-gzip
Details
(deleted), application/vnd.ms-excel
Details
(deleted), text/csv
Details
(deleted), application/pdf
Details
(deleted), application/pdf
Details
(deleted), application/vnd.ms-excel
Details
(deleted), patch
zwol
: review+
Details | Diff | Splinter Review
(deleted), patch
zwol
: review+
Details | Diff | Splinter Review
(deleted), patch
zwol
: review+
Details | Diff | Splinter Review
(deleted), patch
zwol
: review+
Details | Diff | Splinter Review
(deleted), patch
zwol
: review+
Details | Diff | Splinter Review
Attached file spreadsheet summarizing crashes (deleted) —
It's been theorized that we can reduce the security risk due to frame tree maintenance bugs by unconditionally -- even in release builds -- overwriting deallocated frame objects with a "poison" pattern that forms a pointer to inaccessible memory. I've done some preliminary testing of this concept and it looks promising: of six crash-on-load test cases that I have, five change from "jump to an address that might be controllable by an attacker" to "data read from the poisoned address", and the other one appears to not involve frame objects directly. Attached will be: 1. A spreadsheet listing the small handful of current sg:crit bugs that contained test cases that actually crash a release-mode build of current trunk (where "current" means "right before the focus patches landed" -- specifically revision 1057ca8f2a91). Each is annotated with the crash syndrome in unmodified trunk and with the patches that I'm proposing. 2. A tarball of all the test cases I'm using. 3. A tarball of the automation I'm using. 2. Two patches, which together implement the poisoning. They are feature complete with one critical exception: both Windows and Linux define certain regions of the address space to be permanently inaccessible, but MacOS X does not. On MacOS we need to instead mmap() a couple pages of memory, mprotect() them PROT_NONE, and use an address in the middle of that range (to catch both positive and negative offsets). I would prefer if someone who actually has a Mac wrote the code for that. (It would also be good if someone else carefully checked whether my understanding of the address space layout on Windows and Linux is accurate. I'm cc:ing bsmedberg largely for that reason.) 5. A third patch, which is *not* intended for inclusion, but which you will need in order to reproduce my results. It does two things: dissociates this experimental build from normal profiles, and adds a special mode to nsSigHandlers.cpp in which we spawn a gdb process to generate stack traces for us, rather than trying to do it ourselves; this is available even in non-debug builds. The automation is set up to use this special mode. I shall also kick off try server builds to get us talos numbers.
Attached file test case bundle (deleted) —
Only 6 of the 17 or so test cases in here actually crash current trunk; the ones that don't are not mentioned in the spreadsheet.
Attached file automation (deleted) —
You use this automation like so: unpack it and the test cases in sibling directories of your build tree, and then run driver.sh from the top level of the build tree, like this: $ ../multicrash/driver.sh ../testcases-crash-on-load-2 >& multicrash.log
Attached patch "baseline" patch (NOT FOR INCLUSION) (obsolete) (deleted) — Splinter Review
This is the "baseline" patch required to reproduce my build environment exactly; the most important piece of it is the MOZ_USE_GDB_FOR_STACKTRACE hack, without which you will not get much information from the "multicrash" automation. It also includes changes to isolate this build from regular profiles, and the fix for bug 486065, which I need to get non-garbage results from reftest.
Now comes the real thing. This piece just splits out the frame arena to its own file (and renames it "nsPresArena", because it's not just for frames).
Attachment #382636 - Flags: review?(roc)
This is the meat of the proposed change.
Attachment #382637 - Flags: review?(roc)
Looking at your results, we can see that in release builds we mostly crash with a null deref. I was wondering, somewhat heretically, whether whether we really should do frame poisoning at all. We've been classifying as critical frame-tree corruption bugs that dereference 0xdddddddd, especially during vtable dispatch. But how exploitable really is a use-after-free of a frame pointer? In almost all the cases we have ever seen, the frame arena memory has not been returned to the general heap (because that doesn't happen until the presshell is torn down, at which time almost all possible references to the presshell's frames, especially from other frames, are gone). So, the only way to exploit these bugs is to spray the frame arena itself. But the usual spraying vectors --- strings, images, in fact, arrays of any kind --- are never allocated in a frame arena. Assuming we split style data and frames into separate arenas, the only way for an attacker to inject data into freed frame memory is by allocating actual new frames and setting fields that are actually under outside control (e.g., not vtable pointers). Now, if the attacker can construct an exploit that way, I suspect they can bypass frame poisoning as well, since I suspect it's usually possible to arrange for frames to be created before the bad dereference is triggered. If this is true then frame poisoning doesn't buy us much. But I'm not sure how easy that is. Still, I propose we do the following further analysis. From debug info we can extract the types and layout of the fields of every frame implementation class. Furthermore we can split those classes into groups which share the same free-list in the frame arena. Then we can automatically check for situations where a pointer field in one frame class aliases a non-pointer field, or less worryingly a pointer field of a different type, in another frame class. Then we can think about whether those situations might be exploitable. Or maybe it would be better and safer to just make finer-grained freelists so that different frame class layouts are always on different freelists.
Attached file talos results spreadsheet (deleted) —
These are talos numbers from the try server; I uploaded a null patch (*not* the "baseline" patch, which apparently breaks things on not-my-computer) and then the combination of the two patches above. I'll add a .csv version shortly. I'm not sure how much I believe these numbers, especially the _shutdown ones which seem to be jumping all over the place. (There is a certain temptation to land the patches just to get official talos numbers.) However, to the extent I do believe them, it looks like this is a small but real performance hit, which makes sense.
Attached file talos results spreadsheet (CSV) (deleted) —
Only the Windows results are possibly significant there. In nsFrame::Destroy, if shell->IsDestroying() we can just skip the call to FreeFrame, I think. That would probably help with any perf issues.
Attached file structure layout analysis of frame objects (obsolete) (deleted) —
The two PDFs I've just attached are analyses of where the pointers are in each object which is identifiably allocated in one or other of the pres shell's arenas: this one is for miscellaneous (mostly style-related) objects and the other one is for frames proper (specifically, things that inherit from nsQueryFrame). Both were generated with a debug build on 64-bit Linux. (A release build, even with --enable-debug-info-modules, does not contain enough debugging information to generate these charts.) Beware that the visualizations, especially that of frames proper, are huge images (hence PDF, for smooth scaling). It's probably easier to understand them if you study the smaller "misc objects" PDF first. The structs are laid out horizontally, with byte 0 at the far left, eight bytes per vertical division. That's one pointer on this architecture. There's a key in the upper left-hand corner -- the thing to know is that green is anything-but-a-pointer, yellow is a pointer to data, and orange is a pointer to a vtable (thus, a double indirection to code). Blue is padding -- there would be less of it on a 32-bit architecture. Red would have been used for direct pointers to code but there aren't any in these classes. All structures that would have generated identical rows in the table have been merged together and labeled with a random one of their names. Thus, for instance, 'BRFrame' in the frame-object visualization stands for nsFrame and several other frame classes that add no additional data members to nsFrame. In a few cases you'll see structures with numeric suffixes on their names - that indicates combinatoric expansion of unions. For instance, there are 16 variations on nsMathMLpaddedFrame shown, differing only in which of various union members is live. The program does not know which of these variations are actually possible. To answer the question roc asked in comment #6, look for a group of rows that are all the same length (the white gaps separate such groups). Any time green or blue lines up vertically with yellow or orange within the group, we have data/pointer aliasing within a free list. Green/orange is more of a risk than green/yellow or yellow/orange, I think, but I am really not a security expert. Code provided on request, but be warned that it requires not a little hand-holding. I can run the analysis for any set of structures easily, now, and plan to mess around with repacking stuff and see how much it helps.
I just noticed an error in the "misc presentation arena" table: some of the objects' sizes are not rounded up to a multiple of 8. The arena code always rounds up to a multiple of 8, so (for instance) nsStyleColumn belongs in the same block as nsStyleMargin and nsStyleText.
This is excellent. It would be useful to be able to identify yellow pointers that are the same type. Yellow/yellow aliasing with different types could be an issue (although seems really hard to exploit), yellow/yellow aliasing with the same types is almost certainly not an issue.
In nsStyleSVG and nsMathMLPaddedFrame (and other places like nsStyleContentData), the unions are clearly tagged and it should be easy to ensure that every access to the union is guarded by a tag check. So we should be able to treat all those unions as non-aliasable data.
nsMathMLFrame is not a real frame type and never stands alone, it's just a mixin for MathML frames, so we should leave it out of the analysis.So can pure virtual classes like nsIFrame, nsIFrameDebug, nsFormControlFrame, nsContainerFrame, nsSVGDIsplayContainerFrame. We already allow for 100 freelists in nsFrameArena, and there are less than 100 different rows here. If we distinguished pointer fields of different type and there are still less than 100 different signatures, then almost certainly we could just have one freelist per signature without causing memory usage regressions. With slightly deeper analysis, and perhaps some moving around of fields, we can probably reduce the number of freelists required, if necessary. I hadn't realized we already use so many freelists. I'm pretty confident we can make this work without actually needing to poison.
(In reply to comment #16) > nsMathMLFrame is not a real frame type and never stands alone, it's just a > mixin for MathML frames, so we should leave it out of the analysis.So can pure > virtual classes like nsIFrame, nsIFrameDebug, nsFormControlFrame, > nsContainerFrame, nsSVGDIsplayContainerFrame. Do you have an exhaustive list of such types, or suggestions for how to make an exhaustive list? I find myself unable to make even an educated guess about whether, e.g. nsSplittableFrame should be considered "not a real frame type". The attached revised PDF takes out the types you listed, nsSplittableFrame, and everything that was just a vtable pointer. I haven't distinguished different pointer types yet. > We already allow for 100 freelists in nsFrameArena, and there are less than 100 > different rows here. If we distinguished pointer fields of different type and > there are still less than 100 different signatures, then almost certainly we > could just have one freelist per signature without causing memory usage > regressions. I'm worried about how to label classes with their signature in a way that won't regress the moment someone decides they need a new field.
Attachment #384762 - Attachment is obsolete: true
The only difference here is that object sizes are properly rounded up to a multiple of 8.
Attachment #384766 - Attachment is obsolete: true
(In reply to comment #17) > I'm worried about how to label classes with their signature in a way that won't > regress the moment someone decides they need a new field. One way would be to just make the freelists a hash table keyed off the vtable pointer.
This additional patch makes nsIPresShell::FreeFrame and nsIPresShell::FreeMisc do nothing if called while the pres shell is being torn down. (This does not cause a memory leak; the arenas themselves will be freed en masse.) Talos results for this addition to follow.
Attachment #385275 - Flags: review?(roc)
This time I went to the trouble of copying down *all* the talos numbers and doing actual statistics on them. (Two-sample unpaired T-test with unequal variances, for each talos subtest on each platform.) The take-home is at the very bottom: none of the observed differences are significant when corrected for multiple comparisons. The statistical power of this analysis is very weak -- I would actually read it to say that talos on the try server gives no useful information, even if you do a baseline run. There are too few observations in each bin, and too many bins.
The above whingeing about statistical power is now formalized as bug 500562, for the record.
Does this bug/investigation need to be a hidden security bug? I don't see anything here that would help someone attack us (though I may have missed it).
Whiteboard: [sg:nse] defense-in-depth
My current feeling is that we should take all these patches. But we should do some more work so that the frame arena uses one freelist per frame class, by hashing on the frame's vtable pointer. The frame poisoning should be extremely frustrating for fuzzers, but perhaps can be worked around by a dedicated attacker. But using one freelist per frame class means that even if you dereference a dangling pointer, you'll only see either poisoned data or a live frame of the same type, so type safety is not actually violated. Well, there is one issue left unaddressed: the free-list 'next' pointer that is currently stored in the first word of each object. Potentially that pointer itself could be an exploitation vector. No layout code would overwrite the vtable pointer (unless something was already corrupt), so we don't have to worry about the 'next' pointer being corrupted. But we do have to worry about someone doing a virtual dispatch treating the 'next' pointer as a vtable pointer. Can we reduce the 'next' pointer to an index into the arena, so it's an integer with some reasonable upper bound, and then flip some high bits so that if treated as a pointer it always points to non-addressable memory?
(In reply to comment #24) > Can we reduce the 'next' pointer to an index into the arena, so it's an integer > with some reasonable upper bound, and then flip some high bits so that if > treated as a pointer it always points to non-addressable memory? Or we could keep arrays of free objects on the side. (nsVoidArray? nsTArray<void*>? Can either of those be put in a PLDHash?)
nsTArray can be put in an nsTHashtable. That would work.
I have just pushed a new edition of these patches to the try server. I will attach revised patches tomorrow morning when I've seen the results; my SO is being very patient but I really ought to stop ignoring her, it's past 9pm. :)
I've found a bunch of problems with the current iteration of the patches; also, it seems like m-c is mildly broken right now. Stay tuned.
Attachment #382636 - Attachment is obsolete: true
Attachment #382635 - Attachment is obsolete: true
Attachment #382637 - Attachment is obsolete: true
Comment on attachment 385275 [details] [diff] [review] Patch 3/2: skip deallocation during pres shell teardown Marking all the old patches obsolete to avoid confusion, as I'm about to submit a whole new set.
Attachment #385275 - Attachment is obsolete: true
New patch series begins here. This is functionally the same change as the previous "1/2: separate the frame arena object from nsPresShell.cpp", and nearly the same text. The overall patch series is different enough that I'm going to re-request review on everything, though. This patch can be compiled by itself and could go in ahead of the rest of the changes.
Attachment #390075 - Flags: review?(roc)
This patch makes API changes to nsIPresShell and nsFrame. The changes have no semantic effect right now, but will at the end of the patch series. The goal is to be able to index the free lists in nsPresArena by (size, frame id) pairs rather than just size; this information therefore has to be available to AllocateFrame and FreeFrame, and this means cascading adjustments, ultimately to all 200-odd places where new frame objects are created :-/ So I'm doing all that ahead of the patch that actually makes functional changes. This patch cannot be compiled by itself; the next two patches, which make a long series of search-and-replace changes to update all users of the changed APIs, are required. I've split it up for ease of review only. I think this is the only patch in the series that wants an sr under the new rules.
Attachment #390078 - Flags: superreview?(dbaron)
Attachment #390078 - Flags: review?(roc)
This is a search-and-replace of new (aPresShell) frameclass(args); with NS_CREATEFRAME(aPresShell, frameclass)(args); There should be no other functional changes in this patch. I did strip out some useless boilerplate comments and trailing whitespace /en passant/.
Attachment #390080 - Flags: review?(roc)
This patch is a search-and-replace of AllocateFrame with AllocateMisc and FreeFrame with FreeMisc everywhere except nsFrame::operator new and nsFrame::Destroy; in other words, everywhere they're not being used for actual frame objects (subclasses of nsFrame). 2a/2b/2c must be applied together (on top of 1) to get something that compiles. There should be no runtime consequences of 1+2a+2b+2c (a handful of places are now passing around code numbers that get ignored, but that's it) and I am running a try server build to test this (build identifier "framepoison-prep").
Attachment #390082 - Flags: review?(roc)
*sigh* The try server appears to have melted down.
We discussed modifying NS_DECLARE_FRAME_ACCESSOR to override operator new on a per-frame-class basis. Did that not work?
That's patch 3a and 3b, coming Real Soon Now.
Actually, if you meant to say that I could leverage that to avoid having to change every NS_New*Frame function ... should have realized that was what you meant the first time, and will see if I can rejigger things to make that happen.
roc expressed interest in seeing the complete patchset now, even though it's not really done yet. This *appears* to work, but I've only thrown the reftests at it. Try server builds are running (code name framepoison-2). On Monday I'll be making a proper patch series out of this and also confirming that I hit every last frame class with the queryframe API changes (see the comments in nsFrame.h for why this is important). There's a hook in place for a dehydra check for that, yet to be written.
Attachment #390078 - Attachment is obsolete: true
Attachment #390080 - Attachment is obsolete: true
Attachment #390082 - Attachment is obsolete: true
Attachment #390080 - Flags: review?(roc)
Attachment #390082 - Flags: review?(roc)
Attachment #390078 - Flags: superreview?(dbaron)
Attachment #390078 - Flags: review?(roc)
Incidentally, I didn't mark patch 1/4 obsolete but I am going to revise it slightly, because I just noticed that there are some headers that nsPresShell.cpp no longer needs.
Blocks: 507294
Here goes with series 3, which is working but for a bug in the very last patch that I can't find; I'll talk about that with that patch. This is a very slight revision of series 2 patch 1, which split the frame arena to its own file; the only change is that I noticed nsPresShell.cpp no longer needs to #include "plarena.h" or #define PL_ARENA_CONST_ALIGN_MASK. Carrying previous review forward.
Attachment #390075 - Attachment is obsolete: true
Attachment #390612 - Attachment is obsolete: true
Attachment #391502 - Flags: review+
The frame arena is used for variety of miscellaneous objects in addition to true frames (that is, subclasses of nsFrame). This patch introduces a separate API to allocate and free these objects: nsIPresShell::AllocateMisc/FreeMisc. They end up being allocated from the same PLArenaPool, but we are equipped to distinguish them in the allocator now. In the interest of only changing nsIPresShell once, this patch also adds new "code" parameters to AllocateFrame and FreeFrame. These are not used yet, but will be in later patches. I fix up the calls from nsFrame to these functions in this patch, but I do not fix up the calls that are to become calls to AllocateMisc/FreeMisc. That's part 2b. Thus, this patch cannot be compiled by itself. I also move the clearing on allocation of true-frame memory from nsFrame operator new to AllocateFrame; this is because patches 3a/b will give every frame class its own operator new. Finally, there is an /en passant/ correction to nsFrame.h: the declaration of ComputesOwnOverflowArea was in the middle of the initialization functions. I moved it down by the other box layout methods.
Attachment #391503 - Flags: superreview?(dbaron)
Attachment #391503 - Flags: review?(roc)
This is the purely mechanical change to fix up all the places that were calling AllocateFrame but should, after part 2a, call AllocateMisc. There aren't very many of them.
Attachment #391504 - Flags: review?(roc)
Now it starts to get a little messy. This patch arranges to have three methods exposed from every subclass of nsFrame: an operator new, GetAllocatedSize (which does exactly what it sounds like) and GetAllocatedIID (which returns the IID of the most-derived object). These are automatically declared by NS_DECL_QUERYFRAME and defined by NS_QUERYFRAME_HEAD, along with QueryFrame itself. Every nsFrame subclass must define its own version of these functions, so this patch also includes a dehydra module and annotations that enforce that. There's no machinery to actually *run* this check, but I did it manually for part 3b. In case you're wondering, the only reason I changed the QueryFrame implementation to a switch statement is so the compiler would catch duplicate entries.
Attachment #391507 - Flags: superreview?(dbaron)
Attachment #391507 - Flags: review?(roc)
This is the very large and almost entirely mechanical change required by part 3a; every single frame class must invoke NS_DECL_QUERYFRAME, NS_QUERYFRAME_HEAD, NS_QUERYFRAME_TAIL now. Something like a third of the 200-odd frame classes didn't already have it. I am not sure whether it is actually appropriate to make all inheritance from nsQueryFrame be virtual. It might be better instead to say that frame interfaces (things like nsIFrameDebug, nsIStatefulFrame, nsIScrollableViewProvider) can be QueryFrame'd *to* but do not themselves provide a QueryFrame method.
Attachment #391508 - Flags: superreview?(dbaron)
Attachment #391508 - Flags: review?(roc)
Here finally is the substance of the change, after all the groundwork. nsPresArena now maintains separate free lists for each (size, FrameIID) pair. The miscellaneous stuff all goes under a dummy FrameIID (0xFFFF), so it's only segregated by size, as before. This patch DOES NOT WORK. The browser consistently goes into infinite (or possibly just very lengthy) loops inside pldhash.c if you load any page of significant complexity -- I've been testing with Tinderbox. gdb tells me that there are order of 20,000 entries in the hash table of freelists when this happens, which is ridiculous; there should only be order of 200. I assume that I have done something wrong in my use of nsTHashtable, but I cannot find it.
Attachment #391510 - Flags: review?(roc)
Attachment #391510 - Attachment description: series 3 patch 4/4: separate frame freelists by IID → series 3 patch 4/4: separate frame freelists by IID (broken)
Incidentally, I have submitted the complete patch series to the try server with the #define DEBUG_TRACEMALLOC_FRAMEARENA uncommented in nsPresArena.cpp, which should test everything except the part with the buggy hash table. Look for it under build identifier "framepoison-no-arena".
cc:ing brendan to request help with the hash table problem in patch 4/4.
Why are you assigning to keyHash? Don't do that, you are clobbering a critical bit of information. /be
(In reply to comment #49) > Why are you assigning to keyHash? Don't do that, you are clobbering a critical > bit of information. You mean in the FreeList constructor? What am I supposed to do with the aKey argument to the constructor, then?
You should probably use nsDataHashtable and nsPtrHashKey rather than trying to use nsTHashtable directly. (Though a direct answer to your question would be "store it in an additional member variable".)
This one needed rediffing against latest m-c for no apparent reason.
Attachment #391508 - Attachment is obsolete: true
Attachment #391733 - Flags: superreview?(dbaron)
Attachment #391733 - Flags: review?(roc)
Attachment #391508 - Flags: superreview?(dbaron)
Attachment #391508 - Flags: review?(roc)
This works now; I ran it against tinderbox and several very dynamic websites with no trouble. Thanks, Brendan, your advice was spot on. Try server build is running,
Attachment #391510 - Attachment is obsolete: true
Attachment #391735 - Flags: review?(roc)
Attachment #391510 - Flags: review?(roc)
Try server unit test logs: OSX: http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1248995607.1249003091.18922.gz Linux: http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1248995607.1249002085.7977.gz Win32: http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1248995607.1249002086.7980.gz In all three builds there is one failure, in mochitest-plain/layout/generic/test/test_bug468167.html. I will investigate this tomorrow. Talos is green.
Comment on attachment 391735 [details] [diff] [review] series 3 patch 4/4: separate frame freelists by IID I still don't think you have a good reason to write 20 lines of hand-rolled hashtable boilerplate code when you could just write: nsDataHashtable<nsUint32HashKey, nsTArray<void*> > and move your FreeList::Index method somewhere else and be done with it.
Attachment #391735 - Flags: review-
(That said, a sensible place to put that Index method might be to make a key class derived from nsUInt32HashKey and make that Index method its constructor.)
(In reply to comment #55) > (From update of attachment 391735 [details] [diff] [review]) > I still don't think you have a good reason to write 20 lines of hand-rolled > hashtable boilerplate code when you could just write: > > nsDataHashtable<nsUint32HashKey, nsTArray<void*> > Does that do individual mallocs for each hash table entry's storage? Being sure to avoid that was the only reason I wrote the boilerplate.
On trying to convert to nsDataHashtable, I am now convinced it is inappropriate for this usage. Specifically, in nsPresArena::Free, with the "raw" nsTHashtable API, we can do FreeList* list = mFreeLists.PutEntry(FreeList::Index(aSize, aCode)); and be done: we get the existing list back if there was already one, a new one if there wasn't, and NULL only on allocation failure (and with infallible malloc I hope I can stop worrying about that soon). With the nsBaseHashtable API I am stuck doing this horrid thing instead: FreeListKey listKey(aSize, aCode); nsTArray<void*>* list = NULL; if (!mFreeLists.Get(key, &list)) { if (mFreeLists.Put(key, nsTArray<void*>())) { mFreeLists.Get(key, &list)) } } That's three hash table lookups for something that can be done in one with the lower-level API, much less readable and I'm not even sure it works correctly. I think 20 lines of boilerplate PLDHashEntryHdr subclass is an acceptable price for not having to do that.
Comment 58 cites a problem that I intentionally solved with jsdhash.h all those years ago. It makes for a low-level API with sharp edges, but the savings can be important. nsBaseHashtable removes the edge, sparing some users at the price of the needed precision for certain surgical operations. Anyway, I agree with zwol in this case: 20 lines of subclassing is ok. One must read the *dhash.h comments, of course! /* * Table entry header structure. * ... The stored keyHash value is table size invariant, and it is * maintained automatically by JS_DHashTableOperate -- users should never set * it, and its only uses should be via the entry macros below. * ... */ ;-) /be
I think we should add a second get method to nsBaseHashtable: /** * retrieve memory location at which the value for a key is stored. * @param aKey the key to retreive * @param pData location of data associated with this key will be * stored in this pointer. This pointer is only valid until the * next mutation of the table. * @return PR_TRUE if the key exists. If key does not exist, pData is not * modified. */ PRBool GetMutable(KeyType aKey, UserDataType** pData NS_OUTPARAM) const { EntryType* ent = GetEntry(aKey); if (!ent) return PR_FALSE; *pData = &ent->mData; return PR_TRUE; } You're not the first person who's needed this, and it's well past time to add it.
Er, actually, that's not quite right, since we want to add the entry if it doesn't exist yet, which makes it a bit more complicated. So I suppose this is ok, but we really ought to have that method on nsBaseHashtable...
+#define NS_DECL_QUERYFRAME_IF(class) \ Can we call this NS_DECL_QUERYFRAME_INTERFACE? IF might mean "if" or something + * The NS_MUST_OVERRIDE user attribute marks members that all + * immediate subclasses must override. It's probably most useful for + * virtual member functions, but this analysis doesn't care whether + * the must-override member is a function or data, nor whether it is + * ordinary, static, virtual, or pure virtual. A subclass override can + * itself be NS_MUST_OVERRIDE, in which case its own subclasses must + * override as well. Basically, think of this as allowing you to + * define a function but treat it as pure virtual from a subclass's + * point of view. This last sentence is a little confusing. NS_MUST_OVERRIDE means that every subclass must override the method. But unlike pure virtual functions, it means nothing if there are no subclasses. It would be helpful if you could call out this distinction. Also it would be a good idea to give the detailed documentation for NS_MUST_OVERRIDE in nscore.h, not in the analysis code.
+class nsIPercentHeightObserver : virtual public nsQueryFrame Ugh. No virtual base classes, please. Why are you adding this?
(In reply to comment #63) > +class nsIPercentHeightObserver : virtual public nsQueryFrame > > Ugh. No virtual base classes, please. Why are you adding this? From comment #45: > I am not sure whether it is actually appropriate to make all inheritance from > nsQueryFrame be virtual. It might be better instead to say that frame > interfaces (things like nsIFrameDebug, nsIStatefulFrame, > nsIScrollableViewProvider) can be QueryFrame'd *to* but do not themselves > provide a QueryFrame method. But it's one or the other, because if the nsI* things inherit from nsQueryFrame, then nsQueryFrame becomes an ambiguous base.
(In reply to comment #62) > +#define NS_DECL_QUERYFRAME_IF(class) \ > > Can we call this NS_DECL_QUERYFRAME_INTERFACE? IF might mean "if" or something Sure. > + * The NS_MUST_OVERRIDE user attribute marks members that all > + * immediate subclasses must override. It's probably most useful for > + * virtual member functions, but this analysis doesn't care whether > + * the must-override member is a function or data, nor whether it is > + * ordinary, static, virtual, or pure virtual. A subclass override can > + * itself be NS_MUST_OVERRIDE, in which case its own subclasses must > + * override as well. Basically, think of this as allowing you to > + * define a function but treat it as pure virtual from a subclass's > + * point of view. > > This last sentence is a little confusing. NS_MUST_OVERRIDE means that every > subclass must override the method. But unlike pure virtual functions, it means > nothing if there are no subclasses. It would be helpful if you could call out > this distinction. Ok, will clarify. > Also it would be a good idea to give the detailed > documentation for NS_MUST_OVERRIDE in nscore.h, not in the analysis code. The other nscore.h annotations don't seem to do that, but sure.
Sorry I missed this earlier. (In reply to comment #45) > I am not sure whether it is actually appropriate to make all inheritance from > nsQueryFrame be virtual. It might be better instead to say that frame > interfaces (things like nsIFrameDebug, nsIStatefulFrame, > nsIScrollableViewProvider) can be QueryFrame'd *to* but do not themselves > provide a QueryFrame method. I think only being able to queryframe to those interfaces would be OK. However, I don't think there's a problem with nsQueryFrame being an ambiguous base class. nsISupports is an ambiguous base class, and that's fine.
(In reply to comment #66) > > However, I don't think there's a problem with nsQueryFrame being an ambiguous > base class. nsISupports is an ambiguous base class, and that's fine. I was getting compile errors.
You need to cast to one of the intermediate types in some cases. That's better than introducing virtual inheritance.
Yeah, it requires significantly more assembly to call a method on a virtually-derived class because you have to do two or more vtable dereferences to find the method.
Attachment #391503 - Flags: superreview?(dbaron) → superreview+
Comment on attachment 391503 [details] [diff] [review] series 3 patch 2a/4: introduce nsIPresShell::AllocateMisc sr=dbaron
Attachment #391733 - Flags: superreview?(dbaron) → superreview-
Comment on attachment 391733 [details] [diff] [review] 391508: series 3 patch 3b/4: mechanical changes to frame classes (rediffed) We don't want virtual base classes here. If you need help with the compilation errors, find me on IRC.
Attachment #391507 - Flags: superreview?(dbaron) → superreview+
Comment on attachment 391507 [details] [diff] [review] series 3 patch 3a/4: expose size and frameID of all frame objects sr=dbaron, although I think it's sort of ugly that the macro is called NS_QUERYFRAME_HEAD but really hides a bunch of complete functions and then the head of another. Not sure what to do about that, though. I'm finding the patch-separation here a little odd. It might be good to land in slightly more merged changesets such that each changeset you're landing actually compiles.
(In reply to comment #72) > (From update of attachment 391507 [details] [diff] [review]) > sr=dbaron, although I think it's sort of ugly that the macro is called > NS_QUERYFRAME_HEAD but really hides a bunch of complete functions and then the > head of another. Not sure what to do about that, though. I can't think of a better name for the one macro. We could have two macros, one for the complete function and one for the partial one, but that seems like it would actually be worse. > I'm finding the patch-separation here a little odd. It might be good to land > in slightly more merged changesets such that each changeset you're landing > actually compiles. Basically, 2a+2b should be treated as one patch, and 3a+3b should be treated as one patch. I split them so that the *interesting* (non-mechanical) changes would be all together in the 'a' part. Combining them for landing makes sense to me. > We don't want virtual base classes here. If you need help with the > compilation errors, find me on IRC. I seem to have missed you. There turns out to be only one class of compilation error: generic/nsGfxScrollFrame.cpp: In member function ‘nsresult nsGfxScrollFrameInner::CreateAnonymousContent(nsTArray<nsIContent*>&)’: generic/nsGfxScrollFrame.cpp:1626: error: ‘nsQueryFrame’ is an ambiguous base of ‘nsContainerFrame’ This triggers on *every use* of do_QueryFrame, because, in debug builds, we have this inheritance diagram: nsQueryFrame - nsIFrame - nsBox - nsFrame \- nsIFrameDebug --/ and the do_QueryFrame constructor takes an nsQueryFrame*. I can kludge around this by adding another constructor to class do_QueryFrame: #ifdef DEBUG // nsQueryFrame is an ambiguous base of nsFrame in DEBUG mode do_QueryFrame(nsFrame *s) : mRawPtr((nsQueryFrame*)(nsBox*)(s)) { } #endif but that's really ugly and I'm not even sure it's safe, since the declarations of the intermediate classes are not visible here (it won't let me static_cast, and I think a pointer adjustment is needed in the general case). Instead, I propose to do a separate patch - probably in a separate bug - that eliminates nsIFrameDebug, replacing it with #ifdef DEBUG blocks in nsIFrame and/or nsFrame. Patches 1, 2a and 2b here could go in now, but patches 3a, 3b, and 4 would need to wait for that. An alternative would be to move do_QueryFrame to nsFrame.h where we can safely express the above cast chain. Independent of what we do about the ambiguous base, what say we land 1+2a+2b ASAP? The existing patches apply cleanly to the current trunk and I'd be happy to provide a combined 2a+2b for landing.
Depends on: 510651
> This triggers on *every use* of do_QueryFrame, because, in debug builds, we > have this inheritance diagram: > > nsQueryFrame - nsIFrame - nsBox - nsFrame > \- nsIFrameDebug --/ > > and the do_QueryFrame constructor takes an nsQueryFrame*. I can kludge around > this by adding another constructor to class do_QueryFrame: I tend to think we should perhaps just make do_QueryFrame a template class, taking any T that has a QueryFrame method. (This is how we do CallQueryInterface, although not do_QueryInterface.) However, you should probably run this by Benjamin.
This is a slight revision to patch 3a, adding an NS_QUERYFRAME_ENTRY_CONDITIONAL macro that's now needed for nsTextControlFrame.
Attachment #391507 - Attachment is obsolete: true
Attachment #394918 - Flags: review?(roc)
Attachment #391507 - Flags: review?(roc)
Rediffed for latest trunk, virtual inheritance removed. Now requires the patch in bug 510651 (removal of nsIFrameDebug).
Attachment #391733 - Attachment is obsolete: true
Attachment #394919 - Flags: review?(roc)
Attachment #391733 - Flags: review?(roc)
Comment on attachment 394918 [details] [diff] [review] series 3.1 patch 3a/4: expose size and frameID of all frame objects carrying old sr+ forward
Attachment #394918 - Flags: superreview+
I missed some review comments in the last cycle; this version gets them all.
Attachment #394918 - Attachment is obsolete: true
Attachment #394947 - Flags: superreview+
Attachment #394947 - Flags: review?(roc)
Attachment #394918 - Flags: review?(roc)
more missed review comments.
Attachment #394948 - Flags: review?(roc)
Attachment #394919 - Attachment is obsolete: true
Attachment #394919 - Flags: review?(roc)
Attachment #394918 - Flags: superreview+
Identical to "series 3 patch 1/4" but annotated with reviewer.
Attachment #391502 - Attachment is obsolete: true
Attachment #394952 - Flags: review+
Combined version of "series 3 patch 2a/4" and "series 3 patch 2b/4", annotated with r and sr.
Attachment #394953 - Flags: superreview+
Attachment #394953 - Flags: review+
Attachment #391503 - Attachment is obsolete: true
Attachment #391504 - Attachment is obsolete: true
I'm requesting check-in to m-c at this time, of attachment 394952 [details] [diff] [review] ("patch 1 for commit") and attachment 394953 [details] [diff] [review] ("patch 2 for commit") ONLY. This does not resolve the bug, but does chop off part of the chain of patches that are being maintained here.
Keywords: checkin-needed
Whiteboard: [sg:nse] defense-in-depth → [sg:nse] defense-in-depth [c-n]: see comment 82
I'm putting this in my tree for try-server shortly and then (presumably) checkin later.
One note on preparing patches for commit: If you add [defaults] qnew = -U to your ~/.hgrc, your mq patches will have the correct From: line in them, which will be helpful in the future.
Keywords: checkin-needed
Whiteboard: [sg:nse] defense-in-depth [c-n]: see comment 82 → [sg:nse] defense-in-depth
Attachment #394947 - Flags: superreview?(mrbkap)
Attachment #394947 - Flags: superreview+
Attachment #394947 - Flags: review?(roc)
Attachment #394947 - Flags: review?(dbaron)
Comment on attachment 394947 [details] [diff] [review] series 3.2 patch 3a/4: expose size and frameID of all frame objects changing reviewers since roc is on vacation. this had sr=dbaron before.
Attachment #394948 - Flags: superreview?(mrbkap)
Attachment #394948 - Flags: review?(roc)
Attachment #394948 - Flags: review?(dbaron)
Comment on attachment 394948 [details] [diff] [review] series 3.2 patch 3b/4: mechanical changes to frame classes changing reviewers since roc is on vacation.
Comment on attachment 391735 [details] [diff] [review] series 3 patch 4/4: separate frame freelists by IID switching reviewers since roc is on vacation. clearing old r-dbaron, please reconsider the hash table code in view of the discussion above.
Attachment #391735 - Flags: superreview?(mrbkap)
Attachment #391735 - Flags: review?(roc)
Attachment #391735 - Flags: review?(dbaron)
Attachment #391735 - Flags: review-
Attachment #394952 - Attachment description: patch 1 for commit → patch 1 final [checkin: comment 85]
Attachment #394953 - Attachment description: patch 2 for commit → patch 2 final [checkin: comment 85]
Comment on attachment 394947 [details] [diff] [review] series 3.2 patch 3a/4: expose size and frameID of all frame objects It seems like this patch makes it so that you can QueryFrame *to* any concrete frame class. I'm not sure we want that feature; it has some performance overhead, since the checks we don't use are the ones that happen first. We've previously only used QueryFrame to a limited set of classes. That said, it would be nice if QueryFrame to a concrete class that didn't support QueryFrame didn't compile; I fear that that wasn't the case before this patch. I'm not quite sure yet how I think we should proceed here. >-#define NS_QUERYFRAME_TAIL_INHERITING(class) \ >- return class::QueryFrame(id); } > >-#define NS_QUERYFRAME_TAIL return nsnull; } >+#define NS_QUERYFRAME_TAIL(class) \ >+ default: break; \ >+ } \ >+ return class::QueryFrame(id); \ >+ } >+ >+// This variant of NS_QUERYFRAME_TAIL should only be used by nsFrame. >+#define NS_QUERYFRAME_TAIL_INHERITANCE_ROOT \ >+ default: break; \ >+ } \ >+ return nsnull; \ >+ } You should undo the renaming of NS_QUERYFRAME_TAIL to NS_QUERYFRAME_TAIL_INHERITING. This change makes things not match other patterns in our tree. There are some similar query-map macros where the "end" macro requires the name of the class itself to be repeated; your rename makes this macro sound like one of those. Having _INHERITING in the name makes it clear that the argument is the class it inherits from. (There might be some that need both names, but that's another story.) >diff --git a/xpcom/analysis/must-override.js b/xpcom/analysis/must-override.js I think it would probably be good if somebody who knows our static analysis code reviewed this (and the nscore.h changes). (e.g., bsmedberg, dmandelin)
(In reply to comment #89) > (From update of attachment 394947 [details] [diff] [review]) > It seems like this patch makes it so that you can QueryFrame *to* any concrete > frame class. I'm not sure we want that feature; it has some performance > overhead, since the checks we don't use are the ones that happen first. We've > previously only used QueryFrame to a limited set of classes. In addition to performance overhead: also code size overhead, and making a feature that we really would prefer people didn't use much more widely available.
(In reply to comment #89) > It seems like this patch makes it so that you can QueryFrame *to* any concrete > frame class. I'm not sure we want that feature; it has some performance > overhead, since the checks we don't use are the ones that happen first. We've > previously only used QueryFrame to a limited set of classes. Hmm, yeah, it seems to be mostly nsI things and a handful of SVG parent classes. Maybe this is a reason to split up the macros, after all ... how's this? NS_IMPL_FRAME_CLASS(classname) -- defines operator new, GetAllocatedSize, GetAllocatedIID -- must be used by every nsFrame subclass NS_QUERYFRAME_HEAD(classname) NS_QUERYFRAME_ENTRY(nsIFoo) NS_QUERYFRAME_ENTRY(nsIBar) NS_QUERYFRAME_TAIL[_INHERITING(superclass)] -- just defines QueryFrame() -- can be inherited I probably won't get to that revision till tomorrow though. Incidentally, I see a whole bunch of queryframes to nsIFrame. Should that be allowed? nsIFrame is the first parent of every frame class, so static_cast or even simple conversion on assignment should be fine. (It'll be even sillier if we ever get around to folding nsIFrame/nsBox/nsFrame together.)
And, furthermore, I don't think we even need the code size overhead of having the QueryFrame method on all frame classes. We just need the other (new) methods.
Er, sorry, managed not to see this comment yesterday. (In reply to comment #91) > Hmm, yeah, it seems to be mostly nsI things and a handful of SVG parent > classes. Maybe this is a reason to split up the macros, after all ... how's > this? Also a few concrete classes like nsBlockFrame. > NS_IMPL_FRAME_CLASS(classname) > -- defines operator new, GetAllocatedSize, GetAllocatedIID > -- must be used by every nsFrame subclass > > NS_QUERYFRAME_HEAD(classname) > NS_QUERYFRAME_ENTRY(nsIFoo) > NS_QUERYFRAME_ENTRY(nsIBar) > NS_QUERYFRAME_TAIL[_INHERITING(superclass)] > -- just defines QueryFrame() > -- can be inherited This seems good. (I did like your renaming of the non-INHERITING version to TAIL_INHERITANCE_ROOT, though.) The one thing I worry about is making do_QueryFrame not compile if you're trying to query to something that doesn't implement QueryFrame. Maybe if we have one macro that declares the above methods plus queryframe plus a good typedef that do_QueryFrame refers to, and the macro that declares the above methods without QueryFrame makes the typedef something that won't compile? > Incidentally, I see a whole bunch of queryframes to nsIFrame. Should that be > allowed? nsIFrame is the first parent of every frame class, so static_cast or > even simple conversion on assignment should be fine. (It'll be even sillier if > we ever get around to folding nsIFrame/nsBox/nsFrame together.) We do use it for QueryFrame from an interface that doesn't inherit from nsIFrame (e.g., nsIPageSequenceFrame, nsIObjectFrame).
Attached patch series 4 patch 3a/4: queryframe macro tweaks (obsolete) (deleted) — Splinter Review
Here's a revision along the lines described in comment 91. There are now four sub-patches, two of which are purely mechanical. This piece: rationalize the queryframe-implementation macro naming scheme, restructure the implementation to detect duplicate entries, and add one missing frame ID to the enumeration. Needs the next piece to compile.
Attachment #394947 - Attachment is obsolete: true
Attachment #394948 - Attachment is obsolete: true
Attachment #396031 - Flags: superreview?(mrbkap)
Attachment #396031 - Flags: review?(dbaron)
Attachment #394947 - Flags: superreview?(mrbkap)
Attachment #394947 - Flags: review?(dbaron)
Attachment #394948 - Flags: superreview?(mrbkap)
Attachment #394948 - Flags: review?(dbaron)
Mechanical update of all queryframe macro users.
Attachment #396032 - Flags: superreview?(mrbkap)
Attachment #396032 - Flags: review?(dbaron)
Attached patch series 4 patch 3c/4: expose frame identity (obsolete) (deleted) — Splinter Review
This defines new macros, NS_DECL_FRAMEARENA_HELPERS and NS_IMPL_FRAMEARENA_HELPERS(classname), that every nsFrame subclass must use, and reworks the frame allocation/deallocation mechanism assuming they are in place. It also includes the static analysis script to ensure that they are so used. This compiles by itself (as long as you don't activate the static analysis) but will waste memory without part 3d, because absolutely everything will go on the nsFrame-sized freelist, but only nsFrame objects will be allocated from that freelist. The macros use the nsQueryFrame::classname##_id constants directly, so they don't stomp on the mechanism for ensuring that QueryFrame to a class that doesn't declare itself a queryframe target will fail at compile time.
Attachment #396033 - Flags: superreview?(mrbkap)
Attachment #396033 - Flags: review?(dbaron)
And this is the long, boring, touches-hundreds-of-files mechanical update to make all frame classes declare the frame arena methods.
Attachment #396034 - Flags: superreview?(mrbkap)
Attachment #396034 - Flags: review?(dbaron)
Attached patch series 3 patch 3 consolidated (obsolete) (deleted) — Splinter Review
For convenience, here is all of patch 3a-d rolled into one. This is probably how it should be checked in.
Comment on attachment 396031 [details] [diff] [review] series 4 patch 3a/4: queryframe macro tweaks Patches 3a and 3b: I'd say leave NS_DECL_QUERYFRAME's name alone rather than changing it to NS_DECL_QUERYFRAME_SOURCE; it declares a function called QueryFrame, so NS_DECL_QUERYFRAME makes sense. (This is easily fixable by unapplying the patches, search-replacing them, reapplying them, and refreshing them (to clean up the places where the search-replace introduces an identical +/- pair), assuming you're using mq.) + nsXULLabelFrame_id, This was actually renamed from nsAreaFrame; you should also remove nsAreaFrame_id. (nsAreaFrame used to do more things, but it gradually did less until the only thing it was needed for was XUL labels.) r=dbaron with that
Attachment #396031 - Flags: superreview?(roc)
Attachment #396031 - Flags: superreview?(mrbkap)
Attachment #396031 - Flags: review?(dbaron)
Attachment #396031 - Flags: review+
Attachment #396032 - Flags: superreview?(mrbkap)
Attachment #396032 - Flags: review?(dbaron)
Attachment #396032 - Flags: review+
Attachment #396033 - Flags: superreview?(roc)
Attachment #396033 - Flags: superreview?(mrbkap)
Attachment #396033 - Flags: review?(dbaron)
Attachment #396033 - Flags: review+
Comment on attachment 396033 [details] [diff] [review] series 4 patch 3c/4: expose frame identity Patch 3c and 3d: Sorry this just occurred to me (rather than earlier): it seems like it might be better if GetAllocatedSize and GetAllocatedIID were a single function returning a struct; then you'd only pay the cost of calling (and implementing) a single virtual function. (Though the ABI for returning small structs tends not to be very nice.) I think changing that would be relatively simple, since it requires changing only the macros and the one callsite. But I won't insist on it if you prefer it as it is now. + nsMathMLmencloseFrame_id, Might want to put this in a patch other than 3d. (Actually, maybe the same for Area -> XULLabel. Maybe its own patch? Doesn't really matter, though.) r=dbaron It seems like 3a+3b is independent from 3c+3d. Or are they interlinked somehow I'm not seeing?
Attachment #396034 - Flags: superreview?(mrbkap)
Attachment #396034 - Flags: review?(dbaron)
Attachment #396034 - Flags: review+
Comment on attachment 391735 [details] [diff] [review] series 3 patch 4/4: separate frame freelists by IID >+#ifndef DEBUG_TRACEMALLOC_PRESARENA > >+#include "nsTArray.h" >+#include "nsTHashtable.h" On principle, it's better to keep universally-available #includes outside of ifdefs rather than inside them. (Having #includes #ifdef-ed on things other than their presence leads to unneeded compilation errors when people developing on a system where the #ifdef is true write code *outside* the ifdef that depends on the header.) > void > PresShell::FreeMisc(size_t aSize, void* aPtr) > { >- mFrameArena.Free(aSize, aPtr); >+ if (!PRESARENA_AUTOFREE_ON_DESTROY || !mIsDestroying) >+ mFrameArena.Free(aSize, 0xFFFF, aPtr); > } > > void* > PresShell::AllocateMisc(size_t aSize) > { >- return mFrameArena.Allocate(aSize); >+ return mFrameArena.Allocate(aSize, 0xFFFF); > } Rather than using 0xFFFF, could you add a value to nsQueryFrame::FrameIID enum called something like not_a_frame_id, to make it clear that this value is part of the frame IID space and you're depending on it not intersecting frame IIDs? >+ typedef PRUword KeyType; Use PRUint32 >+ static KeyTypePointer KeyToPointer(KeyType aKey) >+ { return (KeyTypePointer)aKey; } Use NS_INT32_TO_PTR here rather than a cast. + static PLDHashNumber HashKey(KeyTypePointer aKey) + { return (PLDHashNumber)(KeyType)aKey; } And NS_PTR_TO_INT32 here. >-#ifdef DEBUG >- , mAllocCount(0) >-#endif Would you mind keeping mAllocCount, the tracking of it, and the assertion in the destructor? That assertion has caught a whole bunch of bugs. + shell->FreeFrame(sz, id, (void*)this); May as well drop the silly (void*) cast while you're there. r=dbaron with that
Attachment #391735 - Flags: superreview?(roc)
Attachment #391735 - Flags: superreview?(mrbkap)
Attachment #391735 - Flags: review?(dbaron)
Attachment #391735 - Flags: review+
Attachment #396031 - Flags: superreview?(roc) → superreview+
Comment on attachment 391735 [details] [diff] [review] series 3 patch 4/4: separate frame freelists by IID What if we got rid of the aSize argument to Free, and just used aCode, passing the size for aCode for the non-frame allocations? Seems that would simplify things, we wouldn't need GetAllocatedSize.
Comment on attachment 396033 [details] [diff] [review] series 4 patch 3c/4: expose frame identity Looks good except for the issue that I hope GetAllocatedSize is not needed.
Attachment #396033 - Flags: superreview?(roc) → superreview+
(In reply to comment #99) > (From update of attachment 396031 [details] [diff] [review]) > Patches 3a and 3b: > > I'd say leave NS_DECL_QUERYFRAME's name alone rather than changing it to > NS_DECL_QUERYFRAME_SOURCE; it declares a function called QueryFrame, so > NS_DECL_QUERYFRAME makes sense. Ok. > + nsXULLabelFrame_id, > > This was actually renamed from nsAreaFrame; you should also remove > nsAreaFrame_id. (nsAreaFrame used to do more things, but it gradually > did less until the only thing it was needed for was XUL labels.) Will do. > It seems like 3a+3b is independent from 3c+3d. Or are they interlinked > somehow I'm not seeing? They are independent, and in fact I was thinking of filing a separate bug for 3a+3b. Or we could just land it as [3a+3b] [3c+3d] from this bug.
(In reply to comment #100) > (From update of attachment 396033 [details] [diff] [review]) > Patch 3c and 3d: > > Sorry this just occurred to me (rather than earlier): it seems like it > might be better if GetAllocatedSize and GetAllocatedIID were a single > function returning a struct; then you'd only pay the cost of calling > (and implementing) a single virtual function. (Though the ABI for > returning small structs tends not to be very nice.) I did it this way precisely because the ABI for returning small structs tends to not be very nice. But I agree that it would be good to get rid of the extra virtual call. I'm going to implement roc's suggestion of just using the GetAllocatedIID value to index the frame lists (conveniently, this will also simplify the hash table a bunch), plus some defensive code to ensure that a frame ID always corresponds to exactly one size. There's going to have to be some trickiness so that AllocateMisc() doesn't break, since it does not provide an ID to the arena code, but I have a plan. > + nsMathMLmencloseFrame_id, > > Might want to put this in a patch other than 3d. (Actually, maybe the > same for Area -> XULLabel. Maybe its own patch? Doesn't really matter, > though.) I moved this to 3a, since 3a is already touching the enumeration for Area -> XULLabel.
(In reply to comment #104) > They are independent, and in fact I was thinking of filing a separate bug for > 3a+3b. Or we could just land it as [3a+3b] [3c+3d] from this bug. That sounds fine.
Attached patch series 4.1 patch 3a/4: queryframe macro tweaks (obsolete) (deleted) — Splinter Review
Here's dbaron's requested changes to patch 3a ...
Attachment #396031 - Attachment is obsolete: true
Attachment #396606 - Flags: superreview+
Attachment #396606 - Flags: review+
... and 3b. Will post consolidated patch 3[a+b] for checkin next.
Attachment #396032 - Attachment is obsolete: true
Attachment #396035 - Attachment is obsolete: true
Attachment #396607 - Flags: review+
Attached patch patch 3a+3b final (obsolete) (deleted) — Splinter Review
Combined patches 3a and 3b. This can land now, I hope.
Keywords: checkin-needed
Whiteboard: [sg:nse] defense-in-depth → [sg:nse] defense-in-depth [c-n: attachment 396608 only]
Someone landed attachment 396608 [details] [diff] [review] (thanks!) but didn't update the bug (argh!) http://hg.mozilla.org/mozilla-central/rev/e2927bb26412
Keywords: checkin-needed
Whiteboard: [sg:nse] defense-in-depth [c-n: attachment 396608 only] → [sg:nse] defense-in-depth
Attachment #396608 - Attachment description: patch 3a+3b for checkin → patch 3a+3b final [Checkin: comment 110]
Attachment #396606 - Attachment is obsolete: true
Attachment #396607 - Attachment is obsolete: true
Attachment #396608 - Flags: superreview+
Attachment #396608 - Flags: review+
I usually wait for Tinderboxes to cycle green before updating the bugs.
Depends on: 512726
I've split the new static analysis module to bug 512726 as requested in comment 89.
Attached patch bustage fix for part 3a+b (obsolete) (deleted) — Splinter Review
I guess I get to eat my hat. (Conveniently, my hat is in San Diego.) I'm attaching an additional bustage-fix patch rather than regenerating the "3a+3b" patch to make it easier to see what the problem was. Basically, one lousy frame subclass had a QueryFrame definition with some custom code in it that didn't work with the new QueryFrame structure. I actually had caught this the last time through, which is why there was a NS_QUERYFRAME_ENTRY_CONDITIONAL added in the first place, but when I rev'd part 3b I forgot it. I would love suggestions for a way to make the compiler catch this. I can't think of anything.
Attachment #396914 - Flags: review?(roc)
Comment on attachment 396914 [details] [diff] [review] bustage fix for part 3a+b OK, let's reland this
Attachment #396914 - Flags: review?(roc) → review+
Attachment #396608 - Attachment description: patch 3a+3b final [Checkin: comment 110] → patch 3a+3b final
Attached patch series 5 patch 3c/4: expose frame identity (obsolete) (deleted) — Splinter Review
Changes as requested by dbaron, notably to create only one additional function that has to be called from nsFrame::Destroy(). Note that as of *this* patch, it returns the object's size, not its ID - this is so that 3c+3d works without 4.
Attachment #396033 - Attachment is obsolete: true
Attachment #396034 - Attachment is obsolete: true
Attachment #398557 - Flags: superreview+
Attachment #398557 - Flags: review?(roc)
Rediffed.
Attachment #398558 - Flags: review?(roc)
Attached patch series 5 patch 4/4: free lists by ID (obsolete) (deleted) — Splinter Review
And here's the thing that actually does the work, finally. Requesting re-review due to substantial changes since the last iteration. Alas, I had to change nsIPresShell again. Try server is running.
Attachment #398559 - Flags: superreview?(dbaron)
Attachment #398559 - Flags: review?(roc)
After this last patch, should we get rid of GetAllocatedSize?
I did get rid of GetAllocatedSize.
All builds failed; unfortunately, the builds do not appear on the waterfall (presumably due to bug 514615) so I can't find the logs, and the emails don't give me enough information to tell what went wrong. I'm betting it's patch application failure due to hg base revision skew, though. I will not retry the patchset until the backlog from bug 514615 clears.
For what it's worth, though, a local release-mode build from the patched tree has no trouble at all with tinderboxpushlog, which should be a good smoke test for this stuff.
You have a Mozilla LDAP account so you should be able to push to the try servers: hg push -f ssh://hg.mozilla.org/try
No, LDAP gets you access to the web interface, but not to the push interface (that requires commit access to m-c) -- it's a long story, one best not retold here, but if your patch adds files then you will find that the web interface does not suffice, and you must rely on the kindness of colleagues.
Comment on attachment 398559 [details] [diff] [review] series 5 patch 4/4: free lists by ID > <html><head> > <meta http-equiv="content-type" content="text/html; charset=UTF-8"></head><body><pre> Not sure what's adding this HTML to the beginning of your patches, but probably worth fixing. nsPresArena.cpp: >+// We know implicitly that the values we hash fit in 32 bits (see >+// below) and so we do not bother actually hashing them. However, >+// we must preserve our own copy of the original key, since the hash >+// library messes with the value returned from HashKey. Please don't represent your misunderstanding of the internals of pldhash / nsTHashtable as a bug in its API. You need to maintain the key because the nsTHashtable API requires that the user implement KeyEquals using its own data. (No need to discuss that you found something in the internals called keyHash and then discovered that it wasn't what you thought it was.) + // Try to recycle this entry. + FreeList* list = mFreeLists.PutEntry(aCode); + NS_ABORT_IF_FALSE(list, "no free list for pres arena object"); Shouldn't this be GetEntry rather than PutEntry, especially now that you're relying on the size being stored for filling the correct amount with the poison pattern? + poison = (~PRUword(0x0FFFFF00) | PRUword(0x0DEA700)); Would it be clearer to just write 0xF0DEA7FF ? + return mState->Allocate(PRUint32(aSize)|nsQueryFrame::NON_FRAME_MARKER, + aSize); Might it help to cast both sides of the | to PRUint32? (same in FreeBySize) (AllocateBySize) - NS_ABORT_IF_FALSE(aSize > 0, "PresArena cannot allocate zero bytes"); (FreeBySize) - NS_ABORT_IF_FALSE(aSize > 0, "PresArena cannot free zero bytes"); Maybe keep these? nsPresArena.h: +// The debugging version of nsPresArena does not free all the memory it +// allocated when the arena itself is destroyed. +#ifdef DEBUG_TRACEMALLOC_PRESARENA +#define PRESARENA_AUTOFREE_ON_DESTROY 0 +#else +#define PRESARENA_AUTOFREE_ON_DESTROY 1 +#endif I find the name of this macro confusing. Maybe call it PRESARENA_FREE_DURING_DESTROY, make the values PR_TRUE and PR_FALSE, and remove the ! from where it's used? That said, I'm not convinced we should have this at all; I think it might weaken some of the security protection we get from this. nsQueryFrame.h: + // The PresArena implementation uses this as a flag to distinguish s/uses this as a flag/uses this bit/ sr=dbaron with those things fixed
Attachment #398559 - Flags: superreview?(dbaron) → superreview+
(In reply to comment #127) > (From update of attachment 398559 [details] [diff] [review]) > > <html><head> > > <meta http-equiv="content-type" content="text/html; charset=UTF-8"></head><body><pre> > > Not sure what's adding this HTML to the beginning of your patches, but probably > worth fixing. This sometimes happens when I download patches from bugzilla to move them between computers. Don't know why. Will make sure it's not in the next set of patches. > nsPresArena.cpp: > > >+// We know implicitly that the values we hash fit in 32 bits (see > >+// below) and so we do not bother actually hashing them. However, > >+// we must preserve our own copy of the original key, since the hash > >+// library messes with the value returned from HashKey. > > Please don't represent your misunderstanding of the internals of pldhash > / nsTHashtable as a bug in its API. You need to maintain the key > because the nsTHashtable API requires that the user implement KeyEquals > using its own data. (No need to discuss that you found something in the > internals called keyHash and then discovered that it wasn't what you > thought it was.) ok. > + // Try to recycle this entry. > + FreeList* list = mFreeLists.PutEntry(aCode); > + NS_ABORT_IF_FALSE(list, "no free list for pres arena object"); > > Shouldn't this be GetEntry rather than PutEntry, especially now that > you're relying on the size being stored for filling the correct amount > with the poison pattern? Is that the Free method? If so, yeah, that's an editing mistake (it used to create the entry on free, but then I had to move it to alloc to record the size). > + poison = (~PRUword(0x0FFFFF00) | PRUword(0x0DEA700)); > > Would it be clearer to just write 0xF0DEA7FF ? That would give the wrong value on 64-bit systems other than amd64. > + return mState->Allocate(PRUint32(aSize)|nsQueryFrame::NON_FRAME_MARKER, > + aSize); > > Might it help to cast both sides of the | to PRUint32? Help what, exactly? > (AllocateBySize) > - NS_ABORT_IF_FALSE(aSize > 0, "PresArena cannot allocate zero bytes"); > (FreeBySize) > - NS_ABORT_IF_FALSE(aSize > 0, "PresArena cannot free zero bytes"); > > Maybe keep these? They moved to the Alloc/Free methods that actually have this limitation (the non-DEBUG_TRACEMALLOC_FRAMEARENA versions). > I find the name of this macro confusing. Maybe call it > PRESARENA_FREE_DURING_DESTROY, make the values PR_TRUE and PR_FALSE, and > remove the ! from where it's used? That rename seems just as confusing to me. PRESARENA_MUST_FREE_DURING_DESTROY? > That said, I'm not convinced we should have this at all; I think it > might weaken some of the security protection we get from this. roc was concerned that the overall change would be an unacceptable performance hit without. I suggest we land it with this hack, and then file a new bug in which we disable it, land that and see what happens to Tp. > + // The PresArena implementation uses this as a flag to distinguish > > s/uses this as a flag/uses this bit/ ok
(In reply to comment #128) > Is that the Free method? If so, yeah, that's an editing mistake (it used to > create the entry on free, but then I had to move it to alloc to record the > size). Yes. > > + poison = (~PRUword(0x0FFFFF00) | PRUword(0x0DEA700)); > > > > Would it be clearer to just write 0xF0DEA7FF ? > > That would give the wrong value on 64-bit systems other than amd64. Ah, oops. > > + return mState->Allocate(PRUint32(aSize)|nsQueryFrame::NON_FRAME_MARKER, > > + aSize); > > > > Might it help to cast both sides of the | to PRUint32? > > Help what, exactly? It's not obvious to me whether (PRUint32 | enum), where enum can be pretty much any integral type that holds [ 0, 2^31 ), always yields what one would expect; i.e., that the conversions involved don't involve any type conversions that might involve sign-extension. Though maybe it's ok since the *values* involved won't be changed by sign-extension; however, a cast might make it clearer that I don't have to worry. > > I find the name of this macro confusing. Maybe call it > > PRESARENA_FREE_DURING_DESTROY, make the values PR_TRUE and PR_FALSE, and > > remove the ! from where it's used? > > That rename seems just as confusing to me. PRESARENA_MUST_FREE_DURING_DESTROY? Sure. > roc was concerned that the overall change would be an unacceptable performance > hit without. I suggest we land it with this hack, and then file a new bug in > which we disable it, land that and see what happens to Tp. ok
Comment on attachment 398559 [details] [diff] [review] series 5 patch 4/4: free lists by ID >+#if defined __x86_64 Also, I think the ifdef you probably want here is: #if defined(__x86_64__) || defined(_M_AMD64) Since: * I seem to think the gcc macros with __ both at the start and end are preferred since only a small number of them have the version without trailing __ (although maybe you'd know a reason otherwise) * the latter part gets x86_64 Windows too
One more thing we should do (though it can be a followup patch): When we're reallocating from the free lists, we should NS_ABORT_IF_FALSE that the memory is still filled with the poison pattern.
You mean if the memory is not still filled with the poison pattern? That sounds like it would be a performance hit. Zack, can we get final patches here and get this stuff checked in?
This is what I hope will be the true final version of the big mechanical update to all frame classes. I merged 3a, 3b, bustage fix, 3c, and 3d together and patched a few glitches. A try server run earlier today blew up due to what *appears* to be unrelated fallout (from mrbkap's landing bug 509557). I'm going to start another try server run late tonight after tinderbox is well and truly green, and may try to herd the patch in over the weekend if that goes well.
Attachment #396608 - Attachment is obsolete: true
Attachment #396914 - Attachment is obsolete: true
Attachment #398557 - Attachment is obsolete: true
Attachment #398558 - Attachment is obsolete: true
Attachment #398559 - Attachment is obsolete: true
Attachment #400210 - Flags: superreview+
Attachment #400210 - Flags: review+
Attached patch patch 4 final (obsolete) (deleted) — Splinter Review
And here's the revisions to patch 4 requested by dbaron. As with part 3, I am not sure of its correctness due to the tree being busted for most of today. Will throw it at the try server yet again once tinderbox goes green this weekend. In any case I think patch 3 should be given a chance to bake on trunk for at least 24 hours before this lands.
Attachment #391735 - Attachment is obsolete: true
Attachment #400211 - Flags: superreview+
Attachment #400211 - Flags: review+
Attachment #400210 - Attachment description: series 5 patch 3 final (for sure this time) → patch 3 final (for sure this time)
Attachment #400210 - Attachment description: patch 3 final (for sure this time) → patch 3 final
Requesting checkin at this time of "patch 3 final" (attachment 400210 [details] [diff] [review]) ONLY.
Keywords: checkin-needed
Whiteboard: [sg:nse] defense-in-depth → [sg:nse] defense-in-depth [c-n: "patch 3 final" only]
Keywords: checkin-needed
Whiteboard: [sg:nse] defense-in-depth [c-n: "patch 3 final" only] → [sg:nse] defense-in-depth
Attachment #400210 - Attachment description: patch 3 final → patch 3 final [checkin: comment 137]
Comment on attachment 400210 [details] [diff] [review] patch 3 final [checkin: comment 137] Tree cycled green. I'm gonna leave this for at least 24 hours before requesting landing of part 4, though.
Incidentally, > When we're reallocating from the free lists, we should NS_ABORT_IF_FALSE that > the memory is still filled with the poison pattern. I did this under #ifdef DEBUG in the current revision of part 4. Yeah, it'll be a performance hit, but I think it should be okay in debug builds, and it may help narrow down some of these use-after-free problems we have.
bz landed part 4: http://hg.mozilla.org/mozilla-central/rev/a3f33def2dca and I backed it out per zwol's request due to leak test failures. backout merge: http://hg.mozilla.org/mozilla-central/rev/38f9517ffc7a
Rediffed + fixed an inverted condition in an NS_ABORT_IF_FALSE that was causing yesterday's leak test failures.
Attachment #400211 - Attachment is obsolete: true
Attachment #400857 - Flags: superreview+
Attachment #400857 - Flags: review+
Keywords: checkin-needed
Whiteboard: [sg:nse] defense-in-depth → [sg:nse] defense-in-depth [c-n: attachment 400857]
Resolving. Patch for 1.9.2 branch shortly.
Group: core-security
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
... this should probably still be security sensitive. sorry about that.
Group: core-security
Attached patch complete patch for 1.9.2 branch (deleted) — Splinter Review
Attachment #402221 - Flags: superreview+
Attachment #402221 - Flags: review+
Attachment #402221 - Flags: approval1.9.2?
Flags: blocking1.9.2?
Have you pushed the 1.9.2 patch to try server? (You can push a current 1.9.2 tree to try server with the patch committed (or in an mq queue) on top of it.)
Keywords: checkin-needed
Whiteboard: [sg:nse] defense-in-depth [c-n: attachment 400857] → [sg:nse] defense-in-depth
(In reply to comment #147) > Have you pushed the 1.9.2 patch to try server? (You can push a current 1.9.2 > tree to try server with the patch committed (or in an mq queue) on top of it.) I won't be able to do that till tomorrow morning, but I'll do it then.
Why can't you do it now? You don't need to watch the try servers, just push and forget and check the results tomorrow.
Because I always fail to find try server results if they're more than eight or so hours old.
Can you just push it? I'll find the results if necessary.
I pushed it as: changeset: 31872:d6a39f25ca2c tag: qtip tag: tip tag: qbase tag: fp-192-rollup.diff user: Zack Weinberg <zweinberg@mozilla.com> date: Tue Sep 22 17:34:15 2009 -0700 summary: Bug 497495 (frame poisoning). r=roc sr=dbaron on top of: changeset: 31871:7b4f35e3a442 tag: qparent user: Marco Bonardo <mbonardo@mozilla.com> date: Tue Sep 22 23:26:43 2009 +0200 summary: Bug 512854 - VACUUM places.sqlite database on daily idle but not more than once a month, r=sdwilsh a=beltzner This means I'll get the email with the results, though.
Comment on attachment 402221 [details] [diff] [review] complete patch for 1.9.2 branch Try server run built successfully on Maemo, Linux, WINNT 5.2, WinCE, WinMo, and OS X 10.5.2. Passed unit tests on OS X 10.5.2. Had one unit test failure on Linux that I'm pretty sure is known random: http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1253666107.1253673757.20624.gz 29147 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_media_selection.html | Test timed out. Passed unit tests on WINNT 5.2. So a1.9.2=dbaron
Attachment #402221 - Flags: approval1.9.2? → approval1.9.2+
What I should have said last night was, I was leaving at 5:30 and did not have time to look up how to push to the try server. I realized after I posted comment #150 that things that are pushed to the try server show up in tbpl, which makes them much easier to find later. Anyway, I tried to land this on 1.9.2 but I get "could not lock repository .../releases/mozilla-1.9.2: Permission denied". Filed bug 518413, but if anyone wants to do the push, I can watch the tree.
Flags: blocking1.9.2? → blocking1.9.2+
Priority: -- → P1
fixed1.9.2. Marking wontfix for 1.9.1 on the theory that we're not gonna backport the queryframe changes, which this would need. Given that this is not going to be fixed in 1.9.1/ff3.5, is it now safe to talk about this stuff in public?
This is marked FIXED and as fixed on 1.9.2, but I have some questions: - I don't see any tests, such as making sure that our poison address is really not allocatable or whatever, or that we correctly poison after freeing in various cases even. - Is the architecture, especially the technique we use to pick the poisoned address, documented anywhere? I don't see a URL in this bug, but I'll admit that it's pretty long and I didn't read every comment in depth again. - Have we re-evaluated the severity rating of the bugs affected by this? Reducing the impact of these defects in our code was the whole point of this (involved and lengthy!) exercise, so we should track what effect it had on our outstanding set of bugs. I don't think we should talk about it in detail until a 3.6 is available, even if we're not going to backport to 3.5.x, but that's just my opinion.
(In reply to comment #157) > - Have we re-evaluated the severity rating of the bugs affected by this? > Reducing the impact of these defects in our code was the whole point of this > (involved and lengthy!) exercise, so we should track what effect it had on our > outstanding set of bugs. We haven't. We should.
(In reply to comment #157) > This is marked FIXED and as fixed on 1.9.2, but I have some questions: > > - I don't see any tests, such as making sure that our poison address is really > not allocatable or whatever, or that we correctly poison after freeing in > various cases even. I have a crude test program that validates the poison address, I suppose it could become a standalone unit test. Poisoning is verified in debug builds with assertions on reallocation; I don't see any easy way to do better. > - Is the architecture, especially the technique we use to pick the poisoned > address, documented anywhere? I don't see a URL in this bug, but I'll admit > that it's pretty long and I didn't read every comment in depth again. I guess I never mentioned it here. https://intranet.mozilla.org/Frame_poisoning_mitigation_rules I mean to turn this into a public blog post as soon as we can talk about it in public. > - Have we re-evaluated the severity rating of the bugs affected by this? > Reducing the impact of these defects in our code was the whole point of this > (involved and lengthy!) exercise, so we should track what effect it had on our > outstanding set of bugs. There are some small suggestions for severity reduction in the above document. > I don't think we should talk about it in detail until a 3.6 is available, even > if we're not going to backport to 3.5.x, but that's just my opinion. Fine by me.
Blocks: 522088
I'm reopening this. I don't know the exact process for this, but I'm not confident we're at the right place wrt shipping this.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: PoisonFrameCrash
Damon, are your concerns due to bug 526587?
(In reply to comment #160) > I'm reopening this. I don't know the exact process for this, but I'm not > confident we're at the right place wrt shipping this. I'm thinking that Damon meant to clear the status1.9.2:fixed flag as well, so that this shows up on blocker reports again.
Re-opening is one way of doing it, especially if your intent is that we should back this patch out. If you have a specific concern, it's usually better to open new bugs and mark them as blocking and dependent on this one.
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
(oops, I didn't mean to do that; up to Damon how he wants to handle!)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Yes, my concern is over bug 526587. It's not blocking, but there's a list of 183 crash sigs there that we haven't analyzed. I don't care which bug is blocking, this one, that one, a new bug that says we must analyze the crashes first. Something must block, because if we don't go through those first, we need to back out FP, turn it off, or back port it. None of the bugs that block bug 526587 are blocking 1.9.2 (at least the open ones), but, again, we still don't know if they exist. Feel free to modify this bug, or whatever, but there should be something that indicates we're not going to ship until we resolve these issues.
Let's use that bug, then. I believe it's pretty clear that the crashes which are fallout from frame poisoning are all operations on deleted frames, which would almost certainly be potential sg:critical bugs if we turned poisoning off.
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
I made some proposals on how to move forward in https://bugzilla.mozilla.org/show_bug.cgi?id=526587#c34 and 35 interested in feedback on these suggestions.
Alias: FramePoisoning
Group: core-security
(In reply to Markus Stange [:mstange] from comment #143) > Landed corrected patch 4: > http://hg.mozilla.org/mozilla-central/rev/82e988788229 Landed a late-breaking followup to update a comment that was missed in this cset. https://hg.mozilla.org/integration/mozilla-inbound/rev/669a973d69d7
Depends on: 1375331
Product: Core → Core Graveyard
Component: Layout: Misc Code → Layout
Product: Core Graveyard → Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: