Closed Bug 489345 Opened 15 years ago Closed 15 years ago

OOM Implementation

Categories

(Tamarin Graveyard :: Garbage Collection (mmGC), defect, P1)

x86
macOS
defect

Tracking

(Not tracked)

VERIFIED FIXED
flash10.1

People

(Reporter: treilly, Assigned: treilly)

References

Details

(Whiteboard: Tracking)

Attachments

(19 files, 5 obsolete files)

(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
stejohns
: review+
Details | Diff | Splinter Review
(deleted), patch
lhansen
: review+
Details | Diff | Splinter Review
(deleted), patch
lhansen
: review+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
lhansen
: review+
Details | Diff | Splinter Review
(deleted), patch
lhansen
: review+
Details | Diff | Splinter Review
(deleted), patch
lhansen
: review+
Details | Diff | Splinter Review
(deleted), patch
lhansen
: review+
Details | Diff | Splinter Review
(deleted), patch
stejohns
: review+
edwsmith
: superreview+
Details | Diff | Splinter Review
(deleted), patch
edwsmith
: superreview+
Details | Diff | Splinter Review
(deleted), patch
edwsmith
: superreview+
Details | Diff | Splinter Review
(deleted), patch
stejohns
: review+
Details | Diff | Splinter Review
(deleted), patch
stejohns
: review+
Details | Diff | Splinter Review
(deleted), patch
edwsmith
: superreview+
Details | Diff | Splinter Review
(deleted), patch
lhansen
: superreview+
Details | Diff | Splinter Review
(deleted), patch
lhansen
: superreview+
Details | Diff | Splinter Review
(deleted), patch
lhansen
: superreview+
Details | Diff | Splinter Review
(deleted), patch
lhansen
: superreview+
Details | Diff | Splinter Review
Attached patch impl first take (deleted) — Splinter Review
Bug for OOM Implementation patches
Attachment #373843 - Flags: review?(lhansen)
This is the first take, status is that its been lightly debugged (passes regression tests, including oom tests) and while it tries to incorporate all the player's requirements it hasn't been integrated their yet. This isn't for pushing, just code review, once its been integrated into the player another patch will be posted.
I've done a once-over. Here are some odds and ends in no particular order. I don't understand why you removed signalling of start and end of collection from the policy manager and GCManager; this call was not used for anything at present, true, but is intended to aid in making gc decisions when the policy code evolves. Making GCHeap a friend of GCManager is stylistically dodgy. It's not obviously correct to remove MMGC_LOCKING and assume it always defaults to true, but I think it's worth discussing (in the larger group) whether the small costs incurred in the case where it was false are worth the simplification. Even if MMGC_LOCKING has been removed some places, it lives on in other places, like in GC::~GC. FixedAlloc::~FixedAlloc introduces an extra guard; it does not perform shutdown checking if status is kMemAbort. But VerifyFreeBlockIntegrity, which is called outside this guard, does not have a guard. Should it? The GC introduces a "CollectionNeeded" check but this needs to go through the policy manager, not be some hidden mechanism. The point of having the policy manager in the first place is to centralize policy. Please don't change this. Same thing with fullCollectionQueued - don't scatter policy around in hard-to-reach places. WriteBarrierWrite is now even more expensive, this does not look so good. It's correct for write barrier work to drive the gc, but this could be handled more localized, eg as part of an overflow check. (And the policy should be separated out, not be another call to CollectionNeeded.) Is the definition of CollectionNeeded correct? I doubt it; unless there's a guarantee that we can get out of the reserve state at the end of a GC we'll immediately start another gc. It looks like the status only goes back to kMemNormal after memory is returned to GCHeap but it's possible to free up a lot of memory (and thus drive the program forward) without returning any blocks at all to GCHeap, no?
Depends on: 490623
Depends on: 488733
Depends on: 490378
Depends on: 491314
Depends on: 491068
Depends on: 491454
Blocks: 488738
Depends on: 491495
Target Milestone: --- → flash10.x
Depends on: 396921
No longer depends on: 491495
Target Milestone: flash10.x → ---
Depends on: 491531
Depends on: 492024
Attached patch makes tr compile in frr (deleted) — Splinter Review
Attachment #378675 - Flags: review?(stejohns)
Attachment #378675 - Flags: review?(stejohns) → review+
pushed changeset: 1913:57007bee31b1
Attachment #379712 - Flags: review?(lhansen)
Flags: flashplayer-qrb+
Priority: -- → P2
Target Milestone: --- → flash10.x
Comment on attachment 379712 [details] [diff] [review] distilled version of first take, should be small refinements after this Stylistic & other: GCHeap::StatusChangeNotify could use the locking version of BasicList::next, presumably.
Attachment #379712 - Flags: review?(lhansen) → review+
changeset: 1954:014ecfe95c47
Attachment #373843 - Flags: review?(lhansen)
Attachment #380482 - Flags: review?(lhansen)
Comment on attachment 380482 [details] [diff] [review] Full collection and stackless scanning, callback pruning cancelled review, missing call to completeFullCollection
Attachment #380482 - Flags: review?(lhansen)
Attachment #380482 - Attachment is obsolete: true
Attachment #380844 - Flags: review? → review?(lhansen)
Attachment #380484 - Attachment is obsolete: true
Comment on attachment 380844 [details] [diff] [review] Implement collection queuing, stack less scanning and fixup some enter/exit issues Minor point: Needs documentation for queryFullCollectionQueued(), fullCollectionQueued. Style: in SetStackEnter, and several related functions, you're using "0" but you mean "NULL". I'm fairly sure that the ReapZCT call in SetStackEnter needs to be controlled by policy, as two reaps back to back can be bad for performance if the ZCT is not close to empty. I'm sure I don't understand what the term "edge" is meant to convey. "enter_or_exit" might be more indicative of what's going on, not sure.
Attachment #380844 - Flags: review?(lhansen) → review+
pushed: changeset: 1975:079b86c8954a fixed the 0/NULL instances A stack less, edge ZCT reap is pretty cheap back to back because there's no stack scan and the RCRoot segments should have all been popped, also it early returns if count is zero (which its guaranteed to be after a Reap occurs with no stack and no RCRoot segments) you got it, edge == primary enter or final exit.
Attached patch Sync with player, add new VMPI api's (obsolete) (deleted) — Splinter Review
Attachment #381770 - Flags: superreview?(lhansen)
Attachment #381770 - Flags: review?
Attachment #381770 - Flags: review? → review?(rishah)
Attached patch real patch (obsolete) (deleted) — Splinter Review
Attachment #381770 - Attachment is obsolete: true
Attachment #381773 - Flags: superreview?(lhansen)
Attachment #381773 - Flags: review?(rishah)
Attachment #381770 - Flags: superreview?(lhansen)
Attachment #381770 - Flags: review?(rishah)
Attachment #381773 - Flags: review?(rishah) → review+
Comment on attachment 381773 [details] [diff] [review] real patch Just one comment: In GCDebug.cpp vsnprintf is assumed to always return 0 or positive value. In theory, it could return a negative value. Minor issue but thought I'd point out.
Attachment #381773 - Flags: superreview?(lhansen) → superreview+
Comment on attachment 381773 [details] [diff] [review] real patch Approved, with need for minor revision. Must fix: The documentation for VMPI_cleanStack is inadequate. It needs to specify whether the cleaning is into the unallocated space ("above" SP) or the other direction; it needs to specify whether it is OK for this function to do nothing and what the consequences of that would be. The naming of VMPI_spyCallback is inappropriate, since it is a callback for allocation. The name must be changed. Eg, VMPI_spyAllocationEvent. Should fix: VMPI_areNewPagesDirty does not need to be a function; a configuration parameter should be enough (I assume the return value is always constant for a given invocation of the VM). The cost of this bug is small, though, so it's OK with me for now. We'll do a bigger cleanup job post-release.
Comment on attachment 381773 [details] [diff] [review] real patch redux changeset: 2001:37483f093b65, with the two required changes as noted in the superreview, and with two additional bug fixes to allow the acceptance tests to pass: widen the scope of the MMGC_ENTER in the avmshell, and fix the implementation of System.totalMemory.
Attachment #381773 - Attachment is obsolete: true
Attached patch Slightly more correct patch (obsolete) (deleted) — Splinter Review
This patch is better than Tommy's (compiles on more platforms, fixes more bugs) but fails smoke testing on Windows and Solaris. On windows, the is an assert in the locking code, probably during OOM shutdown: ownerThread != current thread in a call to Release. (The exit code is then 2^32-1; this is normal.) It's not known what the values of the thread IDs are. The guess du jour is that ownerThread is zero, ie, it's already been released and the problem is a double release of the lock. Other platforms do not have similar asserts in the threading code, so it's not known if the problem is platform specific or just goes undetected elsewhere. On Solaris, the exit code is '133' instead of the expected '128'; don't know why. I know some platforms have strange exit code conventions and that may be worth looking into before doing anything else.
One more thing: the problem on Windows is so far not reproducible locally on my Thinkpad but is reliably reproducible on the buildbot.
Attached patch Even more correct patch (deleted) — Splinter Review
(In reply to comment #22) > Created an attachment (id=382535) [details] > Slightly more correct patch > > This patch is better than Tommy's (compiles on more platforms, fixes more > bugs) but fails smoke testing on Windows and Solaris. The Windows bug has been diagnosed and fixed: the problem is that the buildbot build scripts enable C++ exception handling in debug builds. Since the longjmp that comes with Microsoft Visual C++ performs stack unwinding (!) when exception handling is enabled, the locking mechanism used in MMgc, which relies on the constructor of an auto object acquiring the lock and the destructor releasing it, breaks. (MMgc assumes that a longjmp is just that - a long jump, discarding the stack contents, and explicitly releases the lock before the longjmp.) The new patch removes -EHsc from configure.py, documents the problem in that file, fixes (temporarily) a logic bug in GCHeap.cpp that was discovered by inspection, and copies the assertions about acquire/release patterns from the Windows spinlock implementation to the Mac spinlock implementation.
Attachment #382535 - Attachment is obsolete: true
Depends on: 497624
The better patch was committed but then reverted again (redux changesets 2010, 2011). The logic bug fix in GCHeap.cpp ripples backward into AllocBlockIncremental and AllocBlockNonincremental, which probably need to pass kCanFail to heapAlloc, as do other callers of heapAlloc that test for a NULL return. More extensive testing seems in order. Consider this an r- on the current patch.
Attachment #384849 - Flags: review?
Attachment #384849 - Flags: review? → review?(lhansen)
Attachment #384849 - Flags: review?(lhansen) → review+
Attached patch hack fix (deleted) — Splinter Review
Please, please tell me you have a better idea on how to fix this...
Attachment #391127 - Flags: review?(lhansen)
If enterFrame isn't NULL then we are on thread and have previously entered so ShouldNotEnter returns false
Attachment #391130 - Flags: review?(lhansen)
Comment on attachment 391127 [details] [diff] [review] hack fix I'm holding my nose but this seems expedient and not obviously wrong.
Attachment #391127 - Flags: review?(lhansen) → review+
Comment on attachment 391130 [details] [diff] [review] Clean up and fix enter guard to allow on thread enters Stylistic nit: ShouldNotEnter can simply return the value of the test, you don't need if (...) return true; else return false;
Attachment #391130 - Flags: review?(lhansen) → review+
Attachment #391133 - Flags: review?(lhansen)
Attachment #391133 - Flags: review?(lhansen) → review+
(In reply to comment #30) > (From update of attachment 391130 [details] [diff] [review]) > Stylistic nit: ShouldNotEnter can simply return the value of the test, you > don't need if (...) return true; else return false; Yeah I originally coded it that way then found myself wanting to break on the return true case a lot and needed it on its own line, stupid debuggers.
Comment on attachment 391127 [details] [diff] [review] hack fix This patch does not compile and thus should not have been pushed...!
Comment on attachment 391127 [details] [diff] [review] hack fix Backed out with change changeset: 2263:9651f5d8a598 temporarily to get build working again
Flags: flashplayer-triage+
pushed (In reply to comment #26) > Created an attachment (id=384849) [details] > Allow MMgc to handle oom exit for standalone processes pushed: changeset: 2349:cd8b12ad3ca9
Attached patch silence asserts in player (deleted) — Splinter Review
Attachment #394091 - Flags: review?(stejohns)
Attachment #394091 - Flags: superreview?(edwsmith)
Attachment #394096 - Flags: superreview?(edwsmith)
Attachment #394091 - Flags: review?(stejohns) → review+
Attachment #394103 - Flags: review?(edwsmith)
Attachment #394096 - Flags: superreview?(edwsmith) → superreview+
Comment on attachment 394097 [details] [diff] [review] during browser OOM restart we need to detect the GCHeap is NULL and re-create it, this assert gets in the way Comment please, explaining why its okay for instance to be NULL in OOM situations
Attachment #394097 - Flags: superreview?(edwsmith) → superreview+
Attachment #394099 - Flags: review?(edwsmith) → review?(stejohns)
Attachment #394103 - Flags: review?(edwsmith) → review?(stejohns)
Attachment #394091 - Flags: superreview?(edwsmith) → superreview+
Attachment #394099 - Flags: review?(stejohns) → review+
Attachment #394103 - Flags: review?(stejohns) → review+
Attached patch go valgrind! (deleted) — Splinter Review
Attachment #394340 - Flags: superreview?(edwsmith)
Comment on attachment 394340 [details] [diff] [review] go valgrind! VG error gone after patch applied
Attachment #394340 - Flags: superreview?(edwsmith) → superreview+
Depends on: 511858
Attachment #398417 - Flags: superreview?(lhansen)
Attachment #398417 - Flags: superreview?(lhansen) → superreview+
No longer depends on: 491454
Attached patch clear GC lock when aborting (deleted) — Splinter Review
Attachment #398647 - Flags: superreview?(lhansen)
Attachment #398647 - Flags: superreview?(lhansen) → superreview+
Depends on: 508610
No longer blocks: 488738
Depends on: 488738
Depends on: 509067
Depends on: 512976
Depends on: 491388
Depends on: 506013
Attached patch don't re-jump (deleted) — Splinter Review
Attachment #399489 - Flags: superreview?(lhansen)
Depends on: 515659
Depends on: 515935
Attachment #399489 - Flags: superreview?(lhansen) → superreview+
Attachment #399505 - Flags: superreview?(lhansen) → superreview+
changeset: 2487:f4effd9bc9cb
Depends on: 516694
Depends on: 516848
Depends on: 516379
Depends on: 491734
Whiteboard: Tracking
Priority: P2 → P1
No longer depends on: 506013
No longer depends on: 516379
Depends on: 523389
Depends on: 524473
knock on wood...
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Engineering work item. Marking verified fixed.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: