Closed Bug 599610 Opened 14 years ago Closed 14 years ago

PurgeScriptFragments may not purge all fragments

Tracking

()

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

blocking2.0

---

beta8+

status1.9.2

---

wontfix

status1.9.1

---

unaffected

People

(Reporter: igor, Assigned: igor)

References

Details

(Whiteboard: [sg:critical] [qa-examined-192] [qa-needs-STR])

Attachments

(2 files, 4 obsolete files)

v1 14 years ago Igor Bukanov (deleted), patch		Details \| Diff \| Splinter Review
v2 14 years ago Igor Bukanov (deleted), patch		Details \| Diff \| Splinter Review
v3 14 years ago Igor Bukanov (deleted), patch		Details \| Diff \| Splinter Review
v4 14 years ago Igor Bukanov (deleted), patch	gal : review+	Details \| Diff \| Splinter Review
patch for 192 14 years ago Igor Bukanov (deleted), patch		Details \| Diff \| Splinter Review
patch for 192 14 years ago Igor Bukanov (deleted), patch	gal : review+ dveditz : approval1.9.2.14- dveditz : approval1.9.2.18- dveditz : approval1.9.2.20-	Details \| Diff \| Splinter Review

Igor Bukanov

Assignee

Description

•

14 years ago

PurgeScriptFragments only looks for trace fragments that are found in TraceMonitor for the current thread. This misses other threads that also may contain the fragments for the script. For example, the GC never runs on a web worker thread so the GC will never purge any fragments for web worker functions. This is hazardous as the fragments may match a newly allocate script that reused memory from the deleted script.

Igor Bukanov

Assignee

Comment 1

•

14 years ago

The safest fix for branches would be to enumerate over all threads in search for script fragment. But with web workers activated this would imply that any finalized script (including function scripts) would need to scan at least 2 extra vmfragments instances implying a read of minimum 2 2K segment per script finalization. That could influence the GC pause timing. An alternative is to put all fragments from JSScript into a list so a script can enumerate them all.

Assignee: general → igor

Robert Sayre

Updated

•

14 years ago

Whiteboard: [sg:critical]

Igor Bukanov

Assignee

Updated

•

14 years ago

blocking2.0: --- → ?

Robert Sayre

Updated

•

14 years ago

blocking2.0: ? → beta8+

Robert Sayre

Comment 2

•

14 years ago

Any update here?

blocking1.9.2: --- → ?

status1.9.2: --- → ?

Igor Bukanov

Assignee

Comment 3

•

14 years ago

On branches there is a problem with just enumerating over all threads. There JS_DestroyScript really destroy script and that function can be called outside the GC. Thus enumerating over all threads brings the complexity of using the GC machinery just to enumerate the thread data. This is too complex to add to branches. A safer solution would be to add a field to JSScript that would track the thread that was used for the first recording and will abort the following recordings if that thread would mismatch the current one. That could hurt web workers if the worker is transferred to another thread, but that should be acceptable.

Igor Bukanov

Assignee

Comment 4

•

14 years ago

Attached patch v1 (obsolete) (deleted) — Details — Splinter Review

The idea above of using an owner thread field in JSScript would not lead to a simple patch. With such change it would be necessary to clear such field when JSThread is GC-ed but the script is not. That brings a lot of extra complexity. So here is an implementation of that enumerating all threads idea that should be easy to port to branches. On branches it would not address the issue of JS_DestroyScript been called outside the GC for scripts having fragments on other threads, but such usage does not exists in FF code base.

LegNeato

Comment 5

•

14 years ago

I would imagine this affects 1.9.1 as well, yes?

blocking1.9.1: --- → ?

blocking1.9.2: ? → .12+

status1.9.1: --- → ?

status1.9.2: ? → wanted

Igor Bukanov

Assignee

Comment 6

•

14 years ago

(In reply to comment #5) > I would imagine this affects 1.9.1 as well, yes? No - on 1.9.1 the GC flashes all JIT-generated code. So there the bug can only affect explicitly destroyed scripts outside the GC phase, but those scripts do not escape to other threads in FF code base.

Daniel Veditz [:dveditz]

Updated

•

14 years ago

blocking1.9.1: ? → ---

status1.9.1: ? → unaffected

Igor Bukanov

Assignee

Comment 7

•

14 years ago

Attached patch v2 (obsolete) (deleted) — Details — Splinter Review

Rebased version of v1 with minimal amount of changes.

Attachment #481230 - Attachment is obsolete: true

Igor Bukanov

Assignee

Comment 8

•

14 years ago

Attached patch v3 (obsolete) (deleted) — Details — Splinter Review

The new patch adds a hash of scripts with recorded fragments to offset performance impact of enumerating the fragments for all threads when destructing a script.

Attachment #482507 - Attachment is obsolete: true

Attachment #482522 - Flags: review?(gal)

Igor Bukanov

Assignee

Comment 9

•

14 years ago

Attached patch v4 (deleted) — Details — Splinter Review

The new version adds missing OOM reporting.

Attachment #482522 - Attachment is obsolete: true

Attachment #482547 - Flags: review?(gal)

Attachment #482522 - Flags: review?(gal)

Johnny Stenback (:jst)

Comment 10

•

14 years ago

gal, can you do this review please?

Daniel Veditz [:dveditz]

Updated

•

14 years ago

Whiteboard: [sg:critical] → [sg:critical][needs r=gal]

Andreas Gal :gal

Updated

•

14 years ago

Attachment #482547 - Flags: review?(gal) → review+

Andreas Gal :gal

Updated

•

14 years ago

Whiteboard: [sg:critical][needs r=gal] → [sg:critical]

Andreas Gal :gal

Comment 11

•

14 years ago

Fell off the plate until sayrer pinged me. Sorry for the delay.

Igor Bukanov

Assignee

Comment 12

•

14 years ago

landed to TM - http://hg.mozilla.org/tracemonkey/rev/f9785814bdbc For 1.9.2 the patch requires some porting efforts as the hashmap is not available there. So I suggest to move this to the next release on that branch.

blocking1.9.2: .13+ → ?

Robert Sayre

Comment 13

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/f9785814bdbc

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Daniel Veditz [:dveditz]

Comment 14

•

14 years ago

per comment 12 it's now the "next release" on the branch.

blocking1.9.2: ? → .14+

Igor Bukanov

Assignee

Comment 15

•

14 years ago

Attached patch patch for 192 (obsolete) (deleted) — Details — Splinter Review

Here is untested backport to 1.9.2 that uses JSDHashTable to record traced scripts.

Igor Bukanov

Assignee

Comment 16

•

14 years ago

Attached patch patch for 192 (deleted) — Details — Splinter Review

The patch for 1.9.2 deviates from the trunk too much and requires own review. Part of the reason for that is that JSDhashTable does not have the clear method that required to finish table to remove all elements during flushing.

Attachment #497967 - Attachment is obsolete: true

Attachment #498162 - Flags: review?(gal)

Igor Bukanov

Assignee

Comment 17

•

14 years ago

gal - do you have time to look at 192 patch today before the code freeze ?

Andreas Gal :gal

Comment 18

•

14 years ago

Comment on attachment 498162 [details] [diff] [review] patch for 192 If I don't review a patch within 24h, please nag (I overlooked the email in that case).

Attachment #498162 - Flags: review?(gal) → review+

Igor Bukanov

Assignee

Updated

•

14 years ago

Attachment #498162 - Flags: approval1.9.2.14?

Andreas Gal :gal

Comment 19

•

14 years ago

Its blocking, no approval needed.

Daniel Veditz [:dveditz]

Comment 20

•

14 years ago

(In reply to comment #19) > Its blocking, no approval needed. That's never the case on the branch, please see the tree rules at the top of every tinderbox page. That may be true (most of the time) on the trunk; the branches, however, operate in permanent "last days of the release" mode. (I assume you're referring to branch since it was checked in on the trunk in comment 13.)

LegNeato

Comment 21

•

14 years ago

Comment on attachment 498162 [details] [diff] [review] patch for 192 a=LegNeato for 1.9.2.14. Please land this asap today.

Attachment #498162 - Flags: approval1.9.2.14? → approval1.9.2.14+

Andreas Gal :gal

Comment 22

•

14 years ago

I see. My mistake. In that case though, whats the point of "blocking" a release branch? We should probably not use blocking on branches then, only wanted and approved.

Daniel Veditz [:dveditz]

Comment 23

•

14 years ago

blocking means we'll hold the release for it, wanted means we want it but won't hold the release. Because of the short cycle times we want extra eyes on the actual patch -- think of it as a super-review-required type step.

Andreas Gal :gal

Comment 24

•

14 years ago

We briefly had a super-review-required rule for security patches, but we backed off of that because it was pointless. Now the rule seems to be back, just different and even more pointless. Previously the super review was coming from a super-reviewer who was at least potentially qualified to review the patch at a technical level. Now the super-review is coming from a driver who most likely has no clue what the patch does? Chris, do you have the slightest idea what the code in the patch does you just approved? Don't feel bad if not. Its a super specialized corner case that probably 3-4 people on the planet really understand. You can't expect drivers to have a technical clue/review opinion for individual patches. That's simply not their job. These extra eyes are not qualified to see anything useful (and shouldn't be expected to). Reviews should come from module owners/peers. For the last few branch patches I have seen the approval step is basically just a delay/paperwork/stamping step. Why do it then? If you mark a bug blocking a branch, it means: module owners/peers, write a patch, have it reviewed, and land it asap. If you are worried about upcoming freezes/timing/scheduling and want to stabilize a branch, just close it.

Igor Bukanov

Assignee

Comment 25

•

14 years ago

http://hg.mozilla.org/releases/mozilla-1.9.2/rev/a20b66107a79

status1.9.2: wanted → .14-fixed

Reed Loden [:reed]

Comment 26

•

14 years ago

(In reply to comment #24) > Now the rule seems to be back, just different and even more pointless. "back"? The branches have had the same rules now for as long as I can remember... probably at least 4-5 years, if not longer. This process isn't anything new. All branch patches require approval. While it may seem burdensome, this process has caught numerous problems in the past that would have been missed otherwise had the branch drivers not specifically looked at the patch (even if they didn't understand all of it). Considering the branches get very little testing due to lack of nightly users and limited QA, having an extra pair of eyes is important.

LegNeato

Comment 27

•

14 years ago

(In reply to comment #24) > Chris, do you have the slightest idea what the code in the patch does you just > approved? Don't feel bad if not. Its a super specialized corner case that > probably 3-4 people on the planet really understand. You can't expect drivers > to have a technical clue/review opinion for individual patches. That's simply > not their job. These extra eyes are not qualified to see anything useful (and > shouldn't be expected to). Reviews should come from module owners/peers. Nope, I do not. Sometimes I do, though more often I just look at the extent of the change and approve unless it looks huge. I see the approval required more to meter what goes into a release, not to determine if the fix is "correct". It also gives an additional chance for me (or other release drivers) to ask clarifying questions, make sure 1.9.1 gets a patch if needed, etc. Sure, the same thing can be done w/o needing the up front approvals, but the reality would be that the additional checking would go away after a month or so. > For the last few branch patches I have seen the approval step is basically just > a delay/paperwork/stamping step. Why do it then? If you mark a bug blocking a > branch, it means: module owners/peers, write a patch, have it reviewed, and > land it asap. If you are worried about upcoming freezes/timing/scheduling and > want to stabilize a branch, just close it. Yes, it is generally rubberstamping for JS blackmagic these days :-). It makes a lot more sense for other areas of the code. I do think we need to revamp the way we manage branch blocking/approvals, I just haven't figured out the best way to do it. I've never encountered a release process that only approves bugs without requiring approval on the individual patches as well FWIW.

Igor Bukanov

Assignee

Comment 28

•

14 years ago

On 192 the patch lead to busted leak test: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.6/1295472073.1295478386.7556.gz&fulltext=1 My plan is to check the patch one more time and backout in an hour or so.

Igor Bukanov

Assignee

Comment 29

•

14 years ago

To 1.9.2 drivers: I could not find anything suspicions in the patch. Given the closed tree status I think it is better to wait until a fresh build on tree reopening. I will be offline for the next 9 hours or so (need to sleep in CEST timezone).

Igor Bukanov

Assignee

Comment 30

•

14 years ago

The tinderbox failure was caused by the bug 627225 and unrelated to the landed patch.

Al Billings [:abillings - ex-MoCo]

Comment 31

•

14 years ago

Are there any STR for 1.9.2 to verify this fix?

Whiteboard: [sg:critical] → [sg:critical] [qa-examined-192] [qa-needs-STR]

Daniel Veditz [:dveditz]

Updated

•

14 years ago

Depends on: 631105

LegNeato

Comment 32

•

14 years ago

I backed this out of 1.9.2 (both default and the .14 relbranch) due to it probably causing bug 631105

blocking1.9.2: .14+ → needed

status1.9.2: .14-fixed → ---

Daniel Veditz [:dveditz]

Comment 33

•

14 years ago

It probably didn't cause bug 631105, not directly. Backing this out moved the crash to bug 633869 but the underlying cause of both now appears to be frankenbuilds.

Daniel Veditz [:dveditz]

Comment 34

•

14 years ago

We've landed and then (fairly publicly) backed out this change. We should try to get it into the next release if we can.

blocking1.9.2: needed → ?

status1.9.2: --- → wanted

Daniel Veditz [:dveditz]

Updated

•

14 years ago

Attachment #498162 - Flags: approval1.9.2.15?

Daniel Veditz [:dveditz]

Updated

•

14 years ago

blocking1.9.2: ? → .15+

LegNeato

Updated

•

14 years ago

Attachment #498162 - Flags: approval1.9.2.15? → approval1.9.2.15+

Daniel Veditz [:dveditz]

Comment 35

•

14 years ago

Comment on attachment 498162 [details] [diff] [review] patch for 192 This turned out not to be the culprit for the crash spike, we want it back in now. Approved for 1.9.2.15, a=dveditz for release-drivers clegnitto will try to re-land the patch he backed out, but may need a merge if something else changed.

Attachment #498162 - Flags: approval1.9.2.14+ → approval1.9.2.14-

Daniel Veditz [:dveditz]

Updated

•

14 years ago

blocking1.9.2: .17+ → .18+

Daniel Veditz [:dveditz]

Comment 36

•

14 years ago

Comment on attachment 498162 [details] [diff] [review] patch for 192 Since this is one of the ones that led to the frankenbuild crash spike we want to wait another release. 3.6.17 is getting additional install/update fixes that should help that problem.

Attachment #498162 - Flags: approval1.9.2.17+ → approval1.9.2.18?

Daniel Veditz [:dveditz]

Comment 37

•

14 years ago

Comment on attachment 498162 [details] [diff] [review] patch for 192 Approved for 1.9.2.18, a=dveditz for release-drivers Now that users have the install fixes we want to land this.

Attachment #498162 - Flags: approval1.9.2.18? → approval1.9.2.18+

Daniel Veditz [:dveditz]

Updated

•

14 years ago

Flags: in-testsuite?

Keywords: testcase-wanted

Daniel Veditz [:dveditz]

Updated

•

14 years ago

blocking1.9.2: .18+ → .19+

Daniel Veditz [:dveditz]

Updated

•

13 years ago

Attachment #498162 - Flags: approval1.9.2.19?

Attachment #498162 - Flags: approval1.9.2.18-

Attachment #498162 - Flags: approval1.9.2.18+

LegNeato

Comment 38

•

13 years ago

We decided there is just too much risk at this point to take this in 3.6 due to our lower test population. Marking as wontfix for 1.9.2.

blocking1.9.2: .20+ → ---

status1.9.2: wanted → wontfix

Daniel Veditz [:dveditz]

Updated

•

13 years ago

Attachment #498162 - Flags: approval1.9.2.20? → approval1.9.2.20-

Daniel Veditz [:dveditz]

Updated

•

10 years ago

Group: core-security

Ryan VanderMeulen [:RyanVM]

Updated

•

9 years ago

Keywords: testcase-wanted

You need to log in before you can comment on or make changes to this bug.