Closed Bug 604704 Opened 14 years ago Closed 6 years ago

Optimize the exact tracing methods

Categories

(Tamarin Graveyard :: Garbage Collection (mmGC), defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX
Future

People

(Reporter: lhansen, Unassigned)

References

Details

Attachments

(2 files, 1 obsolete file)

Once the exact tracing infrastructure has landed in Tamarin, we need to worry about performance. There are many optimization ideas in the roadmap (https://acrobat.com/#d=UCWhgWVzJeNr7Ph3M7y1Fw), we should pursue them. In particular we should look into fast scanning of slot arrays.
Depends on: 617943
Attached patch WIP: Specialized string tracers (obsolete) (deleted) — Splinter Review
Warm-up exercise. Strings are bounded-depth data structures (a string is either a leaf, possibly with atomic data hanging off it, or it is dependent on a master that is a leaf). Thus we can avoid pushing and popping work items when marking a string value, and we can specialize TraceLocation for String pointers (yay C++). The attached code is a preliminary attempt to do that. A simple test shows some performance wins too (MacPro Release build w/GCC 4.0, five iterations, further testing required): avm avm2 test best avg best avg %dBst %dAvg Metric: v8 custom v8 normalized metric (hardcoded in the test) Dir: v8.5/js/ crypto 541 540.2 548 547.2 1.3 1.3 + deltablue 382 381.2 383 382.2 0.3 0.3 + earley-boyer 1254 1247.8 1293 1289 3.1 3.3 + raytrace 814 812.4 853 851.6 4.8 4.8 + regexp 103 102.2 116 116 12.6 13.5 ++ richards 323 323 317 317 -1.9 -1.9 - splay 925 844 974 965.8 5.3 14.4 Dir: v8.5/optimized/ crypto 4523 4514.6 4679 4665.8 3.4 3.3 + deltablue 4061 4042.4 4068 4056.2 0.2 0.3 earley-boyer 1262 1253 1287 1277.8 2.0 2.0 + raytrace 9383 9364 9715 9685.8 3.5 3.4 + regexp 103 102.2 117 116.2 13.6 13.7 ++ richards 4578 4575.6 4602 4599.6 0.5 0.5 + splay 6610 6582 6888 6870.2 4.2 4.4 + Dir: v8.5/typed/ crypto 3516 3510.6 3563 3557 1.3 1.3 + deltablue 4122 4116.8 4183 4176 1.5 1.4 + earley-boyer 1261 1254.6 1291 1286.6 2.4 2.6 + raytrace 9392 9360.4 9690 9673 3.2 3.3 + regexp 102 102 117 116.6 14.7 14.3 ++ richards 4572 4572 4615 4607 0.9 0.8 + splay 1147 1143.6 1208 1207.2 5.3 5.6 ++ Dir: v8.5/untyped/ crypto 600 599.4 602 600.6 0.3 0.2 + deltablue 2041 2038.6 2090 2085.2 2.4 2.3 + earley-boyer 1253 1249.4 1284 1281.8 2.5 2.6 + raytrace 3894 3886 3959 3953.4 1.7 1.7 + regexp 102 102 117 116.2 14.7 13.9 ++ richards 527 526.2 504 500.8 -4.4 -4.8 - splay 1088 1085.6 1143 1141.2 5.1 5.1 ++ Not sure what the slowdown on Richards means.
Apples-to-apples now, same setup: avm avm2 test best avg best avg %dBst %dAvg Metric: v8 custom v8 normalized metric (hardcoded in the test) Dir: v8.5/js/ crypto 546 545.6 548 547.4 0.4 0.3 + deltablue 383 381.4 383 382.8 0 0.4 earley-boyer 1292 1287.6 1293 1285 0.1 -0.2 raytrace 868 866 853 852.6 -1.7 -1.5 - regexp 117 116.2 117 116.2 0 0 richards 321 321 318 318 -0.9 -0.9 splay 984 972.6 980 973 -0.4 0.0 Dir: v8.5/optimized/ crypto 4688 4680 4697 4688.6 0.2 0.2 + deltablue 4096 4077.8 4089 4075.8 -0.2 -0.0 earley-boyer 1280 1272.4 1283 1278.8 0.2 0.5 raytrace 9764 9743 9690 9680.8 -0.8 -0.6 - regexp 117 116.6 116 116 -0.9 -0.5 - richards 4602 4592.4 4615 4609.4 0.3 0.4 + splay 7007 6988.4 6851 6846.4 -2.2 -2.0 - Dir: v8.5/typed/ crypto 3592 3589.2 3572 3569.8 -0.6 -0.5 - deltablue 4183 4176.2 4175 4166.6 -0.2 -0.2 earley-boyer 1278 1274.6 1287 1281.8 0.7 0.6 + raytrace 9745 9729 9681 9673.4 -0.7 -0.6 - regexp 117 116.8 117 116.4 0 -0.3 richards 4602 4594.8 4608 4605.6 0.1 0.2 splay 1219 1216.4 1213 1211.4 -0.5 -0.4 - Dir: v8.5/untyped/ crypto 602 601.4 604 602.8 0.3 0.2 + deltablue 2088 2081.4 2084 2079.6 -0.2 -0.1 earley-boyer 1273 1267.6 1283 1275.6 0.8 0.6 raytrace 3959 3948.8 3959 3951 0 0.1 regexp 117 116.2 117 116.2 0 0 richards 496 494.4 503 502 1.4 1.5 + splay 1153 1151.4 1143 1140.8 -0.9 -0.9 -
Attached patch WIP: Optimized slot tracer (deleted) — Splinter Review
This is a faster bit-scan loop for slot tracers. Here are some numbers, but the big win from this change should show up on Flex apps, not on microbenchmarks: avm avm2 test best avg best avg %dBst %dAvg Metric: v8 custom v8 normalized metric (hardcoded in the test) Dir: v8.5/js/ crypto 547 546.6 549 547.8 0.4 0.2 + deltablue 382 381.8 376 375.6 -1.6 -1.6 - earley-boyer 1292 1286.6 1294 1292 0.2 0.4 raytrace 868 867.6 860 859.4 -0.9 -0.9 - regexp 117 116.4 117 116.8 0 0.3 richards 322 321.4 320 320 -0.6 -0.4 - splay 984 977.6 989 976 0.5 -0.2 Dir: v8.5/optimized/ crypto 4699 4691.6 4695 4686.6 -0.1 -0.1 deltablue 4075 4065.2 4108 4086.6 0.8 0.5 + earley-boyer 1274 1260.4 1270 1265.4 -0.3 0.4 raytrace 9764 9747.2 9745 9708.6 -0.2 -0.4 regexp 117 116.4 117 116.2 0 -0.2 richards 4602 4593.6 4602 4591.2 0 -0.1 splay 6968 6945.8 7015 6986.2 0.7 0.6 + Dir: v8.5/typed/ crypto 3600 3592 3583 3580 -0.5 -0.3 - deltablue 4190 4185.8 4183 4174.8 -0.2 -0.3 earley-boyer 1269 1265.2 1279 1275.4 0.8 0.8 + raytrace 9755 9745.4 9764 9745 0.1 -0.0 regexp 117 116.4 117 116.6 0 0.2 richards 4602 4592.4 4602 4594.8 0 0.1 splay 1215 1214.6 1225 1222.6 0.8 0.7 + Dir: v8.5/untyped/ crypto 603 600.6 599 597.6 -0.7 -0.5 - deltablue 2088 2084.8 2086 2081.2 -0.1 -0.2 earley-boyer 1269 1260.8 1272 1263.6 0.2 0.2 raytrace 3943 3942.4 3955 3947.4 0.3 0.1 + regexp 116 115.8 117 116.4 0.9 0.5 + richards 498 492.2 493 491.2 -1.0 -0.2 splay 1153 1152 1162 1159.2 0.8 0.6 +
Flags: flashplayer-bug-
This is somewhat more sophisticated but makes no real difference on any benchmark we have already. A benchmark that ought to show a difference would have a very large database of mixed dependent/indirect strings (a Vector.<String> probably) and would run the allocator enough to cause a lot of GC in that database. That said, it may be that the high bit for GC performance is elsewhere right now and that these performance tweaks should be back-burnered until we've tackled other problems.
Attachment #502756 - Attachment is obsolete: true
Priority: P3 → --
Target Milestone: Q3 11 - Serrano → Future
Depends on: 650102
Another probably not making a difference today idea is to exploit the new Leaf types to avoid testing ContainsPointers, ie TraceLocation overrides would call TraceLeafPointer and it would just set the mark bit
Actually if the only !kContainsPointers allocations were Leaf objects we could also remove the test from TracePointer because contains pointers would always be true.
Assignee: lhansen → nobody
Status: ASSIGNED → NEW
Flags: flashplayer-qrb+
Tamarin is a dead project now. Mass WONTFIX.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
Tamarin isn't maintained anymore. WONTFIX remaining bugs.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: