Closed Bug 457582 Opened 16 years ago Closed 16 years ago

performance metric changes between 1pm-2pm 2008-09-26

Categories

(Release Engineering :: General, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: karlt, Unassigned)

References

Details

(Keywords: memory-footprint, memory-leak, perf)

Similar issues on Windows but the time is a little earlier, I guess because the time reported is the code build time, but the talos files are pulled at a later date. XP, Vista Tp3 Mem-WS 1.8, 2.0 http://graphs.mozilla.org/graph.html#show=787138,787155,787156,395024,395034,395054,53282,61128&sel=1222438002,1222474026 XP Tp3 1.8, 1.9, 2.0 http://graphs.mozilla.org/graph.html#show=53218,53236,53254,53276,395008,395020,395048&sel=1222437590,1222484358
It seems like a bad idea to be making changes that could affect the performance results either (a) while the tree is open or (b) without announcing them so people looking at the graphs know what caused the changes.
The change from 5->10 was actually a fix for a previous error where 10 was changed to 5 (this happened during talos config file clean up work that happened a couple of months ago and was only noticed till now). The linux number drop was part of an attempt to fix bug 457464. I noticed a long running, high cpu usage process on the boxes and killed it off - hoping to reduce variance in the numbers. A side effect of this was to cause the linux talos numbers on all boxes to drop slightly. I'm sorry for the confusion caused by these changes. In future I'll try and keep the risky stuff to scheduled downtimes.
In terms of the change in memory metrics that is connected to bug 450666 - my test system didn't show that there would be any change in the numbers, but it looks like that was untrue. So, this Friday there was: 1) bug fix for 5 test cycles to 10 (should have ended up in a bug, but didn't) 2) attempt to fix oddness in linux numbers that resulted in a drop in the linux tp numbers across the board 3) check in of a patch to reduce the amount of memory related data points collected by talos that resulted in the drop in memory related talos numbers across the board Overall, not a very good Friday afternoon for me. As it stands now, #1 is going to be left as is as it was an attempt to return to the stable point. I would argue that #2 should also be left as is as the long term result should be a reduction in variance in the linux talos numbers. Discussion regarding #3 should be moved to the appropriate bug for follow up.
(In reply to comment #3) > The change from 5->10 was actually a fix for a previous error where 10 was > changed to 5 (this happened during talos config file clean up work that > happened a couple of months ago and was only noticed till now). I can't see where 10 was changed to 5. testing/performance/talos/sample.config has had "-tpcycles 5" since 2007-09-19 11:02 at least. http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/testing/performance/talos/sample.config&rev=1.15&root=/cvsroot&mark=86#85 Was a different file used previously? Can you point to where the change happened, please? > The linux number drop was part of an attempt to fix bug 457464. I noticed a > long running, high cpu usage process on the boxes and killed it off - hoping > to reduce variance in the numbers. A side effect of this was to cause the > linux talos numbers on all boxes to drop slightly. This was "an attempt to fix bug 457464" before the bug was filed, I assume. I would expect the -tpcycles change to effect performance numbers on Linux noticeably. The first cycle is always much slower, because the later cycles use cached data from the first run (even though there are 400+ pages, yes). Increasing the number of cycles reduces the contribution of the first cycle to the average. I don't know what performance change to expect on other platforms, but MS XP Tp also dropped. The "high cpu usage process" was on Linux only, I assume. Merely accessing the necessary files/blocks in the first cycle can put them in OS filesystem caches ready for faster access in subsequent cycles. (In reply to comment #4) > In terms of the change in memory metrics that is connected to bug 450666 Again, this could be the -tpcycles change. It could be a leak in the app, or it could be that the faster testing means that more items in our caches are being reused before expiry and thus don't expire. The only way we'd know whether it was the -tpcycles change or the resolution change would be to compare with what happened when -tpcycles was changed from 10 to 5 (but I don't know when that was).
Keywords: footprint, mlk, perf
OS: Linux → All
Change from 10->5 occurred as a side effect of bug 432883 (we went from multiple talos config files to one centralized config file, the previous batch of config files used 10 cycles and the difference wasn't caught during review). That change was checked in 08/07. You can see a very obvious hike in the tp numbers for windows on 08/07 (http://graphs.mozilla.org/#spst=range&spstart=1207337829&spend=1208210198&bpst=cursor&bpstart=1207337829&bpend=1208210198&m1tid=394872&m1bl=0&m1avg=0&m2tid=394900&m2bl=0&m2avg=0&m3tid=394887&m3bl=0&m3avg=0) - that would be the numbers jumping post check in of the tp cycles 5.
OS: All → Linux
Sorry, that was a link to the hike observed on the tiger boxes, not winxp. There is a similar hike on the leopard machines as well.
Blocks: 450401
I'll mark this fixed then, as it fixed bug 450401.
Status: NEW → RESOLVED
Closed: 16 years ago
OS: Linux → All
Resolution: --- → FIXED
Component: Release Engineering: Talos → Release Engineering
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.