457582 - performance metric changes between 1pm-2pm 2008-09-26

Reporter

Description

•

16 years ago

Linux Tp3 1.8, 1.9, 2.0 http://graphs.mozilla.org/graph.html#show=395125,395135,395166,1431032,61138,61148,61164,61214,146480&sel=1222436388,1222487458 Linux Tp3 RSS 1.8, 1.9, 2.0 http://graphs.mozilla.org/graph.html#show=395139,395141,395170,1431037,61142,61152,61168,61218,146484&sel=1222450246,1222469838 I suspect this change may be involved: http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2008-09-26+13%3A00&maxdate=2008-09-26+14%3A00&cvsroot=%2Fcvsroot http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&subdir=mozilla/testing/performance/talos&command=DIFF_FRAMESET&file=sample.config&rev1=1.17&rev2=1.18&root=/cvsroot The change from -tpcycles 5 to -tpcycles 10 looks unintentional. It was not part of the reviewed patch https://bugzilla.mozilla.org/attachment.cgi?id=336106&action=diff

Karl Tomlinson (:karlt)

Reporter

Comment 1

•

16 years ago

Similar issues on Windows but the time is a little earlier, I guess because the time reported is the code build time, but the talos files are pulled at a later date. XP, Vista Tp3 Mem-WS 1.8, 2.0 http://graphs.mozilla.org/graph.html#show=787138,787155,787156,395024,395034,395054,53282,61128&sel=1222438002,1222474026 XP Tp3 1.8, 1.9, 2.0 http://graphs.mozilla.org/graph.html#show=53218,53236,53254,53276,395008,395020,395048&sel=1222437590,1222484358

David Baron :dbaron:

Comment 2

•

16 years ago

It seems like a bad idea to be making changes that could affect the performance results either (a) while the tree is open or (b) without announcing them so people looking at the graphs know what caused the changes.

alice nodelman [:alice] [:anode]

Comment 3

•

16 years ago

The change from 5->10 was actually a fix for a previous error where 10 was changed to 5 (this happened during talos config file clean up work that happened a couple of months ago and was only noticed till now). The linux number drop was part of an attempt to fix bug 457464. I noticed a long running, high cpu usage process on the boxes and killed it off - hoping to reduce variance in the numbers. A side effect of this was to cause the linux talos numbers on all boxes to drop slightly. I'm sorry for the confusion caused by these changes. In future I'll try and keep the risky stuff to scheduled downtimes.

alice nodelman [:alice] [:anode]

Comment 4

•

16 years ago

In terms of the change in memory metrics that is connected to bug 450666 - my test system didn't show that there would be any change in the numbers, but it looks like that was untrue. So, this Friday there was: 1) bug fix for 5 test cycles to 10 (should have ended up in a bug, but didn't) 2) attempt to fix oddness in linux numbers that resulted in a drop in the linux tp numbers across the board 3) check in of a patch to reduce the amount of memory related data points collected by talos that resulted in the drop in memory related talos numbers across the board Overall, not a very good Friday afternoon for me. As it stands now, #1 is going to be left as is as it was an attempt to return to the stable point. I would argue that #2 should also be left as is as the long term result should be a reduction in variance in the linux talos numbers. Discussion regarding #3 should be moved to the appropriate bug for follow up.

Karl Tomlinson (:karlt)

Reporter

Comment 5

•

16 years ago

(In reply to comment #3) > The change from 5->10 was actually a fix for a previous error where 10 was > changed to 5 (this happened during talos config file clean up work that > happened a couple of months ago and was only noticed till now). I can't see where 10 was changed to 5. testing/performance/talos/sample.config has had "-tpcycles 5" since 2007-09-19 11:02 at least. http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/testing/performance/talos/sample.config&rev=1.15&root=/cvsroot&mark=86#85 Was a different file used previously? Can you point to where the change happened, please? > The linux number drop was part of an attempt to fix bug 457464. I noticed a > long running, high cpu usage process on the boxes and killed it off - hoping > to reduce variance in the numbers. A side effect of this was to cause the > linux talos numbers on all boxes to drop slightly. This was "an attempt to fix bug 457464" before the bug was filed, I assume. I would expect the -tpcycles change to effect performance numbers on Linux noticeably. The first cycle is always much slower, because the later cycles use cached data from the first run (even though there are 400+ pages, yes). Increasing the number of cycles reduces the contribution of the first cycle to the average. I don't know what performance change to expect on other platforms, but MS XP Tp also dropped. The "high cpu usage process" was on Linux only, I assume. Merely accessing the necessary files/blocks in the first cycle can put them in OS filesystem caches ready for faster access in subsequent cycles. (In reply to comment #4) > In terms of the change in memory metrics that is connected to bug 450666 Again, this could be the -tpcycles change. It could be a leak in the app, or it could be that the faster testing means that more items in our caches are being reused before expiry and thus don't expire. The only way we'd know whether it was the -tpcycles change or the resolution change would be to compare with what happened when -tpcycles was changed from 10 to 5 (but I don't know when that was).

Keywords: footprint, mlk, perf

Karl Tomlinson (:karlt)

Reporter

Updated

•

16 years ago

OS: Linux → All

alice nodelman [:alice] [:anode]

Comment 6

•

16 years ago

Change from 10->5 occurred as a side effect of bug 432883 (we went from multiple talos config files to one centralized config file, the previous batch of config files used 10 cycles and the difference wasn't caught during review). That change was checked in 08/07. You can see a very obvious hike in the tp numbers for windows on 08/07 (http://graphs.mozilla.org/#spst=range&spstart=1207337829&spend=1208210198&bpst=cursor&bpstart=1207337829&bpend=1208210198&m1tid=394872&m1bl=0&m1avg=0&m2tid=394900&m2bl=0&m2avg=0&m3tid=394887&m3bl=0&m3avg=0) - that would be the numbers jumping post check in of the tp cycles 5.

OS: All → Linux

alice nodelman [:alice] [:anode]

Comment 7

•

16 years ago

Sorry, that was a link to the hike observed on the tiger boxes, not winxp. There is a similar hike on the leopard machines as well.

Karl Tomlinson (:karlt)

Reporter

Updated

•

16 years ago

Blocks: 450401

Karl Tomlinson (:karlt)

Reporter

Comment 8

•

16 years ago

I'll mark this fixed then, as it fixed bug 450401.

Status: NEW → RESOLVED

Closed: 16 years ago

OS: Linux → All

Resolution: --- → FIXED

timeless

Updated

•

15 years ago

Component: Release Engineering: Talos → Release Engineering

Nobody; OK to take it and work on it

Assignee

Updated

•

11 years ago

Product: mozilla.org → Release Engineering

Bugzilla

Quick Search

performance metric changes between 1pm-2pm 2008-09-26

Categories

(Release Engineering :: General, defect)

Tracking

(Not tracked)

People

(Reporter: karlt, Unassigned)

References

Details

(Keywords: memory-footprint, memory-leak, perf)

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Updated

Comment 8

Updated

Updated