Open Bug 1633927 Opened 5 years ago Updated 5 years ago

54.02% build times (linux64) regression on push 655d98fff192e4733f3317e233efcc6193534872 (Fri April 24 2020)

Tracking

(firefox-esr68 unaffected, firefox75 unaffected, firefox76 unaffected, firefox77 fix-optional, firefox78 affected)

Status:

NEW

Tracking Flags:

Tracking

Status

firefox-esr68

---

unaffected

firefox75

---

unaffected

firefox76

---

unaffected

firefox77

---

fix-optional

firefox78

---

affected

People

(Reporter: marauder, Unassigned)

References

(Regression)

Details

(Keywords: perf-alert, regression)

Marian Raiciof [:marauder]

Reporter

Description

•

5 years ago

Perfherder has detected a 2 performance regression from push 655d98fff192e4733f3317e233efcc6193534872. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

54% build times linux64 opt taskcluster-c5d.4xlarge valgrind 1,039.40 -> 1,600.85

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

For more information on performance sheriffing please see our FAQ.

Marian Raiciof [:marauder]

Reporter

Updated

•

5 years ago

status-firefox75: --- → unaffected

status-firefox76: --- → unaffected

status-firefox77: --- → affected

status-firefox-esr68: --- → unaffected

Component: Performance → DOM: Content Processes

Flags: needinfo?(nika)

Product: Testing → Core

Version: Version 3 → unspecified

Nika Layzell [:nika] (ni? for response)

Comment 1

•

5 years ago

Looking into the issue. Not sure yet how my patch could've caused the average build time to stabilize at the worst-case scenario.

Mike Hommey [:glandium]

Comment 2

•

5 years ago

I retriggered the build on that push and the one that follows. Both ended up with "normal" build times. So what happened is that the push in question changed some central headers, that triggered most things to rebuild because of cache misses. Then, something else happened: those builds were made less frequent. And the indirect result is that they get less cache hits because of that. That's something to keep in mind when we change task scheduling.

Joel, do you know what bug made the valgrind builds less frequent (that seems to have happened this week, but a quick bugzilla search didn't show anything)? (and what the right component would be for this bug)

Flags: needinfo?(nika) → needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 3

•

5 years ago

check out bug 1621764, we reduced builds that don't run tests on autoland to be run every 10th push on autoland.

Flags: needinfo?(jmaher)

Mike Hommey [:glandium]

Comment 4

•

5 years ago

(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #3)

check out bug 1621764, we reduced builds that don't run tests on autoland to be run every 10th push on autoland.

Except the valgrind builds are tests.

Mike Hommey [:glandium]

Updated

•

5 years ago

Component: DOM: Content Processes → General

Product: Core → Testing

Regressed by: 1621764
No longer regressed by: 1580565

Joel Maher ( :jmaher ) (UTC -8)

Comment 5

•

5 years ago

there is nothing to change here? we build 90% less often, but increase our build by 50% runtime- I recommend wontfix, but open to hearing other suggestions.

Mike Hommey [:glandium]

Comment 6

•

5 years ago

Whatever policy applies to tests should be applied to valgrind. Not the policy for builds.

Joel Maher ( :jmaher ) (UTC -8)

Comment 7

•

5 years ago

the criteria for running every 10th push is a way for all tasks to continue to find regressions before merging to m-c but for tasks that are a low risk of finding regressions. This build time regression is infrastructure only issue and not a regression that would cause or need to cause a backout to keep nightly green. Once a certain task yields a few unique regressions it becomes higher value and we want to ensure it runs more frequently. Often there are tasks that have no history of finding a unique regression (it can fail as part of a regression easily found by another platform or task) and we target those to reduce frequency.

Ideally we could separate the build from the tests so this confusion of what is a build or a test wouldn't happen as often

Pascal Chevrel:pascalc (PTO until August 21)

Updated

•

5 years ago

status-firefox77: affected → fix-optional

status-firefox78: --- → affected

Geoff Brown [:gbrown]

Updated

•

5 years ago

Severity: -- → S4

Priority: -- → P3

You need to log in before you can comment on or make changes to this bug.

Bugzilla

54.02% build times (linux64) regression on push 655d98fff192e4733f3317e233efcc6193534872 (Fri April 24 2020)

Categories

(Testing :: General, defect, P3)

Tracking

(firefox-esr68 unaffected, firefox75 unaffected, firefox76 unaffected, firefox77 fix-optional, firefox78 affected)

People

(Reporter: marauder, Unassigned)

References

(Regression)

Details

(Keywords: perf-alert, regression)

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Comment 5

Comment 6

Comment 7

Updated

Updated