Closed Bug 697802 Opened 13 years ago Closed 13 years ago

Pushes don't get Windows tests if they have newlines in the push comment

Categories

(Release Engineering :: General, defect, P2)

x86
Windows 7
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Assigned: rail)

References

Details

Attachments

(1 file, 4 obsolete files)

Rumor has it that the state of https://tbpl.mozilla.org/?tree=Try&rev=40ee397bbc6e where the Windows builds completed just fine, and show successful sendchanges, but didn't get any Windows tests, is not entirely unusual. If so, probably not a good thing, since when faced with that sort of thing we're likely to do one of two things, either just push without noticing we didn't do the testing we intended to, or, notice and push to try again (or retrigger the build, the least expensive but still expensive possibility).
Belatedly occurs to me that this could be the same thing as when results were appearing on other trees yesterday six hours after they had actually finished, that something may be seriously slow about sticking results in the db.
Priority: -- → P3
Whiteboard: [tryserver]
Blocks: 700168
Not actually try-only, but on non-try trees, we expect crappy times to result in coalescing all of a push's tests away. My marker for this-hood is that self-serve doesn't show any tests having ran, since when they are coalesced it does show them as having run. By that measure, https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=940adaea65a1 was also this, as was one or as many as a half dozen some night last week, where we had a Windows regression that appeared after six builds didn't have Windows tests run, and then the tip push also didn't have Windows tests, which I chalked up to a backward-coalescing bug or some such thing.
Severity: normal → major
Priority: P3 → --
Summary: Try push didn't get Windows tests → Some pushes don't get Windows tests
Whiteboard: [tryserver]
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&onlyunstarred=1&rev=0246515e5fda

Might want to add some more Windows build slaves, I'm going to start solving this problem by just retriggering builds.
Let's see if this more recent issue on try (which does not coalesce let's me figure this out).

Same problem with:
https://tbpl.mozilla.org/?tree=Try&rev=42d3729cc5e9
where I used the syntax:
try: --build do --platform win32 --unittests all --talos none
Assignee: nobody → armenzg
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #10)
> Let's see if this more recent issue on try (which does not coalesce let's me
> figure this out).
> 
> Same problem with:
> https://tbpl.mozilla.org/?tree=Try&rev=42d3729cc5e9
> where I used the syntax:
> try: --build do --platform win32 --unittests all --talos none

Let's scratch this.

Let's use this more recent job for our analysis:
https://tbpl.mozilla.org/?tree=Try&jobname=WINNT&rev=e22826384f38
I see something different between a succeeding sendchange and a failing one:

2 lines (fails to trigger):
>    comments: try: -b do -p all -u all -t none
>Bug 698570 - Use weak references in nsFormHistory.js. r=mak.

one line (succeeds to trigger):
>    comments: try: -b do -p win32 -u all -t all --post-to-bugzilla Bug 696162 - Fix jsgcchunk's AllocGCChunk to be more efficient and to avoid potential problems on Mac 10.7 (with Windows optimistic/pessimistic fix).
From looking at the first failing changeset https://hg.mozilla.org/try/rev/40ee397bbc6e we have another case of multi-line issue:
> try: -b do -p all -u all -t none
>
>Bug 693341 - Test arrays of iid_is params. r=khuey 


philor, are you sure sure that the inbound cases are the same and not coalescing?
You are well aware of how much it happens on inbound.
We need more win32 builders on the buildpool but that's another case.
https://build.mozilla.org/buildapi/self-serve/mozilla-inbound/rev/95d98e8ab9f3 shows no Win7 opt tests at all, tbpl shows no Win7 opt tests, I think that's this.

https://build.mozilla.org/buildapi/self-serve/mozilla-inbound/rev/ff13fc7b5e42 does show a "Rev3 WINNT 6.1 mozilla-inbound opt test mochitests-2/5" but tbpl does not show a Win7 opt M2, I think that's coalescing.
For try rev e22826384f38, the master for the win32 opt build thinks there were sendchanges on branches try-win32-talos and try-win32-opt-unittest at 11:34:35 PST, but I can't see any record of them in 
  buildbot-master10:/builds/buildbot/build_scheduler/master/twistd.log*

There are sendchanges recorded for mac,linux, android etc, so we might have a failure on the slave to sendchange properly.
If you look at the sendchange steps in https://tbpl.mozilla.org/php/getParsedLog.php?id=7313406&tree=Try (try e22826384f38 again) the actual calls to buildbot.bat don't have the properties or files appended at the end:

========= Started sendchange (results: 0, elapsed: 1 secs) ==========
    master: buildbot-master10.build.mozilla.org:9301
    branch: try-win32-talos
    revision: e22826384f38
    comments: try: -b do -p all -u all -t none
Bug 698570 - Use weak references in nsFormHistory.js. r=mak.
    user: respindola@mozilla.com
    files: ['http://stage.mozilla.org/pub/mozilla.org/firefox/try-builds/respindola@mozilla.com-e22826384f38/try-win32/firefox-11.0a1.en-US.win32.zip']
    properties: [('buildid', '20111109095439'), ('builduid', u'ef2453d624374e269cfb59a412831910')]
'python' 'e:/builds/moz2_slave/try-w32/tools/buildfarm/utils/retry.py' '-s' '5' '-t' '1800' '-r' '5' '--stdout-regexp' 'change sent successfully' 'buildbot' 'sendchange' '--master' 'buildbot-master10.build.mozilla.org:9301' '--username' 'respindola@mozilla.com' '--branch' 'try-win32-talos' '--revision' 'e22826384f38' '--comments' u'try: -b do -p all -u all -t none\nBug 698570 - Use weak references in nsFormHistory.js. r=mak.' '--property' 'buildid:20111109095439' '--property' u'builduid:ef2453d624374e269cfb59a412831910' 'http://stage.mozilla.org/pub/mozilla.org/firefox/try-builds/respindola@mozilla.com-e22826384f38/try-win32/firefox-11.0a1.en-US.win32.zip'
...
Calling <function run_with_timeout at 0x00AE43B0> with args: (['buildbot.bat', 'sendchange', '--master', 'buildbot-master10.build.mozilla.org:9301', '--username', 'respindola@mozilla.com', '--branch', 'try-win32-talos', '--revision', 'e22826384f38', '--comments', 'try: -b do -p all -u all -t none'], 1800, 'change sent successfully', None, False, True), kwargs: {}, attempt #1
Executing: ['buildbot.bat', 'sendchange', '--master', 'buildbot-master10.build.mozilla.org:9301', '--username', 'respindola@mozilla.com', '--branch', 'try-win32-talos', '--revision', 'e22826384f38', '--comments', 'try: -b do -p all -u all -t none']

So the sendchange is happening but the master drops it because it is missing necessary information. It's the same problem on inbound 2b40312773a5, b259af61fe3f, 95d98e8ab9f3 (comments #5 thru #7).

Armen, I suspect this is an argument passing issue in retry.py.
Summary: Some pushes don't get Windows tests → Pushes don't get Windows tests if they have newlines in the push comment
No longer blocks: 700168
I think the reason why it behaves differently on Windows lay in the way how buldbot runs command on windows. On windows it passes everything to cmd.exe and it interprets everything.

I think, we could use just the first line of comments to get rid of this error and minimize the possibility of too long arg length.
Attached patch Strip comments (obsolete) (deleted) — Splinter Review
Maybe we'll need to add some sanity check to prevent IndexError/AttributeError exceptions.
<catlee-away> the test masters need them to decide what jobs to run, or what kind of notifications to send
<rail> oh
<rail> so, they use that syntax
<catlee-away> yeah
<rail> hmmmmm
<catlee-away> we can reuse try_parser's processMessage I bet
<catlee-away> or figure out a different way of doing this
<catlee-away> passing this stuff around in comments is really hacky
Assignee: armenzg → rail
Priority: -- → P2
Attached patch Strip comments, preserve try syntax (obsolete) (deleted) — Splinter Review
This one should use try syntax line if it exists or the first line of comments otherwise.
Attachment #573417 - Attachment is obsolete: true
Attachment #573549 - Flags: feedback?(lsblakk)
Attached patch Strip comments, preserve try syntax (obsolete) (deleted) — Splinter Review
This patch should also fix bug 702884. To be tested in staging.
Attachment #573549 - Attachment is obsolete: true
Attachment #573549 - Flags: feedback?(lsblakk)
Attachment #574930 - Flags: feedback?(catlee)
Blocks: 703206
Severity: major → critical
Comment on attachment 574930 [details] [diff] [review]
Strip comments, preserve try syntax

Review of attachment 574930 [details] [diff] [review]:
-----------------------------------------------------------------

looks fine. I'd use re.sub rather than the string module though.
Attachment #574930 - Flags: feedback?(catlee) → feedback+
Attached patch Strip comments, preserve try syntax (obsolete) (deleted) — Splinter Review
I'm testing this patch in staging. Hope to finish with testing by the end of the day.
Attachment #574930 - Attachment is obsolete: true
I tested the patch in staging, but couldn't run it under the same conditions as in production. I would like to land it after the next reconfig and let it bake a couple of days in preproduction.
Attachment #575246 - Attachment is obsolete: true
Attachment #575712 - Flags: review?(catlee)
Attachment #575712 - Flags: review?(catlee) → review+
Comment on attachment 575712 [details] [diff] [review]
Strip comments, preserve try syntax

http://hg.mozilla.org/build/buildbotcustom/rev/6926c19b7a0e
Attachment #575712 - Flags: checked-in+
This made it to production today.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: