Closed Bug 741688 (icanhazkanzaz) Opened 13 years ago Closed 12 years ago

buildIDSchedFunc can generate invalid build ids

Categories

(Release Engineering :: General, defect, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Assigned: nthomas)

References

Details

(Whiteboard: [buildbot][schedulers])

Attachments

(1 file)

eg http://preproduction-master.build.mozilla.org:8010/builders/b2g%20fedora16-i386%20ionmonkey%20build/builds/1 has 20120402184463, ie 18:44:63 and we're not in kansas any more.

When we upload the log we pass that on to post_upload.py which tries to do 
  int(mktime(strptime(buildid, "%Y%m%d%H%M%S")))
in BuildIDToUnixTime, which quite reasonably fails and raises an error. If we weren't suppressing the actual error we'd be getting
 ValueError: unconverted data remains: 3

This value appears to be first set when we do
 make -f client.mk build MOZ_BUILD_DATE=20120402184463

It's happening on bld-centos6-hp-001 and bld-centos6-hp-002 but apparently not in production, so it might be something funky with builder_master on preprod.
Win64 too:
http://preproduction-master.build.mozilla.org:8010/builders/WINNT%206.1%20x86-64%20mozilla-inbound%20build/builds/0 

20120402184471 on a build that started at Mon Apr 2 23:33:30 2012
the problem comes from 
http://hg.mozilla.org/build/buildbotcustom/file/309e0a7cddbd/misc_scheduler.py#l92

where we add 1 to the last buildid to make sure we don't re-use buildids and are always getting build ids in increasing order.

I think the fix should be calling genBuildId again with a different value to generate a new, valid, build id.

http://hg.mozilla.org/build/buildbotcustom/file/309e0a7cddbd/common.py#l24
Priority: -- → P2
Summary: 'preproduction-master.build.sjc1:Command Queue is CRITICAL: 4 dead items' from buildIDs with more than 60 seconds → buildIDSchedFunc can generate invalid build ids
Whiteboard: [buildbot][schedulers]
http://build.mozillamessaging.com/tinderboxpushlog/?tree=SeaMonkey&rev=9998f4b43479
sgautherie.bz@free.fr – Sat Apr 7 22:09:55 2012
Severity: normal → major
OS: Linux → All
Hardware: x86 → All
Gonna look at converting the existing buildID to an epoch timestamp before adding 1, then casting it back to the YYYY... style.
Assignee: nobody → nrthomas
Status: NEW → ASSIGNED
this is causing a bunch of failures today, e.g.:

Exception: Command ['ssh', '-l', 'ffxbld', '-i', '/home/cltbld/.ssh/ffxbld_dsa', '-p', '22', 'stage.mozilla.org', 'post_upload.py --tinderbox-builds-dir mozilla-central-linux64-debug -b mozilla-central -p firefox -i 20120718033863 --revision bf09f11b69d8 --release-to-tinderbox-dated-builds /tmp/tmp.HevH3EXAVQ /tmp/tmp.HevH3EXAVQ/mozilla-central-linux64-debug-bm32-build1-build4.txt.gz'] returned non-zero exit code 1:
sys.argv: ['/usr/local/bin/post_upload.py', '--tinderbox-builds-dir', 'mozilla-central-linux64-debug', '-b', 'mozilla-central', '-p', 'firefox', '-i', '20120718033863', '--revision', 'bf09f11b69d8', '--release-to-tinderbox-dated-builds', '/tmp/tmp.HevH3EXAVQ', '/tmp/tmp.HevH3EXAVQ/mozilla-central-linux64-debug-bm32-build1-build4.txt.gz']
Traceback (most recent call last):
  File "/usr/local/bin/post_upload.py", line 508, in <module>
    func(options, upload_dir, files)
  File "/usr/local/bin/post_upload.py", line 215, in ReleaseToTinderboxBuilds
    ReleaseToBuildDir(TINDERBOX_BUILDS_PATH, TINDERBOX_URL_PATH, options, upload_dir, files, dated)
  File "/usr/local/bin/post_upload.py", line 193, in ReleaseToBuildDir
    buildid = str(BuildIDToUnixTime(options.buildid))
  File "/usr/local/bin/post_upload.py", line 126, in BuildIDToUnixTime
    raise "Could not parse buildid!"
TypeError: exceptions must be old-style classes or derived from BaseException, not str
INFO:util.commands:command: END (32.13 elapsed)

the buildid is 20120718033863.
Blocks: 775168
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'buildid': '20120718101455'}
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'builduid': 'cb3c90eea64e4206beff5da1a53428f1'}
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'buildid': '20120718101456'}
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'builduid': '748a148ee0374dbebf7b4cef4862cb31'}
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'buildid': '20120718101457'}
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'builduid': 'f4815111964e4d7b9807e85085bbbbc2'}
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'buildid': '20120718101458'}
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'builduid': '46dafa27bf204ad3961b2bf405e0ea29'}
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'buildid': '20120718101459'}
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'builduid': '8574c6de800346a4b7ae6c323d80add3'}
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'buildid': '20120718101460'}
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'builduid': '483cd1dc417f46b7943f935197a223f8'}
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'buildid': '20120718101461'}
2012-07-18 10:14:55-0700 [-] mozilla-central: propfunc returned {'builduid': '9950b6951b3740d0827bab4d12e3bfb3'}
2012-07-18 10:14:56-0700 [-] mozilla-central: propfunc returned {'buildid': '20120718101462'}
2012-07-18 10:14:56-0700 [-] mozilla-central: propfunc returned {'builduid': '1f30fdffd98641e0a1fa9e0bc5e3e438'}
2012-07-18 10:14:56-0700 [-] mozilla-central: propfunc returned {'buildid': '20120718101463'}
2012-07-18 10:14:56-0700 [-] mozilla-central: propfunc returned {'builduid': 'b70846cfd65d47ce9b54a20618eab3ae'}
Looks like the property functions stack up somehow? Or we've got multiple instances of the scheduler running? In any case, I've restarted the build scheduler for now to see if that helps.
restarting the master doesn't seem to have helped this particular issue:
2012-07-18 12:44:15-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120718124415'}
2012-07-18 12:44:15-0700 [-] mozilla-inbound: propfunc returned {'builduid': 'dc7c5294bae44dc2aa8d618d431d4583'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120718124416'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'builduid': '58365902091c46538c865f2bded5185a'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120718124417'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'builduid': 'd1128660da4f43e9b1328fc9f36ee7ec'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120718124418'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'builduid': '17840901fc634c819f8a899428f61c78'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120718124419'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'builduid': '0ceda90c2d9343d59292fcd3ceb2ef8d'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120718124420'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'builduid': '41d922a49f9d4fa29729854148838402'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120718124421'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'builduid': '82b073fa837c47cbaff6d337d6da2af4'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120718124422'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'builduid': '3adcdd6825a24d278881a7fbd218adb3'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120718124423'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'builduid': '4c2d0a21c4f746cda9eae968e2aae675'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120718124424'}
2012-07-18 12:44:16-0700 [-] mozilla-inbound: propfunc returned {'builduid': 'cd89584f73ff4e2d901e9f2dced9274a'}
I can't speak to the repeated calls, but this keeps the buildIDs valid when we increment. Passes the test but not tried on a master.
Attachment #643722 - Flags: review?(catlee)
Comment on attachment 643722 [details] [diff] [review]
[buildbotcustom] Increment buildID better

Review of attachment 643722 [details] [diff] [review]:
-----------------------------------------------------------------

::: test/test_misc_scheduler_propfuncs.py
@@ +57,5 @@
>              # If time happens to go backwards, our buildid shouldn't
>              time_method.return_value = 8
>              self.dbc.runInteractionNow(lambda t: buildIDSchedFunc(self.s, t, None))
>              state = self.dbc.runInteractionNow(lambda t: self.s.get_state(t))
> +            self.assertEquals(state['last_buildid'], time.strftime("%Y%m%d%H%M%S", time.localtime(61)))

Looks great.

Can you also check that these last two buildids don't end with 60 or 61? They should end with 00 and 01 I think.
Attachment #643722 - Flags: review?(catlee) → review+
Blocks: 775545
The repeated calls are triggered by the poller being reset:

2012-07-19 07:31:14-0700 [HTTPPageGetter,client] http://hg.mozilla.org/integration/mozilla-inbound has been reset
2012-07-19 07:32:14-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120719073214'}
2012-07-19 07:32:14-0700 [-] mozilla-inbound: propfunc returned {'builduid': '5429c5d334c14868b9ffc0726af075ea'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120719073215'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'builduid': '362dbd406c2e472e99d11a3a50ee4b54'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120719073216'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'builduid': 'd9b0fe1125f544c1aad2024d060f92c3'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120719073217'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'builduid': '2995c14965374ef992eb9bf18e260211'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120719073218'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'builduid': 'ebf054d576074f7787c51530d639383b'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120719073219'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'builduid': 'e2b98e91e8e14203ab2505158e104e8d'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120719073220'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'builduid': '3a86b210702c4133926a65c784cee9f3'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120719073221'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'builduid': '2040ae25543c4919bd85eb8120b3b29a'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120719073222'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'builduid': 'e9695bcf16f844d39e40b0198d621536'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'buildid': '20120719073223'}
2012-07-19 07:32:15-0700 [-] mozilla-inbound: propfunc returned {'builduid': '060157a88bac449798dd134cbc16a41d'}
Depends on: 774862
Comment on attachment 643722 [details] [diff] [review]
[buildbotcustom] Increment buildID better

Landed on default with the additional test:
http://hg.mozilla.org/build/buildbotcustom/rev/1155eadd6e87
Attachment #643722 - Flags: checked-in+
In production
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: