Closed
Bug 664539
Opened 13 years ago
Closed 13 years ago
update verify should retry if it gets an empty result from AUS
Categories
(Release Engineering :: General, defect, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bhearsum, Assigned: catlee)
References
Details
Attachments
(3 files)
(deleted),
patch
|
bhearsum
:
review+
nthomas
:
checked-in+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bhearsum
:
review+
nthomas
:
checked-in+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
nthomas
:
review+
catlee
:
checked-in+
|
Details | Diff | Splinter Review |
We already retry in update verify if we fail to _retrieve_ the snippet, but we don't retry if we get an empty snippet. This can cause spurious burning if AUS loses its mind momentarily.
We might be able to add some arguments to http://hg.mozilla.org/build/tools/file/276961805e0a/release/common/download_mars.sh#l12 to do it. If not, we'll have to do something smarter in that function.
Reporter | ||
Comment 1•13 years ago
|
||
We'll probably get it for free, but we should make sure this happens for final verification, too. We hit a similar issue during 6.0b5:
FAIL: no complete update found for https://aus2.mozilla.org/update/1/Firefox/6.0/20110705195857/Darwin_x86_64-gcc3-u-i386-x86_64/ta/releasetest/update.xml?force=1
FAIL: download_mars returned non-zero exit code: 1
Comment 2•13 years ago
|
||
This gives output like this:
Using https://aus3.mozilla.org/update/1/Firefox/8.0/20111006182035/Linux_x86-gcc3/zu/betatest/update.xml?force=1
Calling <function run_with_timeout at 0xb7c5f02c> with args: (['wget', '--no-check-certificate', '-S', '-O', 'update.xml', 'https://aus3.mozilla.org/update/1/Firefox/8.0/20111006182035/Linux_x86-gcc3/zu/betatest/update.xml?force=1'], 300, None, None, False, True), kwargs: {}, attempt #1
Executing: ['wget', '--no-check-certificate', '-S', '-O', 'update.xml', 'https://aus3.mozilla.org/update/1/Firefox/8.0/20111006182035/Linux_x86-gcc3/zu/betatest/update.xml?force=1']
Process stdio:
Process stderr:
--13:39:11-- https://aus3.mozilla.org/update/1/Firefox/8.0/20111006182035/Linux_x86-gcc3/zu/betatest/update.xml?force=1
Resolving aus3.mozilla.org... 63.245.209.149
Connecting to aus3.mozilla.org|63.245.209.149|:443... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Date: Wed, 12 Oct 2011 21:07:27 GMT
Server: Apache
X-Backend-Server: pm-app-dist05
X-Powered-By: PHP/5.1.6
Set-Cookie: aus2a=63.245.220.220.1318453647.0357; expires=Wed, 12-Oct-2016 02:11:17 GMT; path=/; domain=aus2.mozilla.org
Cache-Control: no-store, must-revalidate, post-check=0, pre-check=0, private
Content-Length: 919
Keep-Alive: timeout=5, max=199
Connection: Keep-Alive
Content-Type: text/xml;
Cookie coming from aus3.mozilla.org attempted to set domain to aus2.mozilla.org
Length: 919 [text/xml]
Saving to: `update.xml'
0K 100% 535M=0s
13:39:11 (535 MB/s) - `update.xml' saved [919/919]
Got this response:
<?xml version="1.0"?>
<updates>
<update type="minor" version="8.0 Beta" extensionVersion="8.0" buildID="20111011182523" detailsURL="https://www.mozilla.com/zu/firefox/8.0/releasenotes/">
<patch type="complete" URL="http://stage-old.mozilla.org/pub/mozilla.org/firefox/nightly/8.0b3-candidates/build1/update/linux-i686/zu/firefox-8.0b3.complete.mar" hashFunction="SHA512" hashValue="19bf9de7cc2d8664147af90c919ba701e05616a21888705d294c4fd1cfff99f6b40473c970683093dc9f36926dcc013e76995c26733125cb1d3bb26ebb065d05" size="16480509"/>
<patch type="partial" URL="http://stage-old.mozilla.org/pub/mozilla.org/firefox/nightly/8.0b3-candidates/build1/update/linux-i686/zu/firefox-8.0b2-8.0b3.partial.mar" hashFunction="SHA512" hashValue="463d9b70759e1b873dbeb67265646355600dfd0987c88d015159b07285570765b306ce09660829c77e868c77d312a35364a5f6b17217caff28faa497b18ee69b" size="840361"/>
</update>
</updates>
X-Backend-Server might help us tell if one particular webhead is being a problem, and seeing update.xml will confirm we're getting an empty update.
Attachment #566627 -
Flags: review?(bhearsum)
Reporter | ||
Updated•13 years ago
|
Attachment #566627 -
Flags: review?(bhearsum) → review+
Comment 3•13 years ago
|
||
Comment on attachment 566627 [details] [diff] [review]
[tools] Add some debug info
http://hg.mozilla.org/build/tools/rev/3aa8baeb3c2e
Attachment #566627 -
Flags: checked-in+
Comment 4•13 years ago
|
||
The debugging patch broke final verification by adding one of these lines for each aus query
HTTP request sent, awaiting response...
which then match in a 'grep HTTP' when we don't want them too. Adding the forward slash fixes it by only matching on the likes of HTTP/1.1. We don't need to worry about windows and slash-munging because all the verifications run on linux.
Attachment #567036 -
Flags: review?(bhearsum)
Reporter | ||
Updated•13 years ago
|
Attachment #567036 -
Flags: review?(bhearsum) → review+
Comment 5•13 years ago
|
||
Comment on attachment 567036 [details] [diff] [review]
[tools] Fix final verification
http://hg.mozilla.org/build/tools/rev/79b0fcb2d90f
Attachment #567036 -
Flags: checked-in+
Reporter | ||
Comment 6•13 years ago
|
||
I had a quick peek at an update verify log for 8.0b5 and found that there was at least three webheads serving empty snippets (01 and 05 and 08): http://buildbot-master08.build.sjc1.mozilla.com:8001/builders/release-mozilla-beta-win32_update_verify_5%2F10/builds/19/steps/run_script/logs/stdio
Comment 7•13 years ago
|
||
I saw 04 and 07 as well.
Comment 8•13 years ago
|
||
OK, we should ask IT if this could be an issue with the Zeus load balancer which is in front of the web heads.
Assignee | ||
Comment 9•13 years ago
|
||
I couldn't get AUS to give me an empty response on purpose, so I tested by adding this:
diff --git a/release/common/download_mars.sh b/release/common/download_mars.sh
--- a/release/common/download_mars.sh
+++ b/release/common/download_mars.sh
@@ -9,16 +9,22 @@ download_mars () {
test_only="$3"
max_tries=5
try=1
while [ "$try" -lt "$max_tries" ]; do
echo "Using $update_url"
$retry wget --no-check-certificate -S -O update.xml $update_url
+ if [ "$RANDOM" -lt 30000 ]; then
+ echo "<?xml version=\"1.0\"?>" > update.xml
+ echo "<updates>" >> update.xml
+ echo "</updates>" >> update.xml
+ fi
+
echo "Got this response:"
cat update.xml
# If the first line after <updates> is </updates> then we have an
# empty snippet. Otherwise we're done
if [ "$(grep -A1 '<updates>' update.xml | tail -1)" != "</updates>" ];
break;
fi
echo "Empty response, sleeping"
Attachment #576631 -
Flags: review?(nrthomas)
Comment 10•13 years ago
|
||
Comment on attachment 576631 [details] [diff] [review]
retry on empty responses
>+ max_tries=5
>+ try=1
>+ while [ "$try" -lt "$max_tries" ]; do
Use -le to get max_tries instead of max_tries-1, and add a comment like this
# retrying until we get offered an update
>+ echo "Using $update_url"
>+ $retry wget --no-check-certificate -S -O update.xml $update_url
Please add a comment like
# retrying until AUS gives us any response at all
>+ if [ "$RANDOM" -lt 30000 ]; then
>+ echo "<?xml version=\"1.0\"?>" > update.xml
>+ echo "<updates>" >> update.xml
>+ echo "</updates>" >> update.xml
>+ fi
Please leave out the test code on landing.
Attachment #576631 -
Flags: review?(nrthomas) → review+
Assignee | ||
Comment 11•13 years ago
|
||
Comment on attachment 576631 [details] [diff] [review]
retry on empty responses
http://hg.mozilla.org/build/tools/rev/55d97b119691
Attachment #576631 -
Flags: checked-in+
Assignee | ||
Updated•13 years ago
|
Assignee: nobody → catlee
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•