Closed
Bug 829207
Opened 12 years ago
Closed 12 years ago
Fix Expires header settings for firefox nightly/aurora
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: bhearsum, Assigned: nmaul)
References
Details
15:07 < jakem> bhearsum|buildduty: okay so... I can mark firefox-nightly-latest and firefox-aurora-latest as SSL-only
15:07 -!- catlee-mtg is now known as catlee
15:07 < jakem> that will force them to the https://ftp.mozilla.org mirror
15:07 < jakem> which general release updates will never use
15:08 < jakem> it's a bit of a workaround, but it means we can safely leave the http://ftp.mozilla.org mirror disabled permanently
15:08 < bhearsum|buildduty> ah, i see
15:08 < bhearsum|buildduty> and then we'll never kill ftp, even if the cdn goes down
15:08 < jakem> the other option is to spin up a new mirror just for those 2 products
15:08 < jakem> precisely
15:08 < jakem> there won't *be* a local mirror to fall back to and self-immolate
15:08 < bhearsum|buildduty> that makes sense to me, but mind if i file a bug and let nthomas weigh in before we make the change?
Assignee | ||
Comment 1•12 years ago
|
||
Options here:
1) Mark these 2 products as SSL Only. This forces them to use https://ftp.mozilla.org, allowing us to keep http://ftp.mozilla.org disabled. This prevents Sentry from screwing up and sending all of our update traffic to us, causing us to fall over. This is definitely the simplest option. I'd eventually like to deliver *all* installers over SSL, regardless of whether or not they're part of stub installer... if/when that happens, this would naturally happen anyway.
2) Set up a separate 'mirror' (instead of ftp.mozilla.org) that handles specifically nightly and aurora (and soon maybe beta, too, if/when it changes to the nightly/aurora update model). I don't like this, it's more stuff to maintain. I mention it for completeness only.
3) Allow nightly and aurora files to be served up via CDN. This has caching/delay implications... it would be possible to download the latest stub installer, and then get fed a not-latest full installer. How far out of date would depend on the Cache-Control headers set... probably 4-24 hours or so. This causes a poor UX... downloading the latest version and immediately having to update. Like #1, this is also very easy to do.
Assignee | ||
Comment 2•12 years ago
|
||
Stub installer on nightly and aurora are broken until we make a decision and do something on this.
I am marking them as SSL-Only for now. This will at least get them working again, and can easily be changed at any time.
Updated•12 years ago
|
tracking-firefox20:
--- → ?
Comment 3•12 years ago
|
||
Jake sent http://jakem.pastebin.mozilla.org/2052419 as evidence that he's able to hit the download server.
Assignee | ||
Comment 4•12 years ago
|
||
Problem identified: the stub installer does not actually know how to do SSL/TLS connections. It sends plaintext to port 443. This doesn't work, but the stub installer doesn't really know how to handle it either... it simply tries to access download.mozilla.org again, and get stuck in an infinite loop.
This doesn't show up with curl, because curl *does* properly support HTTPS connections.
This was ultimately identified by examining the incoming traffic to ftp.mozilla.org... the log entries looked like this:
<ip> - - [11/Jan/2013:16:20:27 -0800] " " - 0 "-" "-" "-"
This is a common log entry to see when you do precisely this (send plaintext to an SSL port).
I believe this is something that should be fixed within stub installer, so we can (eventually) deliver all installers over SSL. It should definitely not exhibit this infinite loop behavior without an error message and an eventual fallback page.
However, in the meantime, akeybl has agreed that option #3 in comment 1 is an acceptable workaround for now. I have deployed this out, and disabled the "SSL Only" option. It seems to be working for me now.
I don't know if this bug is still needed directly... I think we need some follow-up bugs at least:
1) Stub installer should support fetching from HTTPS
2) Stub installer should error out when redirected to a URL it can't fetch, not loop indefinitely
3) Tune Cache-Control settings for /pub/mozilla.org/firefox/nightly/* for proper Zeus/CDN caching
4) Find a long-term solution for nightly/aurora delivery (if CDN isn't it).
Comment 5•12 years ago
|
||
The stub was never suppose to download via SSL (note that we already verify that the cert is present and correct which is why SSL support was never added or tested). Which bug implemented this change to the server side and why was it changed?
Comment 6•12 years ago
|
||
(In reply to Jake Maul [:jakem] from comment #4)
> Problem identified: the stub installer does not actually know how to do
> SSL/TLS connections. It sends plaintext to port 443. This doesn't work, but
> the stub installer doesn't really know how to handle it either... it simply
> tries to access download.mozilla.org again, and get stuck in an infinite
> loop.
>
> This doesn't show up with curl, because curl *does* properly support HTTPS
> connections.
>
> This was ultimately identified by examining the incoming traffic to
> ftp.mozilla.org... the log entries looked like this:
>
> <ip> - - [11/Jan/2013:16:20:27 -0800] " " - 0 "-" "-" "-"
>
> This is a common log entry to see when you do precisely this (send plaintext
> to an SSL port).
>
> I believe this is something that should be fixed within stub installer, so
> we can (eventually) deliver all installers over SSL. It should definitely
> not exhibit this infinite loop behavior without an error message and an
> eventual fallback page.
>
> However, in the meantime, akeybl has agreed that option #3 in comment 1 is
> an acceptable workaround for now. I have deployed this out, and disabled the
> "SSL Only" option. It seems to be working for me now.
>
>
> I don't know if this bug is still needed directly... I think we need some
> follow-up bugs at least:
>
> 1) Stub installer should support fetching from HTTPS
Bug 829829
> 2) Stub installer should error out when redirected to a URL it can't fetch,
> not loop indefinitely
Already filed and I have a patch in progress
Would love to know why we want to download over https especially since we put a decent amount of work into the security checks so we don't download over https per requests and stub requirements.
Assignee | ||
Comment 7•12 years ago
|
||
The change happened in this very bug... see comment 1 and comment 2 specifically.
As to why: we cannot safely offer http://ftp.mozilla.org/ as a valid mirror. Circumstances have conspired (more than once) to attempt to send *all* traffic to us, which takes an entire datacenter offline. The only sure-fire way to prevent this is to simply not offer that as a valid mirror anymore. However, at present, "firefox-nightly-latest" and "firefox-aurora-latest" depend on it. So once we did this, those no longer worked... stub installer was getting 404's from bouncer and throwing up the fallback page.
You can see the various ways out of this predicament in comment 1. This SSL trick was simply one way out of it (or so it seemed to me).
Comment 2 implemented option #1 as a quick-fix (didn't know at the time that it wouldn't work anyway). Comment 4 reverts that and implements option #3, as it's the next best option. That's where we're at now.
I do believe it would be nice to support SSL, but I understand the motivations behind not doing so in this case. I knew we were not *requiring* stub installer to fetch over SSL, but I did not realize it wouldn't actually work at all. It's obvious in hindsight...
Comment 8•12 years ago
|
||
Is #2 the only option that makes the stub installer functional and prevents the lag time in CDN delivery for Nightly/Aurora? Also, will this lag time impact Beta/Release once we roll out stub installers to those channels?
Assignee | ||
Comment 9•12 years ago
|
||
I'll answer your second question first: this situation can't affect the current beta releases, or the release channel. The underlying problem is that the file contents are changing even though the file names are not... that doesn't happen on Release (or on the current Beta system, at least until that goes to nightly builds).
On your first question:
Sort of... there are 3 basic ways to make option #3 work better:
1) Just have really short Cache-Control: max-age headers for these files. This works fine, the CDNs will obey it. It costs more due to inefficient caching, but it works. For nightly and aurora (and beta when it moves to nightly builds), the volume is low enough that it's not significant. For release, we should be able to stomach a larger max-age header because we aren't ever (AFAIK) planning to do daily release builds. Even if we did, they'd have different filenames (18.0.1 rather than 18.0) so this is a non-issue.
2) Set the max-age based on file modification time instead of access time. If you can predict with any sort of accuracy when the file will next be updated, this works great. Once that time is reached, the CDN will start fetching from the origin... essentially degrading into #1 above. As long as your next change is X hours away (+/- a couple hours), you can set this to X-2 hours and have essentially zero lag on the new version. This only breaks down if you need to build a new version sooner than the header.
3) Include some form of unique query string when linking to the file. This is by far the most common general solution to caching issues within web development and operations. If you can link to "http://file/?build=something", the problem goes away and your max-age can be "forever". This might take bouncer changes... but maybe we can munge in some rewrite rules instead... we'd need to investigate.
If none of that is feasible, then the original option #2 is the way to go. I'd prefer to find a way to do this through the CDN, because I'd prefer to handle all 4 channels in the same way if we can. The prior situation up until this bug was in itself a hack/workaround for this situation, so we shouldn't necessarily look to that design as the proper way to move forward.
Comment 10•12 years ago
|
||
(In reply to Jake Maul [:jakem] from comment #9)
> I'll answer your second question first: this situation can't affect the
> current beta releases, or the release channel. The underlying problem is
> that the file contents are changing even though the file names are not...
> that doesn't happen on Release (or on the current Beta system, at least
> until that goes to nightly builds).
Yeah, that makes this issue less urgent. We should still get a fix in for Nightly/Aurora though.
> On your first question:
>
> Sort of... there are 3 basic ways to make option #3 work better:
>
> 1) Just have really short Cache-Control: max-age headers for these files.
> This works fine, the CDNs will obey it. It costs more due to inefficient
> caching, but it works. For nightly and aurora (and beta when it moves to
> nightly builds), the volume is low enough that it's not significant. For
> release, we should be able to stomach a larger max-age header because we
> aren't ever (AFAIK) planning to do daily release builds. Even if we did,
> they'd have different filenames (18.0.1 rather than 18.0) so this is a
> non-issue.
This is mostly true. The one caveat is that we may one day find ourselves in a position where we want to pull an 18.0 release build and re-spin after it's been pushed to CDNs, but before we actually announce the release and push out updates. As long as we have a one-off solution, we should be fine.
> 2) Set the max-age based on file modification time instead of access time.
> If you can predict with any sort of accuracy when the file will next be
> updated, this works great. Once that time is reached, the CDN will start
> fetching from the origin... essentially degrading into #1 above. As long as
> your next change is X hours away (+/- a couple hours), you can set this to
> X-2 hours and have essentially zero lag on the new version. This only breaks
> down if you need to build a new version sooner than the header.
>
> 3) Include some form of unique query string when linking to the file. This
> is by far the most common general solution to caching issues within web
> development and operations. If you can link to
> "http://file/?build=something", the problem goes away and your max-age can
> be "forever". This might take bouncer changes... but maybe we can munge in
> some rewrite rules instead... we'd need to investigate.
I'm OK with any of the above for Nightly/Aurora. I don't think Beta needs any special resolution, but I'd like to also figure out a plan for any channel, if we ever absolutely need to pull a build off the wire and make sure that the stub is offering up the latest (regardless of filename).
Do you mind if I assign to you for final resolution? I think we want to:
1) Ensure that a build from the last 24 hours is provided to Nightly/Aurora users when they use the stub
2) We can force the latest bits being pushed out to stub installers, if ever necessary, regardless of the filename
Assignee: nobody → nmaul
tracking-firefox20:
? → ---
Assignee | ||
Comment 11•12 years ago
|
||
I have instituted the following config:
<DirectoryMatch "firefox/nightly/">
ExpiresDefault "modification plus 23 hours"
</DirectoryMatch>
This causes anything in the firefox/nightly directory tree to be served with a max-age and Expires header set for 23 hours after the mtime of the file on the server. This should be fairly effective, as long as new nightly/aurora builds always hit at roughly the same time of day, every day.
I have also issued a purge of this content from the 2 CDNs, and this is completed. They are currently showing up-to-date content. However it's also been more than 24 hours since a fresh build landed... I presume that has something to do with the Firefox 19 release and will be remedied naturally, soon.
Marking this bug as resolved... don't think there's anything more to do here.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 12•12 years ago
|
||
This change is temporarily reverted for testing purposes in bug 836044.
Assignee | ||
Updated•12 years ago
|
Summary: mark firefox-{nightly,aurora}-latest as SSL-only → Fix Expires header settings for firefox nightly/aurora
Assignee | ||
Comment 13•12 years ago
|
||
Green light given, this fix is now applied again and both CDNs have had purge requests issued. They should be purged and up-to-date within a couple hours.
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Comment 14•12 years ago
|
||
Is there something QA can look out for to verify this is fixed?
Assignee | ||
Comment 15•12 years ago
|
||
Well, you can fetch nightly and aurora installers, and double-check that the Expires and Cache-Control headers seem sane... that'll tell you if the actual change is in effect, but won't necessarily tell you if things are really fixed.
The best test for if things are actually fixed is to just install using stub-installer many times, and see if it ever fails. If the problem is not fixed, you should get sporadic failures. It may take a few days for these failures really to become apparent.
FWIW, I have pretty high confidence here. We identified a concrete problem, and rolled out a solid fix for it... as well as additional bulletproofing within stub installer itself (in bug 836044).
And just to reiterate, this is a problem that could not feasibly have occurred in Beta or Release. Naming conventions on those channels dictate all new builds have different version numbers in the filenames... the underlying problem wouldn't have triggered on those channels. This is in contrast to nightly and aurora, where the file names don't change but the contents do, which is hard for caching to deal with.
Comment 16•12 years ago
|
||
and for the stub I received preliminary data last night that infers the error seen in the stub is no longer occuring.
Comment 17•12 years ago
|
||
We've installed Aurora using the stub a good dozens of times in the last days and haven't encountered any errors or problems on normal install.
https://wiki.mozilla.org/Stub_Installer/QA#Aurora_21_Sign-off
Also considering bug 836044 comment 62 and Robert's data from comment 16, this should be safe to mark as verified.
Status: RESOLVED → VERIFIED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•