Closed Bug 238654 Opened 21 years ago Closed 21 years ago

[FIXr]Content-location possible regression in 1.7b: may 'fail' on many servers

Categories

(Core :: DOM: HTML Parser, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla1.7final

People

(Reporter: wolruf, Assigned: bzbarsky)

References

()

Details

(Keywords: fixed1.7, regression)

Attachments

(2 files)

After fix for bug 109553 was committed, one major server failed because of HTTP content-location badly configured: bug 231072. After 1.7b was released, it seems like this rate is increasing: bug 238156, bug 238314, and now a bank site in bug 238626. The worrying part is that 10% of the IIS servers seem to be wrongly configured on the Internet (not to mention the Intranet servers), according to: http://www.securityspace.com/s_survey/data/man.200402/firewalled_cloc.html [...] Number of IIS Servers that revealed an IP in the Content-Location tag different from the IP accepting the connection 75,938 9.40% [...] 1.7 will to be the first non-beta release of Mozilla (Firefox 0.8 doesn't have it) to ship with this new feature and I'm afraid it could fail on many servers. Possible solutions: - a pref to disable it ? by default ? - not taking care of content-location when served by IIS and Oracle9 (the servers we've seen so far) ? Mozilla has done something similar with HTTP pipelining in the past: http://lxr.mozilla.org/seamonkey/source/netwerk/protocol/http/src/nsHttpConnection.cpp#236 (originally posted on n.p.m.netlib on march 22nd)
it must be possible to implement the feature differently. If an image is missing, the whole pageload doesn't stall for 10 minutes before it loads with a "broken image". It would be a disaster to the user experience. So why allow other broken tags cause pauses like that? Isn't it possible to detect and deal with the missing path, throw it aside and and let the browser fall back to "life before bug 109553 was fixed"?
QA Contact: ian
Flags: blocking1.7?
> If an image is missing, the whole pageload doesn't stall for 10 minutes No, but if a script is missing it does..... and has to to support document.write. :( What exact failures are we seeing in those evang bugs? Bogus IPs? Those should be pretty fast (DNS lookups giving "no such host" are fast), but could result in bad rendering. A bigger problem is random ports specified on which the server doesn't respond and so we have to wait out a TCP timeout.... If someone can collect a list of the actual hostnames and content-location headers for the affected sites, that would be great. I do think we should decide this one way or the other (fix, wontfix, back out bug 109553, whatever) for 1.7; I'd rather not ship several slightly different versions of content-location support.
> What exact failures are we seeing in those evang bugs? Bogus IPs? Trying to summarize the issues in the content-location header from the various bugs reported so far: bug 238156: bogus port (3003), bad rendering (images don't load), bug 238314: bogus IP (192.168.130.34), can't load site after home page, bug 238626: bogus port (8080), bad rendering, doesn't finish loading, bug 231072: bogus port (3002), bad rendering (CSS+images), > If someone can collect a list of the actual hostnames and content-location > headers for the affected sites, that would be great. using http://web-sniffer.net/ bug 238156: http://www.gv.com.sg/Booking/servlet/BkShowDetailsServlet?gvAction=buyTickets Content-Location: http://www.gv.com.sg:3003/Booking/BkSelectShowTime.jsp bug 238314: http://www.petro-points.com/ Content-Location: http://192.168.130.34/ns-pp-prod/index.html bug 238626: https://ibank.amsouth.com/ Content-Location: http://ibank.amsouth.com:8080/home.html bug 231072: http://pub.tv2.no/nettavisen/spray/slideshow/article195382.ece Content-Location: http://pub.tv2.no:3002/dyn-nettavisen/spray/slideshow/article.jsp
OK. I would be perfectly happy to do any of the following: 1) Ignore content-location from those two server types 2) As #1, but only if the content-location is the same host but a different port 3) Anything else reasonable that's suggested. Darin? What do you think?
To repeat the steps above with the Server: lines and a few more sites that I happen to know get broken because of Content-Location headers: bug 238156: http://www.gv.com.sg/Booking/servlet/BkShowDetailsServlet?gvAction=buyTickets Content-Location: http://www.gv.com.sg:3003/Booking/BkSelectShowTime.jsp Server: Oracle9iAS/9.0.2 Oracle HTTP Server Oracle9iAS-Web-Cache/9.0.3 bug 238314: http://www.petro-points.com/ Content-Location: http://192.168.130.34/ns-pp-prod/index.html Server: Microsoft-IIS/5.0 X-powered-by: ASP.NET bug 238626: https://ibank.amsouth.com/ Content-Location: http://ibank.amsouth.com:8080/home.html Server: Microsoft-IIS/5.0 bug 231072: http://pub.tv2.no/nettavisen/spray/slideshow/article195382.ece Content-Location: http://pub.tv2.no:3002/dyn-nettavisen/spray/slideshow/article.jsp Server: Oracle9iAS/9.0.2 Oracle HTTP Server Oracle9iAS-Web-Cache/9.0.3.1.0 (TH;max-age=214364716+0;age=66945) bug 237372: https://e.pkobp.pl/ssl/ Content-Location: http://e.pkobp.pl:81/ssl/Default.htm Server: Microsoft-IIS/4.0 http://www.stj.gov.br/webstj/ Content-Location: http://www.stj.gov.br/webstjDefault.htm Server: Microsoft-IIS/5.0 http://www.hihostels.com/openHome.sma Content-Location: http://www.hihostels.com/pages/layouts/main.jsp Server: Oracle9iAS (1.0.2.2.1) Containers for J2EE http://bookwire.com/bookwire/ Content-Location: http://bookwire.com:3170/bookwire/bookwire.html Server: Microsoft-IIS/4.0 http://www.deutschebahn.sagenet.co.uk/ Content-Location: http://www.deutschebahn.sagenet.co.uk/d/e/u/deutschebahn_sagenet_co_uk/Default.htm Server: Microsoft-IIS/5.0 https://www.motortax.ie/mtoapp/logout.do (click the "Exit" button at the top right of http://www.motortax.ie/) Content-Location: https://www.motortax.ie/pages/ Server: Oracle9iAS/9.0.2 Oracle HTTP Server http://www.personal-system.com/ Content-Location: http://www.personal-system.com/publicSite12/NMORETTI001/Default.htm Server: Microsoft-IIS/5.0 So if we want to sniff for servers, it looks like we have to search for any of: - an exact match for "Microsoft-IIS/4.0" - an exact match for "Microsoft-IIS/5.0" - a leading substring match for "Oracle9iAS (1.0.2.2.1)" - a leading substring match for "Oracle9iAS/9.0.2" I can't see any heuristics that would work out if the Content-Location is reliable or not.
Well, all the content-locations fall into the following categories: 1) Private IP address range 2) Different port 3) Ends in Default.htm But yes, just killing off servers (prefix match on "Microsoft-IIS/" and "Oracle9iAS", I would guess) would work.... If we can get some consensus here, I'll whip up a patch. Darin?
(In reply to comment #6) > > But yes, just killing off servers (prefix match on "Microsoft-IIS/" and > "Oracle9iAS", I would guess) would work.... I think we should make sure we include version numbers; we don't want to prematurely pigeon hole future products into this particular quirks mode.
I agree. IIS/6.0 might not have this bug.
according to http://support.microsoft.com/default.aspx?scid=kb;en-us;218180 and http://support.microsoft.com/default.aspx?scid=kb;EN-US;244998 IIS/4.0 & 5.0 are affected, 6.0 seems to be ok, eventhough it's optional and can be configured by an admin (may be a few cases, which can be treated by Tech Evangelism then ?): http://support.microsoft.com/default.aspx?scid=kb;en-us;245099
Those articles just talk about the IP screwup and a possible port screwup. They only cover some of the problem cases we are seeing; in particular, they don't cover the cases when a bogus port is stuck in the content-location, nor do they cover the case of totally bogus content-location (a la the http://www.personal-system.com/ example and others in comment 5). I'm fine with including specific versions only in the blacklist, but we should include IIS 6.0. Note also that those articles give me the impression that MS is treating this header as meaning something different than what we think it means, which does not give me much hope for it changing in IIS any time soon.
I hope it is not intended to ship 1.7 with this bug.
As a matter of fact, I was testing a patch as you wrote that. Patience, please. ;)
Attached patch Proposed patch (deleted) — Splinter Review
I felt the pref justified, so people can easily deal with new bogus servers popping up without recompiling. I also still feel that we should just blacklist IIS altogether, because, as I said, it looks to me from the MS documentation that they disagree with us on what this header actually means.
Assignee: parser → bzbarsky
Status: NEW → ASSIGNED
Attachment #145378 - Flags: superreview?(darin)
Attachment #145378 - Flags: review?(bugmail)
Priority: -- → P1
Summary: Content-location possible regression in 1.7b: may 'fail' on many servers → [FIX]Content-location possible regression in 1.7b: may 'fail' on many servers
Target Milestone: --- → mozilla1.7final
Comment on attachment 145378 [details] [diff] [review] Proposed patch looks good, except i'm not too happy with the preference name. IMO "network.http." should be for necko's http implementation only.
Attachment #145378 - Flags: superreview?(darin) → superreview+
Hmm... In a way, this is affecting our "HTTP" support, which is why I put it in this namespace. But I'm open to suggestions on pref names, if someone feels something else would be better.
No performance issue with checking the pref branch on every hit or anything?
Not really. This would happen once per pageload, basically. That's absolutely negligible.
And it would only happen on pages that have a content-location header.
> Hmm... In a way, this is affecting our "HTTP" support, which is why I put it in > this namespace. sure, it is affecting our HTTP support, but then i think there's something to be said for having pref namespaces correspond to code modules. however, it's not like we've been very consistent at all :-/ note: we only honor this preference for some consumers of HTTP. suppose some other consumer wants to inspect the content-location header. because we don't put this in necko, we don't affect all consumers. hence, it seems strange to mention this as a necko preference. it's only changing the behavior of gecko's layout engine.
OK, I buy that. jst, would this pref be OK in the "dom." namespace? Or should it be in the "browser." namespace?
(In reply to comment #17) > Not really. This would happen once per pageload, basically. That's absolutely > negligible. Cool, just checking.
Comment on attachment 145378 [details] [diff] [review] Proposed patch Wow, this is so sad. Do we have any idea what IE does? Anyhow, r=me
Attachment #145378 - Flags: review?(bugmail) → review+
Comment on attachment 145378 [details] [diff] [review] Proposed patch Actually, do you think we'll be adding more HeaderCallback functions? Otherwise i'd just add a |if (ContentLocationOK(...))| in ProcessHeaderData. No biggie though
Sicking, IE simply doesn't support this HTTP header. I'd rather keep the table-driven style of this code so that new headers are easy to add. And yes, I do suspect that we may need more callbacks like that if we add any more headers not supported by IE....
Comment on attachment 145378 [details] [diff] [review] Proposed patch Could this be approved for 1.7? This disables support for the content-location header when we encounter a server known to have a high probability of sending a bogus content-location.
Attachment #145378 - Flags: approval1.7?
Summary: [FIX]Content-location possible regression in 1.7b: may 'fail' on many servers → [FIXr]Content-location possible regression in 1.7b: may 'fail' on many servers
Comment on attachment 145378 [details] [diff] [review] Proposed patch a=chofmann for 1.7
Attachment #145378 - Flags: approval1.7? → approval1.7+
Fix checked in, with the pref renamed to "browser.content-location.bogus-servers", but backed out due to the typical Windows bustage. Anyone have any idea why the following code: 344 static const HeaderData headers[] = { 345 { "link", nsnull }, Gives the errors: c:/builds/tinderbox/firefox/WINNT_5.0_Clobber/mozilla/content/base/src/nsContentSink.cpp(345) : error C2440: 'initializing' : cannot convert from 'char [5]' to 'const struct HeaderData' c:/builds/tinderbox/firefox/WINNT_5.0_Clobber/mozilla/content/base/src/nsContentSink.cpp(345) : error C2440: 'initializing' : cannot convert from 'const int' to 'const struct HeaderData' c:/builds/tinderbox/firefox/WINNT_5.0_Clobber/mozilla/content/base/src/nsContentSink.cpp(345) : fatal error C1903: unable to recover from previous error(s); stopping compilation ?
hmmm, maybe the char needs to be a "const char*"?
Maybe. I seem to recall that breaking the Windows compiler (which is why I did not do it), but maybe. Could someone with a Windows env try it?
Checked in, no bustage this time. Fixed for 1.7. I guess a bunch of those evang bugs can be resolved now....
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Blocks: 239798
See Bug 239845, there the server is a "Oracle-Application-Server-10g/9.0.4.0.0 Oracle-HTTP-Server" with bogus ports. Should this server name be added in the pref?
If that's a common server, I suppose we could add it to the list. Otherwise we should evangelize them. Or we could add "Oracle" to the list in general, I guess.
*** Bug 240758 has been marked as a duplicate of this bug. ***
See Bug 241094: Site is broken even with current 1.8a builds like 2004041809 Win or 20040419 Linux. Is the patch only in the 1.7 branch?
the site in Bug 241094 uses the following: "Oracle9iAS (9.0.2.2)", which is not checked for in the patch ("Oracle9iAS/9.0.2,Oracle9iAS (1.0.2.2.1)"). Seems Mozilla should check even more browsers...
Yep. So everyone in favor of blacklisting that oracle server altogether raise your hand? They change the server line for every single minor revision, so trying to deal is just a whack-a-mole exercise.
Fine by me; it's not a major server family anyway.
shouldn`t we try to contact them ?
Sure. So I propose we convert bug 241094 to evang on the server and I'll change the pref to just blacklist that server family if darin is OK with it.
(In reply to comment #39) > Sure. So I propose we convert bug 241094 to evang on the server and I'll change > the pref to just blacklist that server family if darin is OK with it. you don't think trying to evang someone, about something that works perfectly fine on all browsers (once this server is blacklisted), is somewhat unlikely to work?
We tell them that their bug limits their customers' ability to use a very useful HTTP feature. <shrug>. And maybe threaten them with a "list of shame" like thing.
Depends on: 241094, 241320
Depends on: 241402
To summarize the impact this has on Oracle: Mozilla 1.7 will ignore a Content-Location header sent by the following Oracle application servers (identified by Server header): Oracle9iAS/9.0.2 Oracle9iAS (1.0.2.2.1) This change makes Mozilla 1.7 behave like older versions of Mozilla, Netscape, and IE 6 -- w.r.t. its handling of the Content-Location header -- when talking to an instance of one of these servers. The point to emphasize is that installations of these servers that worked with Mozilla 1.6 will be unaffected by the changes made to Mozilla 1.7 to support the Content-Location header. Mozilla 1.7 will honor any Content-Location header sent by an Oracle server, whose Server header does not match the list given above. Also, the patch in bug 241320 proposes ignoring a Content-Location header sent by any server with a Server header starting with "Oracle9iAS". At the moment, that patch has not been checked into the 1.7 branch. Chofmann is in touch with folks at Oracle regarding this issue. He is working to find the right folks within Oracle to possibly suggest a better Server header "filter" string (or even a better solution for Oracle9iAS). He said he'd be commenting in this bug when he has more info.
Depends on: 239845
Depends on: 234778
Note that issues with two more servers have come up: Server: HDS/1.0 (GZ/44) Server: Orion/1.5.2
Depends on: 241652
One has to wonder why so many servers send this header with the wrong value? Anyway, might it be better to use a whitelist or compleatly disable this feature for 1.7?
> One has to wonder why so many servers send this header with the wrong value? IIS just does (and MS knows about it). For the rest, they do it because IE doesn't support it, of course. If we disable this feature, we may as well back it out. It's not going to be better for 1.8. I'm fine with a whitelist, but Darin seems opposed...
I think a whitelist is conceptually the wrong way to go. I don't want to have to make sure I claim to be Apache just because I want to use Content-Location. And I do think we should support this; much like Link: headers, it is something that I would actually use. Currently we have four server families that require blacklisting. Our DOCTYPE blacklist is in the high double figures, so I don't think this is a big issue.
while I agree that a whitelist is the WrongThing it sucks that so many random servers send a broken value. The current DOCTYPE list took us a lot longer to arrive at then we've had between enabling Content-Location and creating a stable branch. This is why I think disabling (or whitelisting) could be a good thing to do for 1.7. We would then have a few months to iron out a blacklist for 1.8. Hixie does bring up a good point though, a whitelist might end up making servers pretend to be Apache just so they can support Content-Location. So my vote goes towards disabling this compleatly for 1.7 and then reenable and finalize the blacklist for 1.8 and forward.
If this is really that big a problem then yeah, let's back it out on the 1.7 branch and get it right in 1.8.
Depends on: 241065
1. I think I agree that diabling completely for 1.7 is the path that we should take, based on 1.7 as long lived branch, desire for max backward compatibilty with past behavior in 1.7, and lowest possible number of percieved regressions, etc... 2. I also think that if it's not to much work it would be good to enable an on/off pref *and* whitelist/blacklist prefs to allow folks that are interested in continuing to work with this, and test out the feature easily with any distributed build. That would also make choices for which selections go in the mozilla deafult builds, and other distributions easy to make, and allow it to be easily turnned on again for 1.8. can someone drum up a patch for 1, so we can get this into the next RC?
> can someone drum up a patch for 1, so we can get this into the next RC? It's just a straightforward CVS backout. I don't have time to turn the relevant cvs up commands into a patch, but if someone else does I'll review it. Branch-only backout, mind you.
note that tinderbox/bonsai will generate the cvs commands to do the backout...
Depends on: 241981
Attached patch Backout (deleted) — Splinter Review
This should do it.
Comment on attachment 147337 [details] [diff] [review] Backout In light of bug 241981 I propose to check this in on both trunk and branch and just let the matter be.
Attachment #147337 - Flags: superreview?(darin)
Attachment #147337 - Flags: review?(darin)
Attachment #147337 - Flags: approval1.7?
Comment on attachment 147337 [details] [diff] [review] Backout r+sr=darin for backout on both the trunk and 1.7 branch.
Attachment #147337 - Flags: superreview?(darin)
Attachment #147337 - Flags: superreview+
Attachment #147337 - Flags: review?(darin)
Attachment #147337 - Flags: review+
Comment on attachment 147337 [details] [diff] [review] Backout a=chofmann for 1.7
Comment on attachment 147337 [details] [diff] [review] Backout Setting flag per chofmann's comment.
Attachment #147337 - Flags: approval1.7? → approval1.7+
Keywords: fixed1.7
Comment on attachment 147337 [details] [diff] [review] Backout Backout checked in, trunk and branch.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: