Closed Bug 98929 Opened 23 years ago Closed 23 years ago

Implementation of Content-Language in HTTP

Categories

(Core :: Networking: HTTP, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: drepper, Assigned: neeti)

Details

Attachments

(7 files)

Please excuse me using the "Browser-General" component. I don't know better. Please reassign if necessary. I'll shortly attach a patch to implement the handling of the Content-Language header of the HTTP protocol at document level. It's actually a bit more. If the HTTP data doesn't contain the info of if it is no HTTP data, the preference data is used. With this (tested!) patch I can extend my implementation of the :lang() pseudo class to handle all cases correctly (see bug http://bugzilla.mozilla.org/show_bug.cgi?id=35768). To see the patch working you can configure Apache to use the MultiView option for a directory and add files like index.html.de. With an appropriate setting the mod_negotiate module will pick that file if "de" (German) is the preferred language. The transferred data looks like this: HTTP/1.1 200 OK Date: Sun, 09 Sep 2001 06:20:36 GMT Server: Apache/1.3.20 (Unix) (Red-Hat/Linux) mod_ssl/2.8.4 OpenSSL/0.9.6b DAV/1.0.2 PHP/4.0.6 mod_perl/1.24_01 mod_throttle/3.1.2 Content-Location: index.html.de Vary: negotiate,accept-language TCN: choice Last-Modified: Sun, 09 Sep 2001 06:14:17 GMT ETag: "a2c5-147-3b9b08b9;3b9b08b9" Accept-Ranges: bytes Content-Length: 327 Connection: close Content-Type: text/html Content-Language: de <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <HTML> <HEAD> <TITLE>Myware home</TITLE> <STYLE type="text/css"> q:lang(de) { quotes: "»" "«" } </STYLE> </HEAD> <!-- Background white, links blue (unvisited), navy (visited), red (active) --> <BODY BGCOLOR="#FFFFFF"> <q>Nichts</q> zu sehen. </BODY> </HTML> This document is displayed with the quoting from the style sheet although no lang attribute is present in the HTML.
Keywords: patch, review
-> http, I guess
Assignee: asa → neeti
Component: Browser-General → Networking: HTTP
QA Contact: doronr → tever
drepper: nsIHttpChannel::GetLanguage could be replaced with a call to nsIHttpChannel::GetResponseHeader i don't see any reason to introduce storage for mContentLanguage in nsHttpResponseHead.
> i don't see any reason to introduce storage for mContentLanguage in > nsHttpResponseHead. I was just trying to be consistent. This is what is done for mContentType and mContentLanguage. Since response headers are not too frequently allocated the additional memory is IMO a price well payed to get some consistency in the code. If you disagree I can change it but maybe you want to reconsider.
content-type and content-length are stored for reasons internal to the http implementation. the content-language header, however, is not needed interally by the http protocol implementation. therefore, it does not need extra storage. also, IMO GetLanguage should not be added. content-language should be treated similarly to content-disposition, which does not have a special accessor.
I the patch in attachement 48928 more to your liking? I've tested it and it still works for me.
your patch looks fine for the most part, but you'll need to get someone in content to give you an r=, then i (or someone else) can give you an sr=.
Maybe you should modify nsDocument rather than nsHTMLDocument? (Then you'd end up hitting nsHTMLDocument and nsXMLDocument. Or is there a reason you don't want to modify nsXMLDocument?) I'm not a big fan of |nsString*| members. Why not just |nsString|? (Well, one answer is that |nsString| isn't as lightweight as we'd like, but...) Also, the "intl.accept_languages" pref can contain multiple languages (mine is "en, es, eo"). Is that appropriate here, or do you want to take only the part before the first comma? (And is that even an appropriate guess?)
> Or is there a reason you don't want to modify nsXMLDocument?) The patch changes nsXMLDocument as well. I'm just concerned about all other uses of nsDocument. If you say it's good to move it there I'll do it. But I don't want to do it right away. > I'm not a big fan of |nsString*| members. Why not just |nsString|? (Well, one > answer is that |nsString| isn't as lightweight as we'd like, but...) I've used the same method elsewhere in these class. > Also, the "intl.accept_languages" pref can contain multiple languages (mine is > "en, es, eo"). Is that appropriate here, or do you want to take only the part > before the first comma? (And is that even an appropriate guess?) The Content-Lanaguage value in HTTP also can contain multiple languages (stupid but true). So, storing the list is OK.
Attached patch move main changes to nsDocument (deleted) — Splinter Review
I've incorporated the comments of dbaron and jst. The code compiles and works. I saw some funny effects when loading some HTML docs. But I suspect that in my tree due to missing dependencies not all files got recompiled after the change. I'm rebuilding the whole tree which takes a while on my machine. But the patch is already attached to this bug (attachment 48965 [details] [diff] [review]).
Comment on attachment 48965 [details] [diff] [review] move main changes to nsDocument sr=jst
Attachment #48965 - Flags: superreview+
Attached patch little bug fix + optimizations (deleted) — Splinter Review
why does mContentLanguage need to be stored as unicode? isn't it's value guaranteed to be ascii?
> isn't it's value guaranteed to be ascii? Yes, it is. I can change this as well but it's getting more and more different from the rest of the code. If this should serve as a good example I'm fine with it but changing it only "because it's nice" isn't worth it IMO. More code will be needed since then the resources for the string have to be handled explicitly.
so, if mContentLanguage never needs to be converted to a unicode string, then there is no reason to store it as a unicode string. if it does, however, need to be accessed as a unicode string, then perhaps storing it as unicode makes sense.
> if it does, however, need to be accessed as a unicode string, then perhaps > storing it as unicode makes sense. First thing I do in the code where I use GetContentLanguage() is to convert it to ASCII. So, this would mean using const char* is correct. But I don't know what other uses will follow in future.
Next try. darin insisted on using char data. This is indeed what I expect in the :lang() implementation so I updated the patch. Still works for me.
Attached patch Use nsCString instead of char (deleted) — Splinter Review
For completeness, I've attached the patch to use nsCString instead of char. jst currently leans towards using the original patch with nsString since the strings this string will be compared with is also in nsString format and so we avoid converting again and again.
Darin, storing this string as unicode makes sense since the code that will be using this string (the style system) is all unicode based, if we'd store this as an 8-bit string we'll end up having to convert the string for every access from the style system, which can be a fair amount of accesses while laying out a page. The overhead of storing this string as unicode is lost in the noice anyway, so storing it as unicode is IMO the better of the two alternatives. I'd say we go with attachment 48982 [details] [diff] [review]. Thanks Ulrich for fixing this!
Comment on attachment 48982 [details] [diff] [review] little bug fix + optimizations > >+ > nsDOMStyleSheetList::nsDOMStyleSheetList(nsIDocument *aDocument) > { > NS_INIT_REFCNT(); Unnecessary extra line ;). r=peterv.
Attachment #48982 - Flags: review+
sounds good to me... thanks for walking through the content-language usage. r/sr=darin
It was checked in by timeless@mac.com on September 11th. -> Fixed.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: