Closed Bug 146062 Opened 22 years ago Closed 22 years ago

nsWebBrowserPersist::SetDocumentBase() does not work with XHTML/XML

Tracking

(Not tracked)

Status:

VERIFIED FIXED

Milestone:

mozilla1.3alpha

People

(Reporter: hjtoi-bugzilla, Assigned: hjtoi-bugzilla)

References

Details

(Keywords: xhtml)

Attachments

(1 file, 2 obsolete files)

Fix v1.0 22 years ago Heikki Toivonen (remove -bugzilla when emailing directly) (deleted), patch	adamlock : review+	Details \| Diff \| Splinter Review
Fix v1.1 22 years ago Heikki Toivonen (remove -bugzilla when emailing directly) (deleted), patch	hjtoi-bugzilla : superreview-	Details \| Diff \| Splinter Review
Fix v1.2 22 years ago Heikki Toivonen (remove -bugzilla when emailing directly) (deleted), patch	adamlock : review+ alecf : superreview+	Details \| Diff \| Splinter Review

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Description

•

22 years ago

nsWebBrowserPersist::SetDocumentBase() uses GetElementsByTagName() method which most likely will work unexpectedly in XHTML/XML documents. Look at bug 133654 how to recognize when it is plain HTML and when it is XHTML, so that you'd need to use GetElementsByTagNameNS(). In case it is not HTML and not XHTML, you might want to use XML Base (http://www.w3.org/TR/xmlbase/)

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

22 years ago

Keywords: xhtml

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

22 years ago

Blocks: 136717

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

22 years ago

Keywords: nsbeta1

sairuh (rarely reading bugmail)

Updated

•

22 years ago

QA Contact: sairuh → petersen

Adam Lock

Updated

•

22 years ago

Target Milestone: --- → Future

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 1

•

22 years ago

I have a patch, taking...

Assignee: adamlock → heikki

Priority: -- → P2

Whiteboard: [fixinhand]

Target Milestone: Future → mozilla1.3alpha

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 2

•

22 years ago

Attached patch Fix v1.0 (obsolete) (deleted) — Details — Splinter Review

The beginning of this patch contains the fix for bug 136718. The rest of the patch is for this bug. This works even for XML documents that have nothing but |<img xmlns="http://www.w3.org/1999/xhtml"/>| as content, in which case no head or base tags will be inserted. There is something weird going on here, though. SetDocumentBase() is called twice: first without |aBaseURI|, then with it. Regardless, the base tag does never appear to be serialized, at least if none existed to begin with. This code seems to create and append one so that is weird. This happens with HTML and XML alike, and has nothing to do with my changes.

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

22 years ago

Attachment #105664 - Flags: review?(adamlock)

Adam Lock

Comment 3

•

22 years ago

Comment on attachment 105664 [details] [diff] [review] Fix v1.0 r=adamlock with some minor nits: NeedXHTMLBaseTag should run a loop over an array of elements to reduce bloat. xhtml_ns, head, base should probably be called kXHTMLNS, kHead etc. for consistency with other consts. EmptyList should be renamed IsEmptyList for readability, or deleted altogether if NeedXHTMLBaseTag is rewritten to use a loop.

Attachment #105664 - Flags: review?(adamlock) → review+

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 4

•

22 years ago

Thanks Adam, good points, will fix. But what about the SetDocumentbase() weirdness, is my observed behavior expected or not? Because right now the output that is produced for a document (say, |<doc><img/></doc>| will output an empty head tag, but no base.

Status: NEW → ASSIGNED

Adam Lock

Comment 5

•

22 years ago

The base tag is removed during persistence to ensure that the relative links are not fouled up when the document is saved to disk. It is then restored afterwards to ensure the DOM document still works properly when control returns to the browser. So actually thinking about it, perhaps there should be no base tag there at all - our code should strip all base tags (XML or otherwise) before saving and restore them afterwards. Local files require all links to external documents need to be made absolute, all links to inline content, stylesheets etc need to be made relative.

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 6

•

22 years ago

Attached patch Fix v1.1 (obsolete) (deleted) — Details — Splinter Review

Fixed problems Adam pointed out.

Attachment #105664 - Attachment is obsolete: true

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

22 years ago

Attachment #106127 - Flags: superreview?(kin)

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 7

•

22 years ago

I logged the weird behavior I saw as bug 179990.

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

22 years ago

Attachment #106127 - Flags: superreview?(kin) → superreview?(alecf)

Adam Lock

Comment 8

•

22 years ago

As I commented when I realised before, the removal of base tags is actually deliberate during persistence which is why I think this patch needs more consideration. The base tags are removed because it hoses relative links in the file. For example, if the HTML had a base tag pointing to http://www.mozilla.org then an img tag fixed up to point to "mozilla.org_files/image.gif" would resolve to "http://www.mozilla.org/mozilla.org_files/image.gif" which is wrong. Therefore the base tag has to be removed in the local copy to ensure relative links correctly resolve. Anchors and other links are made absolute to ensure they also work properly. As the DOM document was supplied by the caller, the base tag must be reinstated before returning. As an alternative to this remove/reinstate behaviour it may be possible to strip out the base tags during the encoding phase during the call to FixupNode. If the node is a base element, we replace it with an empty comment node as it is saved to disk. We would still have to make a note of what the base tag actually was for fixup, but it would mean the original document would not have to be modified in this weird way.

Alec Flett

Comment 9

•

22 years ago

Comment on attachment 106127 [details] [diff] [review] Fix v1.1 Well.. I gotta say that the use of GetElementsByTagNameNS seems a little ugly to me. I mean, isn't this going to do a search of the whole tree? The especially scary one is of course NeedXHTMLBaseTag() which calls it in a loop. As for the code itself, I think you meant to put kXMLNS into the loop in NeedXHTMLBaseTag(), rather than that NS_LITERAL_STRING()? I guess before I sr= this, I want to hear some justification as to why we can't do something like a depth first, short-circuiting search for the first occurance of a given tag, rather than walking the whole tree.

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 10

•

22 years ago

Attached patch Fix v1.2 (deleted) — Details — Splinter Review

Address Alec's issues, including depth-first-search. The reason I had not done it earlier was because use of XML is still pretty limited, and it could be argued we should only check the content type, and not the contents. But this is almost as fast for typical XHTML-only documents as checking the content type, and this will also work for stuff served without that content type so this should be more robust.

Attachment #106127 - Attachment is obsolete: true

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

22 years ago

Attachment #106421 - Flags: superreview?(alecf)

Attachment #106421 - Flags: review?(adamlock)

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

22 years ago

Attachment #106127 - Flags: superreview?(alecf) → superreview-

Alec Flett

Comment 11

•

22 years ago

Comment on attachment 106421 [details] [diff] [review] Fix v1.2 this is better - I might have made abstracted the depth-first searching method a bit so that I could replace GetElementsByTagName[NS] but since GetElementsByTagName is probably only called once or twice per call to SetDocumentBase, Can you just add a comment where you're calling GetElementsByTagName[NS] saying that this is a one-time operation since you're doing base tag fixup? with the comment sr=alecf

Attachment #106421 - Flags: superreview?(alecf) → superreview+

Adam Lock

Comment 12

•

22 years ago

Comment on attachment 106421 [details] [diff] [review] Fix v1.2 I think the patch for bug 179990 should go in first or be incorporated into this one. The issue of having to remove the base tag and then restore it should go away then. This patch does look okay otherwise, except for the way the tag array is defined. Is there a better way to define a PRUnichar string list?

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 13

•

22 years ago

Do I need to wait long for the other bug? I am asking because this patch here also contains the fix to the xml-stylesheet issue which is what is holding me from landing the fix in 136717. The good thing about the string list is that it is done statically. I don't know any better way to do this. I could make it cleaner-looking by making it use dynamic memory, but we don't want that. I guess I could make this code use atoms, but that would add new dependencies I think. I don't think this code is so performance sensitive to warrant that.

Adam Lock

Comment 14

•

22 years ago

Comment on attachment 106421 [details] [diff] [review] Fix v1.2 r=adamlock, though the other bug will have to be updated to remove some of the stuff in SetDocumentBase and the calls to it.

Attachment #106421 - Flags: review?(adamlock) → review+

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 15

•

22 years ago

Checked in.

Status: ASSIGNED → RESOLVED

Closed: 22 years ago

Resolution: --- → FIXED

Whiteboard: [fixinhand]

Chris Petersen

Comment 16

•

22 years ago

Verified per last comments.

Status: RESOLVED → VERIFIED

Nobody; OK to take it and work on it

Updated

•

8 years ago

Product: Core → Core Graveyard

You need to log in before you can comment on or make changes to this bug.