Closed Bug 246441 (e4x) Opened 20 years ago Closed 12 years ago

Implement E4X in SpiderMonkey

Categories

(Core :: JavaScript Engine, enhancement, P1)

enhancement

Tracking

()

RESOLVED WONTFIX
mozilla1.8beta2

People

(Reporter: brendan, Unassigned)

References

()

Details

(Keywords: js1.5)

Attachments

(12 files, 17 obsolete files)

(deleted), patch
Details | Diff | Splinter Review
(deleted), application/x-gzip
Details
(deleted), patch
shaver
: superreview+
brendan
: approval1.8a5+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), application/x-javascript
Details
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), text/plain
Details
(deleted), text/plain
Details
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
shaver
: review+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
We should support ECMAScript for XML (E4X), which is a draft ECMA standard. /be
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: --- → mozilla1.8alpha2
I'll try to pry it loose tomorrow at the ECMA TG1 meeting. /be
http://lambda.weblogs.com/discuss/msgReader$6380 has some good feedback on the E4X spec, much of which I agree with. Maybe we can remedy some of them in the JS24X stuff?
Depends on: 40757
Attached patch work in progress, parses xml but doesn't yet codegen (obsolete) (deleted) — — Splinter Review
But @a::b, *, etc. work. I'm compiling a big E4X errata sheet; I'll mail that to the right folks next week. /be
Attached patch checkpoint (obsolete) (deleted) — — Splinter Review
Attachment #156645 - Attachment is obsolete: true
Target Milestone: mozilla1.8alpha2 → mozilla1.8alpha4
Alias: e4x
Attached file compressed checkpoint patch, nearly ready to branch (obsolete) (deleted) —
Attachment #156860 - Attachment is obsolete: true
Attached file Errata so far (obsolete) (deleted) —
Possibly some of these claimed errata are erroneous -- there sure are a lot. The spec seems like a somewhat buggy translation of slightly buggy Java code, which is unfortunate, but possibly better than a spec based on no implementation! /be
Another erata: In E4X specs the section 11.5.1 The Abstract Equality Comparison Algorithm states: ... 3. If Type(x) is the same as Type(y) ... c. If Type(x) is Object and x.[[Class]] == "Namespace", return the results of the comparison x.uri == y.uri That should be changed to c. If Type(x) is Object and x.[[Class]] == "Namespace" and y.[[Class]] == "Namespace", return the results of the comparison x.uri == y.uri A similar treatment should be applied to QName part. Note also that the algorithm can be simplified if [[Equals]] would be defined to return triple true|false|null where null indicates that the object does not know how to do the equality. Then the method can be defined for Namespace and QName to return null if the argument is not Namespace/QName or do the proper job and similarly for XML. Then 1-2-3-4 in the algorithm can be replaced by: 1. If x implements [[Equals]], a. Call x.[[Equals]](y) b. If result of a is not null, return it 2. If y implements [[Equals]], a. Call y.[[Equals]](x) b. If result of a is not null, return it
Sorry for sounding harsher than I intended in comment 7. Consider that JS was more than slightly buggy in the old days, and ECMA standardization of it involved at least two other implementations. I think E4X will benefit from other attempts to implement the spec, and I hope the errata here are helpful. /be
A note on a pretty minor glitch in spec, that I'd like to share here (I'll roll it into the complete errata file, which needs to be HTML-ized and checked for errors): 11.1.4 The sub-grammar here allows tag content of the form <T A={e1}{e2}>, but ExcapeAttributeValue is called only on the result of evaluating e1, not e2, and unless e2 evaluates to the empty string, the resulting string concatenation will not be well-formed XML because only the e1 result is escaped and quoted. The spec clearly intends that one can have exactly one {expr} in lieu of a properly quoted and escaped attribute value, but the cover grammar does not require this, and it's up to the XML parser to try to catch such errors. This simplifies the E4X spec, but it deserves at least an informative comment, if not a tighter cover grammar. Implementations where the XML parsing is built into the JS parser can be simplified based on the restriction, and we don't want to court interop bugs where an {e2} that evaluates to the empty string is allowed in one implementation but not in another. I remember bugging John about the cover grammar issue just before ECMA-357 was finalized, but I didn't have time to find this quirk, or ambiguity. I should have seen it then; better late than never. /be
Attached file compressed checkpoint patch, landing today (obsolete) (deleted) —
Need to fix a few more bugs, test again, and land. Current plan is to avoid a branch by landing on the trunk, configured off. /be
Attachment #160708 - Attachment is obsolete: true
Checked into cvs.mozilla.org trunk, configured off (via jsconfig.h). See the TODO comment near the top of jsxml.c. /be
Attached file latest errata file (obsolete) (deleted) —
I included the erratum Igor gave in comment 8. Feel free to add more to this bug, and I'll roll them up. /be
Attachment #160709 - Attachment is obsolete: true
Here is another errata: ECMA 357, 11.1 Primary Expressions ... Semantics The production PrimaryExpression : PropertyIdentifier is evaluated as follows: 1. Let n be the result of evaluating PropertyIdentifier 2. Let name = ToXMLName(n) 3. While (true) a. If there are no more objects on the scope chain, i. Return undefined ^^^^^^^^^^^^^^^^^^^^^^^^ b. Let o be the next object on the scope chain. NOTE: on the first iteration, o will be the first object on the scope chain c. If Type(o) {XML, XMLList} i. Let hasProp be the result of calling the [[HasProperty]] method of o, passing name as the property ii. If hasProp == true 1. Return a value of type Reference whose base object is o and whose property name is name The algorithm should not return undefined for unbound name. Rather in this case it should return a value of type Reference whose base object is the first XML or XMLList object found on the scope chain or null otherwise and whose property name is <i>name</i> so the algorithm should read: 1. Let n be the result of evaluating PropertyIdentifier 2. Let name = ToXMLName(n) 3. Let firstXML = null 4. While (true) a. If there are no more objects on the scope chain, i. Return a value of type Reference whose base object is firstXML and whose property name is name. NOTE: firstXML will be null if no XML or XMLList value was found on the scope chain. b. Let o be the next object on the scope chain. NOTE: on the first iteration, o will be the first object on the scope chain c. If Type(o) {XML, XMLList} i. Let hasProp be the result of calling the [[HasProperty]] method of o, passing name as the property ii. If hasProp == true 1. Return a value of type Reference whose base object is o and whose property name is name iii. If firstXML == null 1. Set firstXML to o Without the change while script consisting of the single "x" generates ReferenceError, its E4X modifications, *::x or @x results in undefined. In addition the following would not work if xmlList contains an element without child subtag: xmlList.(child.@name == 'something') as child would evaluates to undefined resulting to ReferenceError on child.@name.
Igor, thanks for writing that up -- it was in the errata file at attachment 161180 [details] already at the top, in cryptic form: "??? @a::b => undefined if not found in scope chain !?!?" I noted this while implementing but forgot to go back and explain that note. /be
Erratum in 13.5.4.18 XMLList.prototype.propertyIsEnumerable ( P ): The first step in the semantic algorithms should be changed from: 1. if ToNumber(P) is greater than or equal to 0 and ToNumber(P) is less than x.[[Length]], return true to 1. Let i = ToUint32(P) 2. If ToString(i) == P a. If i is greater than or equal to 0 and i is less than x.[[Length]], return true b. Otherwise return false Without the change x.propertyIsEnumerable(0.1) or even x.propertyIsEnumerable(-0.0) (depending on the meaning of "or equal to 0") can return true.
Possible errata in XMLList.prototype.child 13.5.4.5 defines XMLList.prototype.child(propertyName) to return empty list if the list itself is empty as well without any checks for valid propertyName. That means that x.child(null) or x.child(undefined) returns empty list if x is empty and throw TypeError when x is not empty during the eventual call to toXMLName(null|undefined). Compare that to 13.4.4.4 XML.prototype.attribute(attributeName) which always throws TypeError when attributeName is null or undefined. Should XMLList.prototype.child always require valid name even when the list is empty?
Blocks: 265341
It seems > does not need to be escaped in element values (text), and the test in e4x/Expressions/11.1.4.js wants it not to be escaped -- this is a ToXMLString erratum. Also, an element's attribute and namespace uri values are not escaped, but should be. More when I get back from the ECMA TG1 meeting. /be
Attachment #161086 - Attachment is obsolete: true
Attached patch checkpoint, more tests passed, closing in (obsolete) (deleted) — — Splinter Review
Attachment #163500 - Attachment is obsolete: true
Plus some lurking bugs, and some bogus tests in the suite. Testsuite errata doc soon, along with updated spec errata. Reduced context from -pu8 to fit within bugzilla's 500K patch size limit. /be
Attachment #164502 - Attachment is obsolete: true
Attached patch checkpoint, just about done (deleted) — — Splinter Review
Impurity lurking, plus testsuite bugs (I think) that give false failures in six tests (could be real bugs in the SpiderMonkey e4x lurking still, causing one or two of these): # Retest List, smdebug, generated Fri Nov 12 16:13:00 2004. # Original test base was: e4x. # 129 of 129 test(s) were completed, 6 failures reported. e4x/Expressions/11.6.1.js e4x/Types/9.1.1.9.js e4x/XML/13.4.3.js e4x/XML/13.4.4.38.js e4x/XML/13.4.4.39.js e4x/Regress/regress-264369.js /be
Attachment #165651 - Attachment is obsolete: true
Attached file version for 1.8a5, ifdef'd off still (deleted) —
I checked in something close to attachment 166178 [details], rs=shaver. It passes the js/tests suite, excluding e4x, configured without E4X support (the default for now). With JS_HAS_XML_SUPPORT defined to 1 for JS_VERSION == 150 in jsconfig.h, it passes all e4x tests but the six noted in comment 21, and seems pure undef Purify on Windows (although I had some "ghost" leaks that were once real, then fixed, but persisted even after the fixes, which demonstrably eliminated all such leaks using the DEBUG_notme code in jsxml.c). I'm leaving this bug open to work on further testing, correctness, footprint and performance wins necessary to justify turning on E4X by default. One idea: use NSPR's dlopen/dlsym portable veneer to make the E4X support runtime-conditional, and put the bulk of the code in a separate DLL/DSO. Before resorting to that, I'm going to try to recover space throughout SpiderMonkey, by reducing the use of macros (optimizations done in the '90s on RISC targets, mostly, and possibly counterproductive on superscalar x86 machines now). /be
Oh, and I put off thread safety for "later". The plan is to avoid penalizing the all single-threaded use-cases and embeddings by building on the zero-cost locking implemented in bug 54743. In the case of E4X, this isn't trivial, because just using a per-XML-object zero-cost lock doesn't avoid nested locks and AB-BA deadlock hazards inherent in XMLLists, [[Insert]], [[Append]], and [[Replace]]. Just for starters! /be
Feedback in bug 270552 and bug 270553. - N.
Attached patch fix a spec bug that I faithfully transcribed into a code bug (obsolete) (deleted) — — Splinter Review
Updated errata file next. /be
Attached file latest errata file (obsolete) (deleted) —
I really should clean this up and HTML-ize it, perhaps tomorrow. /be
Comment on attachment 166406 [details] [diff] [review] fix a spec bug that I faithfully transcribed into a code bug I'd like to fix this for 1.8a5, just for the early adopters, but it's not a big deal either way -- remember that E4X is configured off by default, and not hooked up to the <script> tag by a MIME type parameter yet. John, check my erratum claim here. It sure seems like the [[Replace]]-calls-[[Insert]] case, replacing at or beyond the length with a a list value, will create a hole and an off-by-1 x.[[Length]]. /be
Attachment #166406 - Flags: superreview?(shaver)
Attachment #166406 - Flags: review?(john.schneider)
Comment on attachment 166406 [details] [diff] [review] fix a spec bug that I faithfully transcribed into a code bug Optimizing, since this is all #if'd off for 1.8a5. /be
Attachment #166406 - Flags: approval1.8a5?
Comment on attachment 166406 [details] [diff] [review] fix a spec bug that I faithfully transcribed into a code bug a=asa for 1.8a5 checkin.
Attachment #166406 - Flags: approval1.8a5? → approval1.8a5+
Comment on attachment 166406 [details] [diff] [review] fix a spec bug that I faithfully transcribed into a code bug Thanks, but there's a better patch that probably does what the spec should do: avoid incrementing x.[[Length]] until we know the value being inserted at the end is not an XMLList. /be
Attachment #166406 - Attachment is obsolete: true
Attachment #166406 - Flags: superreview?(shaver)
Attachment #166406 - Flags: review?(john.schneider)
Attachment #166406 - Flags: approval1.8a5+
Attached patch better approach (deleted) — — Splinter Review
I'll check this into 1.8a5. Reviews welcome, after the fact! ;-) /be
Comment on attachment 166536 [details] [diff] [review] better approach Restoring Asa's stamp. /be
Attachment #166536 - Flags: superreview?(shaver)
Attachment #166536 - Flags: review?(john.schneider)
Attachment #166536 - Flags: approval1.8a5+
Attached patch js/tests/e4x patch to fix test bugs and glitches (obsolete) (deleted) — — Splinter Review
Who should review this patch? /be
Attached file latest errata file (obsolete) (deleted) —
I didn't get the time to HTML-ize this yet, but it grew a bit. Diff against the last one: 94a95,106 > 9.2.1.2 step 2(c)(ii) sets _y.[[Parent]] = r_ where _r_ is the result of > [[ResolveValue]] called on _x.[[TargetObject]] in 2(a)(i). This can > result in text parenting text: > > var MYXML = new XML(); > MYXML.appendChild(new XML("<TEAM>Giants</TEAM>")); > > (testcase from Werner Sharp <wsharp@macromedia.com>). > > To match insertChildAfter, insertChildBefore, prependChild, and > setChildren, we should silently do nothing in this case. > /be
Attachment #166407 - Attachment is obsolete: true
Attached patch updated e4x testsuite patch (obsolete) (deleted) — — Splinter Review
- Vacuously ported e4x/shell.js's NL function to SpiderMonkey. - Fixed bugs (deviations from the ECMA-357 spec) in e4x/Expressions/11.6.1.js sections 3 and 6. - Fixed backward use of ignoreWhitespace (or backward sense of equality test, but I went with minimal change here) in e4x/Types/9.1.1.9.js section 9. - e4x/XML/13.4.3.js repeated used XML.defaultSettings() where it wanted XML.setSettings() -- this suggests that the verb-less "defaultSettings" name is confusing, because hackers love to verb nouns, and "default" is often used as a verb, so XML.defaultSettings connotes "set defaults", not "get defaults". - e4x/XML/13.4.4.38.js sets XML.prettyPrinting = false but then expects XMLLists to use NL() as the list element separator, contrary to 10.2.2 Step 2(a). Same goes for 13.4.4.39.js. - Added missing newline at end of file to a bunch of tests. - Fixed misspelled "arbitrary" in e4x/Regress/regress-257679.js. I'd like to check this in soon, but not before someone reviews the substantive changes and agrees (and files any needed Rhino E4X bugs). Igor, are you game? /be
Attachment #166821 - Attachment is obsolete: true
Attached patch patch I just checked in (deleted) — — Splinter Review
to finish off several PutProperty bugs, a few other odd bugs, and fixes based on the recently discovered errata. /be
Breaking out the E4X <=> DOM hookup to the bug that Nigel kindly filed. This bug can track that, and accumulate errata and testsuite fixes, for a while. When the DOM hookup is done, I'd like to close both this bug and bug 270553, and let any further problems (which should arise at much lower rate) be tracked by individual bugs. For regression testing purposes, it would have been ideal to make separate bugs for each erratum or SpiderMonkey bug fixed by the last few patches attached here, but that would have slowed things down to a crawl. The consolidated errata file can help tell us where we need to add more tests, instead of a list of micro-bugs to treat as "regressions". /be
Depends on: 270553
(In reply to comment #36) > Created an attachment (id=166969) > updated e4x testsuite patch > > I'd like to check this in soon, but not before someone reviews the substantive > changes and agrees (and files any needed Rhino E4X bugs). The changes triggered test failures in Rhino but AFAICS these are real bugs. Note also that e4x/XMLList/13.5.4.16.js contains: x = new XML(); TEST(2, undefined, x.parent()); Should it be changed to null? > /be
Brief feedback. Errata: Section 8 of E4X incorrectly pluralizes "Punctuators" in the second paragraph. Bugs: js -x could error more gracefully and earlier when presented with this test data: var a = <>; // put anything lexable here Seen with Linux/gcc 3.3.3-7, latest trunk. - N.
Igor wrote: > x = new XML(); > TEST(2, undefined, x.parent()); > > Should it be changed to null? Yes, to match the spec. I changed SpiderMonkey's E4X to match the testsuite the other week, and noted the confusion in the errata file: 13.5.4.16 Steps 1 and 3 specify undefined return, not null as is done for XML.prototype.parent(). The testsuite wants null for "no parent" from XMLList.prototype.parent(). It's handy to be able to distinguish "not there" from "null" in cases such as this, so I think the spec has it right. I'll post an updated testsuite patch with this fix, and a SpiderMonkey E4X patch to match, within the next day or two. /be
Attached file Sample showing <> error lateness (obsolete) (deleted) —
Now that it's not the wee hours, here's a more coherent report. Error from this attachment reports Line 5 not 2: 5: SyntaxError: missing } in XML expression: 5: flag = false; 5: ..............^ - N.
Attached file Sample showing <> error lateness (deleted) —
Now that it's not the wee hours, here's a more coherent report. Error from this attachment reports Line 5 not 2: 5: SyntaxError: missing } in XML expression: 5: flag = false; 5: ..............^ - N.
Attachment #167237 - Attachment is obsolete: true
Attachment #167237 - Attachment mime type: application/octet-stream → application/x-javascript
Comment on attachment 167238 [details] Sample showing <> error lateness The original Netscape HTML parser went to great lengths to cope with unclosed tags and improperly terminated entities. Of course here we don't want to "correct" errors by auto-closing tags, but how much code footprint do we want to throw at diagnosing these issues? Food for thought. /be
Attachment #167238 - Attachment mime type: application/octet-stream → application/x-javascript
Attached patch latest e4x testsuite patch (deleted) — — Splinter Review
Fixes tests to-do with XML.prototype.parent() and XMLList.prototype.parent(). SpiderMonkey patch next. /be
Attachment #166969 - Attachment is obsolete: true
I'm checking this in today, along with the testsuite patch. /be
Attached file toXMLString() comment serialisation problem (obsolete) (deleted) —
Trivial case showing non-comment serialization in toXMLString() - N.
Attached file Entity reference serialization problems (deleted) —
Trivial test case showing entity reference loss with to[XML]String(). - N.
Re: comment 47: comments are ignored by default (XML.ignoreComments defaults to true -- set it to false to get the desired results). Re: comment 48: that's a bug, I'll fix it (feel free to report such lesser followup bugs separately from this bug). Thanks, /be
Fixed the bug reported in comment 48. This leads to results that might be considered surprising: toString() --------- &nbsp; toXMLString() --------- <tag>&amp;nbsp;</tag> But that's per spec, and applies to any unknown entity (nbsp is not defined by the XML spec). /be
Just out of interest, why wouldn't: var xml = <tag>&nbsp;</tag>; ...cause a well-formedness error?
Hixie: cuz my integrated XML parser isn't doing that kind of well-formedness check? D'oh. Easily fixed, thanks.... /be
So E4X is written without specifying XML apart from 1.0 (vs. 1.1, e.g. in what makes an identifier an XMLName). But it doesn't say anything about entity refs that refer to entities other than the built-in XML 1.0 ones (amp, lt, gt, apos, quot). This means that an XML parser used in conjunction with a JS engine to implement E4X could process <!ENTITY ...> declarations and expand entity references, but XML.prototype.toXMLString would not convert these expansions back into references. That means no round-tripping via toXMLString, which seems like a bug to me. Probably ECMA TG1 considered this already. Anyway, 10.3.2.1 step 8 is quite clear, and SpiderMonkey's E4X should throw a TypeError for any unexpanded entity ref. I'll fix that now. /be
E4X's XML language doesn't support <!ENTITY> at all, right? So to end up in this position you'd need something like: document.createEntityReference('nbsp').toXML().toXMLString(); ...right? That would presumably just return a string consisting of U+00A0, assuming "document" is an HTML document or an XML document with the appropriate entity defined. Do the specs define this well enough?
Hixie: > document.createEntityReference('nbsp').toXML().toXMLString(); There is no toXML in the DOM specs, but if we added such a thing, we'd have to extend ECMA-357 as well. 10.3.2.1 MapInfoItemToXML step 8 says: 8. If i is an unexpanded entity reference information item a. Throw a TypeError exception /be
> There is no toXML in the DOM specs My bad, I misunderstood ECMA-357 section 10.3. Take this example instead: new XML(document.createEntityReference('nbsp')).toXMLString(); > 10.3.2.1 MapInfoItemToXML step 8 says: > > 8. If i is an unexpanded entity reference information item > a. Throw a TypeError exception By "unexpanded" I assume they mean any entity reference, as opposed to one that has been flattened. I guess that works. So basically, entities in general aren't handled by E4X. (That's a good thing. I wish they'd dropped entities other than the numeric ones and the five predefined ones from XML in the first place.)
Attachment #167900 - Attachment is obsolete: true
Attached file latest errata file (obsolete) (deleted) —
This should include everyone's errata -- if not, please mail me. /be
Attachment #161180 - Attachment is obsolete: true
Attachment #166822 - Attachment is obsolete: true
Regarding erratas: It would be nice if E4X clarifies whether XML.[[Get]] or Object.[[Get]] should be used during prototype lookup if the head of the prototype chain is Object. Consider: function Foo() { } Foo.prototype = XML.prototype; var bar = new Foo(); var test = bar.toXMLString The question is whether test should refer to the toXMLString method defined in XML.prototype or should it be empty XMLList? Here is ECMAScript for Object.[[Get]] from ECMA-262 v3, section 8.6.2.1: -------------------------------------- 8.6.2.1 [[Get]] (P) When the [[Get]] method of O is called with property name P, the following steps are taken: 1 . If O doesn't have a property with name P, go to step 4. 2 . Get the value of the property. 3 . Return 4 . If the [[Prototype]] of O is null, return undefined. 5 . Call the [[Get]] method of [[Prototype]] with property name P. 6 . Return Result(5). -------------------------------------- Now E4X specs gives in ECMA-357, section 9.1.1.1, first paragraph: -------------------------------------- 9.1.1.1 [[Get]] (P) Overview The XML type overrides the internal [[Get]] method defined by the Object type. ... -------------------------------------- Given the traditional definition of "override" this tells that the prototype lookup should use XML.[[Get]] and in the example "test" should be empty XMLList. On the other hand the 3-rd paragraph of 9.1.1.1 [[Get]] (P) states: -------------------------------------- NOTE Unlike the internal Object [[Get]] method, the internal XML [[Get]] method is never used for retrieving methods associated with XML objects. E4X modifies the ECMAScript method lookup semantics for XML objects as described in section 11.2.2. -------------------------------------- which indicates that Object.[[Get]] should always call Object.[[Get]] and "test" should alias XML.prototype.toXMLString method. Note that 11.2.2 does not consider the case of prototype chain and simple refers to Object.[[Get]] definition. So what should be the right thing? I would prefer the second interpretation since as far as I can see it is the only way to get reference to methods defined in XML.prototype to use them in continuation with Function.protype.apply, but it is just my preference.
NCRs are implemented now, and all entity references other than the fab five defined in XML 1.0 are illegal. /be
Attached file latest errata file (deleted) —
Diff with previous errata file on left: 113a114,124 > 9.2.1.2 Step 2(e)(i, ii), > 9.2.1.2 Step 7(e)(i), > 9.2.1.3 Step 2(b)(ii)(1)(a) > All uses of a.[[Name]] for an attribute a in these sections that pass > that QName object to [[Delete]] must pass an AttributeName cloned from> a.[[Name]]. The [[Name]] internal property is always a QName instance> and never an AttributeName or AnyName instance. But [[Delete]] will > not operate on x.[[Attributes]] when given a QName by these sections, > so a child could be wrongly deleted instead of the attribute of the > same name. > /be
Attachment #168427 - Attachment is obsolete: true
That was ugly -- I wish bugzilla would wrap long text better, or not at all if it appears to be "cited" (diff > at start of line looks like mail citation line mark) -- fudging the indentation so the diff is readable, to foster discussion: 113a114,124 > 9.2.1.2 Step 2(e)(i, ii), > 9.2.1.2 Step 7(e)(i), > 9.2.1.3 Step 2(b)(ii)(1)(a) > All uses of a.[[Name]] for an attribute a in these sections that pass > that QName object to [[Delete]] must pass an AttributeName cloned from > a.[[Name]]. The [[Name]] internal property is always a QName instance > and never an AttributeName or AnyName instance. But [[Delete]] will > not operate on x.[[Attributes]] when given a QName by these sections, > so a child could be wrongly deleted instead of the attribute of the > same name. > The problem in these steps is that an attribute's [[Name]] is used as if it were of (internal) type AttributeName, but it's specified in 9.1.1 that [[Name]] must be null or a QName object. /be
Oh, and I just enabled E4X at compile time, with shaver prodding me to action. The footprint hit is ~80k, but I have some wins coming, and this is a case where we gain a feature that may be worth the cost, even if we can't save elsewhere to try to reduce the size of the hit. /be
dmose pointed out the followup bug patched here. E4X breaks the invariant, preserved until now for all JS objects, native or host: for all method m in an object o, o.m(arguments) is equivalent to var f = o.m; f.apply(o, arguments). With E4X, you cannot extract a function-valued method from an XML instance. In SpiderMonkey's E4X, you can extract a method using the function:: namespace, but that extension has not made its way back to ECMA, and it doesn't, in any case, help prevent the broken symmetry from rippling through other layers of C or C++ code built on the JS API. In this case, XPConnect. Still seems like a lose to break the invariant for XML, but I haven't got a way to unify the conflicting goals: 1. For E4X, that every get of a property id that is not an index (in the case of an XMLList; reserved and an error for XML) returns an XMLList, which may be empty, to return all matching attributes or children. 2. Preserving the JS method extraction invariant mentioned above, and related ones (call as well as apply, manual invocation if the |this| parameter binding to o in the examples above doesn't matter, or can be faked via o.m2 = f; o.m2(arguments)). Since 1 does not apply uniformly to all XML properties (get of an index id on an XML object is illegal, reserved for future use; get on an XMLList returns the indexed child, not an XMLList containing the indexed child), I would simply add another special case to 1, preserving 2. But it's a bit late for that -- dmose is using E4X for CalDAV hacking already. /be
I checked in a slightly better version of attachment 169767 [details] [diff] [review], so dmose can (I hope) get past the problem he's seeing. /be
(In reply to comment #63) ... > In SpiderMonkey's E4X, you can extract a method using the function::namespace, but that extension has not made its way back to ECMA, You can extract method in the standard E4X ( well, at least the way I read E4X specs, see comment 58) via: function getMethod(object, name) { var Proxy = function() { } Proxy.prototype = object; var proxy = new Proxy(); return proxy[name]; } and then a.function::method can be done via getMethod(a, "method") > > 1. For E4X, that every get of a property id that is not an index (in the case > of an XMLList; reserved and an error for XML) returns an XMLList, which may be > empty, to return all matching attributes or children. > > 2. Preserving the JS method extraction invariant mentioned above, and related > ones (call as well as apply, manual invocation if the |this| parameter binding > to o in the examples above doesn't matter, or can be faked via o.m2 = f; > o.m2(arguments)). In Rhino the above is satisfied via allowing to apply "()" to XMLList which remember sufficient history to be able to support var f = xmlobject.m; f.apply(xmlobject, arguments) and related operations. > > Since 1 does not apply uniformly to all XML properties (get of an index id on > an XML object is illegal, reserved for future use; get on an XMLList returns > the indexed child, not an XMLList containing the indexed child), I would simply > add another special case to 1, preserving 2. But it's a bit late for that -- > dmose is using E4X for CalDAV hacking already. > > /be
> and then a.function::method can be done via getMethod(a, "method") Clever! But it still doesn't handle name collisions between prototype methods and (user-defined) XML children. > In Rhino the above is satisfied via allowing to apply "()" to XMLList which > remember sufficient history to be able to support var f = xmlobject.m; > f.apply(xmlobject, arguments) and related operations. Again the problem is that someone may write XML of the form: x = <parent><name>...</name><child id="1">...</child>...</parent>; and then you cannot call x.child() or x.name(). E4X really wanted to put XML.prototype.* methods delegated from XML instances into a separate namespace, but instead specified so that they can't be extracted easily or in general. It sounded at the last TG1 meeting as though everyone agreed with the idea of putting the methods in a separate namespace, which would be automatically selected when calling. Then at least the following issues arise: (1) What URI for the namespace? It would not be useful to use a URI that could identify an XML schema, since no schema can describe the grammar for native functions. It would be better if it were a reserved "anti-URI", which could not collide with any XML schema URI. (2) Should there be a shorthand, _a la_ SpiderMonkey's function::child use of the function reserved identifier, so that users don't have to repeatedly create a function namespace (var fn = new Namespace("<anti-URI goes here>"); var f = x.fn::child; etc.)? Obviously I think so, although "function" isn't that short (but at least it's reserved, and clear). /be
(In reply to comment #66) > > Again the problem is that someone may write XML of the form: > > x = <parent><name>...</name><child id="1">...</child>...</parent>; > > and then you cannot call x.child() or x.name(). But why? var f = x.child; f.apply(x); can be made to work since '()' can check for [[TargetProperty]]/[[TargetObject]] of child to see if it come from xml and then check for Function.prototype as a source of additional properties since x.child() is defined. In fact it does not work in Rhino this way due to fixable bugs. > > E4X really wanted to put XML.prototype.* methods delegated from XML instances > into a separate namespace, but instead specified so that they can't be extracted > easily or in general. It sounded at the last TG1 meeting as though everyone > agreed with the idea of putting the methods in a separate namespace, which would > be automatically selected when calling. > > Then at least the following issues arise: > > (1) What URI for the namespace? It would not be useful to use a URI that could > identify an XML schema, since no schema can describe the grammar for native > functions. It would be better if it were a reserved "anti-URI", which could not > collide with any XML schema URI. > > (2) Should there be a shorthand, _a la_ SpiderMonkey's function::child use of > the function reserved identifier, so that users don't have to repeatedly create > a function namespace (var fn = new Namespace("<anti-URI goes here>"); var f = > x.fn::child; etc.)? Obviously I think so, although "function" isn't that short > (but at least it's reserved, and clear). > > /be
(In reply to comment #66) > Then at least the following issues arise: > > (1) What URI for the namespace? It would not be useful to use a URI that could > identify an XML schema, since no schema can describe the grammar for native > functions. It would be better if it were a reserved "anti-URI", which could not > collide with any XML schema URI. > > (2) Should there be a shorthand, _a la_ SpiderMonkey's function::child use of > the function reserved identifier, so that users don't have to repeatedly create > a function namespace (var fn = new Namespace("<anti-URI goes here>"); var f = > x.fn::child; etc.)? Obviously I think so, although "function" isn't that short > (but at least it's reserved, and clear). > > /be But why not to provide a global function, say methods(object), that would return the method-only view of the object? Then one can write methods(x).child or even methods(x)['child'] which is much more clear IMO then x[QName(fn, 'child')].
> can be made to work since '()' can check for [[TargetProperty]]/[[TargetObject]] > of child to see if it come from xml and then check for Function.prototype as a > source of additional properties since x.child() is defined. The E4X spec says that the XMLList returned in that example should contain one element for reach immediate child of the parent tag, and not include a function object in the array. We'd have to say where it goes, and it would make the list inhomogeneous, breaking operators such as the filtering predicate. It's yet more asymmetry. Second reason not to do this is that it adds random logic to the XML method calling (really, getting -- GetMethod is factored from CallMethod in SpiderMonkey, and could be in E4X spec) path, which is already complicated enough. The Macromedia folks commented on how the magic-ness of E4X objects required extra random logic to be added to generic and type-specific layers, and that was true in SpiderMonkey and (IIRC) in Rhino too. Simpler is better, and a namespace isolates all functions and preserves homogeneity of XMLList [[Get]] results. > But why not to provide a global function, say methods(object), that would > return the method-only view of the object? Then one can write > methods(x).child or even methods(x)['child'] which is much more clear IMO > then x[QName(fn, 'child')]. First, there's no need to write x[QName(fn, 'child')] -- x.fn::child works and is the natural way to express it in E4X or ECMA Edition 4. Second, a global function would have to create or find a safely reusable methods-only view object. That's ineffecient. Third, a global function pollutes the global namespace. Given namespaces in E4X and Edition 4, why would you not use them to handle this asymmetry imposed by E4X's current design, which was based on the reasonable goal mentioned in comment 63 (numbered 1 there): to have homogeneous lists of child or attribute results from [[Get]]? Using a global method is a "JS 1" type of hack (I should know, I perpetrated JS 1 and its hacks), given namespaces. We should not spend much more space arguing in this bug, so if you still feel strongly, write something up and attach it. I'll take it to the TG1 group. /be
(In reply to comment #69) > > can be made to work since '()' can check for [[TargetProperty]]/[[TargetObject]] > > of child to see if it come from xml and then check for Function.prototype as a > > source of additional properties since x.child() is defined. > > The E4X spec says that the XMLList returned in that example should contain one > element for reach immediate child of the parent tag, and not include a function > object in the array. We'd have to say where it goes, and it would make the list > inhomogeneous, breaking operators such as the filtering predicate. It's yet > more asymmetry. > > Second reason not to do this is that it adds random logic to the XML method > calling (really, getting -- GetMethod is factored from CallMethod in > SpiderMonkey, and could be in E4X spec) path, which is already complicated enough. > > The Macromedia folks commented on how the magic-ness of E4X objects required > extra random logic to be added to generic and type-specific layers, and that was > true in SpiderMonkey and (IIRC) in Rhino too. Simpler is better, and a > namespace isolates all functions and preserves homogeneity of XMLList [[Get]] > results. > > > But why not to provide a global function, say methods(object), that would > > return the method-only view of the object? Then one can write > > methods(x).child or even methods(x)['child'] which is much more clear IMO > > then x[QName(fn, 'child')]. > > First, there's no need to write x[QName(fn, 'child')] -- x.fn::child works and > is the natural way to express it in E4X or ECMA Edition 4. I was refering to the case when property is not known at run time like in: function callit(object, method_name) { return object[method_name](); } then to prepare it for E4X I first thought it would have to be modified as: function callit(object, method_name) { if (typeof object == "xml") { return object[new QName(fn, method)](); } return object[method](); } which would be more verbouse then function callit(object, method_name) { return methods(object)[method](); } but then I forgot that :: can be followed by [] so callit would look like: function callit(object, method_name) { return object.function::[method](); } which is nice and I guess now function:: should go to Rhino as well.
(In reply to comment #63) > In SpiderMonkey's E4X, you can extract a method using the function:: namespace, It sems that CVS tip does not treat function:: specially if it starts expression like in: with (<xml>...</xml>) { function::name(); } Should FunctionStmt be XML-ised as well?
1) Propagate the TCF_HAS_DEFXMLNS flag from the parser to the constant folder, so that the latter doesn't fold XML literals whose default namespace can't be known till runtime. 2) The xml_mark object-op must call the native mark object-op (js_Mark), because XML objects are like native objects: they have scopes (almost always, a share in the XML.prototype object's scope), and they have slots (almost always, just the first four). /be
Attachment #169878 - Flags: review?(shaver)
Attached patch patch for bug pointed out in comment 71 (deleted) — — Splinter Review
This is a jsparse.c patch generated on top of the patch at attachment 169878 [details] [diff] [review]. Thanks, Igor! /be
Comment on attachment 169878 [details] [diff] [review] further fixes for dmose's JS component re-registration case This patch and the next one after it are checked into the cvs trunk now. /be
Attachment #169878 - Attachment description: furher fixes for dmose's JS component re-registration case → further fixes for dmose's JS component re-registration case
Another erratum: 13.3.5.4 [[GetNamespace]] 10.3.2.1 Step 6(c) NOTE does not say how a [prefix] containing "no value" (see http://www.w3.org/TR/xml-infoset 2.2.3) maps to x.[[Name]].[[Prefix]], but it is reasonable to assume it maps to the *undefined* prefix value mentioned various places in ECMA-357 as a distinct value from any string value, to mean "no prefix". 10.3.2.1 Step 6(h)(i)(1) specifies: 1. If the [local name] property of a is "xmlns" a. Map ns.prefix to the empty string Now 13.3.5.4 [[GetNamespace]] Step 3 includes this note: NOTE: implementations that preserve prefixes in qualified names may additionally constrain ns, such that ns.prefix == q.[[Prefix]] But "" != undefined, or *undefined* represented by null or any other non-void value that is distinct from any string when compared with == (note: it would be better if the spec used ===, since 0 == "" and 0 == "0"). Therefore, without further special-casing in [[GetNamespace]], 10.2.1 "ToXMLString Applied to the XML Type" step 11, given XML of the form: <t xmlns="http://foo.com"/> will call [[GetNamespace]] on x.[[Name]] = (prefix=undefined, localName="t") with argument { (prefix="", uri="http://foo.com") }. According to the NOTE in 13.3.5.4 Step 3, the singleton namespace won't match the QName because "" != undefined, so [[GetNamespace]] will create a new Namespace (prefix=undefined, uri="http://foo.com"). Then 10.2.1 Step 12 will generate an implementation-defined namespace prefix distinct from any prefix in { "", undefined }, sets namespace.prefix to that generated string, and unions namespace into namespaceDeclarations. The effect on the example ToXMLString input is the string: '<pfx:t xmlns:pfx="http://foo.com" xmlns:pfx2="http://foo.com"/>' (pfx2 and pfx may be the same generated string, if the prefix auto- generation algorithm deterministically computes a variation on a part of the namespace's uri.) Need more toXMLString tests! /be
Here's an even subtler suboptimality in the spec: 10.2.1 "ToXMLString applied to the String type" Step 12 Erratum: if both namespace.prefix and x.[[Name]].[[Prefix]] are undefined, we know that x was named using the default namespace (proof: see [[GetNamespace]] and the Namespace constructor called with two arguments). So we ought not generate a new prefix here, when we can declare namespace as the default namespace in x. The fix is to change Step 12 to something like this: 12. If (namespace.prefix == undefined), a. If (x.[[Name]].[[Prefix]] == undefined), i. Let namespace.prefix be the empty string b. Else ii. Let namespace.prefix be an arbitrary implementation defined namespace prefix, such that there is no ns2 in (AncestorNamespaces U namespaceDeclarations) with namespace.prefix == ns2.prefix c. [what was 12(b) goes here] This helps descendants inherit the namespace instead of redundantly redeclaring it with generated prefixes in each descendant. If I'm missing a good reason not to serialize default namespaces via default namespace declarations (xmlns=...) via toXMLString, let me know. With the code version of this fix in SpiderMonkey's E4X, the testsuite still passes fine (it fails to test toXMLString on a deep tree that uses only the default namespace). This case, from dmose, is even better: default xml namespace = "urn:ietf:params:xml:ns:caldav"; var queryXml = <calendar-query> <calendar-query-result/> <filter> <icalcomp-filter name="VCALENDAR"> <icalcomp-filter name="VEVENT"> </icalcomp-filter> </icalcomp-filter> </filter> </calendar-query>; print(queryXml.toXMLString()); Without the fix, you get: <caldav:calendar-query xmlns:caldav="urn:ietf:params:xml:ns:caldav"> <caldav:calendar-query-result xmlns:caldav="urn:ietf:params:xml:ns:caldav"/> <caldav:filter xmlns:caldav="urn:ietf:params:xml:ns:caldav"> <caldav:icalcomp-filter xmlns:caldav="urn:ietf:params:xml:ns:caldav" name="VCALENDAR"> <caldav:icalcomp-filter xmlns:caldav="urn:ietf:params:xml:ns:caldav" name="VEVENT"></caldav:icalcomp-filter> </caldav:icalcomp-filter> </caldav:filter> </caldav:calendar-query> Painful! With the fix, you get: <calendar-query xmlns="urn:ietf:params:xml:ns:caldav"> <calendar-query-result/> <filter> <icalcomp-filter name="VCALENDAR"> <icalcomp-filter name="VEVENT"></icalcomp-filter> </icalcomp-filter> </filter> </calendar-query> Nice! /be
The proposed fix in comment 76 needs fixing: 12. If (namespace.prefix == undefined), a. If (x.[[Name]].[[Prefix]] == undefined), i. Let namespace.prefix be the empty string ii. If there exists a member ns2 of namespaceDeclarations that has the empty string as its prefix 1. Remove ns2 from namespaceDeclarations b. Else ii. Let namespace.prefix be an arbitrary implementation defined namespace prefix, such that there is no ns2 in (AncestorNamespaces U namespaceDeclarations) with namespace.prefix == ns2.prefix c. [what was 12(b) goes here] This came up via bug 277779, which wants XML.prototype.setNamespace (at least; what about setName as well?) to auto-magically add to the declared namespaces, but not to duplicate default namespace declarations. /be
Blocks: 277887
Errata for E4X 13.1.2.1 isXMLName(value) The specs states: 2. If q.localName does not match the production NCName, return false but it does not define the meaning of "match". This is important since NCName, http://w3.org/TR/xml-names11/#NT-NCName, includes among other thinks characters beyond 64K limit, http://www.w3.org/TR/xml11/#NT-NameStartChar : NameStartChar ::= ... [#x10000-#xEFFFF] Since JS strings are just 2byte chunks, the specs should define the meaning of "match". IMO a clean way to deal with this is to require to interpret JS strings as UTF-16 encoding and to define what to do if the JS string is not a valid UTF-16.
Comment on attachment 166536 [details] [diff] [review] better approach rs=shaver
Attachment #166536 - Flags: superreview?(shaver) → superreview+
Latest thinking from ECMA TG1, or the subset working on E4X as it goes to ISO: the [[InScopeNamespaces]] internal property should be mapped from the XML Infoset's [in-scope namespaces], excluding the xml: prefix's namespace. Not mapped from the [namespace attributes] property, as ECMA-357 10.3.2.1 step 6(g-h) have it. TG1, or at least the subset consisting of Jeff Dyer of Macromedia and me, also wants to constrain [[InScopeNamespaces]] to include only *used* namespaces. This changes a bunch of the code I wrote in SpiderMonkey, and it probably changes the Rhino E4X implementation. It means that you can always compute the declared namespaces (returned by namespaceDeclarations, and serialized in the output of toXMLString) for an element are always the set difference of that element's in-scope namespaces minus the union of its ancestors' in-scope namespaces. So there's no need for the |declared| flag in JSXMLNamespace. More in a bit. /be
Need to finish this off for 1.8: - Revise namespace inheritance so [[InScopeNamespaces]] matches [in-scope namespaces] from XML-infoset (this bug). - Implement XML(document) for two-way DOM <-> E4X data binding (bug 270553). - Enable E4X by default for XUL contexts (bug not filed yet). /be
Target Milestone: mozilla1.8alpha4 → mozilla1.8beta2
Depends on: 289630
More ECMA-357 errata: I claim ECMA-357 has a spec bug in 10.2.1 "ToXMLString Applied to the XML Type": 17. For each an in attrAndNamespaces a. Let s be the result of concatenating s and the space <SP> character b. If Type(an) is XML and an.[[Class]] == "attribute" i. Let ans be a copy of the result of calling [[GetNamspace]] on a.[[Name]] with argument AncestorNamespaces The last line should read with argument AncestorNamespaces U namespaceDeclarations as namespaceDeclarations (the xmlns= and xmlns:prefix= attributes in this element) are in-scope for this element's attributes. I also claim 13.4.4.35 XML.prototype.setName and 13.4.4.36 XML.prototype.setNamespace are incomplete. They say absolutely nothing about updating [[InScopeNamespaces]]. Other parts of the spec manually update this set, but these methods assume that it magically keeps itself up-to-date with respect to x.[[Name]]. Warning: you cannot use AddInScopeNamespace easily from 13.4.4.{35,36} to fix this erratum. /be
errata nit: spelling "initializer" in 12.1 example on page 58, while rest of document uses "initialiser"
Section 10.1.2 ToString Applied to the XMLList Type of the specification says the following: Given an XMLList object l, ToString performs the following steps: !!!1. If l.hasSimpleContent() == true a. Let s be the empty string b. For i = 0 to l.[[Length]]-1, i. If x[i].[[Class]] &#8713; {"comment", "processing-instruction"} 1. Let s be the result of concatenating s and ToString(l[i]) c. Return s !!!2. Else a. Return ToXMLString(x) Where does the 'x' in 1.b.i and 2.a come from? I guess that should be i. If l[i].[[Class]] &#8713; {"comment", "processing-instruction"} and a. Return ToXMLString(l)
Blocks: 290393
Blocks: branching1.8
Blocks: 300861
No longer blocks: 300861
Flags: testcase?
we've got what we're going to get, I believe, so pulling this off of the branch blocking list.
No longer blocks: branching1.8
Attachment #166536 - Flags: review?(john.schneider)
Flags: testcase? → testcase-
I see bug 321547, bug 321549 in E4X implementation
Depends on: 336921
QA Contact: pschwartau → general
WHy is this bug still open rather than marked as fixed?
Not because of bug 349263, that can wait. It must be due to bug 270553. /be
Since we've tracked errata here so far... The specification currently says that, during parsing of an XML literal or a string passed to XML(), if XML.ignoreWhitespace is true whitespace-only text nodes are removed from the resultant XML object; it does not say that leading and trailing whitespace in XML text nodes should be trimmed. Per discussion in bug 369394, this is a mistake in the specification -- when XML.ignoreWhitespace is true, leading and trailing whitespace should be stripped from XML text nodes.
Someone else should lead the charge here; I've got bigger fish to fry. /be
Assignee: brendan → general
Status: ASSIGNED → NEW
Depends on: 410192
Depends on: 410263
Depends on: 410366
Depends on: 355422, 355674, 381197, 382339, 384435
Depends on: 410740
Blocks: 435398
Depends on: 452729
Depends on: 490758
E4X will be removed again from Spidermonkey (bug 788293)
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
No longer depends on: 349263
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: