Closed Bug 113611 Opened 23 years ago Closed 22 years ago

branch for xpath context, node tests and namespace handling

Categories

(Core :: XSLT, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla1.1alpha

People

(Reporter: axel, Assigned: axel)

References

Details

(Keywords: perf, Whiteboard: XPATH_CONTEXT_113611_2_BRANCH (XPATH_CONTEXT_113611_2_MERGE_20020603 trunk))

Attachments

(4 files, 4 obsolete files)

As the xpath context and the namespace handling are strongly dependent of each other, we're putting this on a branch. We put in the node tests, too, as they pretty much depend on the namespace foo, and change quite a bit. Axel
tagged extensions/tranformiix/source/xpath with XPATH_CONTEXT_113611_BASE, tag -b XPATH_CONTEXT_113611_BRANCH
Status: NEW → ASSIGNED
QA Contact: kvisco → axel
Whiteboard: XPATH_CONTEXT_113611_BRANCH, xpath
landed the xpath part of attachement 60321 of bug 102293 on the branch
tagged Jonas' stuff with JONAS_60321. So one gets his xpath part of the diff with cvs diff -u -rXPATH_CONTEXT_113611_BASE -rJONAS_60321
checked in my current state. It lacks documentation, I still want to write a html giving the reasons for all those levels. I have prepared a set of optimisations, which aren't in yet. There is currently no way to parse a pattern. That is, the ExprParser::createPattern won't create something that you can call matches() on. I tend to create a txPatternParser, inheriting from ExprParser. Anyway, this is as far as I get now, I do need to get home. Short: Pattern will die. txPattern is the one we want. Expr will only have an evaluate method. Once the expressions come down to the LocationStep level, you loose the context. foo/bar/baz never really creates a context set for the bar level. The context set is created completely new once you get down to a Predicate. Therefor I added a txStep class, which has a evalStep method, which does just this. At this level, the grammars for Pattern and Expr merge, so those objects are txPatterns, too. I added another eval method to PredicateList, mainly because there are nice shortcuts in the case of just one predicate. That is used for matches() only.
synched the branch with output and nodeset rewrite. the base directory is now mozilla/extensions/transformiix, and to ease that, I created XPATH_CONTEXT_113611_2_BASE and XPATH_CONTEXT_113611_2_BRANCH. I gave a first stab at the mac project file, too. Nothing beyond xpath builds right now, next on the sleave is changing receiveError to take a nsresult instead of a enum, and to make the NodeTests depend on LocationStep again. Jonas didn't like my attempt to have them full expr/patterns at all.
receiveError is changed, I added two XPATH errors to txError.h, too, +#define NS_ERROR_XPATH_EVAL_FAILED NS_ERROR_FAILURE +#define NS_ERROR_XPATH_INVALID_ARG NS_ERROR_INVALID_ARG currently wired to standard nsresults, we don't have an error module yet.
xslt/functions builds. Actually I'm gonna head for "build this" first, then I'll start modifying.
xslt/Numbering.* compiles.
actually, all in standalone but XSLTProcessor compiles now. Missing are two pieces of new code, the pattern objects and the pattern parser. Should those go to xslt or xpath? Should the pattern parser be a member of ExprParser, or inherit (making most private methods of ExprParser protected) I personally favor putting that to xslt, I don't think adding inheritance to txPatternParser will bloat the binary.
What's up with eval contexts? http://www.axel.pike.org/xslt/Callgraph-XSLTProcessor.xml shows what XSLTProcessor is doing right now. A good deal of which is bad. The context node set is only changed for xsl:apply-templates and xsl:for-each. eval context should be handled there as well as iterating the context node. That shouldn't happen in processMatchedTemplate at all. Now there are three ways for getting the txIEvalContext around. 1) ProcessorState implements txIEvalContext. pros: easy to do, the argument aPs just stays in calls to evaluate. cons: alot of logic in the ProcessorState, for example a stack of txNodeSetContexts. Plus functionality of moving the context node, so the interface get's large. 2) Another argument of a txIEvalContext in a bunch of places. Ok, I didn't evaluate bunch, but I guess it's all over the place. 3) Just store the current txIEvalContext in ProcessorState, and cache the previous context when creating a new one, and resetting the eContext when done. The arguments to evaluate would then be a aPs->getEvalContext(); I'm favoring 3 right now. (On a side note, aPs->findTemplate might wanna be in processMatchedTemplate. And void XSLTProcessor::process(Node* node, const String& mode, ProcessorState* ps) should die)
> What's up with eval contexts? > http://www.axel.pike.org/xslt/Callgraph-XSLTProcessor.xml shows what > XSLTProcessor is doing right now. A good deal of which is bad. > The context node set is only changed for xsl:apply-templates and xsl:for-each. > eval context should be handled there as well as iterating the context node. > That shouldn't happen in processMatchedTemplate at all. why not, processMatchedTemplate is part of the code for xsl:apply-templates since we have that code in two places (one for default-templates). It's also used by xsl:apply-imports even though that's not strictly neccesary, but xsl:apply-imports does share most of it's logic with xsl:apply-templates > Now there are three ways for getting the txIEvalContext around. I don't fully understand the issue, so if i don't make sence then that's why :) > 1) ProcessorState implements txIEvalContext. > pros: easy to do, the argument aPs just stays in calls to evaluate. > cons: alot of logic in the ProcessorState, for example a stack of > txNodeSetContexts. Plus functionality of moving the context node, so the > interface get's large. One could take that as an indication that iterating the context-nodeset shouldn't live in the context ;-) > 2) Another argument of a txIEvalContext in a bunch of places. Ok, I didn't > evaluate bunch, but I guess it's all over the place. Could you be a bit more specific as to where the context needs to be forwarded? Is it basically around between the processAction/processChilren/processTemplate functions? If so then shouldn't it be possible to forward that around instead of the current aNode? Hmm.. will the current() function cause this solution any problems (or worse?) > 3) Just store the current txIEvalContext in ProcessorState, and cache the > previous context when creating a new one, and resetting the eContext when > done. > The arguments to evaluate would then be a aPs->getEvalContext(); 1 and 3 seems best to me. IIRC i saw that in your code the ProcessorState implements nsIXPathPatternContext (don't remember the exact name), if so it feels more right to have it implement the eval-context too, no? Other wise you'd have two separate nsIXPathPatternContext implementations which sounds like a source of trouble. > (On a side note, aPs->findTemplate might wanna be in processMatchedTemplate. > And > void XSLTProcessor::process(Node* node, > const String& mode, > ProcessorState* ps) > should die) YES, PLEASE! Please merge it with ::startTransform. After the vars-landing we should be able to move the setting of the top context-node/nodeset there too.
> why not, processMatchedTemplate is part of the code for xsl:apply-templates Well, the issue is: in the callers to processMatchedTemplate you iterate over the nodeset, take that node, pass it over to processMatchedTemplate, which puts that node in the context. You're tearing one logical action into two functions. The logic is, for each node in your context, find the template and process it. > One could take that as an indication that iterating the context-nodeset > shouldn't live in the context ;-) It isn't. Those parts that need to iterate the context nodes within the context node set use txNodeSetContext, which provides extra functionality for iterating. txIEvalContext doesn't have any logic to modify it's context. If you need that, you create a new context forwarding the calls that belong to txIMatchContext to your parent. That way you can guarantee the state of the context that you created. > Hmm.. will the current() function cause this solution any problems (or worse?) the current() function will need the current txIEvalContext of the XSLTProcessor, so I'll go with a member for that in ProcessorState. 3) it is. A bit more precision about txIMatchContext and txIEvalContext. txIMatchContext is the context you need for matching XSLT patterns, but even more so, it's the part of txIEvalContext that doesn't change during the evaluation of a single expression (context node set changes from Step to Step) That's why it's pretty sane for ProcessorState to implement txIMatchContext. It's pretty sick to make it implement txIEvalContext, I even think it would break severely. Unless you make it behave like it didn't implement it.
moved patterns to xslt, killed txStep (which was wrong), the Pattern and ExprPattern typedefs, added a txPatternParser, made |txNodeTest|s not be Patterns or Exprs, like Jonas originally did it. Need to resolve IdKeyPattern production, and prolly I'm going to roll back the mFilterExpr in PathExpr, now that txStep is gone. Need the txIEvalContext in ProcessorState to get things linking.
ok, here is the current status: the branch compiles on unix for module and standalone. it runs the xalan tests in standalone with just two regressions: QNames in variables, that's up to Jonas, and idkey in patterns. I need some help in building on windows and see if other crash on copy16 and copy30. copy16 crashes somewhere in Document::getElementById, not that I touched that. I need to merge again, too. And I'll see what the diffs look like.
merged to the tip, tagged with XPATH_CONTEXT_113611_2_MERGE1
IdKeyPattern landed. namespaces in AVT got fixed. Now the stuff is ready to get reviewed. Or like xregress.pl says it, "position69 now succeeds" bad impact: xsl:variable with QNames regressed, bug 117658 should handle that issue good impact: - namespaces work now. Stuff like <afoo xmlns:bar="http://www.w3.org/1999/XSL/Transform" foo="{element-available('bar:output')}" /> does not confuse us anymore. We are better than others here. - SPEED. veiligheid test is twice as fast, large stylesheets like docbook REALLY rock, we have a factor of 8-10 here. The most significant impact is findTemplate returning early if a match was found and calculating the default priority only once. This has to put more work on the addTemplate function, plus infrastructure to handle UnionPatterns right. /me lays back and gets ready for the flood.
Whiteboard: XPATH_CONTEXT_113611_BRANCH, xpath → XPATH_CONTEXT_113611_2_BRANCH, transformiix
Attached patch patch of branch vs. trunk (obsolete) (deleted) — Splinter Review
merged darins checkins to the branch and created a diff.
adding dependencies, note that this patch obsoletes the wallpaper in bug 110266
Blocks: 40672, 96410, 110266
this bug should make it for 1.0. Big speed and conformance impact here.
Keywords: mozilla1.0
Target Milestone: --- → mozilla1.0
Comments so far (i'm about half-way through) General comments: I don't like the nsIMatchContext name. It looks very strange that we deal with match-contexts during evaluations. How about renaming it to nsIStaticContext or some such? The way the parse-context is handled in the parser is very dangerous. You turn the context into a state of the parser and it would break horribly if any recursion was done. Please either make the parser stateless again by making aContext an argument to all functions and forward it everywhere. Or make it "backup" the old mContext before setting it and then restore it on exit. Or at least add an assertion before setting mContext to make sure that it always is null. Why make PathExpr::evalStep recursive? That's bad both perf-wise and codecomplexity-wise. I didn't review XPathProcessor.cpp since that'll be depecated soon. Code comments: -ExprResult* AttributeValueTemplate::evaluate(Node* context, ContextState* cs) { - ListIterator* iter = expressions.iterator(); +ExprResult* AttributeValueTemplate::evaluate(txIEvalContext* aContext) +{ + txListIterator iter(&mExpressions); String result; - while ( iter->hasNext() ) { - Expr* expr = (Expr*)iter->next(); - ExprResult* exprResult = expr->evaluate(context, cs); + while (iter.hasNext()) { + Expr* expr = (Expr*)iter.next(); + ExprResult* exprResult = expr->evaluate(aContext); need a null-check here + // check left expression for early decision + if ((mOp == OR) && (lval)) + return new BooleanResult(MB_TRUE); + if ((mOp == AND) && (!lval)) + return new BooleanResult(MB_FALSE); ugh, overuse of () Do we still need/use ErrorFunctionCall? + txNameTest(String& aPrefix, String& aLocalName, PRUint32 aNSID, + Node::NodeType aNodeType); Why not atomize aLocalName and aPrefix here already? namespace-ids are PRInt32 - VariableRefExpr(const String& name); + VariableRefExpr(const String& aPrefix, + const String& aLocalName, PRInt32 aID); Atomize aPrefix and aLocalName. aID->aNSID or some such. - /** - * Selects from the descendants of the context node - * all nodes that match the Expr - * -- this will be moving to a Utility class - **/ - void evalDescendants(Expr* expr, - Node* context, - ContextState* cs, - NodeSet* resNodes); + /** + * Selects from the descendants of the context node + * all nodes that match the Expr + * -- this will be moving to a Utility class + **/ + void evalDescendants(Expr* aStep, Node* aNode, + txIEvalContext* aContext, + NodeSet* resNodes); that last **/ looks strange, and please remove the utility-class comment 'cause it won't. (right?) - expr = new VariableRefExpr(tok->value); + { + String prefix, lName; + nsresult res = NS_OK; + PRInt32 nspace; + res = resolveQName(tok->value, prefix, lName, nspace); + if (NS_FAILED(res)) { + // XXX error report namespace resolve failed + return 0; + } + expr = new VariableRefExpr(prefix, lName, nspace); + } Why not report the error now that you have a context? No need to set res to NS_OK initially. Please name res rv, that's the mozilla-convetion. - //-- Most likely an Extension Function, or error, but it's - //-- not our job to report an invalid function call here - fnCall = new ExtensionFunctionCall(fnName); + txAtom* name; + PRInt32 namespaceID; + int idx = tok->value.indexOf(':'); + if (idx >= 0) { + String nameStr, prefixStr; + tok->value.subString(idx+1, nameStr); + name = TX_GET_ATOM(nameStr); + + tok->value.subString(0, idx, prefixStr); + txAtom* prefix = TX_GET_ATOM(prefixStr); + res = mContext->resolveNamespacePrefix(prefix, namespaceID); + // XXX report error + TX_IF_RELEASE_ATOM(prefix); + } + else { + name = TX_GET_ATOM(tok->value); + namespaceID = kNameSpaceID_None; + } Use resolveQName instead. + nsresult res = NS_OK; + PRInt32 nspace; + res = resolveQName(tok->value, prefix, lName, nspace); + if (NS_FAILED(res)) { + // XXX error report namespace resolve failed + return 0; + } + switch (axisIdentifier) { + case LocationStep::ATTRIBUTE_AXIS: no need to set res=NS_OK initially. res->rv +nsresult ExprParser::resolveQName(const String& aQName, + String& aPrefix, String& aLocalName, + PRInt32& aNamespace) +{ + nsresult res = NS_OK; atomize aPrefix and aLocalName. Don't initialize res to NS_OK. res->rv +NodeSet* FunctionCall::evaluateToNodeSet(Expr* aExpr, txIEvalContext* aContext) { NS_ASSERTION(aExpr, "Missing expression to evaluate"); - ExprResult* exprResult = aExpr->evaluate(aContext, aCs); + ExprResult* exprResult = aExpr->evaluate(aContext); if (!exprResult) return 0; if (exprResult->getResultType() != ExprResult::NODESET) { String err("NodeSet expected as argument"); - aCs->recieveError(err); + aContext->receiveError(err, NS_ERROR_XPATH_EVAL_FAILED); NS_ERROR_XPATH_INVALID_ARG seems better? +void LocationStep::fromDescendants(Node* node, txIMatchContext* cs, + NodeSet* nodes) plase use an txIEvalContext instead - if (!context || !expressions.getLength()) - return new StringResult("error"); + if (!aContext || (expressions.getLength() == 0)) { + NS_ASSERTION(aContext, "malformed PathExpr"); + return new NodeSet(); + } return StringResult("error"). That's a search-n-replace to 0 once all our |evaluate| clients properly null-check the return-value In PathExpr: - NodeSet* nodes = new NodeSet(context); + NodeSet* nodes = new NodeSet(aContext->getContextNode()); |nodes| can be created on the stack now that it's not the returnvalue. +void PathExpr::evalDescendants (Expr* aStep, Node* aNode, + txIEvalContext* aContext, + NodeSet* resNodes) +{ + NodeSet set(aNode); + txNodeSetContext eContext(&set, aContext); + eContext.next(); + ExprResult *res = aStep->evaluate(&eContext); why not simply use aContext? +ExprResult* RootExpr::evaluate(txIEvalContext* aContext) { - if (!context) - return new StringResult("error"); + Node* context; + if (!aContext && !(context=aContext->getContextNode())) { + NS_ASSERTION(0, "internal error"); + return 0; + } + context = aContext->getContextNode(); + if (!context) + return 0; that first |if| isn't nice :( and most of it is duplicated in the second |if| simply an assertion should be enough + virtual NodeSet* getContextNodeSet() = 0; this function never seems used Please add comments to the functions in txIXPathContext.h +MBool txNameTest::matches(Node* aNode, txIMatchContext* aContext) +{ + if (!aNode || aNode->getNodeType() != mNodeType) + return MB_FALSE; + + // Totally wild? + if (mLocalName == txXPathAtoms::_asterix && + (kNameSpaceID_None == mNamespace)) + return MB_TRUE; Don't assume that mNamespace is kNameSpaceID_None if and only if there is no prefix. Check for the precense of a prefix instead. +double txNameTest::getDefaultPriority() +{ + if (mLocalName == txXPathAtoms::_asterix) { + if (kNameSpaceID_None == mNamespace) same here +void txNameTest::toString(String& aDest) +{ + if (kNameSpaceID_None != mNamespace) { and here +PRUint32 txNodeSetContext::position() +{ + NS_ASSERTION(mPosition, "Should have called next() at least once"); + return mPosition; +} inline this instead since it's used non-virtually in predicates. Not sure if that applies to any more functions in txNodeSetContext + nsresult res = NS_ERROR_NULL_POINTER; + if (mInner) { Just assert for mInner instead. It should never be null + MBool hasNext() + { + return mPosition < size(); + } + void next() + { + NS_ASSERTION(mPosition < size(), "Out of bounds."); + mPosition++; + } merge these two functions into one that steps one step forward and returns false when it walks off the end of the list. +MBool txNodeTypeTest::matches(Node* aNode, txIMatchContext* aContext) +{ + if (!aNode) + return MB_FALSE; + + Node::NodeType type = (Node::NodeType)aNode->getNodeType(); + + switch (mNodeType) { + case COMMENT_TYPE: + return Node::COMMENT_NODE == type; please make that |type == Node::COMMENT_NODE|. The other way always makes me very confused. (same applies to a few other places in that function)
merged changes to the trunk up to (and including) XPathDOM. That misses contexts, though. tagged the trunk with XPATH_CONTEXT_113611_2_MERGE_20020314 for later merging fun
- String valueAttr; - xslNumber->getAttr(txXSLTAtoms::value, kNameSpaceID_None, - valueAttr); + Expr* expr = ps->getExpr(xslNumber, ProcessorState::ValueAttr); //-- check for expr - if (!valueAttr.isEmpty()) { - Expr* expr = ps->getExpr(xslNumber, ProcessorState::ValueAttr); - if (!expr) - return; + if (expr) { getExpr will report an error if the attribute is missing, use hasAttr instead.
re #23, getExpr does not report on error if the attribute isn't there, but returns 0. It's just bad to get the attribute twice, getExpr should handle that.
re #21 > I don't like the nsIMatchContext name. It looks very strange that we deal with > match-contexts during evaluations. How about renaming it to nsIStaticContext > or some such? Static is worse, IMHO. It's a XSLTism, I agree, but that's the context that you use for pattern matching. You don't static anything ;-). And it's not something static either, I guess it's more a const. But basically I don't think that an access specifier is a qualifying name. How about making it a native american? txITheContextThatDancesWithWolfesAndDoesn'tChangeWhenEvaluatingSteps The parse context in ExprParser is fine, you should always work on the lexer and not a string argument internally. I'll add that assertion though. > Why make PathExpr::evalStep recursive? That's bad both perf-wise and > codecomplexity-wise. That keeps track of the right txIEvalContext for the evaluation of a step. > I didn't review XPathProcessor.cpp since that'll be depecated soon. died already. > Code comments: > +ExprResult* AttributeValueTemplate::evaluate(txIEvalContext* aContext) we shouldn't push 0, this is internal, so no need to do this. Fixing AVTs is not part of this bug, if you want that, file a separate one. (Yes, they could be better, removing AttributeValueTemplate completely) > Do we still need/use ErrorFunctionCall? unimplemented XSLT functions do I guess I gonna atomize resolveQName, and it's friends (like the constructors and resolving functions can use that, too) > Why not report the error now that you have a context? No need to set res to > NS_OK initially. Please name res rv, that's the mozilla-convetion. fixing error reporting is a different bug. I'll switch to rv, but keep initializing it. > plase use an txIEvalContext instead not without a reason. What you need in that function is a txIMatchContext, so that's the argument to use txIMatchContext. > +void PathExpr::evalDescendants (Expr* aStep, Node* aNode, > + txIEvalContext* aContext, > + NodeSet* resNodes) > +{ > + NodeSet set(aNode); > + txNodeSetContext eContext(&set, aContext); > + eContext.next(); > + ExprResult *res = aStep->evaluate(&eContext); > > why not simply use aContext? I'll ponder > + virtual NodeSet* getContextNodeSet() = 0; > this function never seems used nice observation ;-), should I remove it? > please make that |type == Node::COMMENT_NODE|. The other way always makes me > very confused. (same applies to a few other places in that function) confused me too, but I changed my style to have the non-lvalue on the left, so I can't accidently hork == to =. May I? Stuff not answered is either agreed on or waiting for peterv's comments to make up my mind.
> The parse context in ExprParser is fine, you should always work on the lexer > and not a string argument internally. I'll add that assertion though. I didn't even think of that, i guess that is why you had to add a lexer to the AVT parser? My concern was if somebody calls the lexer during, for example, resolveFunctionCall. Then you'd loose the current parse-context and end up using a null one. But since noone is doing that now it should work anyway. > > Why make PathExpr::evalStep recursive? That's bad both perf-wise and > > codecomplexity-wise. > That keeps track of the right txIEvalContext for the evaluation of a step. What is the "right" vs. "wrong" eval-context? And why? You do know that predicates don't affect PathExprs, only FilterExprs/Steps, right? The way you do it now you could end up evaluating a series of step seval times for the same node since you don't merge nodesets at each step. > > Code comments: > > +ExprResult* AttributeValueTemplate::evaluate(txIEvalContext* aContext) > we shouldn't push 0, this is internal, so no need to do this. > Fixing AVTs is not part of this bug, if you want that, file a separate > one. > (Yes, they could be better, removing AttributeValueTemplate completely) Sorry, didn't paste enough. I ment that you should check the resulting exprResult for null. Yes I know that the old code didn't do it either. > > Do we still need/use ErrorFunctionCall? > unimplemented XSLT functions do Just return null if the function doesn't exist. There's no point in parsing an expression that can't be evaluated anyway. > > Why not report the error now that you have a context? No need to set res to > > NS_OK initially. Please name res rv, that's the mozilla-convetion. > fixing error reporting is a different bug. The reason I didn't do errorreporting for the old code is that it was impossible since there was nothing to report the error to. However since there is now I don't see a reason not to do it in new code. But if peterv is ok with pushing it to when we fix the rest of the errorreporting in the parser then I guess it's ok. > > plase use an txIEvalContext instead > not without a reason. What you need in that function is a txIMatchContext, so > that's the argument to use txIMatchContext. Ok, since that context is only used for evaluating the node-tests I see good reason. > > + virtual NodeSet* getContextNodeSet() = 0; > > this function never seems used > nice observation ;-), should I remove it? Please do, if we need it later then we add it then. > > please make that |type == Node::COMMENT_NODE|. The other way always makes me > > very confused. (same applies to a few other places in that function) > confused me too, but I changed my style to have the non-lvalue on the left, so > I can't accidently hork == to =. May I? I'd rather not, single = should give compile-warning. But I'm not religious.
checked in changes to makefile.wins, tested for standalone. module needs verification
did: res->rv, resolveQName with atoms, txNameTest, VariableRefExpr constructor with atoms, ExprParser::createFunctionCall with resolveQName, txNameTest needs to check for no prefix for totally wild
hmm.. the DOM-XPath spec uses |null| as return-value when no mapping is found in the XPathNSResolver. This means that in both XSLT1 and DOM-XPath the resolved namespace is null if and only if there is no prefix. So the code in txNameTest will work the way it is now, so I'm ok with it. We should be aware of this though since our internal interfaces allows a prefix to be mapped to the null namespace (we return kNameSpaceID_Unknown on error), and in XPath2 this will not work.
killed PathExpr::evalStep
checked in the contexts for the DOM level 3 XPath implementation
*sigh* i just lost a big chunk of comments in a crash. /me curses mozilla + txList simpleMatches; + pattern->getSimplePatterns(simpleMatches); You'd save a few cycles and a few bytes if you didn't do this for the, probably common, case when all "sub patterns" in a UnionPattern have the same priority. You could do this by having txUnionPattern::getDefaultPriority return NaN only when the patterns don't have the same priority. And then use that (and !hasPriority) to determin if getSimplePatterns needs to be called. The splitup txUnionPattern is leaked. - MatchableTemplate* templ = new MatchableTemplate; + txPattern* root = new txRootPattern(); + MatchableTemplate* templ = new MatchableTemplate(aStylesheet, + root, + Double::NaN); if (!templ) { // XXX ErrorReport: out of memory return; } - - templ->mTemplate = aStylesheet; - String match("/"); - templ->mMatch = exprParser.createPattern(match); - if (templ->mMatch) - templates->add(templ); - else - delete templ; + templates->add(templ); nullcheck the new txRootPattern. I think you need to priority-sort the new template since AFAICT there is nothing that prohibits LRE stylesheets to be imported. in getExpr: + if (!hasAttr) + return 0; How would a client know if the returned null is due to a parse-failure or a missing attribute? Have you checked all callers that don't allow missing attributes so that they report the error. IMHO it's better to use a hasAttr before using getExpr in the few cases where the attribute is optional. hasAttr should be very cheap, especially if we make it use nsIContent::HasAttr. Same in getPattern AttributeValueTemplate* avt = - exprParser.createAttributeValueTemplate(aAttValue); + exprParser.createAttributeValueTemplate(aAttValue, &pContext); + if (!avt) { - // XXX ErrorReport: out of memory + // shortcut, this is just a regular string + aResult.append(aAttValue); return; } Hmm.. i see no such change in ::createAttributeValueTemplate. avt should only be null if an error occured; the avt didn't parse or if we ran out of memory. + if (kNameSpaceID_None != aID) { + return NS_OK; please switch places around the !=, you confuse my brain. And return an error, there's no point in continuing parsing if the function is unknown. You should set aFunction to null as well. Same at the bottom of the function. + if (CHECK_FN(unparsedEntityUri)) { + err = "unparsed-entity-uri function not yet implemented"; + aFunction = new ErrorFunctionCall(err); + return NS_ERROR_NOT_IMPLEMENTED; never return an error and a value. XSLTProcessor is next
> nullcheck the new txRootPattern. I think you need to priority-sort the new > template since AFAICT there is nothing that prohibits LRE stylesheets to be > imported. err, i mean "included"
Attached patch branch status as of today (obsolete) (deleted) — Splinter Review
updated patch, not all review comments are in, merged the checkins to the trunk into this patch. Didn't check them in to the branch yet
Attachment #72805 - Attachment is obsolete: true
I should have put that in here, too, so citing http://bugzilla.mozilla.org/show_bug.cgi?id=94036#c3 "I can wholeheartly change the signature to nsresult getPattern(Element*, PatternAttr, txPattern*&);" is my way to answer > in getExpr: > + if (!hasAttr) > + return 0; ... of #32
- txNameTestItem* nti = new txNameTestItem(name, aShouldStrip); + String prefix, lname; + PRInt32 aNSID = kNameSpaceID_None; + XMLUtils::getPrefix(name, prefix); + if (!prefix.isEmpty()) { + txAtom* prefixAtom = TX_GET_ATOM(prefix); + aNSID = aElement->lookupNamespaceID(prefixAtom); + TX_IF_RELEASE_ATOM(prefixAtom); + } + XMLUtils::getLocalPart(name, lname); + txNameTestItem* nti = new txNameTestItem(prefix, lname, aNSID, + aShouldStrip); you could maybe use txExpandedName to parse the qname here. + // "node()" + txNodeTest* nt = new txNodeTypeTest(txNodeTypeTest::NODE_TYPE); + mNodeExpr = new LocationStep(nt, LocationStep::CHILD_AXIS); need out-of-memory test on |nt| - - //-- push nodeSet onto context stack - aPs->getNodeSetStack()->push(nodeSet); + txNodeSetContext evalContext(nodeSet, aPs); + txIEvalContext* priorEC = + aPs->setEvalContext(&evalContext); for apply-templates you didn't change the evalcontext until after sorting, any reason why it's different in for-each? How about removing the |aNode| argument to processAction/Children/ChildrenAsValue/Template/MatchedTemplate/DefaultTemplate/ TemplateParams/Variable/yay-we-have-many-process-functions? It is now very seldomly used, and having the same value propagated in two places in parallell always makes me worried. Since the old code did this too it is ok with to do it some other time, but it would be nice to do. txPatternParser is next...
+txPattern* txPatternParser::createPattern(const String& aPattern, + txIParseContext* aContext, + ProcessorState* aPs) +{ + mContext = aContext; + mProcessorState = aPs; As in the expression-parser I'm not overly crazy about haveing the context as state. But if you absolutly wanna do it then please add assertions to protect against bad codepaths/recursion. +nsresult txPatternParser::createUnionPattern(ExprLexer& aLexer, + txPattern*& aPattern) +{ + nsresult rv = NS_OK; + txPattern* locPath = 0; init aPattern to 0 in case something fails further down. + #if 0 // XXX addPattern can't fail yet, it doesn't check for mem + if (NS_FAILED(rv)) { + delete unionPattern; + delete locPath; + return rv; + } + #endif remove the |#if 0|. No reason to wait until addPattern can do proper errorchecking + } while (Token::UNION_OP == type); + + if (Token::END != type) { please switch these around
merged spell and string fixes in source/base and the death of nsAppShellCIDs.h in source/xml/parser
The patterparser is compleatly lacking error-reporting. Do you want to do that later? If so, how and where would errorreporting work? The ExprParser also always makes sure that when the parsing stops due to error the lexer is always positioned right before the token where parsing failed, which might be something that is good if the PatterParser also does. +nsresult txPatternParser::createLocPathPattern(ExprLexer& aLexer, + txPattern*& aPattern) +{ init aPattern to 0 in case something fails further down (i guess this applies to all txPatternParser functions) + short type = aLexer.peek()->type; + switch (type) { + case Token::ANCESTOR_OP: + isChild = MB_FALSE; + aLexer.nextToken(); + break; The old code used to ensure that "//foo" only matched <foo>s that were in the main tree (i.e. added a RootExpr first in the expr-list). Not sure if we care about that or not? IMHO it's not really importat to optimize "stupid" expressions like that. + case Token::PARENT_OP: + aLexer.nextToken(); + isAbsolute = MB_TRUE; + if (Token::END == aLexer.peek()->type) { + aPattern = new txRootPattern(); + return aPattern ? NS_OK : NS_ERROR_OUT_OF_MEMORY; + } you need to at least check for UNION_OP here too, not sure if there are any more. I would rather not have a list "ending tokens" that determines if the function should return, a better way IMHO is to have a list of tokens that determines if parsing in this function should continue. This was one of the main problems with the pre-rewrite-ExprParser and caused a lots of hangs and parse-errors. However patterns have a bit simpler syntax... + if (txXPathAtoms::id == nameAtom) { you have tons of these :( + if (Token::CNAME == tok->type) { + // resolve QName + String prefix, lName; + nsresult res = NS_OK; res -> rv txXSLTPatterns.cpp next..
> you need to at least check for UNION_OP here too, not sure if there are any > more. I would rather not have a list "ending tokens" that determines if the > function should return, a better way IMHO is to have a list of tokens that > determines if parsing in this function should continue you could simply call the |isLocationStepToken| function
+void txRootPattern::toString(String& aDest) +{ + #ifdef DEBUG + aDest.append("txRootPattern{"); + #endif + aDest.append("/"); you need to do the "should serialize" trick here so that "/foo/bar" don't get serialized as "//foo/bar". (or make all txPattern::toString debug-only, which IMHO would be nice) + const nsString ids = mIds.getConstNSString(); do we really need this string-copy? couldn't you use mIds.getConstNSString() directly instead? Or was |const nsString& ids =..| the intention? + PRBool found = FindCharInReadable(space, pos, end); + + while (found) { + if (value.Equals(Substring(begin, pos))) { + return MB_TRUE; + } + while (space == *pos) { + ++pos; // skip ' ' + } + begin = pos; + found = FindCharInReadable(PRUnichar(' '), pos, end); + } You need to handle other whitespace the ' '. Could do |while (FindCharInReadable (...))| instead of the double Find. A nice thing to do speed-wise would be to split up the strings in the ctor, but that could definitely wait. Hmm.. you could normalize whitespace in the ctor though, that should be pretty easy, and that way you'd only have to search for ' '. + aDest.append("id("); + aDest.append(mIds); + aDest.append(")"); need "'"s around the literal. Same in keys. + if (!nodes.contains(aNode)) { + return MB_FALSE; + } You could do this faster by making the |switch(exprResult->getResultType())| set a |MBool add| variable and then do |if (!add && predContext.getContextNode () == aNode) return MB_FALSE| in the inner loop. Though the win isn't as big once attributenodes behave properly. Looks like txNodeSetContext::getContextNode could benefit from inlineing as well. in txStepPattern::matches: + } +} // matches add some foo here to make sure that we don't get bogus warnings. Maybe change the switch to an if and move out the |default:|? + * @return The current node of the context node set + */ +ExprResult* CurrentFunctionCall::evaluate(txIEvalContext* aContext) the comment isn't quite right. Should be something like "NodeSet containing the current XSLT node" Key* key=(Key*)iter.next(); - if (key->matchPattern->matches(aNode, 0, mProcessorState)) { + if (key->matchPattern->matches(aNode, aContext)) { NodeSet contextNodeSet(aNode); - mProcessorState->getNodeSetStack()->push(&contextNodeSet); - mProcessorState->pushCurrentNode(aNode); - ExprResult* exprResult = key->useExpr->evaluate(aNode, mProcessorState); - mProcessorState->popCurrentNode(); - mProcessorState->getNodeSetStack()->pop(); + txNodeSetContext evalContext(&contextNodeSet, aContext); + evalContext.next(); + ExprResult* exprResult = key->useExpr->evaluate(&evalContext); You need to set the "current()" node so the ProcessorState is still needed. Either as an argument or as a member
- String expr("."); - ExprParser parser; - mDefaultExpr = parser.createExpr(expr); + txNodeTest* test = new txNodeTypeTest(txNodeTypeTest::NODE_TYPE); + mDefaultExpr = new LocationStep(test, LocationStep::SELF_AXIS); nullcheck |test| + mEContext = new EvalContext(aNodes, mPs); + need to set this as current evalcontext in the ProcessorState so that "current()" and AVTs work (hmm.. i wonder if adding a txIEvalContext-argument to processAttrValueTemplate would be a good idea?). And please don't make the sorter statefull by adding the context as a member. Forward it to compareNodes as an argument instead, that way you can create it on the stack. I don't like the EvalContext class :(. I think it's making things much more complicated and error-prone then needed. This way you force the first argument to compareNodes to always be the same node as stored in iterating part of the context. The entire class can be replaced with a setContext(Node*) function in txForwardContext, which IMHO would make the logic much simpler and also remove any extra requirements on the SortableNode-arguments to compareNodes. Hey, that's all!! Even though I have a lot of comments I really do like the patch. Having the context-node as a member in the context turned out much nicer then i thought, and of course getting our namespace act together is more then nice. The part that i'm mostly worried about functional-wise is the new PatternParser (since it's code) and the added states in some of the classes. However I think the risk of landing this is much lower just looking at the patchsize indicates. The flow in the code doesn't really change that much and most changes are checked at compiletime. And of course, the xalan-testsuit is really good against regressions.
Attached file Test case for namespace resolver (deleted) —
Simple test case to show that namespace prefixes in the path expression are not being resolved.
Blocks: 134295
merged 131899, 110156, 110155 (outliner->tree), 134608
Attached patch The Nits Volume 1 (deleted) — Splinter Review
Mostly nits.
> case TX_LANG: > { > - if (!requireParams(1, 1, aCs)) > + if (!requireParams(1, 1, aContext)) > return new StringResult("error"); > > XXX + aContext->receiveError(err, NS_ERROR_UNEXPECTED); ? requireParams will do errorreporting +class txNodeTypeTest : public txNodeTest +{ +public: > + enum NodeType { > + COMMENT_TYPE, > + TEXT_TYPE, > + PI_TYPE, > + NODE_TYPE > + }; > > XXX txNodeType ? IMHO it's not needed since it's scoped in a tx-prefixed class, so the full name is txNodeTypeTest::NodeType. Same for a few others > - if (attValue.isEmpty()) > + if (attValue.isEmpty()) { > + mContext = 0; > return avt; //XXX should return 0, but that causes crash in lre12 > > XXX Does this still crash lre12? Yes, that's my comment and it's wrong. We *should* return avt (or an empty StringExpr). > ExprResult* exprResult; > - exprResult = ((Expr*)iter.next())->evaluate(aContext, aCs); > + exprResult = ((Expr*)iter.next())->evaluate(aContext); > > XXX Do we need to null-check .next()? the requireParams will ensure that we have enough parameters, and FunctionCall::addParam will ensure that they are != null
Merged bug 5693, bug 56087, bug 132300, bug 113401, bug 70855 and the buster update. Tagged the trunk with XPATH_CONTEXT_113611_2_MERGE_20020424
Whiteboard: XPATH_CONTEXT_113611_2_BRANCH, transformiix → XPATH_CONTEXT_113611_2_BRANCH (XPATH_CONTEXT_113611_2_MERGE_20020424 trunk)
forgot to merge bug 34849, that's in now.
Addressed comments against ExprParser::mContext by adding the context (and ProcessorState) to the signatures as needed. Made ExprParser and txPatterParser purely static, too, to get rid of any hickups like this in the future.
> XXX What's the rationale for putting this in the Expression parser? > Could this be a static method? It uses a txParseContext to resolve the prefix, so I put it in the parser. It's static now. > + Node::NodeType type = (Node::NodeType)aNode->getNodeType(); > XXX do you really need the cast? getNodeType returns short :-( I don't feel like inlining txNodeSetContext::getContextNode or txNodeSetContext::position, it's nicer this way around (having all of txIEvalContext in one place) and I don't expect to see the speed difference. I don't want to merge txNodeSetContext::hasNext() and next(), either, I like testing and iterating being separate better, and it won't matter codewise, as they're inlined anyway. I won't do a bunch of those beautifying nits peterv had, we should really have a branch for that, and everybody stabs on some files for a week. Oh, and #transformiix decided to not prefix inner classes or structs. I didn't do error reporting either, I'd prefer to do that in bug 112622 or a new sibling of that. The checkins http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=XPATH_CONTEXT_113611_2_BRANCH&branchtype=match&dir=mozilla%2Fextensions%2Ftransformiix&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=04%2F26%2F2002+08%3A00&maxdate=04%2F26%2F2002+13&cvsroot=%2Fcvsroot should pretty much deal with attachment 79641 [details] [diff] [review], but I still have a list of things to do, at least investigating eval contexts in PathExpr::evalDescendants and fixing txRootPattern to work for //foo. (it's currently equivalent to foo, which has at least a different default priority). I added txSingleNodeContext.h, which implements txIEvalContext for a single node. Didn't spend a .cpp for that one, it's used only in a few files. not ready for another patch just yet, IMHO. A request for comments, either put your comments in your branch tree, and diff against that, or use something else than XXX, we have way too many of those.
re #21 > why not simply use aContext? 'cause it's context node is wrong. I changed that to txSingleNodeContext. re #26 > Just return null if the function doesn't exist. There's no point in parsing an > expression that can't be evaluated anyway. We really should cope with this in a non-fatal way, and ErrorFunctionCall is one way. (As we have to return a FunctionCall, otherwise we could have used a StringExpr.) notes to self: still need to fix the UnionExpr leak. I'll have to make a list of Patterns that are owned by the ProcessorState, all other ways I came up with where way more complicated. (Like adding at least a clone method to txPattern.) ProcessorState::resolveFunctionCall needs to return a warning nsresult instead of an error for unparsedEntityUri. See above. #33, too. const nsString ids = mIds.getConstNSString(); a reference in txXSLTPatterns? (continue at #41) I think #35 isn't answered yet.
Don't leak txUnionPatterns in ProcessorState::addTemplate. I fixed this by fixing the ownership of txUnionPatterns and it's children. Once you call getSimplePatterns, the union pattern is empty and it's therefor deleted in ProcessorState::addTemplate. Changed to comment to txPattern::getSimplePatterns to make that clear. (That was easier than I thought before I thought. Think more before, my mug says, and it's right.)
fixed resolving non-implemented functions. The error is generated in ExprParser::createFunctionCall on NS_ERROR_NOT_IMPLEMENTED (that has to return a Expr*, now). We have to parse parameters, still, so I made that 0-proof. We can now kill ErrorFunctionCall, so dead it is. txStepPattern::matches now remembers if aNode is in the NodeSet given by the predicate (if it's not an attribute) fixed current() is txKeyFunctionCall need to fix whitespace in txIdPattern. Then I'll be thru. Hey.
merging 140687, 141173, 135825. fixed txRootPattern serialisation and parsing. id() doesn't assert anymore for non-found IDs (like standalone)
merged 132302 from back in the days (april 10), created diff against the trunk (tag XPATH_CONTEXT_113611_2_DIFF_0508). I'll read that diff myself before attaching it. I didn't go for txExpandedName (yet), and still need some answers on #35
My oppinion on comment 35: If you make getExpr/getPattern *not* report an error for missing attributes you need to do two things: 1. You need to signal back to the caller that the lack of returned Expr/Pattern is due to a missing attribute and not due to a parse-error. This signaling is probably best done by changing the signature to xpcom-style 2. Make sure all clients of getExpr/getPattern reports an error when no Expr/Pattern is returned due to missing attribute and the attribute isn't optional IMHO 2. will add too much error-reporting code to be worth the saved cycles of an extra hasAttr call in the cases where the attribute is optional.
cvs -z3 ci -m"fixing optional arguments, after lots of arguments. I just shouldn't change that here and now" Numbering.cpp patch coming up :-)
Attached patch patch of the branch (obsolete) (deleted) — Splinter Review
+4125/-3058 some 1000 lines are just removed files.
Attachment #74957 - Attachment is obsolete: true
Keywords: review
Attached patch Slightly simpler and smaller patch (obsolete) (deleted) — Splinter Review
We decided to keep most of the cleanup for when we do our spring cleaning branch for transformiix (RSN). I have a separate patch that has all the cleanup I removed from the branch.
Attachment #83346 - Attachment is obsolete: true
Comment on attachment 83802 [details] [diff] [review] Slightly simpler and smaller patch r=axel@pike.org on peterv's changes which he landed on the branch
trunk branch speedup attsets 5136 4179 19% avts 29967 19532 35% axis 1211 1006 17% bottles 24188 24703 -2% breadth 890940 31262 96% chart 9499 9382 1% creation 31559 28648 9% current 1915 1689 12% dbtail 29469 17163 42% decoy 2252815 311982 86% depth 17057 8153 52% encrypt 11243 10773 4% 3304999 468472 86% A 86% speedup, that's x7!
Keywords: perf
Target Milestone: mozilla1.0 → mozilla1.1alpha
merged bugs 136756, 134295, 146964, 146965
Attached file Reviews comments (deleted) —
Comment on attachment 86079 [details] Reviews comments >Index: source:xpath:Expr.h > >// YYY TX_DECL_EXPR; those are non-virtual > >Index: source:xslt:ProcessorState.cpp >+nsresult ProcessorState::resolveFunctionCall(txAtom* aName, PRInt32 aID, >// YYY Should this be an error or just silent failure? this is a parse error, and even if we some day get extensibility for functions, those would hook up to the ProcessorState, so that's still the one to throw the error. >Index: source:xslt:XSLTProcessor.cpp >// YYY Error output should be settable by API user (as discussed). this used to go to cerr, we need to really make up what's our API and what is just a helper function. but that's bug 85408 or a friend of that >Index: source:xslt:txPatternParser.cpp >+nsresult txPatternParser::createStepPattern(ExprLexer& aLexer, >+ txIParseContext* aContext, >+ txPattern*& aPattern) > >// YYY How does this work with tok->type == Token::AT_SIGN ? only senseful case is "@node()", which works, as found out on irc. >Index: source:xslt:txXSLTPatterns.cpp >+double txLocPathPattern::getDefaultPriority() >// YYY Can't mSteps.get(0) be null? no, that's a parse error. >+#define TX_IS_WHITE(c) (c == ' ' || c == '\r' || c == '\n'|| c == '\t') > >// YYY We might have a function to do this in Mozilla, maybe we should use it I didn't find it. >+txIdPattern::txIdPattern(const String aString) >// YYY Why not a stringlist? I was thinking forth and back, a single normalized string was the easiest to maintain. I fixed the other comments.
new patch coming up that addresses the comments. comments I got on irc and fixed are: crasher in nsXPathEvaluator::ParseContextImpl::resolveNamespacePrefix unindenting #if and friends bad logic in RootExpr::evaluate
Whiteboard: XPATH_CONTEXT_113611_2_BRANCH (XPATH_CONTEXT_113611_2_MERGE_20020424 trunk) → XPATH_CONTEXT_113611_2_BRANCH (XPATH_CONTEXT_113611_2_MERGE_20020603 trunk)
Attached patch addressing comments (deleted) — Splinter Review
Attachment #83802 - Attachment is obsolete: true
Comment on attachment 86087 [details] [diff] [review] addressing comments r=peterv
Attachment #86087 - Flags: review+
Comment on attachment 86087 [details] [diff] [review] addressing comments Pike had already fixed all the issues that I found while reviewing the earlier revision of this diff, so no more comments from me. sr=jst (that was one large diff :-) )
Attachment #86087 - Flags: superreview+
This have added a lot of warnings on brad TBox: +extensions/transformiix/source/xslt/Numbering.cpp:80 + `enum txNodeTypeTest::NodeType nodetype' might be used uninitialized in this function + +extensions/transformiix/source/xslt/ProcessorState.cpp:194 + `double priority' might be used uninitialized in this function + +extensions/transformiix/source/xslt/ProcessorState.cpp:513 + `MBool hasAttr' might be used uninitialized in this function + +extensions/transformiix/source/xslt/ProcessorState.cpp:555 + `MBool hasAttr' might be used uninitialized in this function See also bug 126463 for "might be used uninitialized" warnings previously fixed in extensions/transformiix
not a lot. No biggie anyway, the logic in the code takes care that those values are initiliased once they're used. Gonna fix those warnings later, in a different bug.
patch is in. Thanx to Peter and Jonas for reviewing and input, and Johnny for the sr on short notice.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
we didn't verify for a long time. I really checked, so VERIFIED.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: