Open Bug 313207 Opened 19 years ago Updated 2 years ago

Get stats on QI calls

Categories

(Core :: XPCOM, defect)

defect

Tracking

()

People

(Reporter: sicking, Unassigned)

References

Details

(Keywords: perf)

Attachments

(3 files, 7 obsolete files)

Towards the end of the security meeting we ended up discussing QI performance.
Two ideas were brought up that both need statistical data for which interfaces
we QI to on which classes. Using such data we can

1. Add explicit early-returns for interfaces that we QI to often but don't
implement. This way we won't have to run down a long list of iid-checks but can
rather bail right away.

2. Rearrange order of iid-checks so that the interfaces we QI to most often are
on top of the list.
So... doing this in general would take work, but I did it for HTML elements by
applying a patch (coming up), then running bc's spider at depth 1 on
http://bclary.com/2004/07/10/tests/test-sites.html and post-processing the
result (with a script that's also coming up).  I really wish we had a better way
to get interface names for non-idl stuff.  :(
Attached patch Patch I used for logging (obsolete) (deleted) β€” β€” Splinter Review
Attached file Script used for post-processing (obsolete) (deleted) β€”
I put in some common IIDs by hand, but that really doesn't scale, esp if people
change IIDs....
Attached file Output (obsolete) (deleted) β€”
First thing that jumps out at me is nsIXPConnectWrappedJS.  I believe we do a
QI to that in XPConnect in various cases, but in particular when first creating
an XPCWrappedNative...

That said, the 25000-some QIs to that put it at about 10th place in the total
QI calls list.

Also note that this build doesn't have my caps/xpconnect patches that should
eliminate most of the QIs to nsIScriptObjectPrincipal.

I guess I'm wondering where to go from here.  Should I try for a bigger
dataset?  Or instrumenting a wider variety of classes?	Or both?

Also note that I'm not outputting the concrete class anywhere here.  Should I
be, perhaps?  Do we care?
nsIDOMEventReceiver is pretty noteworthy. Mostly because it is implemented as a
tearoff and so it will often result in a malloc in addition to the normal QI
cost. We should probably consider making that a normal interface or caching a
pointer in nsDOMSlots.

I think it might be interesting to get these stats on a per element bases. And
for xul elements too. I'm wondering if we're QIing to nsIImageLoadingContent and
nsILink on all elements, or if we just do it a lot on the elements where it
makes sense.

Other then that, i think we might be able to rearrange code to get nsIDOMNode
and nsIDOMElement be tested earlier.
Depends on: 313249
Perhaps we should resurrect the possibility of table-driven-QI? According to Don
Box table-driven QI can significantly improve performance due to locality of CPU
cache and reduced codesize (single shared QI impl). It might allow allow us to
instrument QI to do the other optimizations here more easily.
> We should probably consider making that a normal interface or caching a
> pointer in nsDOMSlots.

I think generally caching our tearoffs in domslots and using an aggregated
refcount with the content node would be a great idea....

I'll try to change the instrumentation to note the element class.
> Perhaps we should resurrect the possibility of table-driven-QI?

How would you support NS_INTERFACE_MAP_ENTRY_CONDITIONAL (and some of the other
nasty QI macros that we have)?
We also have tearoffs that can be tricky. Is there a writeup somewhere about how
tabledriven QI works?
> We also have tearoffs that can be tricky. Is there a writeup somewhere about 
> how tabledriven QI works?

Don Box's book is the best reference I know.  I have a copy sitting in my book
shelf, so I've never bothered to search the web looking for similar info.  In a
nut-shell, you replace the QI implementation with a simple table of IIDs and
offsets.  The offsets are used to increment the |this| pointer to get at the
correct vtable for the corresponding IID.  Then, you just loop over the table,
and when the IIDs match, you adjust the |this| pointer according to the offset
and return that value.  So, it's pretty trivial to implement NS_IMPL_ISUPPORTS.
 The challenge is in supporting all of the variations allowed by QueryInterface.
You have to supply the class that is queried too, no? And that along with the
IID is the key into the table. Can you use the vtablepointer as class? Or would
we just a giant enum?
You don't have to do the variations table-driven: you can do something like this:

** Simple (no-tearoff, no-conditional) case **

nsresult Class::QueryInterface(aIID, aResult)
{
  return Table_driven_QI(NS_STATIC_CAST(void*, this), static_lookup_table, aIID,
aResult);
}

** Complex case **

nsresult Class::QueryInterface(aIID, aResult)
{
  nsresult rv = Table_driven_QI(as above);
  if (NS_SUCCEEDED(rv))
    return rv;

  do conditional_and_tearoff_QI_manually_here
}
bsmedberg: exactly, but it's not clear how to do that with our existing macros :-/
i think we'd probably have to touch a lot of code to fix up the macro usage. 
any clever ways to avoid that?
I'm willing to fixup the NS_IMPL_ISUPPORTSx macros first, and then do some
reordering and fixup of the INTERFACE_MAP macros which are harder... I don't
think it's going to be possible to automatically fixup the current INTERFACE_MAP
usages which have _CONDITIONAL or _TEAROFF usages.
Sounds like a good plan to me.
Table-driven QI forked off to bug 313309.
Assignee: bzbarsky → dougt
Assignee: dougt → bzbarsky
Attached patch New patch -- logs XUL as well, and logs classname (obsolete) (deleted) β€” β€” Splinter Review
Attachment #200317 - Attachment is obsolete: true
Attached file Updated script to analyze the data (deleted) β€”
Attachment #200318 - Attachment is obsolete: true
Attached file New output from bc's spider (obsolete) (deleted) β€”
This is a log from bc's spider on the top 70 sites again.  Note that this was
using the spider chrome, not the usual Firefox chrome and that I did not
interact with the browser much while it was running.
Attachment #200319 - Attachment is obsolete: true
Interesting things to note here:

1)  Whence QIs to nsPIDOMWindow?  Someone QIing event targets?
2)  Lots of QIs on XUL to nsIDOMEventTarget.
Ah, the nsPIDOMWindow QI is a debug-only QI in
nsDOMEvent::SetTarget/SetCurrentTarget.

Come to think of it, doing such stats in a debug build is silly.  I'll try to
patch one of my opt builds and retest.
I'm very curious who is QIing to nsIDOMHTMLSelectElement and
nsIDOMHTMLOptGroupElement. My money is on someone is searching for
select/optgroup elements and QIs rather then checking the nodename.
Jonas: the results are certainly bogus in light of Assert_NoQueryNeeded :-)
I filed bug 313343 on the nsIDocumentObserver change.  I'll build an opt build
with this instrumentation and have some data on Sunday, hopefully.
Depends on: 313343
Attached file Log for an opt build (obsolete) (deleted) β€”
As expected, a lot less QI to nsIStyledContent.  But the QIs to nsIDOMHTMLSelectElement and nsIDOMHTMLOptGroupElement are still there.

The nsIDOMHTMLSelectElement QIs come from frame construction -- ContentAppended and ContentInserted always check whether the parent content node is an <html:select> before doing a WipeContainingBlock check.  That's silly.  What really matters here, imo, is not the type of the parent node but the type of the parent _frame_, since that's what WipeContainingBlock cares about.  We should fix this.

The nsIDOMHTMLOptGroupElement QIs come from <select> layout.  In particular, GetMaxOptionHeight() (in nsListControlFrame.cpp) QIs all options to this interface because it would need to recurse in that case.  We could try to move to a tag test here, I suppose, or just go with a QI to nsIDOMHTMLOptionElement instead (would be hit more often) and reverse the two branches of the if.
Depends on: 313496
Depends on: 313968
Depends on: 314878
Attached file Current log (deleted) β€”
No more nsIDOMHTMLSelectElement.  We still have misses for nsIXPConnectWrappedJS, nsIScriptObjectPrincipal (which is on its way out), nsIDOMHTMLOptGroupElement (should probably file a bug), nsIDocument, nsIScriptGlobalObject, nsIDOMCharacterData.  And small fry after that.  Note that the perl script needs new IIDs for nsIContent and nsIDocument to generate this output; I won't bother attaching the update, since those IIDs keep changing...
Attachment #200397 - Attachment is obsolete: true
Attachment #200401 - Attachment is obsolete: true
Attachment #200519 - Attachment is obsolete: true
See also bug 71528.
Assignee: bzbarsky → dougt
Attached patch Patch merged to trunk (deleted) β€” β€” Splinter Review
I'm removing this from my tree for now...
Attachment #200395 - Attachment is obsolete: true
mass reassigning to nobody.
Assignee: dougt → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: