Open Bug 1384727 Opened 7 years ago Updated 2 years ago

Make QueryInterface use integer IDs

Categories

(Core :: XPCOM, enhancement)

enhancement

Tracking

()

Performance Impact low

People

(Reporter: ehsan.akhgari, Unassigned)

References

(Blocks 3 open bugs)

Details

There is no reason to continue to use UUIDs to represent our interface identifiers any more. Comparing 128 bits wastes precious cycles. It's time to make this an integer. We would need some central list of identifiers mapped to unique ints maintained somehow for this to work. One solution is one header containing a giant enum (which would have the downside of triggering a tree-wide rebuild every time it changes).
Component: DOM → XPCOM
We use nsID for more things than just XPCOM today. For example, we use UUIDs to identify things created in different processes without a communications protocol to coordinate. The nsID UUID provides a reasonable way to do that without colliding. Are you really talking about all nsID here or just XPCOM's use of nsID?
Flags: needinfo?(ehsan)
Also note we serialize nsID in a variety of places like Cache API, service worker registry, and probably a lot more. We would need to handle all those disk representations.
You're right, I should have thought about that! My goal here is to improve the performance of QueryInterface() specifically, there are certainly other valid use cases of UUIDs (and other cases where they have been performance problems FTR, like bug 1348462 or bug 1348461). :-)
Flags: needinfo?(ehsan)
Summary: Make nsID an integer → Make QueryInterface use integer IDs
Heh, I filed this 18 hours later. See bug 1384914 for some implementation notes. I don't think the tree-wide enum is a big deal, because adding new interfaces is relatively uncommon.
> My goal here is to improve the performance of QueryInterface() specifically, > there are certainly other valid use cases of UUIDs (and other cases where > they have been performance problems FTR, like bug 1348462 or bug 1348461). > :-) Yea, I'm aware UUIDs have cost. I just wanted to point out changing nsID would be a huge lift since its used in a lot of places. And many of those places have requirements that a simple integer is unlikely to meet.
(In reply to Benjamin Smedberg [:bsmedberg] from comment #5) > Heh, I filed this 18 hours later. See bug 1384914 for some implementation > notes. > > I don't think the tree-wide enum is a big deal, because adding new > interfaces is relatively uncommon. While that's probably true, the workflow on a patch that does it would be a huge PITA. I have several patches that change something at the core of dom/base etc. and every time I have to switch back to central (to write a quick fix for something else for example, or to compare), the complete Gecko rebuild happens, which these days is sometimes getting up to 90 minutes on my super fast, state of the art laptop with icecream at the office. That's really long. Imagining that every time I go back and forth with a small patch that adds an interface I'd need a complete rebuild is scary. Can we look for a way to avoid the monolithic impact here and keep it modular? Like, and that's totally a shot in the dark - is there an integer large enough that randomly generating it we're unlikely to collide?
> is there an integer large enough that randomly generating it we're unlikely to collide? Yes, 128 bits. Aka UUIDs. What we could do is assign different parts of the 32-bit range to different source directories or something, then only have a single list per directory, maybe.
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #7) > (In reply to Benjamin Smedberg [:bsmedberg] from comment #5) > > Heh, I filed this 18 hours later. See bug 1384914 for some implementation > > notes. > > > > I don't think the tree-wide enum is a big deal, because adding new > > interfaces is relatively uncommon. > > While that's probably true, the workflow on a patch that does it would be a > huge PITA. I'll note that we lived with revving the uuid for every change to IDL files for 10+ years, and that seemed to work pretty OK. The change proposed here is at least an order of magnitude less disruptive. > I have several patches that change something at the core of dom/base etc. > and every time I have to switch back to central (to write a quick fix for > something else for example, or to compare), the complete Gecko rebuild > happens, which these days is sometimes getting up to 90 minutes on my super > fast, state of the art laptop with icecream at the office. That's really > long. > > Imagining that every time I go back and forth with a small patch that adds > an interface I'd need a complete rebuild is scary. > Can we look for a way to avoid the monolithic impact here and keep it > modular? > > Like, and that's totally a shot in the dark - is there an integer large > enough that randomly generating it we're unlikely to collide? 128 bits is too wide for performance, so that's out of the question. What Boris proposed in comment 8 is one option. We can even go to extremes, such as assigning each directory (not top level directory) a unique space of let's say 10,000 IDs, where the build system would generate a central header with the correct assignments so that the situation you described above would only be a tree-wide recompile if you added a new directory.
Whiteboard: [qf]
Blocks: 1425466
Whiteboard: [qf] → [qf:f63][qf:p3]
Two implementation ideas that have come up after discussions and thinking about this for the last day: - Interface UUIDs are baked into serialization of implementing classes: https://searchfox.org/mozilla-central/source/xpcom/io/nsBinaryStream.cpp#327-341 And we serialize these things all over. We can work around this when we do switch by inventing a magic UUID to effectively serve as a version specifier. When we read that magic UUID, we can recognize that we have serialized a "new-UUID" object. And we already do a little of this, e.g.: https://searchfox.org/mozilla-central/source/xpcom/io/nsBinaryStream.cpp#911-924 - ID allocation can be managed by hooking into the build system, assuming that we traverse directories in a stable order. The first directory we traverse gets IDs 0-999, the second directory IDs 1000-1999, and so forth. The XPIDL compiler can then be given the ID "base" and sequentially assign IDs to the interfaces that it sees. Adding and removing IDLs by e.g. pushing and popping patches in a patch queue would trigger some small amount of recompilation, since the IDs for a given directory might be renumbered as a result. This scheme would require that all interfaces be defined in IDL, which might be a tall order. I don't have good ideas about any entanglement of nsISupports and IUnknown, suggested in bug 1384914.
The idea that I've been playing around with is just having a centralized list of interface IDs. New interfaces always get added to the end of the list, with the next available interface ID, and old IDs never get reused.
Apparently we pervasively define IIDs for classes outside of IDL files: https://searchfox.org/mozilla-central/search?q=NS_DEFINE_STATIC_IID_ACCESSOR&path= So people would probably revolt if we tried to enforce the IDL requirement. =/ (along with the potential ergonomic loss from requiring methods to follow XPIDL rules)
(We could probably just compile that list into a header, and reference it from existing C++ interfaces without too many changes, based on interface name. The IDL side of things would be even easier.)
(In reply to Nathan Froyd [:froydnj] from comment #12) > Apparently we pervasively define IIDs for classes outside of IDL files: > > https://searchfox.org/mozilla-central/ > search?q=NS_DEFINE_STATIC_IID_ACCESSOR&path= > > So people would probably revolt if we tried to enforce the IDL requirement. > =/ (along with the potential ergonomic loss from requiring methods to > follow XPIDL rules) Requiring all interfaces to be defined in XPIDL isn't really an option. We have a bunch of concrete classes that we QI to, and those can't be XPIDL-ified. We'd have to QI and then static_cast, which is gross...
Depends on: 1469389
(In reply to Kris Maglione [:kmag] from comment #11) > The idea that I've been playing around with is just having a centralized > list of interface IDs. New interfaces always get added to the end of the > list, with the next available interface ID, and old IDs never get reused. I was trying to come up with a scheme that would avoid the problems referenced in comment 7. Maybe we just have to eat another global dependency that requires recompiling thousands of lines for a single-line change, though...
Blocks: 1469390
(In reply to Nathan Froyd [:froydnj] from comment #15) > I was trying to come up with a scheme that would avoid the problems > referenced in comment 7. Maybe we just have to eat another global > dependency that requires recompiling thousands of lines for a single-line > change, though... Given that it only needs to be changed when we add a completely new interface ID, I don't think likely to be a huge issue. We do that less and less these days, anyway. However, we could always store the canonical value in a centralized list and then copy it into the header for non-IDL classes, I suppose. We should really only need to include that header in header files for C++-implemented interfaces, though. In practice, that'd probably mean recompiling most of the tree whenever it changes. But so would changing a single one of those headers, in a lot of cases. So it probably won't make a huge difference either way.
> Given that it only needs to be changed when we add a completely new > interface ID, I don't think likely to be a huge issue. We do that less and > less these days, anyway. Removing entries (once this list exists) would probably be more common. Strictly speaking we wouldn't have to remove dead entries, but it would poor hygiene not to.
Performance Impact: --- → P3
Whiteboard: [qf:f63][qf:p3]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.