Prototype semantic linkage mechanism
Categories
(Webtools :: Searchfox, enhancement)
Tracking
(Not tracked)
People
(Reporter: asuth, Unassigned)
References
(Blocks 2 open bugs)
Details
This is a spin-off of bug 1641372 review action item https://github.com/mozsearch/mozsearch/pull/408#discussion_r596683311 that led to my posting of https://bugzilla.mozilla.org/show_bug.cgi?id=1641372#c3 which I reproduce below in its entirety, but may update with time. I'll make another comment right after this one.
Linkage
Problem Statement
An important issue raised by :kats in the github review is that the IPC linkage mechanism implemented in the branch in crossref.rs
could potentially be made more generic / higher level, covering cases like the preferences use-case described in bug 1699048 or perhaps the URL mapping discussed in bug 1697671. However, in this comment I'm only going to deal with the IDL cases, though I think the general idea here should generalize to the preferences situation.
When digging into this a bit more, it became clear that when addressing this we could potentially also simplify the implementations of our IDL-related processing. Much of the complexity in the XPIDL analyzer and the IPDL analyzer involves having them determine what C++ analysis header file to load (:billm's explanation) and then to establish a mapping between the expected "pretty" human-readable identifier (ex: "Class::Method") and the underlying mangled symbol (which the analyzers don't understand and depend on there being no overloads to get it right). This is work that could instead be handled by the cross-referencing and/or new linker process.
IDL analyzer simplification
For XPIDL/IPDL/WebIDL, the analyzer/indexer could be simplified such that it emits analysis records that express:
- The IDL token definitions, including the peek ranges that cover the preceding comments.
- The expected linkage identifiers for each "slot" that relates to the symbol. This can then be used by the linker to determine the actual (mangled) symbol(s).
- The linkage identifiers would have a structured scheme like an array of
{ symbol: "exact_symbol" }
/{ symbol_prefix: "ZBlahBlah" }
/{ pretty: "Foo::Bar" }
/{ raw_prototype: "void blah(int blah, char blah) or something clever" }
objects. - This would allow the XPIDL compiler to indicate C++ bindings via "pretty" name and rust bindings via symbol (which is actually the same as pretty right now), supporting both in a single pass.
- The linkage identifiers would have a structured scheme like an array of
- IDL Slot schemes could be:
- XPIDL:
getter
(for C++/Rust),setter
(for C++/Rust),attribute
(for JS given our current analysis),method
- IPDL:
send
,recv
- WebIDL:
getter
,setter
,method
,enabling_pref
,enabling_func
.
- XPIDL:
- There'd probably also be a
binding
slot on each interface that allows a direct mapping from the interface to the binding classes.
Slots, Kinds, Symbols and UX
UX Status Quo
Searchfox currently really only has:
- From context menu:
- Go to definition (for which we know there's only one definition, otherwise this option is hidden)
- Search for everything about the unioned set of symbols associated with this token that all shared the same pretty identifier. (There may be multiple searches for cases like constructors where implicitly invoked field constructors end up collapsed onto the same token, but inherently have different pretty identifiers.)
- This is how the current XPIDL bindings work: We associate the .idl file tokens with the C++ symbol name for the binding we scrape from the binding header file, plus the wildly-unscoped JS property that the method/attribute would be indexed under as implemented or consumed.
- The search UI:
- Search for this pretty identifier and then tell me everything about the symbols associated with the pretty identifier.
- Fulltext search that is largely separate from all this semantic stuff but highly useful and powerful in its own right and serves as a backstop for when the semantic stuff falls down.
In the search results, we facet and prioritize the result by (target) "kind", on trunk these are: def
, decl
, idl
, use
, assign
.
What Slots Enable
The additional slots potentially enable new direct access context menu options as well as additional faceting. For example, for the above:
- XPIDL:
- Menu:
- Go to {getter/setter/method} implementation. This is distinct from "go to definition" which would be in the IDL file.
- Search Faceting:
- Uses of {getter/setter/attribute} as distinct items instead of all combined together. Although it might make sense to mix them together by default but provide an inline control in the sub-heading like
Uses ([x] getter [x] setter [x] attribute)
.
- Uses of {getter/setter/attribute} as distinct items instead of all combined together. Although it might make sense to mix them together by default but provide an inline control in the sub-heading like
- Menu:
- IPDL:
- Menu:
- Go to recv implementation. (This would directly accessible from the IPDL file and any calls to the Send method.)
- Diagramming: It really helps the auto-diagramming logic to be able to understand what calls have IPC semantics and the send-recv pairings.
- Menu:
- WebIDL:
- Menu:
- Go to {getter/setter/method} implementation. This is distinct from "go to definition" which would be in the WebIDL file.
- Go to enabling pref "foo.bar" (The presence of the menu option indicating the use of an enabling pref is probably most useful; maybe this also wants to be an icon?)
- Go to enabling func.
- Menu:
One Symbol Per Token / Abstract Symbols
A big conceptual change I'd been pursuing in the fancy branch was that rather than having each token being associated with a set of symbols, the token would explicitly be associated with a single symbol and have explicit relationships to the related symbols. That is, right now, there is no symbol that corresponds exactly to the given XPIDL token; it only references the presumed JS binding and the presumed C++ binding symbols. With this change we would create an explicit symbol namespace for XPIDL interfaces and the token in the .idl file would be associated with only that symbol.
That XPIDL-namespaced symbol would then reference the symbols of all the bindings. The C++ binding's analysis file would explicitly only have its own C++ symbol associated with the token. However, the cross-referencing process would establish these symbol relationships via the slots. The C++ binding symbol would have an "idl" slot that references the XPIDL-namespaced symbol, etc.
The search mechanism and context menu popups would all know how to pierce these relationships and potentially pre-compute/pre-aggregate the information as necessary.
Reporter | ||
Comment 1•3 years ago
|
||
The core action items for this bug will be a prototype which first:
- Changes some subset of the in-tree mozilla-central XPIDL, IPDL, and WebIDL to generate linkage data for searchfox.
- It won't be necessary to have full implementations for every piece in order to get eyes on this and opinions about whether the changes are resasonable and the benefits are reasonable.
- Changes searchfox to perform the linkage step for the subset.
- Demonstrate a few searchfox-tool based "checks" queries that usefully answer some kind of code question by following the linkage and which can intuitively map to some kind of context menu support or maybe a top-level search query.
From there I think we can evaluate how well this works and consider the specific context-menu features to implement, etc. (How the context menus get implemented could change if we move from the array-based jumps/searches ANALYSIS_DATA emission in the HTML versus just a keyed-by-symbol dictionary/map, but the structured analysis changes need to land first before that can happen.)
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Comment 2•3 years ago
|
||
:m_kato provided some information about JNI bindings in https://github.com/mozsearch/mozsearch/pull/314#issuecomment-1005626355 that looks like an example of more complicated linkages:
RegisterNatives
which takes a table like this media video capture logic, but where we seem to have standardized on a custom template glue mechanism:- widget/android/jni/Natives.h has a call to RegisterNatives in its Init method
- All the things we expose then call that Init method
- The called classes are like GeckoAppShellSupport which subclasses GeckoAppShell::Natives which builds an explicit table with a lot of useful markers and subclasses the NativeImpl template class from Natives.h which exposes the Init method which calls RegisterNatives.
- Here's the C++ Java_org_mozilla_gecko_mozglue_GeckoLoader_loadGeckoLibsNative that seems to correspond to org/mozilla/gecko/mozglue/GeckoLoader.java's loadGeckoLibsNative
I don't think this will be the first thing to hook up, but it's a very interesting real-world scenario! Edit: And likely this is something where some kind of rules engine would be appropriate so that some kind of declarative schema could be processed by the crossref mechanism as it digests symbols and then establishes those links in a second pass once it's seen all the symbols.
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Updated•2 years ago
|
Reporter | ||
Updated•2 years ago
|
Description
•