Closed Bug 1729407 Opened 3 years ago Closed 3 years ago

Cache friendly accessible text implementation

Categories

(Core :: Disability Access APIs, task)

task

Tracking

()

RESOLVED FIXED
94 Branch
Tracking Status
firefox94 --- fixed

People

(Reporter: Jamie, Assigned: Jamie)

References

(Blocks 2 open bugs)

Details

Attachments

(6 files, 2 obsolete files)

To support text in the parent process a11y cache, we need to cache any information we need from layout, but also cache as little information as possible. Our current HyperTextAccessible implementation is not really suitable:

  1. Because of embedded objects, each ancestor/descendant HyperTextAccessible has its own "view" of a given boundary. That means we'd potentially end up querying layout multiple times for the same boundary.
  2. For the same reason, this would probably be less memory efficient in the cache, since we're storing multiple representations of the same boundary.
  3. We'd have to cache word offsets as well as line offsets.
  4. Because of the extreme complexity of calculating boundaries using HyperTextAccessible (walking in and out of embedded objects to retrieve a single word or line, etc.), Mac doesn't use this. If the cache was based on HyperTextAccessible, we'd have to completely rethink the Mac implementation.
  5. The same complexity would cause problems when we want to support UIA Automation.
  6. Said complexity has caused a lot of obscure bugs and quirks which are extremely difficult to debug and fix.
  7. HyperTextAccessible relies on nsIFrame::PeekOffset. This has a lot of obscure quirks which have caused us problems in the past.

In this bug, we will implement new TextLeafPoint and TextLeafRange classes for text navigation. These will eventually replace TextRange and TextPoint. Rather than using HyperTextAccessible, these will use leaf Accessibles and offsets therein. HyperTextAccessible will eventually be a thin wrapper around these new classes, translating from/to HyperTextAccessible offsets only when absolutely necessary.

We need information from layout to calculate line boundaries, so we'll provide cache friendly methods to retrieve those, using layout only where absolutely necessary. We can calculate words without information from layout, so we don't need to cache those and will instead provide unified word navigation for local and remote Accessibles. As much as possible, the code to find boundaries should be unified for local and remote.

As well as being cache friendly, my hope is that this work will resolve a lot of obscure text bugs we've collected over the years (or at least make them easier to fix).

Depends on: 1729412

This adds TextLeafPoint::FindBoundary, which will be the main entry point for finding single boundaries.
FindBoundary searches individual Accessibles for a boundary, walking the a11y tree and continuing the search if a boundary isn't found.

Support for line start boundaries uses layout to determine line edges, but otherwise traverses the accessibility tree.
This avoids depending on PeekOffset, which has a lot of quirks that cause problems for a11y.
More importantly, it's possible to find line boundaries within a single LocalAccessible using the lower level FindPrev/NextLineStartSameLocalAcc methods.
This will be useful for line boundary caching, since we want to cache this info on individual Accessibles.
In contrast, PeekOffset might walk outside the current Accessible, which makes caching more difficult.

This implementation uses WordBreaker and a bunch of code to emulate layout's handling of punctuation, space, words spanning across nodes, words broken due to lines, etc.

While this does mean we're effectively duplicating layout, there are a couple of advantages.
First, PeekOffset has a lot of quirks which have caused us problems in the past and are difficult to debug.
They don't matter so much for other consumers, but a11y is very sensitive to these kinds of bugs.
Having our own logic means we can make changes as needed without potentially affecting other PeekOffset consumers.
Second, while the implementation currently only works on LocalAccessibles (because we don't have text in RemoteAccessible yet), it's been designed so that it should be very easy to adapt for RemoteAccessible.
It already walks the unified Accessible tree, so it just has to be taught how to get text from RemoteAccessible once that's possible.
This means we don't have to cache word boundaries; they can be calculated on demand in the parent process.

Again, this should be very easy to adapt to RemoteAccessible once we cache text there.

We won't use HyperTextAccessible for actual caching.
However, we have a lot of existing single process mochitests for HyperTextAccessible.
Putting it behind this pref makes it easy for us to run those tests against this new text implementation.
Eventually, we want to use this implementation for both local and remote Accessibles.

Blocks: 1730085
Blocks: 1730086
Blocks: 1730090
Blocks: 1730093
Attachment #9239809 - Attachment is obsolete: true
Depends on: 1730809
Blocks: 1730096
Attachment #9239810 - Attachment description: Bug 1729407 part 6: Use TextLeafRange for word/line starts in HyperTextAccessible::TextAtOffset if the cache is enabled. → Bug 1729407 part 5: Use TextLeafRange for word/line starts in HyperTextAccessible::TextAtOffset if the cache is enabled.

Previously, the test asserted that the line offsets for an embedded object should be an empty range.
This is actually incorrect, since there is content on the line (an empty text box and a button).
Update this with the correct result, but mark it as an expected failure.
The new implementation fixes this, so once we enable it by default, we can assert success.

Blocks: 1730862
Blocks: 1730868
Pushed by jteh@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f08b66f1bfc2 part 1: Add new TextLeafPoint and TextLeafRange classes. r=eeejay https://hg.mozilla.org/integration/autoland/rev/25c73785043e part 2: Add TextLeafPoint support for finding line start boundaries for LocalAccessibles. r=eeejay https://hg.mozilla.org/integration/autoland/rev/1bdee10f6384 part 3: Add TextLeafPoint support for finding word boundaries. r=eeejay https://hg.mozilla.org/integration/autoland/rev/58281d3cac4c part 4: Add TextLeafPoint support for finding character boundaries. r=eeejay https://hg.mozilla.org/integration/autoland/rev/2c4dc811beb9 part 5: Use TextLeafRange for word/line starts in HyperTextAccessible::TextAtOffset if the cache is enabled. r=eeejay https://hg.mozilla.org/integration/autoland/rev/9fb2520417b2 part 6: Fix an existing line boundary test. r=eeejay
Blocks: a11y-ctw-text
No longer blocks: texta11y, a11y-ctw
Blocks: 1734582
Blocks: 1735464

IsStartOfGroup wants to peek at the previous character.
It calls PrevChar, but it doesn't check for failure.
It then unconditionally increments mOffset to "undo" PrevChar.
If PrevChar failed, that means we're now 1 character after where we started.
This wasn't causing any test failures and I can't think of a test case that shows breakage here.
Nevertheless, it definitely doesn't make sense and could cause difficult-to-diagnose bugs.

Comment on attachment 9245660 [details]
Bug 1729407: PrevWordBreakClassWalker::IsStartOfGroup: Don't advance mOffset if PrevChar fails.

Revision D128332 was moved to bug 1735464. Setting attachment 9245660 [details] to obsolete.

Attachment #9245660 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: