Closed Bug 1819802 Opened 2 years ago Closed 2 years ago

[CTW] HyperTextAccessibleBase::OffsetAtPoint is very slow with large text containers

Categories

(Core :: Disability Access APIs, defect)

defect

Tracking

()

VERIFIED FIXED
112 Branch
Tracking Status
firefox112 --- verified
firefox113 --- verified

People

(Reporter: Jamie, Assigned: Jamie)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [ctw-m6])

Attachments

(2 files)

STR:

  1. Open this test case (a paragraph with 20000 characters):
    data:text/html,<div id="div"><p id="p"></p><script>for (let i = 0; i < 10000; ++i) p.textContent += 'a ';</script>
  2. Ensure the document is focused.
  3. Open the Browser Console (control+shift+j).
  4. Paste this code:
    serv = Cc["@mozilla.org/accessibilityService;1"].getService(Ci.nsIAccessibilityService);
    doc = serv.getAccessibleFor(document.activeElement).firstChild;
    p = doc.firstChild.firstChild.QueryInterface(Ci.nsIAccessibleText);
    p.getOffsetAtPoint(0, 0, Ci.nsIAccessibleCoordinateType.COORDTYPE_PARENT_RELATIVE);
    
    • Expected: The response should be nearly instantaneous.
    • Actual: The browser hangs for more than 5 seconds.

This happens because OffsetAtPoint indirectly calls CharBounds on every character, which calls BoundsWithOffset for each character. BoundsWithOffset walks the Accessible ancestry for every call. This gives us O(n^2).

In theory, we should be able to avoid most of these BoundsWithOffset calls because many of them will occur on the same Accessible. If a text leaf has 20000 characters, we only need to fetch that leaf's bounds once. In practice, this is a bit tricky because there might be more than one text leaf in a HyperText container.

I can think of two possible solutions.

Solution 1:

  • Give TextLeafPoint::CharBounds an option to return coordinates relative to the leaf. Note that this is subtly different to COORDTYPE_PARENT_RELATIVE. Parent relative coordinates would be relative to the parent Accessible. We want them relative to the leaf itself so we don't have to do any acc bounds calculation at all.
  • In OffsetAtPoint, we'd get the bounds for the TextLeafPoint's mAcc, but only if the mAcc hasn't changed since the last character. Then we'd add the self relative CharBounds.
  • But we probably can't just add the CharBounds if there's a transform on mAcc. I'm not sure how to deal with that. Applying the transform in OffsetAtPoint seems awful.
  • This is kinda ugly. :(

Solution 2:

  • Lazily cache the calculated bounds for each Accessible in a hash map on the document. This way, BoundsWithOffset can just early return those if they're already calculated.
  • Whenever a bounds cache update occurs (on any Accessible), clear the entire hash map. This is a naive caching strategy, but it should work well enough.
  • This avoids OffsetAtPoint or CharBounds needing to manage any of this.
  • We still have the transform problem, though.

Morgan, I'd love your thoughts on this when you get a chance.

Flags: needinfo?(mreschenberg)

For the second solution (caching calculated bounds), I guess we could cache the bounds up to this Accessible. BoundsWithOffset could still do the bit where it applies the offset and then the transform. Then it'd just add the cached calculated bounds after that.

But... arrrgh! Actually, none of this will work because transforms from ancestors also affect the character rects. Once we cache any calculated coordinates, we've lost the granularity needed to transform things correctly.

Can we somehow convert the requested point into a point relative to the leaf's mAcc? We'd then compare with leaf relative character rects directly. This is in contrast to the current approach where we have to get a screen rect for every character.

Note to self: profile before making assumptions about where the real problem lies.

O(n^2) is not ideal, but I don't know if we're going to be able to fix that due to transforms. However, this profile suggests that memory allocation might be the real bottleneck here. When we get the character bounds array (which we do for every single character), we have to convert it from an array of numbers to an array of rects. There are two problems with this:

  1. We don't specify the capacity of the array (which we can calculate), so each append causes reallocation.
  2. Ideally, we wouldn't build the rect array at all, as we're doing a lot of processing we don't need to there.
Assignee: nobody → jteh
Flags: needinfo?(mreschenberg)
Whiteboard: [ctw-m6]

When hit testing, we retrieve the bounds for every character in a text leaf in a separate call.
Because we need to convert from an array of flat ints to rects, this previously meant we built the entire array for every character.
For a large text leaf, building this rect array is relatively expensive.
We only need a single rect at a time.
Therefore, RemoteAccessibleBase now returns the rect for a single requested character instead of building an array of rects.

For the viewport cache and character rects, we have at least a reasonable estimate of the number of elements that will be required.
Pre-allocating these saves on potentially costly re-allocation as we append elements.

Pushed by jteh@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8b31039c5cab part 1: When getting cached a11y character bounds, return a single rect rather than an array of rects. r=morgan https://hg.mozilla.org/integration/autoland/rev/57f7ab1f9f72 part 2: Pre-allocate some arrays in LocalAccessible::BundleFieldsForCache. r=morgan,nlapre
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 112 Branch
Flags: qe-verify+

Reproduced the issue on Firefox 112.0a1 under macOS 13.3 by following the STR from Comment 0.

The issue is fixed on Firefox 112.0b9 and Firefox 113.0a1 (2023-04-02). Tests were performed on macOS 13.3, Windows 11 and Ubuntu 22.04.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: