atk_text_get_text_at_offset returns bogus offsets and incorrect units of text
Categories
(Core :: Disability Access APIs, defect, P3)
Tracking
()
People
(Reporter: jdiggs, Assigned: MarcoZ)
References
(Blocks 2 open bugs)
Details
Attachments
(2 files)
Steps to reproduce:
-
Load the attached test case in Firefox
-
Launch Accerciser and locate the object associated with the contents in the code block.
-
In Accerciser's iPython console, type the following:
t = acc.queryText()
t.getTextAtOffset(95, TEXT_BOUNDARY_LINE_START)
t.getTextAtOffset(96, TEXT_BOUNDARY_LINE_START)
t.getTextAtOffset(100, TEXT_BOUNDARY_LINE_START)
t.getTextAtOffset(120, TEXT_BOUNDARY_LINE_START)
Expected results: A single line and valid start and end offsets would be returned each time.
Actual results:
t.getTextAtOffset(95, TEXT_BOUNDARY_LINE_START) returns ('', 95, 1)
t.getTextAtOffset(96, TEXT_BOUNDARY_LINE_START) returns ('', 95, 1)
t.getTextAtOffset(100, TEXT_BOUNDARY_LINE_START) returns ('', 95, 1)
t.getTextAtOffset(120, TEXT_BOUNDARY_LINE_START) returns ('module Configurable\n def self.included(host_class)\n host_class.extend ClassMethods\n end\n\n module ClassMethods\n def config\n', 1, 132)
Impact: This degree of brokenness makes it very difficult for Orca to provide access to this content.
Reporter | ||
Comment 1•6 years ago
|
||
By the way, until things start failing (at offset 95), results look normal:
t.getTextAtOffset(0, TEXT_BOUNDARY_LINE_START): ('\n', 0, 1)
t.getTextAtOffset(1, TEXT_BOUNDARY_LINE_START): ('module Configurable\n', 1, 21)
t.getTextAtOffset(21, TEXT_BOUNDARY_LINE_START): (' def self.included(host_class)\n', 21, 53)
t.getTextAtOffset(53, TEXT_BOUNDARY_LINE_START): (' host_class.extend ClassMethods\n', 53, 88)
t.getTextAtOffset(88, TEXT_BOUNDARY_LINE_START): (' end\n', 88, 94)
t.getTextAtOffset(94, TEXT_BOUNDARY_LINE_START): Out[7]: ('\n', 94, 95)
Comment 2•6 years ago
|
||
Wow. We have quite a few nasty text bugs, but this is probably the wackiest I've ever seen. Thanks for reporting, even if I'm going to be scarred forever after... :)
I distilled the test case down to this:
data:text/html,<span style="display: inline-block;"><span>a<br>b<br><span></span></span>c</span>
Line for each offset:
0: okay, (0, 2, 'a\n')
1: okay, (0, 2, 'a\n')
2: fail, expected (2, 4, 'b\n'), actual (2, 0, null)
3: fail, expected (2, 4, 'b\n'), actual (2, 0, null)
4: fail, expected (4, 5, 'c'), actual (0, 5, 'a\nb\nc')
If you remove the inline-block style, all works as expected. The inner spans do affect the result; removing each of them changes the result in different ways.
Comment 3•6 years ago
|
||
(In reply to James Teh [:Jamie] from comment #2)
Wow. We have quite a few nasty text bugs, but this is probably the wackiest
I've ever seen. Thanks for reporting, even if I'm going to be scarred
forever after... :)I distilled the test case down to this:
data:text/html,<span style="display: inline-block;"><span>a<br>b<br><span></span></span>c</span>
Line for each offset:
0: okay, (0, 2, 'a\n')
1: okay, (0, 2, 'a\n')
2: fail, expected (2, 4, 'b\n'), actual (2, 0, null)
3: fail, expected (2, 4, 'b\n'), actual (2, 0, null)
4: fail, expected (4, 5, 'c'), actual (0, 5, 'a\nb\nc')If you remove the inline-block style, all works as expected. The inner spans
do affect the result; removing each of them changes the result in different
ways.
Do you reproduce this on Windows also? The is marked currently for Linux-only.
Best regards,
Alex.
Comment 4•6 years ago
|
||
The testing in comment 2 was done on Windows. I just forgot to update the platform. Thanks for catching.
Comment 5•6 years ago
|
||
I tracked case "2:" down to nsTextFrame::GetChildFrameContainingOffset : the loop quickly stops because f->GetNextContinuation() returns NULL, before we reach the requested offset. AIUI that's not expected, we should find a frame containing the requested offset before calling frameAtOffset->PeekOffset()? Could it just be that the code producing the text frames is missing setting mNextContinuation in this precise case?
Comment 6•6 years ago
|
||
Any idea who we could ask for advice on this on the text frame production side?
Comment 7•6 years ago
|
||
Emilio, can you help here? (See comment 5.) I confess I know very little (yet) about how a11y interacts with layout for text offsets. However, it does seem odd that GetNextContinuation() returns null in this specific case.
Comment 8•6 years ago
|
||
(In reply to James Teh [:Jamie] from comment #7)
Emilio, can you help here? (See comment 5.) I confess I know very little (yet) about how a11y interacts with layout for text offsets. However, it does seem odd that GetNextContinuation() returns null in this specific case.
Well, there are no continuations in this test-case, so that's not really surprising. A continuation happens when you split a text-node into multiple lines. In this test-case each text-node is different so there should be no continuation.
What are the offsets that you're giving to layout? What's the expected interpretation of those offsets? Editing (caret positioning and such) uses the same APIs, and seems to work fine, so there's probably a bogus assumption somewhere else.
Could somebody paste a stack trace of where the error happens and how do the arguments given to GetChildFrameContainingOffset look like?
Assignee | ||
Comment 9•4 years ago
|
||
Emilio, the calls start with HyperTextAccessible::TextAtOffset, steps into HyperTextAccessible::FindLineBoundaries, and from there into HyperTextAccessible::FindOffset. aOffset is the parameter Jamie is giving above, e. g. the offsets 0, 1, ... 4, 5. What we want returned is always the line at the offset, when queried on the accessible that is created for the span that has the inline-block style in Jamie's distilled testcase from comment #2.
FindLineBoundary tries to get both start and end offsets. The start offset is correct, the end offset is not. The end offset is always the point where the next interesting thing starts. For offsets 0 and 1, the end offset is 2, which is where the next line starts, etc.
FindLineBoundaries, when looking for the NextLineBegin offset, first simulates the press of the DownArrow into the next line, then a Home key press to get to the start of the line it lands on. It uses different variations of FindOffsets for that, and since bug 1670541, has some special handlings when landing on embedded characters. That bug doesn't solve this one, though, so FindOffset has some bug where it returns 0 for the end offset.
FindOffset is quite a beast. It first tries to find the deepest text leaf for its current HyperTextAccessible (first loop). With the offset it finds for the deepest text leaf, it then finds the frame, and then tries to get the offset from layout.
Sorry, I don't know how to create a real stack trace because debug builds are terrible to use with a screen reader. Does this above help in any way?
Assignee | ||
Comment 12•4 years ago
|
||
If an inline-block element contains nested elements that may contain line breaks, or whitespace that comes from how the HTML file is formatted, both start and end offsets returned from layout may be incorrect. Adjust for the following cases:
- The start offset goes too far back from the passed-in offset, overshooting a line boundary.
- The end offset returned is smaller than the passed-in offset, even though we want the start of the next line. Even on the last line, this should never happen.
Updated•4 years ago
|
Comment 13•4 years ago
|
||
Comment 14•4 years ago
|
||
bugherder |
Updated•4 years ago
|
Comment 15•3 years ago
|
||
Cancelling NI because 1) the patch in this bug works around the problem here; and 2) the new TextLeafPoint implementation in bug 1729407 also solves this, but in a cleaner way.
Description
•