Closed
Bug 961737
Opened 11 years ago
Closed 11 years ago
Regression: atk_text_get_text() is broken w.r.t. embedded object characters
Categories
(Core :: Disability Access APIs, defect)
Tracking
()
RESOLVED
FIXED
mozilla30
People
(Reporter: jdiggs, Assigned: surkov)
References
(Blocks 1 open bug)
Details
Attachments
(4 files, 2 obsolete files)
(deleted),
text/x-python
|
Details | |
(deleted),
patch
|
tbsaunde
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
tbsaunde
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
tbsaunde
:
review+
|
Details | Diff | Splinter Review |
Steps to reproduce:
1. Load the attached accessible-event listener in a terminal
2. (Re)load the test case (attachment 827653 [details])
3. Compare the results from Firefox 24 and nightly.
Things that implement AtkText and claim to have text (string, character count) used to report that text when using atk_get_text_at_offset(). That seems to no longer be the case when the text consists of one or more embedded object characters. Only getting a character works. See results below.
I am admittedly not a fan of digging down via the accessible text interface trying to piece together lines and implementing an Orca-controlled caret for Gecko content. ;) BUT, given that this seems necessary, whatever change is responsible for this new behavior, it breaks stuff for Orca. And even if, as part of the Great Orca Rewrite I'm doing, I solve this in Orca, that's not going to solve things for users of previous versions of Orca. So I'm going to assume this is an accidental regression, but if it is by design I still think it needs to be reversed for those "long term stable release" folks.
==================
Firefox 24 results
==================
-> [document frame | List item bug] (0, 137, 1080, 1756)
textinfo: offset: 0 length: 1 string: [OBJ]
lineinfo: <[OBJ]> (0, 1) (8, 145, 1064, 52)
wordinfo: <[OBJ]> (0, 1) (8, 145, 1064, 52)
charinfo: <[OBJ]> (0, 1) (8, 145, 1064, 52)
-> [list | ] (8, 145, 1064, 52)
textinfo: offset: -1 length: 2 string: [OBJ]
lineinfo: <[OBJ]> (0, 1) (48, 145, 1024, 26)
wordinfo: <[OBJ]> (0, 1) (48, 145, 1024, 26)
charinfo: <[OBJ]> (0, 1) (48, 145, 1024, 26)
-> [list item | • Item 1] (32, 145, 1040, 26)
textinfo: offset: -1 length: 7 string: •Item 1
lineinfo: <•Item 1> (0, 7) (32, 145, 65, 26)
wordinfo: <1> (6, 7) (88, 145, 9, 26)
charinfo: <•> (0, 1) (32, 155, 8, 10)
-> [list item | • Item 2] (32, 171, 1040, 26)
textinfo: offset: -1 length: 7 string: •Item 2
lineinfo: <•Item 2> (0, 7) (32, 171, 71, 26)
wordinfo: <2> (6, 7) (93, 171, 10, 26)
charinfo: <•> (0, 1) (32, 181, 8, 10)
===============
Nightly results
===============
-> [document frame | List item bug] (0, 110, 1080, 1783)
textinfo: offset: 0 length: 1 string: [OBJ]
lineinfo: <> (-1, -1) (0, 0, 0, 0)
wordinfo: <> (0, 0) (0, 0, 0, 0)
charinfo: <[OBJ]> (0, 1) (8, 118, 1064, 52)
-> [list | ] (8, 118, 1064, 52)
textinfo: offset: -1 length: 2 string: [OBJ]
lineinfo: <> (-1, -1) (0, 0, 0, 0)
wordinfo: <> (1, 1) (0, 0, 0, 0)
charinfo: <[OBJ]> (0, 1) (48, 118, 1024, 26)
-> [list item | • Item 1] (32, 118, 1040, 26)
textinfo: offset: -1 length: 7 string: •Item 1
lineinfo: <•Item 1> (0, 7) (32, 118, 65, 26)
wordinfo: <> (1, 1) (0, 0, 0, 0)
charinfo: <•> (0, 1) (32, 128, 16, 10)
-> [list item | • Item 2] (32, 144, 1040, 26)
textinfo: offset: -1 length: 7 string: •Item 2
lineinfo: <•Item 2> (0, 7) (32, 144, 71, 26)
wordinfo: <•> (0, 1) (32, 154, 16, 10)
charinfo: <•> (0, 1) (32, 154, 16, 10)
Assignee | ||
Comment 1•11 years ago
|
||
I think it's reasonable if text consisting of embedded char returns the char when line is requested
Assignee | ||
Comment 2•11 years ago
|
||
Not sure what wanted behavior is though. Say if we have
<body>
<p>first paragraph</p>
<p>second paragrahp</p>
</body>
so the document text is two embed characters. Say if the first paragraph takes multiple lines then what is getTextAtOffset(0, LINE_BOUNDARY) supposed to return? Should it be (0, 1) or (0, 0)?
Flags: needinfo?(jamie)
Comment 3•11 years ago
|
||
(In reply to alexander :surkov from comment #2)
> so the document text is two embed characters. Say if the first paragraph
> takes multiple lines then what is getTextAtOffset(0, LINE_BOUNDARY) supposed
> to return? Should it be (0, 1) or (0, 0)?
It should be (0, 1), which indicates that the embedded object consumes at least one entire line. (0, 0) is a collapsed range suggesting that the line contains nothing, which is incorrect.
Flags: needinfo?(jamie)
Assignee | ||
Comment 4•11 years ago
|
||
part1: fix a generic (non list) case
Assignee: nobody → surkov.alexander
Attachment #8363303 -
Flags: review?(trev.saunders)
Assignee | ||
Comment 5•11 years ago
|
||
Attachment #8363303 -
Attachment is obsolete: true
Attachment #8363303 -
Flags: review?(trev.saunders)
Attachment #8363305 -
Flags: review?(trev.saunders)
Comment 6•11 years ago
|
||
Comment on attachment 8363305 [details] [diff] [review]
part1:patch2
>+ addTextOffset = static_cast<uint32_t>(
>+ aIsEndOffset && (addTextOffset > 0 || descendantAcc->IndexInParent() > 0));
imho that's a really hard way to read that.
if (aIsEndOffset)
addOffset = addOffset > 0 || child->IndexInParent() > 0 ? 1 : 0;
else
addOffset = 0
is a lot clearer even though its longer. It also seems like it would be nice if addOffset could be a bool if we should add one or not, not a number to add.
Also that's going to make that loop a good bit slower, maybe check if its the first content child? I guess that's not a great idea. Maybe we should just finally devirtualize GetChildAt() and then use FirstChild()
>+ testTextAtOffset(["ht_5" ], BOUNDARY_LINE_START,
>+ [ [ 0, 0, kEmbedChar, 0, 1 ] ]);
why not test BOUNDARY_LINE_END isn't that what you fix?
Attachment #8363305 -
Flags: review?(trev.saunders) → review+
Assignee | ||
Comment 7•11 years ago
|
||
(In reply to Trevor Saunders (:tbsaunde) from comment #6)
> Comment on attachment 8363305 [details] [diff] [review]
> part1:patch2
>
> >+ addTextOffset = static_cast<uint32_t>(
> >+ aIsEndOffset && (addTextOffset > 0 || descendantAcc->IndexInParent() > 0));
>
> imho that's a really hard way to read that.
>
> if (aIsEndOffset)
> addOffset = addOffset > 0 || child->IndexInParent() > 0 ? 1 : 0;
> else
> addOffset = 0
>
> is a lot clearer even though its longer.
ok if you want
> It also seems like it would be
> nice if addOffset could be a bool if we should add one or not, not a number
> to add.
addTextOffset may also be used to keep an offset in text node
> Also that's going to make that loop a good bit slower, maybe check if its
> the first content child? I guess that's not a great idea. Maybe we should
> just finally devirtualize GetChildAt() and then use FirstChild()
instead IndexInParent? not sure
> >+ testTextAtOffset(["ht_5" ], BOUNDARY_LINE_START,
> >+ [ [ 0, 0, kEmbedChar, 0, 1 ] ]);
>
> why not test BOUNDARY_LINE_END isn't that what you fix?
original test is LINE_START, iirc _END consts were finally deprecated so I don't longer worry about it
Comment 8•11 years ago
|
||
(In reply to alexander :surkov from comment #7)
> (In reply to Trevor Saunders (:tbsaunde) from comment #6)
> > Comment on attachment 8363305 [details] [diff] [review]
> > part1:patch2
> >
> > >+ addTextOffset = static_cast<uint32_t>(
> > >+ aIsEndOffset && (addTextOffset > 0 || descendantAcc->IndexInParent() > 0));
> >
> > imho that's a really hard way to read that.
> >
> > if (aIsEndOffset)
> > addOffset = addOffset > 0 || child->IndexInParent() > 0 ? 1 : 0;
> > else
> > addOffset = 0
> >
> > is a lot clearer even though its longer.
>
> ok if you want
thx, my eyes bleed on the other form ;)
> > It also seems like it would be
> > nice if addOffset could be a bool if we should add one or not, not a number
> > to add.
>
> addTextOffset may also be used to keep an offset in text node
yeah, I guess this is text stuff its doomed to being hard to read.
> > Also that's going to make that loop a good bit slower, maybe check if its
> > the first content child? I guess that's not a great idea. Maybe we should
> > just finally devirtualize GetChildAt() and then use FirstChild()
>
> instead IndexInParent? not sure
yeah, just trying to think of way to make this less slow.
> > >+ testTextAtOffset(["ht_5" ], BOUNDARY_LINE_START,
> > >+ [ [ 0, 0, kEmbedChar, 0, 1 ] ]);
> >
> > why not test BOUNDARY_LINE_END isn't that what you fix?
>
> original test is LINE_START, iirc _END consts were finally deprecated so I
> don't longer worry about it
hm, and windows / jsat don't use them? I guess that's fine, just looks kind of unbalanced.
Assignee | ||
Comment 9•11 years ago
|
||
(In reply to Trevor Saunders (:tbsaunde) from comment #8)
> > ok if you want
>
> thx, my eyes bleed on the other form ;)
I cannot say I disliked the previous form :)
> > > It also seems like it would be
> > > nice if addOffset could be a bool if we should add one or not, not a number
> > > to add.
> >
> > addTextOffset may also be used to keep an offset in text node
>
> yeah, I guess this is text stuff its doomed to being hard to read.
heritage :)
> > instead IndexInParent? not sure
>
> yeah, just trying to think of way to make this less slow.
yep, shouldn't be bottleneck though
> > original test is LINE_START, iirc _END consts were finally deprecated so I
> > don't longer worry about it
>
> hm, and windows / jsat don't use them? I guess that's fine, just looks kind
> of unbalanced.
agree. windows yes, jsat is quite likely too, just patching saving some time for urgent stuff
Assignee | ||
Comment 10•11 years ago
|
||
Comment on attachment 8363305 [details] [diff] [review]
part1:patch2
landed http://hg.mozilla.org/integration/mozilla-inbound/rev/700b3557403d
Assignee | ||
Updated•11 years ago
|
Whiteboard: [leave open]
Comment 11•11 years ago
|
||
Assignee | ||
Comment 12•11 years ago
|
||
make cleaning up of DOMPointToOffset, the result code is not really equivalent to old one but mochitest suite passes. CharacterCount() as fallback value in DOMPointToOffset is primarily needed for "stranges" of PeekOffset work.
Attachment #8365545 -
Flags: review?(trev.saunders)
Assignee | ||
Comment 13•11 years ago
|
||
Attachment #8365545 -
Attachment is obsolete: true
Attachment #8365545 -
Flags: review?(trev.saunders)
Attachment #8366114 -
Flags: review?(trev.saunders)
Assignee | ||
Comment 14•11 years ago
|
||
Attachment #8368950 -
Flags: review?(trev.saunders)
Comment 15•11 years ago
|
||
Comment on attachment 8366114 [details] [diff] [review]
part2_cleanup:patch2
>+HyperTextAccessible::DOMPointToOffset(nsINode* aNode, int32_t aNodeOffset,
>+ bool aIsEndOffset) const
> {
>- if (!aHyperTextOffset)
>- return nullptr;
>- *aHyperTextOffset = 0;
>-
> if (!aNode)
>- return nullptr;
>+ return 0;
when can it happen? can we just ban it?
A>+ // If the given DOM point cannot be mapped into offset relative this hypertext
>+ // offset then return length as fallback value.
when exactly is this needed?
Its really hard to say if this is exactly correct, but it doesn't seem particularly unreasonable.
Attachment #8366114 -
Flags: review?(trev.saunders) → review+
Assignee | ||
Comment 16•11 years ago
|
||
(In reply to Trevor Saunders (:tbsaunde) from comment #15)
> >+HyperTextAccessible::DOMPointToOffset(nsINode* aNode, int32_t aNodeOffset,
> >+ bool aIsEndOffset) const
> > if (!aNode)
> >- return nullptr;
> >+ return 0;
>
> when can it happen? can we just ban it?
somebody relies on it, I saw crashes
>
> A>+ // If the given DOM point cannot be mapped into offset relative this
> hypertext
> >+ // offset then return length as fallback value.
>
> when exactly is this needed?
when you do PeekOffset on input element then you can get outside input element, that was a case iirc
> Its really hard to say if this is exactly correct, but it doesn't seem
> particularly unreasonable.
the thesis is common for text interface stuff, we increase test coverage and one day it will guarantee the correct work
Comment 17•11 years ago
|
||
Comment on attachment 8368950 [details] [diff] [review]
part3 (list items):patch1
>+ int32_t offset = aOffset;
>+ Accessible* descendant = aDescendant;
>+ while (descendant && !descendant->IsDoc()) {
shouldn't we always find this before we get to a document? why can't we just assert that?
>+ // HTML list items needs special processing because PeekOffset doesn't work
>+ // with list bullets.
please file a bug to fix it
>+ (aAmount == eSelectBeginLine || aAmount == eSelectLine)) {
>+ if (text != this)
>+ return TransformOffset(text, 0, false);
>+
>+ // If we are in list item text then fall down into original procedure
>+ // since text may be multiline.
ending the block with a comment seems weird on top of this already being ubber ugly
>+ }
>+
extra blank line
>+ } else {
>+ // We assume that bullet is one single char.
can we assert that?
>+ // Ask a text leaf next to the bullet for an offset since list item
>+ // may be multiline.
>+ return aOffset + 1 < CharacterCount() ?
when is this not true?
>- if (hyperTextOffset == CharacterCount() && aDirection == eDirPrevious)
>- return 0;
>+ if (aDirection == eDirPrevious) {
>+ // If we reached the end during search, this means we didn't find the DOM point
>+ // and we're actually at the start of the paragraph
when is this ok? if there's text there shouldn't we have accessibles?
can you make sure we test list items with no bullet and test stuff with :before and :after?
Attachment #8368950 -
Flags: review?(trev.saunders) → review+
Assignee | ||
Comment 18•11 years ago
|
||
Comment on attachment 8366114 [details] [diff] [review]
part2_cleanup:patch2
https://hg.mozilla.org/integration/mozilla-inbound/rev/a70d1fe8d510
Assignee | ||
Comment 19•11 years ago
|
||
(In reply to Trevor Saunders (:tbsaunde) from comment #17)
> >+ int32_t offset = aOffset;
> >+ Accessible* descendant = aDescendant;
> >+ while (descendant && !descendant->IsDoc()) {
>
> shouldn't we always find this before we get to a document? why can't we just
> assert that?
it's sounds good
> >+ // HTML list items needs special processing because PeekOffset doesn't work
> >+ // with list bullets.
>
> please file a bug to fix it
I'm not sure this a bug since keyboard skips list bullets from navigation
> >+ (aAmount == eSelectBeginLine || aAmount == eSelectLine)) {
> >+ if (text != this)
> >+ return TransformOffset(text, 0, false);
> >+
> >+ // If we are in list item text then fall down into original procedure
> >+ // since text may be multiline.
>
> ending the block with a comment seems weird on top of this already being
> ubber ugly
ok, I'll remove the comment since it doesn't really seem useful
> >+ } else {
> >+ // We assume that bullet is one single char.
>
> can we assert that?
assert is not really needed since this is obviously wrong assumption (for example in case of list-style-type:upper-roman style). I changed it to comment: "It works only when the bullet is one single char." I will file a bug.
> >+ // Ask a text leaf next to the bullet for an offset since list item
> >+ // may be multiline.
> >+ return aOffset + 1 < CharacterCount() ?
>
> when is this not true?
no text list items, I added "(if not empty)" after "Ask a text leaf next"
> >- if (hyperTextOffset == CharacterCount() && aDirection == eDirPrevious)
> >- return 0;
> >+ if (aDirection == eDirPrevious) {
> >+ // If we reached the end during search, this means we didn't find the DOM point
> >+ // and we're actually at the start of the paragraph
>
> when is this ok? if there's text there shouldn't we have accessibles?
I guess it's something related with PeekOffset work, it's full of surprises
> can you make sure we test list items with no bullet and test stuff with
> :before and :after?
:before and :after will be broken since PeekOffset presumably don't work with them. I could have a test for no bullet case but it rather belongs to list bullets bug
Comment 20•11 years ago
|
||
Flags: in-testsuite+
Assignee | ||
Comment 21•11 years ago
|
||
Comment on attachment 8368950 [details] [diff] [review]
part3 (list items):patch1
https://hg.mozilla.org/integration/mozilla-inbound/rev/826d951bd6fa
Assignee | ||
Updated•11 years ago
|
Whiteboard: [leave open]
Comment 22•11 years ago
|
||
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
You need to log in
before you can comment on or make changes to this bug.
Description
•