Closed Bug 478792 Opened 16 years ago Closed 16 years ago

svg text highlighting & hit-testing should use glyph advance, not ink extents

Categories

(Core :: SVG, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9.2a1

People

(Reporter: jfkthame, Assigned: jfkthame)

References

Details

Attachments

(2 files, 2 obsolete files)

Attached image svg-rendered text showing glyph areas (deleted) —
The SVG code currently uses glyph "ink extents" for functions such as GetExtentOfChar() and GetCharNumAtPosition() (implemented in nsSVGGlyphFrame.cpp). This is a problem when these functions are used for what is likely to be their main purpose, determining the appropriate areas for highlighting or hit-testing text.

When glyphs extend beyond their typographic origin and advance, it is better for operations such as selection, highlighting, and hit-testing to be based on the glyph advance rather than the bounding box of the ink. Otherwise there is a risk of glyphs that overhang their neighbors effectively making it impossible to "hit" certain characters in the text.

The attachment shows some text drawn with SVG using the Zapfino font. Consider the character "f", highlighted in red in the first line. If we use getExtentOfChar to provide typical "selection highlighting" for this character, the area highlighted will be the rect shown in green in the second line. To the user, this appears to include several neighboring characters, although it is in fact just the bounding box of the "f" glyph.

Worse still, consider trying to map a mouse-click to an underlying character using getCharNumAtPosition. Because the function in nsSVGGlyphFrame.cpp takes the last-drawn character whose rect includes the given point, and because the "g" in the following word has a left-overlapping descender, the only area where a click will map to the "f" is the blue rect shown on the third line. And certain characters such as the "i" of "if" and even the "s" at the end of "happens" will be completely impossible to hit, as they are entirely overshadowed by the bounding box of the "f".

The user experience will be greatly improved by using the glyph origin and advance (along with the font ascent and descent) rather than the ink rect in these situations. The one exception would be zero-width glyphs (typically diacritics); for these it is probably preferable to use the ink rectangle, otherwise they have no valid "hit area" at all.
This patch will provide more reasonable behavior in the overlapping-glyph situations described.

It will also allow the ClearType-related glyph extent patches required for bug 445087 and 475968 to land without breaking existing SVG tests (in particular svg/content/test/test_text.html, which uses getExtentOfChar and getCharNumAtPosition for highlight-area and hit-test type of functionality).

Besides the places patched here, the glyph-metrics mBoundingBox is also used in nsSVGGlyphFrame::ContainsPoint and nsSVGGlyphFrame::AddBoundingBoxesToPath. I am not sure how these are most likely to be used, and so it is not clear to me at the moment whether they should be patched in a similar way.
Assignee: nobody → jfkthame
Attachment #362640 - Flags: review?(roc)
Having second thoughts about the behavior of getExtentOfChar; seems to me that it's questionable which kind of "extent" this should give, depending how we expect it to be used. Perhaps only getCharNumAtPosition should be patched here.

Hmm, perhaps I should look up the SVG spec. :)  Ah, it says that getExtentOfChar "Returns a tightest rectangle which defines the minimum and maximum X and Y values in the user coordinate system for rendering the glyph(s) that correspond to the specified character." So that does sound like an ink extent. However, it also says "The calculations assume that all glyphs occupy the full standard glyph cell for the font", which means it ignores the ink vertically. A strange hybrid.

OK, I've read that our policy is to implement the spec as written, wherever possible. In that case, neither our existing (ink bounding box) behavior nor this patch is correct for getExtentOfChar. Guess I should revise the patch to implement something closer to the official spec, though I'm not convinced it's preferable.
The spec is quite clear about how diacritics should be handled:
http://www.w3.org/TR/SVG/text.html#DOMInterfaces
"If multiple consecutive characters are rendered inseparably (e.g., as a single glyph or a sequence of glyphs), then each of the inseparable characters will return the same extent."

I'm not sure if switching to advance widths and font metrics is actually allowed:

"Returns a tightest rectangle which defines the minimum and maximum X and Y values in the user coordinate system for rendering the glyph(s) that correspond to the specified character. The calculations assume that all glyphs occupy the full standard glyph cell for the font."

"Returns the index of the character whose corresponding glyph cell bounding box contains the specified point. The calculations assume that all glyphs occupy the full standard glyph cell for the font."

I agree this is confusing. I'm not sure what exactly the "glyph cell bounding box" means, and I'm not sure whether saying "all glyphs occupy the full standard glyph cell for the font" is meant to be setting a lower bound on the area returned or meant to be setting equal the area returned. Maybe jwatt or Robert can comment here.

ContainsPoint is used for mouse event targeting, and AddBoundingBoxesToPath is used to calculate the "covered region". The latter definitely requires ink extents, the former may as well too.
*ahem* It might be worth testing Webkit, Opera and Batik to see how they interpreted the spec.
In Safari, my code for highlighting the rects isn't working, not sure why offhand. However, playing with selection in the svg text makes it clear that they're using the "standard glyph cell" from (origin, font-ascent) to (advance, font-descent) for hit-testing and highlight-drawing purposes, at least.

Text selection isn't working for me in firefox (is it supposed to?), so I can't compare that directly.
(In reply to comment #3)

> I agree this is confusing. I'm not sure what exactly the "glyph cell bounding
> box" means, and I'm not sure whether saying "all glyphs occupy the full
> standard glyph cell for the font" is meant to be setting a lower bound on the
> area returned or meant to be setting equal the area returned.

The only reasonable interpretation I can come up with is that the "glyph cell" means the rectangle from (origin, font-ascent) to (advance, font-descent). They're telling us to use the font ascent/descent metrics throughout, rather than true glyph heights and depths, because they don't want glyph selection areas (e.g., for highlighting or hit-testing) that go up and down like a city skyline where there are ascenders, capitals, etc. They don't want an "a" (let alone ".") to have a shorter hit-area than a "b" just because it doesn't reach up to the font ascent line.

I think this text was written to address the situation where glyphs fall considerably short of the "glyph cell", whether in the vertical direction (like many lowercase letters, punctuation, etc) or the horizontal (like narrow letters or punctuation in a monospaced font); it's less clear what we're supposed to do about glyphs where the ink extends beyond the "full standard glyph cell", but for the getCharNumAtPosition function, at least, I think our current behavior is clearly not useful.
Opera was disappointing; it displays the Zapfino text with clipping of the ascenders and descenders, obviously using the font ascent and descent to clip the drawing. It then crashes trying to run my onload() handler that's supposed to highlight the glyph rects. :(

If I remove that script, I can try text selection; this confirms that they're using the glyph origin/advance as the basis of hit-testing, not the actual ink extent.
Batik's implementation of getExtentOfChar() is returning the "standard glyph cell" from (origin, font-ascent) to (advance, font-descent) with no regard for ink that extends beyond this cell. So the "extent" of the Zapfino "f", for example, is actually narrower than if I replace it with "n".
Alright, I'm convinced! The only thing is that I don't think we should special-case for zero-width glyphs. Instead we should just fix clusters to give each character the extents of the full cluster. That should be easy using gfxTextRun::IsClusterStart(), but it could be done in a separate bug if you like.
This implements what the spec is trying to say, AFAICT. It passes our existing SVG mochitests; I'll look into adding some new tests specifically for the char-cluster support, as that's currently untested.
Attachment #362640 - Attachment is obsolete: true
Attachment #362700 - Flags: review?(roc)
Attachment #362640 - Flags: review?(roc)
The SVG specification is known to be pretty close to impenetrable at this point. bug 369054 contains the last discussion scattered amongst the general yelling as does http://lists.w3.org/Archives/Public/www-svg/2007Feb/thread.html#msg14 

How does your change fit in with these discussions?
Thanks for the pointers. Didn't realize this (in other manifestations) had been quite such an issue!

AFAICT, the spec remains less than crystal-clear (as discussed in earlier comments here), but the interpretation that character "boxes" do not necessarily correspond strictly to a tight bounding box of the painted area, but rather are based on metrics that may include whitespace within the "glyph cell", seems to have fairly broad support both in the discussions and among other implementations.

I haven't noticed much prior discussion of how to handle the case where the glyph's ink extends *outside* the metrics-based cell; the concern previously has been about varying heights (where many glyphs do not actually reach to the ascent or descent line) and/or gaps between characters (due to the sidebearings of the glyphs within their cells).

One could argue that we should consider the rect from (origin, ascent) .. (advance, descent) as a *minimum* rather than necessarily the final dimensions of the glyph cell, and extend it further if the ink falls outside this initial rect, but (a) nobody else seems to be using or promoting that interpretation at present, and (b) at least for hit-testing purposes, it can introduce major usability problems as illustrated in the Zapfino example. (Yes, that's a relatively extreme font, at least for Latin script, but serves to demonstrate the problem; and with simpler fonts, it's a non-issue anyway as glyphs don't typically extend far beyond the simple metrics-based cell.)

So overall, I think this change is in line with the general thrust of the discussions, and with a reasonable interpretation of the spec, although the spec itself remains open to various readings as it fails to define its terms clearly. This change is also in line with the interpretation that other actual implementations seem to be using, and it does begin to address one issue - character clusters - that we haven't handled until now.
Comment on attachment 362700 [details] [diff] [review]
patch v2, based on comments - supports character clusters

+  while (start > 0 && !mTextRun->IsClusterStart(start))
+    --start;
+  while (limit < mTextRun->GetLength() && !mTextRun->IsClusterStart(limit))
+    ++limit;

{} around these statements

+    while (limit < mTextRun->GetLength() && !mTextRun->IsClusterStart(limit))
+      ++limit;

Here too

+    while (++i < limit)
+      iter.NextChar();

And here
It'd be nice to get this landed soon, so that we can also land the Cairo ClearType-clipping patch (bug 445087) without breaking SVG mochitests.
Attachment #362700 - Attachment is obsolete: true
Keywords: checkin-needed
Whiteboard: [needs landing]
Comment on attachment 362858 [details] [diff] [review]
updated according to review comments
[Checkin: Comment 16]


http://hg.mozilla.org/mozilla-central/rev/f6cdd2d6a9ea
Attachment #362858 - Attachment description: updated according to review comments → updated according to review comments [Checkin: Comment 16]
Blocks: 445087
Status: NEW → RESOLVED
Closed: 16 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: [needs landing]
Target Milestone: --- → mozilla1.9.2a1
Version: unspecified → Trunk
I think we need to land this on 1.9.1 also, because it blocks bug #445087, which is a 1.9.1 blocker.
Flags: blocking1.9.1?
Flags: blocking1.9.1? → blocking1.9.1+
Whiteboard: [needs 1.9.1 landing]
Could we re-evaluate 1.9.1 blocking status for this, please. As we didn't take bug 445087 on the branch, I don't think this needs to block either any longer.
Flags: blocking1.9.1+ → blocking1.9.1?
Flags: blocking1.9.1? → blocking1.9.1-
Whiteboard: [needs 1.9.1 landing]
Depends on: 590291
Can someone cc me on bug 590291 please?
(done)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: