Closed Bug 514968 Opened 15 years ago Closed 15 years ago

optimize FT2 text run creation

Categories

(Core :: Graphics, defect)

ARM
Windows CE
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
status1.9.2 --- beta1-fixed
fennec 1.0b1+ ---

People

(Reporter: vlad, Assigned: vlad)

References

Details

(Whiteboard: [nv])

Attachments

(2 files, 1 obsolete file)

Attached patch add glyph cache to ft2 (obsolete) (deleted) — Splinter Review
The current FT2 code is highly unoptimal, especially where text run creation is concerned -- it calls FT_Load_Glyph for every glyph in the textrun, sometimes twice (due to the kerning). FT_Load_Glyph isn't cached in any way on the freetype side, so the entire glyph lookup and creation steps happen, including scaling the outline and hinting. This code adds a simple glyph metrics cache to the gfxFT2Font, caching just the pieces that are needed for measurement. This is 16 bytes per glyph plus hash table overhead, so memory usage isn't significant. Before this patch, gfxTextRunPerfTest was taking 1773s under WinCE; after, it's 213s, for an 8.3x speedup. Further gains could be made by optimizing the hash table code. This translates into a 2-3x speedup on pageload for many text-heavy pages, as well as a Ts speedup. (For reference, on my desktop fast x86 box, normal win32 font code takes 15s for the test.. before this patch, the ft2 patch took 100s, after it takes 5s.) This patch is built on top of the patch for bug 490267, which in turn depends on bug 493280. We still call FT_Load_Glyph separately from cairo, but cairo only does it for glyphs that are rendered.. however, there's a lot of overlap there, especially for latin text, and it'd be nice to move all of this stuff into one spot so that we only ever have to load each glyph just once.
Flags: wanted-fennec1.0?
Flags: blocking1.9.2+
Attachment #398981 - Flags: review?(jfkthame)
tracking-fennec: --- → 1.0b1+
Comment on attachment 398981 [details] [diff] [review] add glyph cache to ft2 This is obviously a huge win. :) A few small comments... >+ struct CachedGlyphData { >+ CachedGlyphData() >+ : glyph_index(0xffffffffU) { } >+ >+ CachedGlyphData(PRUint32 gid) >+ : glyph_index(gid) { } >+ >+ PRUint32 glyph_index; >+ PRInt32 lsb_delta; >+ PRInt32 rsb_delta; >+ PRInt32 x_advance; >+ }; Surely we normally use camelCase rather than underscores? To be really picky about following the style guide, I suppose the "gid" argument should be "aGid" (or "aGlyphID"). >+ const CachedGlyphData* GetGlyphDataForChar(PRUint32 ch) { Similarly, "aCh". (Do we really care in cases like this? I would most naturally write it the way you have.) >+ FT_Face mLockedFace; >+ PRUint32 mFaceLockCount; See below. >+ void FillGlyphDataForChar(PRUint32 ch, CachedGlyphData *gd); Argument names (sigh). >+ //nsDataHashtable<nsUint32HashKey, PRUint32, FT_Pos> mKerningCache; Thinking of caching kern data as well, apparently. Might be worthwhile, but I'd suggest leaving it until we get an idea of Harfbuzz performance, and then see if we need to maintain a simpler, non-HB path. >+ if (!mHasSpaceGlyph) { >+ FT_Face face = LockFace(); >+ >+ FT_UInt gid = FT_Get_Char_Index(face, ' '); > FT_Load_Glyph(face, gid, FT_LOAD_DEFAULT); >+ > mSpaceGlyph = gid; > mHasSpaceGlyph = PR_TRUE; >+ >+ UnlockFace(); >+ } I realize this particular line is pre-existing code, but as we're here.... is there a purpose to the FT_Load_Glyph call here? It's not obvious to me that it achieves anything useful. >+FT_Face >+gfxFT2Font::LockFace() >+{ >+ if (++mFaceLockCount == 1) { >+ mLockedFace = cairo_ft_scaled_font_lock_face(CairoScaledFont()); >+ if (!mLockedFace) >+ mFaceLockCount = 0; >+ } >+ >+ return mLockedFace; >+} >+ >+void >+gfxFT2Font::UnlockFace() >+{ >+ if (--mFaceLockCount == 0) { > cairo_ft_scaled_font_unlock_face(CairoScaledFont()); >+ mLockedFace = nsnull; > } >- return mSpaceGlyph; > } Could we please replace the use of gfxFT2Font::LockFace() / UnlockFace(), and the related member variables, with a simple stack-based helper class with a constructor that locks the face and destructor that unlocks it? Seems like that would be a better idiom. >+void >+gfxFT2Font::FillGlyphDataForChar(PRUint32 ch, CachedGlyphData *gd) >+{ >+ FT_Face face = LockFace(); >+ >+ FT_UInt gid = FT_Get_Char_Index(face, ch); >+ >+ if (gid == 0) { >+ // uh, how did we get here? we should totally not have gotten here, >+ // this font doesn't support this char! This comment sounds like maybe it should be an assertion. >+ gd->glyph_index = 0; >+ goto finish; Using a stack-based FT2FaceLocker would allow you to simply do an early return here. >+ } >+ >+ FT_Error err = FT_Load_Glyph(face, gid, FT_LOAD_DEFAULT); >+ >+ if (err) { >+ // hmm, this is weird, we failed to load a glyph that we had? >+ printf ("FT load error: %d (ch: %d)\n", err, ch); Use a aarning rather than a bare printf? >+ gd->glyph_index = 0; >+ goto finish; And here. >+ } >+ >+ gd->glyph_index = gid; >+ gd->lsb_delta = face->glyph->lsb_delta; >+ gd->rsb_delta = face->glyph->rsb_delta; >+ gd->x_advance = face->glyph->advance.x; >+ >+finish: Oops, the label has trailing whitespace. But in any case, I'm hoping you'll eliminate it entirely (see above).
(In reply to comment #1) > (From update of attachment 398981 [details] [diff] [review]) > This is obviously a huge win. :) > > A few small comments... > > >+ struct CachedGlyphData { > >+ CachedGlyphData() > >+ : glyph_index(0xffffffffU) { } > >+ > >+ CachedGlyphData(PRUint32 gid) > >+ : glyph_index(gid) { } > >+ > >+ PRUint32 glyph_index; > >+ PRInt32 lsb_delta; > >+ PRInt32 rsb_delta; > >+ PRInt32 x_advance; > >+ }; > > Surely we normally use camelCase rather than underscores? > > To be really picky about following the style guide, I suppose the "gid" > argument should be "aGid" (or "aGlyphID"). > > >+ const CachedGlyphData* GetGlyphDataForChar(PRUint32 ch) { > > Similarly, "aCh". (Do we really care in cases like this? I would most naturally > write it the way you have.) Opinions vary -- I normally don't care especially in gfx code, and didn't use mFoo notation because those aren't private members; it's just a normal struct that happens to have a constructor for convenience. Though I also hate the "aFoo" syntax for args in general, so maybe I'm not a good guideline. > >+ FT_Face mLockedFace; > >+ PRUint32 mFaceLockCount; > > See below. > > >+ void FillGlyphDataForChar(PRUint32 ch, CachedGlyphData *gd); > > Argument names (sigh). > > >+ //nsDataHashtable<nsUint32HashKey, PRUint32, FT_Pos> mKerningCache; > > Thinking of caching kern data as well, apparently. Might be worthwhile, but I'd > suggest leaving it until we get an idea of Harfbuzz performance, and then see > if we need to maintain a simpler, non-HB path. Yeah; I'll remove that line, maybe replace it with a comment in the function saying "maybe we should cache kerning data". The win is huge enough even without caching it, and caching the kerning data is more complex (though I was thinking of caching kerning data only for glyphs with indices that fit into 16 bits, so that the hash key remains simple). > >+ if (!mHasSpaceGlyph) { > >+ FT_Face face = LockFace(); > >+ > >+ FT_UInt gid = FT_Get_Char_Index(face, ' '); > > FT_Load_Glyph(face, gid, FT_LOAD_DEFAULT); > >+ > > mSpaceGlyph = gid; > > mHasSpaceGlyph = PR_TRUE; > >+ > >+ UnlockFace(); > >+ } > > I realize this particular line is pre-existing code, but as we're here.... is > there a purpose to the FT_Load_Glyph call here? It's not obvious to me that it > achieves anything useful. Erm. That's a very good point, I think this is actually a bug -- it's supposed to get the advance of the space glyph and put it into mMetrics.spaceWidth. But this is also done in GetMetrics as well. I'll replace this with a call to the new cached function, not sure why I didn't do that before. > >+FT_Face > >+gfxFT2Font::LockFace() > >+{ > >+ if (++mFaceLockCount == 1) { > >+ mLockedFace = cairo_ft_scaled_font_lock_face(CairoScaledFont()); > >+ if (!mLockedFace) > >+ mFaceLockCount = 0; > >+ } > >+ > >+ return mLockedFace; > >+} > >+ > >+void > >+gfxFT2Font::UnlockFace() > >+{ > >+ if (--mFaceLockCount == 0) { > > cairo_ft_scaled_font_unlock_face(CairoScaledFont()); > >+ mLockedFace = nsnull; > > } > >- return mSpaceGlyph; > > } > > Could we please replace the use of gfxFT2Font::LockFace() / UnlockFace(), and > the related member variables, with a simple stack-based helper class with a > constructor that locks the face and destructor that unlocks it? Seems like that > would be a better idiom. I thought about that, but this seemed simpler since it's just for internal usage. It does clean up the 'goto finish' case though, so I'll take a look. > >+void > >+gfxFT2Font::FillGlyphDataForChar(PRUint32 ch, CachedGlyphData *gd) > >+{ > >+ FT_Face face = LockFace(); > >+ > >+ FT_UInt gid = FT_Get_Char_Index(face, ch); > >+ > >+ if (gid == 0) { > >+ // uh, how did we get here? we should totally not have gotten here, > >+ // this font doesn't support this char! > > This comment sounds like maybe it should be an assertion. I agree, though I was actually hitting this case in practice... I assumed that with the previous correct fallback patch, we'd ensure that the font supports the character before getting here. > >+ gd->glyph_index = 0; > >+ goto finish; > > Using a stack-based FT2FaceLocker would allow you to simply do an early return > here. > > >+ } > >+ > >+ FT_Error err = FT_Load_Glyph(face, gid, FT_LOAD_DEFAULT); > >+ > >+ if (err) { > >+ // hmm, this is weird, we failed to load a glyph that we had? > >+ printf ("FT load error: %d (ch: %d)\n", err, ch); > > Use a aarning rather than a bare printf? Whoops, yes. Will do. > > >+ gd->glyph_index = 0; > >+ goto finish; > > And here. > > >+ } > >+ > >+ gd->glyph_index = gid; > >+ gd->lsb_delta = face->glyph->lsb_delta; > >+ gd->rsb_delta = face->glyph->rsb_delta; > >+ gd->x_advance = face->glyph->advance.x; > >+ > >+finish: > > Oops, the label has trailing whitespace. But in any case, I'm hoping you'll > eliminate it entirely (see above). Hmm, will fix my editor :-) Yeah, I'll look at getting rid of it.
Attached patch updated (deleted) — Splinter Review
Updated to use a stack-based locking class, also fixed struct member names and such.
Assignee: nobody → vladimir
Attachment #398981 - Attachment is obsolete: true
Attachment #399582 - Flags: review?(jfkthame)
Attachment #398981 - Flags: review?(jfkthame)
Attached patch kerning cache (deleted) — Splinter Review
This is a simple kerning cache; it was an 7% speedup on a pure textrun test on my fast box, though the code could probably be written better and get a bit more of a speedup. The complexity probably isn't worth it though, especially since the win would be marginal on a real web page compared to a pure textrun benchmark.
Attachment #399582 - Flags: review?(jfkthame) → review+
Comment on attachment 399582 [details] [diff] [review] updated Looks good to me. I'd have been inclined to inline the FaceLock constructor and destructor (though maybe the compiler will do that anyway?)... don't suppose it really makes a significant difference though. Oh, I noticed a couple places where there are no braces around an "if" body, when it's a single statement. As the style guide calls for them everywhere, it might be good to add them. Examples (I may not have caught them all): + if (!entry) + return nsnull; + if (hGDI && ::GetObjectW(hGDI, sizeof(logFont), &logFont)) familyArray.AppendElement(nsDependentString(logFont.lfFaceName)); + if (gidNext && gidNext != spaceGlyph) + lsbDeltaNext = cgdNext->lsbDelta; (I know the existing gfx code is inconsistent on this anyhow.) Marking this as r+, but I think it'd be good to tidy up these before landing.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
(In reply to comment #2) > Opinions vary -- I normally don't care especially in gfx code, and didn't use > mFoo notation because those aren't private members; it's just a normal struct > that happens to have a constructor for convenience. Though I also hate the > "aFoo" syntax for args in general, so maybe I'm not a good guideline. I hate these guidelines too, but my attempts to change them have been unsuccessful. So I think we need to stick to them.
Who (or what -- could just be the high-inertia codebase) wants aFoo and mFoo? In JS we decided to drop both, in favor of this->foo if ambiguous, otherwise foo. /be
I proposed dropping the 'a' convention in a thread starting here in mid 2008: http://groups.google.com/group/mozilla.dev.platform/browse_thread/thread/3e6fc203449c91c4/e04bbfaf0bd394f0 Got pushback from Shawn Wilsher, fantasai, also Peter Weilbacher and Sergey Yanovich said they found it useful, plus negative feedback from a number of other people that I could disregard without losing any sleep. It's particular the feedback from people who say they find it useful when reading code that made me decline to push harder.
While JS may get to make style decisions independently of the rest of the project, I don't think it's wise to extend that freedom to other modules, as I hope I convinced you of in the recent dev.platform thread.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: