Closed Bug 1667977 Opened 4 years ago Closed 4 years ago

Crash in [@ mozilla::fontlist::Family::SearchAllFontsForChar]

Categories

(Core :: Layout: Text and Fonts, defect)

Firefox 83
defect

Tracking

()

RESOLVED FIXED
88 Branch
Fission Milestone M8
Tracking Status
firefox-esr78 --- disabled
firefox81 --- disabled
firefox82 --- disabled
firefox83 --- disabled
firefox84 --- disabled
firefox85 --- disabled
firefox86 --- disabled
firefox87 --- disabled
firefox88 --- disabled
firefox89 --- fixed

People

(Reporter: over68, Assigned: jfkthame)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(2 files)

Steps to reproduce:

  1. Download and install the font ColorTube-Regular.ttf.
  2. Download Font Loader.
  3. Download and extracts the archive.
  4. Open this page.
  5. Open the Font Loader, Click on the Add Fonts button, Select the extracted font files then click Open.
  6. Click on the Load button then Unload.
  7. If it doesn't reproduce, close and open the page again.
  8. Click on the Load button then Unload.

Actual results:

The tab crashed.

Crash report: bp-72d6f5ea-66a2-41f7-870d-04a670200929

Top 10 frames of crashing thread:

0 xul.dll mozilla::fontlist::Family::SearchAllFontsForChar gfx/thebes/SharedFontList.cpp:347
1 xul.dll gfxFontGroup::FindFallbackFaceForChar gfx/thebes/gfxTextRun.cpp:2930
2 xul.dll gfxFontGroup::FindFontForChar gfx/thebes/gfxTextRun.cpp:3095
3 xul.dll gfxFontGroup::MakeTextRun gfx/thebes/gfxTextRun.cpp:2442
4 xul.dll BuildTextRunsScanner::FlushFrames layout/generic/nsTextFrame.cpp:1657
5 xul.dll BuildTextRunsScanner::ScanFrame layout/generic/nsTextFrame.cpp:2023
6 xul.dll nsTextFrame::EnsureTextRun layout/generic/nsTextFrame.cpp:2998
7 xul.dll nsTextFrame::ReflowText layout/generic/nsTextFrame.cpp:9162
8 xul.dll nsLineLayout::ReflowFrame layout/generic/nsLineLayout.cpp:881
9 xul.dll nsBlockFrame::DoReflowInlineFrames layout/generic/nsBlockFrame.cpp:4319
Blocks: 1533462
Status: UNCONFIRMED → NEW
Ever confirmed: true

It's unlikely this is really a regression; it's probably always been possible to hit this crash if the shared font-list pref is enabled.

Having said that, I can't seem to reproduce it locally at the moment (although the multiple crash-reports appear to indicate that it reproduces pretty reliably for the reporter). Reproducing may be dependent on some details of timing and/or font configuration, as what is happening looks like it may be some sort of raciness involving how the chrome and content processes respond to the flurry of Windows font-update notifications that we receive.

QA Whiteboard: [qa-regression-triage]

Almost all crashes with this signature are in 83 Nightly after build 20200926211645 but I am not seeing font specific patches landed in this build.

Nightly-only crash, no need for tracking for the release.

As far as default settings goes, it appears to be a regression on Firefox.
Checking for the range; with affected builds, the crash occurred for both loading page and unloading the font.

Regression range:

  • last good: 2020-09-06;
  • first bad: 2020-09-07;
  • Pushlog: URL.
Has Regression Range: --- → yes
Has STR: --- → yes
Hardware: x86 → All

:over68, since this bug is a regression, could you fill (if possible) the regressed_by field?
For more information, please visit auto_nag documentation.

Flags: needinfo?(over68)

See comment 1. This started crashing after bug 1533462, but would presumably have happened with that pref set beforehand.

Flags: needinfo?(over68)
Keywords: regression

The signature is mozilla::fontlist::Pointer::ToPtr with the gfx.e10s.font-list.shared enabled by manually flipping the pref, has changed to mozilla::fontlist::Family::SearchAllFontsForChar after landing bug 1533462.

Crash report: bp-cc2fb240-2da4-4598-84c8-8c20c0201003

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0 xul.dll mozilla::fontlist::Pointer::ToPtr const gfx/thebes/SharedFontList.cpp:41
1 xul.dll mozilla::fontlist::Family::SearchAllFontsForChar gfx/thebes/SharedFontList.cpp:347
2 xul.dll gfxFontGroup::FindFallbackFaceForChar gfx/thebes/gfxTextRun.cpp:2930
3 xul.dll gfxFontGroup::FindFallbackFaceForChar gfx/thebes/gfxTextRun.cpp:2942
4 xul.dll gfxFontGroup::FindFontForChar gfx/thebes/gfxTextRun.cpp:3097
5 xul.dll gfxFontGroup::MakeTextRun gfx/thebes/gfxTextRun.cpp:2442
6 xul.dll BuildTextRunsScanner::BuildTextRunForFrames layout/generic/nsTextFrame.cpp:2553
7 xul.dll BuildTextRunsScanner::FlushFrames layout/generic/nsTextFrame.cpp:1649
8 xul.dll BuildTextRunsScanner::ScanFrame layout/generic/nsTextFrame.cpp:2015
9 xul.dll nsTextFrame::EnsureTextRun layout/generic/nsTextFrame.cpp:2990

I can also reproduce it on this page with the steps in comment 0.

See https://youtu.be/dc8g2GwKwDQ

Crash report: bp-db0a94db-ba10-437b-8c95-0307c0201006

Jonathan, do the comment 8 STR work for you?

Flags: needinfo?(jfkthame)

Ah - I finally got it to happen on my machine, with a variation on the given STR: to increase the chances of hitting the problem, I clicked Reload on the page while Font Loader was in the process of loading or unloading the various activated fonts.

I assume we're mishandling some kind of raciness related to the font-changed notifications from Windows. I'll see what I can figure out...

Severity: -- → S2
Flags: needinfo?(jfkthame)

I can also reproduce on Ubuntu with the page in comment 0 and comment 8 after uninstalling then install the font ColorTube-Regular.ttf.

See https://youtu.be/Ogt7-jwzZbk

Crash report: bp-f9b6c0a7-c8d9-4a29-8554-5e52d0201024

No longer blocks: 1533462
Regressed by: 1533462

The crash signature has changed to gfxFontGroup::FindFallbackFaceForChar after landing bug 1676966.

Crash report: bp-f3aed4a3-9624-402a-969a-7163c0210106

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0 xul.dll gfxFontGroup::FindFallbackFaceForChar gfx/thebes/gfxTextRun.cpp:2934
1 xul.dll gfxFontGroup::FindFontForChar gfx/thebes/gfxTextRun.cpp:3111
2 xul.dll gfxFontGroup::MakeTextRun gfx/thebes/gfxTextRun.cpp:2443
3 xul.dll gfxFontGroup::MakeHyphenTextRun gfx/thebes/gfxTextRun.cpp:2418
4 xul.dll GetHyphenTextRun layout/generic/nsTextFrame.cpp:2157
5 xul.dll AddHyphenToMetrics layout/generic/nsTextFrame.cpp:6072
6 xul.dll nsTextFrame::ReflowText layout/generic/nsTextFrame.cpp:9350
7 xul.dll nsLineLayout::ReflowFrame layout/generic/nsLineLayout.cpp:878
8 xul.dll nsInlineFrame::Reflow layout/generic/nsInlineFrame.cpp:361
9 xul.dll nsLineLayout::ReflowFrame layout/generic/nsLineLayout.cpp:875
Crash Signature: [@ mozilla::fontlist::Family::SearchAllFontsForChar] → [@ gfxFontGroup::FindFallbackFaceForChar ] [@ mozilla::fontlist::Family::SearchAllFontsForChar]
Blocks: 1694174

If we want shared font-list (bug 1694174) for Fission MVP, Jonathan recommends (in bug 1694174 comment 3) that we fix this Windows crash before Fission rides to Release (Fission M8).

Fission Milestone: --- → M8

Note this also happens on Linux, see comment 11.

(In reply to blinky from comment #14)

Note this also happens on Linux, see comment 11.

In that case, changing Platform from "Windows 10" to "All".

OS: Windows 10 → All

In trying to reproduce this on Linux, I have been able to occasionally hit the assertion at https://searchfox.org/mozilla-central/rev/f6ffb71dca9eb491e85aa95042380b2602008b00/gfx/thebes/SharedFontList.cpp#64, which I believe can arise when there's a race between the content process requesting shared-memory blocks and the parent flushing and rebuilding its list. As the assertion wouldn't be present in a release build, we should ensure that we safely return null in this case, rather than trying to use the out-of-range block value to index into the (stale) array.

This may not be the only issue, however. Playing with the STR here on Linux, I have also seen an occasional crash that is preceded by GFX failure messages "Failed initializing FreeType face from filename" and "Attempted to deserialize Fontconfig scaled font without FreeType face", followed by an assertion failure in gfx/2d/InlineTranslator.h. I'm not certain whether a release build (where that assertion doesn't fire and kill us) would survive this or not. In any case, this issue is not related to the shared font list; it reproduces even when the pref is turned off.

So I'll post a patch for the issue in Pointer::ToPtr(), but also continue testing/investigating this....

Assignee: nobody → jfkthame
Status: NEW → ASSIGNED
Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/98d0713fd804 Safely return null if we fail to get a shared-mem block in Pointer::ToPtr. r=jwatt
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 88 Branch

I can still reproduce the crash with the latest Nightly build.

bp-93333990-8bbe-4933-b8de-506c60210319

Thanks for testing; I was afraid this might not be the only issue, but reproducing the crash has been intermittent for me, making it hard to be confident.

I assume this is still related to installing/removing the ColorTube font while those example files are loaded, and/or then reloading the pages?

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Try to reproduce the crash with this page with installing/removing the ColorTube font as in comment 11.

If it doesn't reproduce, try installing/removing the font several times.

You can also try to reproduce it with installing/removing the font on Mac OS.

I updated the page in comment 22 to reproduce more easily.

Steps to reproduce 2:

  1. Open this page in two tabs.
  2. Installing/removing the ColorTube font several times.

See https://youtu.be/n0Ec1-Rzk5E

Crash report: bp-a84726b1-f520-40ec-9a89-da7270210402

Thanks for the testcases and ongoing testing here; sorry it's taken so long to get to the bottom of it.

I have a patch that I think addresses the core of this issue; I've started a tryserver job at https://treeherder.mozilla.org/jobs?repo=try&revision=5be7af1401613f9fc69cc24b457bff2cabb91154, so if you could test that build once it's ready, that would be really helpful.

Flags: needinfo?(over68)

I can not reproduce the crash with the build in comment 26. Thanks.

Flags: needinfo?(over68)

If there's a textrun already cached in the frame, it's fine to check its attrbutes such as direction,
but using it to retrieve a fontGroup to shape new text is risky because this could lead to a font-
matching search via the platform font list (if the fonts already configured in the group don't
support the text to be shaped), and the list could have been mutated in the meantime, making the
old fontgroup's references invalid. Instead, we should retrieve the fontgroup from the context to
ensure it is current.

Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a2ef634333b3 Don't risk using a stale fontGroup from a pre-existing textrun in GetHyphenTextRun; ensure it retrieves fresh metrics from the context. r=emilio
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: