Closed Bug 1697256 Opened 3 years ago Closed 2 years ago

crash at null in [@ nsFocusManager::SetFocusInner]

Categories

(Core :: DOM: Core & HTML, defect, P2)

defect

Tracking

()

VERIFIED FIXED
98 Branch
Tracking Status
firefox-esr91 --- wontfix
firefox88 --- wontfix
firefox96 --- wontfix
firefox97 --- wontfix
firefox98 --- verified

People

(Reporter: tsmith, Assigned: saschanaz)

References

(Blocks 2 open bugs)

Details

(Keywords: crash, testcase, Whiteboard: [bugmon:bisected,confirmed])

Crash Data

Attachments

(3 files)

Attached file testcase.html (deleted) —
==28375==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f4328c4bde5 bp 0x7ffc5dddc8c0 sp 0x7ffc5dddc380 T0)
==28375==The signal is caused by a READ memory access.
==28375==Hint: address points to the zero page.
    #0 0x7f4328c4bde5 in nsCOMPtr /builds/worker/workspace/obj-build/dist/include/nsCOMPtr.h:525:7
    #1 0x7f4328c4bde5 in nsFocusManager::SetFocusInner(mozilla::dom::Element*, int, bool, bool, unsigned long) src/dom/base/nsFocusManager.cpp:1741:40
    #2 0x7f4328c4e125 in nsFocusManager::SetFocus(mozilla::dom::Element*, unsigned int) src/dom/base/nsFocusManager.cpp:485:3
    #3 0x7f4328a501a7 in mozilla::dom::Element::Focus(mozilla::dom::FocusOptions const&, mozilla::dom::CallerType, mozilla::ErrorResult&) src/dom/base/Element.cpp:462:16
    #4 0x7f4328a98f94 in mozilla::dom::nsAutoFocusEvent::Run() src/dom/base/Document.cpp:12194:15
    #5 0x7f432558e696 in mozilla::RunnableTask::Run() src/xpcom/threads/TaskController.cpp:472:16
    #6 0x7f432558b253 in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) src/xpcom/threads/TaskController.cpp:760:26
    #7 0x7f4325589127 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) src/xpcom/threads/TaskController.cpp:611:15
    #8 0x7f432558957d in mozilla::TaskController::ProcessPendingMTTask(bool) src/xpcom/threads/TaskController.cpp:395:36
    #9 0x7f4325595d04 in operator() src/xpcom/threads/TaskController.cpp:136:37
    #10 0x7f4325595d04 in mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_4>::Run() src/xpcom/threads/nsThreadUtils.h:534:5
    #11 0x7f43255b1024 in nsThread::ProcessNextEvent(bool, bool*) src/xpcom/threads/nsThread.cpp:1158:16
    #12 0x7f43255bbb9c in NS_ProcessNextEvent(nsIThread*, bool) src/xpcom/threads/nsThreadUtils.cpp:548:10
    #13 0x7f432dd6079b in SpinEventLoopUntil<mozilla::ProcessFailureBehavior::ReportToCaller, (lambda at src/layout/printing/ipc/RemotePrintJobChild.cpp:30:31)> /builds/worker/workspace/obj-build/dist/include/mozilla/SpinEventLoopUntil.h:93:25
    #14 0x7f432dd6079b in mozilla::layout::RemotePrintJobChild::InitializePrint(nsTString<char16_t> const&, nsTString<char16_t> const&, int const&, int const&) src/layout/printing/ipc/RemotePrintJobChild.cpp:30:3
    #15 0x7f432cfc17b4 in nsDeviceContextSpecProxy::BeginDocument(nsTSubstring<char16_t> const&, nsTSubstring<char16_t> const&, int, int) src/widget/nsDeviceContextSpecProxy.cpp:140:34
    #16 0x7f4327b4a09e in nsDeviceContext::BeginDocument(nsTSubstring<char16_t> const&, nsTSubstring<char16_t> const&, int, int) src/gfx/src/nsDeviceContext.cpp:524:32
    #17 0x7f432dd72041 in nsPrintJob::SetupToPrintContent() src/layout/printing/nsPrintJob.cpp:1391:31
    #18 0x7f432dd78474 in DocumentReadyForPrinting src/layout/printing/nsPrintJob.cpp:1032:17
    #19 0x7f432dd78474 in nsPrintJob::MaybeResumePrintAfterResourcesLoaded(bool) src/layout/printing/nsPrintJob.cpp:1537:10
    #20 0x7f432dd6eef0 in nsPrintJob::InitPrintDocConstruction(bool) src/layout/printing/nsPrintJob.cpp:1493:3
    #21 0x7f432dd7d498 in nsPrintJob::Observe(nsISupports*, char const*, char16_t const*) src/layout/printing/nsPrintJob.cpp:2688:17
    #22 0x7f4330a17139 in mozilla::embedding::PrintProgressDialogChild::RecvDialogOpened() src/toolkit/components/printingui/ipc/PrintProgressDialogChild.cpp:37:18
    #23 0x7f4326eff914 in mozilla::embedding::PPrintProgressDialogChild::OnMessageReceived(IPC::Message const&) /builds/worker/workspace/obj-build/ipc/ipdl/PPrintProgressDialogChild.cpp:234:28
    #24 0x7f4326a4af13 in mozilla::dom::PContentChild::OnMessageReceived(IPC::Message const&) /builds/worker/workspace/obj-build/ipc/ipdl/PContentChild.cpp:8701:32
    #25 0x7f43267d5a6a in mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&) src/ipc/glue/MessageChannel.cpp:2157:25
    #26 0x7f43267d20ce in mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&) src/ipc/glue/MessageChannel.cpp:2081:9
    #27 0x7f43267d3a88 in mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask&) src/ipc/glue/MessageChannel.cpp:1929:3
    #28 0x7f43267d45eb in mozilla::ipc::MessageChannel::MessageTask::Run() src/ipc/glue/MessageChannel.cpp:1960:13
    #29 0x7f432558e696 in mozilla::RunnableTask::Run() src/xpcom/threads/TaskController.cpp:472:16
    #30 0x7f432558b253 in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) src/xpcom/threads/TaskController.cpp:760:26
    #31 0x7f4325589127 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) src/xpcom/threads/TaskController.cpp:611:15
    #32 0x7f432558957d in mozilla::TaskController::ProcessPendingMTTask(bool) src/xpcom/threads/TaskController.cpp:395:36
    #33 0x7f4325595d04 in operator() src/xpcom/threads/TaskController.cpp:136:37
    #34 0x7f4325595d04 in mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_4>::Run() src/xpcom/threads/nsThreadUtils.h:534:5
    #35 0x7f43255b1024 in nsThread::ProcessNextEvent(bool, bool*) src/xpcom/threads/nsThread.cpp:1158:16
    #36 0x7f43255bbb9c in NS_ProcessNextEvent(nsIThread*, bool) src/xpcom/threads/nsThreadUtils.cpp:548:10
    #37 0x7f43267dd274 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) src/ipc/glue/MessagePump.cpp:109:5
    #38 0x7f43266e7841 in RunInternal src/ipc/chromium/src/base/message_loop.cc:335:10
    #39 0x7f43266e7841 in RunHandler src/ipc/chromium/src/base/message_loop.cc:328:3
    #40 0x7f43266e7841 in MessageLoop::Run() src/ipc/chromium/src/base/message_loop.cc:310:3
    #41 0x7f432cfb7b87 in nsBaseAppShell::Run() src/widget/nsBaseAppShell.cpp:137:27
    #42 0x7f4330a84b6f in XRE_RunAppShell() src/toolkit/xre/nsEmbedFunctions.cpp:902:20
    #43 0x7f43266e7841 in RunInternal src/ipc/chromium/src/base/message_loop.cc:335:10
    #44 0x7f43266e7841 in RunHandler src/ipc/chromium/src/base/message_loop.cc:328:3
    #45 0x7f43266e7841 in MessageLoop::Run() src/ipc/chromium/src/base/message_loop.cc:310:3
    #46 0x7f4330a842ff in XRE_InitChildProcess(int, char**, XREChildData const*) src/toolkit/xre/nsEmbedFunctions.cpp:733:34
    #47 0x557ee8cd89fd in content_process_main(mozilla::Bootstrap*, int, char**) src/browser/app/../../ipc/contentproc/plugin-container.cpp:57:28
    #48 0x557ee8cd8e21 in main src/browser/app/nsBrowserApp.cpp:309:18
    #49 0x7f434c6e2b96 in __libc_start_main /build/glibc-2ORdQG/glibc-2.27/csu/../csu/libc-start.c:310
Flags: in-testsuite?

A Pernosco session is available here: https://pernos.co/debug/c3g8B1OitoXpPDfvlALU0Q/index.html

Assignee: nobody → krosylight
Severity: -- → S3
Priority: -- → P2

Bugmon Analysis:
Verified bug as reproducible on mozilla-central 20210309161138-5f0f6477c734.
The bug appears to have been introduced in the following build range:

Start: 056bbc57ca7c4eaff9ed44bbde2a9595a2258216 (20200904033504)
End: d871d71f519666171d7c300d585125d98ffd6a4e (20200904033328)
Pushlog: https://hg.mozilla.org/mozilla-unified/pushloghtml?fromchange=056bbc57ca7c4eaff9ed44bbde2a9595a2258216&tochange=d871d71f519666171d7c300d585125d98ffd6a4e

Whiteboard: [bugmon:bisected,confirmed]
Blocks: domino

:saschanaz, since this bug contains a bisection range, could you fill (if possible) the regressed_by field?
For more information, please visit auto_nag documentation.

Flags: needinfo?(krosylight)

I haven't take a deep look, and not sure how nsCOMPtr can fail to be created.

Masayuki, the bisect includes your commit, do you have any clue why this happens?

Assignee: krosylight → nobody
Flags: needinfo?(krosylight) → needinfo?(masayuki)

and I don't think this is really a recent regression. Something may have revealed the issue now in the testcase, but nsFocusManager looks buggy here.

FWIW, a dumb search for GetDocShell() shows, that we seem to fail to (immediately) check its return value in quite some places, not only in nsFocusManager. In some cases it is "just" passed elsewhere and might be handled gently elsewhere, in some cases the pointer is accessed directly afterwards.

Aha, we're trying to focus an element in the static document

Tyson, do you recall how to reproduce this?

Flags: needinfo?(twsmith)

(In general all the Get* methods in DOM may return null.)

This is somewhat obvious issue, but having a way to reproduce would be nice to verify the fix.

I am able to use Grizzly to reproduce it reliably.

To install Grizzly:

pip install grizzly-framework

To replay the test case:

python3 -m grizzly.replay <path_to_build>/firefox testcase.html --repeat 10 --relaunch 1
Flags: needinfo?(twsmith)
Assignee: nobody → krosylight

Okay, Pernosco shows a bit different call stack:

nsFocusManager::SetFocus () at nsFocusManager.cpp:485
nsFocusManager::SetFocusInner () at nsFocusManager.cpp:1743
<signal handler called>
WasmTrapHandler () at WasmSignalHandlers.cpp:981
js::UnixExceptionHandler () at MemoryProtectionExceptionHandler.cpp:272

So it was just that I didn't (or still not?) understand how asan fails 😄

Pushed by krosylight@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1abffab5d5e9
Check docshell existence in SetFocusInner r=smaug

Backed out for causing reftest failures in crashtests/1697256.html

Backout link: https://hg.mozilla.org/integration/autoland/rev/d0fa0ab5a929f13e53f9970437e4dbf38031b72b

Push with failures

Failure log

Flags: needinfo?(krosylight)
Flags: needinfo?(krosylight)

Sigh, I thought I confirmed the original testcase fails on CI but it turns out I saw another irrelevant crash instead. The testcase somehow timeouts on CI but not on my local machine, even on same Windows. 🤷‍♀️

Tyson, can you still reproduce this one? I tried your grizzly method with the latest asan builds on both Windows and Ubuntu but it didn't reproduce.

(And could you also share the working profile? It seems print.always_print_silent is set to true with Grizzly but Windows still opens a file dialog anyway, while Ubuntu just does nothing 🤔)

Flags: needinfo?(twsmith)

(In reply to Kagami :saschanaz from comment #16)

Tyson, can you still reproduce this one? I tried your grizzly method with the latest asan builds on both Windows and Ubuntu but it didn't reproduce.

Yes I can reproduce with m-c 20210416-a62c94365ebb on Ubuntu. Not sure what has changed but by default it does not reproduce for me either, here is what I did to get it to repro.

python3 -m grizzly.replay m-c-20210416155149-fuzzing-asan-opt/firefox testcase.html --repeat 10 --relaunch 1 --no-harness -p prefs.js

(And could you also share the working profile? It seems print.always_print_silent is set to true with Grizzly but Windows still opens a file dialog anyway, while Ubuntu just does nothing 🤔)

Grizzly will use prefpicker to generate a temporary prefs.js file if one is not provided. I will attach the prefs.js file I used to repro the issue. Perhaps some default prefs have changed or something?

Let me know how that works.

Flags: needinfo?(twsmith)
Attached file prefs.js (deleted) —

Thanks, it also reproduces here with the prefs (but somehow only on Ubuntu) 👍

Sigh, this unexpectedly is consuming much more time than I expected. The test intermittently timeouts on CI with a confusing JavaScript error: , line 0: uncaught exception: undefined error message, which requires deeper investigation...

So these are the minimal required flags (while network flags are only for Grizzly to work):

user_pref("browser.search.region", 'US');
user_pref("network.proxy.autoconfig_url", "data:text/plain,function FindProxyForURL(url, host) { if (host == 'localhost' || host == '127.0.0.1') { return 'DIRECT'; } else { return 'PROXY 127.0.0.1:6'; } }");
user_pref("network.proxy.share_proxy_settings", true);
user_pref("network.proxy.type", 2);
user_pref("print.always_print_silent", true);
user_pref("print.print_to_file", true);
user_pref("print.print_to_filename", '/dev/null');

I'm not sure why browser.search.region matters but it anyway is required, and that implies the full browser chrome is needed for this to be reproduced. But the crashtest system does not show the browser chrome. 🤷‍♀️

Chrome-less printing is probably not expected, so maybe it should use normal mochitest. (Not sure WPT can be used, there is no existing non-manual WPT test with window.print() call.)

I will investigate more next week.

There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:saschanaz, could you have a look please?
For more information, please visit auto_nag documentation.

Flags: needinfo?(krosylight)
Flags: needinfo?(bugs)
Flags: needinfo?(krosylight)
Flags: needinfo?(bugs)
Pushed by krosylight@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a428c4aa428e
Check docshell existence in SetFocusInner r=smaug

Backed out for causing crashtest failures in dom/base/crashtests/1697256.html

Backout link: https://hg.mozilla.org/integration/autoland/rev/9a8e1f2aa036dce9121f499168041b7a99baafb6

Push with failures

Failure log

INFO -  REFTEST TEST-LOAD | http://10.0.2.2:8854/tests/dom/base/crashtests/1697256.html | 367 / 3906 (9%)
[task 2022-01-21T17:34:34.046Z] 17:34:34  WARNING -  REFTEST TEST-UNEXPECTED-FAIL | dom/base/crashtests/1697256.html | load failed: timed out waiting for reftest-wait to be removed
[task 2022-01-21T17:34:34.046Z] 17:34:34     INFO -  REFTEST INFO | Saved log: START http://10.0.2.2:8854/tests/dom/base/crashtests/1697256.html
[task 2022-01-21T17:34:34.046Z] 17:34:34     INFO -  REFTEST INFO | Saved log: [CONTENT] OnDocumentLoad triggering WaitForTestEnd
[task 2022-01-21T17:34:34.047Z] 17:34:34     INFO -  REFTEST INFO | Saved log: [CONTENT] WaitForTestEnd: Adding listeners
[task 2022-01-21T17:34:34.047Z] 17:34:34     INFO -  REFTEST INFO | Saved log: Initializing canvas snapshot
[task 2022-01-21T17:34:34.047Z] 17:34:34     INFO -  REFTEST INFO | Saved log: [CONTENT] MakeProgress
[task 2022-01-21T17:34:34.047Z] 17:34:34     INFO -  REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_TO_FIRE_INVALIDATE_EVENT
[task 2022-01-21T17:34:34.048Z] 17:34:34     INFO -  REFTEST INFO | Saved log: [CONTENT] MakeProgress: dispatching MozReftestInvalidate
[task 2022-01-21T17:34:34.048Z] 17:34:34     INFO -  REFTEST INFO | Saved log: [CONTENT] MakeProgress
Flags: needinfo?(krosylight)
Flags: needinfo?(krosylight)
Pushed by krosylight@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/25666969c318
Check docshell existence in SetFocusInner r=smaug
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 98 Branch
Crash Signature: [@ nsFocusManager::SetFocusInner]
Flags: in-testsuite? → in-testsuite+

Bugmon Analysis
Verified bug as fixed on rev mozilla-central 20220122095122-61861c0babc6.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Status: RESOLVED → VERIFIED
Keywords: bugmon

The patch landed in nightly and beta is affected.
:saschanaz, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(krosylight)

I believe this is rare enough, so no.

Flags: needinfo?(krosylight)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: