Closed Bug 1584953 Opened 5 years ago Closed 5 years ago

Crash in [@ NS_NewLocalFile]

Categories

(Core :: Storage: IndexedDB, defect, P2)

Unspecified
All
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr68 --- wontfix
firefox74 --- wontfix
firefox75 --- unaffected

People

(Reporter: gsvelto, Assigned: sg)

References

Details

(5 keywords)

Crash Data

This bug is for crash report bp-ca1c9b60-adeb-4b45-b3ff-a813e0190930.

Top 10 frames of crashing thread:

0 xul.dll NS_NewLocalFile xpcom/io/nsLocalFileWin.cpp:3120
1 xul.dll mozilla::dom::indexedDB::FileManager::GetDirectory dom/indexedDB/ActorsParent.cpp:15439
2 xul.dll mozilla::dom::indexedDB::FileInfo::GetFileForFileInfo dom/indexedDB/FileInfo.cpp:175
3 xul.dll static nsresult mozilla::dom::indexedDB::`anonymous namespace'::DatabaseOperationBase::GetStructuredCloneReadInfoFromExternalBlob dom/indexedDB/ActorsParent.cpp:18341
4 xul.dll static nsresult mozilla::dom::indexedDB::`anonymous namespace'::DatabaseOperationBase::GetStructuredCloneReadInfoFromStatement dom/indexedDB/ActorsParent.cpp:5447
5 xul.dll nsresult mozilla::dom::indexedDB::`anonymous namespace'::ObjectStoreGetRequestOp::DoDatabaseWork dom/indexedDB/ActorsParent.cpp:24712
6 xul.dll void mozilla::dom::indexedDB::`anonymous namespace'::TransactionDatabaseOperationBase::RunOnConnectionThread dom/indexedDB/ActorsParent.cpp:21881
7 xul.dll nsresult mozilla::dom::indexedDB::`anonymous namespace'::TransactionDatabaseOperationBase::Run dom/indexedDB/ActorsParent.cpp:22048
8 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1225
9 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486

This is not a new crash as it shows up on multiple versions and both on macOS and Windows with the same stack trace. There's a handful of comments mentioning a Facebook game called "Coin Master" and many, many crashes come from URLs starting with https://apps.facebook.com/coin-master/

Blocks: 1541370
Priority: -- → P2
Group: dom-core-security

I suspect a UAF on the FileManagerobject here. Did anybody try to reproduce this yet?

Maybe something you want to look at as you are revisiting all the windows file functions?

Flags: needinfo?(ttung)

(In reply to Jens Stutte [:jstutte] from comment #1)

I suspect a UAF on the FileManagerobject here. Did anybody try to reproduce this yet?

There seem to have multiple issues involved.
For instance, https://crash-stats.mozilla.org/report/index/eac56a17-f616-4d26-86d6-a75f70200122 involves QuotaManager.

Also, if I assume FileManager is a nullptr, then it's not clear to me why it didn't crash at here. I said that because it accesses the this of FileManager. It's possible that FileManager is null'ed right after this line, but the crash rate on Release is too high to say so. Besides, FileManager is held by FileInfo since FileInfo is initialized.

I probably need to download mini dumps to see if there are more clues.

Assignee: nobody → ttung
Status: NEW → ASSIGNED
Flags: needinfo?(ttung)

So, I checked the mini dump of f8170e9c-d0d2-4ba8-8f15-2a1cb0200201.
When crashing:
The FileManager seems to be a nullptr(The address of mDirectoryPath is 0xfffffffffffffff7)
The FileInfo doesn't look like a nullptr

I still don't quite understand how can we have a null FileManager and non-null FileInfo since FileInfo holds a RefPtr for FileManager and it's set by a FileManager (FileManager propagates itself to FileInfo's constructor by passing this; And, I cannot find a place where FileInfo::mFileManager is set to a nullptr)

FileInfo and FileManager want to be threadsafe as of NS_INLINE_DECL_THREADSAFE_REFCOUNTING but use unsafe pointers on themselves at least here:

nsCOMPtr<nsIFile> FileInfo::GetFileForFileInfo(FileInfo* aFileInfo) {
  FileManager* const fileManager = aFileInfo->Manager();
  const nsCOMPtr<nsIFile> directory = fileManager->GetDirectory();
  if (NS_WARN_IF(!directory)) {
    return nullptr;
  }

  nsCOMPtr<nsIFile> file =
      FileManager::GetFileForId(directory, aFileInfo->Id());
  if (NS_WARN_IF(!file)) {
    return nullptr;
  }

  return file;
}

which is part of the call stack. I suspect, that we have a race here and on some other thread the FileManager object dies through refcounting between the first and the second line of that function (maybe together with the containing FileInfo). We might want to ensure thread safety in a more thorough way on these classes (there is probably more than just pointer handling to fix).

I cannot reproduce this by writing a simple test which just reading external blobs in rr chaos mode.

Checking the reports again and I found same build id crashed multiple times, I suspect that indicate this issue might be related to wrong value storing in the database. I will check the code around persisted data.

Tom do you have access to the crash reports' URLs? There's a bunch of different URLs involved but I'd say that the overwhelming majority of it are games of some sort, either Flash- or WebGL-based. Also if you poking into the crash minidumps could help just let me know and I'll have a look.

(In reply to Gabriele Svelto [:gsvelto] from comment #7)

Tom do you have access to the crash reports' URLs? There's a bunch of different URLs involved but I'd say that the overwhelming majority of it are games of some sort, either Flash- or WebGL-based. Also if you poking into the crash minidumps could help just let me know and I'll have a look.

Thanks for asking! I have access to that and have tried playing with some URLs (e.g. coin master) with ASAN build, but I couldn't reproduce the crash. I checked the minidumps, but I didn't find out more clues to figure the reason why.
My understanding from the minidumps I read is that:

  • FileManager seems to be a nullptr and its address is poisoned
  • FileInfo seems to be a nullptr as well but its address is not poisoned
    I don't have any other clues to proceed, so I will probably try to add more release assertion on Nightly to see if I can get more information.

Simon, would you mind taking a look at this issue when you have time? Maybe you can find something here, otherwise, I might prepare a patch to add more assertions to get more clues.

However, I am worried if these assertions could help to fix the real issue or not.

There are at least two issues based on the crash report. One of them is related to QM and the other is related to IDB. Since there are more reports related to IDB, I would probably start with IDB cases.

From my understanding, IDB cases here can be related to persisted data (something store inside IDB database) because I see the same build id crash multiple times in a short time and it seems it's not easy to reproduce.

I went to the websites which cause them to crash with an ASAN debug/opt build, but I didn't see even any warning related to the code around (there were warnings related to storage busy while opening an IDB database in some cases).

We suspect both FileManager and FileInfo are nullptr (in theory, they shouldn't) while crashing. So, if we eventually decide to add some assertions, I might prefer adding CheckedUnsafePtr to FileInfo since we handle the ref count of it in a special way.

Flags: needinfo?(sgiesecke)

I looked at the latest beta crashreport (https://crash-stats.mozilla.org/report/index/cb795a42-007b-4bcc-8cea-14a120200219), which is one of the IndexedDB cases.

FileManager is not nullptr here, but it looks as if it has been freed, in the top frame, aPath is at an e5 pattern address, which binds to a member of the FileManager. This must have happened between https://searchfox.org/mozilla-central/rev/fca0be7e2cf2f922c9b927423ce28e8a04b3fd90/dom/indexedDB/ActorsParent.cpp#19068 (where the FileInfo object was retrieved from the FileManager) and https://searchfox.org/mozilla-central/rev/fca0be7e2cf2f922c9b927423ce28e8a04b3fd90/dom/indexedDB/ActorsParent.cpp#19105

I am not sure where exactly you would want to use CheckedUnsafePtr? At least FileInfo::mFileManager already is a RefPtr.

One possible cause for this is that the custom refcounting of FileInfo is not correct (if the FileInfo. FileManager itself, however uses normal refcounting. From my point of view, it would be strongly advisable to have unit tests for FileInfo and its custom refcounting. Not sure how hard this is too achieve given its dependencies.

Flags: needinfo?(sgiesecke)

(In reply to Simon Giesecke [:sg] [he/him] from comment #10)
Thanks for looking into it!

FileManager is not nullptr here, but it looks as if it has been freed, in the top frame, aPath is at an e5 pattern address, which binds to a member of the FileManager. This must have happened between https://searchfox.org/mozilla-central/rev/fca0be7e2cf2f922c9b927423ce28e8a04b3fd90/dom/indexedDB/ActorsParent.cpp#19068 (where the FileInfo object was retrieved from the FileManager) and https://searchfox.org/mozilla-central/rev/fca0be7e2cf2f922c9b927423ce28e8a04b3fd90/dom/indexedDB/ActorsParent.cpp#19105

Ah, sorry for the confusion. I meant both FileManager and FileInfo were freed (because it doesn't make sense to me if FileManager was freed while FileInfo was alive).

And,

where the FileInfo object was retrieved from the FileManager
I guess you meant here (inside of that function)?

I am not sure where exactly you would want to use CheckedUnsafePtr? At least FileInfo::mFileManager already is a RefPtr.

I meant adding that for FileInfo since it was gotten from a raw pointer array. And, my concern, if the cause is that FileInfo was freed, then what we really want is to figure out how to make ref-counting of FileInfo incorrect rather than just finding the place where it has already incorrect. I'm not saying finding where it has already incorrect is not good, but if it's made from previous Firefox session, then it's still hard for us to fix the root cause completely (We probably can only return an error when we find the ref-count is incorrect).

One possible cause for this is that the custom refcounting of FileInfo is not correct (if the FileInfo. FileManager itself, however uses normal refcounting. From my point of view, it would be strongly advisable to have unit tests for FileInfo and its custom refcounting. Not sure how hard this is too achieve given its dependencies.

That's a good point! My first thought is that this is not easy to achieve. I will think about whether it's not really hard to do that.

Assignee: ttung → sgiesecke
Status: ASSIGNED → NEW
Depends on: 1617170

This should probably marked with some sec-* keyword. I thought the process to request this is set the sec-review flag, but I can't set that. Could you have a look?

Flags: needinfo?(dveditz)

It's clear a UAF so defaulting to sec-high, but if it's a race condition that's hard to get right maybe we'll downgrade to sec-moderate later. If you suspect a race then TSAN would be a better tool for finding it than ASAN -- check with decoder or other members of the fuzzing team.

The sec-review flag has long been deprecated -- I hadn't noticed it was still showing up in the UI. Thanks for pointing that out: I've now disabled it. For a feature/implementation review request please see https://wiki.mozilla.org/Security/Reviews/. For questions like comment 12 or other general issues please ping us on #security (slack or matrix) to take a look (without mentioning bug details since those are public channels). Or needinfo, as you did, works too. Normally a "hidden security bug without a rating" will get caught in our weekly triage; will investigate how this one has slipped through repeatedly.

Flags: needinfo?(dveditz)

Just for the records: The bug has been added to dom-core-security quite early in order to avoid information leakage. But we missed to trigger your review together with that , sorry.

Depends on: 1618541

There are no crashes with builds later than 20200309095159. And we had occurrences in beta 74 (and 73) but not yet in beta 75. The fix from bug 1618541 and/or the FileManager/Info cleanup from bug 1617170 seem to help. Not sure if we can then just close this bug here or if there is something left to do? We might want to backport at least bug 1618541 to ESR ? But from 1618541#c6 it seems, it is not even in the 75 beta? Then we should check, which patches from bug 1617170 are already contained in beta?

Flags: needinfo?(dveditz)
Flags: needinfo?(sgiesecke)

This is not happening much on ESR-68, especially considering those crash reports aren't throttled. bug 1617170 is way too big to take on ESR and we don't want to try to cherry-pick our way through it without a reliable testcase. I think we call it "worksforme" on mainline and "wontfix" for ESR

Status: NEW → RESOLVED
Closed: 5 years ago
Flags: needinfo?(dveditz)
Resolution: --- → WORKSFORME
Flags: needinfo?(sgiesecke)
Group: dom-core-security
You need to log in before you can comment on or make changes to this bug.