Crash in mozilla::dom::PlacesObservers::NotifyListeners
Categories
(Toolkit :: Places, defect, P1)
Tracking
()
People
(Reporter: Crashdows, Assigned: daisuke)
References
(Regression)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file, 1 obsolete file)
(deleted),
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-beta+
|
Details |
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:74.0) Gecko/20100101 Firefox/74.0
Steps to reproduce:
first please make sure go to bookmarks and search for an object(you must bookmarks page before) and select all bookmarks with ctrl+a or right click and select all, then try to delete with delete key or etc( at least 100 bookmark)
Actual results:
crash happens and nothing delete.
here is 3 times that my browser crash:
https://crash-stats.mozilla.org/report/index/90f49bf7-c3ad-4ec0-a56b-7f7e80200322
https://crash-stats.mozilla.org/report/index/770fd0a7-bcc7-4ee3-96b6-e5a620200322
https://crash-stats.mozilla.org/report/index/a6ffbdbd-146d-48ef-8331-a403e0200322
Expected results:
bookmarks must be removed.
Updated•5 years ago
|
Comment 1•5 years ago
|
||
I wonder if this is because deleting that many bookmarks at once fires a notification that causes some other code to run and also fire a notification, so we end up re-entering notifyObservers
. This is enough to trigger that assertion:
PlacesObservers.addListener(["bookmark-added"], events =>
PlacesObservers.notifyListeners([new PlacesBookmarkAddition({ dateAdded: 0, guid: "bookmarkAAAA", id: -1, index: 0, isTagging: false, itemType: 1, parentGuid: "fakeAAAAAAAA", parentId: -2, source: 0, title: "A", url: "http://example.com/a" })])
);
// PlacesObservers.notifyListeners([]); // Uncomment to crash!
Javad, do you have Sync turned on? That would be my first guess for what's firing notifyObservers
...
(In reply to :Lina Cambridge from comment #1)
I wonder if this is because deleting that many bookmarks at once fires a notification that causes some other code to run and also fire a notification, so we end up re-entering
notifyObservers
. This is enough to trigger that assertion:PlacesObservers.addListener(["bookmark-added"], events => PlacesObservers.notifyListeners([new PlacesBookmarkAddition({ dateAdded: 0, guid: "bookmarkAAAA", id: -1, index: 0, isTagging: false, itemType: 1, parentGuid: "fakeAAAAAAAA", parentId: -2, source: 0, title: "A", url: "http://example.com/a" })]) ); // PlacesObservers.notifyListeners([]); // Uncomment to crash!
Javad, do you have Sync turned on? That would be my first guess for what's firing
notifyObservers
...
yes ofc, browser sync turned onf and work well
Comment 4•5 years ago
|
||
Doug, what do you think is the best way to make notifyListeners
re-entrant? I see we have a queue for addListener
and removeListener
calls; should we also queue up notifyListeners
calls, and pop them off in a loop? (Flushing our addListener
and removeListener
queues after each turn, so that re-entrant calls to notifyListeners
that end up calling {add, remove}Listener
...my head hurts a bit thinking about it! 😅) Or is this not worth it or unsafe?
Comment 5•5 years ago
|
||
I don't have strong opinions here, but I can give my two cents regarding the trade-offs.
I don't have any technical objections for there being a queue of notifyListeners
calls - however it can lead to bad assumptions regarding the ordering guarantees. For the most part if you call notifyListeners
you can assume that all listeners have run by the time the function returns. This however would now only be true if you're not calling it in a reentrant way.
You could conceivably just allow it to nest, which would keep the ordering guarantees, but you would need to fix the CallListeners
function which removes listeners that no longer exist (due to weak references) - since if we were in a nested notifyListeners
call, this would mess up the iteration of the parent notifyListeners
call. There might be other things that you would need to tweak to get this to be robust - I can't guarantee CallListeners
is the end of it.
The third option would just be to require any consumers to defer their notifyListeners call, which would make the ordering at the call site explicit and not require any risky changes inside the PlacesObservers
system. But, it would leave the risk around of introducing unexpected crashes like this in the future.
I would be okay with any of these options - I'm not in this code often enough anymore to feel like I have significant stake in it. If you would like me to have a preference though, I would say option 2 (keeping the ordering guarantees and allowing nested notifyListeners
calls) sounds the most appealing. It also has the benefit that it will crash if we find ourselves in an infinite loop by running out of stack size, rather than just happily eating resources in the background like the other options.
Comment 6•5 years ago
|
||
The priority flag is not set for this bug.
:mak, could you have a look please?
For more information, please visit auto_nag documentation.
Updated•5 years ago
|
Updated•3 years ago
|
Comment 7•3 years ago
|
||
Marco, you have this as P1 for 2 years. We have a regression in bug 1734566, can you advise?
Comment 8•3 years ago
|
||
P1 and P2 are just used internally to the "team" to track things, what matters for the project is the severity field.
I must refresh my memory about this issue, it's a protective assertion to avoid re-entrant calls, and we must pick a strategy for that, as explained in comment 5.
In practice there are 3 options:
- enqueue new notifications. In this case we manage a FIFO queue of notifications, whoever is notifying will check if there's more to send. The nested order would be messed up, but could be easier to do.
- nest notifications. In this case new ones are just fired, CallListeners must change to check if it's a nested call, and remove dead listeners only at the end if it's not nested. Nested order is respected, but it has some risks of mismanaging in CallListeners.
- just annotate in the webidl that it's not possible to nest listeners, and fix any consumers doing it to enqueue. It's easy, but doesn't remove the risk of hitting the assertion. The order would be pretty much similar to (1), so this is likely not our best option.
I agree with Doug that a first try should be made to implement (2), if that becomes too hairy, we can fallback to (1).
Daisuke, would you have time to investigate this? Feel free to push back if it's too much or you're overloaded and we'll find a different solution.
Assignee | ||
Comment 9•3 years ago
|
||
Yes, let me do this.
I got the reason for this crash. So, I have questions about implementation.
(In reply to Marco Bonardo [:mak] from comment #8)
- enqueue new notifications. In this case we manage a FIFO queue of notifications, whoever is notifying will check if there's more to send. The nested order would be messed up, but could be easier to do.
Does this mean, we keep the events as like below?
void PlacesObservers::NotifyListeners(aEvents) {
mEventsList.AppendElement(aEvents);
.... notify
}
- nest notifications. In this case new ones are just fired, CallListeners must change to check if it's a nested call, and remove dead listeners only at the end if it's not nested. Nested order is respected, but it has some risks of mismanaging in CallListeners.
And, I'm so sorry, I don’t understand clearly yet what kind of implementation you are thinking of for these nested notifications. If possible, could you explain a bit more?
Comment 10•3 years ago
|
||
Yeah, the implementation bits are the tricky part to figure out here and may require playing with the code a bit.
I assume in (1) we'd detect that we are already notifying, thus we'd add those notifications to some sort of a queue, the original code that is currently notifying, when done with its notifications would check if there's anything in queue and serve it.
For (2) instead I think you'd basically just have to remove the assert and let the current code continue, but don't remove null listeners (also see Doug's comment 5) in nested calls, to avoid changing the listeners array position for the outside loop. Thus each call should track whether its nested or not.
This is on a first stance, as I said it's possible there are non visible issues that will arise when we start touching the code.
Doing this in a test-driver manner is probably suggested, it should be easy to write some nested notifications cases.
Assignee | ||
Comment 11•3 years ago
|
||
Thank you very much!
Perhaps, I could understand :)
Assignee | ||
Comment 12•3 years ago
|
||
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 14•3 years ago
|
||
Updated•3 years ago
|
Updated•3 years ago
|
Comment 15•3 years ago
|
||
Set release status flags based on info from the regressing bug 1729423
Comment 18•3 years ago
|
||
I just got a crash after deleting just a single bookmark: https://crash-stats.mozilla.org/report/index/0fd9ead8-20a9-4229-b783-c9e590211113
Comment 19•3 years ago
|
||
Workaround for the crashes starting in 94 (Bug 1734566) is to set widget.wayland.async-clipboard.enabled = false
in about:config
.
Comment 20•3 years ago
|
||
No need to track for 95 given the volume on beta.
Updated•3 years ago
|
Comment hidden (metoo) |
Updated•3 years ago
|
Comment 27•3 years ago
|
||
(In reply to Kestrel from comment #19)
Workaround for the crashes starting in 94 (Bug 1734566) is to set
widget.wayland.async-clipboard.enabled = false
inabout:config
.
The workaround here will be removed as it causes more clipboard issues (failed paste, non-working clipboard in dialogs).
Is there any way how to fix that?
Comment 28•3 years ago
|
||
I also experience a crash when deleting just one bookmark.
I don't know whether it's this bug or another one.
Here is the log: https://crash-stats.mozilla.org/report/index/94a7c5ad-7920-415e-b3e9-9497d0211208
How can I work around it?
Will there be a fix?
Comment 29•3 years ago
|
||
Distribution ID: mozilla-MSIX | User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:95.0) Gecko/20100101 Firefox/95.0 | OS: Windows_NT 10.0 19044. No crash.
Comment 30•3 years ago
|
||
Comment 31•3 years ago
|
||
Backed out for causing leaks.
Backout link: https://hg.mozilla.org/integration/autoland/rev/61ce5f7c90e883d5db84a810e13720ae135f3a7f
Failure log: https://treeherder.mozilla.org/logviewer?job_id=361122436&repo=autoland&lineNumber=2447
Comment hidden (metoo) |
Comment 33•3 years ago
|
||
(In reply to Tizio Caio from comment #32)
Still crashing if I delete a one single bookmark
Please be patient, this bug is not yet marked as fixed, and even when it is fixed, it'll probably be 10th Jan before it is included in a release.
Updated•3 years ago
|
Assignee | ||
Updated•3 years ago
|
Updated•3 years ago
|
Comment 38•3 years ago
|
||
Comment 39•3 years ago
|
||
bugherder |
Comment 40•3 years ago
|
||
Can this be backported to Firefox 96? It is rather inconvinient to not be able to delete a bookmark...
Comment 41•3 years ago
|
||
This patch is not a good backport candidate for 96 due to the large size and complexity. It needs time to bake before we ship it to the release population at large. We are hoping to backport to Beta for Fx97 due to ship in a couple weeks, however.
Assignee | ||
Comment 42•3 years ago
|
||
Comment on attachment 9248579 [details]
Bug 1624384: Allow nested notifications by notifing events sequentially.
Beta/Release Uplift Approval Request
- User impact if declined: Might be crash when deleting bookmark on linux.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Medium
- Why is the change risky/not risky? (and alternatives if risky): The patch changes the way to notify places events, and is not so small.
- String changes made/needed: none
Comment 44•3 years ago
|
||
Comment on attachment 9248579 [details]
Bug 1624384: Allow nested notifications by notifing events sequentially.
Moderately-risky patch needed to address a pretty frequent Linux crash spike. Approved for 97.0b6.
Comment 45•3 years ago
|
||
bugherder uplift |
Updated•3 years ago
|
Description
•