Closed Bug 1624384 Opened 5 years ago Closed 3 years ago

Crash in mozilla::dom::PlacesObservers::NotifyListeners

Tracking

()

Status:

RESOLVED FIXED

Milestone:

98 Branch

Tracking Flags:

Tracking

Status

firefox-esr78

---

wontfix

firefox-esr91

---

wontfix

firefox74

---

wontfix

firefox93

---

wontfix

firefox94

---

wontfix

firefox95

---

wontfix

firefox96

---

wontfix

firefox97

---

fixed

firefox98

---

fixed

Tracking

Status

relnote-firefox

thunderbird_esr102

thunderbird_esr115

firefox-esr78

firefox-esr91

firefox-esr102

firefox-esr115

firefox74

firefox93

firefox94

firefox95

firefox96

firefox97

firefox98

firefox117

firefox118

firefox119

firefox125

firefox123

firefox126

firefox124

People

(Reporter: Crashdows, Assigned: daisuke)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file, 1 obsolete file)

Bug 1624384: Allow nested notifications. 3 years ago Daisuke Akatsuka (:daisuke) (deleted), text/x-phabricator-request		Details
Bug 1624384: Allow nested notifications by notifing events sequentially. 3 years ago Daisuke Akatsuka (:daisuke) (deleted), text/x-phabricator-request	RyanVM : approval-mozilla-beta+	Details

Luker

Reporter

Description

•

5 years ago

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:74.0) Gecko/20100101 Firefox/74.0

Steps to reproduce:

first please make sure go to bookmarks and search for an object(you must bookmarks page before) and select all bookmarks with ctrl+a or right click and select all, then try to delete with delete key or etc( at least 100 bookmark)

Actual results:

crash happens and nothing delete.
here is 3 times that my browser crash:
https://crash-stats.mozilla.org/report/index/90f49bf7-c3ad-4ec0-a56b-7f7e80200322

https://crash-stats.mozilla.org/report/index/770fd0a7-bcc7-4ee3-96b6-e5a620200322

https://crash-stats.mozilla.org/report/index/a6ffbdbd-146d-48ef-8331-a403e0200322

Expected results:

bookmarks must be removed.

[:philipp]

Updated

•

5 years ago

Crash Signature: [@ mozilla::dom::PlacesObservers::NotifyListeners ]

status-firefox74: --- → affected

Component: Untriaged → Places

Keywords: crash

Product: Firefox → Toolkit

Summary: mozilla::dom::PlacesObservers::NotifyListeners → Crash in mozilla::dom::PlacesObservers::NotifyListeners

Lina Butler [:lina]

Comment 1

•

5 years ago

I wonder if this is because deleting that many bookmarks at once fires a notification that causes some other code to run and also fire a notification, so we end up re-entering notifyObservers. This is enough to trigger that assertion:

PlacesObservers.addListener(["bookmark-added"], events =>
  PlacesObservers.notifyListeners([new PlacesBookmarkAddition({ dateAdded: 0, guid: "bookmarkAAAA", id: -1, index: 0, isTagging: false, itemType: 1, parentGuid: "fakeAAAAAAAA", parentId: -2, source: 0, title: "A", url: "http://example.com/a" })])
);
// PlacesObservers.notifyListeners([]); // Uncomment to crash!

Javad, do you have Sync turned on? That would be my first guess for what's firing notifyObservers...

Luker

Reporter

Comment 2

•

5 years ago

(In reply to :Lina Cambridge from comment #1)

I wonder if this is because deleting that many bookmarks at once fires a notification that causes some other code to run and also fire a notification, so we end up re-entering notifyObservers. This is enough to trigger that assertion:
PlacesObservers.addListener(["bookmark-added"], events =>
  PlacesObservers.notifyListeners([new PlacesBookmarkAddition({ dateAdded: 0, guid: "bookmarkAAAA", id: -1, index: 0, isTagging: false, itemType: 1, parentGuid: "fakeAAAAAAAA", parentId: -2, source: 0, title: "A", url: "http://example.com/a" })])
);
// PlacesObservers.notifyListeners([]); // Uncomment to crash!
Javad, do you have Sync turned on? That would be my first guess for what's firing notifyObservers...

yes ofc, browser sync turned onf and work well

Marco Bonardo [:mak]

Comment 3

•

5 years ago

Do you have a plan of action for this?

Flags: needinfo?(lina)

Lina Butler [:lina]

Comment 4

•

5 years ago

Doug, what do you think is the best way to make notifyListeners re-entrant? I see we have a queue for addListener and removeListener calls; should we also queue up notifyListeners calls, and pop them off in a loop? (Flushing our addListener and removeListener queues after each turn, so that re-entrant calls to notifyListeners that end up calling {add, remove}Listener...my head hurts a bit thinking about it! 😅) Or is this not worth it or unsafe?

Flags: needinfo?(lina) → needinfo?(dothayer)

Doug Thayer [:dthayer] (he/him)

Comment 5

•

5 years ago

I don't have strong opinions here, but I can give my two cents regarding the trade-offs.

I don't have any technical objections for there being a queue of notifyListeners calls - however it can lead to bad assumptions regarding the ordering guarantees. For the most part if you call notifyListeners you can assume that all listeners have run by the time the function returns. This however would now only be true if you're not calling it in a reentrant way.

You could conceivably just allow it to nest, which would keep the ordering guarantees, but you would need to fix the CallListeners function which removes listeners that no longer exist (due to weak references) - since if we were in a nested notifyListeners call, this would mess up the iteration of the parent notifyListeners call. There might be other things that you would need to tweak to get this to be robust - I can't guarantee CallListeners is the end of it.

The third option would just be to require any consumers to defer their notifyListeners call, which would make the ordering at the call site explicit and not require any risky changes inside the PlacesObservers system. But, it would leave the risk around of introducing unexpected crashes like this in the future.

I would be okay with any of these options - I'm not in this code often enough anymore to feel like I have significant stake in it. If you would like me to have a preference though, I would say option 2 (keeping the ordering guarantees and allowing nested notifyListeners calls) sounds the most appealing. It also has the benefit that it will crash if we find ourselves in an infinite loop by running out of stack size, rather than just happily eating resources in the background like the other options.

Flags: needinfo?(dothayer)

BugBot [:suhaib / :marco/ :calixte]

Comment 6

•

5 years ago

The priority flag is not set for this bug.
:mak, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(mak)

Marco Bonardo [:mak]

Updated

•

5 years ago

Status: UNCONFIRMED → NEW

Ever confirmed: true

Flags: needinfo?(mak)

Priority: -- → P1

Jan Alexander Steffens [:heftig]

Updated

•

3 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1734566

Gian-Carlo Pascutto [:gcp]

Comment 7

•

3 years ago

Marco, you have this as P1 for 2 years. We have a regression in bug 1734566, can you advise?

Flags: needinfo?(mak)

Marco Bonardo [:mak]

Comment 8

•

3 years ago

P1 and P2 are just used internally to the "team" to track things, what matters for the project is the severity field.

I must refresh my memory about this issue, it's a protective assertion to avoid re-entrant calls, and we must pick a strategy for that, as explained in comment 5.
In practice there are 3 options:

enqueue new notifications. In this case we manage a FIFO queue of notifications, whoever is notifying will check if there's more to send. The nested order would be messed up, but could be easier to do.
nest notifications. In this case new ones are just fired, CallListeners must change to check if it's a nested call, and remove dead listeners only at the end if it's not nested. Nested order is respected, but it has some risks of mismanaging in CallListeners.
just annotate in the webidl that it's not possible to nest listeners, and fix any consumers doing it to enqueue. It's easy, but doesn't remove the risk of hitting the assertion. The order would be pretty much similar to (1), so this is likely not our best option.

I agree with Doug that a first try should be made to implement (2), if that becomes too hairy, we can fallback to (1).
Daisuke, would you have time to investigate this? Feel free to push back if it's too much or you're overloaded and we'll find a different solution.

Flags: needinfo?(mak) → needinfo?(daisuke)

Daisuke Akatsuka (:daisuke)

Assignee

Comment 9

•

3 years ago

Yes, let me do this.
I got the reason for this crash. So, I have questions about implementation.

(In reply to Marco Bonardo [:mak] from comment #8)

enqueue new notifications. In this case we manage a FIFO queue of notifications, whoever is notifying will check if there's more to send. The nested order would be messed up, but could be easier to do.

Does this mean, we keep the events as like below?

void PlacesObservers::NotifyListeners(aEvents) {
  mEventsList.AppendElement(aEvents);
  .... notify
}

nest notifications. In this case new ones are just fired, CallListeners must change to check if it's a nested call, and remove dead listeners only at the end if it's not nested. Nested order is respected, but it has some risks of mismanaging in CallListeners.

And, I'm so sorry, I don’t understand clearly yet what kind of implementation you are thinking of for these nested notifications. If possible, could you explain a bit more?

Flags: needinfo?(daisuke) → needinfo?(mak)

Marco Bonardo [:mak]

Comment 10

•

3 years ago

Yeah, the implementation bits are the tricky part to figure out here and may require playing with the code a bit.
I assume in (1) we'd detect that we are already notifying, thus we'd add those notifications to some sort of a queue, the original code that is currently notifying, when done with its notifications would check if there's anything in queue and serve it.
For (2) instead I think you'd basically just have to remove the assert and let the current code continue, but don't remove null listeners (also see Doug's comment 5) in nested calls, to avoid changing the listeners array position for the outside loop. Thus each call should track whether its nested or not.
This is on a first stance, as I said it's possible there are non visible issues that will arise when we start touching the code.
Doing this in a test-driver manner is probably suggested, it should be easy to write some nested notifications cases.

Flags: needinfo?(mak)

Daisuke Akatsuka (:daisuke)

Assignee

Comment 11

•

3 years ago

Thank you very much!
Perhaps, I could understand :)

Daisuke Akatsuka (:daisuke)

Assignee

Comment 12

•

3 years ago

Attached file Bug 1624384: Allow nested notifications. (obsolete) (deleted) — Details

Phabricator Automation

Updated

•

3 years ago

Assignee: nobody → daisuke

Status: NEW → ASSIGNED

Dianna Smith [:diannaS]

Updated

•

3 years ago

status-firefox93: --- → wontfix

status-firefox94: --- → wontfix

status-firefox95: --- → affected

tracking-firefox95: --- → +

Ryan VanderMeulen [:RyanVM]

Updated

•

3 years ago

status-firefox74: affected → wontfix

status-firefox-esr78: --- → wontfix

status-firefox-esr91: --- → wontfix

Daisuke Akatsuka (:daisuke)

Assignee

Comment 14

•

3 years ago

Attached file Bug 1624384: Allow nested notifications by notifing events sequentially. (deleted) — Details

Martin Stránský [:stransky] (ni? me)

Updated

•

3 years ago

Regressed by: 1729423

BMO Automation

Updated

•

3 years ago

Has Regression Range: --- → yes

BugBot [:suhaib / :marco/ :calixte]

Updated

•

3 years ago

Keywords: regression

BugBot [:suhaib / :marco/ :calixte]

Comment 15

•

3 years ago

Set release status flags based on info from the regressing bug 1729423

status-firefox96: --- → affected

matias

Comment 18

•

3 years ago

I just got a crash after deleting just a single bookmark: https://crash-stats.mozilla.org/report/index/0fd9ead8-20a9-4229-b783-c9e590211113

Kestrel

Comment 19

•

3 years ago

Workaround for the crashes starting in 94 (Bug 1734566) is to set widget.wayland.async-clipboard.enabled = false in about:config.

Pascal Chevrel:pascalc (PTO until August 21)

Comment 20

•

3 years ago

No need to track for 95 given the volume on beta.

tracking-firefox95: + → ---

Pascal Chevrel:pascalc (PTO until August 21)

Updated

•

3 years ago

status-firefox95: affected → wontfix

Comment hidden (metoo)

Phabricator Automation

Updated

•

3 years ago

Attachment #9246084 - Attachment is obsolete: true

Martin Stránský [:stransky] (ni? me)

Comment 27

•

3 years ago

(In reply to Kestrel from comment #19)

Workaround for the crashes starting in 94 (Bug 1734566) is to set widget.wayland.async-clipboard.enabled = false in about:config.

The workaround here will be removed as it causes more clipboard issues (failed paste, non-working clipboard in dialogs).
Is there any way how to fix that?

Tizio Caio

Comment 28

•

3 years ago

I also experience a crash when deleting just one bookmark.

I don't know whether it's this bug or another one.

Here is the log: https://crash-stats.mozilla.org/report/index/94a7c5ad-7920-415e-b3e9-9497d0211208

How can I work around it?

Will there be a fix?

lamalbrut

Comment 29

•

3 years ago

Distribution ID: mozilla-MSIX | User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:95.0) Gecko/20100101 Firefox/95.0 | OS: Windows_NT 10.0 19044. No crash.

Luker

Reporter

Updated

•

3 years ago

status-firefox97: --- → affected

Pulsebot

Comment 30

•

3 years ago

Pushed by dakatsuka.birchill@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7736689d07b7 Allow nested notifications by notifing events sequentially. r=mak

Atila Butkovits

Comment 31

•

3 years ago

Backed out for causing leaks.

Backout link: https://hg.mozilla.org/integration/autoland/rev/61ce5f7c90e883d5db84a810e13720ae135f3a7f

Push with failures: https://treeherder.mozilla.org/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel&searchStr=linux%2C18.04%2Cx64%2Cwebrender%2Casan%2Copt%2Cmochitests%2Ctest-linux1804-64-asan-qr%2Fopt-mochitest-devtools-chrome-e10s%2Cdt2&revision=7736689d07b77f115ac850926bd1aee2cacede5b&selectedTaskRun=Vf3sd20fSCGEmi54kLnH3Q.0

Failure log: https://treeherder.mozilla.org/logviewer?job_id=361122436&repo=autoland&lineNumber=2447

Flags: needinfo?(daisuke)

Comment hidden (metoo)

Mark Banner (:standard8)

Comment 33

•

3 years ago

(In reply to Tizio Caio from comment #32)

Still crashing if I delete a one single bookmark

Please be patient, this bug is not yet marked as fixed, and even when it is fixed, it'll probably be 10th Jan before it is included in a release.

Ryan VanderMeulen [:RyanVM]

Updated

•

3 years ago

status-firefox96: affected → wontfix

Daisuke Akatsuka (:daisuke)

Assignee

Updated

•

3 years ago

Flags: needinfo?(daisuke)

BugBot [:suhaib / :marco/ :calixte]

Updated

•

3 years ago

Crash Signature: [@ mozilla::dom::PlacesObservers::NotifyListeners ] → [@ mozilla::dom::PlacesObservers::NotifyListeners ] [@ libc.so.6@0xf6b5d | libglib-2.0.so.0@0x594dc]

Pulsebot

Comment 38

•

3 years ago

Pushed by dakatsuka.birchill@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/56b41210698b Allow nested notifications by notifing events sequentially. r=mak

Cristina Cozmuta (:CrissCozmuta)

Comment 39

•

3 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/56b41210698b

Status: ASSIGNED → RESOLVED

Closed: 3 years ago

status-firefox98: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 98 Branch

Frank Steinborn

Comment 40

•

3 years ago

Can this be backported to Firefox 96? It is rather inconvinient to not be able to delete a bookmark...

Ryan VanderMeulen [:RyanVM]

Comment 41

•

3 years ago

This patch is not a good backport candidate for 96 due to the large size and complexity. It needs time to bake before we ship it to the release population at large. We are hoping to backport to Beta for Fx97 due to ship in a couple weeks, however.

Daisuke Akatsuka (:daisuke)

Assignee

Comment 42

•

3 years ago

Comment on attachment 9248579 [details]
Bug 1624384: Allow nested notifications by notifing events sequentially.

Beta/Release Uplift Approval Request

User impact if declined: Might be crash when deleting bookmark on linux.
Is this code covered by automated tests?: Yes
Has the fix been verified in Nightly?: No
Needs manual test from QE?: No
If yes, steps to reproduce:
List of other uplifts needed: None
Risk to taking this patch: Medium
Why is the change risky/not risky? (and alternatives if risky): The patch changes the way to notify places events, and is not so small.
String changes made/needed: none

Attachment #9248579 - Flags: approval-mozilla-beta?

Ryan VanderMeulen [:RyanVM]

Comment 44

•

3 years ago

Comment on attachment 9248579 [details]
Bug 1624384: Allow nested notifications by notifing events sequentially.

Moderately-risky patch needed to address a pretty frequent Linux crash spike. Approved for 97.0b6.

Attachment #9248579 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Ryan VanderMeulen [:RyanVM]

Comment 45

•

3 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-beta/rev/aa7d6cfa821d

status-firefox97: affected → fixed

Flags: in-testsuite+

BugBot [:suhaib / :marco/ :calixte]

Updated

•

3 years ago

You need to log in before you can comment on or make changes to this bug.

Crash in mozilla::dom::PlacesObservers::NotifyListeners

crash happens and nothing delete. here is 3 times that my browser crash: https://crash-stats.mozilla.org/report/index/90f49bf7-c3ad-4ec0-a56b-7f7e80200322

Beta/Release Uplift Approval Request

crash happens and nothing delete.
here is 3 times that my browser crash:
https://crash-stats.mozilla.org/report/index/90f49bf7-c3ad-4ec0-a56b-7f7e80200322