Closed Bug 110226 Opened 23 years ago Closed 23 years ago

nsObserverService corruption

Categories

(Core :: XPCOM, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla0.9.7

People

(Reporter: KaiE, Assigned: alecf)

Details

Attachments

(2 files, 1 obsolete file)

I wrote code that used AddObserver("xpcom-shutdown"). But the notification was never received. During debugging I got the impression the internal list of nsObserverService might become corrupted, as for the same class, the notification was sent multiple times. For fun, I changed my code to call AddObserver("xpcom-shutdown") multiple times. Even twice were enough, and the notification was received. Hunting this down, I saw that some implementations of Observe do use RemoveObserver(this) while being notified. For test purposes, I changed the implementation of nsObserverService. I set a flag while notification is active. If RemoveObserver is reached while this flag is set, I simply return without trying to update the list. With this change, my code works. A simple registration is enough and my code gets notified. I don't know whether it is forbidden to call RemoveObserver during notification, or whether nsObserverService::RemoveObserver is buggy. David Baron suggested we could add some assertion checks that tell us that this happens. I will attach a patch that implements this assertion. (Note that this patch does not fix my problem.) In my environment, three "remove during notification" were encountered: 1) nsExternalHelperAppService::Observe 2) nsObserverService::RemoveObserver(nsIObserver *, char const *)+0x00000044 [/home/kai/new-trunk/mozilla-debug/dist/bin/libxpcom.so +0x0008F1C4] nsEventStateManager::~nsEventStateManager(void)+0x00000432 [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgkcontent.so +0x0027224E] nsEventStateManager::Release(void)+0x00000099 [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgkcontent.so +0x00272575] nsCOMPtr<nsIEventStateManager>::~nsCOMPtr(void)+0x0000004A [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgklayout.so +0x00338F3E] nsPresContext::~nsPresContext(void)+0x00000487 [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgklayout.so +0x002E62BF] GalleyContext::~GalleyContext(void)+0x00000036 [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgklayout.so +0x002E37E2] nsPresContext::Release(void)+0x000000A1 [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgklayout.so +0x002E6545] nsDOMEvent::~nsDOMEvent(void)+0x000000F1 [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgkcontent.so +0x00287369] nsDOMEvent::Release(void)+0x0000009F [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgkcontent.so +0x00287603] XPCJSRuntime::GCCallback(JSContext *, JSGCStatus)+0x000007E7 [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libxpconnect.so +0x0005E3F3] nsJSEnvironment::GetScriptingEnvironment(void)+0x000000CC [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libjsdom.so +0x0006162C] js_GC+0x00000D9F [/home/kai/new-trunk/mozilla-debug/dist/bin/libmozjs.so +0x0004857B] js_ForceGC+0x0000005C [/home/kai/new-trunk/mozilla-debug/dist/bin/libmozjs.so +0x000477C8] JS_GC+0x00000070 [/home/kai/new-trunk/mozilla-debug/dist/bin/libmozjs.so +0x00014584] nsDOMSOFactory::Observe(nsISupports *, char const *, unsigned short const *)+0x000000D8 [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libjsdom.so +0x00058A60] 3) nsObserverService::RemoveObserver(nsIObserver *, char const *)+0x00000044 [/home/kai/new-trunk/mozilla-debug/dist/bin/libxpcom.so +0x0008F1C4] nsCacheProfilePrefObserver::Remove(void)+0x0000018B [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libnkcache.so +0x0001BF63] nsCacheService::Shutdown(void)+0x00000074 [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libnkcache.so +0x0001D59C] nsCacheProfilePrefObserver::Observe(nsISupports *, char const *, unsigned short const *)+0x000000B4 [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libnkcache.so +0x0001C3DC]
oh! Yeah, this is a known bug.. (sorry I didn't understand what you meant on IRC, it's more clearly explained here) I like what you're attempting here, but I think we're going to hit a lot of false positives... because its possible to addObserver() on one list when you're notifying on another list. if you can make this mNotifying somehow know which list we're notifying (i.e. store it in the observerlist or something) then I'll review that..
kaie, could you try the patch in bug 105719
dougt: I tested the patch, it does not seem to help. I still do not get notified. But this might be, because with the patch from bug 105719 Mozilla hangs on shutdown. Right after the console says "WebShell 0", the application stops without any CPU cycles being used.
Attached patch Better assertion patch (deleted) — Splinter Review
alecf: Good idea. The attached patch remembers the pointer to the notification topic while notification is active, and only triggers the assertion if adding/removing for the same topic is tried. I assume topic to list is a 1:1 relationship.
Attachment #57897 - Attachment is obsolete: true
alecf: Do you want to review the new patch?
I guess this seems reasonable, but what happens if notifying on one topic causes notification to happen on another topic? lots of possibilities here. That's why I was hoping for something that was based/stored in the observer list. That said, if there's no way to do that (I haven't seen any indication that we've tried) I guess we can go with this temporary solution.
> I was hoping for something that was based/stored in the observer list. Me too. Since we have an object that is specialized for this purpose, nsObserverList, it could keep track of what ObserverListEnumerators it had made. It adds to its list on GetObserverList. We would need to add another method, ReleaseIterator() to nsObserverList. Between those two calls, any call to AddObserver/ReleaseObserver goes through the list of currently attached iterators and tells each at what index an item is being added or removed. The iterator can then adjust its index accordingly.
The simplest way to fix this is to notice when the observer list has changed out from under the enumeration of the list and adjust the enumeration index accordingly. I don't see why we need to get too fancy here, but I haven't tried to fix it (I fixed it before for the 99% case of the current item getting removed during notification just by noticing if the current item still matches the item that got notified, but doug removed that code when he reworked it.)
david, didn't intent to.. sorry. do you want to fix my mistake or should I fix this?
Target Milestone: --- → mozilla0.9.8
Doug, if you have the time, I'd appreciate it if you could fix it.
will do. I am upset that I broke this in the first place. :-/
ok, I got sick of dealing with this so I fixed this the "right" way - notify observers in reverse order, so that if they remove themselves, we're ok because we've already passed them in the list.
Comment on attachment 59158 [details] [diff] [review] notify observers in reverse order + mValueArray->GetElementAt(mIndex-1, aResult); + mIndex--; should be mValueArray->GetElementAt(--mIndex, aResult);
Attachment #59158 - Flags: review+
Comment on attachment 59158 [details] [diff] [review] notify observers in reverse order sr=bienvenu, with doug's comment; I've been running with this for a while w/o any problems
Attachment #59158 - Flags: superreview+
cool thanks - checked in (with the minor tweak)
Assignee: dougt → alecf
Target Milestone: mozilla0.9.8 → mozilla0.9.7
fix is in
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Are there users of this that were banking on the "other" firing order?
They better not have :) If so it wouldn't have been intentional...so if there are any such consumers, then this will help us flush them out... bienvenu and I were both running with this, and since this also fixed 40 bytes of leaks, I think its worth finding any regressions with still two more weeks left in the milestone... this might even fix a few crash-on-shutdown bugs
Alec, shouldn't you remove PRUint32 cnt; nsresult rv = mValueArray->Count(&cnt); too (from both ::HasMoreElements and ::GetNext)?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: