Closed
Bug 59102
Opened 24 years ago
Closed 24 years ago
Last install dir skin is clobbered by last profile dir skin
Categories
(Core Graveyard :: RDF, defect, P3)
Core Graveyard
RDF
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bugs, Assigned: waterson)
References
()
Details
(Whiteboard: [rtm++])
Attachments
(2 files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review |
OK, this is an interesting bug ;)
Two ways to reproduce:
Use the installer and run a recommended install. Launch the app and go to
View->Apply Theme. See the two themes supplied, Modern and Classic. Go to
http://gooey/client/themes/ and install (select if you like, but not important,
only installation is required) two of the skins there. Alternately,
http://jrgm/bugs/58515 has variations on classic to install.
Quit and restart the app. The Classic skin (from the install directory) is no
longer in either the menu or in the list in preferences.
Other method:
Use the installer, choose custom install and deselect "Classic Skin" from the
list. Launch the app, you should see just modern in both lists. Now install any
other skin, restart as above, and you'll find that Modern is no longer listed.
It seems that when the number of skins in the install directory (one RDF Seq)
and that of the profile directory (another RDF Seq) matches, the last skin in
the install directory is clobbered by the last profile directory (clobbered in
terms of being reflected in the UI, all the files and datasources on disk are
correct). It could be that this is a XUL Template Builder bug, so filing against
RDF.
Installing further skins however caused classic/modern to appear again.
Needless to say, this is pretty nasty, and could leave a user in an
unrecoverable state (especially method 2) if the installed skin is switched to
and is malicious. (hence unable to download further themes, see comment in
previous paragraph). Short of knowing the right files to remove or creating a
new profile and pitching the old one, there is no recovery.
Reporter | ||
Comment 2•24 years ago
|
||
More comments:
hyatt says this could be an issue when the clash in numbering between the
install dir skin Seq and the profile dir skin Seq is resolved. (To display the
list of available skins, he overlays the user profile graph on top of the
install graph, both Seqs have identical numbering starting at _1, _2, etc, and
apparently code in rdf/ resolves this conflict. This could be the source of this
bug).
Reporter | ||
Updated•24 years ago
|
Severity: normal → critical
OS: Windows 2000 → All
Hardware: PC → All
Comment 3•24 years ago
|
||
Comment 4•24 years ago
|
||
waterson, the above patch fixes the problem. I'm not completely sure about it
though. Take a look and let me know.
The basic idea is that when there are multiple targets for each ordinal (_1, _2,
etc.), the enumerator only walks one of the final set of targets.
I view this as being a stop-ship bug, since the user will quite easily hose
their entire setup when they start installing skins.
Upgrading to rtm need info.
Whiteboard: [rtm need info]
Comment 5•24 years ago
|
||
PDT is very concerned about this one. Can folks comment on this code path, and
the impact it has on testing? If we took this, we'd have to focus extra testing
on it, and we'd like to know how broad the impact (and regression potential) is.
Adding some folks to the cc list. This is a critical issue for Theme
installation.
Comment 7•24 years ago
|
||
Wanted to add that this is a crucial bug, and I can't open up the theme park and
have themes available from there, if this can't be fixed. This of course cuts
into the page views/impressions/revenue to Netcenter, currently the only way the
themes feature is making any money.
Assignee | ||
Comment 8•24 years ago
|
||
hyatt's fix is right. RDF containers are one-indexed; so we need to iterate
until index <= count.
Comment 9•24 years ago
|
||
waterson: Does that count as an sr=? Who can r=?
Assignee | ||
Comment 10•24 years ago
|
||
cc'ing rjc and alecf, who are familiar with this stuff. Could you two guys take
a look at hyatt's patch above?
Comment 11•24 years ago
|
||
If this could get checked into the trunk (pending r=) then I (and others)
could test this in the commercial build for the 8pm spin tonight. [Although
I see there was no 8pm win32 trunk spin last night (?)].
waterson: any particular things I should be looking for in other applications
of this code, outside of whether we retain all the profiles?
Assignee | ||
Comment 12•24 years ago
|
||
The other places that use RDF sequences are bookmarks, search, the localstore.
However, I can't think of any other place that's using them the way that the
chrome register is...
Assignee | ||
Comment 13•24 years ago
|
||
I checked in hyatt's fix on the trunk.
Comment 14•24 years ago
|
||
great - jrgm, can you test on the trunk and post your results?
Comment 15•24 years ago
|
||
Okay so with this and bug 59130 (a crash that could happen on skin-switching,
among other things) in the trunk, I should be good to go when the builds
come out. I'm starting in my own debug build win2k, as soon as that completes.
Comment 16•24 years ago
|
||
r=rjc on hyatt's fix
Comment 17•24 years ago
|
||
I ran the smoketests on win32 and linux, and found no new problems, especially
no new problems that could be related to RDF (and on windows, when I found
what might be suspect, I found I could get the same behaviour with or without
the patch). I also worked with search and bookmarks, but found no (new)
problems there.
I testing the skin install procedures, and I did not lose modern or classic
as I added more skins (but I will test more tomorrow). I plan to also look
some more at search/localstore/bookmarks tomorrow as well, but as it stands
now, I don't see any problems with the patch (but this code is used in a lot
of places, so greater depth of testing (e.g., more people) would be good).
Reporter | ||
Comment 18•24 years ago
|
||
I installed respins with this fix and the problem was solved. General use of
the browser yielded no RDF-related problems.
Comment 19•24 years ago
|
||
Seeing all the reviews and approvals, the patch sure seems correct. My
confusion is how could we have an off-by-one error on the upper limit in this
code and not impact other sections of the system? Is it really the case that
this method was not used by other parts of our system? ...or was count always
zero (along other code paths), and hence we never use the upper limit in *ANY*
other place in the code?
Has anyone put in a breakpoint to try to verify that this code is not used
elsewhere? Chris mentioned that bookmarks, search, and localstore are the
likely candidates for possibly exercising this code... so running with a
breakpoint set during some testing would at least be interesting. Setting a
conditional breakpoint for "count > 0" on the changed line would be the
interesting thing. In any case where the breakpoint triggered, I'd be
interested in understanding how the calling code worked at all :-/ (for the full
range of items).
To be clear: I'm really trying to understand risk involved with this change.
Identifying ANY additional users of this code is pivotal to that eval.
Thanks,
Jim
Comment 20•24 years ago
|
||
It's more complex than that. The bug only occurs if you have multiple elements
occupying the same slot. The chrome registry is (as far as I know) the only RDF
data source that uses this oddity of RDF. Basically you can have several
elements at position 1, several at 2, several at 3, etc.
The bug was that when you got to the last index, we only returned one of the
multiple elements. So all other data sources still worked. Now we properly
hang out without bailing until we're sure that entire enumerator (for the last
index) has been depleted.
We were (before) always bailing without depleting the enumerator, but we did
hang around just long enough to return one element (which made everybody else
work).
Comment 21•24 years ago
|
||
hmmm... given that explanation, why didn't you just say
while (mCurrent || mNextIndex < count)
and leave the count limit as it was?
Why do you want to go into the loop construct "an extra time?"
If you were always willing to "go an extra time," it seems like you could almost
have done:
while(1)
and counted on the code to bail (return) in the first if(!mCurrent) block.
Isn't that almost equivalent to what has been done, now that mNextIndex will
never get bumped any further? (re: mCurrent won't be loaded).
To be real specific, the current patch is *probably* identical to
if (count > 0) while (1) ...
since mNext wil never exceed count due to the loop.
Doing it with the "go one extra time" would cause the effort to create an
mCurrent to *always* happen when there was no work to be done.
What am I missing now?
Thanks,
Jim (junior code reviewer, and paranoid PDT rep) Roskind
Assignee | ||
Comment 22•24 years ago
|
||
RDF containers are "one-indexed"; that is, their indices run from 1..n. So, in
this case, we really do want to run the index up until it's == count.
Assignee | ||
Comment 23•24 years ago
|
||
Ok, I missed jar's point on this. It turns out that jar is right: we want to
check if mCurrent is still valid *or* we still have indices we could try.
Although hyatt's fix works, it works mainly because the code recovers properly
from "running off the end".
Assignee | ||
Updated•24 years ago
|
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 24•24 years ago
|
||
I've checked jar's version of the fix into the branch and trunk, after verifying
that it fixed this bug:
diff -u -r1.12 nsContainerEnumerator.cpp
--- nsContainerEnumerator.cpp 2000/07/24 22:42:32 1.12
+++ nsContainerEnumerator.cpp 2000/11/06 04:53:48
@@ -180,7 +180,7 @@
}
// Now iterate through each index.
- while (mNextIndex < count) {
+ while (mCurrent || mNextIndex < count) {
if (! mCurrent) {
NS_WITH_SERVICE(nsIRDFContainerUtils, rdfc, kRDFContainerUtilsCID,
&rv);
if (NS_FAILED(rv)) return rv;
Marking bug "fixed"...
Comment 25•24 years ago
|
||
Hyatt's patch looks correct to me.
Jar's patch seems to break the singleton case (count == 1), because on first
call to hasMoreElements, mNextIndex == 1 and mCurrent is null, so the while loop
will not iterate. What am I missing?
/be
Assignee | ||
Comment 26•24 years ago
|
||
rjc had to set me straight on this. The local variable "count" *isn't* really
the count of elements in the container. It's the largest value of "next
available index" in the container. Which is "1" if the container is empty, "2"
if the container has one element, and so on. (I picked a shitty name for that
local variable.)
Assignee | ||
Comment 27•24 years ago
|
||
See, for example,
http://lxr.mozilla.org/mozilla/source/rdf/base/src/nsRDFContainer.cpp#207
I'm going to change the variable name from "count" to "max" on the trunk...
Comment 28•24 years ago
|
||
Oh, and you couldn't use mContainer->GetCount because it does not handle
multiple targets
(http://lxr.mozilla.org/mozilla/source/rdf/base/src/nsRDFContainer.cpp#168).
Blech. Trunk cleanup?
(After mid-air collision:) Don't use max, it connotes maximum value. How about
limit?
/be
Assignee | ||
Comment 29•24 years ago
|
||
Comment 30•24 years ago
|
||
Or fencepost, if limit sounds like asymptotic value. But why not fix GetCount
so it works on mContainer in this case?
/be
Comment 31•24 years ago
|
||
If you fixed GetCount or used a "maximum index value not to exceed" as the loop
control, then I claim hyatt's fix is minimal and avoids extra tests.
Right now, with the limit or fencepost as the loop control, we jump from 0 to 2,
3, ... for the empty, single, double, etc. cases, and therefore waste an extra
iteration with hyatt's patch, to terminate the loop.
If the loop control used 0, 1, 2, ... for empty, single, double, etc. case
counts, and used mNextIndex <= count, then no extra mCurrent test would be
needed, and maybe the next person to maintain this code will not be confused.
And you could still call the loop control variable 'count'.
/be
Comment 32•24 years ago
|
||
The old loop test was wrong.
The loop is not traversed in proportion to the size of "count."
The loop counter assures that we actually generate up distinct values of
mCurrent in proportion to the value of "count" (with the right off-by one
adjustment).
The loop must be traversed many more times. It must be traversed until each
mCurrent is depleted, and until all mCurrents have been created. That is why
the new limit needed to be placed.
Comment 33•24 years ago
|
||
The other confusing thing about this loop is that unlike a traditional loop that
merely itterates, this method constantly returns (with values), and then must be
called again and fight (via conditionals) to get back into the same state (same
level of loop nesting etc.). The extra condition check assures that we ne
re-visit this method, we'll be able to quickly get back to "where we left off."
In this case (the bug), we were working on the last mCurrent, there are no more
mCurrents to be created, but we have more than 1 element to take from mCurrent.
The alternate fix by Hyatt simply asserted "There was an extra mCurrent" that
needed to be created. Although the code somewhat gracefully handled this case,
the performance impact of the extra mCurrent creation would need to be examined.
I was not familiar enough with the code to estimate the cost of *always*
creating an extra mCurrent, even though it ends up having no elements, and
quickly being discarded. I don't know if there are lingering effects (is the
construction lazy, but it leaves remnants now that construction has take place?
What is the cost or impact of these remnants if any?)
In the end, Waterson, Churchill and I felt more confident that the extra test
avoided impact on other calls to this method that did not have a multitude of
elements in the *last* mCurrent.
Whiteboard: [rtm need info] → [rtm++]
Assignee | ||
Comment 34•24 years ago
|
||
"What jar said." The bug here was *not* off-by-one indexing. The bug was that we
prematurely terminated the loop in the case that we had >1 RDF Sequence
aggregated with exactly the same number of elements.
Let's say we have two RDF containers aggregated together with exactly one
element each.
Our first call to HasMoreElements() will succeed:
- We'll compute "count" to be "2" (that is, both containers will report the next
available index to be two).
- We'll do mCurrent gets GetTargets(container, "1").
- We'll increment "mNextIndex" so that it's value is now "2".
- We'll pull the first element out of mCurrent, and cache it as mResult.
We'll then later call GetNextElement(), to pull the cached value out. Now we're
on our *second* call to HasMoreElements():
- We'll again compute the "count" to be "2".
- We'll now *fail* the loop condition, because mIndex == 2, even though mCurrent
still hadn't been depleted!
The fix is to alter the loop condition so that we *also* check mCurrent to see
if it has been depleted (we set it to null when it's been depleted at line 204).
So, hyatt's fix "worked" because it masked the bug, but was not optimal. With
hyatt's change, we took an extra trip through the loop and *avoided* the case
where mCurrent had not been depleted, but mNextIndex == count. jar's fix is optimal.
Whiteboard: [rtm++] → [rtm need info]
Comment 35•24 years ago
|
||
Yeah, I knew that it wasn't. I was trying to just pick a minimal patch that
fixed the bug. :)
Comment 36•24 years ago
|
||
verified fixed mac/linux/win32 2000110616 builds MN6 -- installed win32 as both
modern-only and modern+classic, installed mac as modern+classic, and linux
as modern+classic and forced it to also be modern-only. Added skins, shutdown
restarted and added another skin, etc. -- Modern and/or Classic were always
available and could be switched to at any time -- also took win32 up to a
total of 40 installed skins (hey it works, but it's a long menu :-]
Marking vtrunk (e.g. verified on branch).
Keywords: vtrunk
Whiteboard: [rtm need info] → [rtm++]
Updated•6 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•