Closed
Bug 25693
Opened 25 years ago
Closed 24 years ago
request for a table for duplicates
Categories
(Bugzilla :: Bugzilla-General, enhancement, P3)
Tracking
()
VERIFIED
FIXED
Bugzilla 2.12
People
(Reporter: cbegle, Assigned: Chris.Yeh)
References
Details
Attachments
(3 files)
(deleted),
application/zip
|
Details | |
(deleted),
application/octet-stream
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review |
It would be nice if we could query for bugs which had the most duplicates to
make it easy to generate the "Most Frequently Filed bugs".
I think a table that ahd the bug number and a tally for the # of bugs that are
duplicates of that bug would be fine (ie not have to manage a list of bugs that
are duplicates of the bug)... If we want to see details fo the duplicates, that
info is stored in the comment.
Comment 1•25 years ago
|
||
Doing some kind of half-assed tables to create duplicates will only cause more
problems and confusion.
I completely agree that the way duplicates are handled by Bugzilla is woefully
inadequate, and that reports like this should at least be feasable.
Status: NEW → ASSIGNED
Summary: request for a table for duplicates → request for a table for duplicates
Comment 2•25 years ago
|
||
Bug 26053, "[RFE] Ability to limit search to bugs with DUPs", if implemented,
may provide the basis for a simple, not quite perfect, but potentially
good enough solution for this problem. If a field were added to each record
that simply counted how many bugs had been made a DUP of a given bug as
they come in, it would allow not only searching for bugs with DUPs, but,
with the addition of a column for the bug list ( "#DUP"?), ordering the list
by number of duplicates.
There would be at least two inaccuracies. Sometimes bugs are marked as DUPs
incorrectly (mistaken resolution or typo), which would inflate the number
recorded over reality. Also, sometimes trees of DUPs form, with some of the
bugs marked as DUPs of the bug that will get fixed having DUPs themselves,
which would cause under-recording of the true number of DUPs in all.
But for the purposes of finding the most frequently reported bugs, these
inaccuracies may not matter greatly.
Any solution that would properly handle those cases would probably require
a complete revamp of the way DUPs are handled, and would probably be a real
pain to get to work for the existing records.
Comment 4•25 years ago
|
||
I'm actually not scared of the work to handle existing records. All I have to
do is write some code that searches for bugs that are closed with a resolution
of DUPLICATE, and find the last chunk of text in such bugs that says "*** This
bug has been marked as a duplicate of <something> ***". Given that information,
it can go populate the database properly.
I'm just not sure how to represent this all in the database. I suppose I could
just add two fields: one is the bug number that a given bug is a duplicate of,
and one is the count you propose of how many duplicates a bug has. Given those,
I can do reasonable sanity checking and stuff, and aleviate the problems of
typos and bugs that are REOPENED.
Hmm.
Comment 5•25 years ago
|
||
You are right that it would be easier to start from the "bug that is a DUP"
end than the "bug that has DUPs" end, and doing so would provide more
potential benefits. FWIW, bug 26053 started out as a proposal for a
quick-n-dirty way of identifying the bugs that have DUPs, but I'd rather
see a solution that addresses more problems than that.
How much additional work would it be to flatten those trees of DUPs for
the purposes of getting a better count? Could this count be done without
actually changing which bug a leaf bug is a DUP of from a branch bug to the
root bug? Or should those bugs be changed to match the bug that the branch
bug is made a DUP as that is done? And should this be a new bug report?
Comment 6•25 years ago
|
||
I think it would be possible to maintain the count as being the size of the
total tree, not just the number of direct duplicates.
I don't see anything worthy of another bug report...
Comment 7•25 years ago
|
||
To get the other half you'd need to have a "at most x bugs" capability (no bug
report that I'm aware of), as well as the ability to specify the columns in the
URL (bug #12284).
Instead, if you want to order this within product categories, then you're
looking at general summary reports (bug #12282).
Comment 8•25 years ago
|
||
tara@tequilarista.org is the new owner of Bugzilla and Bonsai. (For details,
see my posting in netscape.public.mozilla.webtools,
news://news.mozilla.org/38F5D90D.F40E8C1A%40geocast.com .)
Assignee: terry → tara
Status: ASSIGNED → NEW
Comment 10•25 years ago
|
||
Bug 38850 is one possible solution to the problem discussed here. It recommends
simply storing the bug number a bug is a dupe of, as a field.
Then, all the intelligence could be put in the script which walks the database
to generate the most-frequent-bugs list.
In outline, it would search the entire thing, keeping a running total:
Bug 34567 3
Bug 34590 6
...
Then, it would go through every bug on that list and, if it itself were a dupe,
move the number, i.e. if 34590 were marked a dupe of 34567, the above would
become:
Bug 34567 9
...
At the end of this procedure, you'd have the info you need - which bugs were
most duped, and how many times, even indirectly. And it hardly requires any
Bugzilla changes AFAICS.
Gerv
Comment 11•25 years ago
|
||
Tara: What are the chances of this happening any time soon? If none, I'll go
away and do most-frequent-bugs another way...
Gerv
Comment 12•24 years ago
|
||
*** Bug 38850 has been marked as a duplicate of this bug. ***
Comment 13•24 years ago
|
||
I'm about to attach a zip of a duplicates.cgi (the most-frequent-bugs list) and
a diff of changes to a reasonably recent CVS version which implements
much-improved duplicate handling in Bugzilla.
It would be very cool if this patch could be tested on Landfill or evaluated by
other people, for as-soon-as-possible addition to bugzilla.mozilla.org (as it
would make my life, as maintainer of most-frequent-bugs at the moment) a lot
easier :-)
Gerv
Keywords: patch
Comment 14•24 years ago
|
||
Comment 15•24 years ago
|
||
Chris--can you get this on landfill, and at some point could you give me access
to landfill?
:)
Assignee: tara → cyeh
Comment 16•24 years ago
|
||
*** This bug has been marked as a duplicate of 38850 ***
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → DUPLICATE
Comment 18•24 years ago
|
||
Since when do you get to verify your own change? I thought that was the point of
verifying was to get a second opinion. And I disagree with this being marked a
duplicate of bug 38850. And according to your comments on 38850, so do you, so
I'm confused as to why you did this...
Comment 19•24 years ago
|
||
*** Bug 38850 has been marked as a duplicate of this bug. ***
Comment 20•24 years ago
|
||
Based on this statement from bug 38850:
>I believe the duplicate should be marked the other way round. Bug 25693 has
>more information than this bug. I'll try to mark 25693 as the initial bug and
>this one (38850) as the duplicate
And the fact that I agree with the reasoning behind this statement, I am changing
it back to match, and making this the original, and 38850 the duplicate.
Reopening.
Status: VERIFIED → REOPENED
Resolution: DUPLICATE → ---
Comment 21•24 years ago
|
||
If anyone disagrees, please discuss before switching it back. As someone who
stands a good chance of helping with development in making this change, I want
this bug open because this is the one that has the most information in it. If
you have a valid reason to close this in favor of a different duplicate, speak
up, and I might be convinced, but otherwise please leave this one open.
Thanks, and apologies to everyone else reading this for the spam.
Status: REOPENED → ASSIGNED
Comment 22•24 years ago
|
||
This is all very well, but is there any chance anyone is going to get around to
evaluating and checking in my PATCH for this problem? Then we can close both
bugs ;-)
Gerv
Comment 23•24 years ago
|
||
Yeah, I left that out, this bug has a patch on it, too, which is a definite plus
for keeping this one open. :)
I believe the hold up at the moment is waiting for the configuration changes on
landfill so some others of us besides Chris can apply patches to landfill.
Otherwise we're waiting on Chris for anything that needs trial time on landfill.
(Chris?)
Assignee | ||
Comment 24•24 years ago
|
||
landfill has been both moved and reconfigured to allow for outside people
access. working with Gerv on getting the patch in. It's a large enough change
that we'll want to have people hammer it and give it some bake time.
Assignee | ||
Comment 25•24 years ago
|
||
trying to install your patch onto landfill, and i ran into two things:
1) $regenerateshadow = 1 should be $::regenerateshadow = 1;
2) collectstats.pl makes calls to dbmopen and dbmclose, which barfs the script
with the following error:
>Can't locate NDBM_File.pm in @INC (@INC contains:
/usr/local/lib/perl5/5.6.0/i686-linux /usr/local/lib/perl5/5.6.0
/usr/local/lib/perl5/site_perl/5.6.0/i686-linux
/usr/local/lib/perl5/site_perl/5.6.0 /usr/local/lib/perl5/site_perl .) at (eval
7) line 3.
Is this a perl module that I need to get from CPAN, or is my perl installation
screwed up?
Comment 26•24 years ago
|
||
yes.
Assignee | ||
Comment 27•24 years ago
|
||
fixed it. a use DB_File; line was missing from the patch
Assignee | ||
Comment 28•24 years ago
|
||
okay, i got this working. I also had to add a use DB_File; to duplicates.cgi,
and to fix up a typo in collectstats.pl where you wrote the file without a .db
extension, yet were expecting to read in a file that had it. i added .db to the
output file and it works.
you did test this before you gave me the patch, right? :)
it's up and running on landfill. please pound it and see if it behaves the way
you expect it to. if you generate a lot of duplicates i'll have to re-run
collectstats.pl by hand to get those to show up in a quick fashion.
Comment 29•24 years ago
|
||
> you did test this before you gave me the patch, right? :)
Er, yeah - but quite a while ago now...
I'll pound on it on landfill. Did the import work correctly?
Gerv
Assignee | ||
Comment 30•24 years ago
|
||
import? what am i importing where?
Comment 31•24 years ago
|
||
When you update (run checksetup.pl) it should have automatically picked up any
dupes in the database. But there weren't any, I now discover ;-) So that bit
will need testing in some way (I did test it, but only on a small scale.)
Gerv
Comment 32•24 years ago
|
||
How's it importing the existing duplicates? If it's going by the "this bug has
been marked a duplicate of xxxxx" in the text, you might want to ask for manual
intervention if loops are detected (or are you already doing that? I haven't
looked it over real good yet). This bug right here is a real good example of
one that'll trip it up, since it got closed as a duplicate of another bug that
was already a dupe of this one, and then reversed again.
Comment 33•24 years ago
|
||
It detects the text on bugs currently resolved DUPLICATE, and adds the two bugs
in a relationship in a hash - this means that only the last dupe marking counts,
as earlier ones get overwritten. It doesn't need to do loop detection - as
there's no "right answer" in that case, it just does "something". I remember
checking the code against loops; I can't remember how it decides what to do, but
it turns out it doesn't need to do anything special.
Gerv
Comment 34•24 years ago
|
||
*** Bug 38857 has been marked as a duplicate of this bug. ***
Comment 35•24 years ago
|
||
adding myself and endico to this bug.
Comment 36•24 years ago
|
||
Comment 37•24 years ago
|
||
Checked in Gervase' patch.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → FIXED
Comment 38•24 years ago
|
||
In search of accurate queries.... (sorry for the spam)
Target Milestone: --- → Bugzilla 2.12
Comment 39•24 years ago
|
||
reopening... the duplicate scan failed miserably on bugzilla.mozilla.org's
database when Dawn tried to do a test update on a copy of the database. The
regexp that tests for the duplicate string in the bug text is apparently not
strict enough, and is catching invalid dupes. Bug 26913 is an example of a bug
that tripped it up.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 40•24 years ago
|
||
DBD::mysql::db do failed: You have an error in your SQL syntax near 't feel like
generating more spam just to be picky, making VERIFIED.
', '26913')' at line 1 at ./checksetup.pl line 2048.
Comment 41•24 years ago
|
||
Comment 42•24 years ago
|
||
The attached patch worked on my database, and also worked correctly on b.m.o's
data.
Comment 43•24 years ago
|
||
at least, as correctly as it could. The highest numbered bug in my test db
(a backup from this morning) is 70984. I searched for dups and dups of, and
found quite a lot.
mysql> select * from duplicates where dupe_of > 70984;
+---------+-------+
| dupe_of | dupe |
+---------+-------+
| 71024 | 71040 |
| 71042 | 71043 |
| 71024 | 71061 |
| 71024 | 71072 |
| 71082 | 71083 | all of these except the last entry have both the
| 71024 | 71084 | dup and dup_of field bogus. i can understand one field
| 71096 | 71104 | being bad, but both?
| 71109 | 71111 |
| 71024 | 71120 |
| 91026 | 21027 | dave fixed this bug
+---------+-------+
10 rows in set (0.01 sec)
mysql> select * from duplicates where dupe > 70984;
+---------+-------+
| dupe_of | dupe |
+---------+-------+
| 64100 | 70985 |
| 68336 | 70986 |
| 70773 | 70997 |
| 50758 | 71001 |
| 70756 | 71007 |
| 67574 | 71017 |
| 70773 | 71039 |
| 71024 | 71040 |
| 71042 | 71043 |
| 71024 | 71061 |
| 71024 | 71072 |
| 49141 | 71073 |
| 71082 | 71083 |
| 71024 | 71084 |
| 70361 | 71092 |
| 70057 | 71093 |
| 60151 | 71095 |
| 70924 | 71103 |
| 71096 | 71104 |
| 59655 | 71109 |
| 71109 | 71111 |
| 71024 | 71120 |
| 43847 | 71149 |
+---------+-------+
23 rows in set (0.03 sec)
Comment 44•24 years ago
|
||
those are all accurate.
select max(bug_id) from bugs
instead of
select count(*) from bugs :)
Comment 45•24 years ago
|
||
oops, never mind. I computed the max bug number incorrectly. There was only
one incorrect value and Dave just fixed the bug. Someone had marked a bug
a dup of an invalid bug number.
Comment 46•24 years ago
|
||
Since this worked on b.m.o and on mine, I'm thinking to just go ahead and check
it in, but since Gerv knows this code better, I'll wait for his say-so.
Comment 47•24 years ago
|
||
Oh, come on, how was I supposed to know people would do pathologically evil
things like bug 26913? ;-)
Looks good to me. Sorry I haven't got to this earlier; I've been, er, asleep :-)
Gerv
Comment 48•24 years ago
|
||
OK, it's checked in.
Status: REOPENED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → FIXED
Comment 49•24 years ago
|
||
When printing bug summaries, the summary text should first quote html
meta-characters.
The summary of bug 39992 begins with <SELECT> and a little dropdown
box is displayed instead of the summary.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 50•24 years ago
|
||
Jake's patch from bug 22041 fixed this bug too. marking fixed.
Status: REOPENED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → FIXED
Comment 51•23 years ago
|
||
Verified fixed. We have a table for duplicates now :-)
Status: RESOLVED → VERIFIED
Comment 52•23 years ago
|
||
Moving closed bugs to Bugzilla product
Component: Bugzilla → Bugzilla-General
Product: Webtools → Bugzilla
QA Contact: matty
Version: other → unspecified
Updated•12 years ago
|
QA Contact: matty_is_a_geek → default-qa
You need to log in
before you can comment on or make changes to this bug.
Description
•