Closed Bug 67036 Opened 24 years ago Closed 14 years ago

Can't query for product, component, version names that have commas

Categories

(Bugzilla :: Query/Bug List, defect, P2)

2.13

Tracking

()

RESOLVED FIXED
Bugzilla 4.0

People

(Reporter: jesse, Assigned: mkanat)

References

Details

(Whiteboard: [wanted-bmo])

Attachments

(2 files, 16 obsolete files)

(deleted), patch
mkanat
: review+
Details | Diff | Splinter Review
(deleted), patch
LpSolit
: review+
Details | Diff | Splinter Review
Component names which contain commas seem to prevent queries by component from obtaining any results.
So I guess component names shouldn't be allowed to have commas?
Target Milestone: --- → Bugzilla 2.16
Priority: -- → P2
Mass moving to new product Bugzilla...
Assignee: tara → endico
Component: Bugzilla → Query/Bug List
Product: Webtools → Bugzilla
Version: other → 2.13
We are currently trying to wrap up Bugzilla 2.16. We are now close enough to release time that anything that wasn't already ranked at P1 isn't going to make the cut. Thus this is being retargetted at 2.18. If you strongly disagree with this retargetting, please comment, however, be aware that we only have about 2 weeks left to review and test anything at this point, and we intend to devote this time to the remaining bugs that were designated as release blockers.
Target Milestone: Bugzilla 2.16 → Bugzilla 2.18
Sounded like a security issue at first but isn't. This seems to be because buglist.cgi uses the advanced query code behind the scenes which probably separates on space and comma. I suggest prohibiting commas is the appropriate solution ... users of advanced query will want to use these as separators explicitly.
This is similar to bug 179309.
This also affect products and versions....
Summary: comma in component name prevents query by component → Can't query for product, component, version names that have commas
*** Bug 205018 has been marked as a duplicate of this bug. ***
Assignee: endico → nobody
Attached is a proposed patch against CVS HEAD. It escapes commas to \, and then modifies the anyexact sub to use quotewords() from Text::ParseWords (part of the standard Perl distribution) instead of split().
Comment on attachment 142271 [details] [diff] [review] Patch to use Text::ParseWords instead of split() Setting review? so I remember to come back and look at it when I have time to play. Question about this: is Text::ParseWords part of a core Perl 5.6.0 distribution? If not, we need a version check for it added to checksetup.pl as well.
Attachment #142271 - Flags: review?(justdave)
I'm not sure how far back we support, but I just downloaded a copy of 5.005_03 and Text::ParseWords is in there. I found some usenet posts indicating it was present at least as far back as 5.004.
These bugs appear to be abandoned. Retargeting to 2.20
Target Milestone: Bugzilla 2.18 → Bugzilla 2.20
Not abandoned; waiting for patch review. I don't have permission to retarget this bug to 2.18, but I'd really like to see a fix for this issue incorporated into the next stable release.
(In reply to comment #10) > I'm not sure how far back we support, but I just downloaded a copy of 5.005_03 > and Text::ParseWords is in there. OK, that's far enough for me. :)
Don't you need to DBI escape this, instead?
It does not seem to be the database getting confused. Rather, the code is concatenating a bunch of search criteria joined with commas and then later using a split(',',$whatever) to seperate them.
Dave: is there a reason not to split on /(?<!\\),/ instead of having to use Text::ParseWords?
If I'm not mistaken, quotewords() also ignores delimiters that are inside quotes.... which means if you wanted to search for something that did contain commas, you could put quotes around it, and the quotes would get removed by quotewords() and the comma within it would be part of that term isntead of getting split. This is kind of a cool concept. That would automatically allow for you to search things with commas in them that didn't match something in a @legal_xxxx array, too. :)
(In reply to comment #17) > commas, you could put quotes around it, and the quotes would get removed by > quotewords() and the comma within it would be part of that term isntead of This is indeed pretty cool. Is there a parser in there? I don't mind, but wouldn't it perhaps be easier then to bracket the parameters with quotes instead of escaping the commas?
OS: Linux → All
Hardware: PC → All
*** Bug 205861 has been marked as a duplicate of this bug. ***
Blocks: 179309
Comment on attachment 142271 [details] [diff] [review] Patch to use Text::ParseWords instead of split() Looks like nobody ever got around to looking at this. Marc, did you want to take a crack at it?
Attachment #142271 - Flags: review?(justdave) → review?(wurblzap)
Assignee: nobody → dave.pifke
Comment on attachment 142271 [details] [diff] [review] Patch to use Text::ParseWords instead of split() bitrotten.
Attachment #142271 - Flags: review?(wurblzap) → review-
Attachment #142271 - Attachment is obsolete: true
Attachment #185708 - Flags: review?
Comment on attachment 185708 [details] [diff] [review] Patch to use Text::ParseWords instead of split() The patch deals only with the anyexact case. It ignores for example ",anywordssubstr" or the ",allwordssubstr" cases, which are right below it. Their "split" is hidden at the end of the Search.pm file, in subs like: # Support for "any/all/nowordssubstr" comparison type ("words as substrings") sub GetByWordListSubstr { or sub GetByWordList { They have in them an implicit split which is based on commas as well (and spaces). They would need to be updated as well if we're taking this approach.
Attachment #185708 - Flags: review? → review-
*** Bug 300348 has been marked as a duplicate of this bug. ***
I was going to submit a new bug but I found this one already filed. Is there any way that this bug can get allocated for repair? I only ask since this was reported a mere four and a half years ago. The work we had to do to identify the bug in the first place came to a total of about 3 man-hours of principal-level engineers. I understand that that's how the ball bounces but I just hate to find that it's the result of something that's been known for so long.
*** Bug 324265 has been marked as a duplicate of this bug. ***
I think the right approach for this is as follows.... The @funcdefs list should be redefined replacing the "string1,string2,string3" => sub {} form with..... { key =>("string1", "string2", "string3"), ref => sub {} } %funcsbykey is used in 2 ways.... in all but one place it is used to directly look up the member of @funcdefs where string1 is null and we are looking only for we are looking only for a member of @funcdefs where string2 is exactly what we are looking for. In the remaining place, we are sequentially checking @funcdefs for the first entry where each string matches the corresponding portion of the chart row as follows... foreach my $key (@funcnames) { if ("$f,$t,$rhs" =~ m/$key/) { my $ref = $funcsbykey{$key}; If we replace this with a loop like... foreach my ($key, $ref) (@funcnames) { if (($f eq '' || $f =~ m/$key[0]) && ($t eq '' || $t =~ m/$key[1]) && ($rhs eq '' || $rhs =~ m/$key[2])) { etc... Someone will have to do all of this and make sure the data structures are all correct, but it should work well.
OK... more like this... @funcdefs = ( { key => ['^(?:assigned_to|reporter|qa_contact)$', '^(?:notequals|equals|anyexact)$', '^%group\\.(\\w+)%$'], ref => sub { the code....; } }, { key => ['^(?:assigned_to|reporter|qa_contact)$', '^(?:notequals|equals|anyexact)$', '.*'], ref => sub { the code....; } }, etc... Which begs the question... shoudl the anchors go inside every regexp or should they go outside and require the use of wildcards where anchored strings are not wanted. I am leaning towards the latter
Ugh!! The last thing I want to do is to change all these around programatically. This filter does about 90% of it, but the spurious "}," at the beginning of funcdefs needs to be moved to the end. #!/usr/bin/perl while ($l = <>) { $l =~ s| "\^(.*),(.*)(?:,(.*))?" => sub |}, { key => ['$1', '$2', '$3'], ref => sub |; $l =~ s| ",(.*)(?:,(.*))?" => sub |}, { key => ['-----', '$1', '$2'], ref => sub |; print $l; }
Assignee: dave.pifke → bugreport
Target Milestone: Bugzilla 2.20 → Bugzilla 2.24
This still needs a massive whitespace fix, but it is worth testing.
Attachment #185708 - Attachment is obsolete: true
Attached patch The full patch (obsolete) (deleted) — Splinter Review
Attachment #211520 - Attachment is obsolete: true
Attachment #211533 - Flags: review?(LpSolit)
Since there is a lot of whitespace only change here, this is a patch with that removed
Comment on attachment 211533 [details] [diff] [review] The full patch There is still an additional issue.... We need to make the search forms use a list of quoted strings for multi-selects and permit the anyexact searches to use EITHER a list of comma-delimited strings or, if the list starts witha quotation mark, a comma-seperated list of quoted strings.
Attachment #211533 - Flags: review?(LpSolit)
Attached patch v3 - use quotewords also (obsolete) (deleted) — Splinter Review
Attachment #211533 - Attachment is obsolete: true
Attachment #211535 - Attachment is obsolete: true
Attachment #211546 - Flags: review?(LpSolit)
Comment on attachment 211546 [details] [diff] [review] v3 - use quotewords also mkanat said Search.pm wasn't that hard to read. So I let him reviewing this one. ;)
Attachment #211546 - Flags: review?(LpSolit) → review?(mkanat)
QA Contact: mattyt-bugzilla → default-qa
Comment on attachment 211546 [details] [diff] [review] v3 - use quotewords also Oh, passing the buck. :-) justdave's the owner of Search.pm...
Attachment #211546 - Flags: review?(mkanat) → review?(justdave)
This bug is retargetted to Bugzilla 3.2 for one of the following reasons: - it has no assignee (except the default one) - we don't expect someone to fix it in the next two weeks (i.e. before we freeze the trunk to prepare Bugzilla 3.0 RC1) - it's not a blocker If you are working on this bug and you think you will be able to submit a patch in the next two weeks, retarget this bug to 3.0. If this bug is something you would like to see implemented in 3.0 but you are not a developer or you don't think you will be able to fix this bug yourself in the next two weeks, please *do not* retarget this bug. If you think this bug should absolutely be fixed before we release 3.0, either ask on IRC or use the "blocking3.0 flag".
Target Milestone: Bugzilla 3.0 → Bugzilla 3.2
On bmo, in the "Mozilla Localizations" product, the components "es-AR / Spanish, Argentina", "ja-JP, ja-JPM / Japanese", "en-GB / English, United Kingdom" and "en-ZA / English, South Africa" are hit by this bug. I have now idea how those working on those localizations can stand the situation. I don't know why Joel gave up pushing this bug, but it seems all the patch requires is a review by justdave ? (if it didn't bit-rot already)
Comment on attachment 211546 [details] [diff] [review] v3 - use quotewords also (In reply to comment #39) > I don't know why Joel gave up pushing this bug, but it seems all the patch > requires is a review by justdave ? (if it didn't bit-rot already) Probably because the review request was made at a time when I was very buried with work stuff, and was lost in the pile by the time I ever had time to look at it. Yes, this is definitely very bitrotted :( Concept behind it looks fairly sound though, and now's the perfect time to play with such a major overhaul to the basic search infrastructure (3.1 dev period)
Attachment #211546 - Flags: review?(justdave) → review-
as before, there were a lot of whitespace-only changes. To see only the actual changes this patch was making: cvs -q up -dPC -D"18 Dec 2005 19:24:32 -0000" patch -p0 < v3patchfile cvs -q diff -uw
You know, also, I'd be really happy if somebody actually *named* those subs and just put them together at the bottom of Search.pm in their own section. I think that would make Search.pm a lot easier to read. (That is, you'd have key => [/some regex/], sub => \&_some_sub_name.
Status: NEW → ASSIGNED
Attached patch v4 (obsolete) (deleted) — Splinter Review
Updated the previous patch for bitrot, and added tightly related fixes mentioned in in bugs 179309 and 340884. I tried naming all the anonymous funcs and moving them outside @funcdefs, but didn't find a way to do this that I felt actually improved readability. I noticed that v3 did not require types to match at the start of the string, so "notequals" would match "equals". Matching at the start should be appropriate for any patterns that were written in the comma style. Quotewords does not fail very gracefully when given a string ending in a backslash that is not escaped properly, and it is possible for this to be incorrectly reported as a "field_type_mismatch" error.
Attachment #280682 - Flags: review?(justdave)
Attachment #280682 - Flags: review?(justdave) → review?(bugreport)
Depends on: 399371
I just experienced this bug on 2.22. It needs to have it's severity raised because it causes bugzilla to give incorrect results. This *silently* causes Zarro Bugs Found to be returned makeing the searcher feel that the bugs in question are zero. There is nothing that they can do about it either. The admin can go change the version numbers with commas but the damage is already done.
Severity: minor → major
Attached patch v5 (obsolete) (deleted) — Splinter Review
This patch uses quotewords to parse strings with commas and adapts selectable params such as Status to the quotewords format. It also creates a direct hash of search functions by type to avoid the confusion caused by concatenation of commas onto search types, and clears up a good deal of other comma-related cruft. Compound search functions are not placed in the constant hash because they may need runtime information such as the list of multiselect fields.
Attachment #280682 - Attachment is obsolete: true
Attachment #299973 - Flags: review?(mkanat)
Attachment #280682 - Flags: review?(bugreport)
Attached patch v6 (obsolete) (deleted) — Splinter Review
there was a small typo in the previous version
Attachment #299973 - Attachment is obsolete: true
Attachment #299977 - Flags: review?(mkanat)
Attachment #299973 - Flags: review?(mkanat)
Attached patch v7 (obsolete) (deleted) — Splinter Review
Added more user error text per bug 340892.
Attachment #299977 - Attachment is obsolete: true
Attachment #299982 - Flags: review?(mkanat)
Attachment #299977 - Flags: review?(mkanat)
Comment on attachment 299982 [details] [diff] [review] v7 > if ($key =~ /^[^,]*$/) { >- die "All defs in %funcs must have a comma in their name: $key"; >+ die "All function patterns must contain a comma: $key"; >+ } >+ if ($key =~ /^[^\^]/) { >+ die "All function patterns must begin with a carat: $key"; > } > if (exists $funcsbykey{$key}) { >- die "Duplicate key in %funcs: $key"; >+ die "Duplicate function pattern: $key"; > } What's that? We should never call die() directly. We should always use Throw*Error() instead, especially because Throw*Error() will correctly rollback pending SQL changes.
Attached patch v8 (obsolete) (deleted) — Splinter Review
replace "die" calls with "ThrowCodeError"
Attachment #299982 - Attachment is obsolete: true
Attachment #299982 - Flags: review?(mkanat)
Attachment #300100 - Flags: review?(mkanat)
Comment on attachment 300100 [details] [diff] [review] v8 >+ ThrowCodeError("illegal_search_pattern", >+ { pat => $key, msg => "does not contain a comma." }); >+ } >+ if ($key =~ /^[^\^]/) { >+ ThrowCodeError("illegal_search_pattern", >+ { pat => $key, msg => "does not begin with a carat." }); > } > if (exists $funcsbykey{$key}) { >+ ThrowCodeError("illegal_search_pattern", >+ { pat => $key, msg => "is not unique." }); > } Do not hardcode strings in Search.pm. It's impossible to translate them. Define msg => 'no_comma', 'no_character', and 'not_unique' (or names like that) and in code-error: bla bla ... [% IF msg = 'no_comma' %] does not contain a comma. [% ELSIF msg = 'no_character' %] does not begin with .... [% ELSIF msg = 'not_unique' %] is not unique. [% END %]
Attachment #300100 - Flags: review?(mkanat) → review-
Attached patch v9 (obsolete) (deleted) — Splinter Review
Moved all ThrowCodeError text to the template file.
Attachment #300100 - Attachment is obsolete: true
Attachment #300114 - Flags: review?(LpSolit)
Attachment #300114 - Flags: review?(LpSolit) → review?(justdave)
Assignee: bugreport → jjclark1982
Status: ASSIGNED → NEW
Attachment #300114 - Flags: review?(justdave) → review?(mkanat)
Flags: blocking3.2?
Not a blocker--this isn't a regression. It could still make it into 3.2, though--it's a bug, not an enhancement.
Flags: blocking3.2? → blocking3.2-
Actually, this might be a good time to move ALL search function patterns into a constant, not just the directly indexed ones. However it would require special attention for runtime-only patterns like multiselects.
Comment on attachment 300114 [details] [diff] [review] v9 Okay, why not just remove the whole idea of using commas in the regexes? Couldn't you do two regex matches, one for the type and another for the field, if that particular field has an override for that type? Also, it looks like there's a lot of cleanup in this patch. Could we move that cleanup to a different patch, since this one is already large enough without me having to figure out what's cleanup and what's not?
Attachment #211546 - Attachment is obsolete: true
Comment on attachment 300114 [details] [diff] [review] v9 Jesse and I talked this over on IRC a while back, and came up with some alternate methods that would be better architecturally.
Attachment #300114 - Flags: review?(mkanat) → review-
Status: NEW → ASSIGNED
Target Milestone: Bugzilla 3.2 → Bugzilla 4.0
I'd be interested in an ETA for this. I just happened to hit this, and we've been using ',' in the component names for l10n quite extensively, sadly. See https://bugzilla.mozilla.org/describecomponents.cgi?product=Mozilla%20Localizations.
Whiteboard: [wanted-bmo]
The Bugzilla component reorg (http://www.gerv.net/temp/bmo-reorg.html) eliminates commas from l10n for exactly this reason. Gerv
Severity: major → normal
This is definitely (at least) major severity considering we (bmo) had to explicitly work around this issue by renaming components that had commas in them. Not sure how you can say otherwise.
Severity: normal → major
I'm with Reed. This is something I need to tell every new product admin explicitly. And it's being forgotten just as quickly, because a short list of things is handy -- so I need to re-tell every time somebody puts up a new component with commas (or renames an existing one to have commas).
Confirmed in Bugzilla 3.2.2
Jesse, any progress?
Attached patch v10 (obsolete) (deleted) — Splinter Review
Here is a minimal patch to address the comma issue on the head. Select fields work automatically, and commas can be escaped in the boolean charts using a common format. I take extra care to make this format appear verbatim on the search results page. Older branches will need to handle some fields a little differently, such as keywords and content.
Attachment #300114 - Attachment is obsolete: true
Attachment #438023 - Flags: review?(LpSolit)
(In reply to comment #67) > Older branches will need to handle some fields a little differently, such as > keywords and content. No need to worry about older branches. We will only fix it on HEAD (and 3.6, if the patch applies cleanly).
Target Milestone: Bugzilla 4.0 → Bugzilla 3.8
Attachment #438023 - Flags: review?(mkanat)
Attachment #438023 - Flags: review?(LpSolit)
Attachment #438023 - Flags: review+
Comment on attachment 438023 [details] [diff] [review] v10 >=== modified file 'Bugzilla/Search.pm' >+ my @words = &quotewords('[,\s]+', 0, $strs) >+ or ThrowUserError("misquoted_words", {str => $strs}); I was unable to trigger this error. Is this doable by the user, or only if something went internally wrong? In the last case, it should be a code error rather than a user error. >+ ThrowUserError("misquoted_words", {str => $strs}) unless scalar @list; Why not returning an empty array instead of throwing an error? Otherwise looks good and works fine. Boolean charts were already working fine, AFAICS, and now select fields are working correctly too. r=LpSolit I will let mkanat have a look at the patch too, in case I missed something.
(In reply to comment #69) > I was unable to trigger this error. Is this doable by the user, or only if > something went internally wrong? In the last case, it should be a code error > rather than a user error. If you enter an unparseable string into a boolean chart, you will get this error. For example: field: product type: anyexact value: "Product One", "Products Two, Three"\ will fail to parse because of the trailing backslash. Since this is user-entered data, an error message is appropriate. > Why not returning an empty array instead of throwing an error? Good idea, if something goes wrong with a select field, it was probably a code error or an HTTP injection, so a user error is not helpful.
Attached patch v11 (obsolete) (deleted) — Splinter Review
Updates per Comment 70. Also removed some whitespace-only changes, to make review easier.
Attachment #438023 - Attachment is obsolete: true
Attachment #438173 - Flags: review?(mkanat)
Attachment #438023 - Flags: review?(mkanat)
Just a comment here so that everybody knows--Jesse and I talked about this on IM, and we decided that an architecturally-better solution for HEAD would be to use arrayrefs instead of comma-separated strings, thus generally eliminating this problem. For the branch (if we decide to take this for the branch), the quotewords solution is more likely to be taken, though.
Comment on attachment 438173 [details] [diff] [review] v11 I'm going to cancel review on this until we have a patch for HEAD.
Attachment #438173 - Flags: review?(mkanat)
Jesse, could we get some traction on this bug, please? :)
(In reply to comment #74) > Jesse, could we get some traction on this bug, please? :) FWIW, this bug will be solved auto-magically by some refactoring I am planning to do on 4.2.
So why not take his patch v11 for 4.0 only?
(In reply to comment #76) > So why not take his patch v11 for 4.0 only? I'd consider that, sure. Once I have the 4.2 patch ready, I'll also review the 4.0 patch.
Attachment #438173 - Flags: review?(mkanat)
I'm going to work on this now for 4.2, and then once that is done, I'll review the 4.0 patch.
Assignee: jjclark1982 → mkanat
Attached patch Work In Progress (trunk) (obsolete) (deleted) — Splinter Review
Here's a work-in-progress patch for trunk. However, currently this patch has a bunch of extra stuff that's unrelated to this bug in it (other improvements to Search.pm). I'm going to strip out the unrelated stuff, fix any remaining bugs, and then post the finalized patch, later.
Attached patch Trunk Refactor, v1 (deleted) — Splinter Review
All right, this fixes it for trunk by passing an array of values to the _anyexact operator instead of passing a string. (Strings are still allowed, however, for use in the boolean charts.) Standard parameters (not boolean chart parameters) from the URL are always passed as arrays, so we preserve the functionality even if there is just a single value passed.
Attachment #457802 - Attachment is obsolete: true
Attachment #457989 - Flags: review+
This checkin fixes this bug on trunk, and now I'm going to try the 4.0 patch out. Committing to: bzr+ssh://bzr.mozilla.org/bugzilla/trunk/ modified Bugzilla/Search.pm Committed revision 7380.
Comment on attachment 438173 [details] [diff] [review] v11 Interestingly, this patch still applies just fine on 4.0. The new xt/search.t test says that this breaks anyexact searches for dates, like creation_ts and delta_ts, and for the longdesc field. I think it's because you split on spaces and commas both. I think it might be possible to simply hack in an array-based solution for the anyexact charts only, for 4.0, which I'm going to try.
Attachment #438173 - Flags: review?(mkanat) → review-
Attached patch v12 (4.0) (obsolete) (deleted) — Splinter Review
Okay, this fixes the problem on 4.0, and passes xt/search.t. It's a specific hack for the anyexact fields that appear as multi-selects on query.cgi.
Attachment #438173 - Attachment is obsolete: true
Attachment #458003 - Flags: review?(LpSolit)
Comment on attachment 458003 [details] [diff] [review] v12 (4.0) This fixes the problem, and the buglist now contains expected bugs, but the data displayed before the buglist header, where arguments passed to buglist.cgi are displayed, are all empty. This only affects this patch. Bugzilla 4.1 is fine.
Attachment #458003 - Flags: review?(LpSolit) → review-
Do not forget to grant approval for trunk, as you already committed the patch. ;) We are so close to get it work on 4.0 that I will block 4.0 on it. Do you plan to backport it to 3.6.x too?
Flags: blocking4.0+
Flags: approval?
(In reply to comment #85) > Do not forget to grant approval for trunk, as you already committed the patch. > ;) Ah, thanks, yeah. :-) > We are so close to get it work on 4.0 that I will block 4.0 on it. Sounds reasonable. > Do you plan to backport it to 3.6.x too? No--I checked if the 4.0 patch applies, and it doesn't, so I'd have to write a different patch that I'm not sure would work. In particular, 3.6 still uses commas in func_defs, and I'm not sure how that would affect things (and it would be a lot more complex I suspect). I think it's probably better to keep 3.6 as it is than to risk any serious search regression on a stable branch.
Flags: approval? → approval+
(In reply to comment #86) > I think it's probably better to keep > 3.6 as it is than to risk any serious search regression on a stable branch. Yeah, I have the same feeling about it. :)
Attached patch v13 (4.0) (deleted) — Splinter Review
Ah, thanks for catching that. :-) This fixes the search description.
Attachment #458003 - Attachment is obsolete: true
Attachment #459198 - Flags: review?(LpSolit)
Comment on attachment 459198 [details] [diff] [review] v13 (4.0) Works fine, thanks. r=LpSolit
Attachment #459198 - Flags: review?(LpSolit) → review+
Flags: approval4.0+
Great, thanks for the review. :-) Committing to: bzr+ssh://bzr.mozilla.org/bugzilla/4.0/ modified Bugzilla/Search.pm Committed revision 7338.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Blocks: 581327
Keywords: relnote
Added to the release notes in bug 604256.
Keywords: relnote
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: