Closed Bug 146945 Opened 23 years ago Closed 22 years ago

clean up format ambiguities

Categories

(Bugzilla :: Bugzilla-General, defect, P2)

2.15
defect

Tracking

()

RESOLVED FIXED
Bugzilla 2.18

People

(Reporter: myk, Assigned: gerv)

References

Details

Attachments

(2 files, 2 obsolete files)

The concept of formats is ambiguous because it represents two kinds of variations in appearance: structural (f.e. classic and modern versions of the query form) and representational (f.e. HTML and RDF versions of bug lists). All structural variations *have* a representation (i.e. content type), but all representational variations *are* a representation, which leads to confusion about how users should request variations. Structural and representational variations should be disambiguated from each other by distinguishing between the two in filenames and URL/form parameters via the following scheme: Variations in structure are called "formats." Each format has a unique name, and the name of a template file implementing a format consists of the name of the script (or equivalent identifier) followed by a dash, the format name, a period, the filename extension representing the format's content type, another period, and the "tmpl" extension, f.e.: foo-classic.html.tmpl foo-modern.html.tmpl Formats may be specified by users via the "format" URL/form parameter. Variations in representation are called "content types." Each content type has a unique extension, and the name of a template file implementing a content type consists of the name of the script (or equivalent identifier), followed by a period, the extension, another period, and the "tmpl" extension, f.e.: bar.html.tmpl bar.xml.tmpl Content types may be specified by users via the "ctype" URL/form parameter. Scripts may support both kinds of variations, in which case the "format" parameter takes precedence over the "ctype" parameter. Scripts may not implement both kinds of variations in a single file, at least not until someone implements a data structure for storing variations that indexes them by both format name and extension (there is no need for this now, and it's unclear that there ever will be, but the scheme above does not do anything to prohibit its implementation if anyone ever discovers a reason to do it). Bugzilla should fail gracefully when it cannot find the requested variation. Bugzilla should have ways for installations and users to specify default variations.
Attached patch work-in-progress: complete but untested (obsolete) (deleted) — Splinter Review
Here's a patch implementing the scheme suggested by this bug report.
Priority: -- → P2
Target Milestone: --- → Bugzilla 2.18
Blocks: 132181
> Scripts may support both kinds of variations, in which case the "format" > parameter takes precedence over the "ctype" parameter. I don't understand what you mean by "take precedence". If I ask for fred.cgi?format=foo&ctype=rdf I should get the template fred-foo.rdf.tmpl or an error, if it doesn't exist. Also, what impact does this have on 2.16? If we are going to rename lots of templates (e.g. buglist-rdf.rdf.tmpl -> buglist.rdf.tmpl) then we should discuss this idea and, if we decide it's a good one, do it now rather than later... The default format should be "" (as now; i.e. fred.ctype.tmpl), and the default ctype should be defined in the CGI script itself. Gerv Gerv
>I don't understand what you mean by "take precedence". If I ask for >fred.cgi?format=foo&ctype=rdf >I should get the template >fred-foo.rdf.tmpl >or an error, if it doesn't exist. If someone asks for a template that doesn't exist, but half of their request refers to a template that does exist, it's a good bet they wanted the existing template, so it makes a lot more sense (because it makes our users happier more of the time) to give them what they probably wanted then throw an error message. >Also, what impact does this have on 2.16? If we are going to rename lots of >templates (e.g. buglist-rdf.rdf.tmpl -> buglist.rdf.tmpl) then we should >discuss this idea and, if we decide it's a good one, do it now rather than >later... It has no impact on 2.16, because 2.16 needs to ship and this is too big a change to go in at the last minute. It's a shame, but it's better if we ship and then make big changes in 2.18 than hold 2.16 for another month or more while we work these changes out. Also, the idea that we can get all the big changes into 2.16 if we just hold out a little longer is silly. We're going to make big changes in 2.18 no matter how long we hold 2.16. >The default format should be "" (as now; i.e. fred.ctype.tmpl), and the default >ctype should be defined in the CGI script itself. This code doesn't specifically address default formats (in fact it specifically bypasses that issue). Let's go argue about that elsewhere. :-)
No longer blocks: 132181
Blocks: 150049
> If someone asks for a template that doesn't exist, but half of their request > refers to a template that does exist, it's a good bet they wanted the existing > template, so it makes a lot more sense (because it makes our users happier more > of the time) to give them what they probably wanted then throw an error message. This causes problems, because Template Toolkit looks for the given filename along a variable path. We'd have to make a call to try one filename and, if that failed, try another one. That's horrible and icky. Why can't it be simple? foo.cgi <nothing> -> foo.html.tmpl format=simple -> foo-simple.html.tmpl ctype=xml -> foo.xml.tmpl format=simple&ctype=xml -> foo-simple.xml.tmpl As Bugzilla is primarily a web-based application, it makes sense for <nothing> to map to ctype=html and format="". This means we can get rid of all the GetOutputFormats() and ValidateOutputFormats() stuff, and just construct the filename we look for along the simple lines listed above. It separates structural and representational variations, easily allows both at once (a simple RDF version, or whatever), and is generally easy. We stop the user requesting invalid combinations using the UI. What's wrong with it? :-) I'm sure there's something... Gerv
Blocks: 162151
I like Gerv's idea and only ./list/list-rdf.rdf.tmpl ./bug/choose-xml.html.tmpl need to be changed. > the default ctype should be defined in the CGI script itself Yes :-) This is e.g. needed for the emails which use the template system.
> > the default ctype should be defined in the CGI script itself > Yes :-) This is e.g. needed for the emails which use the template system. It's not, you know. :-) Those templates would just be called (as they, in fact, are) someemail.txt.tmpl . And remember, using the formats system is optional for any given call to the Template Toolkit - it's still perfectly possible to hard-code template names (such as someemail.txt.tmpl). Gerv
I'm uneasy about Gerv's proposal, but I can't put my finger on what exactly is the matter with it. As it stands, it sounds good, but something is nagging me about it.
Attached patch Patch v.1 (obsolete) (deleted) — Splinter Review
OK, here's an initial patch. In addition to this patch, list/list-rdf.rdf.tmpl would need renaming to list/list.rdf.tmpl, removing the redundancy. It would then be requested with "ctype=rdf". This is the only file whose name doesn't fit the new mechanism. Bug 150049 is a meta-bug for template format improvements. It has five dependencies, including this bug. Here is the impact on the other four: Bug 140513 - fixed (pass "txt" as the third parameter) Bug 143604 - fixed (ctype=html does a no-op) Bug 148133 - fixed (it now returns text/plain if the ctype is not found, which seems reasonable) Bug 149061 - not fixed. It's outside the scope. Gerv
Comment on attachment 98718 [details] [diff] [review] Patch v.1 >Index: globals.pl >@@ -1515,119 +1515,27 @@ + my ($templatename, $format, $ctype) = @_; Nit: "name" is redundant in "templatename". >+ return >+ { >+ 'template' => $templatename , >+ 'extension' => $ctype , >+ 'contenttype' => $::contenttypes->{$ctype} || "text/plain" , Nit: It would make sense to change the name of the "contenttype" property to "ctype" since that's what we now use in the URL and in this function. >@@ -202,12 +202,14 @@ >+my $format = GetFormat("reports/duplicates", >+ $::FORM{'format'}, >+ $::FORM{'ctype'}); Nit: better as my $format = GetFormat("reports/duplicates", $::FORM{'format'}, $::FORM{'ctype'}); >Index: enter_bug.cgi >@@ -382,8 +382,10 @@ >+my $format = GetFormat("bug/create/create", >+ $::FORM{'format'}, >+ $::FORM{'ctype'}); See previous nit. >Index: query.cgi >@@ -363,7 +363,7 @@ +print "$format->{'contenttype'}\n\n"; -> print "Content-Type: $format->{'contenttype'}\n\n"; Are you going to create a separate patch to convert list-rdf.rdf.tmpl?
Attachment #98718 - Flags: review-
So what if someone passes 'format=blah' into the script, where blahis not a valid format?
-> me. I haven't beeng getting mail on this bug :-( Gerv
Assignee: justdave → gerv
Attached patch Patch v.2 (deleted) — Splinter Review
Here's version 2, with Myk's comments addressed. > So what if someone passes 'format=blah' into the script, where blahis not a > valid format? You get a standard "template not found" error. But I think this is reasonable - the UI should only allow people to choose valid formats. Of course you can define an invalid one by editing URLs, but you can create errors by editing URLs all over Bugzilla. Nothing nasty or insecure happens. Gerv
Attachment #84999 - Attachment is obsolete: true
Attachment #98718 - Attachment is obsolete: true
Yes, but currently you get told that, rather than the ThrowTemplateError that you'll now get. If ctype/format are wrong, then you should throw an error/die rather than silently accept it, I think
> If ctype/format are wrong, then you should throw an error/die rather than > silently accept it, I think The current patch lets Template Toolkit look for the file which ctype and/or format defines, and throws an error if it's not found. The only other thing we could do would be what happened previously - i.e. to replicate that process, and throw an error if it's not found before we get to TT. I don't think we gain much by throwing the error earlier, apart from the fact that we say "dodgy format/ctype" instead of "template not found". And it means we have to duplicate the code TT has for tracking down template files. Gerv
Hmm. I guess I'm convinced, as long as the template error message mentions the template/format/etc name somewhere in there, so that an admin can diagnose it. I think it will, since its actually a template name by the time TT gets to it. What about the silent detainting?
> What about the silent detainting? I don't quite understand... The ctype and format parameters are checked: + # Security - allow letters and a hyphen only + $ctype =~ s/[^a-zA-Z\-]//g; + $format =~ s/[^a-zA-Z\-]//g; Is that what you mean? Gerv
Right, they're checked, but failures are silently corrected. Why should a format of 'html*' become 'html' ? ThrowUserError for non-valid stuff.
Comment on attachment 99441 [details] [diff] [review] Patch v.2 Why should we throw a user error because of a typo? If we probably know what the user wanted, we should give it to them. That behavior is better in more situations, and actual errors (where the user mistypes one format name and gets another) are exceedingly rare and easily spotted and corrected. >+my $format = >+ GetFormat("reports/duplicates", $::FORM{'format'}, $::FORM{'ctype'}); Nit: Other parts of our code use half-level (two space) indentation rather than right-justification in these situations, and I think it's more consistent, readable, and maintainable: my $format = GetFormat("reports/duplicates", $::FORM{'format'}, $::FORM{'ctype'}); r=myk This is a pretty obvious fix and doesn't need second review, but you do need to fix existing template files (i.e. list-rdf.rdf.tmpl) and figure out how not to break existing applications using format=rdf.
Attachment #99441 - Flags: review+
> Why should we throw a user error because of a typo? Well, users shouldn't be typing format names - they should be manipulating forms. But I agree - the current code is more friendly. I'll just re-checkin list-rdf.rdf.tmpl as list.rdf.tmpl. It's only had a couple of minor tweaks since it was moved last anyway. > You need to... figure out how not to break existing applications using > format=rdf. That's more of a problem. Are there already many of these applications? Presumably mozbot is one. Can we just notify everyone and put this one down to experience, or are we going to have to add a hack which converts format=rdf to ctype=rdf on that CGI? Gerv
Fixed. Checking in globals.pl; /cvsroot/mozilla/webtools/bugzilla/globals.pl,v <-- globals.pl new revision: 1.200; previous revision: 1.199 done Checking in buglist.cgi; /cvsroot/mozilla/webtools/bugzilla/buglist.cgi,v <-- buglist.cgi new revision: 1.193; previous revision: 1.192 done Checking in duplicates.cgi; /cvsroot/mozilla/webtools/bugzilla/duplicates.cgi,v <-- duplicates.cgi new revision: 1.26; previous revision: 1.25 done Checking in enter_bug.cgi; /cvsroot/mozilla/webtools/bugzilla/enter_bug.cgi,v <-- enter_bug.cgi new revision: 1.72; previous revision: 1.71 done Checking in query.cgi; /cvsroot/mozilla/webtools/bugzilla/query.cgi,v <-- query.cgi new revision: 1.104; previous revision: 1.103 done Checking in post_bug.cgi; /cvsroot/mozilla/webtools/bugzilla/post_bug.cgi,v <-- post_bug.cgi new revision: 1.65; previous revision: 1.64 done Removing list-rdf.rdf.tmpl; /cvsroot/mozilla/webtools/bugzilla/template/en/default/list/list-rdf.rdf.tmpl,v <-- list-rdf.rdf.tmpl new revision: delete; previous revision: 1.3 done RCS file: /cvsroot/mozilla/webtools/bugzilla/template/en/default/list/list.rdf.tmpl,v done Checking in list.rdf.tmpl; /cvsroot/mozilla/webtools/bugzilla/template/en/default/list/list.rdf.tmpl,v <-- list.rdf.tmpl initial revision: 1.1 done Removing list-rdf.rdf.tmpl; /cvsroot/mozilla/webtools/bugzilla/template/en/default/list/list-rdf.rdf.tmpl,v <-- list-rdf.rdf.tmpl new revision: delete; previous revision: 1.3 done RCS file: /cvsroot/mozilla/webtools/bugzilla/template/en/default/list/list.rdf.tmpl,v done Checking in list.rdf.tmpl; /cvsroot/mozilla/webtools/bugzilla/template/en/default/list/list.rdf.tmpl,v <-- list.rdf.tmpl initial revision: 1.1 done Gerv
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Attached patch patch v3: hack (deleted) — Splinter Review
I think we should hack it, and here it is.
Comment on attachment 99716 [details] [diff] [review] patch v3: hack r=gerv - seems OK code. I sort of disagree, though, because this seems like the sort of nipped-it-in-the-bud situation where we can avoid having to carry a hack for the next X years. But, I suppose we did do a release with the old way... We should at least deprecate this in the 2.18 release notes. Do we have a relnote keyword? Gerv
Attachment #99716 - Flags: review+
We could just upgrade botbot - hixie hasn't applied that patch to the sources yet.
I know of two tools using format=rdf (not including mozbot), and there are undoubtedly others. I don't think we can say we nipped it in the bud, especially when some installations (b.m.o notwithstanding) don't upgrade regularly (and thus may not upgrade from 2.16 to a later release for a few years).
OK, the hack it is, then. Unless we provide them a one-liner they can apply to _their_ Bugzillas: $::FORM{'format'} = "rdf" if $::FORM{'ctype'} = "rdf"; and then get them to upgrade their client software. Well, I can hope. :-) Gerv
Presumably you meant the reverse, but that won't work (whether in the tree or distributed on demand) because it causes Bugzilla to look for "list-rdf.rdf.tmpl" instead of "list.rdf.tmpl". Hack checked in: Checking in buglist.cgi; /cvsroot/mozilla/webtools/bugzilla/buglist.cgi,v <-- buglist.cgi new revision: 1.194; previous revision: 1.193 done
QA Contact: matty_is_a_geek → default-qa
(In reply to Gervase Markham [:gerv] from comment #12) > the UI should only allow people to choose valid formats. Of course you can > define an invalid one by editing URLs, but you can create errors by editing > URLs all over Bugzilla. Nothing nasty or insecure happens. This mostly means "why should I bother about security here?". Your assumption is wrong, and bug 842038 is a perfect example of why invalid parameters must be rejected, not silently sanitized. bbaetz in comment 13 and comment 17 was right to worry about such partial validations. FYI, Bugzilla 4.4 and newer now loudly reject invalid format and ctype values.
Bug 842038, AIUI, is a case of using untrusted data before it's been sanitized, which is never a good idea. (It's somewhat hard to discuss it here as that bug is not yet open.) Gerv
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: