Closed Bug 1295948 Opened 8 years ago Closed 8 years ago

query_build_dir_url() doesn't support encoded characters in build URL

Tracking

(firefox53 fixed)

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

firefox53

---

fixed

People

(Reporter: whimboo, Assigned: svezauzeto12, Mentored)

Details

(Whiteboard: [lang=py])

Attachments

(1 file, 1 obsolete file)

Bug 1295948 - fixing method so it works with encoded URLs 8 years ago Tomislav Jurin (deleted), text/x-review-board-request	mozilla : review+	Details
Bug 1295948 - fixing method so it works with encoded URLs, changed return method 8 years ago Tomislav Jurin (deleted), text/x-review-board-request	mozilla : review-	Details

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Description

•

8 years ago

query_build_dir_url() [1] is doing some magic to find test packages as referenced in the test_packages.json file. It searches for the last '/' and replaces the remaining string with the appropriate test file. As I just noticed this doesn't work if the '/' is HTML encoded. Here an example: https://queue.taskcluster.net/v1/task/PnwWXZOuT8CFZ8TkL0G0CA/runs/0/artifacts/public%2Fbuild%2Ftarget.test_packages.json With this above logic we end up with a common.tests.zip file as: https://queue.taskcluster.net/v1/task/PnwWXZOuT8CFZ8TkL0G0CA/runs/0/artifacts/common.tests.zip It strips off the 'public/build/' path which forces TC to ask for permissions to download the file, which ends-up with a 403 forbidden failure. We should hardening the method so it can also cope with encoded slashes. [1] https://dxr.mozilla.org/mozilla-central/rev/fe895421dfbe1f1f8f1fc6a39bb20774423a6d74/testing/mozharness/mozharness/mozilla/testing/testbase.py#150

Aki Sasaki (not active)

Comment 1

•

8 years ago

I'm guessing that instead of looking for '/' in the url, we should use something that already parses urls like urlparse, and urldecode the path if we want to allow for encoded slashes.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 2

•

8 years ago

Sounds like a good idea, and I feel it would be great for a mentored project. Do you want to take that Aki?

Aki Sasaki (not active)

Comment 3

•

8 years ago

Sure.

Mentor: aki

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Updated

•

8 years ago

Whiteboard: [lang=py]

Tomislav Jurin

Assignee

Comment 4

•

8 years ago

Can I work on this ? Thanks

Aki Sasaki (not active)

Comment 5

•

8 years ago

Sure. It looks like you've already contributed patches; let me know if you have any questions or need any guidance to write this patch.

Tomislav Jurin

Assignee

Comment 6

•

8 years ago

If I understood correctly I should just decode url so it goes from : https://queue.taskcluster.net/v1/task/PnwWXZOuT8CFZ8TkL0G0CA/runs/0/artifacts/public%2Fbuild%2Ftarget.test_packages.json to : https://queue.taskcluster.net/v1/task/PnwWXZOuT8CFZ8TkL0G0CA/runs/0/artifacts/public/build/target.test_packages.json than can be done ( as you mentioned above ) with urllib2.unquote(our_url) after that I should parse url and return url path, which in our case should be: /v1/task/PnwWXZOuT8CFZ8TkL0G0CA/runs/0/artifacts/public/build/target.test_packages.json and then I should strip "target.test_packages.json" so filename can be added. We should get this: /v1/task/PnwWXZOuT8CFZ8TkL0G0CA/runs/0/artifacts/public/build (/filename is added after build ) Is that correct ? If it is please review this commit. It is possible that I have made error even if the above assumption is correct, I have never worked on this large project before. I have added comments in code so you follow my thoughts-slow more easily, we can just remove them later when you are satisfied with commit. Thanks.

Tomislav Jurin

Assignee

Comment 7

•

8 years ago

error when trying to push commit: error publishing review request 95750: Error publishing: Bugzilla error: Aki Sasaki [:aki] (PTO, back Nov28) <aki@mozilla.com> is not currently accepting 'review' requests. (HTTP 500, API Error 225) (review requests not published; visit review url to attempt publishing there)

Aki Sasaki (not active)

Comment 8

•

8 years ago

I just unblocked review requests. Try again?

Comment hidden (mozreview-request)

Tomislav Jurin

Assignee

Comment 10

•

8 years ago

commited, thanks :)

Aki Sasaki (not active)

Comment 11

•

8 years ago

mozreview-review

Comment on attachment 8814566 [details] Bug 1295948 - fixing method so it works with encoded URLs https://reviewboard.mozilla.org/r/95752/#review95792 r- because this patch changes the return value from a url to a path. In the review comments, I've shown a couple ways we can do this. ::: testing/mozharness/mozharness/mozilla/testing/testbase.py:169 (Diff revision 1) > > + #first we decode url > + reference_url = urllib2.unquote(reference_url) > + > + #now we parse url so we can extract path > + reference_url = urlparse(reference_url) Here you overwrite the reference_url string with a `urlparse.ParseResult`. ::: testing/mozharness/mozharness/mozilla/testing/testbase.py:172 (Diff revision 1) > + > + #now we parse url so we can extract path > + reference_url = urlparse(reference_url) > + > + #now we should extract path to string so we can find last slash > + reference_url = reference_url.path Here you overwrite `reference_url` with just the path. So if `reference_url` was once `https://server.name/path/to/artifact`, it is now `/path/to/artifact`. ::: testing/mozharness/mozharness/mozilla/testing/testbase.py:177 (Diff revision 1) > + reference_url = reference_url.path > + > + # and now this should work even with encoded urls ( which are not encoded anymore > + # when they come by the time they come here > last_slash = reference_url.rfind('/') > base_url = reference_url[:last_slash] Since `reference_url` no longer has the scheme or netloc (or any other url information, just the path), `base_url` will also be just a path. ::: testing/mozharness/mozharness/mozilla/testing/testbase.py:179 (Diff revision 1) > + # and now this should work even with encoded urls ( which are not encoded anymore > + # when they come by the time they come here > last_slash = reference_url.rfind('/') > base_url = reference_url[:last_slash] > > return '%s/%s' % (base_url, file_name) Now, since `base_url` is just a path, we're returning a path instead of a url. I can think of two things that might work: 1. remove the urlparse lines (lines 168-172). That means we'll just unquote and then remove everything in the url from the rightmost `/` onwards as the `base_url`, and add the filename. This is not proper behavior, because a url can have lots of info post-filename (e.g., https://server.name/path/to/artifact?query=string&query2=string2#anchor) but I don't think we use queries or anchors in the artifact urls, so this will probably be sufficient. 2. Instead of saying `reference_url = urlparse(reference_url)` and `reference_url = reference_url.path`, you can do something like ```python from urlparse import urlparse, ParseResult reference_url = "https://server/path/to/artifact?query=string#anchor" file_name = "new_artifact" # parts will be a list from the ParseResult: # ['https', 'server', '/path/to/artifact', '', 'query=string', 'anchor'] parts = list(urlparse(reference_url)) last_slash = parts[2].rfind('/') parts[2] = '/'.join([parts[2][:last_slash], file_name]) print ParseResult(*parts).geturl() ``` which changes `https://server/path/to/artifact?query=string#anchor` to `https://server/path/to/new_artifact?query=string#anchor` and is the proper behavior. Does that help?

Attachment #8814566 - Flags: review?(aki) → review-

Aki Sasaki (not active)

Updated

•

8 years ago

Assignee: nobody → svezauzeto12

Tomislav Jurin

Assignee

Comment 12

•

8 years ago

I have made new commit based on what we discussed. Hope it is working now, I have made few small tests in my own script but I am still not 100% sure that it is correct. I have made changes according to your 2. method which you described above. Thanks

Comment hidden (mozreview-request)

Aki Sasaki (not active)

Comment 14

•

8 years ago

Thanks! There's at least one nit. I triggered https://treeherder.mozilla.org/#/jobs?repo=try&revision=3b2c13ed853747fdf554fb0a2cddd200825ff2de to verify this works.

Aki Sasaki (not active)

Comment 15

•

8 years ago

mozreview-review

Comment on attachment 8814732 [details] Bug 1295948 - fixing method so it works with encoded URLs, changed return method https://reviewboard.mozilla.org/r/95826/#review96166 One small nit, below. I'd like to see a try run finish with a unit test and a talos test successfully passing this part. I triggered a new one because the old one wasn't cooperating for some reason: https://treeherder.mozilla.org/#/jobs?repo=try&revision=63db69e1f268f3ad22f99d17a3a75244f3a3953e Using a small python script for testing was a good approach! I'm pretty sure this is good, but I'd rather be on the safe side. ::: testing/mozharness/mozharness/mozilla/testing/testbase.py:174 (Diff revision 1) > - # and now this should work even with encoded urls ( which are not encoded anymore > - # when they come by the time they come here > - last_slash = reference_url.rfind('/') > - base_url = reference_url[:last_slash] > > - return '%s/%s' % (base_url, file_name) > + return '%s' %url You only need to `return url` :) The '%s' is string replacement. It was there to join the `base_url` and `file_name` earlier, but we don't need that.

Attachment #8814732 - Flags: review?(aki) → review-

Aki Sasaki (not active)

Comment 16

•

8 years ago

Looks like the try run looks good, so the return statement is the only thing that needs fixing. Thank you!

Tomislav Jurin

Assignee

Comment 17

•

8 years ago

I think this should be it. Thanks for mentoring.

Comment hidden (mozreview-request)

Tomislav Jurin

Assignee

Updated

•

8 years ago

Attachment #8814732 - Attachment is obsolete: true

Aki Sasaki (not active)

Comment 19

•

8 years ago

mozreview-review

Comment on attachment 8814566 [details] Bug 1295948 - fixing method so it works with encoded URLs https://reviewboard.mozilla.org/r/95752/#review96524 Thanks!

Attachment #8814566 - Flags: review?(aki) → review+

Pulsebot

Comment 20

•

8 years ago

Pushed by asasaki@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/46a0a73fde9a fixing method so it works with encoded URLs r=aki

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 21

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/46a0a73fde9a

Status: NEW → RESOLVED

Closed: 8 years ago

status-firefox53: --- → fixed

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.