Closed
Bug 1150304
Opened 10 years ago
Closed 10 years ago
index: Restrict characters allowed in namespace (or encode them)
Categories
(Taskcluster :: General, defect)
Taskcluster
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jonasfj, Assigned: jonasfj)
References
(Blocks 1 open bug)
Details
Attachments
(1 file)
The current index API looks as follows:
GET /task/<namespace> (get task from index)
PUT /task/<namespace> (put task into index)
See: http://docs.taskcluster.net/services/index/
It would be nice if we could have end-points like:
GET /task/<namespace>/artifacts/<artifactName>
Which redirects to the queue, which redirects to the artifact...
(Might need special tricks for secret artifacts).
Anyways, we can't have that because current <namespace> format in the URI
doesn't encode / or any other special characters. Doing so is obviously
a breaking change. But I don't think there're many if any of the current
namespaces that would change if we required that the argument be encoded
by encodeURIComponent.
Ie. in the future we should always do:
task.routes = [
'index.' + encodeURIComponent('my.namespace.something')
]
And:
var index = new taskcluster.Index(...);
index.findTask(encodeURIComponent('my.namespace.something'))
The only real change is that we just reduce the number if legal characters
and require that illegal characters be encoded. Implementation-wise there is
no difference in the index service, except we validate and reject namespace
names that aren't URI encoded.
IMO, We don't have to decode before storing the database.
Is this crazy, will I break anything... If now shall we speed this up :)
This would reduce allowed characters in namespace names to:
[a-zA-Z0-9_.!~*'()-]
With "." still carrying special meaning.
taskcluster-index side I would change:
route: '/task/:namespace(*)',
Into:
route: '/task/:namespace',
At: https://github.com/taskcluster/taskcluster-index/blob/master/routes/api/v1.js#L119
Assignee | ||
Comment 1•10 years ago
|
||
@mshal, @jlal, @auswerk, Are you cool with this change?
And do you know of anyone else using index who would be in trouble by this change?
Note, we would only do:
index.findTask(encodeURIComponent('my.namespace.something'))
If the namespace we really really wanted to use contained a character not in
the set of allowed characters: [a-zA-Z0-9_.!~*'()%-]
(Sorry, I forgot to add % before, obviously it's allowed otherwise URI component encoding won't work).
For most namespace name we would refrain from using illegal characters, so no encoding is necessary.
(From a brief look at the inspector I could see any with "/" or other special characters in use)
Flags: needinfo?(mshal)
Flags: needinfo?(jlal)
Flags: needinfo?(aus)
Comment 2•10 years ago
|
||
I don't think that restricting the namespaces to that allowed set of characters would cause problems. AFAIK the only place we use characters outside of the set is in the artifact names, not the namespace (eg: a release build package may have spaces in the filename).
Flags: needinfo?(mshal)
Assignee | ||
Comment 3•10 years ago
|
||
Artifacts names also contains slashes "/", example: "public/logs/live.log" is a good example.
Note, for the HTTP API using illegal characters in namespace will return some 4xx error.
For indexing using routing key a message will be printed in log. At the moment we do allow
indexing with routing keys like: "index.my/key.otherkey"
After this, that wouldn't be allowed. But I think the concept of illegal values already exists
in some form, AMQP allows for routing keys as bytes, so you can have invalid unicode in there.
I suspect using that would cause internal errors in various places of the index :)
So maybe limiting allow characters would harden the index too.
Comment 4•10 years ago
|
||
Oh right, I was only thinking of the final pathname of the artifact for some reason.
Comment 5•10 years ago
|
||
That seems fine to me. I don't actually need to change anything but, to be safe, I would have one small minor update to make to the taskcluster-npm-cache creation process to encodeURIComponent(namespace) but it's not a necessity before this change is committed.
Flags: needinfo?(aus)
Assignee | ||
Comment 6•10 years ago
|
||
This also upgrades TC-base, so we keep azure connections alive, and upgrade to node 0.12, this improves latency a lot...
@jlal, I can't assign your for review... Anyways, I'm not rolling this out before you clear the NI. I don't see any conflicts as all existing namespaces are within the allowed character set.
This requires that task are indexed under string on the form:
/^([a-zA-Z0-9_!~*'()%-]+\.)*[a-zA-Z0-9_!~*'()%-]+$/
So no empty strings between the dots, ie. "my.namespace..test" is not valid.
Comment 7•10 years ago
|
||
Comment on attachment 8596914 [details]
Github PR
r+ with a couple of comments in GH.
Attachment #8596914 -
Flags: review?(garndt) → review+
Updated•10 years ago
|
Flags: needinfo?(jlal)
Assignee | ||
Comment 8•10 years ago
|
||
Rolled out the patch...
Allowed characters are: a-zA-Z0-9_!~*'()%-
Please, let me know if this causes problems anywhere...
@mshal, @jlal, you might want keep an eye on the things you index to make sure they are still being indexed...
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•9 years ago
|
Component: TaskCluster → General
Product: Testing → Taskcluster
Target Milestone: --- → mozilla41
Version: unspecified → Trunk
Comment 9•9 years ago
|
||
Resetting Version and Target Milestone that accidentally got changed...
Target Milestone: mozilla41 → ---
Version: Trunk → unspecified
Blocks: 1680064
You need to log in
before you can comment on or make changes to this bug.
Description
•