Closed
Bug 921169
Opened 11 years ago
Closed 11 years ago
Implement secure DOM localization
Categories
(L20n :: HTML Bindings, defect, P1)
L20n
HTML Bindings
Tracking
(Not tracked)
RESOLVED
FIXED
1.0
People
(Reporter: stas, Assigned: stas)
References
Details
Attachments
(2 files, 2 obsolete files)
(deleted),
patch
|
zbraniecki
:
review+
mgol
:
feedback+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
Details | Diff | Splinter Review |
We should make our DOM localization logic secure and sound for the 1.0 release. The initial discussion took place in the mailing list:
https://groups.google.com/d/msg/mozilla.tools.l10n/9JvxgTUwIrk/70ja-BE55kQJ
Goals
-----
1. Make it possible to localize content in HTML elements and
attributes without forcing developers to split strings into pre-
and post- parts (definitely a bad practice). For instance, it
should be possible for an <a> element to be a part of the
translation.
2. Make it possible for localizers to apply text-level semantics to
the translations and make use of HTML entities. For instance,
it should be possible for a localizer to use an <sup> element in
"M<sup>me</me>" (an abbreviation of French "Madame").
Constraints
-----------
1. Make the whole system secure and don't trust translations by
default allowing a safe set of attributes on HTML elements. For
instance, the localizer should not be able to add an onclick
handler or overwrite the target of href or src without the
developer or the localization engineer knowingly allowing it.
2. Don't break the Web; in particular, until we get more feedback
and gather more data, we should not break any two-way bindings
the third-party libraries might have set up on existing DOM
nodes.
Assignee | ||
Comment 2•11 years ago
|
||
This is an early work-in-progress implementation. It:
- allows safe elements in translations,
- keeps original elements instead of replacing them,
- uses a whitelist for elements,
- doesn't check attributes against whitelist yet,
- doesn't copy/overlay attributes yet,
- doesn't support ITS translateRules for extending the whitelists,
- currently breaks language switching,
- probably doesn't handle errors well yet.
Gandalf, can you take a look and see if you like the general direction?
Michał, would you mind applying this locally and verifying that it works well
with Angular? It should keep the nodes for which Angular creates two-way
bindings.
Attachment #810757 -
Flags: feedback?(m.goleb+mozilla)
Assignee | ||
Updated•11 years ago
|
Attachment #810757 -
Flags: feedback?(gandalf)
Assignee | ||
Updated•11 years ago
|
Attachment #810757 -
Attachment is obsolete: true
Attachment #810757 -
Flags: feedback?(m.goleb+mozilla)
Attachment #810757 -
Flags: feedback?(gandalf)
Assignee | ||
Comment 3•11 years ago
|
||
I'm quite happy with what I ended up today:
- allows safe elements in translations,
- keeps original elements instead of replacing them,
- uses a whitelist for elements,
- uses a whitelist for attributes,
- copies and overlays attributes,
- works with lang switching and retranslation,
- should handle corner cases well.
- doesn't support ITS translateRules for extending the whitelists (I'd like to
fix this in a follow-up bug to not increase the complexity of this patch).
One important change is that I removed the whole l20nSourceNode approach, since
it's not trivial to make it work with the paradigm of keeping original nodes in
the DOM. However, I don't think this is bad. The patch doesn't support
reordering anyways, so retranslation based on the actual node instead of
L20nSourceNode can't break too much. The only case I can think of is when
language 1 adds a title attribute, and language 2 doesn't. Then the title attr
will stay on the node and will be present in the language 2 localization. I'd
prefer to fix this in a follow-up if it turns out to be a problem. I do think
it will be quite rare an edge-case.
Attachment #811184 -
Flags: review?(gandalf)
Attachment #811184 -
Flags: feedback?(m.goleb+mozilla)
Assignee | ||
Updated•11 years ago
|
Attachment #811184 -
Attachment is obsolete: true
Attachment #811184 -
Flags: review?(gandalf)
Attachment #811184 -
Flags: feedback?(m.goleb+mozilla)
Assignee | ||
Comment 4•11 years ago
|
||
I realized one thing: creating a dettached div or body and assigning the
unsanitized translation to innerHTML may trigger HTTP requests or even scripts
to run. Sanitizing only after the HTML was parsed is too late.
For instance, the translation might look like this:
<img src="http://evil.com/track.png">
Scripts in <script> tags are not executed, but you can easily run arbitrary
code otherwise. For instance:
<img src="missing.png" onerror="alert('pwnd!');">
The overlayElement will correclty sanitize these translations and the <img> nor
the 'onerror' callback won't be present in the final result, but they will
actually be parsed and run the moment when the unsanitized translation is
assigned to innerHTML.
To combat this, we can 1) use DOMParser and create a new document our of every
translation (overkill; also 10x slower) or 2) take advantage of the <template>
element, which uses DocumentFragments internally. It parses the HTML but keeps
it inert, so no requests are made and no scripts are run when it's created.
Because it uses DocumentFragments, I can use XPath any more to match elements
in translation to elements in the source HTML (see bug 402129). I tried using
CSS selectors instead, but unfortunately I'm only interested in immediate
children (deeper descendants are taken care of recursively), and CSS doesn't
make it easy (you'll recall you can't just do querySelector('> span'); you need
the LHS of the child operator). There is the Selector level 4 spec in the
works with the :scope pseudo-class [1], but the support is scarce (Chrome
+ Firefox but only on the testign channels) and from my testing :scope also
doesn't work well with DocumentFragments.
We could go the Sizzle route, which is to assign a temporary id to the context
node and use querySelector on it [2]. But again, DocumentFragments can't have
attributes.
So in the end I went for a simple manual approach where I calculate the index
among siblings of the same type myself and then use it in reverse to match an
element in the translation. This works well.
I'd love to get some feedback on the whole approach. I still need to make
necessary changes to the documentation. I wish we had a client-side test
harness, but I don't want to block on this.
[1] http://www.w3.org/TR/2012/ED-selectors-api2-20121115/
[2] https://github.com/jquery/sizzle/blob/c50c710b6411db2cc6d77d3b982ac6625605ea63/src/sizzle.js#L248-L268
Attachment #811518 -
Flags: review?(gandalf)
Attachment #811518 -
Flags: feedback?(m.goleb+mozilla)
Comment 5•11 years ago
|
||
(In reply to Staś Małolepszy :stas from comment #4)
> Created attachment 811518 [details] [diff] [review]
> Use template elements to force inertness of innerHTML
Great, great work! I've just tested it (sorry for the delay, busy Friday at work) and it works with Angular perfectly! :)
Updated•11 years ago
|
Attachment #811518 -
Flags: feedback?(m.goleb+mozilla) → feedback+
Comment 6•11 years ago
|
||
I'm not too optimistic about it being fixed, though, docco issues seems to linger without attention. We might have to fix it ourselves. (and docco is written in Literate CoffeScript, what a pain)
Comment 7•11 years ago
|
||
Sorry for the comment, wrong ticket. :)
Assignee | ||
Comment 8•11 years ago
|
||
This is the same patch as before, but applied to both bindings/l20n/html.js and
bindings/l20n/gaia.js.
Assignee | ||
Comment 9•11 years ago
|
||
I ran b2gperf tests on Gandalf's l20n-master with and without the patch (settings app)
without patch: median: 1213, mean: 1221, std: 35
with patch: median: 1220, mean: 1229, std: 29
The localization files don't have much HTML in them, so there isn't much difference. I added <em>Foo</em> to all 500 entities to see how the performance changes.
without patch: no change, but broken; <em> is visible in textContent
with patch: median: 1283, mean: 1286, std: 33
Comment 10•11 years ago
|
||
Comment on attachment 811518 [details] [diff] [review]
Use template elements to force inertness of innerHTML
Review of attachment 811518 [details] [diff] [review]:
-----------------------------------------------------------------
It looks good. It's pretty complex, I'd love to make it very clear for the reader of html.js which part does he need to understand without dom overlays, but it works and performance is ok.
median of 30 iterations on Settings app:
without patch: 1068
with patch: 1076
Attachment #811518 -
Flags: review?(gandalf) → review+
Comment 11•11 years ago
|
||
(In reply to Staś Małolepszy :stas from comment #4)
> To combat this, we can 1) use DOMParser and create a new document our of
> every
> translation (overkill; also 10x slower) or 2) take advantage of the
> <template>
> element, which uses DocumentFragments internally. It parses the HTML but
> keeps
> it inert, so no requests are made and no scripts are run when it's created.
The template element has poor browser support for now so I don't believe this is a viable solution. How about a sort-of polyfill for that, i.e. script element with custom type?
Assignee | ||
Comment 12•11 years ago
|
||
Landed in https://github.com/l20n/l20n.js/commit/892b1507a446080e80efebd24534744174561671.
I added docs to https://github.com/l20n/l20n.js/blob/892b1507a446080e80efebd24534744174561671/docs/html.md#making-html-elements-localizable
I also filed three follow-ups: bug 922573, bug 922576 and bug 922577.
(In reply to Michał Gołębiowski from comment #11)
> The template element has poor browser support for now so I don't believe
> this is a viable solution. How about a sort-of polyfill for that, i.e.
> script element with custom type?
Would you mind filing follow-up bug about this?
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 13•11 years ago
|
||
(In reply to Staś Małolepszy :stas from comment #12)
> (In reply to Michał Gołębiowski from comment #11)
> > The template element has poor browser support for now so I don't believe
> > this is a viable solution. How about a sort-of polyfill for that, i.e.
> > script element with custom type?
>
> Would you mind filing follow-up bug about this?
Sorry, I totally forgot about that until I tested in IE recently... :/ I reported bug 942594 about that.
You need to log in
before you can comment on or make changes to this bug.
Description
•