Open Bug 1392241 Opened 7 years ago Updated 1 year ago

[meta] Align with Fetch on data: URLs

Categories

(Core :: DOM: Core & HTML, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: annevk, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: meta)

In https://github.com/whatwg/fetch/pull/579 I'm working on a revised standard for data: URLs to put all issues related to them in Firefox and across browsers to bed forever. There are corresponding tests over at https://github.com/w3c/web-platform-tests/pull/6890. Both are currently somewhat blocked on it not being clear to me what the best strategy around MIME types is. The RFC definition doesn't work as text/html; cannot be treated as an error and neither can text/html;unknown but how exactly we should preserve missing or invalid parameters is unclear. Ideas on that are very much welcome over in https://github.com/whatwg/mimesniff/issues/30. I'm going to mark all data: URL bugs that are blocked on a better processing definition as blocking this bug to make sure the solution covers all of them. I think we should start fixing them one-by-one even if the standard hasn't landed yet as there are clear improvements we could make over the status quo.
Sorry, should already have marked these meta bugs as P3.
Priority: -- → P3
Depends on: 908413
Tests have landed and the Fetch Standard has been updated: https://fetch.spec.whatwg.org/#data-urls
Anne, can you link to any tests we fail? Its not obvious where the tests are to me.
Thanks. I guess wpt.fyi is not updated yet. I don't see them there.
I’ve implemented this in Rust: https://github.com/servo/rust-url/tree/master/data-url Though this code takes &str (UTF-8) as input, whereas Gecko might want something that works on &[u16] directly, to avoid converting and copying a potentially-long string.
Gecko's URIs are stored as UTF-8 strings, I believe. They're definitely stored as 1-byte strings. What they're _not_, at least in the data: case, is stored as a _single_ string. Right now Gecko does parse the full string, with a resulting extra copy, but bug 1333899 was aiming to stop doing that, and it would be good to plan for it. Ideally, there would be an API that takes the substring starting right after ':' and going up to (but not including) the '#' and parses that.
There isn’t exactly that public API in the code linked above, but it could be added.
Component: DOM → DOM: Core & HTML
Summary: Align with Fetch on data: URLs → [meta] Align with Fetch on data: URLs
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.