Closed Bug 1454590 Opened 6 years ago Closed 6 years ago

Align overrideMIMEType with the XMLHttpRequest Standard

Categories

(Core :: DOM: Core & HTML, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla63
Tracking Status
firefox63 --- fixed

People

(Reporter: annevk, Assigned: wisniewskit, NeedInfo)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file)

Priority: -- → P3
anne, baku, I'm wondering if you feel it's worth trying the mimesniffing spec on overrideMimeType first in this bug, before we go further with using it in other places like bug 1423877 and bug 1454325?

Anne, note that there is an unexpectedly failing WPT due to the spec changes (https://github.com/web-platform-tests/wpt/issues/12289). I'm not sure if my "fix" for that is appropriate, given the name of the test implies it should work in the unsent state, not just in the opened state, but the test is definitely at-odds with the spec now, so we'd need to figure that out.
Comment on attachment 8997621 [details]
Bug 1454590 - Align overrideMIMEType with the XMLHttpRequest Standard.

https://reviewboard.mozilla.org/r/261326/#review268414


Code analysis found 2 defects in this patch:
 - 2 defects found by clang-tidy

You can run this analysis locally with:
 - `./mach static-analysis check path/to/file.cpp` (C/C++)


If you see a problem in this automated review, please report it here: http://bit.ly/2y9N9Vx


::: dom/xhr/XMLHttpRequestMainThread.h:182
(Diff revision 1)
> +
> +  explicit MimeType(const nsAString& aType, const nsAString& aSubtype)
> +  : mType(aType), mSubtype(aSubtype)
> +  {}
> +
> +  ~MimeType() {}

Warning: Use '= default' to define a trivial destructor [clang-tidy: modernize-use-equals-default]

  ~MimeType() {}
  ^           ~~
              = default;

::: dom/xhr/XMLHttpRequestMainThread.cpp:3275
(Diff revision 1)
> +                }
> +                paramValue.Append(*pos);
> +              }
> +              ++pos;
> +              continue;
> +            } else {

Warning: Do not use 'else' after 'continue' [clang-tidy: readability-else-after-return]

            } else {
              ^~~~~~
Comment on attachment 8997621 [details]
Bug 1454590 - Align overrideMIMEType with the XMLHttpRequest Standard.

https://reviewboard.mozilla.org/r/261326/#review268516

r- for testing. We need gtest for this parser.

::: dom/xhr/XMLHttpRequestMainThread.h:160
(Diff revision 1)
>    void GetCORSUnsafeHeaders(nsTArray<nsCString>& aArray) const;
>  };
>  
>  class nsXHRParseEndListener;
>  
> +class MimeType final {

{ in a new line

::: dom/xhr/XMLHttpRequestMainThread.h:164
(Diff revision 1)
>  
> +class MimeType final {
> +  NS_INLINE_DECL_REFCOUNTING(MimeType)
> +
> +private:
> +  class ParameterValue : public nsString {

here as well

::: dom/xhr/XMLHttpRequestMainThread.h:169
(Diff revision 1)
> +  class ParameterValue : public nsString {
> +  public:
> +    bool mRequiresQuoting;
> +
> +    ParameterValue()
> +    : nsString(), mRequiresQuoting(false)

<space><space>: mRequiresQuoting(false)

no need to use nsString CTOR.

::: dom/xhr/XMLHttpRequestMainThread.h:178
(Diff revision 1)
> +  nsString mType;
> +  nsString mSubtype;
> +  nsDataHashtable<nsStringHashKey, ParameterValue> mParameters;
> +  nsTArray<nsString> mParameterNames;
> +
> +  explicit MimeType(const nsAString& aType, const nsAString& aSubtype)

no explicit for more than 1 parameters.

::: dom/xhr/XMLHttpRequestMainThread.h:179
(Diff revision 1)
> +  nsString mSubtype;
> +  nsDataHashtable<nsStringHashKey, ParameterValue> mParameters;
> +  nsTArray<nsString> mParameterNames;
> +
> +  explicit MimeType(const nsAString& aType, const nsAString& aSubtype)
> +  : mType(aType), mSubtype(aSubtype)

<space><space>:...

::: dom/xhr/XMLHttpRequestMainThread.cpp:3107
(Diff revision 1)
>  {
>    return mState;
>  }
>  
> +namespace {
> +  static inline bool IsASCIIWhitespace(const char16_t c) {

Use mfbt/TextUtils.h instead of this.

::: dom/xhr/XMLHttpRequestMainThread.cpp:3111
(Diff revision 1)
> +namespace {
> +  static inline bool IsASCIIWhitespace(const char16_t c) {
> +    return c == 0x9 || c == 0xA || c == 0xC || c == 0xD || c == 0x20;
> +  }
> +
> +  static inline bool IsASCIIAlphanumeric(const char16_t c) {

and this

::: dom/xhr/XMLHttpRequestMainThread.cpp:3129
(Diff revision 1)
> +    return c == 0x9 || (c >= ' ' && c <= '~') || (c >= 0x80 && c <= 0xFF);
> +  }
> +}
> +
> +/* static */ already_AddRefed<MimeType>
> +MimeType::Parse(const nsAString& aMimeType)

This is complex and it should not be in this file.
I pref to have a separate .h/.cpp file and a gtest to cover all the possible input/output.

::: dom/xhr/XMLHttpRequestMainThread.cpp:3139
(Diff revision 1)
> +
> +  // Step 1, skip leading and trailing whitespace
> +  while (pos <= end && IsASCIIWhitespace(*pos)) {
> +    ++pos;
> +  }
> +  if (pos > end) {

pos >= ?

::: dom/xhr/XMLHttpRequestMainThread.cpp:3145
(Diff revision 1)
> +    return nullptr;
> +  }
> +  while (end >= pos && IsASCIIWhitespace(*end)) {
> +    --end;
> +  }
> +  if (pos > end) {

same here
Attachment #8997621 - Flags: review?(amarchesini) → review-
Comment on attachment 8997621 [details]
Bug 1454590 - Align overrideMIMEType with the XMLHttpRequest Standard.

https://reviewboard.mozilla.org/r/261326/#review268516

> Use mfbt/TextUtils.h instead of this.

I see no "is ascii whitespace" function in that header. I'm guessing you meant NS_IsAsciiWhitespace, from xpcom/base/nsCRTGlue.h?

> pos >= ?

Not in this case, as "end" is really the last character (End - 1), not past the last character. I wioll change its name to "lastChar" to make that clearer.
Alright, I've updated the patchset and addressed the feedback. I added gtests that are largely based on the relevant WPT, but also include extra test-cases which helped improve the parser implementation.

A fresh try run seems fine to me: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5aad300fe50b6a177dc22a9158bd2df09274e295&selectedJob=193047155

My remaining concern at this point is the spec issue, but that part of the patch can be easily split off if necessary if we'd like to land the rest separately. (The spec issue is https://github.com/web-platform-tests/wpt/issues/12289)

Anne, do you think you'll have time to consider that anytime soon? I'll be away on PTO for a few weeks starting next week, and baku is away as well, so if you'd like to try landing this without that change I can try to find another reviewer for the rest before I leave.
Flags: needinfo?(annevk)
I can look into it first thing tomorrow. I'll leave the needinfo in place so I don't forget.
I suggest we do not reset override MIME type when open() is invoked:

  https://github.com/whatwg/xhr/pull/218
  https://github.com/web-platform-tests/wpt/pull/12404

Hope that helps in landing this before you go away, but otherwise it'll have to wait a bit which doesn't seem too bad?

(Also, many thanks for taking the number of outstanding XMLHttpRequest bugs down again!)
Flags: needinfo?(annevk)
Comment on attachment 8997621 [details]
Bug 1454590 - Align overrideMIMEType with the XMLHttpRequest Standard.

https://reviewboard.mozilla.org/r/261326/#review269214

I'd suggest we leave xhr/overridemimetype-unsent-state-force-shiftjis.htm unchanged as well.
Attachment #8997621 - Flags: review?(annevk) → review-
Alright, I've removed the bits pertaining to that WPT, Anne. New try-run here, just in case: https://treeherder.mozilla.org/#/jobs?repo=try&revision=39a172991ace343c1c453e77b2a012f41c24c70e

Henri, mind reviewing this one? (baku is away on PTO, but did a partial review in comment 4).
Comment on attachment 8997621 [details]
Bug 1454590 - Align overrideMIMEType with the XMLHttpRequest Standard.

https://reviewboard.mozilla.org/r/261326/#review269322

::: dom/base/MimeType.h:16
(Diff revision 2)
> +#include "nsDataHashtable.h"
> +#include "nsTArray.h"
> +
> +class MimeType final
> +{
> +  NS_INLINE_DECL_REFCOUNTING(MimeType)

There doesn't seem to be a reason in this patch why this type needs to be refcounted. Please remove the refcount and use `UniquePtr` instead of `RefPtr`.

::: dom/base/MimeType.h:19
(Diff revision 2)
> +class MimeType final
> +{
> +  NS_INLINE_DECL_REFCOUNTING(MimeType)
> +
> +private:
> +  class ParameterValue : public nsString

This works, but it's not clear to me if we're supposed to subclass our string classes like this. I'll need to ask dbaron.

::: dom/base/MimeType.cpp:24
(Diff revision 2)
> +{
> +  // See https://mimesniff.spec.whatwg.org/#parsing-a-mime-type
> +
> +  // Steps 1-2
> +  const char16_t* pos = aMimeType.BeginReading();
> +  const char16_t* lastChar = aMimeType.EndReading() - 1;

Please call this `end`, remove the `- 1` and change the loop conditions from `<=` to `<` (and the early return conditions from `>` to `==`), since that's how `EndReading()` is supposed to be used.

::: dom/base/MimeType.cpp:25
(Diff revision 2)
> +  // See https://mimesniff.spec.whatwg.org/#parsing-a-mime-type
> +
> +  // Steps 1-2
> +  const char16_t* pos = aMimeType.BeginReading();
> +  const char16_t* lastChar = aMimeType.EndReading() - 1;
> +  while (pos <= lastChar && NS_IsAsciiWhitespace(*pos)) {

Sadly, `NS_IsAsciiWhitespace` doesn't match what the Infra Standard requires when it comes to Form Feed, so we probably need a new function in `mfbt/TextUtils.h`.

::: dom/base/MimeType.cpp:31
(Diff revision 2)
> +    ++pos;
> +  }
> +  if (pos > lastChar) {
> +    return nullptr;
> +  }
> +  while (lastChar >= pos && NS_IsAsciiWhitespace(*lastChar)) {

Then `*lastChar` here needs to become `*(end - 1)` which is a bit unfortunate, but I still think it's better to use the idiomatic "pointer past end" formulation.

::: dom/base/MimeType.cpp:34
(Diff revision 2)
> +    return nullptr;
> +  }
> +  while (lastChar >= pos && NS_IsAsciiWhitespace(*lastChar)) {
> +    --lastChar;
> +  }
> +  if (pos > lastChar) {

This `if` condition can never evaluate to true, right? That is, if the string was entirely whitespace, we'd have returned earlier already.

::: dom/base/MimeType.cpp:46
(Diff revision 2)
> +    if (!IsHTTPTokenPoint(*pos)) {
> +      return nullptr;
> +    }
> +    ++pos;
> +  }
> +  const char16_t* typeEnd = pos - 1;

Likewise, please remove `- 1` here and change the later comparison from `<=` to `<`.

::: dom/base/MimeType.cpp:67
(Diff revision 2)
> +  while (pos <= lastChar && *pos != ';') {
> +    if (!IsHTTPTokenPoint(*pos)) {
> +      // If we hit a whitespace, check that the rest of
> +      // the subtype is whitespace, otherwise fail.
> +      if (NS_IsAsciiWhitespace(*pos)) {
> +        subtypeEnd = pos - 1;

And here.

::: dom/base/MimeType.cpp:83
(Diff revision 2)
> +      }
> +    }
> +    ++pos;
> +  }
> +  if (subtypeEnd == nullptr) {
> +    subtypeEnd = pos - 1;

Again here.

::: dom/base/MimeType.cpp:93
(Diff revision 2)
> +
> +  // Step 10
> +  nsString type;
> +  nsString subtype;
> +  for (const char16_t* c=typeStart; c<=typeEnd; ++c) {
> +    type.Append(ToLowerCase(*c));

`ToLowerCase` does single-code-point Unicode lower casing instead of ASCII lower casing. Probably the right way to go is to add
`char ToLowerCaseASCII(char aChar)`,
`char16_t ToLowerCaseASCII(char16_t aChar)` and
`char32_t ToLowerCaseASCII(char32_t aChar)` in nsUnicharUtils.h. These shouldn't use the gASCIIToLower lookup table, because a branch is needed to check if the index in within the length of the lookup table, so that branch might as well be used to check if the code unit is in the ASCII range.

::: dom/base/MimeType.cpp:114
(Diff revision 2)
> +
> +    // Steps 11.3 and 11.4
> +    nsString paramName;
> +    bool paramNameHadInvalidChars = false;
> +    while (pos <= lastChar && *pos != ';' && *pos != '=') {
> +      if (!paramNameHadInvalidChars) {

Assuming that parameters are most often valid, this optimization is probably a pessimization. Please remove this check.

::: dom/base/MimeType.cpp:118
(Diff revision 2)
> +    while (pos <= lastChar && *pos != ';' && *pos != '=') {
> +      if (!paramNameHadInvalidChars) {
> +        if (!IsHTTPTokenPoint(*pos)) {
> +          paramNameHadInvalidChars = true;
> +        }
> +        paramName.Append(ToLowerCase(*pos));

`ToLowerCase` again.

::: dom/base/MimeType.cpp:149
(Diff revision 2)
> +        // Step 11.7.1.2
> +        while (true) {
> +
> +          // Step 11.7.1.2.1
> +          while (pos <= lastChar && *pos != '"' && *pos != '\\') {
> +            if (!paramValueHadInvalidChars) {

Again, this optimization is a pessimization assuming that most often the input is valid.

::: dom/base/MimeType.cpp:168
(Diff revision 2)
> +            // Step 11.7.1.2.2.1
> +            ++pos;
> +
> +            // Step 11.7.1.2.2.2
> +            if (pos <= lastChar) {
> +              if (!paramValueHadInvalidChars) {

And here.

::: dom/base/MimeType.cpp:182
(Diff revision 2)
> +              ++pos;
> +              continue;
> +            }
> +
> +            // Step 11.7.1.2.2.3
> +            if (!paramValueHadInvalidChars) {

And here.

::: dom/base/MimeType.cpp:204
(Diff revision 2)
> +
> +        const char16_t* paramValueStart = pos;
> +
> +        // Step 11.7.2
> +        while (pos <= lastChar && *pos != ';') {
> +          if (!paramValueHadInvalidChars) {

And here.

::: dom/base/MimeType.cpp:217
(Diff revision 2)
> +          ++pos;
> +        }
> +
> +        // Step 11.7.3
> +        const char16_t* paramValueEnd = pos - 1;
> +        if (!paramValueHadInvalidChars) {

And here.
dbaron, is it considered OK to subclass nsString as seen in this patch?
Flags: needinfo?(dbaron)
Alright, I've addressed the feedback aside from the nsString subclassing. If we'd like to land this while I'm away on PTO, I might be able to find some time to re-work the patch to not do the subclassing tomorrow before I head out.
Comment on attachment 8997621 [details]
Bug 1454590 - Align overrideMIMEType with the XMLHttpRequest Standard.

https://reviewboard.mozilla.org/r/261326/#review269408

Thanks. I guess we can deal with the subclassing issue as a follow-up if necessary. r+ provided that the comments here are addressed.

::: dom/base/MimeType.cpp:95
(Diff revision 3)
> +    type.Append(ToLowerCaseASCII(*c));
> +  }
> +  for (const char16_t* c = subtypeStart; c < subtypeEnd; ++c) {
> +    subtype.Append(ToLowerCaseASCII(*c));
> +  }
> +  mozilla::UniquePtr<MimeType> mimeType(new MimeType(type, subtype));

Please use `MakeUnique` instead of `new` here.

::: dom/xhr/XMLHttpRequestMainThread.cpp:2174
(Diff revision 3)
>        !waitingForBlobCreation) {
>      // Smaller files may be written in cache map instead of separate files.
>      // Also, no-store response cannot be written in persistent cache.
>      nsAutoCString contentType;
> +    if (!mOverrideMimeType.IsEmpty()) {
> +      contentType.Assign(NS_ConvertUTF16toUTF8(mOverrideMimeType));

My understanding is that nsACString Necko mime types are Latin1 and not UTF-8. In any case, there is no need to use `NS_Convert`... and `Copy`... is more efficient. Please replace this line with `LossyCopyUTF16toASCII(mOverrideMimeType, contentType);` (note that "ASCII" really means "Latin1" here and MimeType::Parse followed by Serialize sohuld ensure the UTF-16 data is within the Latin1 code point range).

::: intl/unicharutil/util/nsUnicharUtils.cpp:65
(Diff revision 3)
>  }
>  
> +char
> +ToLowerCaseASCII(char aChar)
> +{
> +  return mozilla::unicode::GetLowercase(aChar);

This does Unicode lowercasing instead of ASCII lowercasing. Please write the implementation of this function as:

```
if (aChar >= 'A' && aChar <= 'Z') {
  return aChar + 0x20;
}
return aChar;

```

And the same implementation for the other two functions.

::: mfbt/TextUtils.h:50
(Diff revision 3)
>    using UnsignedChar = typename detail::MakeUnsignedChar<Char>::Type;
>    auto uc = static_cast<UnsignedChar>(aChar);
>    return uc < 0x80;
>  }
>  
> +template<typename Char>

Please add a comment saying that this is intended to match the definition from the Infra Standard.
Attachment #8997621 - Flags: review?(hsivonen) → review+
Henri, it appears that your suggestion to use LossyCopyUTF16toASCII isn't going to work as expected, as I've documented in the review request. Other than that, I've adjusted the patch to match your other feedback.

If you feel it's alright without that change, I can request check-in before I leave (otherwise, we'll have to wait a few weeks until I return, or someone else will have to check what's up here and land the patch).

I'm doing a fresh try-run here just in case anything else pops up: https://treeherder.mozilla.org/#/jobs?repo=try&revision=390f78fa3280fa31733d4629c64f24bc64051723
Flags: needinfo?(hsivonen)
(In reply to Thomas Wisniewski [:twisniewski] (PTO Aug 15- Sep 7) from comment #18)
> Henri, it appears that your suggestion to use LossyCopyUTF16toASCII isn't
> going to work as expected, as I've documented in the review request.

I can't find such a comment on MozReview, but if it doesn't work, let's go with the conversion to UTF-8 instead.

> Other
> than that, I've adjusted the patch to match your other feedback.

Thank you.
Flags: needinfo?(hsivonen)
Comment on attachment 8997621 [details]
Bug 1454590 - Align overrideMIMEType with the XMLHttpRequest Standard.

https://reviewboard.mozilla.org/r/261326/#review269408

> My understanding is that nsACString Necko mime types are Latin1 and not UTF-8. In any case, there is no need to use `NS_Convert`... and `Copy`... is more efficient. Please replace this line with `LossyCopyUTF16toASCII(mOverrideMimeType, contentType);` (note that "ASCII" really means "Latin1" here and MimeType::Parse followed by Serialize sohuld ensure the UTF-16 data is within the Latin1 code point range).

Actually, if I use that suggestion, then two WPTs end up failing. With my original code, it works fine.

The WPTs that fail are 38 and 43 in xhr/overridemimetype-blob.html, such as "text/html;test=ÿ;charset=gbk" (which should quote the non-ASCII part, "text/html;test="ÿ";charset=gbk", and fails with these assertions appearing in the console output (leading to xhr.response.type being unset, thus failing the test):

[Child 6634, Main Thread] ###!!! ASSERTION: Not a UTF-8 string. This code should only be used for converting from known UTF-8 strings.: 'Error', file /media/ssd/mozilla/central3/xpcom/string/nsUTF8Utils.h, line 463
[Child 6634, Main Thread] ###!!! ASSERTION: Not a UTF-8 string. This code should only be used for converting from known UTF-8 strings.: 'Error', file /media/ssd/mozilla/central3/xpcom/string/nsUTF8Utils.h, line 121
[Child 6634, Main Thread] ###!!! ASSERTION: Input wasn't UTF8 or incorrect length was calculated: 'Error', file /media/ssd/mozilla/central3/xpcom/string/nsReadableUtils.cpp, line 380

Is it alright with you if I simply revert this fix, or are we potentially missing something here?
Thanks for the follow-up. Sorry, in my haste I forgot to hit "publish" on that comment (busy day). I've published it now, just in case. It does make sense to me to use the original line.

At any rate, I've just done one final rebase, so I'll request check-in.
Keywords: checkin-needed
Thank you! I pressed the autoland button on MozReview.
Keywords: checkin-needed
Pushed by hsivonen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/55a9d10ee432
Align overrideMIMEType with the XMLHttpRequest Standard. r=hsivonen
https://hg.mozilla.org/mozilla-central/rev/55a9d10ee432
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Assignee: nobody → wisniewskit
Anybody else following that, like me, didn't understand c8, I think what Anne meant was https://github.com/whatwg/xhr/pull/218
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: