Closed
Bug 866431
Opened 12 years ago
Closed 12 years ago
XHR arraybuffer response type uses too much memory
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
FIXED
mozilla24
People
(Reporter: vlad, Assigned: vlad)
References
Details
Attachments
(5 files)
(deleted),
text/html
|
Details | |
(deleted),
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
sfink
:
review+
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
(deleted),
image/jpeg
|
Details | |
(deleted),
patch
|
sfink
:
review+
|
Details | Diff | Splinter Review |
Right now, XHR with an arraybuffer response type will read the incoming data into a nsCString, which dynamically grows by doubling. Then, when .response is fetched, it will create the ArrayBuffer, but not free the nsCString until the XHR itself goes away, even though it will never be touched again. This means that getting a 260MB data file can use up 512MB (string, doubled) + 260MB (ArrayBuffer) = 772MB at its peak.
The first simple fix is to just truncate the response string when we fetch the arraybuffer, since we keep track of the arraybuffer already.
The followup fix is to read the data into an arraybuffer, reallocating as we go as needed.
Assignee | ||
Comment 1•12 years ago
|
||
Here's a testcase that XHR's a 260MB file from my server (gzip compressed, so only 260k downloaded).
After clicking Start, we run the XHR until we get a readyStateChange == 4. At that point, about:memory shows something like 460M to event-targets (no idea why it's not 512mb; this number seems to vary between 460 and 500).
After clicking Get Response, we just call xhr.response and save it to a global variable. 460M of event-targets still, but now 260M JS objects (as expected).
After clicking null out xhr and doing a bunch of GCs, or clicking abort xhr (which calls xhr.abort()), memory usage just shows the 260M JS objects.
Assignee | ||
Comment 2•12 years ago
|
||
Here's part 1, where we just truncate mResponseBody after fetching the arraybuffer response. This doesn't do anything about our peak memory usage (we still have the peak while we do the copy), but it does mean that we free the larger string much more aggressively than waiting for the xhr to be GC'd or aborted.
Assignee: nobody → vladimir
Attachment #742732 -
Flags: review?(bzbarsky)
Comment 3•12 years ago
|
||
Comment on attachment 742732 [details] [diff] [review]
Part 1, truncate mResponseBody when we fetch arraybuffer
r=me
Attachment #742732 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 4•12 years ago
|
||
Here's a better fix than just truncating. This adds JS_ReallocateArrayBufferContents (we could drop JS_Allocate* now, since this does the same thing if the given contents are null), and a js::ArrayBufferBuilder helper class. Then XHR uses that to build up the AB directly.
The next step would be to trust Content-Length as a first guess at needed capacity. Step after -that- would be to add a X-Content-Decoded-Length header that would state the final size post-Content-Encoding decoding.
Attachment #744906 -
Flags: review?(sphink)
Attachment #744906 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 5•12 years ago
|
||
Not a perfect comparison -- left side is pre-patch, 32-bit nightly. Right side is post-patch 64-bit debug build. But the deck is stacked against the 64-bit debug build and it still peaks way lower.
Yellow is "private data", orange is "heap". I'm guessing 32/64 is the reason for the different categorization.
Comment 6•12 years ago
|
||
Comment on attachment 744906 [details] [diff] [review]
better fix - introduce ArrayBufferBuilder, use it
Review of attachment 744906 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good to me, but if you end up deciding to change the growth calculation, I'd like to take another look.
::: content/base/src/nsXMLHttpRequest.cpp
@@ +1747,5 @@
> xmlHttpRequest->mResponseBlob = nullptr;
> }
> + } else if (xmlHttpRequest->mResponseType == XML_HTTP_RESPONSE_TYPE_ARRAYBUFFER ||
> + xmlHttpRequest->mResponseType == XML_HTTP_RESPONSE_TYPE_CHUNKED_ARRAYBUFFER) {
> + // get the initial capacty to something reasonable to avoid a bunch of reallocs right
*capacity (but I'm not really reviewing this file)
::: js/src/jsapi.h
@@ +3673,5 @@
> + * as appropriate. The new data pointer will be returned in data. If *contents is NULL,
> + * behaves like JS_AllocateArrayBufferContents.
> + */
> +extern JS_PUBLIC_API(JSBool)
> +JS_ReallocateArrayBufferContents(JSContext *cx, uint32_t nbytes, void **contents, uint8_t **data);
I'm fine with removing JS_AllocateArrayBufferContents, but I don't care either way. I want all of this to die when we can create ArrayBuffers using external memory.
::: js/src/jsfriendapi.h
@@ +962,5 @@
> + capacity_ = length_ = 0;
> + }
> +
> + // will truncate if newcap is < length()
> + JSBool setCapacity(uint32_t newcap) {
JSBool is so... uh, 2011 or so. Just use bool now. (I know, there's still a ton of JSBools in JSAPI. But the new stuff is just bool.)
@@ +964,5 @@
> +
> + // will truncate if newcap is < length()
> + JSBool setCapacity(uint32_t newcap) {
> + JSBool ok = JS_ReallocateArrayBufferContents(NULL, newcap, &rawcontents_, &dataptr_);
> + if (ok) {
House style seems to be more like
if (!JS_foo())
return false;
do stuff;
@@ +981,5 @@
> + if (length_ + datalen > capacity_) {
> + uint32_t newcap;
> + // double while under maxgrowth or if not specified
> + if (!maxgrowth || length_ < maxgrowth) {
> + newcap = length_ * 2;
SM style nit: no curlies for single-line consequents.
Also, is doubling the length what you want? I don't know what datalens will be passed into append(), but if they're all kinds of weird sizes, then it seems like this won't be anywhere near a power of 2 (or even a multiple of pagesize), and will therefore waste memory on fragmentation in the allocator.
Similarly, you're comparing length_ to maxgrowth. Isn't the "growth" the change in capacity, not the change in length? I was assuming that maxgrowth was meant to be the max change in memory requested from the allocator; this seems to make it the max change in (length + memory requested) or something.
@@ +983,5 @@
> + // double while under maxgrowth or if not specified
> + if (!maxgrowth || length_ < maxgrowth) {
> + newcap = length_ * 2;
> + } else {
> + newcap = length_ + maxgrowth;
Same here; I'd expect this to be |newcap = capacity_ + maxgrowth|.
@@ +998,5 @@
> + if (!setCapacity(newcap))
> + return JS_FALSE;
> + }
> +
> + memcpy(dataptr_ + length_, data, datalen);
Can this be either memmove or an assert that they're non-overlapping? (I don't think overlapping with uninitialized memory is useful, so an assert's fine.)
::: js/src/jstypedarray.cpp
@@ +229,5 @@
> +
> + if (initdata)
> + memcpy(newheader->elements(), initdata, nbytes);
> +
> + // we rely on this being correct
? I don't understand the comment.
@@ +3951,5 @@
> JS_PUBLIC_API(JSBool)
> +JS_ReallocateArrayBufferContents(JSContext *cx, uint32_t nbytes, void **contents, uint8_t **data)
> +{
> + if (!*contents)
> + return JS_AllocateArrayBufferContents(cx, nbytes, contents, data);
I don't see why this is necessary, but I guess it's fine either way.
::: js/src/jstypedarray.h
@@ +139,4 @@
> uint8_t **data);
>
> static inline void setElementsHeader(js::ObjectElements *header, uint32_t bytes);
> + static inline uint32_t getElementsHeaderIntializedLength(const js::ObjectElements *header);
getElementsHeaderInitializedLength
Attachment #744906 -
Flags: review?(sphink) → review+
Comment 7•12 years ago
|
||
Comment on attachment 744906 [details] [diff] [review]
better fix - introduce ArrayBufferBuilder, use it
>+#define XML_HTTP_REQUEST_ARRAYBUFFER_MIN_SIZE (64*1024)
That's fairly noticeable... I guess people don't do much with small arraybuffers over XHR, though?
>+ if (!mResultArrayBuffer) {
> return JSVAL_NULL;
Shouldn't this still throw on aRv?
>+ if (!mArrayBufferBuilder.setCapacity(mArrayBufferBuilder.length())) {
Why do we need that, exactly? Is it to realloc to a smaller size in case we overallocated when doubling?
If so, please explicitly document that!
r=me on the content/ parts.
Attachment #744906 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 8•12 years ago
|
||
sfink wanted to see another version of the patch with doubling mechanisms and other stuff fixed; attached here!
Attachment #745588 -
Flags: review?(sphink)
Comment 9•12 years ago
|
||
Comment on attachment 745588 [details] [diff] [review]
ArrayBufferBuilder & co updated and fixed
Review of attachment 745588 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jsfriendapi.h
@@ +1007,5 @@
> + // assert that the region isn't overlapping so we can memcpy;
> + // this is max(start1, start2) >= min(end1, end2)
> + JS_ASSERT(std::max<const uint8_t *>(data, dataptr_ + length_) >=
> + std::min<const uint8_t *>(data + datalen, dataptr_ + length_ + datalen));
> +
some whitespace snuck in
Attachment #745588 -
Flags: review?(sphink) → review+
Assignee | ||
Comment 10•12 years ago
|
||
So I'm trying to tryserver this, and am baffled:
https://tbpl.mozilla.org/?tree=Try&rev=8ca20a4c88ba
The gcc platforms (Linux and OSX, basically) are all saying:
nsKeygenHandler.cpp:727:5: error: 'free' is not a member of 'nsCRT'
(Earlier one of them was erroring out with my "data" function arg shadowing a member function called data(), but I fixed that.)
I'm totally baffled, because free is absolutely a member of nsCRT. I wonder if this is because jsfriendapi.h is now doing #include <algorithm>, and that's screwing some very vintage code over... somehow?
Assignee | ||
Comment 11•12 years ago
|
||
So dholbert looked into this. It is gross.
Our mozalloc stuff does a #define free moz_free. So nsCRT actually gets a moz_free() member function when it's included. But, <cstdlib> does a '#undef free' (and all associated things). So when we #include <algorithm> (or anything that'll pull in <cstdlib>) after something that included our moz alloc #defines, things break.
I don't want to fix that problem, so I'm just going to not use std::min/max, but ugh.
Assignee | ||
Comment 12•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/b569b790a109
with the following changes:
- stopped using std::min/std::max, instead just wrote a helper overlapping function that did the ?: directly
- renamed 'data' arg to not shadow 'data' function (come on!)
- changed min size to 32k from 64k (per bz's feedback)
- throw OOM if we fail to create the arraybuffer object (per bz's feedback)
- added comment about reallocing down (per bz's feedback)
Assignee | ||
Updated•12 years ago
|
OS: Windows 8 → All
Hardware: x86_64 → All
Comment 13•12 years ago
|
||
PR_MAX? :/
Assignee | ||
Comment 14•12 years ago
|
||
Well, does std::max work instead of the PR_MAX? The <algorithm> issue will be everywhere..
Comment 15•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
Updated•9 years ago
|
Depends on: CVE-2015-2739
You need to log in
before you can comment on or make changes to this bug.
Description
•