Closed
Bug 964561
Opened 11 years ago
Closed 8 years ago
Use the size of structured clone data to choose whether it will be stored in the database or in the filesystem
Categories
(Core :: Storage: IndexedDB, defect)
Core
Storage: IndexedDB
Tracking
()
RESOLVED
FIXED
mozilla52
Tracking | Status | |
---|---|---|
firefox52 | --- | fixed |
People
(Reporter: bent.mozilla, Assigned: janv)
References
Details
(Whiteboard: [games])
Attachments
(11 files, 8 obsolete files)
(deleted),
patch
|
asuth
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
asuth
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
asuth
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
asuth
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
asuth
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
asuth
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
asuth
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
asuth
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
asuth
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
mak
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
asuth
:
review+
|
Details | Diff | Splinter Review |
Currently all structured clone data (including ArrayBuffer contents) is stored in the database while all Blob/File objects are always stored as actual files in the filesystem. Neither decision takes the data size into account so in the worst case we can have very large values in the database and many tiny files in the filesystem. We should be smarter about this and consistently store small values in the database and larger values in the filesystem.
The size threshold will probably need to be tweakable via pref so that mobile/desktop can have different values.
Comment 1•11 years ago
|
||
In particular, this would greatly help in avoiding the general problem described in bug 933398 comment 0. We'd still have the 1 big copy from the ArrayBuffer (holding the VFS) to the structured clone buffer, but at least we'd avoid the subsequent copy when compressing and again later when storing in sqlite. (To avoid the structured clone buffer copy, we discussed an orthogonal optimization.)
Blocks: 933398
Whiteboard: [games]
Assignee | ||
Comment 2•11 years ago
|
||
I did a little investigation of Epic's Citadel demo.
<profile>/storage/persistenct/http+++www.unrealengine.com/idb/1873003896EEMH_CPARCE_LDOA
- empty directory (no stored files obviously)
sqlite3 <profile>/storage/persistenct/http+++www.unrealengine.com/idb/1873003896EEMH_CPARCE_LDOA.sqlite
sqlite> pragma page_size;
32768
sqlite> select length(data) from object_data;
102690803
105
so, just 2 rows, one of them almost 100MB (compressed)
There are some performance numbers about whether to store blobs in the database or a filesystem.
It seems the limit is 100k.
We can make this number dependent on the platform, and maybe even configurable from JS, e.g.
var request = indexedDB.open("myDb", 1, { maxInternalBlobSize: 102400 });
Assignee | ||
Comment 3•11 years ago
|
||
We need this to avoid creating an extra buffer in IndexedDB implementation.
Assignee: nobody → Jan.Varga
Attachment #8366754 -
Flags: review?(luke)
Comment 4•11 years ago
|
||
Comment on attachment 8366754 [details] [diff] [review]
Add JSAutoStructuredCloneBuffer::Copy() which takes a PRFileDesc
Review of attachment 8366754 [details] [diff] [review]:
-----------------------------------------------------------------
Thus far we've tried to keep our dependency on NSPR quite limited. Do you suppose you could change the API to take an 'int fd' instead so we can just use plain posix freads instead? MSDN suggests this will just work on MSVC.
Assignee | ||
Comment 5•11 years ago
|
||
ah, ok, I'll rework it using plain posix freads.
Assignee | ||
Comment 6•11 years ago
|
||
Attachment #8366754 -
Attachment is obsolete: true
Attachment #8366754 -
Flags: review?(luke)
Attachment #8366857 -
Flags: review?(luke)
Comment 7•11 years ago
|
||
Comment on attachment 8366857 [details] [diff] [review]
patch v2
Review of attachment 8366857 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/vm/StructuredClone.cpp
@@ +44,5 @@
>
> #include "jscntxtinlines.h"
> #include "jsobjinlines.h"
>
> +#define FILE_COPY_BUFFER_SIZE 32768
Could you move this to a static const unsigned inside JSAutoStructuredCloneBuffer::copy?
@@ +1688,5 @@
> +bool
> +JSAutoStructuredCloneBuffer::copy(FILE *fd, size_t size, uint32_t version)
> +{
> + uint64_t *newData = static_cast<uint64_t *>(js_malloc(size));
> + if (!newData)
Could you use a ScopedJSFreePtr<uint64_t>?
@@ +1707,5 @@
> + }
> +
> + js_memcpy(pos, copyBuffer, bytesRead);
> + pos += bytesRead;
> + } while (true);
Could you add a JS_ASSERT((char*)newData + size == (char*)pos) at the end of the loop?
Attachment #8366857 -
Flags: review?(luke) → review+
I think storing Blobs in the database if they are small enough makes a lot of sense.
However I don't think we should be in a hurry to add the complexity involved in storing large structured clone data outside of the database until we have data showing that there's significant performance improvements.
Storing large data in such that it doesn't get in the way of performance of database lookups seems like the job of SQLite. Do we know what it does as the data gets large?
Storing values of the size of 100MB is never going to be fast. No matter where the data is stored we're going to have to read 100MB into memory. This is hardly a common case.
The fact that the games code is doing it right now is hopefully temporary. As I understand it it's a workaround since they need synchronous access to a filesystem. So they keep the whole filesystem in memory. Once we have something like sync IDB or the sync message channel, that will be solved in much better ways by only loading the data that's needed into memory.
Assignee | ||
Comment 9•11 years ago
|
||
Well, it seems that collapsing of small Blobs is harder than storing structured clones outside of the database.
I have the latter mostly implemented.
I agree that long term, they need real sync access to a filesystem, but in order to do that they also need WebGL on workers (that's what Luke told me), so it will take some time. For now, we could fix the crashes at least, by lowering number of allocated buffers and storing large structured clones externally.
Comment 10•11 years ago
|
||
(In reply to Jonas Sicking (:sicking) from comment #8)
> Storing large data in such that it doesn't get in the way of performance of
> database lookups seems like the job of SQLite. Do we know what it does as
> the data gets large?
Copies happen even before it gets into the hands of SQLite. It's copies all the way down :)
> Storing values of the size of 100MB is never going to be fast. No matter
> where the data is stored we're going to have to read 100MB into memory. This
> is hardly a common case.
It doesn't need to be fast, it needs to avoid keeping multiple full-size copies of the structured clone buffer in memory simultaneously. It seems like this would be really important for FFOS as well where the memory spike from much smaller payloads would unacceptable. Gregor was telling me yesterday he has seen this be a problem.
> The fact that the games code is doing it right now is hopefully temporary.
Do you expect that *everything* a game needs will be available in Web Workers in less than a year?
Assignee | ||
Comment 11•11 years ago
|
||
btw, https://www.sqlite.org/intern-v-extern-blob.html, that's the page where I found the 100k boundary.
Assignee | ||
Comment 12•11 years ago
|
||
Attachment #8366857 -
Attachment is obsolete: true
Attachment #8367607 -
Flags: review?(luke)
Updated•11 years ago
|
Attachment #8367607 -
Flags: review?(luke) → review+
Assignee | ||
Comment 13•11 years ago
|
||
I posted all the patches on try:
https://tbpl.mozilla.org/?tree=Try&rev=43244dabf359
and also with enforced external storage of structured clones:
https://tbpl.mozilla.org/?tree=Try&rev=2ec7c5676cf4
I will need some help here to do some better testing on mobile before landing.
Assignee | ||
Comment 14•11 years ago
|
||
try run with default "max internal blob size" (100 KB on android, 1 MB on other platforms):
https://tbpl.mozilla.org/?tree=Try&rev=a449f823796e
try run with "max internal blob size" set to 0 (every structured clone stored externally):
https://tbpl.mozilla.org/?tree=Try&rev=0bc8c6d7c317
The bc failures are caused by mochitest timeouts, it's interesting that especially windows 7 and windows 8 is very slow when storing a lot of tiny files.
Assignee | ||
Comment 15•11 years ago
|
||
ok, this is the main patch
Attachment #8368504 -
Flags: review?(bent.mozilla)
Comment 16•11 years ago
|
||
Random comment on HN complaining about the specific issue this bug solves:
https://news.ycombinator.com/item?id=7068171
Assignee | ||
Comment 17•11 years ago
|
||
Ok, so who is going to make the final decision whether to take this patch or not ?
It's unclear to me where this patch is attempting to prevent copies? I.e. what copies that the current code does is this patch attempting to eliminate? The fact that we're writing to sqlite rather than a separate file doesn't seem to be what creates more copies.
Also, do we really need a new column here? Couldn't we simply write the file-id in the existing file-ids column and leave the data empty to indicate that the data lives in a file rather than the database?
But to more concretely answer your question, I think it's Ben's decision what to do here.
Assignee | ||
Comment 20•11 years ago
|
||
(In reply to Jonas Sicking (:sicking) from comment #18)
> It's unclear to me where this patch is attempting to prevent copies? I.e.
> what copies that the current code does is this patch attempting to
> eliminate? The fact that we're writing to sqlite rather than a separate file
> doesn't seem to be what creates more copies.
I think mozStorage or sqlite creates a copy, but there's also the compression which is done before writing to sqlite, so it seems that this patch eliminates 2 copies.
> Also, do we really need a new column here? Couldn't we simply write the
> file-id in the existing file-ids column and leave the data empty to indicate
> that the data lives in a file rather than the database?
Yes, that might be possible, good catch.
> I think mozStorage or sqlite creates a copy,
Ben thought otherwise, but I'll leave it to him.
> but there's also the compression which is done before writing to sqlite,
Compression seems orthogonal to where we store things. If want to eliminate the copy that compression creates, we can do that while still writing into the database, no?
Assignee | ||
Comment 22•11 years ago
|
||
(In reply to Jonas Sicking (:sicking) from comment #21)
> > I think mozStorage or sqlite creates a copy,
>
> Ben thought otherwise, but I'll leave it to him.
We have this comment in IDBObjectStore.cpp:
// This will hold our compressed data until the end of the method. The
// BindBlobByName function will copy it.
but it seems it's not correct, hm
>
> > but there's also the compression which is done before writing to sqlite,
>
> Compression seems orthogonal to where we store things. If want to eliminate
> the copy that compression creates, we can do that while still writing into
> the database, no?
Well, that's possible, but snappy doesn't support compression in chunks :(
and switching to some other library doesn't sound like a good solution.
Anyway, I still think that we should support storing big structured clones in separate files. We can set the limit high, like several megabytes. There always be people who store big things in the database and then complain about performance.
Assignee | ||
Comment 23•11 years ago
|
||
I don't know all the internals of sqlite, but I have a feeling that even when we eliminate the extra copy for compression, it's better to not store huge blobs in the database.
I think we should also ask someone from sqlite devs.
Flags: needinfo?(drh)
Storing things outside of the database for performance reasons seems like a good idea. It just sounded like we were storing it outside of the database in order to eliminate copies which didn't seem accurate. We can choose to store both compressed and uncompressed data both inside and outside the database.
Comment 25•11 years ago
|
||
(In reply to Jan Varga [:janv] from comment #23)
> I don't know all the internals of sqlite, but I have a feeling that even
> when we eliminate the extra copy for compression, it's better to not store
> huge blobs in the database.
>
> I think we should also ask someone from sqlite devs.
For "smaller" blobs, it is faster to store them in the SQLite database. SQLite is slower to write content to disk (because of the extra formatting it has to do, and because of transaction control) but storing in the database eliminates an open()/close() operation, which can be expensive.
The cutoff for what constitutes a "small" blob varies by OS and hardware. For tests we ran on Linux, the cutoff point was about 100KB. In other words, for blobs smaller than about 100KB, it is faster to store them as fields in an SQLite database than it is to write them into separate files. See http://www.sqlite.org/intern-v-extern-blob.html for details.
Note that SQLite rewrites an entire row whenever you change any column in that row. So if you do put massive blobs in a database, it is best to put them in their own table that assigned each blob an integer tracking number:
CREATE TABLE somekindablob(id INTEGER PRIMARY KEY, content BLOB);
Then use the integer tracking number in the table that contains all the other properties of the blob. Or if you really, really must put the large blob in the same table with other fields, at least put the large blob at the very end, so that the other fields can be read without having to seek past the entire blob.
Flags: needinfo?(drh)
Assignee | ||
Comment 26•11 years ago
|
||
(In reply to D. Richard Hipp from comment #25)
> Or if you really, really must put the large
> blob in the same table with other fields, at least put the large blob at the
> very end, so that the other fields can be read without having to seek past
> the entire blob.
bent already did this optimization
Reporter | ||
Comment 27•11 years ago
|
||
mozStorage does make a copy of the blob before we write it to SQLite. This is the call order:
1. http://dxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBObjectStore.cpp?from=idbobjectstore.cpp#3204
2. http://dxr.mozilla.org/mozilla-central/source/storage/src/mozStorageBindingParams.cpp#351
3. http://dxr.mozilla.org/mozilla-central/source/storage/src/Variant.h#320
4. http://dxr.mozilla.org/mozilla-central/source/storage/src/Variant.h#245
This is horrible, and we should fix it.
Reporter | ||
Comment 28•11 years ago
|
||
1. I think we should pick a threshold, maybe 50k based on the chart drh provided in comment 25.
2. We should ask snappy how much space it needs to store our blob or structured clone data.
3. Anything smaller than the threshold should be compressed and stored in the database. I don't think we need to change our schema at all for this.
4. Anything larger should be stored outside the database.
5. I think it makes sense to store blobs uncompressed (they're likely in a format that is already compressed anyway).
6. Structured clone data maybe should always be compressed since our format is not very efficient (lots of 0s last I looked). However, this requires increased memory to give snappy a buffer to work with. We could try tricks like stitching together chunked snappy buffers but we'd need to see if that costs more performance than the speedup we get from writing less data.
I spun off comment 27 as bug 979445.
Reporter | ||
Comment 30•11 years ago
|
||
Comment on attachment 8368504 [details] [diff] [review]
patch v1
Canceling review until we get a new patch that avoids the extra column.
Attachment #8368504 -
Flags: review?(bent.mozilla)
Assignee | ||
Comment 31•11 years ago
|
||
When I force to store all structured clones externally all tests pass except browser_quotaPromptDelete.js and browser_quotaPromptDeny.js:
https://tbpl.mozilla.org/?tree=Try&rev=1daa655e88d4
It's not a big deal, this is just an edge case that shouldn't happen during normal browser-chrome-mochitest runs, but it would be better to fix it.
However, it seems that I need to fix bug 856921 first.
Depends on: 856921
Assignee | ||
Comment 32•11 years ago
|
||
Attachment #8368504 -
Attachment is obsolete: true
Attachment #8400666 -
Flags: review?(bent.mozilla)
Assignee | ||
Comment 33•11 years ago
|
||
Attachment #8400667 -
Flags: review?(bent.mozilla)
Assignee | ||
Comment 34•11 years ago
|
||
Assignee | ||
Comment 35•11 years ago
|
||
Replaced custom FileAutoCloser with ScopedCloseFile from FileUtils.h
Attachment #8401073 -
Flags: review?(bent.mozilla)
Assignee | ||
Updated•11 years ago
|
Attachment #8400666 -
Attachment is obsolete: true
Attachment #8400666 -
Flags: review?(bent.mozilla)
Comment 36•11 years ago
|
||
When the structured clone data is stored as a file, will it also be compressed?
Assignee | ||
Comment 37•11 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #36)
> When the structured clone data is stored as a file, will it also be
> compressed?
The current patch doesn't compress it to avoid additional memory copy since snappy doesn't support streaming at the moment. We could do the trick Ben mentioned, but that can be done later.
Hm, this reminds me that I wanted to create a flag for this, so it would be easy to add such support in future, w/o converting/compressing all stored files. Thanks for reminding me :)
Assignee | ||
Updated•11 years ago
|
Attachment #8401073 -
Flags: review?(bent.mozilla)
I didn't look too closely at the patch, but do make sure that we properly handle things like
oS.put({ a: reallyHugeArrayBuffer, b: blob1, c: blob2 });
Also make sure to test this of course.
Assignee | ||
Comment 39•11 years ago
|
||
(In reply to Jonas Sicking (:sicking) from comment #38)
> I didn't look too closely at the patch, but do make sure that we properly
> handle things like
>
> oS.put({ a: reallyHugeArrayBuffer, b: blob1, c: blob2 });
>
> Also make sure to test this of course.
That should work correctly, but I'll add a test for it.
I canceled the review because I forgot to add a flag for the compression of externally stored structured clone data.
Assignee | ||
Comment 40•11 years ago
|
||
Attachment #8401073 -
Attachment is obsolete: true
Attachment #8406806 -
Flags: review?(bent.mozilla)
Assignee | ||
Comment 41•11 years ago
|
||
I'll attach the test separately.
Comment 42•10 years ago
|
||
Jan, is this urgent or can we get to this after Jamun is done? Thanks!
Flags: needinfo?(Jan.Varga)
Comment 44•10 years ago
|
||
I don't know of any super-pressing needs for this feature, just that we've hit this (comment 1, and again more recently) and had to tell people to change the code to store incrementally in smaller ABs.
Flags: needinfo?(luke)
Assignee | ||
Comment 45•10 years ago
|
||
(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #42)
> Jan, is this urgent or can we get to this after Jamun is done? Thanks!
Yeah, we'll get to it after Jamun is done.
Flags: needinfo?(Jan.Varga)
Assignee | ||
Comment 46•10 years ago
|
||
Comment on attachment 8400667 [details] [diff] [review]
Part 3: Fix tests
Clearing review request for now.
Attachment #8400667 -
Flags: review?(bent.mozilla)
Assignee | ||
Updated•10 years ago
|
Attachment #8406806 -
Flags: review?(bent.mozilla)
Assignee | ||
Comment 47•10 years ago
|
||
I'm going to try to resurrect this patch.
Reporter | ||
Comment 48•10 years ago
|
||
IDB code should be pretty stable now, the only other relevant big change pending is locale-aware indexes (bug 871846).
Assignee | ||
Comment 49•8 years ago
|
||
I have a patch queue for this:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=dfb8873f8462600cd2bdcb6f1240fe15144dfcd1
It's a prerequisite for WASM in IDB.
This version also supports compression of externally stored structured clone data w/o
creating flat buffers.
So there are no big buffer copies. I created an input stream wrapper for structured clone data
and compression is done using the snappy compression output stream.
I also upgraded the snappy library to the latest version which gives better compression ratio
and is also slightly faster.
Assignee | ||
Updated•8 years ago
|
Assignee | ||
Updated•8 years ago
|
Attachment #8367607 -
Attachment is obsolete: true
Assignee | ||
Updated•8 years ago
|
Attachment #8400667 -
Attachment is obsolete: true
Assignee | ||
Updated•8 years ago
|
Attachment #8406806 -
Attachment is obsolete: true
Assignee | ||
Updated•8 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 50•8 years ago
|
||
Attachment #8802722 -
Flags: review?(bugmail)
Assignee | ||
Comment 51•8 years ago
|
||
Attachment #8802723 -
Flags: review?(bugmail)
Assignee | ||
Comment 52•8 years ago
|
||
Attachment #8802724 -
Flags: review?(bugmail)
Assignee | ||
Comment 53•8 years ago
|
||
Attachment #8802725 -
Flags: review?(bugmail)
Assignee | ||
Comment 54•8 years ago
|
||
Attachment #8802726 -
Flags: review?(bugmail)
Assignee | ||
Comment 55•8 years ago
|
||
Attachment #8802727 -
Flags: review?(bugmail)
Assignee | ||
Comment 56•8 years ago
|
||
Attachment #8802728 -
Flags: review?(bugmail)
Assignee | ||
Comment 57•8 years ago
|
||
Attachment #8802729 -
Flags: review?(bugmail)
Assignee | ||
Comment 58•8 years ago
|
||
Attachment #8802730 -
Flags: review?(bugmail)
Assignee | ||
Comment 59•8 years ago
|
||
Assignee | ||
Updated•8 years ago
|
Attachment #8802729 -
Attachment description: Part 8: Disable externally stored structured clone dat a in file tests; r=asuth → Part 8: Disable externally stored structured clone data in file tests; r=asuth
Assignee | ||
Updated•8 years ago
|
Attachment #8802730 -
Attachment description: Part 9: Add a new test for externally stored structure d clone data; r=asuth → Part 9: Add a new test for externally stored structured clone data; r=asuth
Assignee | ||
Comment 60•8 years ago
|
||
Comment on attachment 8802725 [details] [diff] [review]
Part 4: Update keys directly in the structured clone buffer; r=asuth
Review of attachment 8802725 [details] [diff] [review]:
-----------------------------------------------------------------
This is needed because when we compress externally stored structured clone data we don't need to create one big flat buffer. We read from structured clone buffers directly, so the key must be updated there. The writing is a bit complicated since we can cross buffers after each char of the key in theory.
Assignee | ||
Comment 61•8 years ago
|
||
Comment on attachment 8802726 [details] [diff] [review]
Part 5: Add a data threshold pref; r=asuth
Review of attachment 8802726 [details] [diff] [review]:
-----------------------------------------------------------------
I decided to have 1 MB threshold for now. We can lower it once we are confident there are no bugs related to this new feature.
Assignee | ||
Comment 62•8 years ago
|
||
Comment on attachment 8802727 [details] [diff] [review]
Part 6: Core changes for storing structured clone data outside of the database; r=asuth
Review of attachment 8802727 [details] [diff] [review]:
-----------------------------------------------------------------
::: dom/indexedDB/ActorsParent.cpp
@@ +25726,5 @@
> + if (mDataOverThreshold) {
> + // Higher 32 bits reserved for flags like compressed/uncompressed. For now,
> + // we don't compress externally stored structured clone data since snappy
> + // doesn't support streaming yet and we don't want to allocate another
> + // large memory buffer for that.
This comment is updated in following patch for compression.
Assignee | ||
Comment 63•8 years ago
|
||
Comment on attachment 8802729 [details] [diff] [review]
Part 8: Disable externally stored structured clone data in file tests; r=asuth
Review of attachment 8802729 [details] [diff] [review]:
-----------------------------------------------------------------
We need this just in case someone runs all the tests with the threshold set to 0.
In that case all structured clones need to allocate a file id.
However, file tests have static assumptions about blob file ids, so file tests don't pass.
It would be really hard and messy to fix that.
Comment 64•8 years ago
|
||
Comment on attachment 8802722 [details] [diff] [review]
Part 1: Use a type in StructuredCloneFile instead of a boolean; r=asuth
Review of attachment 8802722 [details] [diff] [review]:
-----------------------------------------------------------------
::: dom/indexedDB/ActorsParent.cpp
@@ +8162,5 @@
> };
>
> struct ObjectStoreAddOrPutRequestOp::StoredFileInfo final
> {
> + enum Type
note to other readers: Happily this duplicated enum type is removed in bug 1311466's part 1 patch.
Attachment #8802722 -
Flags: review?(bugmail) → review+
Updated•8 years ago
|
Attachment #8802723 -
Flags: review?(bugmail) → review+
Updated•8 years ago
|
Attachment #8802724 -
Flags: review?(bugmail) → review+
Updated•8 years ago
|
Attachment #8802725 -
Flags: review?(bugmail) → review+
Comment 65•8 years ago
|
||
Comment on attachment 8802726 [details] [diff] [review]
Part 5: Add a data threshold pref; r=asuth
Review of attachment 8802726 [details] [diff] [review]:
-----------------------------------------------------------------
Looking forward to having the threshold lowered! I expect the target would want to be something in the neighborhood of PAGE_SIZE/20 (per https://www.sqlite.org/withoutrowid.html#when_to_use_without_rowid) since object_data is WITHOUT ROWID. Of course, depending on per-file overhead, it may make sense to adjust the value upwards and/or consider introducing an additional file type for sizes from PAGE/20 up to 4*PAGE_SIZE where we use a dedicated blob table in the database. Until/unless we have a blob pool database, a value around PAGE_SIZE might be least bad.
Specifically, when referring to per-file overhead, I mean:
- file system overhead for opening a new file, especially:
- if we continue to use a flat directory structure and run the risk of file-systems not being happy about us accumulating tens of thousands of files in a single directory and encountering scale problems there.
- if we find that antivirus software is increasing latencies. We definitely would want to benchmark with a few AV solutions.
- disk waste/storage bloat if the file-system requires that at least one disk sector is used per file, etc.
::: dom/indexedDB/IndexedDatabaseManager.cpp
@@ +398,5 @@
> #endif
> Preferences::RegisterCallbackAndCall(LoggingModePrefChangedCallback,
> kPrefLoggingEnabled);
>
> + Preferences::RegisterCallbackAndCall(DataThresholdPrefChangedCallback,
note for other readers: I had a question about using the callback idiom versus adding more atomic-capable cache variants like AddAtomicUintVarCache but part 8 adds logic that upgrades -1 to INT32_MAX which seems like an appropriate use for continuing the existing pattern. (The AtomicBoolPrefChangedCallback could use a new atomic *Cache variant if the atomic bit is really necessary, but the churn doesn't seem worth it anytime soon.)
Attachment #8802726 -
Flags: review?(bugmail) → review+
Comment 66•8 years ago
|
||
Comment on attachment 8802727 [details] [diff] [review]
Part 6: Core changes for storing structured clone data outside of the database; r=asuth
Review of attachment 8802727 [details] [diff] [review]:
-----------------------------------------------------------------
::: dom/indexedDB/ActorsParent.cpp
@@ +4118,5 @@
> nsresult
> +UpgradeSchemaFrom24_0To25_0(mozIStorageConnection* aConnection)
> +{
> + // The only change between 24 and 25 was a different file_ids format,
> + // but it's backwards-compatible.
Note for other readers: The "data" column is created with BLOB type affinity which means no conversion/coercion is performed by SQLite, so there is no schema change to worry about.
@@ +9586,5 @@
> int32_t id;
> StructuredCloneFile::Type type;
>
> + bool isDot = false;
> + if ((isDot = aText.First() == '.') ||
Note for other readers: This lint-nit gets mooted in bug 1311466's part 6.
@@ +25748,3 @@
> size_t compressedLength = snappy::MaxCompressedLength(uncompressedLength);
>
> + // We don't have a smart pointer class that calls free, so we need to
UniquePtr supports destructor policies and UniquePtrExtensions.h defines a FreePolicy that can be used. I don't think this comment/code got mooted in subsequent patches, but I could be wrong.
::: dom/indexedDB/PBackgroundIDBSharedTypes.ipdlh
@@ +53,5 @@
> };
>
> union BlobOrMutableFile
> {
> + null_t;
Note for other readers: The ambiguity with NullableMutableFile is eliminated in bug 1311466's Part 2 which eliminates NullableMutableFile by extracting PBackgroundMutableFile up into the BlobOrMutableFile union.
Attachment #8802727 -
Flags: review?(bugmail) → review+
Updated•8 years ago
|
Attachment #8802728 -
Flags: review?(bugmail) → review+
Comment 67•8 years ago
|
||
Comment on attachment 8802729 [details] [diff] [review]
Part 8: Disable externally stored structured clone data in file tests; r=asuth
Review of attachment 8802729 [details] [diff] [review]:
-----------------------------------------------------------------
::: dom/indexedDB/IndexedDatabaseManager.cpp
@@ +262,2 @@
> Preferences::GetInt(aPrefName, kDefaultDataThresholdBytes);
> + if (dataThresholdBytes == -1) {
This could benefit from a simple comment like your comment 63. Ex:
// The magic -1 is for use only by tests that depend on stable blob file id's.
Attachment #8802729 -
Flags: review?(bugmail) → review+
Updated•8 years ago
|
Attachment #8802730 -
Flags: review?(bugmail) → review+
Assignee | ||
Comment 68•8 years ago
|
||
(In reply to Andrew Sutherland [:asuth] from comment #65)
> Comment on attachment 8802726 [details] [diff] [review]
> Part 5: Add a data threshold pref; r=asuth
>
> Review of attachment 8802726 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Looking forward to having the threshold lowered! I expect the target would
> want to be something in the neighborhood of PAGE_SIZE/20 (per
> https://www.sqlite.org/withoutrowid.html#when_to_use_without_rowid) since
> object_data is WITHOUT ROWID. Of course, depending on per-file overhead, it
> may make sense to adjust the value upwards and/or consider introducing an
> additional file type for sizes from PAGE/20 up to 4*PAGE_SIZE where we use a
> dedicated blob table in the database. Until/unless we have a blob pool
> database, a value around PAGE_SIZE might be least bad.
>
> Specifically, when referring to per-file overhead, I mean:
> - file system overhead for opening a new file, especially:
> - if we continue to use a flat directory structure and run the risk of
> file-systems not being happy about us accumulating tens of thousands of
> files in a single directory and encountering scale problems there.
> - if we find that antivirus software is increasing latencies. We
> definitely would want to benchmark with a few AV solutions.
> - disk waste/storage bloat if the file-system requires that at least one
> disk sector is used per file, etc.
Yeah, these are excellent points. However implementation of a blob pool is a complex task. Changing the flat directory structure doesn't look so complex and could be done sooner. But as you said, many small files can slow down things significantly, we should be careful about lowering the threshold.
Anyway, I think we will be in much better shape with the 1MB limit, since it removes the need for big flat buffers and the SQLite database won't be stressed so much.
Assignee | ||
Comment 69•8 years ago
|
||
I think these methods are expected to free the value even when an error occurs. The AdoptedBlobVariant does the job, but NS_ENSURE_ARG_MAX can trigger an early return (before AdoptedBlobVariant is created).
Attachment #8803847 -
Flags: review?(mak77)
Assignee | ||
Comment 70•8 years ago
|
||
(In reply to Andrew Sutherland [:asuth] from comment #66)
> @@ +25748,3 @@
> > size_t compressedLength = snappy::MaxCompressedLength(uncompressedLength);
> >
> > + // We don't have a smart pointer class that calls free, so we need to
>
> UniquePtr supports destructor policies and UniquePtrExtensions.h defines a
> FreePolicy that can be used. I don't think this comment/code got mooted in
> subsequent patches, but I could be wrong.
Ok, fixed. I also found a possible leak in mozStorage code. See patch Part 10.
Assignee | ||
Comment 71•8 years ago
|
||
(In reply to Andrew Sutherland [:asuth] from comment #67)
> @@ +262,2 @@
> > Preferences::GetInt(aPrefName, kDefaultDataThresholdBytes);
> > + if (dataThresholdBytes == -1) {
>
> This could benefit from a simple comment like your comment 63. Ex:
> // The magic -1 is for use only by tests that depend on stable blob file
> id's.
Ok, added.
Comment 72•8 years ago
|
||
Comment on attachment 8803847 [details] [diff] [review]
Part 10: Fix a possible leak in BindAdoptedBlobByName()/BindAdoptedBlobByIndex(); r=mak
Review of attachment 8803847 [details] [diff] [review]:
-----------------------------------------------------------------
LGTM. fwiw, Andrew can review any change to Storage as well (and I honestly think you should act as a peer as well!)
Attachment #8803847 -
Flags: review?(mak77) → review+
Assignee | ||
Comment 73•8 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #72)
> Comment on attachment 8803847 [details] [diff] [review]
> Part 10: Fix a possible leak in
> BindAdoptedBlobByName()/BindAdoptedBlobByIndex(); r=mak
>
> Review of attachment 8803847 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> LGTM. fwiw, Andrew can review any change to Storage as well (and I honestly
> think you should act as a peer as well!)
Oh, ok, will do next time :)
Assignee | ||
Comment 74•8 years ago
|
||
Ok, this is ready, but I want to wait for bug 1311466 and land them together.
The other bug extends the format of file_ids column too. We don't need to add another schema upgrade of IDB databases if we land them together.
Assignee | ||
Comment 75•8 years ago
|
||
Heads-up: Bug 1302036 just landed and increased the IDB schema version, I updated appropriate patch for this bug to use the same schema since both land at the same day. If there are any backouts I'll fix things to get it right.
Comment 76•8 years ago
|
||
Pushed by jvarga@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/12213ad49dc0
Part 1: Use a type in StructuredCloneFile instead of a boolean; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/e2166a89ee40
Part 2: Refactor file ids handling for better expandability; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/a1fcb8001d2a
Part 3: Implement an input stream wrapper around structured clone data; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/97e29df357f7
Part 4: Update keys directly in the structured clone buffer; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/94f3aca1fe73
Part 5: Add a data threshold pref; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/34a3a5034620
Part 6: Core changes for storing structured clone data outside of the database; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/34b5720fa068
Part 7: Compress externally stored structured clone data; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/9113736a76ce
Part 8: Disable externally stored structured clone data in file tests; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/1e2c6afa8a5e
Part 9: Add a new test for externally stored structured clone data; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/e1c649956724
Part 10: Fix a possible leak in BindAdoptedBlobByName()/BindAdoptedBlobByIndex(); r=mak
Assignee | ||
Comment 77•8 years ago
|
||
Attachment #8804440 -
Flags: review?(bugmail)
Updated•8 years ago
|
Attachment #8804440 -
Flags: review?(bugmail) → review+
Comment 78•8 years ago
|
||
Pushed by jvarga@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/97d5eb039b32
A follow-up fix. Pass correct buffer size to CopyFileData(); r=asuth
Comment 79•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/12213ad49dc0
https://hg.mozilla.org/mozilla-central/rev/e2166a89ee40
https://hg.mozilla.org/mozilla-central/rev/a1fcb8001d2a
https://hg.mozilla.org/mozilla-central/rev/97e29df357f7
https://hg.mozilla.org/mozilla-central/rev/94f3aca1fe73
https://hg.mozilla.org/mozilla-central/rev/34a3a5034620
https://hg.mozilla.org/mozilla-central/rev/34b5720fa068
https://hg.mozilla.org/mozilla-central/rev/9113736a76ce
https://hg.mozilla.org/mozilla-central/rev/1e2c6afa8a5e
https://hg.mozilla.org/mozilla-central/rev/e1c649956724
https://hg.mozilla.org/mozilla-central/rev/97d5eb039b32
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox52:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
You need to log in
before you can comment on or make changes to this bug.
Description
•