Closed
Bug 1250747
Opened 9 years ago
Closed 2 years ago
Raise per-post item limit from 100 to something more reasonable
Categories
(Cloud Services Graveyard :: Server: Sync, defect)
Cloud Services Graveyard
Server: Sync
Tracking
(Not tracked)
RESOLVED
INVALID
People
(Reporter: rnewman, Assigned: autrilla)
References
(Blocks 2 open bugs)
Details
(Whiteboard: [sync-data-integrity])
Attachments
(1 file)
We introduced this in Bug 692372. I don't think we need it: the backend splits up requests and runs them in chunks, and we limit overall size.
I'd like to put as many BSOs in a single request as possible -- for bookmarks that's over 1500. That helps us avoid partial writes (Bug 1250189).
Flags: needinfo?(bobm)
Reporter | ||
Comment 1•9 years ago
|
||
I propose a sanity limit more like 4000. At 1MB for the whole thing that allows 250 bytes per record.
Reporter | ||
Updated•9 years ago
|
Comment 2•9 years ago
|
||
Do you still want/need this even if Bug 1250189 goes ahead?
> the backend splits up requests and runs them in chunks
No it doesn't, and even if we did, we'd have to write out all chunks in a single db transaction in order to preserve X-I-U-S and related timestamp semantics.
It's probably safe to go slightly bigger than 100 though.
Comment 3•9 years ago
|
||
(In reply to Ryan Kelly [:rfkelly] from comment #2)
> It's probably safe to go slightly bigger than 100 though.
This sounds good to me pending some larger batch size specific load tests. Presumably, we'd just need to tweak this: https://github.com/mozilla-services/server-syncstorage/blob/master/loadtest/stress.py#L183
Moving it out to the config might make it easier for QA.
Flags: needinfo?(bobm)
Reporter | ||
Comment 4•9 years ago
|
||
(In reply to Bob Micheletto [:bobm] from comment #3)
> This sounds good to me pending some larger batch size specific load tests.
Note in particular that clients have a fixed number of records to upload -- if the batch limit is 100, they send fifteen requests; if the batch size is >=1500, they send one request. Have to make sure that load tests take that into account.
(In reply to Ryan Kelly [:rfkelly] from comment #2)
> > the backend splits up requests and runs them in chunks
>
> No it doesn't…
My bad, I was going off the comments in the bug where the limit was introduced:
"So, if you submit 500 valid items, we batch them up into 5 batches of 100 and do those queries. "
"note also that you can configure the batch_size and batch_max_count independently. So e.g. we could process a maximum of 500 objects but save them in batches of 100"
Perhaps I misunderstood!
> … we'd have to write out all chunks in a
> single db transaction in order to preserve X-I-U-S and related timestamp
> semantics.
>
> It's probably safe to go slightly bigger than 100 though.
Yeah. We regularly write out chunks of 1000/number of variables in sqlite on cellphone hardware, limited only by the MAX_VARIABLE_NUMBER built into in sqlite, so I hope we can do better on big AWS iron!
Comment 5•9 years ago
|
||
> "So, if you submit 500 valid items, we batch them up into 5 batches of 100 and do those queries. "
> "note also that you can configure the batch_size and batch_max_count independently.
> So e.g. we could process a maximum of 500 objects but save them in batches of 100"
>
> Perhaps I misunderstood!
FTR, those comments apply to the old sync, where we'd happily give items from the same upload different timestamps. For Sync1.5 we use a single transaction to ensure consistency/monotonicity of the timestamps.
Comment 6•9 years ago
|
||
Here's a quick patch that makes this limit configurable.
Attachment #8725523 -
Flags: review?(dcoates)
Attachment #8725523 -
Flags: feedback?(bobm)
Comment 7•9 years ago
|
||
Comment on attachment 8725523 [details]
make per-batch item limit configurable
LGTM
Attachment #8725523 -
Flags: feedback?(bobm) → feedback+
Comment 8•9 years ago
|
||
:kthiessen this makes batchsize configurable in load tests for Sync: https://github.com/mozilla-services/server-syncstorage/blob/configurable-max-batch-size/syncstorage/tests/tests.ini#L18
We'll need that for testing Bug 1250189 as well as this bug.
Updated•9 years ago
|
Attachment #8725523 -
Flags: review?(dcoates) → review+
Comment 9•9 years ago
|
||
Committed at https://github.com/mozilla-services/server-syncstorage/commit/d8f87bbbaa6d07b341265ca865d6d386b26e35f5
Note that the default remains 100. I think we should work on QAing it up towards a larger size, which I'll leave this bug open to track. We should also use the /info/configuration endpoint from Bug 1250189 to expose the server's setting to clients, so they can take advantage of limit increases if and when we make them.
Reporter | ||
Comment 10•9 years ago
|
||
Filed Bug 1254402 to adapt on iOS.
Updated•8 years ago
|
Whiteboard: [sync-data-integrity]
Updated•8 years ago
|
Priority: P3 → --
Reporter | ||
Comment 11•8 years ago
|
||
Ryan, what's the status on rolling this out, complete with configurability?
My iPhone wants to upload 80,000 history records, which at this rate will take 800 POSTs… less than ideal.
Flags: needinfo?(rfkelly)
Comment 12•8 years ago
|
||
This flag, and the corresponding /info/configuration entry, should now be live on all machines in production. The next step will be to QA an increase in the limit, and trying your proposed value of 4000 seems like as good a place to start as any.
Bob/Karl have you guys done any experiments with increasing the batch size already?
Flags: needinfo?(rfkelly)
Updated•8 years ago
|
Flags: needinfo?(kthiessen)
Comment 13•8 years ago
|
||
We haven't yet, but I know it's on the list of things to do. Is this something that we can do early next year, or does it need to be done ASAP?
Flags: needinfo?(kthiessen)
QA Contact: kthiessen
Reporter | ||
Comment 14•8 years ago
|
||
Client code doesn't know to look for it yet, so it's not a rush job. (But now we know we can build the client bits!)
Comment 16•8 years ago
|
||
IIUC, this is specifically talking about per-payload limits. Short answer is "yes, please!". Android clients have been aware of limits defined in info/configuration since 51, and using them for both batching and non-batching uploads (in case the endpoint isn't there, defaults are defined in [0]).
From a mobile client's perspective, minimizing amount of requests it needs to make is very important, both for data consistency and resource efficiency.
As for specific numbers... Until limits are lifted in Bug 1291821, android clients are capped at 5000 bookmark records. Can we get close to that with the per-payload limit (maxPost*), and for when the limits are lifted, much higher than that with the batch limit? (maxTotal*)
[0] https://dxr.mozilla.org/mozilla-central/source/mobile/android/services/src/main/java/org/mozilla/gecko/sync/InfoConfiguration.java?q=InfoConfiguration.java&redirect_type=direct#41-45
Flags: needinfo?(gkruglov)
Reporter | ||
Comment 17•8 years ago
|
||
Ryan, Bob, Karl: can we get this into production?
Flags: needinfo?(rfkelly)
Flags: needinfo?(kthiessen)
Flags: needinfo?(bobm)
Comment 18•8 years ago
|
||
I'll defer to Bob, but I see no objection in principle.
Flags: needinfo?(kthiessen)
Comment 19•8 years ago
|
||
(In reply to Karl Thiessen [:kthiessen] from comment #18)
> I'll defer to Bob, but I see no objection in principle
Sounds good, I'll set a stage node to unlimited batch size (or as big as it'll go). Let's run a series of load tests identical except for increasing (doubling?) batch sizes per comment 9. Then compare the results. We should be less focused on requests per second than records per second, which I think we'll have to guestimate for now. We'll also want to keep an eye on slow MySQL queries and upstream query time.
Flags: needinfo?(bobm)
Comment 20•8 years ago
|
||
To clarify, are you going to be testing higher batch _and_ payload limits? Higher batch limits are particularly important in light of Bug 1336001. Currently they're at 10k, and it'll be great if we can triple that at least. Comment 9 talks about payload limits; those are currently at 100, which also needs to go way up.
Flags: needinfo?(bobm)
Comment 21•8 years ago
|
||
Let's increase max_post_records to 1000. Most batch=true POSTs are less than 100KB [1]. I bet it is because they are hitting the max_post_records limit. Increasing it 10x should still be less than the 2MB request limit while letting the client send BSOs per request.
FWIW, about 88% of POST requests are under 100 BSOs and won't be affected by this change.
[1] batch creation requests
# NumSamples = 3943; Min = 16384.00; Max = 248148.00
# each ∎ represents a count of 21
16384 - 39560 [ 335]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ (8.50%)
39560 - 62736 [ 1631]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ (41.36%)
62736 - 85913 [ 1572]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ (39.87%)
85913 - 109089 [ 262]: ∎∎∎∎∎∎∎∎∎∎∎∎ (6.64%)
109089 - 132266 [ 50]: ∎∎ (1.27%)
132266 - 155442 [ 81]: ∎∎∎ (2.05%)
155442 - 178618 [ 5]: (0.13%)
178618 - 201795 [ 2]: (0.05%)
201795 - 224971 [ 1]: (0.03%)
224971 - 248148 [ 4]: (0.10%)
Reporter | ||
Comment 22•8 years ago
|
||
(In reply to Benson Wong [:mostlygeek] from comment #21)
> Let's increase max_post_records to 1000. Most batch=true POSTs are less than
> 100KB [1]. I bet it is because they are hitting the max_post_records limit.
> Increasing it 10x should still be less than the 2MB request limit while
> letting the client send BSOs per request.
Is there any reason to significantly limit the number of records per POST?
Clients respect both payload size and record count limits, so there should be no reason to limit the number of records artificially to approximate a payload size limit.
(And at a 2MB request limit, and 250 bytes per record, we could safely raise the POST limit to 8000. Even with the stats you mention -- 1KB per record max -- you mention, we could easily accept 2000 records per POST.)
The higher the limit, the more chance we have of avoiding using a batch at all, which presumably makes the storage server much happiper -- if the limit is 2MB and 2000 records, the vast majority of users would be able to upload all of their bookmarks in a single request.
Comment 23•8 years ago
|
||
(In reply to :Grisha Kruglov from comment #20)
> To clarify, are you going to be testing higher batch _and_ payload limits?
> Higher batch limits are particularly important in light of Bug 1336001.
> Currently they're at 10k, and it'll be great if we can triple that at least.
> Comment 9 talks about payload limits; those are currently at 100, which also
> needs to go way up.
I'm all for testing it with both.
Flags: needinfo?(bobm)
Comment 24•8 years ago
|
||
> Is there any reason to significantly limit the number of records per POST?
I don't think so, it's only that the server parses them all into memory at once. I don't think there are e.g. any O(n^2) operations on the list of records, or other things where we might get into trouble by having too many of them at once.
Flags: needinfo?(rfkelly)
Comment 25•8 years ago
|
||
Any progress on bumping these limits further? Testing or otherwise. I'd love to be able to one day put "atomic" somewhere into our uploaders' names ;-)
Flags: needinfo?(rfkelly)
Flags: needinfo?(bobm)
Comment 26•8 years ago
|
||
I'm not aware of any updates, but it also doesn't appear to be blocked on anything other than :bobm bandwidth. We may have to wait until the successful re-rollout of the batch upload stuff in order to try it in prod at all.
Flags: needinfo?(rfkelly)
Comment 27•7 years ago
|
||
Re-titling this bug to be just about the per-individual-POST item limit, and Bug 1378569 is about the overall limit on the number of items in a batch.
That said, I want to get batch upload API out and stable before we do any further fiddling with the config here, so I'm going to make that bug block this one.
Summary: Raise per-batch item limit from 100 to something more reasonable → Raise per-post item limit from 100 to something more reasonable
Comment 28•7 years ago
|
||
(In reply to Ryan Kelly [:rfkelly] from comment #26)
> I'm not aware of any updates, but it also doesn't appear to be blocked on
> anything other than :bobm bandwidth. We may have to wait until the
> successful re-rollout of the batch upload stuff in order to try it in prod
> at all.
Should we re-visit this?
Flags: needinfo?(bobm)
Comment 29•7 years ago
|
||
I think we are finally free to experiment with raising this limit :-)
Comment 30•7 years ago
|
||
And in fact, this is the final blocking bug for Bug 1250189
Comment 31•6 years ago
|
||
I'm taking this out of Bug 1250189's dependency tree, since the larger batch-upload effort has completed. But this one still feels worth doing.
No longer blocks: 1250189
Comment 32•6 years ago
|
||
:autrilla I think you should do this on the new server type.
Flags: needinfo?(autrilla)
Assignee | ||
Updated•6 years ago
|
Assignee: nobody → autrilla
Flags: needinfo?(autrilla)
Updated•6 years ago
|
QA Contact: kthiessen → nobody
Comment 33•2 years ago
|
||
This is no longer valid, we've moved to Spanner which originally limited our max_total_records to 1664. Spanner recently increased its internal limit so we can raise this 1664 value by probably around 2 fold. This is covered here: https://mozilla-hub.atlassian.net/browse/SYNC-3433
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INVALID
Updated•2 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•