Closed Bug 730142 Opened 13 years ago Closed 8 years ago

Download batching

Categories

(Firefox for Android Graveyard :: Android Sync, defect, P1)

All
Android
defect

Tracking

(blocking-fennec1.0 -, firefox52 fixed)

RESOLVED FIXED
Firefox 52
Tracking Status
blocking-fennec1.0 --- -
firefox52 --- fixed

People

(Reporter: rnewman, Assigned: dlim, Mentored)

References

Details

(Whiteboard: [MobileAS] )

Attachments

(2 files, 2 obsolete files)

Looks like Fennec isn't going to impose tiny limits on history, so we can download more and provide a richer experience. But we also need to bear in mind resource constraints, and unlike XUL Fennec we also have a hard limit of 5 minutes for a Sync. My proposal is something like this: * First sync? Do an IDs-only fetch. This'll return tons and tons of IDs. (Limit to 5000?) Persist these. * Do a limited sortindex-ordered fetch of full records. This gives us something to work with. (This is what we do now.) * On every subsequent sync, fetch another batch of records by ID until the collection of backlog IDs is empty. We can even use a timer of our own to figure out how long we have left, and fetch a few more batches if we have time. This gives us a snapshot up-front, and then gradually fills in history until we've got a full copy, but without ever prolonging a sync past a threshold.
See also Bug 721898.
Priority: -- → P2
Blocks: 722379
Blocks: 713939
Priority: P2 → P1
I fear the experience for users on older phones and a large set of history to sync from desktop, will get a lousy experience on first sync. Can affect performance of app, battery heating, and high CPU usage. nom'ing for mobile triage.
blocking-fennec1.0: --- → ?
Priority: P1 → P2
I may have mid-air'd with rnewman. After group discussion, i believe he was bumping priority to P1.
Priority: P2 → P1
blocking-fennec1.0: ? → -
sync retriage: demoted unless teh beta period proves this is really important
Priority: P1 → P2
Status: NEW → ASSIGNED
Truth in advertising: I am not actively working on these tickets.
Assignee: nalexander → nobody
Blocks: 814801
While I'm here: depending on the collection (particularly history) we need a channel for Fennec to warn us that we already have enough history to trigger pruning, so we should download no more. That doesn't apply to structure-critical collections like bookmarks.
Status: ASSIGNED → NEW
Product: Mozilla Services → Android Background Services
Priority: P2 → --
Hardware: ARM → All
Blocks: 1291821
No longer blocks: 814801
Assignee: nobody → dlim
Whiteboard: [MobileAS s1.2]
Status: NEW → ASSIGNED
Attached patch bug-730142.patch (deleted) — Splinter Review
First run at this, hoping to get a quick feedback to see if I'm on the right track before going further.
(In reply to Daniel Lim [:dlim] from comment #8) > Created attachment 8779141 [details] [diff] [review] > bug-730142.patch > > First run at this, hoping to get a quick feedback to see if I'm on the right > track before going further.
Flags: needinfo?(rnewman)
Flags: needinfo?(gkruglov)
Comment on attachment 8779141 [details] [diff] [review] bug-730142.patch Review of attachment 8779141 [details] [diff] [review]: ----------------------------------------------------------------- I think this is generally on the right track, although you're missing some pieces (xius), and downloading logic is leaking throughout the delegate chain. Additionally, similar to iOS, see if it's feasible to keep track of the last-modified timestamp of the latest-fetched record per-session (if we're fetching records in DESC), and use that as a "since" parameter when appropriate (ensuring that at worst we'll fetch records that we already have). Right now we're only keeping track of the "last-synced" timestamp. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/net/SyncResponse.java @@ +121,5 @@ > + */ > + public int weaveOffset() { > + int offset; > + try { > + offset = this.getIntegerHeader("x-weave-next-offset"); From the docs (https://docs.services.mozilla.com/storage/apis-1.5.html): "The value of this header will always be a string of characters from the urlsafe-base64 alphabet. The specific contents of the string are an implementation detail of the server, so clients should treat it as an opaque token." ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/RepositorySession.java @@ +85,5 @@ > } > > public abstract void guidsSince(long timestamp, RepositorySessionGuidsSinceDelegate delegate); > public abstract void fetchSince(long timestamp, RepositorySessionFetchRecordsDelegate delegate); > + public abstract void fetchSince(long timestamp, int offset, RepositorySessionFetchRecordsDelegate delegate); This is not necessary. "offset" logic should be contained within the downloader. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/Server11RepositorySession.java @@ +123,4 @@ > public Server11RepositorySession(Repository repository) { > super(repository); > serverRepository = (Server11Repository) repository; > + this.downloader = new BatchingDownloader(serverRepository, pending); "pending" set of requests is currently used for fetch tasks only, so your downloader should own it entirely and expose an abort method for cleanup. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java @@ +21,5 @@ > + > +/** > + * Download items in batch from server > + */ > +public class BatchingDownloader extends RecordDownloader{ This needs to own a X-If-Unmodified-Since value, and pass it along in headers with your consequent requests. Otherwise we don't know if remote data has been modified while we're fetching different pages of it! @@ +137,5 @@ > + boolean hasMoreItemsToFetch = offset != -1; > + > + if (hasMoreItemsToFetch) { > + Logger.debug(LOG_TAG, "There are more items, offset at " + offset); > + delegate.onFetchMoreRecord(offset); Keep offset logic contained to the downloader, no need to leak it out. @@ +153,5 @@ > + } > + > + @Override > + public void handleRequestFailure(SyncStorageResponse response) { > + // TODO: ensure that delegate methods don't get called more than once. Make sure to handle 412 concurrent modification errors here explicitly once you're passing along x-i-u-s header. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/RecordDownloader.java @@ +8,5 @@ > +import org.mozilla.gecko.sync.net.SyncStorageResponse; > +import org.mozilla.gecko.sync.repositories.Server11Repository; > +import org.mozilla.gecko.sync.repositories.delegates.RepositorySessionFetchRecordsDelegate; > + > +public abstract class RecordDownloader { Same comment as for delegate - roll into one download class unless there's a good reason not to. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/RecordDownloaderDelegate.java @@ +7,5 @@ > +import org.mozilla.gecko.sync.net.WBOCollectionRequestDelegate; > + > +import java.util.Set; > + > +public abstract class RecordDownloaderDelegate extends WBOCollectionRequestDelegate { Unless you're going to base other work on this, just roll everything into one delegate class. @@ +20,5 @@ > + > + public void setRequest(SyncStorageCollectionRequest request) { > + this.request = request; > + } > + void removeRequestFromPending() { Your downloader is probably better suited as an owner of any cleanup logic that needs to happen.
Attachment #8779141 - Flags: feedback+
Flags: needinfo?(gkruglov)
Flags: needinfo?(rnewman)
Attachment #8780287 - Flags: review?(rnewman)
Comment on attachment 8780288 [details] [WIP] Bug 730142 - Download batching. Please re-flag when Grisha's satisfied.
Attachment #8780288 - Flags: review?(rnewman)
Assignee: daniellimpersonal → gkruglov
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review69528 Let's combine the two commits into one; I don't think refactor as a separate commit buys us enough clarity in this case to bother. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/net/MozResponse.java:152 (Diff revision 1) > > public MozResponse(HttpResponse res) { > response = res; > } > > - private String getNonMissingHeader(String h) { > + String getNonMissingHeader(String h) { protected? ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/Server11RepositorySession.java:137 (Diff revision 1) > } > > @Override > public void fetchAll(RepositorySessionFetchRecordsDelegate delegate) { > - this.fetchSince(-1, delegate); > + long timestamp = -1; > + this.fetchSince(timestamp, delegate); nit: just passing -1 will do. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:23 (Diff revision 1) > +import java.util.Collections; > +import java.util.HashSet; > +import java.util.Set; > + > +/** > + * Download items in batch from server Technically incorrect for this Pre commit ;) I think in this case you might as well just combine the two commits into one. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:24 (Diff revision 1) > +import java.util.HashSet; > +import java.util.Set; > + > +/** > + * Download items in batch from server > + */ Once you combine them, make sure you describe what download batching is, mechanics of it, etc - feel free to be verbose and detailed, but aim for clarity. Imagine you're reading this a year from now and it's 3am - the comment should still make sense and logic easy to follow. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:43 (Diff revision 1) > + this.repository = repository; > + this.repositorySession = repositorySession; > + } > + > + public void setSessionFetchRecordsDelegate(RepositorySessionFetchRecordsDelegate delegate) { > + this.sessionFetchRecordsDelegate = delegate; I don't think this is going to do what you want. Imagine multiple consumers running bunch of fetch calls with different delegates - you'll have unpredictable results. Instead, do what the class used to do - pass the RepositorySessionFetchRecordsDelegate into your new BatchingDownloaderDelegate, and do more work there. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:63 (Diff revision 1) > + * @return Normalized timestamp in milliseconds. > + */ > + public static long getNormalizedTimestamp(SyncStorageResponse response) { > + long normalizedTimestamp = -1; > + try { > + normalizedTimestamp = response.normalizedWeaveTimestamp(); Let's use X-Last-Modified here instead. Logically, we want this timestmap to represent "when was the collection modified last". X-Weave-Timestamp is server's current time, so case of a GET they're the same, but semantically X-Last-Modified makes much more sense and is explicitely what we want. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:74 (Diff revision 1) > + normalizedTimestamp = System.currentTimeMillis(); > + } > + return normalizedTimestamp; > + } > + > + protected String flattenIDs(String[] guids) { could be a static method; could have tests... ;-) ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:100 (Diff revision 1) > + String ids, > + BatchingDownloaderDelegate delegate) > + throws URISyntaxException { > + > + URI collectionURI = repository.collectionURI(full, newer, limit, sort, ids); > + SyncStorageCollectionRequest request = new SyncStorageCollectionRequest(collectionURI); I'd rather you pass the request object down into the delegate, and have it pass it back up to any success/failure methods for clean up and `this.pending` bookkeeping if necessary. The less global state like this we have the better. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:122 (Diff revision 1) > + } > + > + > + public void fetchSince(long timestamp) { > + try { > + long limit = repository.getDefaultFetchLimit(); Don't bother with these variables. IDE will tell you what method parameters are if you just hover over the said method ;-) Same comment for other fetch methods. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:126 (Diff revision 1) > + try { > + long limit = repository.getDefaultFetchLimit(); > + String sort = repository.getDefaultSort(); > + boolean full = true; > + String ids = null; > + this.fetchWithParameters(timestamp, limit, full, sort, ids, new BatchingDownloaderDelegate(this)); This might throw, so just wrap this in try/catch. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:133 (Diff revision 1) > + this.sessionFetchRecordsDelegate.onFetchFailed(e, null); > + } > + } > + > + public void fetch(String[] guids) { > + // TODO: watch out for URL length limits! File a follow up bug, and/or explain in a comment why this could be a problem, and should we worry about it. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:140 (Diff revision 1) > + long timestamp = -1; > + long limit = -1; > + boolean full = true; > + String sort = "index"; > + String ids = flattenIDs(guids); > + this.fetchWithParameters(timestamp, limit, full, sort, ids, new BatchingDownloaderDelegate(this)); Only wrap in try/catch what you need (if feasible). ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:192 (Diff revision 1) > + } finally { > + workTracker.decrementOutstanding(); > + } > + } > + > + private void removeRequestFromPending() { This should die if you make the request changes. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderDelegate.java:33 (Diff revision 1) > + return this.downloader.getServerRepository().getAuthHeaderProvider(); > + } > + > + @Override > + public String ifUnmodifiedSince() { > + return null; Let's not do this; we can use collection's last-modified timestamp that we have available via the repository, so let's use that to protect ourselves from concurrent modifications by other clients.
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review69550 ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderDelegate.java:33 (Diff revision 1) > + return this.downloader.getServerRepository().getAuthHeaderProvider(); > + } > + > + @Override > + public String ifUnmodifiedSince() { > + return null; Ah, nevermind - I'm confusing downloading/uploading. Passing a your collection's L-M timestamp as a X-I-U-S header is akin to saying you don't want any new items :-0 However, you want to ensure that once you start downloading things in a batch, the L-M header doesn't change.
Comment on attachment 8780288 [details] [WIP] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70986/#review69558 Getting close! ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/net/SyncResponse.java:114 (Diff revision 1) > public String weaveAlert() { > return this.getNonMissingHeader("x-weave-alert"); > } > + > + /** > + * The offset parameter returned from the server as urlsafe-base64 alphabet docs say it best: `This header may be sent back with multi-record responses where the request included a limit parameter. Its presence indicates that the number of available records exceeded the given limit. The value from this header can be passed back in the offset parameter to retrieve additional records. The value of this header will always be a string of characters from the urlsafe-base64 alphabet. The specific contents of the string are an implementation detail of the server, so clients should treat it as an opaque token. ` ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/net/SyncResponse.java:116 (Diff revision 1) > } > + > + /** > + * The offset parameter returned from the server as urlsafe-base64 alphabet > + * if there was a limit in request and there are more items to be fetched > + * * @return the offset parameter formatting ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/net/SyncResponse.java:123 (Diff revision 1) > + public String weaveOffset() { > + return this.getNonMissingHeader("x-weave-next-offset"); > + } > + > + /** > + * The last modified time for a response returned from the server as decimal timestamp string Docs are clearer: `This header gives the last-modified time of the target resource as seen during processing of the request, and will be included in all success responses (200, 201, 204). When given in response to a write request, this will be equal to the server’s current time and to the new last-modified time of any BSOs created or changed by the request. It is similar to the standard HTTP Last-Modified header, but the value is a decimal timestamp rather than a HTTP-format date.` ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:36 (Diff revision 1) > // So that we can clean up. > private SyncStorageCollectionRequest request; > private final DelayedWorkTracker workTracker = new DelayedWorkTracker(); > //Used to track outstanding requests, so that we can abort them as needed. > private final Set<SyncStorageCollectionRequest> pending = Collections.synchronizedSet(new HashSet<SyncStorageCollectionRequest>()); > + private long newerTimestamp; I think we won't use the same instance of a downloader to process multiple simultaneous fetch requests with different parameters. If that's the case, this is probably OK. Otherwise, this won't work. Please double check this. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:37 (Diff revision 1) > private SyncStorageCollectionRequest request; > private final DelayedWorkTracker workTracker = new DelayedWorkTracker(); > //Used to track outstanding requests, so that we can abort them as needed. > private final Set<SyncStorageCollectionRequest> pending = Collections.synchronizedSet(new HashSet<SyncStorageCollectionRequest>()); > + private long newerTimestamp; > + private String lastModifiedTimestamp; Will these be accessed by more than one thread? ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:39 (Diff revision 1) > //Used to track outstanding requests, so that we can abort them as needed. > private final Set<SyncStorageCollectionRequest> pending = Collections.synchronizedSet(new HashSet<SyncStorageCollectionRequest>()); > + private long newerTimestamp; > + private String lastModifiedTimestamp; > > public BatchingDownloader(final Server11Repository repository, final Server11RepositorySession repositorySession) { This file along with your new delegate file need to be added to `mobile/android/base/android-services.mozbuild` ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:103 (Diff revision 1) > + String offset, > BatchingDownloaderDelegate delegate) > throws URISyntaxException { > > - URI collectionURI = repository.collectionURI(full, newer, limit, sort, ids); > + URI collectionURI = repository.collectionURI(full, newer, limit, sort, ids, offset); > + System.out.println(LOG_TAG + collectionURI.toString()); Use Logger ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:129 (Diff revision 1) > } > > > public void fetchSince(long timestamp) { > + String offset = null; > + this.fetchSince(timestamp, offset); Just do `this.fetchSince(timestamp, null);` ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:166 (Diff revision 1) > } > > public void onFetchCompleted(SyncStorageResponse response) { > removeRequestFromPending(); > > + this.lastModifiedTimestamp = response.lastModifiedTimestamp(); We expect that once we obtain Last-Modified, it will not change throughout our download requests. Yes, server should return 412 if we're passing Last-Modified along in XIUS, but you could also throw here if timestamps don't match for some reason. At that point I'd consider a batching download to be a failure. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:173 (Diff revision 1) > + > + String offset = response.weaveOffset(); > + if (offset != null) { > + // TODO: we want the last item's modified time instead of newertimestamp, > + // can't do that right now because the response here is encrypted, > + // will have to be passed in when buffer work is done This comment is not correct. From the perspective of the downloader, you want the "since" and limit parameters to remain the same, while your offset changes letting you page through data. Last item's last-modified comes into play outside of the downloader; if we've paged through a bunch of data, but not all of it (network dropped, never got to offset == null), we can imagine restaring from where we left off. Also, it's worth noting that this only makes sense if we're receiving data sorted by last-modified. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:174 (Diff revision 1) > + String offset = response.weaveOffset(); > + if (offset != null) { > + // TODO: we want the last item's modified time instead of newertimestamp, > + // can't do that right now because the response here is encrypted, > + // will have to be passed in when buffer work is done > + this.fetchSince(this.newerTimestamp, offset); Prefer to return here, and un-indent below.
Attachment #8780288 - Attachment is obsolete: true
Attachment #8780288 - Flags: review?(gkruglov)
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review69528 > Let's use X-Last-Modified here instead. > > Logically, we want this timestmap to represent "when was the collection modified last". X-Weave-Timestamp is server's current time, so case of a GET they're the same, but semantically X-Last-Modified makes much more sense and is explicitely what we want. Updated to use X-Last-Modified that falls back to X-Weave-Timestamp when not present > File a follow up bug, and/or explain in a comment why this could be a problem, and should we worry about it. Belongs to server11repositorysession, removed.
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review69550 > Ah, nevermind - I'm confusing downloading/uploading. Passing a your collection's L-M timestamp as a X-I-U-S header is akin to saying you don't want any new items :-0 > > However, you want to ensure that once you start downloading things in a batch, the L-M header doesn't change. My understanding is that passing XIUS with last modified time will return 412 code if there were concurrent modification before the download is complete. Also, added a check to compare current last modified time and previous, and abort when detect the time has changed.
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review70360 Getting closer! ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/net/SyncResponse.java:133 (Diff revision 2) > + * When given in response to a write request, this will be equal to the server’s current time and to the new last-modified time of any BSOs created or changed by the request. > + * It is similar to the standard HTTP Last-Modified header, but the value is a decimal timestamp rather than a HTTP-format date > + * > + * @return the last modified header in milliseconds > + */ > + public long lastModifiedTimestamp() { You'll need to rebase once atomic uploads lands. See relevant changes in https://reviewboard.mozilla.org/r/68122/diff/6#3 ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/Server11RepositorySession.java:107 (Diff revision 2) > return; > } > } > } > > + private BatchingDownloader downloader; final ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:23 (Diff revision 2) > +import java.util.Collections; > +import java.util.HashSet; > +import java.util.Set; > + > +/** > + * Download items of collections in batch from server using offset token if it is present in the response header Good start, but please use grammatically correct sentences, and be clearer in your description. Feel free to base it on the sync docs (https://docs.services.mozilla.com/storage/apis-1.5.html#general-info), but it also needs to stand on its own as a solid reference for the implementation that follows. For an example of this, see commentary for the BatchingUploader: https://reviewboard.mozilla.org/r/68122/diff/6#8 ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:30 (Diff revision 2) > + * If detected the collection has been modified and the batch isn't completed, abort batch download > + */ > +public class BatchingDownloader { > + public static final String LOG_TAG = "BatchingDownloader"; > + > + protected Server11Repository repository; final ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:31 (Diff revision 2) > + */ > +public class BatchingDownloader { > + public static final String LOG_TAG = "BatchingDownloader"; > + > + protected Server11Repository repository; > + private Server11RepositorySession repositorySession; final ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:39 (Diff revision 2) > + private final Set<SyncStorageCollectionRequest> pending = Collections.synchronizedSet(new HashSet<SyncStorageCollectionRequest>()); > + > + public BatchingDownloader(final Server11Repository repository, final Server11RepositorySession repositorySession) { > + this.repository = repository; > + this.repositorySession = repositorySession; > + Logger.shouldLogVerbose(LOG_TAG); Useful for debugging, but don't leave this in the actual code. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:57 (Diff revision 2) > + * > + * @param response > + * The <code>SyncStorageResponse</code> to interrogate. > + * @return Normalized timestamp in milliseconds. > + */ > + public static long getNormalizedTimestamp(SyncStorageResponse response) { We can assume that X-Last-Modified header will be present for all successful responses from the Sync server, so this can be simplified. Also, see changes in https://reviewboard.mozilla.org/r/68122/diff/6#8 - once that lands, you should be able to remove this entirely. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:94 (Diff revision 2) > + b.append(","); > + } > + return b.substring(0, b.length() - 1); > + } > + > + protected void fetchWithParameters(long newer, BatchingDownloader should "own" lastModified. You're always passing in "-1" as a Last-Modified value, which is a tell-sign that your caller doesn't know/care what that value is; Last-Modified will start off as null, and will be assigned when first GET request completes. Either BatchingDownloader (preferred) or your delegate should own logic for ensuring that Last-Modified doesn't changed during the download. L-M should always be either null (you don't know yet), or a valid value returned from the server. As your delegate is processing a success response, it should either validate L-M, or pass it off to BatchingDownloader for explicit validation. And you should write unit tests ensuring this stuff works as intended. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:108 (Diff revision 2) > + > + URI collectionURI = repository.collectionURI(full, newer, limit, sort, ids, offset); > + Logger.debug(LOG_TAG, collectionURI.toString()); > + > + SyncStorageCollectionRequest request = new SyncStorageCollectionRequest(collectionURI); > + new BatchingDownloaderDelegate(this, fetchRecordsDelegate, request, newer, lastModifiedTimestamp); I don't think you meant to commit this ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:158 (Diff revision 2) > + removeRequestFromPending(request); > + > + long lastModifiedTimestamp = response.lastModifiedTimestamp(); > + Logger.debug(LOG_TAG, "Last modified timestamp " + lastModifiedTimestamp); > + > + boolean hasTimestampChanged = prevLastModifiedTimestamp != -1 As noted above, this validation is likely misplaced. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:165 (Diff revision 2) > + if (hasTimestampChanged) { > + this.abort(); > + return; > + } > + > + String offset = response.weaveOffset(); While this works, you might elect to keep offset around as a state variable for the duration of the download, and perform sanity checks around its lifetime as your requests succeed. e.g. it starts off as null; if you're dealing with a trivial case (everything fits into one request), it'll be null on the response. otherwise, it'll be non-null, non-empty token for all responses but the very last one, at which point it'll be null again. But again, this works and is probably enough. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:219 (Diff revision 2) > + this.pending.remove(request); > + } > + > + public void abort() { > + this.repositorySession.abort(); > + for (SyncStorageCollectionRequest request : pending) { I believe you need to synchronize on `pending` as you're iterating over it, since it's a synchronizedSet ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderDelegate.java:35 (Diff revision 2) > + this.downloader = downloader; > + this.fetchRecordsDelegate = fetchRecordsDelegate; > + this.request = request; > + this.request.delegate = this; > + this.newerTimestamp = newerTimestamp; > + this.lastModifiedTimestamp = lastModifiedTimestamp; Don't bother passing it in, expose it on the downloader; Probably want to make it volatile. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderDelegate.java:45 (Diff revision 2) > + return this.downloader.getServerRepository().getAuthHeaderProvider(); > + } > + > + @Override > + public String ifUnmodifiedSince() { > + if (this.lastModifiedTimestamp == -1) { Should probably just get this value from the downloader. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderDelegate.java:53 (Diff revision 2) > + return Long.toString(this.lastModifiedTimestamp); > + } > + > + @Override > + public void handleRequestSuccess(SyncStorageResponse response) { > + Logger.debug(LOG_TAG, "Fetch done."); Generally it's good to defensively validate any assumptions you're making about responses you get back. Main thing is validating presence of any headers we need. L-M, what else..?
Whiteboard: [MobileAS s1.2] → [MobileAS s1.3]
Priority: -- → P2
Whiteboard: [MobileAS s1.3] → [MobileAS]
Priority: P2 → P1
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review70360 New revision partially addresses issues from last review. Unit test is WIP and will be in next commit. > You'll need to rebase once atomic uploads lands. See relevant changes in https://reviewboard.mozilla.org/r/68122/diff/6#3 Kept this unchanged, will update once am able to rebase (ie when atomic upload lands) > We can assume that X-Last-Modified header will be present for all successful responses from the Sync server, so this can be simplified. > > Also, see changes in https://reviewboard.mozilla.org/r/68122/diff/6#8 - once that lands, you should be able to remove this entirely. Question: Doesn't old sync api (1.0, 1.1) not support X-Last-Modified header? Would it still be valid to assume the header will always be present for successful responses?
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review72504 Still need to start writing tests for this - I find it very helpful to do that as soon as possible. For inspiration, check out tests for the atomic uploader: https://reviewboard.mozilla.org/r/68122/diff/17#24 ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/Server11RepositorySession.java:88 (Diff revision 3) > - * Used to track outstanding requests, so that we can abort them as needed. > - */ > - private final Set<SyncStorageCollectionRequest> pending = Collections.synchronizedSet(new HashSet<SyncStorageCollectionRequest>()); > - > @Override > public void abort() { Don't need to @Override anymore! ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:75 (Diff revision 3) > + Logger.debug(LOG_TAG, collectionURI.toString()); > + > + SyncStorageCollectionRequest request = new SyncStorageCollectionRequest(collectionURI); > + request.delegate = new BatchingDownloaderDelegate(this, fetchRecordsDelegate, request, newer); > + > + synchronized (pending) { `pending` is already a synchronized set, so you don't need to synchronize here for operations like "add". You do need to synchronize on `pending` while iterating over it though - or, synchronize all access to it via some other method. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:124 (Diff revision 3) > + final SyncStorageCollectionRequest request, long newerTimestamp) { > + removeRequestFromPending(request); > + > + String currentLastModifiedTimestamp = response.lastModified(); > + Logger.debug(LOG_TAG, "Last modified timestamp " + currentLastModifiedTimestamp); > + Server should be guarding against this already [0], but I suppose it doesn't hurt to double check yourself. I'd add a clarifying comment explaining what you're doing here. [0] When we process our first request, we get back a X-Last-Modified header indicating when collection was modified last. We pass it to the server with every subsequent request (if we need to make more) in the X-If-Unmodified-Since, and server is supposed to ensure that this pre-condition is met, and fail our request with a 412 error code otherwise. So, if all of this happens, these checks should never fail. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:125 (Diff revision 3) > + removeRequestFromPending(request); > + > + String currentLastModifiedTimestamp = response.lastModified(); > + Logger.debug(LOG_TAG, "Last modified timestamp " + currentLastModifiedTimestamp); > + > + if (this.lastModified == null && currentLastModifiedTimestamp != null) { You've already checked if response.lastModified() isn't null in the delegate. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:129 (Diff revision 3) > + > + if (this.lastModified == null && currentLastModifiedTimestamp != null) { > + // first time seeing last modified timestamp > + this.lastModified = currentLastModifiedTimestamp; > + } else { > + boolean hasTimestampChanged = !Objects.equals(this.lastModified, currentLastModifiedTimestamp); Since you know `lastModified` isn't null, just do: ``` if (!lastModified.equals(currentLastModifiedTimestamp)) {...} ``` ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:131 (Diff revision 3) > + // first time seeing last modified timestamp > + this.lastModified = currentLastModifiedTimestamp; > + } else { > + boolean hasTimestampChanged = !Objects.equals(this.lastModified, currentLastModifiedTimestamp); > + if (hasTimestampChanged) { > + this.abort(); We do want to know that this happened (server didn't do its job correctly? or did we somehow have two concurrent fetches, and remote collection was modified inbetween?), so at the very least lets log an error here. Investigate passing an exception into delegate's onFetchFailed method as well, perhaps? But we do want to abort at this point, since results will be potentially very inconsistent. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:150 (Diff revision 3) > + // When we're done processing other events, finish. > + this.workTracker.delayWorkItem(new Runnable() { > + @Override > + public void run() { > + Logger.debug(LOG_TAG, "Delayed onFetchCompleted running."); > + // TODO: verify number of returned records. Don't leave TODOs like this floating around. Either create a bug to track this if it's important, or address this, or leave a comment explaining anything of interest to maintainers. In this case you probably can just drop the TODO. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:185 (Diff revision 3) > + > + private void removeRequestFromPending(SyncStorageCollectionRequest request) { > + if (request == null) { > + return; > + } > + synchronized (pending) { Again, don't need to synchronize here. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:192 (Diff revision 3) > + } > + } > + > + public void abort() { > + this.repositorySession.abort(); > + synchronized (pending) { nit: stick with the same style, s/`pending`/`this.pending`. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:199 (Diff revision 3) > + request.abort(); > + } > + this.pending.clear(); > + } > + } > + I don't know what this is, but it looks like a menacing red bar in reviewboard! ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderDelegate.java:54 (Diff revision 3) > + > + @Override > + public void handleRequestSuccess(SyncStorageResponse response) { > + Logger.debug(LOG_TAG, "Fetch done."); > + if (response.lastModified() == null) { > + this.downloader.onFetchFailed(new Exception("Missing last modified header from response"), this.fetchRecordsDelegate, this.request); Use IllegalStateException. Also, style nit: break this down into multiple lines (try to stay under/around 80 chars for improved readability, your IDE should have a handy dotted vertical line for this).
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review70360 > While this works, you might elect to keep offset around as a state variable for the duration of the download, and perform sanity checks around its lifetime as your requests succeed. > > e.g. it starts off as null; if you're dealing with a trivial case (everything fits into one request), it'll be null on the response. otherwise, it'll be non-null, non-empty token for all responses but the very last one, at which point it'll be null again. > > But again, this works and is probably enough. Doing this in the test instead.
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review72504 > I don't know what this is, but it looks like a menacing red bar in reviewboard! It does!
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review73388 This is starting to look pretty good! Let's re-ping Richard and see what he thinks. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:42 (Diff revision 4) > + public BatchingDownloader(final Server11Repository repository, final Server11RepositorySession repositorySession) { > + this.repository = repository; > + this.repositorySession = repositorySession; > + } > + > + public static String flattenIDs(String[] guids) { @VisibleForTesting ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:62 (Diff revision 4) > + } > + > + protected void fetchWithParameters(SyncStorageCollectionRequest request, > + long newer, > + RepositorySessionFetchRecordsDelegate fetchRecordsDelegate) > + throws URISyntaxException { This method won't throw URISyntaxException anymore. ::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderTest.java:144 (Diff revision 4) > + String offsetHeader; > + offsetHeader = "100"; > + SyncStorageResponse response; > + response = makeSyncStorageResponse(200, lmHeader, offsetHeader); > + SyncStorageCollectionRequest request; > + request = new SyncStorageCollectionRequest(new URI("http://dummy.url")); collapse these ^^
Attachment #8780287 - Flags: review?(gkruglov) → review+
Attachment #8780287 - Flags: review?(rnewman)
Still to-do: - re-base and update as necessary once atomic uploads lands (sometime later today) - clean up any remaining nits - ensure tests are in good shape, I haven't looked at them too closely yet - write proper commit summary/details - address any rnewman's feedback
Assignee: gkruglov → daniellimpersonal
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review73714 There's a missing change or two here, I think. Previously, we would configure a constrained repo with a limit. That limit was how many records in total we'd download in a sync; e.g., HISTORY_REQUEST_LIMIT was set to a mere 250, FORM_HISTORY to a sanity limit of 5000. After this patch, that limit is how many records we'll download _in a single request_ -- because the download batch API returns that many results and a continuation token, and we just go right ahead and start the next batch -- and there is no limit to how many records we'll download in a sync! Both of those things need to be addressed. We need some kind of sanity limit for how many records to download in a request for a particular data type; that limit is probably not the same as the limit we're using today. Then we need an optional limit for how many records we'll download in a sync. And there might be another limit on top -- we can't do anything until we download _all_ of your bookmarks, so there's no limit per se, but we might not want to download 50,000 of them before we move on to trying to sync some of your history. (This is why on iOS we call this an "incremental" mirrorer.) Left as-is, we will download 250 history records, chunk by chunk, until we've got them all. My history collection has 62,802 records. Chances of successfully completing 252 HTTP requests in a row? None. Assuming 250 bytes each, that's 15MB. That doesn't bother iOS so much -- iOS also implements high-water-mark download batching, so if we get a concurrent modification error, we can pick up close to where we left off. And it implements a 'green light', so we'll never spend 40 minutes trying to sync. You're in a dangerous place if you haven't addressed those cases. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/net/SyncResponse.java:129 (Diff revision 4) > + } > + > + /** > + * This header gives the last-modified time of the target resource as seen during processing of the request, and will be included in all success responses (200, 201, 204). > + * When given in response to a write request, this will be equal to the server’s current time and to the new last-modified time of any BSOs created or changed by the request. > + * It is similar to the standard HTTP Last-Modified header, but the value is a decimal timestamp rather than a HTTP-format date Nit: final '.'. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/net/SyncResponse.java:138 (Diff revision 4) > + public String lastModified() { > + return this.getNonMissingHeader("x-last-modified"); > + } > + > + public long lastModifiedTimestamp() { > + if (this.lastModified() == null) { Don't call `lastModified` twice: ``` final String lm = this.lastModified(); if (lm == null) { return -1; } return Utils.decimalSecondsToMilliseconds(lm); ``` ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/Server11Repository.java:85 (Diff revision 4) > } > if (ids != null) { > params.add("ids=" + ids); // We trust these values. > } > - > + if (offset != null) { > + params.add("offset=" + offset); At some point -- either here, or in callers -- you need to URI-escape the offset string. Unlike 'sort' and 'ids', which are computed by us, 'offset' comes straight out of HTTP headers. Document what happens in the docstring for this method. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:25 (Diff revision 4) > +import java.util.Set; > + > +/** > + * Download items of collections in batch from server using offset token if it is present in the response header > + * to fetch the next batch, otherwise download is complete > + * If detected the collection has been modified and the batch isn't completed, abort batch download Periods and editing, plz. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:33 (Diff revision 4) > + public static final String LOG_TAG = "BatchingDownloader"; > + > + protected final Server11Repository repository; > + private final Server11RepositorySession repositorySession; > + private final DelayedWorkTracker workTracker = new DelayedWorkTracker(); > + //Used to track outstanding requests, so that we can abort them as needed. Nit: space after `//`. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:135 (Diff revision 4) > + // However, we also track this header in client side, and can defensively validate against it here as well. > + String currentLastModifiedTimestamp = response.lastModified(); > + Logger.debug(LOG_TAG, "Last modified timestamp " + currentLastModifiedTimestamp); > + > + if (this.lastModified == null) { > + // first time seeing last modified timestamp Nit: sentences with a capital letter and punctuation, throughout, please. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:138 (Diff revision 4) > + > + if (this.lastModified == null) { > + // first time seeing last modified timestamp > + this.lastModified = currentLastModifiedTimestamp; > + } else { > + if (!this.lastModified.equals(currentLastModifiedTimestamp)) { Rephrase this method with early returns. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:156 (Diff revision 4) > + } > + } > + > + String offset = response.weaveOffset(); > + if (offset != null) { > + this.fetchSince(newerTimestamp, offset, fetchRecordsDelegate); I'm pretty confused here. The batch-download API is supposed to work like this: * You make a request, which might include 'newer' or 'limit', but also includes a number of other parameters like 'ids' or 'sort'. * You get back an `X-Weave-Offset` header and an `X-Last-Modified` header. * You submit _an identical request_ except with `offset=` and `X-If-Unmodified-Since` set to the values from the response. Setting `this.lastModified` appears to result in `XIUS` thanks to the shower of delegate calls. `offset` is handled here. But you're calling `fetchSince` rather than `fetchWithParameters`, so we'll potentially be switching limit/sort/ids etc. for the second batch. What am I missing? ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderDelegate.java:4 (Diff revision 4) > +/* This Source Code Form is subject to the terms of the Mozilla Public > + * License, v. 2.0. If a copy of the MPL was not distributed with this > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */ > +package org.mozilla.gecko.sync.repositories.downloaders; Newline before `package`. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderDelegate.java:18 (Diff revision 4) > +import org.mozilla.gecko.sync.net.WBOCollectionRequestDelegate; > +import org.mozilla.gecko.sync.repositories.delegates.RepositorySessionFetchRecordsDelegate; > + > + > +/** > + * Delegate that gets passed into fetch methods to handle server response from fetch `.` ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderDelegate.java:51 (Diff revision 4) > + } > + > + @Override > + public void handleRequestSuccess(SyncStorageResponse response) { > + Logger.debug(LOG_TAG, "Fetch done."); > + if (response.lastModified() == null) { Reverse conditional, early return. ::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderDelegateTest.java:4 (Diff revision 4) > +/* This Source Code Form is subject to the terms of the Mozilla Public > + * License, v. 2.0. If a copy of the MPL was not distributed with this > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */ > +package org.mozilla.gecko.sync.repositories.downloaders; Newline. ::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderTest.java:4 (Diff revision 4) > +/* This Source Code Form is subject to the terms of the Mozilla Public > + * License, v. 2.0. If a copy of the MPL was not distributed with this > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */ > +package org.mozilla.gecko.sync.repositories.downloaders; Newline.
Attachment #8780287 - Flags: review?(rnewman) → review-
Daniel, do you have time to address the feedback, or should I take over?
Flags: needinfo?(daniellimpersonal)
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review73714 > I'm pretty confused here. > > The batch-download API is supposed to work like this: > > * You make a request, which might include 'newer' or 'limit', but also includes a number of other parameters like 'ids' or 'sort'. > * You get back an `X-Weave-Offset` header and an `X-Last-Modified` header. > * You submit _an identical request_ except with `offset=` and `X-If-Unmodified-Since` set to the values from the response. > > Setting `this.lastModified` appears to result in `XIUS` thanks to the shower of delegate calls. > > `offset` is handled here. > > But you're calling `fetchSince` rather than `fetchWithParameters`, so we'll potentially be switching limit/sort/ids etc. for the second batch. > > What am I missing? Nothing, I made a mistake and should be calling fetchWithParameters instead!
Priority: P1 → P2
Priority: P2 → P1
Priority: P1 → P2
(In reply to :Grisha Kruglov from comment #30) > Daniel, do you have time to address the feedback, or should I take over? Discussed with Grisha on IRC, clearing NI for now, Grisha will decide on next steps.
Flags: needinfo?(daniellimpersonal)
Assignee: daniellimpersonal → gkruglov
TLDR: next steps: - implement a time-based "green light" function, limiting each (download portion of) sync stage to some maximum allowed time. - split "limit" in the ConstrainedServer11Repository into per-batch and total. Pick good values for these for each record type, considering average size of a record. - for unstructured record types, adjust Downloader to stop (and call onFetchCompleted) if either is hit. - for structured record types, continue using SafeConstrainedServer11Repository with some "sanity limit", but consider not restricting their download otherwise once sanity limit is satisfied during "first sync" (local known collection l-m timestamp=0). - additionally, consider implementing high-water-mark downloading of records on sync-stage level for unstructured records, i.e. restarting downloads not from collection's L-M timestamp, but from L-M timestamp of the last record we've fetched. Richard, does this make sense? More detailed thoughts: Current patch will not stop downloading until either process is killed by the OS, or until they the whole collection (>since) has been downloaded. For large collections this isn't feasible, as Richard pointed out in Comment 29, and might not be likely to succeed at all. So, we need to stop the downloader at some point. For unstructured records, we have three options: 1) introduce a total limit for number of records that the downloader is allowed to fetch. This limit could be part of the ConstrainedServer11Repository, and thus configured separately per record-type. 2) introduce a green light function, which will keep track of how much time we have been syncing for, and stop the downloader once it determines that sync needs to move onto the next step. 3) introduce both of these, and move sync forward when we hit either limit. Either way, we also need to define our per-batch limit - how many records we're downloading in each batch. This is what ConstrainedServer11Repository's limit is in the current version of the patch. For structured records we need to download everything before we can say we've succeeded, so we use SafeConstrainedServer11Repository, which checks if server record count is below our "sanity limit" during the first sync, and skips sync of the stage otherwise. We should consider not restricting downloading of these records otherwise. Here's why: - "Total limit" mentioned previously does not apply - we need all of them, or none of them. - Green light might or might not apply. Since we don't have a local buffer/mirror currently, we store downloaded records right away. If our download fails, we won't upload our partial local state, so this is somewhat safe. We will attempt to "repair"/"fill out" our partial state during next sync, and if we succeed, we will then upload. Having local device data in a partially-complete state is very undesirable, so we should consider not limiting downloads of structured record types at all once they pass our "sanity limit", and prioritize them over other sync stages in terms of allotted time. This will matter most during the first sync, since consequent syncs are incremental. Once a local "incremental mirror" is added, we might consider limiting these downloads with a green light function, to balance things out. We should also be mindful that other clients might upload large additional record sets after our first sync (accounts being merged). Back to unstructured records. It seems that how many records device will be able to download for each record type will largely depend on two factors: - how quickly it's able to download individual batches - how quickly it's able to process individual batches Modern devices have quick-enough storage (but we also run on older devices), so let's say we're primarily concerned with how fast and reliable our network is; that is, how quickly can we download batches and how likely are they all to succeed. Currently there's no retry mechanics for failed batches. Server here is assumed to be opaquely reliable and fast to respond, which of course might not always be the case. Quality of the network is the defining factor here, and that is going to be widely variable (wifi vs 2g vs 3g vs LTE vs spotty congested public wifi on a ferry, etc, all in one day!). If we place total limits with slow networks in mind, we're unnecessarily limiting devices on fast networks, and vice versa. Thus, it seems that following a time-based green light function is more flexible. Devices on fast networks (with fast storage) will be able to fetch more in each sync than devices on slow networks, without hard-coded totals. Lack of "high water mark" here is a problem, since we might quickly download a dozen of batches, and then our network drops off - we consider our sync a failure, and restart from the same timestamp next time, possibly re-downloading many batches worth of records. Atomic uploads gives us a margin of safety here for the other side of sync. If we do have a lot of records locally we are guaranteed to upload & commit all of the records safely. However, it's possible that the collection is very large, and atomic upload might struggle to complete all of the payloads (say, we've been synced, got a lot of records, disconnected from an account, and reconnected to another one - we now need to upload every record!), and thus we might struggle to complete sync on a (poor) network. This scenario could be helped by the mentioned "total limit" for how many records we're willing to download at all. However, that still doesn't guarantee success - just makes it more likely in poor conditions. Another concern here is our local storage limitations - with large collections to sync and small amount of free storage space on the device, we might be in trouble. "Total limit" is helpful here. Bandwidth consumed by sync is also a potential concern, for which "total limit" might be helpful. So to summarize, Option 3 with a per-record-type-configurable total limit is a good way to go. However, we need to be careful in choosing this limit for each record type. It's a balancing act of trying to not penalize fast devices on fast networks vs ensuring that we can sync potentially large collections in a variety of network conditions without hitting device limitations. Another matter here is what we do during a consequent sync. Currently we fetch records with a "since" value that is equal to X-Weave-Timestamp returned with the fetch request. Current patch changes this to use X-Last-Modified value returned with the last (and every, due to X-I-U-S) downloaded batch. If downloading was cut by a green-light or a total limit before we've fetched all of the records, this value is our next "since". But, this means that it's possible to skip over sets of records. |-> next "since" (response.x-last-modified) |--↓ |... gap of records that are potentially skipped |--↑ |-> download cut; onFetchCompleted(response.x-last-modified) | | |-> fetch(since) iOS uses per-record high-water-mark, and restarts downloading from where it left of in case of download failure (412? network error?). We need to implement something similar if we want to avoid these gaps and if we don't want to redownload the same record batches, and certainly once we introduce an incremental local buffer/mirror. This is not part of the Downloader itself - it doesn't have access to decrypted records, so doesn't know their timestamps. So this is likely job of the Sync stage. Look to SyncClientsEngineStage's ClientDownloadDelegate for inspiration.
Flags: needinfo?(rnewman)
(In reply to :Grisha Kruglov from comment #34) Excellent summary. Inline replies: > - for unstructured record types, adjust Downloader to stop (and call > onFetchCompleted) if either is hit. This requires a little nuance. Per-batch is hit each time you grab a chunk; if it's *not* hit, you stop. Total limit is where we actively want to stop. The calling code will want to know if we exhausted the contents of the server, if we failed to grab a batch, or if we hit our own maximum (e.g., 5000 items). We don't call the same onFetchCompleted flow for each situation. > - for structured record types, continue using > SafeConstrainedServer11Repository with some "sanity limit", but consider not > restricting their download otherwise once sanity limit is satisfied during > "first sync" (local known collection l-m timestamp=0). Yup. * If we check info/collection_counts and you have five million bookmarks, we probably shouldn't even bother. * If you have 20,000 bookmarks, we can sync them, but we might do it in four separate syncs, and then apply the downloaded records in the fourth or fifth. In each sync we'll download 5,000 bookmarks, 100 or 200 at a time. So there are several numbers here: * A sanity max that is checked in info/collection_counts. (Currently that's 5000.) * A maximum number of records to download in a sync operation. * A maximum number of records to download in a GET. > - additionally, consider implementing high-water-mark downloading of records > on sync-stage level for unstructured records, i.e. restarting downloads not > from collection's L-M timestamp, but from L-M timestamp of the last record > we've fetched. On all records -- structured records benefit from this, too. This is a little more important than just a "consider" -- batching is great, but batching that fails near the end and starts again from the beginning (which is the only other option if you get a 412) is painful. > 3) introduce both of these, and move sync forward when we hit either limit. Typically one doesn't 'continue with' (proceed to apply then upload) a sync for anything until you've downloaded everything you want to download. I agree that you want to stop for both time and quantity reasons. I think what you're saying is that, for unstructured records, when we've downloaded lots of things we can go ahead and apply them. What happens if you want the user's most recent 10,000 history items, but you don't want to download more than 2,000 in a sync? You can't grab 2,000 and continue (i.e., apply them, then upload yours) because you'll see your own records four syncs later, or you'll miss the next 8,000 remote records. Instead, you need to buffer those 2,000 records until you've got the rest, or you need to apply them and *not upload* any changes, which complicates matters. See discussion of download-only syncs here: https://github.com/mozilla/fxa-auth-server/issues/1316#issuecomment-231064542 > - Green light might or might not apply. Since we don't have a local > buffer/mirror currently, we store downloaded records right away. You will need persistent storage for the buffer. There is no sane approach that allows you to download 10,000 bookmark records without writing them to somewhere other than the main bookmarks table. You can't keep them in memory, and you can't afford to download the same chunks over and over. > Back to unstructured records. It seems that how many records device will be > able to download for each record type will largely depend on two factors … bear in mind that you can also schedule additional syncs. If every engine buffers downloads, you can skip through them quickly then ask Android to keep syncing you again and again -- indeed, some return values from the syncing API will cause Android to immediately ask you to sync again! The reason not to do that is the risk of your batch tokens being invalidated. So you want to grab as many records as you dare from each collection, hoping to exhaust the remainder. A very optimized syncer would figure out which engines contain few records, grab all of those, and aim for completeness as soon as possible. We are not that :D > Lack of "high water mark" here is a problem, since we might quickly download > a dozen of batches, and then our network drops off - we consider our sync a > failure, and restart from the same timestamp next time, possibly > re-downloading many batches worth of records. See note above, yes. You should track the token and a high watermark: if the token is invalidated, you will only redownload a handful of records. And because you're buffering them, redownload doesn't complicate your life. > However, it's possible that the collection is > very large, and atomic upload might struggle to complete all of the payloads > (say, we've been synced, got a lot of records, disconnected from an account, > and reconnected to another one - we now need to upload every record!), and > thus we might struggle to complete sync on a (poor) network. Cross-sync upload is another decent-sized can of worms. It's doable, but let's not do it in this bug. It's best to think of sync as two almost entirely separate stages: download + fast-forward timestamp; upload + fast-forward timestamp. > Another concern here is our local storage limitations - with large > collections to sync and small amount of free storage space on the device, we > might be in trouble. "Total limit" is helpful here. You could file a follow-up to check free space versus expected download size (indeed, you can check the size of the header in each HTTP response). Buffering downloaded records on disk before application will use more space, but gives you an excellent escape hatch if you won't be able to store everything.
Flags: needinfo?(rnewman)
In case I forget: batching is slightly incompatible with any sort option other than sort=oldest. I expect many of the Android engines are set up to use sort=index, which tries to grab most valuable records first (e.g., folders, high-frecency history). You cannot use timestamp-based high-watermark batching recovery unless you're fetching oldest to newest. You *can* use the server-supported batch mechanism, but if you're interrupted it might be because another client added a really high-value record that you have effectively already skipped; you can't just pick up where you left off.
Thank you for the feedback, Richard. If I'm reading your notes correctly, it seems that while it's possible to land batching downloader work without a persistent buffer backing it, with some limitations in place, in the cases where batching would matter most we could really use a buffer, and consequently for the cases where we don't need the buffer as much, batching has quite a bit less impact. Which leaves us at a "let's do them together" point, together with a high water mark downloads, either expanding scope of this bug or placing a hard dependency on a separate chunk of work. This buffer could be a fairly light weight sort of a thing to begin with (mirror of the remote state, essentially, modeled as a repository). If we did want to land download batching by itself, then the thing to do seems to be: - set total sync and per-batch limits close together (if not the same), and probably not much higher than what we have now, so that we're not really "batching", but just running through the downloader in a "one batch only" mode. - adjust total sync limit upwards once buffer and high water mark stuff is in place This gets us to land code, gain some momentum, and provide incentive/some groundwork for continuing towards a brighter future :-) sidenote: if we're buffering everything and applying once we're "done" (which could span multiple syncs), that will have UX implications, especially compared to the current "snappy" (yet prone to error) experience. It seems that the trade-off is worth it, especially considering that UX impact will be rare and mostly around points of large syncs (albeit those points in time might also be pretty important in terms of user perception - first sync, etc). The alternative, as you point out, is the download-only sync. Or if I say it quickly we could have a "read-only" view for when we haven't quite synced everything yet but have stuff in a buffer, a la iOS bookmarks, but that really seems like a can of worms.
Flags: needinfo?(rnewman)
(In reply to :Grisha Kruglov from comment #37) > Thank you for the feedback, Richard. If I'm reading your notes correctly, it > seems that while it's possible to land batching downloader work without a > persistent buffer backing it, with some limitations in place You could do it without buffering if: * You don't mind apply records piecemeal. * Ideally, you fetch with sort=oldest and implement high-watermarking. If you don't, then you'll always restart from the beginning when interrupted, which ain't great, but isn't far from what we have now. That's fine for e.g., passwords. It's flaky for stuff like form history (we want to grab your 5000 most recent!). There are two main purposes for batching downloads: * Downloading large collections: efficient use of time and minimal retries. For this it's helpful if we can resume if interrupted, but that's not compatible with sortindex. Efficiency requires buffering. * Reliably mirroring the server prior to a sync. That's what we need for bookmarks, and that implies buffering. As well as being necessary for that, buffering itself gives an opportunity for validation. > Which leaves us at a "let's do them together" point, together with a high > water mark downloads, either expanding scope of this bug or placing a hard > dependency on a separate chunk of work. This buffer could be a fairly light > weight sort of a thing to begin with (mirror of the remote state, > essentially, modeled as a repository). Yup. > If we did want to land download batching by itself, then the thing to do > seems to be: > - set total sync and per-batch limits close together (if not the same), and > probably not much higher than what we have now, so that we're not really > "batching", but just running through the downloader in a "one batch only" > mode. Yes. Landing batching with max=per-batch (and any sort) and immediately applying downloaded records is almost equivalent to what we do now, but with multiple HTTP requests instead of one big GET. However, the 'almost' is good or bad: right now we don't start *applying* records until `handleHttpResponse` is called. If we split this one request into more than one, we will store and apply some records before the download is complete, and are thus easy to interrupt. (In a sense we already buffer: in the HTTP layer, in memory.) Applying records as we go improves timeliness, but it also costs safety, and shouldn't be done with bookmarks at all for obvious reasons. You can get exactly the same behavior we have now by temporarily storing all of the downloaded records in memory, with about the same lifecycle as the batch token, then walking them calling the real delegate. That wouldn't be much work. > sidenote: if we're buffering everything and applying once we're "done" > (which could span multiple syncs), that will have UX implications, > especially compared to the current "snappy" (yet prone to error) experience. As noted above, we already wait until the entire request is complete before applying any records -- you shouldn't see any e.g., history appear in the DB until we get a 200 for the GET. So the only additional delay would be if we span across multiple syncs. Even then, the total delay should be small (trigger another sync!), the total delay for users who repeatedly fail to complete a download is infinitely better, and the duration until we've fetched some of every collection is bounded, which is great. > Or if I say it quickly we could have a "read-only" view for when we > haven't quite synced everything yet but have stuff in a buffer, a la iOS > bookmarks, but that really seems like a can of worms. It's a can of worms if you want to include it in a tree view. It's pretty easy if you want to do link highlighting, awesomebar searchability, and so on -- just add a `UNION ALL SELECT FROM buffer` bit to those queries.
Flags: needinfo?(rnewman)
Priority: P2 → P3
Priority: P3 → P2
Assignee: gkruglov → daniellimpersonal
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review78528 Pass on the latest changes. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/ConstrainedServer11Repository.java:14 (Diff revisions 5 - 6) > import org.mozilla.gecko.sync.InfoCollections; > import org.mozilla.gecko.sync.InfoConfiguration; > import org.mozilla.gecko.sync.net.AuthHeaderProvider; > > /** > - * A kind of Server11Repository that supports explicit setting of limit and sort on operations. > + * A kind of Server11Repository that supports explicit setting of batchLimit and sort on operations. "A kind of Server11Repository that supports explicit setting of total fetch limit, per-batch fetch limit, and a sort order." ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/ConstrainedServer11Repository.java:22 (Diff revisions 5 - 6) > * > */ > public class ConstrainedServer11Repository extends Server11Repository { > > private String sort = null; > - private long limit = -1; > + private long batchLimit = -1; These can be final, and you don't need to set them to -1 (since values are private and are set in the constructor). ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/ConstrainedServer11Repository.java:48 (Diff revisions 5 - 6) > - public long getDefaultFetchLimit() { > - return limit; > + public long getDefaultBatchLimit() { > + return batchLimit; > } > + > + @Override > + public long getDefaultTotalLimit() { return totalLimit; } Style nit: Break this up into multiple lines. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/Server11Repository.java:124 (Diff revisions 5 - 6) > @SuppressWarnings("static-method") > public String getDefaultSort() { > return null; > } > > + public long getDefaultTotalLimit() { return -1; } Style nit: break this up into multiple lines (follow file style). ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/Server11RepositorySession.java:48 (Diff revisions 5 - 6) > RepositorySessionGuidsSinceDelegate delegate) { > // TODO Auto-generated method stub > > } > > - public void fetchSince(long timestamp, long limit, String sort, RepositorySessionFetchRecordsDelegate delegate) { > + public void fetchSince(long timestamp, long batchLimit, String sort, RepositorySessionFetchRecordsDelegate delegate) { I think we can kill this method entirely. 1) no one actually uses it, as far as I can tell 2) limit and sort are already configured in stage's repository ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:51 (Diff revisions 5 - 6) > private final Server11RepositorySession repositorySession; > private final DelayedWorkTracker workTracker = new DelayedWorkTracker(); > // Used to track outstanding requests, so that we can abort them as needed. > private final Set<SyncStorageCollectionRequest> pending = Collections.synchronizedSet(new HashSet<SyncStorageCollectionRequest>()); > private volatile String lastModified; > + private volatile long numRecords = 0; As mentioned below, `volatile` isn't good enough here, and you'll want to synchronize access to this variable. As you do so, document your access lock inline here. Something like // @AccessLock("this") ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:77 (Diff revisions 5 - 6) > } > return b.substring(0, b.length() - 1); > } > > @VisibleForTesting > public void fetchWithParameters(long newer, Your public interface as far as fetch goes is: 1) fetch(guids, delegate) 2) fetchSince(timestamp, delegate) everything else is private - unless you have to expose it for testing... but I think you should be able to get around that easily enough by testing against other `fetch` methods. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:114 (Diff revisions 5 - 6) > Logger.debug(LOG_TAG, collectionURI.toString()); > > return new SyncStorageCollectionRequest(collectionURI); > } > > - public void fetchSince(long timestamp, long limit, String sort, > + public void fetchSince(long timestamp, long batchLimit, String sort, This is private ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:127 (Diff revisions 5 - 6) > > public void fetchSince(long timestamp, RepositorySessionFetchRecordsDelegate fetchRecordsDelegate) { > this.fetchSince(timestamp, null, fetchRecordsDelegate); > } > > public void fetchSince(long timestamp, String offset, This should be private ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:185 (Diff revisions 5 - 6) > return; > } > > + this.numRecords += response.weaveRecords(); > String offset = response.weaveOffset(); > - if (offset != null) { > + if (offset != null && this.numRecords < repository.getDefaultTotalLimit()) { numRecords is `volatile`, and I don't think volatile's memory safety guarantees are strong enough to support both modifying a variable, and conditionally doing something right afterwards based on the new value. If that is what you want to do, and `numRecords` might be accessed from different threads (I think that's the case here?), you'll need to synchronize access to it. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/stage/AndroidBrowserBookmarksServerSyncStage.java:25 (Diff revision 6) > > // Eventually this kind of sync stage will be data-driven, > // and all this hard-coding can go away. > private static final String BOOKMARKS_SORT = "index"; > - private static final long BOOKMARKS_REQUEST_LIMIT = 5000; // Sanity limit. > + // Sanity limit. Batch and total limit are the same for now, > + // and will be adjusted once buffer and high water mark is in place. Here and elsewhere this comment appears, add a reference to Bug 730142. Also, grammar: "... and high water mark _are_ in place."
Attachment #8780287 - Flags: review+ → review?(gkruglov)
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review78528 > Your public interface as far as fetch goes is: > 1) fetch(guids, delegate) > 2) fetchSince(timestamp, delegate) > > everything else is private - unless you have to expose it for testing... but I think you should be able to get around that easily enough by testing against other `fetch` methods. I made it public for the reason that in some cases it doesn't go through other fetch methods (eg. When we call it again in onFetchComplete after getting an offset token).
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review78830 ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:173 (Diff revisions 6 - 7) > } > }); > return; > } > > + boolean hasReachedLimit; Either rename this variable (`hasNotReachedLimit`), or reverse the conditional (`this.numRecords >= repository.getDefaultTotalLimit()`). Also, `final`.
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review78528 > I made it public for the reason that in some cases it doesn't go through other fetch methods (eg. When we call it again in onFetchComplete after getting an offset token). onFetchComplete is within the same class, so making it private is fine. Your delegate only calls onFetchComplete, onFetchFailed, and onFetchedRecord methods. These should be public, along with the two fetch methods I noted above, called from the session. Everything else is private (@VisibleForTesting might be an exception).
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review78870 I think this is very close. With the minor improvements and nits addressed, I think we're at the "no-op" batching state, and should be good to land to get it working in nightly. Re-flag me once you address these! ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/net/SyncResponse.java:151 (Diff revision 7) > + * It is similar to the standard HTTP Last-Modified header, > + * but the value is a decimal timestamp rather than a HTTP-format date. > + * > + * @return the last modified header > + */ > + public String lastModified() { @Nullable ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/net/SyncResponse.java:155 (Diff revision 7) > + */ > + public String lastModified() { > + return this.getNonMissingHeader(X_LAST_MODIFIED); > + } > + > + public long lastModifiedTimestamp() { This method is not needed. There is already `normalizedTimestampForHeader`, so just use that. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/ConstrainedServer11Repository.java:21 (Diff revision 7) > * @author rnewman > * > */ > public class ConstrainedServer11Repository extends Server11Repository { > > private String sort = null; final ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:50 (Diff revision 7) > + protected final Server11Repository repository; > + private final Server11RepositorySession repositorySession; > + private final DelayedWorkTracker workTracker = new DelayedWorkTracker(); > + // Used to track outstanding requests, so that we can abort them as needed. > + private final Set<SyncStorageCollectionRequest> pending = Collections.synchronizedSet(new HashSet<SyncStorageCollectionRequest>()); > + private volatile String lastModified; I think lastModified might have the same concurrency issues as numRecords. (you're assigning it a value and branching on its value all at the same time). Synchronize? ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:76 (Diff revision 7) > + b.append(","); > + } > + return b.substring(0, b.length() - 1); > + } > + > + public void fetchWithParameters(long newer, Since you are using this in tests in an extending mock class, mark this as `@VisibleForTesting protected` (that is, most private level you can get away with) ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:94 (Diff revision 7) > + this.pending.add(request); > + request.get(); > + } > + > + @VisibleForTesting > + public String encodeParam(String param) throws UnsupportedEncodingException { @Nullable ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:148 (Diff revision 7) > + final SyncStorageCollectionRequest request, long newer, > + long limit, boolean full, String sort, String ids) { > + removeRequestFromPending(request); > + > + // When we process our first request, we get back a X-Last-Modified header indicating when collection was modified last. > + // We pass it to the server with every subsequent request (if we need to make more) in the X-If-Unmodified-Since, "as the X-If-Unmodified-Since header" ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:152 (Diff revision 7) > + // When we process our first request, we get back a X-Last-Modified header indicating when collection was modified last. > + // We pass it to the server with every subsequent request (if we need to make more) in the X-If-Unmodified-Since, > + // and server is supposed to ensure that this pre-condition is met, and fail our request with a 412 error code otherwise. > + // So, if all of this happens, these checks should never fail. > + // However, we also track this header in client side, and can defensively validate against it here as well. > + String currentLastModifiedTimestamp = response.lastModified(); `response.lastModified()` could be null. You check for this in the delegate, but IDE won't know that and will see unchecked use of a @Nullable method return value and (rightfully) complain; probably just add a sanity null check here. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:173 (Diff revision 7) > + } > + }); > + return; > + } > + > + boolean hasReachedLimit; nit: here and elsewhere: follow your own style; if you're going to final stuff when possible (good idea), do it everywhere. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:240 (Diff revision 7) > + return; > + } > + this.pending.remove(request); > + } > + > + public void abort() { private; you're only calling this from onFetchCompleted if L-M timestamp changed; ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderDelegate.java:87 (Diff revision 7) > + @Override > + public void handleWBO(CryptoRecord record) { > + this.downloader.onFetchedRecord(record, this.fetchRecordsDelegate); > + } > + > + // TODO: this implies that we've screwed up our inheritance chain somehow. You can just drop this comment at this point.
Attachment #8780287 - Flags: review?(gkruglov)
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review78528 > onFetchComplete is within the same class, so making it private is fine. Your delegate only calls onFetchComplete, onFetchFailed, and onFetchedRecord methods. These should be public, along with the two fetch methods I noted above, called from the session. Everything else is private (@VisibleForTesting might be an exception). Ugh, I meant to start that sentence with `fetchWithParameters`, not `onFetchComplete`. (but see my comment RE `@VisibleForTesting protected`).
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review78870 > final Its value is set in the constructor, so it should't be final ? > private; you're only calling this from onFetchCompleted if L-M timestamp changed; Making it protected since it's extended in mock class for testing
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review78870 > Its value is set in the constructor, so it should't be final ? `sort` is an instance variable, and since it's of "reference" type, it'll get initialized to `null` automatically. So no need to do that explicitely, and thus you can mark it as final. See https://docs.oracle.com/javase/specs/jls/se8/html/jls-4.html#jls-4.12.3 and https://docs.oracle.com/javase/specs/jls/se8/html/jls-4.html#jls-4.12.5.
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review79258 ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:160 (Diff revisions 7 - 9) > // However, we also track this header in client side, and can defensively validate against it here as well. > - String currentLastModifiedTimestamp = response.lastModified(); > + final String currentLastModifiedTimestamp = response.lastModified(); > Logger.debug(LOG_TAG, "Last modified timestamp " + currentLastModifiedTimestamp); > > - if (this.lastModified == null) { > + // Sanity check. We also did a null check in delegate before passing it into here. > + if (currentLastModifiedTimestamp == null) { You can drop your getters/setters, and just do this: ``` [...] if (currentLastModifiedTimestamp == null) { abort(fetchRecordsDelegate, "Last modified timestamp is missing"); return; } final boolean lastModifiedChanged; synchronized(this) { if (this.lastModified == null) { this.lastModified = currentLastModifiedTimestamp; } lastModifiedChanged = !this.lastModified.equals(currentLastModifiedTimestamp); } if (lastModifiedChanged) { abort(fetchRecordsDelegate, "Last modified timestamp has changed unexpectedly"); return; } [...] ``` Also, do the same thing for `numRecords` (as you had before), no need for getters/setters. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:193 (Diff revisions 7 - 9) > } > }); > return; > } > > - boolean hasReachedLimit; > + this.addNumRecords(response.weaveRecords()); You're acquiring, then releasing, then acquiring the lock. Just do what you did before, and drop setters/getters. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:269 (Diff revisions 7 - 9) > this.pending.clear(); > } > } > > - public String getLastModified() { > + @Nullable > + protected String getLastModified() { You will be deleting this anyway, but FYI you can just synchronize method itself, it's the same thing pretty much.
Attachment #8780287 - Flags: review?(gkruglov)
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review79286 Looking good! One more fix and let's add some more tests for this (see below), and I'll do some local testing of this stuff tomorrow - and I recommend you do the same ;) ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:247 (Diff revision 10) > + } > + this.pending.remove(request); > + } > + > + @VisibleForTesting > + protected void abort() { Either rename this to `abortRequests` or something similar (and it would be great to have unit tests that actually exercise this stuff), or combine with the other `abort` method. Renaming this method (instead of combining them) will make writing unit tests for this easier.
Attachment #8780287 - Flags: review?(gkruglov) → review+
I'm seeing some failing unit tests. Daniel, can you take a look? Relevant failures: org.mozilla.android.sync.test.TestServer11RepositorySession > testStorePostFailure FAILED java.lang.AssertionError at TestServer11RepositorySession.java:171 org.mozilla.android.sync.test.TestServer11RepositorySession > testStorePostSuccessWithFailingRecords FAILED java.lang.AssertionError at TestServer11RepositorySession.java:153 org.mozilla.android.sync.test.TestRecordsChannel > testFetchFail FAILED java.lang.AssertionError at TestRecordsChannel.java:191
Mentor: gkruglov
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review79538 ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:262 (Diff revision 11) > + @Nullable > + protected synchronized String getLastModified() { > + return this.lastModified; > + } > + > + private void abortRequests(final RepositorySessionFetchRecordsDelegate delegate, final String msg) { My apologies, I've placed my "rename this" comment on the wrong method. The other abort method actually "aborts" the requests - so you should swap the names, otherwise they don't make sense. Also, add a unit test for `abortRequests` to verify that you actually aborted them, and cleared the queue. ::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderDelegateTest.java:32 (Diff revision 11) > +import ch.boye.httpclientandroidlib.message.BasicStatusLine; > + > +import static org.junit.Assert.*; > + > +@RunWith(TestRunner.class) > +public class BatchingDownloaderDelegateTest { Add a unit test for ifUnmodifiedSince ::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderDelegateTest.java:81 (Diff revision 11) > + infoConfigurationJSON.put(InfoConfiguration.MAX_TOTAL_BYTES, maxTotalBytes); > + infoConfigurationJSON.put(InfoConfiguration.MAX_TOTAL_RECORDS, maxTotalRecords); > + infoConfigurationJSON.put(InfoConfiguration.MAX_POST_RECORDS, maxPostRecords); > + infoConfigurationJSON.put(InfoConfiguration.MAX_POST_BYTES, maxPostBytes); > + infoConfigurationJSON.put(InfoConfiguration.MAX_REQUEST_BYTES, maxRequestBytes); > + InfoConfiguration infoConfiguration = new InfoConfiguration(infoConfigurationJSON); InfoConfiguration shouldn't matter to you - BatchingDownloader doesn't use it, it's only used by BatchingUploader at the moment. There's a InfoConfiguration() constructor which should be sufficient. ::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderTest.java:37 (Diff revision 11) > + > +import static org.junit.Assert.*; > +import static org.junit.Assert.assertEquals; > + > +@RunWith(TestRunner.class) > +public class BatchingDownloaderTest { These tests are a good start, but you're not really testing batching aspects of the download batcher. You should have tests that cover a variety of limit scenarios, some quick ones that might help expose batching bugs: - total and per-batch limits are the same - per-batch limit is just a bit lower than total - per-batch limit is around half of total - per-batch limit is half of total - per-batch limit is a small fraction of the total - etc... The point is to exercise the (albeit simple) batching logic. You'll need to override/mock repository's getDefaultTotalLimit and getDefaultBatchLimit, and do so depending on what scenario you're testing. ::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderTest.java:128 (Diff revision 11) > + long maxTotalBytes = 1024; > + long maxTotalRecords = 1024; > + long maxPostRecords = 2; > + long maxPostBytes = 4096; > + long maxRequestBytes = 4; > + infoConfigurationJSON.put(InfoConfiguration.MAX_TOTAL_BYTES, maxTotalBytes); InfoConfiguration shouldn't matter to you - BatchingDownloader doesn't use it, it's only used by BatchingUploader at the moment. There's a InfoConfiguration() constructor which should be sufficient. ::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloaderTest.java:148 (Diff revision 11) > + } > + > + @Test > + public void testFlattenId() { > + String[] emptyGuid = new String[]{}; > + String flatten; Here and throughout your tests: just combine the two lines (declaration and assignment).
Another interesting side-effect to consider when increasing total limit to be above per-batch limit: unless total is an even multiple of per-batch, we are likely to blow through the limit (i.e. download more than total specifies). I ran this test against history sync stage: per-batch=250, total=5113, and got total fetched=5250. Completely expected result (and batching worked correctly), but just something to be aware of. If we want total limit to be a hard one we'll need to discard already-downloaded records, which doesn't seem necessary.
(In reply to :Grisha Kruglov from comment #57) > I'm seeing some failing unit tests. Daniel, can you take a look? > > Relevant failures: > org.mozilla.android.sync.test.TestServer11RepositorySession > > testStorePostFailure FAILED > java.lang.AssertionError at TestServer11RepositorySession.java:171 > > org.mozilla.android.sync.test.TestServer11RepositorySession > > testStorePostSuccessWithFailingRecords FAILED > java.lang.AssertionError at TestServer11RepositorySession.java:153 > > org.mozilla.android.sync.test.TestRecordsChannel > testFetchFail FAILED > java.lang.AssertionError at TestRecordsChannel.java:191 Because the mock server used for testing is missing the L-M header :) I will update this on the next commit.
(In reply to :Grisha Kruglov from comment #59) > Another interesting side-effect to consider when increasing total limit to > be above per-batch limit: unless total is an even multiple of per-batch, we > are likely to blow through the limit (i.e. download more than total > specifies). > > I ran this test against history sync stage: per-batch=250, total=5113, and > got total fetched=5250. Completely expected result (and batching worked > correctly), but just something to be aware of. > > If we want total limit to be a hard one we'll need to discard > already-downloaded records, which doesn't seem necessary. On that note, how about when per-batch limit somehow exceeds total limit (due to programming error) ? It would only fetch once (albeit on an exceeded capacity) since we validate if limit is reached in onFetchCompleted. Should we throw an error in this case? (If we add a 0 after the limit it's now 10x the original limit we intended)
(In reply to Daniel Lim [:dlim] from comment #61) > (In reply to :Grisha Kruglov from comment #59) > > Another interesting side-effect to consider when increasing total limit to > > be above per-batch limit: unless total is an even multiple of per-batch, we > > are likely to blow through the limit (i.e. download more than total > > specifies). > > > > I ran this test against history sync stage: per-batch=250, total=5113, and > > got total fetched=5250. Completely expected result (and batching worked > > correctly), but just something to be aware of. > > > > If we want total limit to be a hard one we'll need to discard > > already-downloaded records, which doesn't seem necessary. > > On that note, how about when per-batch limit somehow exceeds total limit > (due to programming error) ? > It would only fetch once (albeit on an exceeded capacity) since we validate > if limit is reached in onFetchCompleted. Should we throw an error in this > case? (If we add a 0 after the limit it's now 10x the original limit we > intended) You shouldn't proceed with the download at all if that's the case. `fetch` method which actually does the work (fetchWithParameters?) is probably a good place for this sanity check - and throw something like an IllegalArgumentException.
Blocks: 814801
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review80842 If this works fine for the non-limited collections (e.g., passwords), and you've verified that it works sanely with the two limits pegged together, then I'm fine with this landing. See nits. ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:33 (Diff revision 12) > +/** > + * Downloader which implements batching introduced in Sync 1.5. > + * > + * Download items of collections in batch from server supporting batch downloading, > + * which uses offset token for subsequent fetches if it is present in the response after the first fetch. > + * We continue to fetch until the response no longer return an offset token, at which point signalling the end of batch downloading. s/return/returns ::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/downloaders/BatchingDownloader.java:71 (Diff revision 12) > + return ""; > + } > + if (guids.length == 1) { > + return guids[0]; > + } > + StringBuilder b = new StringBuilder(); Assuming 12-char GUIDs: `new StringBuilder(guids.length * 12 + guids.length);` There should be a -1 in there, but we accumulate one comma too many.
Attachment #8780287 - Flags: review+
Attachment #8796363 - Attachment is obsolete: true
Attachment #8796363 - Flags: review?(gkruglov)
Comment on attachment 8780287 [details] Bug 730142 - Download batching. https://reviewboard.mozilla.org/r/70984/#review80842 > s/return/returns Daniel, can you please update the docstring to: /** * Batching Downloader, which implements batching protocol as supported by Sync 1.5. * * Downloader's batching behaviour is configured via two parameters, obtained from the repository: * - Per-batch limit, which specified how many records may be fetched in an individual GET request. * - Total limit, which controls number of batch GET requests we will make. * * * Batching is implemented via specifying a 'limit' GET parameter, and looking for an 'offset' token * in the response. If offset token is present, this indicates that there are more records than what * we've received so far, and we perform an additional fetch. Batching stops when either we hit a total * limit, or offset token is no longer present (indicating that we're done). * * For unlimited repositories (such as passwords), both of these value will be -1. Downloader will not * specify a limit parameter in this case, and the response will contain every record available and no * offset token, thus fully completing in one go. * * In between batches, we maintain a Last-Modified timestamp, based off the value return in the header * of the first response. Every response will have a Last-Modified header, indicating when the collection * was modified last. We pass along this header in our subsequent requests in a X-If-Unmodified-Since * header. Server will ensure that our collection did not change while we are batching, if it did it will * fail our fetch with a 412 (Consequent Modification) error. Additionally, we perform the same checks * locally. */
With the two issues resolved, you should be able to auto-land it from mozreview, I think!
Flags: needinfo?(daniellimpersonal)
Priority: P2 → P1
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 52
Flags: needinfo?(daniellimpersonal)
Iteration: --- → 1.5
Product: Android Background Services → Firefox for Android
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: