Closed
Bug 615284
Opened 14 years ago
Closed 14 years ago
download chunking needs to be more resilient against app shutdowns
Categories
(Firefox :: Sync, defect)
Firefox
Sync
Tracking
()
RESOLVED
FIXED
People
(Reporter: mconnor, Assigned: philikon)
References
Details
(Whiteboard: [softblocker][has patch][has review])
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
mconnor
:
review+
|
Details | Diff | Splinter Review |
I've had an Android device running Fennec for four days, and haven't finished a sync yet.
Android kills background processes, and we may or may not be hitting OOM due to lack of GC inside of the sync call (some conflicting reports).
We need to:
* sync more records per GET (I have 17k history records, so it's 340 gets to sync everything, as it stands).
* periodically add remaining IDs to the tracker so we don't start from scratch on interrupt.
* Consider only fetching some cap of records on mobile (i.e. limit the query to 5k records, same as the initial 5k upload cap?)
These two things should help us catch up much quicker, without really hurting our disk I/O story.
Reporter | ||
Comment 1•14 years ago
|
||
We need this ASAP.
blocking2.0: --- → beta8+
tracking-fennec: --- → 2.0b3+
Comment 2•14 years ago
|
||
same thing has happened on one of my test accounts on the nexus one. it syncs
fine on a desktop build, but once i switch to mobile it remains "in progress"
indefinitely.
Assignee | ||
Comment 3•14 years ago
|
||
(In reply to comment #0)
> * sync more records per GET (I have 17k history records, so it's 340 gets to
> sync everything, as it stands).
Yes, that's painful. Need to watch memory here, though (see bug 615167)
> * periodically add remaining IDs to the tracker so we don't start from scratch
> on interrupt.
I think we should probably do it the other way around. We should get the changedIDs from the engine (which probably uses the tracker, but it might not in the future) and add them to the tracker at the beginning of the Sync. Then, as we fetch items in batches, we can remove successfully fetched items from the tracker (in batches, too)
Reporter | ||
Updated•14 years ago
|
Assignee: philipp → rnewman
Reporter | ||
Updated•14 years ago
|
blocking2.0: beta8+ → beta9+
tracking-fennec: 2.0b3+ → 2.0b4+
Comment 4•14 years ago
|
||
Dropping this from me, as I'm not actively working on the contents of this bug.
Assignee: rnewman → nobody
Assignee | ||
Comment 5•14 years ago
|
||
(In reply to comment #4)
> Dropping this from me, as I'm not actively working on the contents of this bug.
No probs, I already started messing around with this anyway (see comment 3).
Assignee: nobody → philipp
As per today's meeting, beta 9 will be a time-based release. Marking these all betaN+. Please move it back to beta9+ if you believe it MUST be in the next beta (ie: trunk is in an unshippable state without this)
blocking2.0: beta9+ → betaN+
Fixing fields my automated script accidentally blanked. Apologies for the bugspam
Updated•14 years ago
|
Whiteboard: [softblocker]
Assignee | ||
Updated•14 years ago
|
Summary: sync chunking needs to be more resilient against app shutdowns → Upload chunking needs to be more resilient against app shutdowns
Updated•14 years ago
|
tracking-fennec: 2.0b4+ → 2.0+
Comment 8•14 years ago
|
||
mconnor, can you comment what the plan for this bug is? from what i read on irc, this wasn't needing beta testers, and there are other bugs tracking perf fixes that are more important for testing.
also, tying bug dependencies to those bugs would be helpful. Thanks.
Reporter | ||
Comment 9•14 years ago
|
||
Effectively, this is about backing off on some of the changes to no longer write IDs we're fetching to disk (bug 609398). However, instead of wholesale reverting that, we'll just periodically flush things to the toFetch file, so that we don't have to redownload and dupe-check everything from scratch.
This is 100% unrelated to other perf work.
Summary: Upload chunking needs to be more resilient against app shutdowns → download chunking needs to be more resilient against app shutdowns
Assignee | ||
Comment 10•14 years ago
|
||
Oh, this is about *down*load chunking? Your mentioning the tracker in comment 0 threw me off and I thought this was about upload (separate issue, but my comment 3 about upload still holds for that case.)
(In reply to comment #9)
> Effectively, this is about backing off on some of the changes to no longer
> write IDs we're fetching to disk (bug 609398). However, instead of wholesale
> reverting that, we'll just periodically flush things to the toFetch file, so
> that we don't have to redownload and dupe-check everything from scratch.
Bringing back toFetch would also help us with bug 622762.
Then again, we might not need to bring back toFetch at all. We could sync items down in the order of their timestamp and update our lastSync timestamp as we process individual WBOs. That way we could restart exactly where we left off the last time. This might put a bit more stress on the server, though mostly the records should already be in input order so the sort operation shouldn't be too expensive.
Thoughts?
Reporter | ||
Comment 11•14 years ago
|
||
Not using sort order means we lose the advantage of syncing folders first, so we don't need to reparent on a large sync.
Assignee | ||
Comment 12•14 years ago
|
||
(In reply to comment #11)
> Not using sort order means we lose the advantage of syncing folders first, so
> we don't need to reparent on a large sync.
Good point! (Not that we do this right now on the desktop... bug 625295)
Assignee | ||
Comment 13•14 years ago
|
||
Bring back the toFetch file that was removed in bug 609398. We still sync all batches in one sync on mobile, so bug 609398 wasn't backed out entirely.
Unlike before, toFetch is now saved on the next event loop tick so that we don't do synchronous I/O while syncing stuff down. We still have bug 608757 for making Utils.json{Load|Save} async. I have a WIP for that so if we decide the synchronous I/O (now at least not interfering with places I/O and the like) is bad on mobile, I can always resurrect that patch.
Attachment #506862 -
Flags: review?(mconnor)
Assignee | ||
Updated•14 years ago
|
Whiteboard: [softblocker] → [softblocker][has patch][needs review mconnor]
Assignee | ||
Comment 14•14 years ago
|
||
Comment on attachment 506862 [details] [diff] [review]
wip v1
Hmm, we don't update lastSync until the end of processIncoming when all records have been downloaded. So if the app gets shutdown in the middle processIncoming, we still refetch all records.
So once we're in the batching mode and save to toFetch, we should update lastSync and not lastModified.
Attachment #506862 -
Attachment description: v1 → wip v1
Attachment #506862 -
Flags: review?(mconnor)
Assignee | ||
Comment 15•14 years ago
|
||
This patch now also ensures that lastSync is fast-forwarded when in batch mode. That way on the next sync we resume where we left off rather than refetching everything.
Attachment #506862 -
Attachment is obsolete: true
Attachment #506915 -
Flags: review?(mconnor)
Comment 16•14 years ago
|
||
Sorry to rain on this parade, but loadToFetch calls Utils.jsonLoad, which does sync disk I/O. If we are doing all this work to avoid synchronous disk I/O in the first place, it seems rather silly to end up adding synchronous disk I/O.
Comment 17•14 years ago
|
||
Although it shouldn't be hard to make Utils.jsonLoad async. It would probably be a lot like bug 627772.
Assignee | ||
Comment 18•14 years ago
|
||
(In reply to comment #16)
> Sorry to rain on this parade, but loadToFetch calls Utils.jsonLoad, which does
> sync disk I/O. If we are doing all this work to avoid synchronous disk I/O in
> the first place, it seems rather silly to end up adding synchronous disk I/O.
We're talking about much different scales here, though. Syncing down 1000 history items in batches of 50 will have us do a tiny bit of synchronous I/O 20 times, rather than a N*1000 times (for some number of N) like now.
(In reply to comment #17)
> Although it shouldn't be hard to make Utils.jsonLoad async.
It's not, it just hasn't been a priority up until now. As I said, we could decide to nominate bug bug 608757 if it turns out to be a problem. It's a small fix that can land between beta and RC.
Comment 19•14 years ago
|
||
(In reply to comment #18)
> We're talking about much different scales here, though. Syncing down 1000
> history items in batches of 50 will have us do a tiny bit of synchronous I/O 20
> times, rather than a N*1000 times (for some number of N) like now.
Unfortunately, it looks like any synchronous I/O on mobile ends up having fairly bad pauses (for example, bug 627635). This is probably partly because on android prior to gingerbread (which uses ext4), the file system serializes all access to itself. If we are already doing I/O on a background thread, the main thread cannot do anything because it's blocked by the background thread.
Assignee | ||
Comment 20•14 years ago
|
||
(In reply to comment #19)
> fairly bad pauses (for example, bug 627635). This is probably partly because
> on android prior to gingerbread (which uses ext4), the file system serializes
> all access to itself. If we are already doing I/O on a background thread, the
> main thread cannot do anything because it's blocked by the background thread.
F-I-N-E. Have it your way! :p
I uploaded patches for bug 608757 just now. Pls to be stopping with the rain on my parade nao. kthxbai.
Reporter | ||
Comment 21•14 years ago
|
||
Comment on attachment 506915 [details] [diff] [review]
v2
Looks good here, woo.
Attachment #506915 -
Flags: review?(mconnor) → review+
Assignee | ||
Updated•14 years ago
|
Whiteboard: [softblocker][has patch][needs review mconnor] → [softblocker][has patch][has review]
Assignee | ||
Comment 22•14 years ago
|
||
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in
before you can comment on or make changes to this bug.
Description
•