Closed Bug 808716 Opened 12 years ago Closed 12 years ago

[email] Be more intelligent about getting bodies

Categories

(Firefox OS Graveyard :: Gaia::E-Mail, defect, P1)

defect

Tracking

(blocking-b2g:tef+, blocking-basecamp:-, b2g18 fixed, b2g18-v1.0.1 verified)

VERIFIED FIXED
blocking-b2g tef+
blocking-basecamp -
Tracking Status
b2g18 --- fixed
b2g18-v1.0.1 --- verified

People

(Reporter: squib, Assigned: jlal)

References

Details

(Whiteboard: QARegressExclude)

Attachments

(1 file)

This is mostly applicable to ActiveSync, but IMAP may benefit as well. Let's talk about ActiveSync: we're currently getting the bodies for *all* the messages in our sync range. This is really slow, for obvious reasons. We should determine a good strategy for getting bodies. For instance, we could get the most recent 5-10 unread bodies automatically, and then only get older ones as the user opens the message.
Andreas, I believe you mentioned wanting to test this out on your email account. I have a test branch[1] that truncates the bodies to a maximum of 16 bytes so we can see how much faster that is for you. If that branch makes things a lot faster, then we should definitely work on making our body fetching more intelligent. If it doesn't help that much, then we should focus on bug 808711 first, since that will reduce bandwidth usage by a factor of three for you. [1] https://github.com/mozsquib/gaia/tree/email-truncate-bodies
blocking-basecamp: ? → +
Priority: -- → P1
Assignee: nobody → squibblyflabbetydoo
One possibility here is to keep it simple and do the following: 1) get all the headers 2) get the bodies in descending order by date 3) as we get each body, show the message in the message list This would, at least, make it quicker to get at the most recent messages, since we wouldn't be forced to wait until the old bodies were downloaded.
Milestoning for C2 (deadline of 12/10), as this meets the criteria of "remaining P1 bugs not already milestoned for C1".
Target Milestone: --- → B2G C2 (20nov-10dec)
With bug 808711 fixed, this bug isn't as urgent as it once was. It's also moderately complex, depending on how clever we decide we should be. Assuming we still want this, it would be extremely useful for me to have access to an Exchange account or two (multiple versions would be helpful) to test the changes against them. I can make this work on Hotmail, but the changes I have in mind might break on Exchange, since I'm relying on stuff that doesn't quite follow the ActiveSync spec. What's the best way to get an Exchange account to test with?
In terms of getting an Exchange account: Microsoft offers hosted exchange accounts at $4/month with "Exchange Online": http://www.microsoft.com/exchange/en-us/exchange-online-hosted-email.aspx Allegedly, this is the same as a real Exchange server: http://en.wikipedia.org/wiki/Microsoft_Exchange_Server#Exchange_Online This might be a good first step, since I think standing up a full Exchange server will involve more manual effort. The easiest thing may be to find someone with a corporate card and have them sign up for the account for you. A more comprehensive approach would probably be to either purchase licenses for current and older versions, or, since older versions may not be readily available that avenue, procure a rather expensive Visual Studio premium account at $6,119. This would net the developer (I believe MSDN is licensed per-developer) of the account Exchange 2013, 2010, 2007, and 2003: http://msdn.microsoft.com/en-us/subscriptions/downloads/#searchTerm=exchange&ProductFamilyId=0&Languages=en&PageSize=10&PageIndex=0&FileId=0 It's conceivable MoCo may have some type of volume licensing deal for MS products in general or MSDN that could provide this to us. Maybe file an IT service request to find out?
In terms of whether we want to do this, I think it would be desirable to establish some a baseline performance metric for time-to-first-sync and time-for-incremental-sync-with-new-messages. I would suggest that a reasonable way to do this for first sync would be to: 1) Use willkg's in-process integration test baseline work to automate a test that creates an ActiveSync account and times the key timeline points of: 1a) Time to create an account. This is the time from hitting next to having the account say it was created. 1b) Time to initially sync an account. This is the time from hitting 'yeah, I love this account, let's go to there' and having stuff show up in the Inbox. 2) Run this under simulated sucky bandwidth conditions. Because we are 1337 linux users, we have the mighty power of "tc" to reliably simulate a sucky network connection. See http://superuser.com/questions/147156/simulating-a-low-bandwidth-high-latency-network-connection-on-linux as a starting google point. The incremental thing is a little annoying because of the whole SyncKey thing. It might be easiest to write a script that generates a bunch of mail traffic or plays games with moving messages into and out of folders using a different device account. Once we have these numbers for ActiveSync, we can also then do the low-ish bar thing for running them against IMAP for gmail and yahoo and seeing how those shape up. Then we ask product which of those to focus on.
Discussed this one with Jim today. The potential benefit of this change is smaller now that bug 808711 is in place. Changing the body handling like this is also somewhat risky this late in the game. It's still a nice optimizations but we suggest moving this out of v1. Re-noming to confirm with triage team.
blocking-basecamp: + → ?
Following Jim recommendation
blocking-basecamp: ? → -
Target Milestone: B2G C2 (20nov-10dec) → ---
I would love to approve a patch for this. I am using imap and it fetches megabytes of bodies for no reason.
For IMAP, we should only be downloading bodies for a handful of messages at a time. I believe we shoot for about 15?
The problem is that IMAP's minimum synchronization granularity is currently 1 day, so when the the number of messages per day is greater than our sync target, the sync target becomes meaningless although we do at least shrink our range down to a single day. Andreas' logcat indicated an inbox with 1000 messages in it where 268 of them were new to us. In conjunction with our lack of IMAP body fetch pipelining at the protocol layer, this can take a while to sync. (Also, I had punted on checkpointing database state partway through a sync for complexity/test complexity reasons, so when he kept killing the app no state was ever persisted.) The question of when to show headers/when to fetch bodies implied by this kind of throughput is that we should: - show headers as soon as we fetch them even though we will initially lack a snippet - do not have the slice 'grow' requests block on retrieval of bodies. - fetch bodies based on what is currently visible and allowing that the user may manage to scroll way faster than we can fetch bodies. Since the UI buffers potentially many screens of messages for responsiveness reasons, the notion of visibility needs to be extended to more than just what is present in the MailSlice. This would probably want to be implemented as the UI periodically sending updates of the visible index range on scroll. The central MailSlice logic would then maintain M batches of N body requests in flight, prioritizing from the center of the range outwards. Each time a batch completes, we look at the current state of the MailSlice so we never get bogged down downloading messages that are long moot. This would allow for 2 potential bandwidth reduction features as follow-ons: - never fetch bodies until the user clicks on the message - only fetch bodies after the user lingers on a set of messages without scrolling past them for some amount of time
(In reply to Andrew Sutherland (:asuth) from comment #11) > Andreas' logcat indicated an inbox with 1000 messages in it where 268 of > them were new to us. Mis-phrased. His Inbox had 1000 messages from December 18th at 4pm PST on that day (today), 268 of which were new. Andreas estimates a new message arrives every minute or two, so let's say 1200 a day. This is about 4 times Chris Lee's inbox rate, for those keeping count.
Blocks: 852226
Nominating for tef? -- this is one of the bugs we're currently working on for 1.0.1 perf improvements (bug 852226).
Assignee: squibblyflabbetydoo → jlal
blocking-b2g: --- → tef?
(tef- for now, as it looks like we are just getting started here so this feels very much like it's not going to make the boat, so lets instead focus on the email improvements that are possible in the days we have left for v1.0.1!)
blocking-b2g: tef? → -
Hey Michael, I have been working on this for a few days now and I am confident we will have this in by the end of the week (we are incrementally reviewing the patch to make this as smooth as possible to land). We expect some very significant bandwidth savings and the time for messages to appear should also be reduced dramatically by not waiting for the entire email body to show something... Not sure if that is the best argument for landing this in v1.0.1 but I suspect we need this in to make the most of the other improvements...
m1, we'll land this on master shortly and then re-nominate for tef when it's ready to go.
Pointer to Github pull-request
Early version, still need to address some review comments from the backend changes... Will post performance/bandwidth differences tomorrow.
Comment on attachment 728601 [details] Pointer to Github pull request: https://github.com/mozilla-b2g/gaia/pull/8775 This should be ready for a formal (and hopefully final) review... asuth/jrburke have reviewed some parts... (this request is for backend/frontend)
Attachment #728601 - Flags: review?(squibblyflabbetydoo)
Attachment #728601 - Flags: review?(squibblyflabbetydoo) → review+
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Re-nominating for tef -- this is now landed in master.
blocking-b2g: - → tef?
+clee I wanted to respond to your mailing list comment here: This patch does two big things- - decouples body downloads from sync operations - provides hook for partial body downloads for snippets This has the great side effect of reducing bandwidth consumption but it also lets us show messages _much_ faster. Once we show messages I think its perfectly reasonable to begin the process of fetching bodies (partially full or otherwise) but now we can do that in parallel with showing the user content. We are not stopping here this just a good place to start optimizing for both responsiveness and bandwidth consumption.
(In reply to Dylan Oliver [:doliver] from comment #21) > Re-nominating for tef -- this is now landed in master. Can you all provide a risk evaluation for the tef? triage? That should play a major role in the final decision.
(In reply to Alex Keybl [:akeybl] from comment #23) > (In reply to Dylan Oliver [:doliver] from comment #21) > > Re-nominating for tef -- this is now landed in master. > > Can you all provide a risk evaluation for the tef? triage? That should play > a major role in the final decision. If worker-threads or any other non-trivial performance work that happens chronologically after this bug is desired, this bug must be uplifted because there are deep semantic changes that are not viably separable. This will be the case for most performance fixes from here on out. Simpler/orthogonal bugs like issuing fetches in parallel can be uplifted without this bug, but will have an increased uplift cost and an increased probability of regression.
(In reply to Alex Keybl [:akeybl] from comment #23) > (In reply to Dylan Oliver [:doliver] from comment #21) > > Re-nominating for tef -- this is now landed in master. > > Can you all provide a risk evaluation for the tef? triage? That should play > a major role in the final decision. There is risk in this patch as it makes significant changes in the way we retrieve email. However, we also know that current email performance on the 1.0.1 branch is not acceptable and this bug is one of those that will be required to reach minimum performance targets. See also comment #24 that other critical performance enhancements like bug 814257 are being developed on top of this one so at this point we need to get this in place on 1.0.1 to make any further progress.
blocking-b2g: tef? → tef+
The issue does not need a test case.
Flags: in-moztrap-
Can you please provide steps to verify this fix - as we will blackbox test from the UI?
v1.0.1: 8256375bf7f83c66d1946247f4ce337569a32b4a This did not uplift to v1-train cleanly. You might be able to get it to do that by doing: cd gaia git checkout v1-train git cherry-pick 8256375bf7f83c66d1946247f4ce337569a32b4a <resolve merge conflict> git add apps/email/js/message-cards.js git commit
Flags: needinfo?(jlal)
v1-train: ec3e03370072f8340b729dab0d2354fd2b6ef1bc
Flags: needinfo?(jlal)
Marking as QARegressexclude to make sure the right person gets this, to do a in depth check on this issue.
Whiteboard: QARegressExclude
I set up a few different email accounts on 1.0.1 and this seems snappy enough. It's purely subjective. But unless someone wants to go to the trouble of setting up a performance test to establish base lines and measure improvements, we'll have to go will user perception as verification for now.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: