Closed Bug 1056204 Opened 10 years ago Closed 10 years ago

[E-mail] Sync stops working after some time

Categories

(Firefox OS Graveyard :: Gaia::E-Mail, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(b2g-v2.0 affected)

RESOLVED WORKSFORME
Tracking Status
b2g-v2.0 --- affected

People

(Reporter: mikeh, Unassigned)

References

Details

Attachments

(1 file)

Attached image Screenshot showing stuck sync (deleted) —
I saw this happen back in July while on a work-week in PDX; I saw it again this past weekend while driving through NY/PA/MD en route to DC from ON. Both times I was using a T-Mo cellular data connection with a $3/day SIM acquired in PDX. Basically, after successfully receiving email for a while, the Email app stops syncing. The "Last sync" status bar goes from showing a few minutes (I have my mozilla.com account set to sync every 30 minutes) to several hours. I'm not sure what causes this and really only notice is after the fact. Tapping the sync icon makes it spin but doesn't fix the problem. Force-closing the Email app and reopening it seems to clear the stuckness, and sync works again. Running: - gonk: v123 - gecko: b2g32_v2_0/20140815160200 - gaia: v2.0/e0de536a
Keywords: steps-wanted
The best STR I can come up with are: 1. use TWQA flash tool to flash a recent non-m-c (e.g. for last weekend, 2.0) build onto my Flame 2. configure WiFi 3. open the email app and add my mozilla.com account (auto-setup works fine) 4. select a sync interval, e.g. every 30 minutes 5. travel to the US, enable cellular radio with T-Mo SIM card 6. send and receive email as normal while driving/walking 7. get busy, stop paying attention to Flame 8. check Flame later on, open email app, notice that "Last Sync" says "X hours ago" 9. tap on the sync icon, observe it spinning, and that sync doesn't happen 10. press and hold the Home button to bring up the task switching -- kill the Email app 11. reopen the email app, email syncs
Some related info, not really addressing the problem, just providing extra context: Bug 1018828 is a meta bug we had for tracking some sync failure fixes, and the important bits of the dependent bugs seem to have been landed before v2.0/e0de536a, which is dated Fri Aug 15. Those fixes attached to the meta bug some were likely applied after the July PDX trip, but with v2.0/e0de536a, it should have all of them.
A further data point: while sync was failing, I tried the Browser app and was able to successfully load web pages.
I've seen this too. I have a device in this state right now, in fact.
I have some theories I'm running down in my head right now, but in the meantime, especially if you can get a logcat of the broken device trying to cronsync, it would be invaluable. Like if your sync interval is 5 minutes, just run logcat for ~5 minutes to make sure you catch it, or if you see it go by, that's really what we want/need. See https://wiki.mozilla.org/Gaia/Email/RequiredBugInfo for example commands, but since you're experts, really all we need is the "Gecko" and "GeckoDump" categories; so it's okay to grep on them or explicitly filter, etc. Also potentially helpful is turning on secret debug mode which enables a circular log buffer that can then be dumped to the sdcard on demand. This lets logs be gathered in the field. https://wiki.mozilla.org/Gaia/Email/SecretDebugMode
Okay, so if closing the email app and reopening it fixes it, this suggests that the failsafe close is not working. And if the failsafe close is not working, this implies either: A) we're never requesting our wake-locks. if mozAlarm.data is still busted, this would explain this, but then cronsync would never work. B) we request the locks, we go to force-close ourself, but !document.hidden, so we don't close. If we're living in someone's pocket and the lock screen is enabled, this should definitely not be the case. C) we are requesting the wake-locks, but we're also releasing them when we get our "oncronsyncstop" notification. Scenario C likely implies that the sync is straight-up failing. Which in turn implies that this is not a situation where something broke and permanently ate the mutex. Because in that case the failsafe close should trigger because the sync will get stuck in the mutex queue. Before we fixed the TCP socket closing issue (present in the aug 15 v2.0 build), this could and would happen for me at a 1-minute sync interval where we were leaking connections and eventually the server would start refusing to let us open new connections. It's conceivable we're looking at a variant of this; maybe resource exhaustion at a Gecko level somehow? In any event, any type of logcat or circular buffer dumps around the time of a failure would be invaluable. It may also be possible to use logcat's own disk-based circular buffers on the device, but I think you somehow need to leave logcat running in the background, and I'm not really sure how to nohup/daemonize it with our setup. This is along the lines of the command you'd want to run: logcat -v time -s -f /storage/sdcard0/logcat -n 16 -r 4096 Gecko:V GeckoDump:V
comment 1 satisfied the steps-wanted request - removing stale keyword
Keywords: steps-wanted
(In reply to Joshua Mitchell [:Joshua_M] from comment #7) > comment 1 satisfied the steps-wanted request - removing stale keyword Not really - the STR is a bit long, so I think we need to reduce the STR here a bit.
Keywords: steps-wanted
Mike - have you seen this anytime recently? Especially with the new gonk / firmware changes?
Flags: needinfo?(mhabicher)
Keywords: steps-wanted
Joshua_M, I recently ran with: Gaia-Rev f8d3bf44029e0afc0124600a4bb34dba8fc1ad21 Gecko-Rev https://hg.mozilla.org/releases/mozilla-b2g34_v2_1/rev/f70a67a7f846 Build-ID 20141120001207 Version 34.0 Device-Name flame FW-Release 4.4.2 FW-Incremental 40 FW-Date Tue Oct 21 15:59:42 CST 2014 Bootloader L1TC10011880 ...and didn't see this issue anymore, so I'm going to mark this closed.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(mhabicher)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: