Closed Bug 1073225 Opened 10 years ago Closed 10 years ago

[Marketplace] Marketplace launch is slow with two white screens visible before Content Visible stage

Categories

(Marketplace Graveyard :: Consumer Pages, defect, P1)

Avenir
ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-b2g:2.1+)

VERIFIED FIXED
blocking-b2g 2.1+

People

(Reporter: bzumwalt, Unassigned)

References

()

Details

(Keywords: perf, regression, Whiteboard: [2.1-exploratory-2][NPOTB])

Attachments

(1 file)

Attached file Logcat (deleted) —
Description:
Marketplace launch latency is high and visible to user. Two blank white screens can be seen as content loads, one immediately after initial splash screen, the second after animated loading icon disappears. 

Manual timing shows 5000ms from launch to interaction ready. Time to load tool in Developer HUD shows a cold launch of 3291ms from launch to first paint.

Repro Steps:
1) Update a Flame device to BuildID: 20140925000204
2) Launch Marketplace app

Actual:
Launch latency for Marketplace app is high with two white screens visible for 5000ms load time.
  
Expected: 
Marketplace launches in ideal of 1000ms from icon touch to content visible.
  
Environmental Variables:
Device: Flame 2.1 KK (319mb)
BuildID: 20140925000204
Gaia: 8061ab487d42cbc49b329fd68b9ca90e0fe477e6
Gecko: e970bc96f8b5
Version: 34.0a2 (2.1)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0
  
Repro frequency: 3/3, 100%
See attached: Youtube video clip & logcat
Youtube link: http://youtu.be/TIWLV70gPro
Issue DOES occur on 2.2 Flame KK (512mb & 319mb), and 2.1 Flame KK (512mb)

Flame 2.2 KitKat Base (512mb)
Device: Flame 2.2 Master
BuildID: 20140925040206
Gaia: c5d2e2f4ebf5f370d6003517057dcd47493dec90
Gecko: 32acbe1d64dc
Version: 35.0a1 (2.2 Master)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0

Flame 2.2 KitKat Base (319mb)
Device: Flame 2.2 Master
BuildID: 20140925040206
Gaia: c5d2e2f4ebf5f370d6003517057dcd47493dec90
Gecko: 32acbe1d64dc
Version: 35.0a1 (2.2 Master)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0

Flame 2.1 KitKat Base (512mb)
Device: Flame 2.1
BuildID: 20140925000204
Gaia: 8061ab487d42cbc49b329fd68b9ca90e0fe477e6
Gecko: e970bc96f8b5
Version: 34.0a2 (2.1)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Actual Results: Launch latency for Marketplace app is high with two white screens visible for 5000ms load time.


Issue does NOT occur on 2.0 Flame KK (512mb & 319mb)

Flame 2.0 KitKat Base (512mb)
Device: Flame 2.0
BuildID: 20140924183011
Gaia: 87ee41fcb3f9a14d7a8bb67f1dd7fd95a6bcd0f0
Gecko: b1cb27078909
Version: 32.0 (2.0)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0

Flame 2.0 KitKat Base (319mb)
Device: Flame 2.0
BuildID: 20140924183011
Gaia: 87ee41fcb3f9a14d7a8bb67f1dd7fd95a6bcd0f0
Gecko: b1cb27078909
Version: 32.0 (2.0)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0

Actual Results: Launch time is consistantly 1 second or less. White screens barely/not noticable.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(dharris)
Blocks: 897156
Priority: -- → P1
[Blocking Requested - why for this release]:

This is a very long load time, and a regression. Nominating this as a blocker
blocking-b2g: --- → 2.1?
Flags: needinfo?(dharris)
Keywords: perf
QA Contact: pcheng
I was unable to find a 'working' build here.

Testing method: I used a stopwatch to time from the time I tap on Marketplace icon to contents loaded on Marketplace.

v180 base-only: Cold launch ~7 seconds; warm launch less than 1 second
v180 + latest 2.2 master: Cold launch ~8 seconds; warm launch less than 1 second
Latest 2.2 master full flash: Cold launch ~8 seconds, warm launch less than 1 second

Removing regression keyword because this is not a regression.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(jmitchell)
This is most likely related to new feed changes. Or it could be related to mozMobileConnections. CCing some folks who might be able to investigate more closely.
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(jmitchell)
I just had a test on my Firefox OS Flames on 1.3 and 2.1.

For some reason, even a warm, cached load takes ~5-6 seconds on Mozilla WiFi (a 11 megabyte/s connection) before the splash screen goes away. (Cold, uncached takes ~8-10 seconds.)
Assignee: nobody → mpillard
I'll run more tests, but on my 2.2 Flame (v180 base, going to update shortly to KK) I'm seeing:
- 6s till hiding splash screen (cold launch)
- 3s till hiding splash screen (warm launch)

I've confirmed that everything is cached correctly - on warm launch, the only network requests that are made are for Google Analytics and New Relic.
It looks like https://github.com/mozilla/fireplace/commit/86f767f7 fixed the API URLs in the offline cache whitelist. This would suggest that, at the time of this bug's filing, the feed API response was previously *not* being stored in nor retrieved from Fireplace's localStorage offline cache system.

That would explain the slowness.

I hooked up a Flame on 2.1 to my computer, using the remote debugger to look at the Network tab. On the high-speed Mozilla wifi (a 11 megabyte/s connection), the splash screen disappeared in 1.3–2.7 seconds - which is not perfect but close to acceptable. We should be striving for under a second.

YMMV on slower connections.

There's this performance game plan <https://wiki.mozilla.org/Marketplace/Performance_optimization>, but IMO these are five things that could improve perf the most (ordered by impact, not easiness):

1. figure out how to get rid of consumer info or not block on it or speed it up (bug 1038936 should help)
2. speed up the Feed API URL (make it smaller, get Zamboni to serve it faster, etc.)
3. break up CSS/JS bundles, inline the critical CSS/JS, and lazy-load the rest
4. use SPDY (CloudFlare's SPDY is fantastic; but bug 987895 determined that IT doesn't want to deploy it)
5. compress things that aren't compressed already (SVGs in bug 1080098, locales in bug 1079894)

Below are a few questions/findings surrounding the Feed API URL, which is one of the two blockers (consumer info being the other) of rendering the homepage:

––

* Why does the Feed API URL have a max-age `180` but `settings.js` says `60 * 30` (`1800`)? [1]

https://github.com/mozilla/fireplace/blob/924b23b/src/media/js/settings.js#L52

>        api_cdn_whitelist: {
>            '/api/v2/fireplace/search/': 60 * 3,  // 3 minutes
>            '/api/v2/feed/get/': 60 * 30,  // 30 minutes
>        },

Keep in mind that these max-ages aren't actually used yet. Zamboni is the one that sets them. But with bug 957383, Fireplace should be the one sending the `Cache-Control` request headers, and then Zamboni looks at them and adds them as response headers, which the CDN then respects.

––

* Why is the Feed API URL's TTL only one day? IMO it should be a week - or even longer. Even if the TTL is far ahead, all that means is that the Marketplace won't block on that API request. That API request will still be updated immediately in the background (sometime after the page has been rendered) - which means when the user navigates to, say for example, an app detail page, the new homepage will be shown when the user presses the back button in the header.

https://github.com/mozilla/fireplace/blob/924b23b/src/media/js/settings.js#L77
       
>        offline_cache_whitelist: {
>            '/api/v2/fireplace/consumer-info/': 60 * 60 * 24 * 7,  // 1 week
>            '/api/v2/feed/get/': 60 * 60 * 24 * 1  // 1 day
>        },

––

* Why are we setting `must-revalidate` in the `Cache-Control` request header (which the CDN consequently sets in the `Cache-Control` response header)? [3] [4]

> Cache-Control: must-revalidate, max-age=122

https://www.dropbox.com/s/dgdmcginydtvlwb/Screenshot%202014-10-08%2014.04.17.png?dl=0

https://marketplace.firefox.com/api/v2/feed/get/?cache=1&lang=en-US&limit=10&region=restofworld&vary=0

https://marketplace.cdn.mozilla.net/api/v2/feed/get/?cache=1&lang=en-US&limit=10&region=restofworld&vary=0

––

* Upon forcing revalidation (hard refresh), why is the `max-age=122`? [5] [6]

> Cache-Control: must-revalidate, max-age=122

https://www.dropbox.com/s/19kimu96qqwimlu/Screenshot%202014-10-08%2014.08.27.png?dl=0

https://marketplace.firefox.com/api/v2/feed/get/?cache=1&lang=en-US&limit=10&region=restofworld&vary=0

––
With 5 bars on 3G AT&T network in Mountain View, California ...

Takes ~3s for splash screen to disappear with cold browser cache to load Marketplace:
https://www.dropbox.com/s/8tqrymw9lsqr898/Screenshot%202014-10-08%2014.49.34.png?dl=0
https://www.dropbox.com/s/al9c7mhvvq6x29w/Screenshot%202014-10-08%2014.53.58.png?dl=0
https://www.dropbox.com/s/oe9r4stewl9e365/Screenshot%202014-10-08%2014.50.43.png?dl=0

Takes ~4.5s (~1.5 seconds slower) with Fireplace's localStorage offline cache disabled:
https://www.dropbox.com/s/e7qar113fnle1cx/Screenshot%202014-10-08%2014.58.26.png?dl=0

Note: These benchmarks aren't perfect. All other assets are requested after `DOMContentLoaded` (the API ones in the background).

We should have a tool that records these over time (I'm building one <https://github.com/cvan/arewefast>, but it's not done yet).
1. consumer_info isn't actually blocking as it is stored offline.
2. My current Feed is 24KB served by the CDN. That's 50% smaller than the New/Popular pages. It was implemented with performance in mind.

Though good point that I should increase the Feed TTL.

Retrieving data from disk might also be something we want to benchmark.
(In reply to Kevin Ngo [:ngoke] from comment #9)
> 1. consumer_info isn't actually blocking as it is stored offline.

That's correct, if I've visited the Marketplace within the last 7 days.

https://github.com/mozilla/fireplace/blob/01c81dea/src/media/js/settings.js#L76

Using Flue's LATENCY option, I was able to verify locally that the offline cache still works correctly and renders the page without blocking on the requests. So that's good :)

> 2. My current Feed is 24KB served by the CDN. That's 50% smaller than the
> New/Popular pages. It was implemented with performance in mind.

Size isn't what I'm concerned about.

Taking 1.96 seconds to get the API response back from Zamboni is what I'm concerned about:

https://www.dropbox.com/s/lzf0t9c3kv3ejvz/Screenshot%202014-10-08%2016.18.28.png?dl=0

I'm in California closer to our data centres and on a faster connection than most of our users would ever dream of.

Even though we use the CDN for search and feed API endpoints, not all endpoints are served from the CDN (e.g., apps, ratings, consumer info.). And for that first person every 3 minutes who gets the CDN cache miss, it'll take Zamboni time + CDN time to get a response back. Luckily, other subsequent users within that 3-minute window have to wait only CDN time for the response.

And now that the app detail page can't be loaded from the homepage model data anymore, the user has to wait for another XHR for each app detail page to complete. This makes the homepage faster, but at a cost. Bug 991955 should definitely get addressed soon.

Luckily, consumer info. appears fast now (at least for me):

https://www.dropbox.com/s/sx6jh4kil5wm8gx/Screenshot%202014-10-08%2016.24.57.png?dl=0

No more than a few weeks ago, from my apartment Internet, when the TTL expired for consumer info.'s offline cache TTL, it was taking 3-5 seconds to load just the consumer info. endpoint. I'll file a bug if I see that again.

––

From Chrome's Timeline profiler, it's a bit hard to cross-reference everything. But there are 100s of milliseconds here and there spent parsing HTML (presumably for nunjucks templates) after XHR responses are finished and then a bunch of `setTimeout`s within the `require.js` code in `include.js`.

> [utils_local] Checking online state with socket include.js?b=1412710964149:8

Do we need to check online state on page load?  Why not check it on failure of an XHR or upon `window` `focus` (which covers the case of when the app was in the background on Firefox OS but is now in the foreground)?

––

According to Chrome's Heap Snapshots from its Profiler, Persona seems to be our biggest hog of memory. Might be something worth investigating.

––
blocking-b2g: 2.1? → 2.1+
Keywords: regression
Blocks: marketplace-perf
No longer blocks: 897156
From the looks of it, the only things we have discrete control over would be:
* bug 991955 (Put fireplace app API behind CDN)
* bug 1079894 (PR is in, thanks mat)
* bug 1080098 (SVGs served from CDN are not gzip'd)
Those are the quickest wins, yes. I'd add this one to the list:

* bug 1038936 (Remove waffle switches from consumer info)

Soon we ought to spend time to speed up the Feed API URL (reduce the size, profile the time spent in Zamboni), since it's arguably the most important endpoint.

Sometime sooner than later, I really think we should reopen bug 987895 to use SPDY - or file another bug to do so.

If we don't decide to use SPDY, when HTTP2 gets adoption, what will our excuse be then?
Adding qawanted to check if the issue of the two white screens are shown before content visible change with the 2.1 on top of v188.
Keywords: qawanted
QA Contact: pcheng
(In reply to Peter Bylenga [:PBylenga] from comment #13)
> Adding qawanted to check if the issue of the two white screens are shown
> before content visible change with the 2.1 on top of v188.

On the video attached at comment 0 there isn't actually a 2nd white screen visible. I consulted the reporter and he says that the 2nd white screen would appear BEFORE marketplace elements displayed, and it's NOT referring to the 'top of marketplace contents loaded but actual contents below need a split second to load' behavior (which is shown in the video). Note that I have NOT seen this 2nd white screen, even on the reported 2.1 build on v180 base.

The 2 white screens issue is NOT reproducible on v188 with 2.1 full flash 319MB mem. I only see the 1st white screen appear before the animated loading icon. Cold launch time is still ~6 seconds.

Tested on
Device: Flame (full flash, 319MB mem)
BuildID: 20141022081833
Gaia: 734d3547fb6c65e8bc4dd1a52b26f70bdfee7474
Gecko: 9646e94a041c
Gonk: 05aa7b98d3f891b334031dc710d48d0d6b82ec1d
Version: 34.0 (2.1)
Firmware: V188
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?]
Flags: needinfo?(jmercado)
Keywords: qawanted
Flags: needinfo?(jmercado) → needinfo?(ktucker)
Assignee: mpillard → nobody
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
According to comment 14 there is no second white screen on v188 w/ 2.1 (and the video in comment 0 never shows a second white screen anyway).  As far as this bug, as filed, it's either FIXED or INVALID.

This bug also mentions three performance related bugs which have been closed: bug 1038936, bug 1080098, bug 1079894.  General slowness and performance of the Marketplace is being tracked in bug 1075278.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Verifying bug as fixed as per Comment 15.

Actual Results: 

First white screen still present, second white screen no longer reproduces. Launch latency still appears to be long at 6.9 seconds on average (1.5 to 3 seconds time to load).

Cold start time measured by time to load in Developer menu:
1585, 2323, 2058, 1151, 2752

Cold start time measured by hand using stopwatch (stopping clock at start of content load following animated splash screen):
7.2 seconds, 7.6 seconds, 5.2 seconds, 7.9 seconds, 6.5 seconds


Device: Flame 2.1 (319mb)(Kitkat Base)(Shallow Flash)
BuildID: 20141105001204
Gaia: 154da5e17029a51002d5d9b7df39563d509edde6
Gecko: 3b0c3580a58d
Version: 34.0 (2.1)
Firmware: V188-1
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Device: Flame 2.2(319mb)(KitKat)(Shallow Flash)
BuildID: 20141105040206
Gaia: 7c9e7cabbde941b976e0e40a3a1d94e21aa9c5e9
Gecko: 62990ec7ad78
Version: 36.0a1 (2.2 Master)
Firmware: V188-1
User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Whiteboard: [2.1-exploratory-2] → [2.1-exploratory-2][NPOTB]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: