Closed Bug 869715 Opened 12 years ago Closed 11 years ago

Marketplace API performance needs improvement

Categories

(Marketplace Graveyard :: API, defect, P4)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 926640

People

(Reporter: basta, Unassigned)

References

Details

(Keywords: perf)

Some unscientific benchmarks gathered from `time curl <url>`: Homepage: real 0m1.283s user 0m0.009s sys 0m0.011s Search (twitter): real 0m1.159s user 0m0.007s sys 0m0.008s Twitter's rating list, page 1: real 0m0.514s user 0m0.004s sys 0m0.004s Every hundred millisecond matters.
Blocks: 855908
Note that Firefox OS introduces additional latency, which means that search results can take over five seconds to load, changing sort can take upwards of three to four seconds to load.
Please include the URL you use, and ideally the response headers. So we can figure out servers and if things like geo-encoding occurred etc.
%% curl -s -w "Connect: %{time_connect} TTFB: %{time_starttransfer} Total time: %{time_total} n" -o /dev/null 'https://marketplace-altdev.allizom.org/api/v1/apps/category/?dev=desktop&format=json&lang=en-US&pro=3040fe.32.1&region=es' Connect: 0.037 TTFB: 0.276 Total time: 0.276 n% %% curl -s -w "Connect: %{time_connect} TTFB: %{time_starttransfer} Total time: %{time_total} n" -o /dev/null 'https://marketplace-altdev.allizom.org/api/v1/apps/category/?dev=desktop&format=json&lang=en-US&pro=3040fe.32.1&region=es' Connect: 0.042 TTFB: 2.031 Total time: 2.031 n% %% time curl -I -X GET 'https://marketplace-altdev.allizom.org/api/v1/apps/category/?dev=desktop&format=json&lang=en-US&pro=3040fe.32.1&region=es' HTTP/1.1 200 OK Server: gunicorn/0.17.4 Vary: API-Filter, X-Requested-With, Accept-Language, Cookie, X-Mobile, User-Agent X-Backend-Server: dev1.addons.phx1.mozilla.com Content-Type: application/json; charset=utf-8 Access-Control-Expose-Headers: API-Filter, API-Status, API-Version Strict-Transport-Security: max-age=2592000 Date: Wed, 08 May 2013 03:10:15 GMT Transfer-Encoding: chunked Access-Control-Allow-Origin: * X-Content-Security-Policy-Report-Only: policy-uri /services/csp/policy?build=a1a1 Via: Moz-pp-zlb09 Connection: keep-alive API-Filter: carrier=&device=tablet&lang=en-US&region=es Access-Control-Allow-Methods: GET, OPTIONS Access-Control-Allow-Headers: X-HTTP-Method-Override, Content-Type API-Version: 1 curl -I -X GET 0.01s user 0.00s system 3% cpu 0.236 total %% time curl -I -X GET 'https://marketplace-altdev.allizom.org/api/v1/apps/category/?dev=desktop&format=json&lang=en-US&pro=3040fe.32.1&region=es' HTTP/1.1 200 OK Server: gunicorn/0.17.4 Vary: API-Filter, X-Requested-With, Accept-Language, Cookie, X-Mobile, User-Agent X-Backend-Server: dev2.addons.phx1.mozilla.com Content-Type: application/json; charset=utf-8 Access-Control-Expose-Headers: API-Filter, API-Status, API-Version Strict-Transport-Security: max-age=2592000 Date: Wed, 08 May 2013 03:10:18 GMT Transfer-Encoding: chunked Access-Control-Allow-Origin: * X-Content-Security-Policy-Report-Only: policy-uri /services/csp/policy?build=a1a1 Via: Moz-pp-zlb09 Connection: keep-alive API-Filter: carrier=&device=tablet&lang=en-US&region=es Access-Control-Allow-Methods: GET, OPTIONS Access-Control-Allow-Headers: X-HTTP-Method-Override, Content-Type API-Version: 1 curl -I -X GET 0.01s user 0.00s system 0% cpu 2.064 total Most of the time from `curl` I get 0.2-0.8s. In the browser, that's a different story. Homepage: http://f.cl.ly/items/001v2V0Q1t041j3N1Y1J/Screen%20Shot%202013-05-07%20at%208.15.47%20PM.png Blank search: http://f.cl.ly/items/2u0y2x2u1t39072R3J0a/Screen%20Shot%202013-05-07%20at%208.16.36%20PM.png Search for "apple": http://f.cl.ly/items/3O3B2W2t1y0d1r362r0J/Screen%20Shot%202013-05-07%20at%208.16.18%20PM.png Notice how fast the featured apps response is. I wouldn't be surprised if TastyPie -> MySQL -> ES -> MySQL -> Tastypie / Django is slowing down search. The awesome news is that Rob Hudson is fixing all of this so we never hit MySQL for searches. (But if ES is what's slow we might be in trouble.) If anyone (e.g., Basta) has any more observations, throw them in this bug. I'm very curious, but all I know is that on the Unagi phones the categories and search are pitifully slow.
Flags: needinfo?(mattbasta)
Note that your benchmarks are wrong because you're using -I, which only does a HEAD request.
Flags: needinfo?(mattbasta)
The endpoints that are particularly critical (i.e.: the ones that Krupa has been complaining about) are: /api/v1/apps/search/ - Any search seems to take at least a second when I curl, even with just a ?q=. Some take two and up. /api/v1/apps/search/featured/ - Each load of one of these pages with no particular args or headers exhibits the same as noted in comment #0. /api/v1/apps/rating/?app=... - Any list of ratings with ?app= with at least one rating seems to take a half second or more.
I've got a local mostly working ES-only search API working. Here's the difference locally (using cvan's fancy curl output): The ES -> MySQL currently: Connect: 0.002 TTFB: 0.559 Total time: 0.560 The ES only of the future: Connect: 0.002 TTFB: 0.075 Total time: 0.075
Thanks for data. That helps.
Priority: -- → P4
FYI as part of bug 866942, I added in some statsd API logging and pages. Now we need some data (and probably to start refining our pages). http://localhost:5000/graphite-api?site=marketplace-altdev&graph=apps
Depends on: 870557
Behold: http://addonsadm.private.phx1.mozilla.com/dashboard/graphite-api?site=marketplace&graph=apps 750ms to 2 seconds is not good. We can get that faster.
This is likely because of elasticsearch but if you want to see something painful watch http://screencast.com/t/NtTRsFacLC and jump to the parts where Victor is waiting for the Marketplace consumer pages to load (i.e., the API endpoints to load).
Assignee: nobody → amckay
Target Milestone: --- → 2013-05-30
The request middleware takes 30ms locally. 28ms of that is in one middleware: 09:42:08 statsd:INFO Timing: base.request.LocaleMiddleware, 28, 1 :/Users/andy/sandboxes/zamboni/django_statsd/clients/log.py:13
That's pretty much all in tower.activate
(In reply to Andy McKay [:andym] from comment #14) > We should start with search: > > https://rpm.newrelic.com/accounts/315282/applications/2787793/ > transactions#id=281592532 I can re-enable the 'search-api-es' waffle on -dev and compare to that. I'm hoping we can move to that soon.
Depends on: 874588
For other people looking at optimising: here is a page that pulls in the API and then shows it as HTML. Which then gives you DJDT on the request. https://github.com/andymckay/zamboni/commit/472eb2
Depends on: 876523
Target Milestone: 2013-05-30 → ---
Bugs are spawning off this, keeping this in the milestone to stay on my radar.
Target Milestone: --- → 2013-06-06
Depends on: 877714
Target Milestone: 2013-06-06 → 2013-06-13
Target Milestone: 2013-06-13 → 2013-06-20
I think the only slow API left from a timing point of view (well at least anything I'm worrying about) is search: http://addonsadm.private.phx1.mozilla.com/dashboard/graphite-api?site=marketplace&graph=apps.search Which I know robhudson is working.
The featured API is probably a bad one. And it's on the homepage too.
Target Milestone: 2013-06-20 → 2013-06-27
Yay @robhudson and his magic search fixes: http://cl.ly/image/2d1k2k3H2o1q
Target Milestone: 2013-06-27 → 2013-07-04
Assignee: amckay → nobody
So this is pretty open-ended right now, so here's some specific goals that I think we can hit that would satisfy the purposes of this bug: Category API https://marketplace-dev.allizom.org/api/v1/apps/category/ Currently takes about 0.4-0.6s [1] to complete. Since it's cacheable by region and carrier, this could imaginably be sped up very significantly. On the front end, we could remove the feature profile from requests to this API and the back-end could set cache control headers to get Zeus to cache this (it should never change with user state). I think we could easily get this one down to 0.1-0.3s. Search Feature API https://marketplace-dev.allizom.org/api/v1/fireplace/search/featured/ Currently takes a consistent 1.3s [1] to complete. Can we store user objects in memcached? The object that tells whether the user has installed/developed/purchased an app could be memoized and stored in memcached and fetched with a get_multi. Busting the cache for that object would only need to be done in the receipt endpoint(s). I think we can realistically get this API to complete in under a second. [1] All tests were carried out with `time curl <url>` on a VPS in Los Angeles. Times were adjusted to represent waiting time between confirmation of HTTP request headers and first byte.
I'm going to take this to wrap some explicit caching around the featured API. I can look at the category API and caching as well.
Assignee: nobody → robhudson.mozbugs
Target Milestone: 2013-07-04 → 2013-07-11
Basta: Can we draw the screen before the featured apps API returns? I.e. draw everything else but leave the featured apps area with a spinner? Why do we need to block on it? Likewise with the categories... they aren't shown unless the user clicks.
https://github.com/mozilla/zamboni/commit/94b8704 adds caching around /search/featured.
(In reply to Rob Hudson [:robhudson] from comment #23) > Basta: Can we draw the screen before the featured apps API returns? I.e. > draw everything else but leave the featured apps area with a spinner? We used to have that, but you end up with weirdness: 1. There's twice as many requests for the homepage, which means that on the Unagi, things load twice as slowly 2. If the featured apps load first, you're left with a spinner for the body 3. If there's no featured apps, the box and the spinner randomly disappear > Likewise with the categories... they aren't shown > unless the user clicks. Categories don't block the homepage from rendering, but we need to fetch them on initial load for a few reasons: 1. We use that data to get the names of the categories, so if you're deeplinked to a category we know what the name of it is (we only have the slug from the URL) 2. If the user clicked the category dropdown and we showed a spinner for a few seconds, that's pretty crappy UX since we could have been loading that asynchronously (which we currently do) 3. Something something operator shelf SIM card
Target Milestone: 2013-07-11 → ---
Assignee: robhudson.mozbugs → nobody
Closing since we have been working on more specific bugs. Marking as duplicate of the main API tracking bug.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.