Closed Bug 1522838 Opened 6 years ago Closed 5 years ago

High input latency on Google Docs

Categories

(Core :: Performance, defect)

ARM64
Windows 10
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1488435
Performance Impact high

People

(Reporter: overholt, Unassigned)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Keywords: perf:responsiveness)

I was typing into a Google Doc (https://docs.google.com/document/d/1QjoOgl5vkt2zNd0xg2MVr2Gv_l7riViSpzyNKIvKdfo/edit#) and got lots of input latency while typing. This is in today's nightly on a Lenovo C630 ARM64 Windows 10 laptop. Here's a profile: http://bit.ly/2Wfzrtb.

On a non-lorem-ipsum document (where I first noticed the high latency), I got this profile: http://bit.ly/2WgQPgV

Here's another profile: http://bit.ly/2WlsaYN

Whiteboard: [qf]
Flags: needinfo?(dpalmeiro)

I thought this might be related to Ion support, but looking at this profile with the tracelogger enabled from x86, http://bit.ly/2GacQbR, it appears that most of the time is actually spent in baseline and there is barely any execution in Ion at all. Seems to be something else going on, I will try to collect the same info for arm64.

Whiteboard: [qf] → [qf:p1:responsiveness]

(In reply to Denis Palmeiro [:denispal] from comment #3)

I thought this might be related to Ion support, but looking at this profile with the tracelogger enabled from x86, http://bit.ly/2GacQbR, it appears that most of the time is actually spent in baseline and there is barely any execution in Ion at all. Seems to be something else going on, I will try to collect the same info for arm64.

I attempted to recreate on x86-64 on Linux using nightly and had no success. The profile generally matched with the ones attached to this ticket, but there was no delay. If you're interested, I'd love to help you investigate this.

FWIW, I am able to reproduce what Andrew can see on ARM. I also was able to capture this with the tracelogger on ARM64 and most of our time is also spent in Baseline, which is good. However, I want to confirm again that the Ion events on x86 are not the main reason it's faster there. I also discussed the profile at the time with Bas, and he believes it's part of our implementation of sparse array dictionary objects which could be improved. I also tested this on Edge and they blow us out of the water when it comes to gdocs latency on ARM64. There's definitely room for improvement.

Flags: needinfo?(dpalmeiro)

(In reply to Denis Palmeiro [:denispal] from comment #5)

FWIW, I am able to reproduce what Andrew can see on ARM. I also was able to capture this with the tracelogger on ARM64 and most of our time is also spent in Baseline, which is good. However, I want to confirm again that the Ion events on x86 are not the main reason it's faster there. I also discussed the profile at the time with Bas, and he believes it's part of our implementation of sparse array dictionary objects which could be improved. I also tested this on Edge and they blow us out of the water when it comes to gdocs latency on ARM64. There's definitely room for improvement.

As I said, I am very interested in working with you on getting to the bottom of this, especially if Bas continues to think that there is room for improvement. I was able to get a profile on x86 on Ubuntu with a build from moz-central that contains jstrace results. Despite not knowing exactly what I am looking at, it seems that most of the time is spent in baseline but eventually some of the code ends up executing in Ion as we move further along.

For https://perfht.ml/2WSIekM, I turned the timing granularity higher than normal which is why (I think) there are so many calls to get the clock time (etc) and overall laginess. Again, running this test without the profiler shows no input latency at all.

HTH.

(In reply to Will Hawkins from comment #6)

As I said, I am very interested in working with you on getting to the bottom of this, especially if Bas continues to think that there is room for improvement. I was able to get a profile on x86 on Ubuntu with a build from moz-central that contains jstrace results. Despite not knowing exactly what I am looking at, it seems that most of the time is spent in baseline but eventually some of the code ends up executing in Ion as we move further along.

One thing we can try is profiling this with Edge for ARM64 and see if there are any major gaps we can observe. If we can isolate the bad performance to a few JS functions that could be quite useful.

For https://perfht.ml/2WSIekM, I turned the timing granularity higher than normal which is why (I think) there are so many calls to get the clock time (etc) and overall laginess. Again, running this test without the profiler shows no input latency at all.

HTH.

The timing calls are mostly coming from the trace logger. We take a timestamp using mozilla::Timestamp (which isn't ideal for this) at the beginning and end of every event, and many of these events are only 50 microseconds or shorter in duration. I have a todo to improve this in https://bugzilla.mozilla.org/show_bug.cgi?id=1503959.

The profile in comment 2 hints that we process several keypress events without painting inbetween.
Bug 1506376 hopefully helps with that quite a bit (that bug is waiting for to get broken test, bug 1489259, fixed). We basically don't paint if we're still waiting for the notification from compositor that it actually processed the previous paint, and currently that notification goes to the normal priority queue.

(In reply to Denis Palmeiro [:denispal] from comment #7)

(In reply to Will Hawkins from comment #6)

As I said, I am very interested in working with you on getting to the bottom of this, especially if Bas continues to think that there is room for improvement. I was able to get a profile on x86 on Ubuntu with a build from moz-central that contains jstrace results. Despite not knowing exactly what I am looking at, it seems that most of the time is spent in baseline but eventually some of the code ends up executing in Ion as we move further along.

One thing we can try is profiling this with Edge for ARM64 and see if there are any major gaps we can observe. If we can isolate the bad performance to a few JS functions that could be quite useful.

I'd love to do that, but do not have ready access to any ARM hardware. Please let me know if there are community devices in the office and I'd be happy to help!

For https://perfht.ml/2WSIekM, I turned the timing granularity higher than normal which is why (I think) there are so many calls to get the clock time (etc) and overall laginess. Again, running this test without the profiler shows no input latency at all.

HTH.

The timing calls are mostly coming from the trace logger. We take a timestamp using mozilla::Timestamp (which isn't ideal for this) at the beginning and end of every event, and many of these events are only 50 microseconds or shorter in duration. I have a todo to improve this in https://bugzilla.mozilla.org/show_bug.cgi?id=1503959.

That's exactly what I thought -- thanks for confirming! I'm starting to learn something :-)

(In reply to Denis Palmeiro [:denispal] from comment #5)

FWIW, I am able to reproduce what Andrew can see on ARM. I also was able to capture this with the tracelogger on ARM64 and most of our time is also spent in Baseline, which is good. However, I want to confirm again that the Ion events on x86 are not the main reason it's faster there. I also discussed the profile at the time with Bas, and he believes it's part of our implementation of sparse array dictionary objects which could be improved. I also tested this on Edge and they blow us out of the water when it comes to gdocs latency on ARM64. There's definitely room for improvement.

The other related bugs contain comments about the implementation of "sparse array dictionary objects" as well and I was wondering if/how those experts were able to tell that was a potential cause of the problem?

Flags: needinfo?(dpalmeiro)
Flags: needinfo?(bas)
Flags: needinfo?(dpalmeiro)

I was testing this today, and typing to Google Docs didn't feel too bad after the whole document was loaded. However while the Google doc (not the UI, just the doc) is still loading, the jank is pretty bad.
But I'd say the build from https://treeherder.mozilla.org/#/jobs?repo=try&revision=fa6322ab9316cfa44fdc11d428db54ed7241751c makes it feel better.
Does anyone want to do a quick test?
https://treeherder.mozilla.org/logviewer.html#?job_id=231109132&repo=try
Scroll the top left area down and look for target.zip, that should work on Win/arm64

(In reply to Olli Pettay [:smaug] from comment #11)

Does anyone want to do a quick test?

It's still pretty awful :( I wonder if I'm just not waiting long enough for the doc to fully load ...

I'm happy to do some sort of videoconferencing demo or record my typing if you want to witness it, Olli.

With both builds I had a similar experience to Comment 11: janky typing while the page is loading but a good experience once loaded.
Having experienced severe input latency on lower end hardware (MacMini w/ platter hard disk) and huge google docs, I made a copy of this document and copy/pasted it to 320 pages: https://docs.google.com/document/d/18uckjFF6gNIQVVPJUO9fPnb2ebqAouQ69mqYjl5rdAw/edit#

On the Arm64 laptop the typing is surprisingly responsive.

Oops, I was reproducing bug 1488435, not this one. With the build from comment 11, thing seem pretty good.

I tested with the Edge version provided on the Lenovo C630, and jank is even worse than Firefox, no character appear for about tens of seconds, and then they are streamed in the document.

I made a profile at a normal typing rate, after the document load is ended, which highlight that 12% of the non-idle time is spent in IonBuilder (Bug 1490798) which is already blocking Bug 1488435.

Testing on Linux x64, I did notice the same document-load typing latency, but not for as long as on ARM64. I do not think this issue is any specific to ARM64, just more noticeable as documents are taking longer to load.

I'm going to close this as a dup of 1488435.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → DUPLICATE
Flags: needinfo?(bas)
Performance Impact: --- → P1
Whiteboard: [qf:p1:responsiveness]
You need to log in before you can comment on or make changes to this bug.