Open Bug 1845775 Opened 1 year ago Updated 1 year ago

Google blog page loads very slowly and often fails to fully load

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

ASSIGNED
Performance Impact low

People

(Reporter: mccr8, Assigned: iain)

References

(Blocks 2 open bugs, )

Details

(Keywords: webcompat:needs-diagnosis)

Attachments

(2 files)

This web page doesn't fully load for me in desktop Firefox: https://security.googleblog.com/2023/07/the-ups-and-downs-of-0-days-year-in.html

I see the header and the footer, but nothing else. Reloading doesn't help.

In another profile, I had these errors after a few reloads:

TypeError: can't access property "GetAvailRect", screen is nullPanelMultiView.sys.mjs:1189:5
    _calculateMaxHeight resource:///modules/PanelMultiView.sys.mjs:1189
    handleEvent resource:///modules/PanelMultiView.sys.mjs:1259

Request to access cookie or storage on “https://www.youtube.com/subscribe_embed? etc"
was blocked because it came from a tracker and content blocking is enabled.
(twice)

But after reloading 3 or so times it eventually worked. In my regular profile, it still is not working.

In my main profile, I'm getting this error message:

Exception trying to format object for log message: SyntaxError: JSON.parse: unexpected end of data at line 1 column 1 of the JSON data(resource://services-sync/resource.sys.mjs:222:21) JS Stack trace: _processResponse/<@resource.sys.mjs:222:21
format@Log.sys.mjs:639:19
formatText@Log.sys.mjs:568:44
format@Log.sys.mjs:586:12
append@Log.sys.mjs:679:37
append@logmanager.sys.mjs:137:44
log@Log.sys.mjs:376:16
warn@Log.sys.mjs:387:10
_doQuickWrite@tabs.sys.mjs:308:17

and then:

Uncaught InternalError: too much recursion
    initContent https://security.googleblog.com/2023/07/the-ups-and-downs-of-0-days-year-in.html:3882
    jQuery 2
    initContent https://security.googleblog.com/2023/07/the-ups-and-downs-of-0-days-year-in.html:3875
    process https://security.googleblog.com/2023/07/the-ups-and-downs-of-0-days-year-in.html:3932
    <anonymous> https://security.googleblog.com/2023/07/the-ups-and-downs-of-0-days-year-in.html:3973
    jQuery 11

I'm seeing the JSON.parse message even without the page loaded, so it is probably unrelated.

We also take about twice as long to show content on this page as Chrome. Looking at a profile, approximately 100% of the 9 seconds it takes to show anything is being spent in is being spent in the JS engine's regexp code, so maybe something is going weird there? https://share.firefox.dev/45mtSfh

I'll move this over to the JS engine component then.

Iain, is there somebody who might be able to look at irregexp performance issues? Thanks.

Component: Desktop → JavaScript Engine
Flags: needinfo?(iireland)
Product: Web Compatibility → Core
Summary: https://security.googleblog.com/2023/07/the-ups-and-downs-of-0-days-year-in.html doesn't fully load → Google blog page loads very slowly and often fails to fully load
Performance Impact: --- → ?
Webcompat Priority: --- → ?

Profiling Chrome on the same site, it takes several seconds to load and spends 94% of the total time in the same regexp (/([\s\S]+?)<div data-is-preview.+?>([\s\S\]+)<\/div>/). It looks like it is stuck in a pathological backtracking case, although not for as long as we are.

I don't know why Chrome gets through the pathological regexp faster than we do. We use the same regexp engine as V8, so we should be doing basically the same work. We have our own codegen layer, so maybe there's an issue there.

Should we reach out to the blog author and let them know that their regexp is bad?

Flags: needinfo?(iireland)

I posted about it on Twitter in reply to this tweet.

They're looking into it.

Performance Impact: ? → low
Webcompat Priority: ? → ---

Somebody else filed a webcompat issue on this same site.

Attached file reduced-test-case.html (deleted) —

Locally, d8 takes about 3.4s and SM takes about 4 seconds when running a reduced testcase in the shell. In SM, roughly 98% of all samples are in this instruction sequence in jitted regexp code:

cmp rdx, -0x7
jge $+0x28
movzx ecx, byte [rax + rdx * 1 + 0x7]
mov rsi, 0x7fbc96c40c14
mov edi, 0x7f
and edi, ecx
movzx esi, byte [rsi + rdi * 1]
test esi, esi
jnz $+0x6
add rdx, 0x8
jmp $-0x32
cmp rdx, -0x1c
jge $+0x418
mov ecx, dword [rax + rdx * 1]
cmp ecx, 0x7669643c
jz $+0x6
add rdx, 0x1
jmp $-0x51

In d8, we get the following:

cmpl rdi,0xf9		
jge 0x34dc000840d4  <+0x94>
movzxbl rdx,[rsi+rdi*1+0x7]
REX.W movq rax,0x34dc082124
REX.W movq rbx,rdx		
REX.W andq rbx,0x7f	
cmpb [rax+rbx*1+0x7],0x0	
jnz 0x34dc000840d4  <+0x94>
REX.W addq rdi,0x8		
jmp 0x34dc000840a4  <+0x64>
cmpl rdi,0xe4		
jge 0x34dc00084501  <+0x4c1
movl rdx,[rsi+rdi*1]	
cmpl rdx,0x7669643c	
jz 0x34dc000840f2  <+0xb2>	
REX.W addq rdi,0x1		
jmp 0x34dc000840a4  <+0x64>

Setting aside differences in how things are formatted, the generated code is mostly identical except for this chunk:

          ***SM***                             ***V8***
movzx esi, byte [rsi + rdi * 1]		 cmpb [rax+rbx*1+0x7],0x0	
test esi, esi				 jnz 0x34dc000840d4  <+0x94>
jnz $+0x6				 	

We're doing a load and a test; V8 is doing a comparison directly in memory. The test instruction is the single hottest instruction in the benchmark, with more than 1/3 of all samples (31K/89K), so it seems plausible that fixing this would help. It's generated by this code here. The macroassembler doesn't currently support branch8 with a BaseIndex and an immediate operand, but I don't see any reason we can't add it.

I wrote a patch that generates code matching V8:

cmp byte [rsi + rdi * 1], 0x0
jnz $+0x6

Unfortunately, at least on my machine, it slows things down: d8 is still around 3.4s, but my modified version of SM has regressed from 4s to nearly 5s. In cases like this, where there's a tiny amount of hot code and small changes that look like they should be improvements regress performance, I start to suspect that we're getting unlucky with eg cache line alignment: maybe V8's regexp prologue is a slightly different length, code is aligned in slightly different places, and since we spend a huge amount of time in a tiny number of instructions, even a small hiccup in the hardware can lead to large performance differences. If that's what's going on here, my experience is that it's difficult/impossible to modify the compiler to avoid ever hitting this kind of problem, so you should just generate the code that should be better and assume that it will be a win overall.

I'm not sure if that's an argument for landing my patch or dropping it. I'll put it up as a WIP for now. If anybody can measure a real improvement from it, we can consider landing it.

Why do we throw InternalError: too much recursion and Chrome doesn't?

Flags: needinfo?(iireland)
Assignee: nobody → iireland
Severity: -- → S4
Status: NEW → ASSIGNED
Priority: -- → P1

Hmm, interesting question. "too much recursion" is basically our default error when things go wrong in a regexp. (We should maybe fix that at some point.) I was assuming it was this code, which limits the number of times we retry after being interrupted. I didn't actually verify that, though, and now that I think about it, it doesn't make much sense: the slow script dialog doesn't appear here.

Flags: needinfo?(iireland)

Actually, scratch that: I tested it out in the browser, and we are indeed triggering interrupts via the watchdog thread. I guess we count the number of times we're interrupted, not the number of times we show the dialog.

Long story short: the watchdog thread interrupted us four times, so we gave up and threw an error. This also explains why Chrome is significantly faster here: when interrupted, we stop the regexp, handle the interrupt, then restart regexp execution from the beginning, whereas V8 does scary code crimes to keep the regexp running despite the possibility of triggering a GC with unrooted values on the stack. (The words "manually relocate unhandlified references" are involved.)

anecdotal info: pages may load for folks with more RAM -- I'm on win64 arm64 with 16GBi ram and always fails, others with 64GBi ram get it rendered.

"Perf" key word?

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: