Closed Bug 1770089 Opened 3 years ago Closed 2 years ago

High CPU usage and memory allocation in mozglue!BaseAllocator and xul!js::ErrorObject::getStack

Categories

(Core :: JavaScript Engine, defect)

Firefox 100
defect

Tracking

()

RESOLVED INCOMPLETE
Performance Impact ?

People

(Reporter: thisismyidashish, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(3 files)

Attached image cpu_high_usage.PNG (deleted) —

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:100.0) Gecko/20100101 Firefox/100.0

Steps to reproduce:

OS: Windows 10
I had few tabs open. Then suddently, CPU usage of one of firefox.exe went high to 2 core 100% usage and memory became ~2 GB for this firefox.exe which is ~10x higher than other firefox.exe processes.

Actual results:

Memory usage for one of firefox.exe went ~2GB and CPU consumption was 100% for 2 cores. This happened continuously for ~10 mins until i crashed the process. I took a dump before crashing.

I also have Perfview profiler cpu profile but somehow bugzilla is not able to upload 32 MB file. I have attached snippet view which shows offending stack. Please ping me for perfview profile.

Expected results:

I have taken PerfView profiler trace and it shows high CPU usage because of critical section entries from inside mem allocator. This explains both symptoms.

This looks like a bug. Thread 3900 seems to go into a infinite loop allocating and freeing memory.

2 such high cpu and memory consumption stacks are shown below.

I have a dump of process but i am not sure if it contains any of my secrets/personal information.

Name
OTHER <<vcruntime140!memset_repmovs>>

  • mozglue!BaseAllocator::realloc
    |+ mozglue!moz_arena_realloc
    | + xul!js::JSStringBuilder::finishString
    | |+ xul!JS::BuildStackString
    | | + xul!js::ErrorObject::getStack_impl
    | | + xul!js::ErrorObject::getStack
    | | + OTHER <<?!?>>
    | | + BROKEN
    | | + Thread (3900) CPU=8016ms (MainThread)

Name
OTHER <<ntdll!RtlEnterCriticalSection>>

  • mozglue!BaseAllocator::realloc
    |+ mozglue!moz_arena_realloc
    | + xul!mozilla::Vector::growStorageBy
    | |+ xul!js::StringBuffer::append
    | ||+ xul!JS::BuildStackString
    | || + xul!js::ErrorObject::getStack_impl
    | || + xul!js::ErrorObject::getStack
    | || + OTHER <<?!?>>
    | || + BROKEN
    | || + Thread (3900) CPU=8016ms (MainThread)

The Bugbug bot thinks this bug should belong to the 'Core::JavaScript Engine' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → JavaScript Engine
Product: Firefox → Core
Attached file windbg_lm.txt (deleted) —
> Why is thread 3900 call stack look weird just before `::getStack` ? Does that mean it is some coming form some extension instead of firefox's code ? 00007ffea4fd0734 ntdll!NtWaitForAlertByThreadId+0x14 00007ffea4f9379d ntdll!RtlpWaitOnAddressWithTimeout+0x81 00007ffea4f93652 ntdll!RtlpWaitOnAddress+0xae 00007ffea4f9346d ntdll!RtlpWaitOnCriticalSection+0xfd 00007ffea4f5fcb4 ntdll!RtlpEnterCriticalSectionContended+0x1c4 00007ffea4f5fae2 ntdll!RtlEnterCriticalSection+0x42 (Inline) mozglue!Mutex::Lock+0x10 (/builds/worker/checkouts/gecko/memory/build/Mutex.h @ 61) (Inline) mozglue!AutoLock<Mutex>::AutoLock+0x10 (/builds/worker/checkouts/gecko/memory/build/Mutex.h @ 115) (Inline) mozglue!arena_dalloc+0x1f (/builds/worker/checkouts/gecko/memory/build/mozjemalloc.cpp @ 3513) (Inline) mozglue!idalloc+0x2f (/builds/worker/checkouts/gecko/memory/build/mozjemalloc.cpp @ 3535) (Inline) mozglue!arena_t::RallocSmallOrLarge+0x1af (/builds/worker/checkouts/gecko/memory/build/mozjemalloc.cpp @ 3632) (Inline) mozglue!arena_t::Ralloc+0x1bc (/builds/worker/checkouts/gecko/memory/build/mozjemalloc.cpp @ 3641) 00007ffe404053ce mozglue!BaseAllocator::realloc+0x27e (/builds/worker/checkouts/gecko/memory/build/mozjemalloc.cpp @ 4268) (Inline) mozglue!Allocator<MozJemallocBase>::moz_arena_realloc+0x7a (/builds/worker/checkouts/gecko/memory/build/malloc_decls.h @ 53) 00007ffe40405108 mozglue!moz_arena_realloc+0x98 (/builds/worker/checkouts/gecko/memory/build/malloc_decls.h @ 138) (Inline) xul!js_arena_realloc+0xf (/builds/worker/workspace/obj-build/dist/include/js/Utility.h @ 402) ... (Inline) xul!JS::CallNonGenericMethod+0x15 (/builds/worker/workspace/obj-build/dist/include/js/CallNonGenericMethod.h @ 103) 0000012e88e69e33 xul!js::ErrorObject::getStack+0x63 (/builds/worker/checkouts/gecko/js/src/vm/ErrorObject.cpp @ 688) 000002289192e4c5 0x2289192e4c5 0000000bcf02fb20 0xbcf02fb20 0000012e9381d118 0x12e9381d118 0000000bcf02f968 0xbcf02f968 000004ab897e4358 0x4ab897e4358 0000b9ad9441aa0b 0xb9ad9441aa0b 0000000bcf02f908 0xbcf02f908 00000228918ea40f 0x228918ea40f 0000000000000005 0x5 000002289192dbe8 0x2289192dbe8 000000000000b020 0xb020 00000e906c99cf88 0xe906c99cf88
Attached file windbg_lm (deleted) —
> Why is thread 3900 call stack look weird just before `::getStack` ? Does that mean it is some coming form some extension instead of firefox's code ? 00007ffea4fd0734 ntdll!NtWaitForAlertByThreadId+0x14 00007ffea4f9379d ntdll!RtlpWaitOnAddressWithTimeout+0x81 00007ffea4f93652 ntdll!RtlpWaitOnAddress+0xae 00007ffea4f9346d ntdll!RtlpWaitOnCriticalSection+0xfd 00007ffea4f5fcb4 ntdll!RtlpEnterCriticalSectionContended+0x1c4 00007ffea4f5fae2 ntdll!RtlEnterCriticalSection+0x42 (Inline) mozglue!Mutex::Lock+0x10 (/builds/worker/checkouts/gecko/memory/build/Mutex.h @ 61) (Inline) mozglue!AutoLock<Mutex>::AutoLock+0x10 (/builds/worker/checkouts/gecko/memory/build/Mutex.h @ 115) (Inline) mozglue!arena_dalloc+0x1f (/builds/worker/checkouts/gecko/memory/build/mozjemalloc.cpp @ 3513) (Inline) mozglue!idalloc+0x2f (/builds/worker/checkouts/gecko/memory/build/mozjemalloc.cpp @ 3535) (Inline) mozglue!arena_t::RallocSmallOrLarge+0x1af (/builds/worker/checkouts/gecko/memory/build/mozjemalloc.cpp @ 3632) (Inline) mozglue!arena_t::Ralloc+0x1bc (/builds/worker/checkouts/gecko/memory/build/mozjemalloc.cpp @ 3641) 00007ffe404053ce mozglue!BaseAllocator::realloc+0x27e (/builds/worker/checkouts/gecko/memory/build/mozjemalloc.cpp @ 4268) (Inline) mozglue!Allocator<MozJemallocBase>::moz_arena_realloc+0x7a (/builds/worker/checkouts/gecko/memory/build/malloc_decls.h @ 53) 00007ffe40405108 mozglue!moz_arena_realloc+0x98 (/builds/worker/checkouts/gecko/memory/build/malloc_decls.h @ 138) (Inline) xul!js_arena_realloc+0xf (/builds/worker/workspace/obj-build/dist/include/js/Utility.h @ 402) ... (Inline) xul!JS::CallNonGenericMethod+0x15 (/builds/worker/workspace/obj-build/dist/include/js/CallNonGenericMethod.h @ 103) 0000012e88e69e33 xul!js::ErrorObject::getStack+0x63 (/builds/worker/checkouts/gecko/js/src/vm/ErrorObject.cpp @ 688) 000002289192e4c5 0x2289192e4c5 0000000bcf02fb20 0xbcf02fb20 0000012e9381d118 0x12e9381d118 0000000bcf02f968 0xbcf02f968 000004ab897e4358 0x4ab897e4358 0000b9ad9441aa0b 0xb9ad9441aa0b 0000000bcf02f908 0xbcf02f908 00000228918ea40f 0x228918ea40f 0000000000000005 0x5 000002289192dbe8 0x2289192dbe8 000000000000b020 0xb020 00000e906c99cf88 0xe906c99cf88
Performance Impact: --- → ?

The profile suggests that many calls to ErrorObject::getStack are made in a repeated manner.
So, some code is likely going in a loop making allocations, and re-attempting to make more allocations as it fails to complete some initialization.

To find more about this issue, I recommend:

  • To check about:performance to know which website/process is causing this issue.
  • To reproduce this issue and to use the Firefox profiler to resolve the OTHER and BROKEN symbols, as Windows profiler does not know how to iterate over JIT stacks.
Flags: needinfo?(thisismyidashish)

(In reply to Nicolas B. Pierron [:nbp] {pto until 30/05} from comment #5)

The profile suggests that many calls to ErrorObject::getStack are made in a repeated manner.
So, some code is likely going in a loop making allocations, and re-attempting to make more allocations as it fails to complete some initialization.

To find more about this issue, I recommend:

  • To check about:performance to know which website/process is causing this issue.
  • To reproduce this issue and to use the Firefox profiler to resolve the OTHER and BROKEN symbols, as Windows profiler does not know how to iterate over JIT stacks.

I can't seem to reproduce the issue as i don't know which website was causing issue. I restarted firefox.
I will take a look at profiler next time it happens.

I have dump of process and i can run specific commands that can provide that info though.
How do you resolve the symbols in a dump ?

Flags: needinfo?(thisismyidashish)

(In reply to AshishNegi from comment #6)

I have dump of process and i can run specific commands that can provide that info though.
How do you resolve the symbols in a dump ?

I'm not sure but you can try the instructions on this page: https://github.com/rust-minidump/rust-minidump/tree/main/minidump-stackwalk#analyzing-firefox-minidumps
But even the resolved symbols are unlikely to tell us which page or which JS function was responsible, because they'll only resolve C++ functions.
We have bug 1530552 for making Perfview able to walk JS / JIT stacks.

(In reply to AshishNegi from comment #6)

I can't seem to reproduce the issue as i don't know which website was causing issue. I restarted firefox.

Thanks for trying!
I will mark this bug as incomplete in the mean time. Feel free to re-open this bug if you have more information.

Status: UNCONFIRMED → RESOLVED
Closed: 2 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: