Closed Bug 1823634 Opened 2 years ago Closed 2 years ago

Reduce the CPU usage of antivirus software on Windows when Firefox is running

Categories

(External Software Affecting Firefox :: Other, enhancement, P3)

All
Windows
enhancement

Tracking

(firefox114 affected)

RESOLVED FIXED
114 Branch
Tracking Status
firefox114 --- affected

People

(Reporter: yannis, Assigned: yannis)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [win:stability])

Attachments

(4 files)

Bug 1441918 comment 90 has highlighted that Firefox currently generates a lot of events (potentially around 7x and more) on the Microsoft-Windows-Threat-Intelligence ETW provider compared to competitors. Antivirus software products, including but not limited to Windows Defender, listen to this ETW provider (and others) to monitor system activity and detect malicious behavior.

Generating events on ETW providers that AV listen to indirectly impacts power usage, because antivirus code will run to analyze each event. It can lead to catastrophic situations if the antivirus itself is not optimized for power usage; see bug 1441918 comment 87 for an example where Windows Defender was using a classic Windows pattern which unfortunately led to a lot of wasted CPU time. Firefox was more impacted by this performance issue compared to competitors, because of the number of events we generate.

The attached text file lists all the events that Firefox Nightly Debug currently generates on Microsoft-Windows-Threat-Intelligence during a casual browsing session on my machine (browsing and watching a video on YouTube, browsing Reddit, browsing Wikipedia, browsing some news website... altogether), grouped by call stacks with event count and percentage of total. Doing this with Debug allowed me to get more detailed stacks.

To summarize, in my casual browsing session:

  • 95.5% of the events originate from calls to VirtualProtect(Ex);
  • 1.8% from calls to WriteProcessMemory;
  • 1.6% from calls to VirtualAlloc(Ex);
  • 1.0% from calls to Resume/CreateThread;
  • 0.1% from calls to MapViewOfFile.

Calls to js::jit::ReprotectRegion account for 92.9% of all events, originating in particular from js::jit::BaselineCacheIRCompiler::compile (39.8%), js::jit::BaselineCompiler::compile (27.3%), js::jit::IonCacheIRCompiler::compile (13.0%), js::jit::CodeGenerator::link (4.7%), v8::internal::SMRegExpMacroAssembler::GetCode (3.6%), js::jit::ExecutableAllocator::poisonCode (3.4%).

We should try to reduce the number of events that Firefox generates, which will reduce the CPU usage from AV software. Bug 1822650 should help address this for the JIT-related part.

We should also double check CPU usage from major AV products to check if some of them might have poorly optimized event analysis code, as this would currently impact our users more than users of competitor browsers, similar to bug 1441918 for Windows Defender.

Attached file stacks_firefox_nightly_113.0.a1.txt (deleted) —
Depends on: 1822650
Severity: -- → S3
Priority: -- → P3
Assignee: nobody → yjuglaret

As explained in bug 1822650 comment 6, v8 is using a trick to do VirtualProtect through VirtualAlloc. It seems to result in lighter analysis from antivirus software. I suggest we push this to Nightly so that we can estimate the impact on official builds.

Pushed by yjuglaret@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9a2405915e12 Use VirtualAlloc to achieve VirtualProtect's purpose on the most impactful path. r=gstoll
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 114 Branch

Keeping this open until confirmation of the impact.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

The following consecutive official Nightly builds can be used to estimate the impact of doing VirtualProtect through VirtualAlloc:

Attached image msmpeng-comparison-i9.png (deleted) —

The results I previously observed with custom builds (in bug 1822650 comment 6) reproduce with the official Nightly builds from comment 7.

Attached is a comparison when running Speedometer 2.1 before patch (top) and after patch (bottom) on a good laptop machine with a i9-12900HK CPU, with lightweight recording instrumentation. No significant difference on Speedometer results (maybe very slightly better; for sure not worse), however MsMpEng.exe CPU usage drops by ~62%! I obtained these results with mpengine.dll version 1.1.20200.4, in other words these numbers show additional, new, cumulative improvements over what was previously described in bug 1441918 comment 91. So on this machine, there should currently be ~90% less CPU usage from MsMpEng.exe compared to before the investigation in bug 1441918 if using Firefox Nightly.

Attached image msmpeng-comparison-i5.png (deleted) —

Attached is a comparison when running Speedometer 2.1 before patch (top) and after patch (bottom) on an older laptop machine with a i5-6200U CPU, with lightweight recording instrumentation. Averaged Speedometer results improved from 81.1 to 82.9 (~2.2% improvement). MsMpEng.exe CPU usage drops by ~47%. MsMpEng.exe is no longer the next top-consuming process after firefox.exe during a Speedometer run! I obtained these results with mpengine.dll version 1.1.20200.4, in other words these numbers show additional, new, cumulative improvements over what was previously described in bug 1441918 comment 91. So on this machine, there should currently be ~87% less CPU usage from MsMpEng.exe compared to before the investigation in bug 1441918 if using Firefox Nightly.

Whiteboard: [win:stability]

Can this be closed now?

QA has started working on collecting CPU usage data for more AV products. I will update the status of this bug when we have more results from this study.

[:danibodea] and Oana Botisan have collected CPU usage data when running the Speedometer 2.1 test, using various AV products, with the two official Nightly builds listed in comment 7 (note that AV products may interact differently with Nightly builds compared to Release builds). I have analyzed the data in order to estimate the impact that AV products have on CPU usage, by splitting it across two different potential impacts:

  • out-of-process impact: this measures the additional CPU usage that results from computation happening in the AV products' own processes (e.g., MsMpEng.exe);
  • in-process impact: this measures the additional CPU usage that results from computation happening within firefox.exe processes, typically because of code injection from AV products.

From that data, we can conclude that we can keep the patch for 114 Release, since the patch seems to:

  • significantly reduce CPU usage resulting from Microsoft Defender (by ~50%), allowing Defender to now score among the lowest CPU usage impacts when using Firefox;
  • reduce CPU usage resulting from ESET and Trend Micro (by ~17% for both);
  • have no significant impact on CPU usage from Avast, Norton, McAfee.

Another interesting insight from this study is that in-process impact can reach abnormally high levels too. This is concerning because, while an abnormally high out-of-process CPU usage impact would easily be attributed to an AV product (like in bug 1441918), a user that would experience abnormally high in-process CPU usage would likely attribute it to Firefox.

I am closing this bug, but will follow-up with a replacement meta bug (which this bug should have been originally) about attempts at measuring and improving AV CPU usage impact with Firefox. We will likely start to monitor AV CPU usage on a regular basis as well.

Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED
Blocks: 1835323
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: