Closed Bug 1618354 Opened 5 years ago Closed 2 years ago

Headless firefox is very slow to start up/shut down when another firefox is open

Categories

(Firefox :: Headless, defect, P3)

73 Branch
defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: joel.einbinder, Unassigned)

References

(Depends on 1 open bug)

Details

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36

Steps to reproduce:

  1. Open normal firefox
  2. Run the command
time firefox -headless -screenshot="a.png" about:blank

Actual results:

firefox -headless -screenshot="a.png" about:blank 0.09s user 0.04s system 2% cpu 5.156 total

5 seconds!

Expected results:

If there was not another instance of firefox open:

firefox -headless -screenshot="a.png" about:blank 0.96s user 0.39s system 196% cpu 0.691 total

0.7 seconds!

Compare this to chrome:

google-chrome --headless --screenshot="a.png" about:blank 0.14s user 0.12s system 99% cpu 0.262 total

0.26 seconds!

Summary: Headless firefox is very slow to start up when another firefox is open → Headless firefox is very slow to start up/shut down when another firefox is open
Component: Untriaged → Headless

I tried with a new profile:

firefox -P asdf

and then in a new terminal, the same command as before

time firefox -headless -screenshot="a.png" about:blank

With the same results, 5 seconds

firefox -headless -screenshot="a.png" about:blank 0.87s user 0.35s system 196% cpu 0.622 total

I pasted the wrong time into the last comment, and am not sure how to edit it. It was indeed 5 seconds:

firefox -headless -screenshot="a.png" about:blank 0.08s user 0.04s system 2% cpu 5.157 total

Update: Repro'd on a mac as well.

I can definitely reproduce as described on Linux. The slow case also says:

Firefox is already running, but is not responding. To open a new window, you must first close the existing Firefox process, or restart your system.

and does not actually create the screenshot file. If I explicitly run the headless Firefox against a different profile like so:

% firefox &
% mkdir /tmp/temp-profile
% time firefox -headless -screenshot="a.png" -profile /tmp/temp-profile about:blank

then the execution is fast and the png is created. So this looks like us trying to acquire the profile lock on an already-in-use profile and failing.

I'm not sure what the expected behavior is for headless execution and profiles; were you expecting it to use a different profile automatically?

Flags: needinfo?(joel.einbinder)
Status: UNCONFIRMED → NEW
Ever confirmed: true

There's been some discussion around this in bug 1382015.

Interesting that this is only a slow startup compared to a complete hang when having multiple profiles listed in the profiles.ini (bug 1563725).

Joel, can you please check if the workaround Boris listed above makes it work for you?

Whiteboard: [qf]
Whiteboard: [qf]

Ok, I noticed the exact same behavior now while working on bug 1623128. There we want to get the Puppeteer tests running in our new Ubuntu 18.04 docker image. Interestingly with the former Ubuntu 16.04 based docker image we haven't seen this problem. Which means something changed for 18.04, which doesn't play that well with Firefox in headless mode.

Changing the jobs to run non-headless fixes the problem.

Maybe I can find some way to get a start-up profile for such a run in CI. Problem is that puppeteer kills the browser at the end. But maybe I can replicate it with a headless Mn job.

Ok, my problem might probably be unrelated to the one from Joel because in our CI jobs we have only a single instance of Firefox open. So I filed my observations as bug 1633101.

So Joel send two profiles via Riot. One for a cold (fresh) profile, and another one for a warm profile:

For the warm profile it takes 0.8s until the web content process shows the first activity, while for the cold one it takes 3.5s.

One thing I noticed is that we actually spend a full second in starting the Addons manager. And here specifically in processFileChanges.

Andrew or Mike could one of you please have a look at that? Note that the profile is from a release build, so not sure if something has already been changed. Thanks a lot!

Flags: needinfo?(mconley)
Flags: needinfo?(andrew.swan)

In general, the addon manager does a lot of work to try to make startup in an existing profile as fast as possible. But little attention has been paid to optimizing the first startup in a new profile. The meta bug for this is bug 1547139. In the cold profile from comment 10, a long time is spent initializing the default system addons in the new profile, the work already identified in bug 1547139 would help with that...

Flags: needinfo?(andrew.swan)

Andrew, I assume disabling these system add-ons would help here? What's the best way to disable all extensions beside those ones we ship with Firefox? Also interesting is that the profile doesn't show any extensions to be installed. So I wonder which those are.

Flags: needinfo?(mconley) → needinfo?(andrew.swan)

For Puppeteer we actually use:
https://searchfox.org/mozilla-central/rev/7fd1c1c34923ece7ad8c822bee062dd0491d64dc/remote/test/puppeteer/lib/Launcher.js#622-623

      'extensions.autoDisableScopes': 0,
      'extensions.enabledScopes': 5,

Joel, do you have those set also for Playwright?

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #12)

Andrew, I assume disabling these system add-ons would help here? What's the best way to disable all extensions beside those ones we ship with Firefox? Also interesting is that the profile doesn't show any extensions to be installed. So I wonder which those are.

I think maybe there's some confusion about terminology? System add-ons are the ones were ship with Firefox. These days, they are doh-rollout, formautofill, screenshots, webcompat, and webcompat-reporter. Some of these can be disabled with prefs, but if the scenario here is creating a brand new profile, there's a bootstrapping problem -- there's no profile in which to customize prefs.

Flags: needinfo?(andrew.swan)

Thanks for the clarification. It was not clear. So it might be worth to check which of those can be disabled. We might also want to have that for the Remote agent. So I filed bug 1634147 to take care of it.

Note that we stick a pre-filled user.js / prefs.js file into the empty profile directory. So disabling them via a preference will work.

(In reply to joel.einbinder from comment #0)

time firefox -headless -screenshot="a.png" about:blank

So for the case as filed in comment 0 we will have to wait for bug 1547139.

Sorry for the late response on this.

I accidentally went down a bad path with this bug report. I noticed that firefox was slow to start up in my tests, and went to headless shell to confirmed. When it was also slow, I assumed they were the same issue. However the headless command is slow because its competing for the profile directory, which you all figured out, and I should have noticed from the helpful message:

Firefox is already running, but is not responding. To open a new window, you must first close the existing Firefox process, or restart your system.

That brings me back to my original issue, which is firefox was slow to start up/shut down in my tests. I can confirm this via

time firefox -profile ./fftmp -headless -screenshot about:blank : 1.5s
time google-chrome --user-data-dir=./crtmp --headless --screenshot about:blank: 0.14s

A 10x faster turnaround time for Chrome.

In Playwright/Puppeteer, we delete the firefox profile dir and start fresh for every time. I noticed that there was something called startupCache there, so I wondered if using an existing profile dir would help. It did!

time firefox -profile ./fftmp -headless -screenshot about:blank : 0.6s

Not quite as fast as Chrome, but still almost 3x faster.

The profile dir contains a lot of things we don't want to save, like cookies. I tried deleting each file individually in the profile dir, and timed startup to see how it was effected.

File Deleted Time in ms
prefs.js 731
places.sqlite 698
favicons.sqlite 640
storage 615
permissions.sqlite 601
compatibility.ini 569
startupCache 559
extensions.json 525
webappsstore.sqlite 498
storage.sqlite 497
minidumps 470
times.json 460
extension-preferences.json 456
datareporting 455
extensions 455
addonStartup.json.lz4 453
crashes 446
cookies.sqlite 424
.parentlock 422
addons.json 416
cache2 410
bookmarkbackups 407
sessionCheckpoints.json 398
shield-preference-experiments.json 397
saved-telemetry-pings 363

The big losses when deleting prefs.js was interesting. Deleting individual prefs revealed a slowdown caused by the extension blocklist. Disabling it (extensions.blocklist.enabled) wins back ~400ms.

That brings the turnaround time to 1.1 seconds. It looks like better management of the profile dir could yield a another ~800ms speedup. places.sqlite looks suspicious.

Flags: needinfo?(joel.einbinder)

(In reply to joel.einbinder from comment #17)

Thank you very much for the detailed investigation!

The profile dir contains a lot of things we don't want to save, like cookies. I tried deleting each file individually in the profile dir, and timed startup to see how it was effected.

File Deleted Time in ms
prefs.js 731
<snip table>

Sorry, I'm a little confused as to what the times represent. I'm assuming they're relative speed-ups / slowdowns - but it's not clear to me what the baseline of the comparison is, and if it's for an otherwise intact profile where you delete just that file but keep all the other ones, or if it's the inverse, ie an empty profile and the difference that having / not having that one file makes?

Can you elaborate a bit on your testing methodology? Like, is the 731ms the difference between running the pre-populated profile with/without prefs.js, or running an empty profile dir with/without prefs.js, or something else?

The big losses when deleting prefs.js was interesting. Deleting individual prefs revealed a slowdown caused by the extension blocklist. Disabling it (extensions.blocklist.enabled) wins back ~400ms.

What version of Firefox did you test with? I'd expect bug 1625181 to fix the blocklist slowdown on nightly - but if it didn't, that'd also be useful data.

I'm also curious what prefs you're setting from the puppeteer/playwright side (e.g. extensions.blocklist.enabled isn't normally set to false (~edit: fixed the value...)). Are you setting it during startup, but perhaps "after it's too late" and the blocklist cost has already been incurred? And does playwright set prefs in addition to the "default" puppeteer ones?

That brings the turnaround time to 1.1 seconds. It looks like better management of the profile dir could yield a another ~800ms speedup. places.sqlite looks suspicious.

This is also a bit confusing - are the times additive? I mean, I don't think so, because otherwise we'd start in negative time, but I'm not sure how else they relate...

Flags: needinfo?(joel.einbinder)

Joel, if you could find the time to reply to Gijs questions, we would appreciate. Thanks!

Depends on: 1692875
No longer depends on: 1547139

A needinfo is requested from the reporter, however, the reporter is inactive on Bugzilla. Closing the bug as incomplete.

For more information, please visit auto_nag documentation.

Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(joel.einbinder)
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.