Closed Bug 1730852 Opened 3 years ago Closed 3 years ago

Service Worker slows down resource load times (may involve tracking protection)

Categories

(Core :: Networking, defect, P2)

Firefox 92
defect

Tracking

()

VERIFIED FIXED
98 Branch
Performance Impact high
Tracking Status
firefox98 --- verified

People

(Reporter: petras.vilkelis, Assigned: manuel, Mentored)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: perf:pageload, Whiteboard: [necko-triaged])

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36

Steps to reproduce:

We found a performance issue for resources coming from specific domains when the service worker is present and has a fetch handler (even if it's just an empty function). To reproduce this you can open this page (make sure to disable Cache in Network tab and refresh once SW is installed):

https://ultimate-fairy-77.app.baqend.com

You will see printed resource timings of two scripts. One is affected by the bug (dev.visualwebsiteoptimizer), the other one (cdn.jsdelivr) is not.

By default, the SW is enabled. To disable the Worker, add ?sw=false to URL and refresh twice to see effects. They are roughly the same speed when SW is not present.

We found several scripts that have this issue, all of them seem to be detected as a tracker in the Network tab, but disabling Tracking Protection did not help.

We also tested this on Nightly and Beta with no success.

Actual results:

The affected request often takes over 1 second with SW, and under 100ms without.

Expected results:

We would expect the request not to be affected by the Worker (~100ms).

The Bugbug bot thinks this bug should belong to the 'Core::DOM: Service Workers' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → DOM: Service Workers
Product: Firefox → Core
Whiteboard: [qf]

The severity field is not set for this bug.
:asuth, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(bugmail)
Severity: -- → S3
Depends on: 1736264
Flags: needinfo?(bugmail)
Priority: -- → P3
Summary: Service Worker slows down resource load times → Service Worker slows down resource load times (may involve tracking protection)

Hi Petras,
Thank you for filing this and providing the clear test case.

I've reproduced your findings and captured a profile but I'm not seeing what is the root cause of the delay.
https://share.firefox.dev/3nwA8g0

I can also see that the slower resources is flagged as a tracker.
So let me find someone more familiar with the area.

Tim, would you be able to find someone with tracking-protection understanding to help discern if that is the root cause of this performance issue?

Flags: needinfo?(tihuang)

Setting qf:p1:pageload because a 10x slowdown from a no-op service worker sounds quite bad.

Status: UNCONFIRMED → NEW
Ever confirmed: true
Whiteboard: [qf] → [qf:p1:pageload]

I think the performance issue lives in the SW intercept code in Necko. I've tried adding the tracking resource to the tracking annotation skip list and the loading time became normal. And I also tried disabling ETP protection, but it doesn't really help. So, the issue has nothing to do with the content blocking, but is more likely in SW intercepting tracking resources.

Seeing here, we mark the channel with nsIClassOfService:Tail when the channel is a third-party tracking resource. IIUC, this would delay the loading of the tracking resource until other loads are finished. Maybe this can explain the reason why the loading is so slow.

Dragana, I believe this is intended behavior. What do you think?

Flags: needinfo?(tihuang) → needinfo?(dd.mozilla)

nsIClassOfService:Tail was exactly made for the hird-party tracking resource.

We may revisit this decision if needed, but I need to understand more here.

Flags: needinfo?(dd.mozilla)

moving to Core::Networking (see comment 7)

Component: DOM: Service Workers → Networking
Attached file log.txt-main.61182.moz_log (deleted) —

I've tried to reproduce this locally and the log is uploaded.
The log really shows that the tracking resource was delayed because of tailing, but it's unclear why this didn't happen when service worker is not used.

Blocks: necko-perf
Whiteboard: [qf:p1:pageload] → [qf:p1:pageload][necko-triaged]

I'll have a look.

Assignee: nobody → kershaw
Priority: P3 → P2

Hi Manuel,
I think I know what's the problem here and if you want, I can guide you to fix this bug.
Feel free to say no if you are not interested.
Thanks.

Flags: needinfo?(mbucher)

Thanks, I'm interested to look at this with your guidance.

Assignee: kershaw → mbucher
Mentor: kershaw
Status: NEW → ASSIGNED
Flags: needinfo?(mbucher)

nsCOMPtr<nsIClassOfService> cos(do_QueryInterface(newChannel));1 was
previously returning a nullptr for objects of the class InterceptedHttpChannel.
Therefore the classOfService flags weren't set after the redirect to the
InterceptedChannel2.

Pushed by mbucher@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3ed55e55a478 Fix classOfService information being lost when redirecting to InterceptedHttpChannel r=necko-reviewers,valentin
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 98 Branch
Flags: qe-verify+
Performance Impact: --- → P1
Keywords: perf:pageload
Whiteboard: [qf:p1:pageload][necko-triaged] → [necko-triaged]

Reproduced the issue on Firefox 94.0a1 (2021-09-15) under macOS 11.6.4 by following the STR from Comment 0.

The issue is fixed on Firefox 98.0. Tests were performed on macOS 11.6.4, Ubuntu 20.04 and Windows 11.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: