Closed
Bug 1401666
Opened 7 years ago
Closed 7 years ago
poor webgl performance
Categories
(Core :: Security: Process Sandboxing, defect, P1)
Tracking
()
VERIFIED
FIXED
mozilla58
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox55 | --- | unaffected |
firefox56 | --- | unaffected |
firefox57 | --- | fixed |
firefox58 | --- | verified |
People
(Reporter: hotsauce2998, Assigned: jld)
References
Details
(Keywords: regression, Whiteboard: [gfx-noted][sb+])
Attachments
(6 files)
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:57.0) Gecko/20100101 Firefox/57.0
Build ID: 20170914224058
Steps to reproduce:
visit
http://alteredqualia.com/three/examples/webgl_pasta.html
or
http://madebyevan.com/webgl-water/
Actual results:
Poor performance, webgl is slow and choppy. Only happens in Firefox 57. Works great in Firefox 55
Expected results:
webgl rendering should be smooth
Thanks for the bug report!
Would you have time to give us a bit more information? I'm having trouble reproducing the issue on my hardware, and this is most likely very hardware specific (e.g., I see a difference between Intel and Nvidia graphics card on the same computer, but not between 55 and 57.)
1. The graphics section of your about:support would help us see what kind of hardware configuration you have.
2. What kind of frame rate are you getting in 57 vs. 55? Perhaps a video recording of the good/bad that we can look at? If you set layers.acceleration.draw-fps in about:config, you will be able to see the frame rate and the GPU execution time.
3. If you're showing a regression that can be evaluated quickly, it would really help us to find which particular change caused this problem. Running mozregression tool (http://mozilla.github.io/mozregression/quickstart.html) would give us that information - could you run it?
status-firefox56:
--- → affected
status-firefox57:
--- → affected
Priority: -- → P2
Whiteboard: [gfx-noted]
Hey Milan I've attached a screenshot of showing the frame rate of Firefox 55 and 57 side by side. Also attached the troubleshooting info from about:support.
Heres the bisect info from the mozregression tool:
17:44.81 INFO: Last good revision: 8e1e06adf80f82d3d5cf08eadaf569a107bd1ecf
17:44.81 INFO: First bad revision: b5fa08551d6e74d8300fa94f3161afdffd867764
17:44.81 INFO: Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=8e1e06adf80f82d3d5cf08eadaf569a107bd1ecf&tochange=b5fa08551d6e74d8300fa94f3161afdffd867764
Hope that helps
(In reply to Milan Sreckovic [:milan] from comment #1)
> Thanks for the bug report!
> Would you have time to give us a bit more information? I'm having trouble
> reproducing the issue on my hardware, and this is most likely very hardware
> specific (e.g., I see a difference between Intel and Nvidia graphics card on
> the same computer, but not between 55 and 57.)
>
> 1. The graphics section of your about:support would help us see what kind of
> hardware configuration you have.
>
> 2. What kind of frame rate are you getting in 57 vs. 55? Perhaps a video
> recording of the good/bad that we can look at? If you set
> layers.acceleration.draw-fps in about:config, you will be able to see the
> frame rate and the GPU execution time.
>
> 3. If you're showing a regression that can be evaluated quickly, it would
> really help us to find which particular change caused this problem. Running
> mozregression tool (http://mozilla.github.io/mozregression/quickstart.html)
> would give us that information - could you run it?
Gian-Carlo, is there a way to run time "disable" some of the changes in bug 1308400 (e.g., by dialing down some sandbox setting) so that Colin can see if this improves the performance back? That's a 3x performance hit we seeing, so we want to figure out what we can do about it.
Jeff any more information we can get from Colin that would help here?
Blocks: 1308400
Flags: needinfo?(jmathies)
Flags: needinfo?(jgilbert)
Flags: needinfo?(gpascutto)
Priority: P2 → P1
Updated•7 years ago
|
OS: Unspecified → Linux
Comment 6•7 years ago
|
||
Run with MOZ_SANDBOX_LOGGING=1 and see what gets blocked. Might want to attach the log to this bug.
https://wiki.mozilla.org/Security/Sandbox#Customization_Settings can whitelist specific files/devices
security.sandbox.content.level can turns down the whole sandbox
Exactly what version of Ubuntu is this, and exactly what video hardware and drivers do you have on there? I only see a basic iGPU in the about:support log, but maybe we end up using that because we fail to reach the real GPU.
Flags: needinfo?(gpascutto)
Vendor ID: Intel Open Source Technology Center
Device ID: Mesa DRI Intel(R) HD Graphics 530 (Skylake GT2)
Driver Version: 3.0 Mesa 12.0.6
Comment 8•7 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #7)
> Vendor ID: Intel Open Source Technology Center
> Device ID: Mesa DRI Intel(R) HD Graphics 530 (Skylake GT2)
> Driver Version: 3.0 Mesa 12.0.6
I saw that, Milan, (as I already pointed out), but there's no guarantee he doesn't have *another* card that the sandbox is masking.
Assignee | ||
Comment 9•7 years ago
|
||
GPU identification seems to be done by the parent process, in which case it's not affected by sandboxing. Also, it uses a subprocess forked very early in startup (bug 639842), so even if I'm wrong and the probing is redone in content processes, it still shouldn't be affected by sandboxing. (If it's trying to fork in a sandboxed content process, then that would've already been broken in 55, so that's not it.)
Comment 10•7 years ago
|
||
Ok, then we're probably blocking something Mesa is expecting to be there. MOZ_SANDBOX_LOGGING=1 should reveal what.
Colin, does setting security.sandbox.content.level in about:config to 2 (or even 1), from the default 3, restarting and running again make any difference? What's the frame rate/exec time then?
If this makes a difference, then setting the environment variable from comment 10 and running again would be awesome. Actually, to be safe, probably would be useful either way.
If you have time for either or both of these, that's be great. If you only have time for one, MOZ_SANDBOX_LOGGING=1 first please :)
Comment 12•7 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #5)
> Gian-Carlo, is there a way to run time "disable" some of the changes in bug
> 1308400 (e.g., by dialing down some sandbox setting) so that Colin can see
> if this improves the performance back? That's a 3x performance hit we
> seeing, so we want to figure out what we can do about it.
>
> Jeff any more information we can get from Colin that would help here?
I don't have anything to add here. This is worrisome for graphics though, since we need to support multiple different drivers. Breaking one driver means we're at-risk for breaking others, and this isn't something one individual can test alone.
Flags: needinfo?(jgilbert)
Reporter | ||
Comment 13•7 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #11)
> Colin, does setting security.sandbox.content.level in about:config to 2 (or
> even 1), from the default 3, restarting and running again make any
> difference? What's the frame rate/exec time then?
>
> If this makes a difference, then setting the environment variable from
> comment 10 and running again would be awesome. Actually, to be safe,
> probably would be useful either way.
>
> If you have time for either or both of these, that's be great. If you only
> have time for one, MOZ_SANDBOX_LOGGING=1 first please :)
I'm getting about 10-11 FPS with security.sandbox.content.level set to 1 and 2,
about 1-2 FPS with security.sandbox.content.level = 3,
and about 20 FPS on firefox 55
I don't know how to set MOZ_SANDBOX_LOGGING=1. Is this the same thing as security.sandbox.logging.enabled in about:config?
Updated•7 years ago
|
Flags: needinfo?(jmathies)
It seems unclear how 56 is affected but we can keep this open and take a look next week. Right now, this isn't actionable for 56.
Comment 15•7 years ago
|
||
(In reply to Colin from comment #13)
> I'm getting about 10-11 FPS with security.sandbox.content.level set to 1 and
> 2,
> about 1-2 FPS with security.sandbox.content.level = 3,
> and about 20 FPS on firefox 55
>
> I don't know how to set MOZ_SANDBOX_LOGGING=1. Is this the same thing as
> security.sandbox.logging.enabled in about:config?
You start Firefox with that flag set, i.e. in a terminal you'd type
MOZ_SANDBOX_LOGGING=1 firefox
and you should get logging in the console window.
Comment 16•7 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #15)
> You start Firefox with that flag set, i.e. in a terminal you'd type
>
> MOZ_SANDBOX_LOGGING=1 firefox
>
> and you should get logging in the console window.
Firefox shouldn't be running when you do this, BTW. Shut it down first, then do the above.
Reporter | ||
Comment 17•7 years ago
|
||
Reporter | ||
Comment 18•7 years ago
|
||
Reporter | ||
Comment 19•7 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #16)
> (In reply to Gian-Carlo Pascutto [:gcp] from comment #15)
> > You start Firefox with that flag set, i.e. in a terminal you'd type
> >
> > MOZ_SANDBOX_LOGGING=1 firefox
> >
> > and you should get logging in the console window.
>
> Firefox shouldn't be running when you do this, BTW. Shut it down first, then
> do the above.
Ok thanks. I attached logs for security.sandbox.content.level=2 and security.sandbox.content.level=3.
This time I got around 20-22 FPS at security.sandbox.content.level=2, which is about what I get in Firefox 55.
Still getting 1-2 FPS when security.sandbox.content.level=3
Reporter | ||
Comment 20•7 years ago
|
||
When security.sandbox.content.level=3:
libGL error: unable to load driver: i915_bpo_dri.so
libGL error: driver pointer missing
libGL error: failed to load driver: i915_bpo
Can't load the driver without read access?
Assignee | ||
Comment 21•7 years ago
|
||
So these are actual ENOENTs — the broker allowed access, but the file doesn't exist:
Sandbox: Failed errno -2 op 0 flags 02000000 path /usr/lib/x86_64-linux-gnu/dri/tls/i915_bpo_dri.so
Sandbox: Failed errno -2 op 0 flags 02000000 path /usr/lib/x86_64-linux-gnu/dri/i915_bpo_dri.so
Sandbox: Failed errno -2 op 0 flags 02000000 path /usr/lib/x86_64-linux-gnu/mesa/dri/tls/i915_bpo_dri.so
Sandbox: Failed errno -2 op 0 flags 02000000 path /usr/lib/x86_64-linux-gnu/mesa/dri/i915_bpo_dri.so
Sandbox: Failed errno -2 op 0 flags 02000000 path /usr/lib/dri/tls/i915_bpo_dri.so
Sandbox: Failed errno -2 op 0 flags 02000000 path /usr/lib/dri/i915_bpo_dri.so
Maybe i915_bpo_dri.so isn't the right driver and it normally loads something else?
The lines above that about sysfs, leading up to “MESA-LOADER: could not get parent device”, aren't quite what they look like — it should fall back other methods of detecting the GPU's PCI ID[*]. But I'm wondering if those might not all agree in this case.
So here's something to try — corresponding to, if I'm following the library code correctly, the PCI ID detection that's failing (libudev) and the next fallback (sysfs), which should work:
grep PCI_ID /sys/dev/char/226:0/../../uevent
cat /sys/dev/char/226:0/device/{vendor,device}
Those should give the same pair of hexadecimal numbers.
Another thing to try is to run with MOZ_PERMISSIVE_CONTENT_SANDBOX=1 MOZ_SANDBOX_LOGGING=1 which will allow all file acccesses and log if they would have been denied. Trying that with Ubuntu 14.04 in a VM it looks like it would normally be reading from file in /run/udev/data, but that seems to just have description strings meant for humans and probably wouldn't be overriding the PCI ID or anything like that.
[*] This is, basically, the code removed in https://cgit.freedesktop.org/mesa/mesa/commit/?id=be239326aa4f9317d42ee07f0f51179c8b3d5b22 — Mesa 13 completely changed how this works.
Assignee | ||
Comment 22•7 years ago
|
||
Here's another hypothesis: suppose this Mesa is built with HAVE_LIBUDEV and HAVE_LIBDRM but not HAVE_SYSFS, and drmGetVersion returns "i915_bpo" as the device name for some reason.
Without sandboxing, libudev obtains the PCI ID, which is looked up in pci_id_driver_map.h by loader_get_driver_for_fd to find the driver name "i915", and i915_dri.so is loaded.
With sandboxing, libudev fails because we're not allowing access to arbitrary things in /sys/devices, then the Mesa 12 version of drm_get_pci_id_for_fd also fails because "i915_bpo" isn't in the hardcoded list of names it understands (Mesa 13 fixes this), so loader_get_pci_id_for_fd fails completely and — this is the important part — loader_get_driver_for_fd falls back on using the name from libdrm as-is. And so it looks for i915_bpo_dri.so, which doesn't exist.
So, drmGetVersion in libdrm is DRM_IOCTL_VERSION which winds up at drm_version() in the kernel, and that reads the drm_driver::name field. And indeed, in Ubuntu's kernel source, there's this:
ubuntu/i915/i915_drv.c: .name = DRIVER_NAME,
ubuntu/i915/i915_drv.h:#define DRIVER_NAME "i915_bpo"
Also, the configure script in Mesa 12.0.6 says this:
--enable-sysfs enable /sys PCI identification [default=disabled]
I don't see Ubuntu's package setup (debian/rules and the debian/* directory in general) explicitly using --enable-sysfs, so that would explain why it would be disabled. (Which is a shame, because it ought to work perfectly with the sandbox policy as it stands.)
In conclusion: ignore comment #21; I think I know what's going on here now.
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → jld
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Assignee | ||
Updated•7 years ago
|
Component: Canvas: WebGL → Security: Process Sandboxing
Updated•7 years ago
|
Updated•7 years ago
|
Whiteboard: [gfx-noted] → [gfx-noted][sb+]
Comment hidden (mozreview-request) |
Comment 24•7 years ago
|
||
mozreview-review |
Comment on attachment 8914490 [details]
Bug 1401666 - Adjust sandbox policy to allow Mesa 12 to use libudev for device identification.
https://reviewboard.mozilla.org/r/185818/#review191038
::: security/sandbox/linux/broker/SandboxBrokerPolicyFactory.cpp:131
(Diff revision 1)
> policy->AddDir(rdonly, "/nix/store");
> policy->AddDir(rdonly, "/run/host/fonts");
> policy->AddDir(rdonly, "/run/host/user-fonts");
>
> // Bug 1384178: Mesa driver loader
> policy->AddPrefix(rdonly, "/sys/dev/char/226:");
Is one of the reasons that this is failing that the driver is realpath-ing behind our back? (Thus, we can't resolve the symlink back to a subdir of this one?)
::: security/sandbox/linux/broker/SandboxBrokerPolicyFactory.cpp:133
(Diff revision 1)
> policy->AddDir(rdonly, "/run/host/user-fonts");
>
> // Bug 1384178: Mesa driver loader
> policy->AddPrefix(rdonly, "/sys/dev/char/226:");
>
> + // Bug 1401666: Mesa driver loader part 2: Mesa <= 12 using libudev
There's enough logic here that this can perhaps go in a helper function. (I know I'm not without fault myself in this function!)
Attachment #8914490 -
Flags: review?(gpascutto) → review+
Comment hidden (mozreview-request) |
Assignee | ||
Comment 26•7 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #24)
> ::: security/sandbox/linux/broker/SandboxBrokerPolicyFactory.cpp:131
> (Diff revision 1)
> > policy->AddDir(rdonly, "/nix/store");
> > policy->AddDir(rdonly, "/run/host/fonts");
> > policy->AddDir(rdonly, "/run/host/user-fonts");
> >
> > // Bug 1384178: Mesa driver loader
> > policy->AddPrefix(rdonly, "/sys/dev/char/226:");
>
> Is one of the reasons that this is failing that the driver is realpath-ing
> behind our back? (Thus, we can't resolve the symlink back to a subdir of
> this one?)
Yes, but it's not quite realpath(), and it's also looking at paths that aren't descendents of what it readlink()ed. From attachment 8912235 [details]:
Sandbox: Recording mapping /sys/devices/pci0000:00/0000:00:02.0/drm/card0 -> /sys/dev/char/226:0
Sandbox: SandboxBroker: denied op=stat rflags=400000 perms=0 path=/sys/devices/pci0000:00/0000:00:02.0/drm/card0 for pid=6218
Sandbox: Failed errno -13 op 2 flags 0400000 path /sys/dev/char/../../devices/pci0000:00/0000:00:02.0/drm/card0
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys/devices/pci0000:00/0000:00:02.0 for pid=6218
Sandbox: Failed errno -13 op 10 flags 00 path /sys/dev/char/../../devices/pci0000:00/0000:00:02.0
I've adjusted one of the comments in the patch to match reality a little better.
Comment 27•7 years ago
|
||
Pushed by jedavis@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e5c54805bcc1
Adjust sandbox policy to allow Mesa 12 to use libudev for device identification. r=gcp
Comment 28•7 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox58:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Reporter | ||
Comment 29•7 years ago
|
||
Can confirm this bug no longer occurs for me as of 58.0a1 (2017-10-04)
Comment 31•7 years ago
|
||
It appears that I suddenly cannot build moz-central on my Gentoo box after this change:
> 8:09.11 /media/ssd/mozilla/central/security/sandbox/linux/broker/SandboxBrokerPolicyFactory.cpp: In function ‘void mozilla::AddMesaSysfsPaths(mozilla::SandboxBroker::Policy*)’:
> 8:09.11 /media/ssd/mozilla/central/security/sandbox/linux/broker/SandboxBrokerPolicyFactory.cpp:67:53: error: ‘major’ was not declared in this scope
> 8:09.11 major(sb.st_rdev),
> 8:09.11 ^
> 8:09.12 /media/ssd/mozilla/central/security/sandbox/linux/broker/SandboxBrokerPolicyFactory.cpp:68:53: error: ‘minor’ was not declared in this scope
> 8:09.12 minor(sb.st_rdev),
> 8:09.12 ^
> 8:09.91 gmake[5]: *** [/media/ssd/mozilla/central/config/rules.mk:1077: SandboxBrokerPolicyFactory.o] Error 1
> 8:09.91 gmake[4]: *** [/media/ssd/mozilla/central/config/recurse.mk:73: security/sandbox/linux/broker/target] Error 2
Any hints?
Comment 32•7 years ago
|
||
Hmm, it seems as though adding "#include <sys/sysmacros.h>" to SandboxBrokerPolicyFactory.cpp does the trick for me.
Is that something we should add to this patch, jld?
Assignee | ||
Comment 33•7 years ago
|
||
Does #include <sys/types.h> work?
Flags: needinfo?(jld) → needinfo?(wisniewskit)
Comment 34•7 years ago
|
||
No, that was already being included by the file.
Flags: needinfo?(wisniewskit)
Assignee | ||
Comment 35•7 years ago
|
||
(In reply to Thomas Wisniewski from comment #34)
> No, that was already being included by the file.
So it is; I was looking at the wrong version.
In any case, it turns out that we've had this problem before: bug 1329798.
Comment 36•7 years ago
|
||
Looks like this bug was verified by the original reporter. Please request Beta approval on this when you get a chance.
Comment 37•7 years ago
|
||
If we uplift it, we will have to take bug 1406845 to fix a resource leak.
Assignee | ||
Comment 38•7 years ago
|
||
Comment on attachment 8914490 [details]
Bug 1401666 - Adjust sandbox policy to allow Mesa 12 to use libudev for device identification.
Approval Request Comment
[Feature/Bug causing the regression]: Bug 1308400 (Linux content sandbox read restrictions)
[User impact if declined]: Slow WebGL performance in some configurations
[Is this code covered by automated tests?]: Somewhat. The code is run, but the sandbox policy changes aren't relevant unless WebGL is used with a real GPU (including the VMware virtual GPU, but excluding pure software rendering).
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: Maybe. The fix has been verified on nightly by the reporter. I'll post the STR I used as a separate comment, because I need to attach a file.
[List of other uplifts needed for the feature/fix]: bug 1406845 (resource leak) and bug 1406233 (downstream build breakage)
[Is the change risky?]: Low risk
[Why is the change risky/not risky?]: It allows read access to files that otherwise wouldn't be allowed, and which were readable in 56.
[String changes made/needed]: None
Flags: needinfo?(jld)
Attachment #8914490 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 39•7 years ago
|
||
To reproduce this locally, because I don't have a Skylake I can spare for testing, I used a small preloaded shim library to replace the reported driver name with 'X's, to simulate a kernel driver name that doesn't exactly match the userland driver name.
Build this with something like `cc -O1 -Wall -fPIC -shared -o drmxxx.so drmxxx.c -I/usr/include/drm -ldl` and use it by setting LD_PRELOAD=/absolute/path/to/drmxxx.so in Firefox's environment.
As written this needs the libdrm-dev package (or equivalent for the distro; it's probably part of Firefox's build dependencies in any case). Changing the "libdrm.so" to "libdrm.so.2" in the source might also work (untested).
It also requires Mesa 12 or older, because newer versions aren't affected by this bug; Ubuntu 14.04 works, and CentOS 7 might also work (untested). It also requires a physical GPU (where VMware counts as "physical" but other VM hosts may not), and not using the Nvidia proprietary drivers (which replace Mesa).
Comment on attachment 8914490 [details]
Bug 1401666 - Adjust sandbox policy to allow Mesa 12 to use libudev for device identification.
Recent regression in webgl performance, beta57+
Attachment #8914490 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Hi Jed, I nominated the two dependent patches for uplift. In future, please nominate the dependencies in the relevant bugs yourself.
Flags: needinfo?(jld)
Comment 42•7 years ago
|
||
bugherder uplift |
Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(jld)
You need to log in
before you can comment on or make changes to this bug.
Description
•