Bad remote symbol unwinding in Mali OpenGL driver
Categories
(Core :: Gecko Profiler, defect)
Tracking
()
People
(Reporter: bholley, Unassigned)
Details
I can get a profile with good libXUL symbols for a local build on an HD8 tablet, but the samples on the Renderer thread that land in the GL driver are wacky: http://bit.ly/2VvLktR
Reporter | ||
Comment 1•6 years ago
|
||
Jed, see the private link to the library pulled from the phone. It would greatly help graphics performance work if we could get these symbols working reliably.
Comment 2•6 years ago
|
||
There's a large range that maps to EXIDX_CANTUNWIND
:
(gdb) x/3xw 0x15d90e4
0x15d90e4: 0x7ed72e44 0x00000001 0x7f1456a4
(gdb) p/x 0x15d90e4 + *(long*)0x15d90e4 - (1<<31)
$23 = 0x34bf28
(gdb) p/x 0x15d90ec + *(long*)0x15d90ec - (1<<31)
$24 = 0x71e790
Judging by the symbol table, that range includes the entire OpenGL implementation. There is also no DWARF unwind info (contrast bug 1529108).
Theoretically it would be possible to implement something like Breakpad's stack scanning, which sweeps stack memory looking for plausible return addresses, but in practice this often produces false positives from old frames that already returned and were only partially overwritten.
It might also be possible to, as a special case for dealing with GL libraries, store enough information to restart stack unwinding for our code before calling into the library. (Didn't we handle unwinding across jitcode like that at one point?) I think we have a wrapper layer around GL calls to deal with library loading, in which case there would at least be an existing place to put code to do that, but someone would have to write it.
Reporter | ||
Comment 3•6 years ago
|
||
(In reply to Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧ from comment #3)
There's a large range that maps to
EXIDX_CANTUNWIND
:(gdb) x/3xw 0x15d90e4 0x15d90e4: 0x7ed72e44 0x00000001 0x7f1456a4 (gdb) p/x 0x15d90e4 + *(long*)0x15d90e4 - (1<<31) $23 = 0x34bf28 (gdb) p/x 0x15d90ec + *(long*)0x15d90ec - (1<<31) $24 = 0x71e790
Judging by the symbol table, that range includes the entire OpenGL implementation. There is also no DWARF unwind info (contrast bug 1529108).
Ah yeah, that seems tricky then. I guess it's a driver-by-driver thing? I wonder how likely this is to reoccur on other drivers (it obviously works for Adrenos, per bug 1529108).
It might also be possible to, as a special case for dealing with GL libraries, store enough information to restart stack unwinding for our code before calling into the library. (Didn't we handle unwinding across jitcode like that at one point?) I think we have a wrapper layer around GL calls to deal with library loading, in which case there would at least be an existing place to put code to do that, but someone would have to write it.
That's an interesting idea - we have auto-generated bindings via gleam, and it would be fairly straightforward to put this code in if the overhead were low. Is there something existing to crib from that already works correctly with the stackwalker?
Reporter | ||
Comment 4•6 years ago
|
||
Markus pointed me to the implementation of the JIT trick [1]. Unfortunately, this relies on the JIT trampoline having stashed all the general-purpose registers in memory, and so that they can be restored (ARM unwinding can depend on arbitrary GPRs). We already do that for the JIT so the only added overhead is during unwinding, but it's probably not tenable to add that for every GL call.
I think a more tractable solution might be to interface more directly with the profiler: time every GL call, and if it's above some threshold (say, 1ms), add a marker to the timeline. I'll give this a try when I get a minute.
Reporter | ||
Updated•6 years ago
|
Reporter | ||
Comment 5•6 years ago
|
||
(In reply to Bobby Holley (:bholley) from comment #5)
I think a more tractable solution might be to interface more directly with the profiler: time every GL call, and if it's above some threshold (say, 1ms), add a marker to the timeline. I'll give this a try when I get a minute.
That landed in bug 1532810. WONTFIXing this one, since we don't have another realistic path forward.
Description
•