Closed Bug 880108 Opened 11 years ago Closed 11 years ago

SkiaGL crashes as it tries to MakeCurrent a generic GLContext, which is pure virtual there

Categories

(Core :: Graphics: Canvas2D, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: bjacob, Assigned: bjacob)

References

Details

Running today's 'graphics' branch in Valgrind, I got crashes with this Valgrind message: "pure virtual method called". Running in GDB, I get this: Program received signal SIGSEGV, Segmentation fault. mozilla::gl::GLContext::MakeCurrent (this=0x7fff8e52a000, aForce=false) at ../../../dist/include/GLContext.h:189 189 return MakeCurrentImpl(aForce); (gdb) bt #0 mozilla::gl::GLContext::MakeCurrent (this=0x7fff8e52a000, aForce=false) at ../../../dist/include/GLContext.h:189 #1 0x00007ffff4822d53 in GrGpuGL::HWGeometryState::setVertexBufferID (this=0x7fff8dafd630, gpu=0x7fff8dafd000, id=273) at /hack/mozilla-graphics/gfx/skia/src/gpu/gl/GrGpuGL.h:343 #2 0x00007ffff4823222 in GrGLBufferImpl::updateData (this=0x7fff8edfdd90, gpu=0x7fff8dafd000, src=0x7fff8d73f000, srcSizeInBytes=96) at /hack/mozilla-graphics/gfx/skia/src/gpu/gl/GrGLBufferImpl.cpp:117 #3 0x00007ffff47f4179 in GrBufferAllocPool::flushCpuData (this=0x7fff8ea9dd80, buffer=0x7fff8edfdd30, flushSize=96) at /hack/mozilla-graphics/gfx/skia/src/gpu/GrBufferAllocPool.cpp:363 #4 0x00007ffff47f427f in GrBufferAllocPool::unlock (this=0x7fff8ea9dd80) at /hack/mozilla-graphics/gfx/skia/src/gpu/GrBufferAllocPool.cpp:113 #5 0x00007ffff480bae4 in GrInOrderDrawBuffer::flush (this=0x7fff8d73c000) at /hack/mozilla-graphics/gfx/skia/src/gpu/GrInOrderDrawBuffer.cpp:441 #6 0x00007ffff47fbc57 in GrContext::flush (this=0x7fff8ea9dc40, flagsBitfield=0) at /hack/mozilla-graphics/gfx/skia/src/gpu/GrContext.cpp:1080 #7 0x00007ffff47fd51f in GrContext::readRenderTargetPixels (this=0x7fff8ea9dc40, target=0x7fff8d7145e0, left=0, top=0, width=30, height=50, dstConfig= kBGRA_8888_GrPixelConfig, buffer=0x7fff8eae1000, rowBytes=120, flags=0) at /hack/mozilla-graphics/gfx/skia/src/gpu/GrContext.cpp:1203 #8 0x00007ffff4811f74 in GrRenderTarget::readPixels (this=0x7fff8d7145e0, left=0, top=0, width=30, height=<optimized out>, config=<optimized out>, buffer= 0x7fff8eae1000, rowBytes=120, pixelOpsFlags=0) at /hack/mozilla-graphics/gfx/skia/src/gpu/GrRenderTarget.cpp:31 #9 0x00007ffff48223cf in SkGrPixelRef::onReadPixels (this=<optimized out>, dst=0x7fff8ea9df10, subset=<optimized out>) at /hack/mozilla-graphics/gfx/skia/src/gpu/SkGrPixelRef.cpp:178 #10 0x00007ffff48221fc in SkROLockPixelsPixelRef::onLockPixels (this=0x7fff8ea9dec0, ctable=<optimized out>) at /hack/mozilla-graphics/gfx/skia/src/gpu/SkGrPixelRef.cpp:33 #11 0x00007ffff48a9949 in SkPixelRef::lockPixels (this=0x7fff8ea9dec0) at /hack/mozilla-graphics/gfx/skia/src/core/SkPixelRef.cpp:116 #12 0x00007ffff48465f3 in SkBitmap::lockPixels (this=0x7fffbe870648) at /hack/mozilla-graphics/gfx/skia/src/core/SkBitmap.cpp:346 #13 0x00007ffff4821135 in sk_gr_create_bitmap_texture (ctx=0x7fff8cb222e0, cache=true, params=0x7fffffff987c, origBitmap=...) at /hack/mozilla-graphics/gfx/skia/src/gpu/SkGr.cpp:91 #14 0x00007ffff48214d2 in GrLockAndRefCachedBitmapTexture (ctx=0x7fff8cb222e0, bitmap=..., params=0x7fffffff987c) at /hack/mozilla-graphics/gfx/skia/src/gpu/SkGr.cpp:189 #15 0x00007ffff484a9bf in SkBitmapProcShader::asNewEffect (this=0x7fffbe8705e0, context=0x7fff8cb222e0, paint=...) at /hack/mozilla-graphics/gfx/skia/src/core/SkBitmapProcShader.cpp:356 #16 0x00007ffff481e38f in (anonymous namespace)::skPaint2GrPaintShader (dev=0x7fff8d715240, skPaint=..., constantColor=true, grPaint=0x7fffffff99e0) at /hack/mozilla-graphics/gfx/skia/src/gpu/SkGpuDevice.cpp:572 #17 0x00007ffff481e7a3 in SkGpuDevice::drawRect (this=0x7fff8d715240, draw=..., rect=..., paint=...) at /hack/mozilla-graphics/gfx/skia/src/gpu/SkGpuDevice.cpp:697 #18 0x00007ffff486f064 in SkCanvas::drawRect (this=0x7fff9ff6a1a0, r=..., paint=...) at /hack/mozilla-graphics/gfx/skia/src/core/SkCanvas.cpp:1546 #19 0x00007ffff47ce77b in mozilla::gfx::DrawTargetSkia::FillRect (this=0x7fff8eada380, aRect=..., aPattern=..., aOptions=...) at /hack/mozilla-graphics/gfx/2d/DrawTargetSkia.cpp:369 #20 0x00007ffff366e889 in mozilla::dom::CanvasRenderingContext2D::FillRect (this=0x7fff8e6edc00, x=0, y=0, w=100, h=50) at /hack/mozilla-graphics/content/canvas/src/CanvasRenderingContext2D.cpp:1567 #21 0x00007ffff41d60f4 in mozilla::dom::CanvasRenderingContext2DBinding::fillRect (cx=0x7fffca875860, obj=..., self=0x7fff8e6edc00, argc=<optimized out>, vp= 0x7fffe32fe128) at /hack/mozilla-graphics/obj-firefox-valgrind/dom/bindings/CanvasRenderingContext2DBinding.cpp:1676 #22 0x00007ffff41d6b2f in mozilla::dom::CanvasRenderingContext2DBinding::genericMethod (cx=0x7fffca875860, argc=4, vp=0x7fffe32fe128) at /hack/mozilla-graphics/obj-firefox-valgrind/dom/bindings/CanvasRenderingContext2DBinding.cpp:4129 #23 0x00007ffff4afca29 in js::CallJSNative (cx=0x7fffca875860, native= 0x7ffff41d6a30 <mozilla::dom::CanvasRenderingContext2DBinding::genericMethod(JSContext*, unsigned int, JS::Value*)>, args=...) at /hack/mozilla-graphics/js/src/jscntxtinlines.h:349 #24 0x00007ffff4b08fe6 in js::Invoke (cx=0x7fffca875860, args=..., construct=js::NO_CONSTRUCT) at /hack/mozilla-graphics/js/src/vm/Interpreter.cpp:395 ---Type <return> to continue, or q <return> to quit---q Quit (gdb) l 184 // support contexts used on multiple threads. 185 NS_ASSERTION(IsOwningThreadCurrent(), 186 "MakeCurrent() called on different thread than this context was created on!"); 187 #endif 188 #endif 189 return MakeCurrentImpl(aForce); 190 } 191 192 virtual bool IsCurrent() = 0; 193 (gdb) p this $1 = (mozilla::gl::GLContext * const) 0x7fff8e52a000 See how |this| is a GLContext*, not a GLContextGLX* (I am on Linux) or anything concrete. So MakeCurrentImpl is a pure virtual method, which explains the crash. How did Skia/GL get hold of a generic GLContext?
I think gdb just shows you whatever the local is declared as. It has no way of inspecting the actual instance to know what it is. I think we are just calling MakeCurrent() on a dead instance. I added some logging: I/Gecko (11290): SNORP: MakeCurrent() on = 0x75f68c00 <snip some more MakeCurrent() calls> I/Gecko (11290): SNORP: destroying GLContext 0x75f68c00 I/Gecko (11290): SNORP: fCallbackData = 0x75fb5000 I/Gecko (11290): SNORP: MakeCurrent() on = 0x75fb5000 <snip some more MakeCurrent() calls> I/Gecko (11290): SNORP: MakeCurrent() on = 0x75f68c00 F/libc (11290): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1), thread 11401 (Gecko) So you can see we're calling MakeCurrent() on 0x75f68c00, which was destroyed earlier. AFAICT, a new instance was not created at that address. I'm not sure why the address always appears to be null. Maybe the destructor zeroes the vtable? That seems strange, though.
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #2) > I think gdb just shows you whatever the local is declared as. It has no way > of inspecting the actual instance to know what it is. No no, it doesn't depend on how the local pointer is declared, it gets actual type information from the vtable pointer and DWARF. > I think we are just > calling MakeCurrent() on a dead instance. That is very possible indeed. > I added some logging: > > I/Gecko (11290): SNORP: MakeCurrent() on = 0x75f68c00 > <snip some more MakeCurrent() calls> > I/Gecko (11290): SNORP: destroying GLContext 0x75f68c00 > I/Gecko (11290): SNORP: fCallbackData = 0x75fb5000 > I/Gecko (11290): SNORP: MakeCurrent() on = 0x75fb5000 > <snip some more MakeCurrent() calls> > I/Gecko (11290): SNORP: MakeCurrent() on = 0x75f68c00 > F/libc (11290): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1), thread > 11401 (Gecko) > > So you can see we're calling MakeCurrent() on 0x75f68c00, which was > destroyed earlier. AFAICT, a new instance was not created at that address. > I'm not sure why the address always appears to be null. Maybe the destructor > zeroes the vtable? That seems strange, though. Interesting, I have no idea at this point.
So, Skia shouldn't ever be destroying the GLContext object, as the only place that knows about it is GLContextSkia.cpp inside gfx/gl, and that's not held in a smart pointer, nor is it ever deleted. All signs point to this being an issue inside CanvasRenderingContext2D, which manages the GLContext object's lifetime.
Here is the end of a GDB session where I was logging stacks to all ~GLContext() calls. Breakpoint 1, mozilla::gl::GLContext::~GLContext (this=0x7fffb93b9800, __in_chrg=<optimized out>) at /hack/mozilla-graphics/gfx/gl/GLContext.h:135 135 virtual ~GLContext() { #0 mozilla::gl::GLContext::~GLContext (this=0x7fffb93b9800, __in_chrg=<optimized out>) at /hack/mozilla-graphics/gfx/gl/GLContext.h:135 #1 0x00007ffff45985e5 in mozilla::gl::GLContextGLX::~GLContextGLX (this=0x7fffb93b9800, __in_chrg=<optimized out>) at /hack/mozilla-graphics/gfx/gl/GLContextProviderGLX.cpp:854 #2 0x00007ffff3668b35 in mozilla::gl::GLContext::Release (this=0x7fffb93b9800) at ../../../dist/include/GLContext.h:87 #3 0x00007ffff366a8fa in nsRefPtr<mozilla::gl::GLContext>::operator=<mozilla::gl::GLContext> (this=0x7fffb91c3ca0, rhs=...) at ../../../dist/include/nsAutoPtr.h:945 #4 0x00007ffff366aa5f in mozilla::dom::CanvasRenderingContext2D::EnsureTarget (this=0x7fffb91c3c00) at /hack/mozilla-graphics/content/canvas/src/CanvasRenderingContext2D.cpp:793 #5 0x00007ffff366e6fc in mozilla::dom::CanvasRenderingContext2D::FillRect (this=0x7fffb91c3c00, x=0, y=0, w=100, h=50) at /hack/mozilla-graphics/content/canvas/src/CanvasRenderingContext2D.cpp:1558 #6 0x00007ffff41d60f4 in mozilla::dom::CanvasRenderingContext2DBinding::fillRect (cx=0x7fffcbdf5470, obj=..., self=0x7fffb91c3c00, argc=<optimized out>, vp=0x7fffe32fe128) at /hack/mozilla-graphics/obj-firefox-valgrind/dom/bindings/CanvasRenderingContext2DBinding.cpp:1676 #7 0x00007ffff41d6b2f in mozilla::dom::CanvasRenderingContext2DBinding::genericMethod (cx=0x7fffcbdf5470, argc=4, vp=0x7fffe32fe128) at /hack/mozilla-graphics/obj-firefox-valgrind/dom/bindings/CanvasRenderingContext2DBinding.cpp:4129 #8 0x00007ffff4afca29 in js::CallJSNative (cx=0x7fffcbdf5470, native=0x7ffff41d6a30 <mozilla::dom::CanvasRenderingContext2DBinding::genericMethod(JSContext*, unsigned int, JS::Value*)>, args=...) at /hack/mozilla-graphics/js/src/jscntxtinlines.h:349 #9 0x00007ffff4b08fe6 in js::Invoke (cx=0x7fffcbdf5470, args=..., construct=js::NO_CONSTRUCT) at /hack/mozilla-graphics/js/src/vm/Interpreter.cpp:395 #10 0x00007ffff4b05cee in js::Interpret (cx=0x7fffcbdf5470, entryFrame=0x7fffe32fe040, interpMode=js::JSINTERP_NORMAL, useNewType=<optimized out>) at /hack/mozilla-graphics/js/src/vm/Interpreter.cpp:2218 #11 0x00007ffff4b08a80 in js::RunScript (cx=0x7fffcbdf5470, fp=0x7fffe32fe040) at /hack/mozilla-graphics/js/src/vm/Interpreter.cpp:352 #12 0x00007ffff4b09058 in js::Invoke (cx=0x7fffcbdf5470, args=..., construct=js::NO_CONSTRUCT) at /hack/mozilla-graphics/js/src/vm/Interpreter.cpp:408 #13 0x00007ffff4b09574 in js::Invoke (cx=0x7fffcbdf5470, thisv=..., fval=..., argc=1, argv=0x7fffffffb488, rval=0x7fffffffb3f8) at /hack/mozilla-graphics/js/src/vm/Interpreter.cpp:441 #14 0x00007ffff4b7e517 in JS_CallFunctionValue (cx=0x7fffcbdf5470, objArg=<optimized out>, fval=$jsval((JSObject *) 0x7fffbc4b3ec0 [object Function "window.onload"]), argc=1, argv=0x7fffffffb488, rval=<optimized out>) at /hack/mozilla-graphics/js/src/jsapi.cpp:5883 #15 0x00007ffff4229b01 in mozilla::dom::EventHandlerNonNull::Call (this=<optimized out>, cx=0x7fffcbdf5470, aThisObj=(JSObject * const) 0x7fffc7025a80 [object Proxy], event=..., aRv=...) at /hack/mozilla-graphics/obj-firefox-valgrind/dom/bindings/EventHandlerBinding.cpp:41 Program received signal SIGSEGV, Segmentation fault. mozilla::gl::GLContext::MakeCurrent (this=0x7fffb93b9800, aForce=false) at ../../../dist/include/GLContext.h:189 189 return MakeCurrentImpl(aForce); So it's CanvasRenderingContext2D::EnsureTarget that is fiddling with the nsRefPtr that is keeping the GLContext alive.
Right here: if (layerManager) { #ifdef USE_SKIA_GPU if (gfxPlatform::GetPlatform()->UseAcceleratedSkiaCanvas()) { SurfaceCaps caps = SurfaceCaps::ForRGBA(); caps.preserve = true; mGLContext = mozilla::gl::GLContextProvider::CreateOffscreen(gfxIntSize(size.width, size.height), caps, mozilla::gl::GLContext::ContextFlagsNone); EnsureTarget is re-creating everything because mTarget was null... how do we end up in this situation where mTarget is null even though we do have a GLContext?
Looking through, Reset() is one of the places that can destroy mTarget. We definitely want to also destroy mGLContext there as well, whether that fixes this particular issue or not.
Confirmed that Reset() is the place that clears the mTarget, just before the crash. While running Canvas 2D tests, that Reset() call is called from ClearTarget(), called from SetDimensions(), triggered by a change of the canvas' width/height attributes.
So the mystery, of course, is why the clearing of mGLContext (I am using the latest graphics branch code) does not avoid the crash. looking.
OK, the answer is that Skia/GL is holding a raw pointer to the GLContext, so it keeps referencing it after CanvasRenderingContext2D::Reset() dropped its reference to it, causing it to go away. CanvasRenderingContext2D::Reset() needs to inform SkiaGL that it shouldn't be talking to that old GLContext anymore.
So, how do you inform Skia, or moz2d, or whatever, of the life and death of the OpenGL context that it should be using?
Flags: needinfo?(gwright)
Hah, I get it now. This raw pointer to GLContext is in GLContextSkia.cpp: static GLContext* sGLContext; extern "C" { void EnsureGLContext(const GrGLInterface* interface) { sGLContext = (GLContext*)(interface->fCallbackData); sGLContext->MakeCurrent(); // <--- this is the MakeCurrent that crashes } I would like to understand the design there. Are we multiplexing a single GLContext across all SkiaGL canvases? If yes, why does CanvasRenderingContext2D have a mGLContext?
OK, so this EnsureGLContext is just an abstraction around the concept of 'current GL context' with the approximation that thread==process. It's clear how we can fix this crash now by clearing interface->fCallbackData when the GLContext goes away, and checking for null pointer in EnsureGLContext. What I really don't know is what would "good design" be around this.
There is a separate GrGLInterface per GrContext (and thus per GLContext) object, which holds a pointer to the GLContext object. EnsureGLContext() is a hack to allow for the GrGLInterface to work with GLContext class methods, because GrGLInterface expects to be given C function pointers for each GL function. If the GLContext goes away, we should ensure that the GrContext (and by extension, the GrGLInterface) is destroyed. This was the original motivation behind setting a destructor callback on the GLContext object so that the GrContext gets properly destroyed when the GLContext goes away. Why are we losing the GLContext, though? It seems to me like that's something that we don't want to happen.
Flags: needinfo?(gwright)
(In reply to George Wright (:gw280) from comment #14) > Why are we losing the GLContext, though? It seems to me like that's > something that we don't want to happen. Indeed we shouldn't. Note that Reset is called everytime the canvas is resized (comment 8). So now that explains why Reset was only clearing mTarget and not nGLContext. I shouldn't have suggested to add a mGLContext=null there. Instead, the bug is that EnsureTarget recreates the mGLContext whenever it recreates the target. In WebGL, resizing the canvas causes the FBO (which plays the role of the drawtarget) to be resized but the GLcontext is obviously not recreated. This should be the same.
Assignee: nobody → bjacob
This was fixed by Benoit a while back
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Resolution: FIXED → WORKSFORME
You need to log in before you can comment on or make changes to this bug.