Closed Bug 988160 Opened 11 years ago Closed 8 years ago

Triple-buffering FrameBufferSurface

Categories

(Core Graveyard :: Widget: Gonk, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: vlin, Assigned: jerry)

References

Details

Attachments

(3 files, 1 obsolete file)

Currently, the FrameBuffer is double-buffering. http://dxr.mozilla.org/mozilla-central/source/widget/gonk/libdisplay/FramebufferSurface.cpp#41 That's bad in cases when Compositor takes time near 16ms or less. Compositor may miss the time to swap buffer because driver periodically reads the buffer with vsync. Once compositor misses the time to swap buffer, it has to wait for the next vsync[1]. Here we have profile[2] on swipe up&down in Settings. There're so many blockers('WaitForever') in several composition periods which makes janks. While we make it triple-buffering[3], those blockers disappear. The idea of triple-buffer is here. At least, we may need a preference for it. This bug doesn't consider buffer for content drawing being discussed in bug 846535, neither the case of HW composition. [1] Waiting for available buffer https://drive.google.com/file/d/0B4nXGzdjFCO7MTVRcVp4SnhOVjQ/edit?usp=sharing [2] Double-buffering https://drive.google.com/file/d/0B4nXGzdjFCO7WG8wUUtlYkl1cU0/edit?usp=sharing [3] Tripe-buffering https://drive.google.com/file/d/0B4nXGzdjFCO7TkJZLVMtNlE0UzQ/edit?usp=sharing
Component: Performance → Widget: Gonk
Product: Firefox OS → Core
(In reply to Vincent Lin[:vilin] from comment #0) > > The idea of triple-buffer is here. At least, we may need a preference for it. > http://www.androidpolice.com/2012/07/12/getting-to-know-android-4-1-part-3-project-butter-how-it-works-and-what-it-added/
Attached patch bug-988160-fix.patch (deleted) — Splinter Review
Attachment #8398392 - Flags: review?(mwu)
Attachment #8398392 - Flags: review?(mwu)
Attachment #8398392 - Flags: review+
Attachment #8398392 - Flags: feedback?(mwu)
Attachment #8398392 - Flags: review+
Attachment #8398392 - Flags: feedback?(mwu)
(In reply to Michael Wu [:mwu] from comment #3) > Android doesn't use 3 here - how come we need three? > http://androidxref.com/4.4.2_r1/xref/frameworks/native/services/ > surfaceflinger/DisplayHardware/FramebufferSurface.cpp Android did use 2 only by default. Actually, as I know, ODMs or chip vendors always modify it to 3. That highly depends on the capability of GPU or HW Overlay performance. The reason was described at the beginning of this bug.
Flags: needinfo?(sushilchauhan)
(In reply to Vincent Lin[:vilin] from comment #4) > Android did use 2 only by default. Actually, as I know, ODMs or chip vendors > always modify it to 3. That highly depends on the capability of GPU or HW > Overlay performance. But this is what's used in AOSP. It doesn't make sense for Google to not use the optimal configuration on their own Nexus devices. Additionally, the code aurora fork doesn't do this.
(In reply to Vincent Lin[:vilin] from comment #4) > Android did use 2 only by default. Actually, as I know, ODMs or chip vendors > always modify it to 3. That highly depends on the capability of GPU or HW > Overlay performance. > > The reason was described at the beginning of this bug. I believe it is needed for low end devices where GPU is not fast enough to complete Rendering + Composition within 1 vsync (16.66 msec). So instead of modifying FrameBufferSurface.cpp, I think it is better to set NUM_FRAMEBUFFER_SURFACE_BUFFERS = 3 in the Board config file of the device on which it is needed.
Flags: needinfo?(sushilchauhan)
Vincent, can you define NUM_FRAMEBUFFER_SURFACE_BUFFERS in board config file of device and confirm that it reduces the janks, mentioned in Comment 0 ?
Assignee: vlin → hshih
Blocks: 1035076
No longer blocks: Silk
We should check the profile result to see if we need triple-buffer or not.
I've been bugged by sporadic yet frequent jitter on all my Xperia devices since we started sony-aosp-l builds, noticeable mainly while scrolling. I recently enabled the FxOS boot animation (I had been building without it before) and noticed it also jitters; it seemed to be showing the frames in the wrong order. https://www.youtube.com/watch?v=hCV6a3UcyxQ The video above was recorded at a high frame rate and then slowed down to clearly show how the jitter is actually frames being shown in the wrong order. This is what led me to suspect bad buffering. We already have triple buffering enabled in the device repos but it is not being picked up by Gecko at build time. (https://github.com/mozilla-b2g/device-sony-yukon/blob/master/BoardConfig.mk#L63) Forcing triple buffers via this patch completely solves all jitter on all Xperia devices.
Attached patch test flickering (deleted) — Splinter Review
Hi Adam, With triple-buffer, I still can see the flickering. Here is my test patch attachment 8650323 [details] [diff] [review]. The device will only show the bootAnim forever. So it might be another problem. I try to use nexus5-l with the same gecko version, but I don't see the jittering bootanim at nexus5-l. Maybe we should create another bug for tracing.
Flags: needinfo?(adam)
I gave a look at the KK port for Z3/Z3c, and indeed we have |NUM_FRAMEBUFFER_SURFACE_BUFFERS := 3| that is not picked up from Gecko's build. It is being set in device/qcom/msm8974/BoardConfig.mk
@jerry Just by looking at hammerhead's BoardConfig.mk I don't see NUM_FRAMEBUFFER_SURFACE_BUFFERS defined so it's not doing triple buffering. Have you tried defining it there, as well as forcing it in Gecko? CAF sets triple buffering in all of their device repos (here's just a random sample): https://www.codeaurora.org/cgit/quic/la/platform/vendor/qcom/msm8960/tree/BoardConfig.mk?id=LA.AF.1.1.1-02910-8064.0#n27 https://www.codeaurora.org/cgit/quic/la/platform/vendor/qcom/msm8226/tree/BoardConfig.mk?h=LA.BF.1.1.1.c1_rb1&id=LA.BF.1.1.1.c5-00400-8x26.0#n24 https://www.codeaurora.org/cgit/quic/la/platform/vendor/qcom/plutonium/tree/BoardConfig.mk?id=LA.BF64.1.2.1.c1-02610-8x94.0#n27 From my limited knowledge of B2G development, I understand it is generally build with CAF in mind. Perhaps the CAF code has some other changes in recent branches that assume triple buffering is enabled, and Gecko was written assuming the CAF changes were valid for AOSP too. That's just speculation, but it could be why (only?) non CAF based devices see the problem? I don't have an actual Nexus device to test on. Hopefully triple buffering doesn't depend on any kernel patches and you can fix it by simply enabling it in the device repo and forcing it in Gecko.
Flags: needinfo?(adam) → needinfo?(hshih)
Attached patch framebuffer_num.patch (obsolete) (deleted) — Splinter Review
export the framebuffer_num from boardconfig.mk to gecko. Some devices already define the "NUM_FRAMEBUFFER_SURFACE_BUFFERS", but gecko can't get that value. Gecko always use our default NUM_FRAMEBUFFER_SURFACE_BUFFERS setting[1]. This patch tries to export this value to gecko. [1] http://hg.mozilla.org/mozilla-central/annotate/29b2df16e961/widget/gonk/libdisplay/FramebufferSurface.cpp#l41
Attachment #8650406 - Flags: review?(mwu)
Hi Adam, I have already built the aries-l with triple-buffer(I force to use triple-buffer at [1]), but I still see the flickering animation. Why we need to turn on triple-buffer if it doesn't solve the flickering problem? I create the attachment 8650406 [details] [diff] [review]. If we have NUM_FRAMEBUFFER_SURFACE_BUFFERS in boardconfig.mk, use it. Otherwise, use our default value. [1] http://hg.mozilla.org/mozilla-central/annotate/29b2df16e961/widget/gonk/libdisplay/FramebufferSurface.cpp#l41
Flags: needinfo?(hshih) → needinfo?(adam)
All I know is that all CAF devices use it, AOSP devices don't seem to use it, and this appears to have solved the flickering problem for me on all my devices. Perhaps your flickering is caused by something unrelated?
Flags: needinfo?(adam)
Comment on attachment 8650406 [details] [diff] [review] framebuffer_num.patch Can you resubmit as a PR?
Attachment #8650406 - Flags: review?(mwu)
Attachment #8650406 - Attachment is obsolete: true
Attachment #8651075 - Flags: review?(mwu)
number of buffers should not related to the flickering. Somehow fencing seems not work correctly.
(In reply to Sotaro Ikeda [:sotaro] from comment #19) > number of buffers should not related to the flickering. Somehow fencing > seems not work correctly flickering needs to be addressed unrelated to number of buffers.
About boot animation case, fence handle is not set to correct DisplaySurface. During bootanimation, it should be set to mBootAnimDispSurface. It seems regression of Bug 116520.
(In reply to Sotaro Ikeda [:sotaro] from comment #21) > It seems regression of Bug 116520. Correction Bug 1165200.
flickering during scrolling seems to be caused by different problem. I suspect around fence delivery from Host side to client side.
The flickering problem for bootAnim case will be fixed at Bug 1197968.
(In reply to Sotaro Ikeda [:sotaro] from comment #23) > flickering during scrolling seems to be caused by different problem. I > suspect around fence delivery from Host side to client side. If the flickering happens scrolling area. It might to be related to TiledPaintedLayer. Tiled leayer's implementation was refactored several times occasionally.
Hi Adam, For bootAnim, we find the gfx bug and be resolved at Bug 1197968. As comment 23, I think the scrolling problem is not related to 3-buffer config here. We should create another bug to check the scrolling flickering. What's your testing environment? Do you have the high frame rate video for this scrolling issue?
Flags: needinfo?(adam)
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #26) > Hi Adam, > > For bootAnim, we find the gfx bug and be resolved at Bug 1197968. As comment > 23, I think the scrolling problem is not related to 3-buffer config here. We > should create another bug to check the scrolling flickering. What's your > testing environment? Do you have the high frame rate video for this > scrolling issue? I can test only on Xperia devices, sirius-l, tianchi-l and flamingo-l. Sirius-l is virtually identical to leo-l.
Flags: needinfo?(adam)
I only have xperia z3c now, and I will try to test with z3c-l(aries-l). Do you have the high frame rate video to show the scroll flickering problem?
Flags: needinfo?(adam)
I can help with eagle-l and scorpion-l.
Comment on attachment 8651075 [details] PR: export framebuffer num to gecko. v1 Clearing review - it sounds like we don't need this anymore, and avoiding triple buffering will save memory if we're not taking advantage of it. Let me know if you still want it.
Attachment #8651075 - Flags: review?(mwu)
Hi Michael, If the NUM_FRAMEBUFFER_SURFACE_BUFFERS in boardconfig.mk is 3, should we still use our gecko setting value 2? The problem is that NUM_FRAMEBUFFER_SURFACE_BUFFERS is never defined even though we have that setting in boardconfig.mk. We always use this line[1]. The attachment 8651075 [details] tries to export it from boardconfig.mk. [1] https://hg.mozilla.org/mozilla-central/annotate/74fbd245369c474beaa7f2b1959570243e3dafaa/widget/gonk/libdisplay/FramebufferSurface.cpp#l41
Flags: needinfo?(mwu)
I'd really appreciate the NUM_FRAMEBUFFER_SURFACE_BUFFERS patch being merged, it drastically improves performance on my devices.
Adam, what kind of improvement do you see?
Enabling it eliminates a whole bunch of flickering and jank and stutter and makes it feel like a real OS, not just a slideshow. This difference really is like night and day.
Flags: needinfo?(adam)
(In reply to Adam Farden from comment #34) > Enabling it eliminates a whole bunch of flickering flickering seems to be caused by a bug of fence handling.
Bug 1172719 is fence handling regression of WebGL and 2d Canvas.
Michael I finally realised why I've have such a problem with this, but everyone else doesn't. I mostly test on tianchi which is a phablet device. It has the same pixel density as the flamingo but it is 6 inches instead of 4.5 inches. It is 720x1280, just like aries, but the bigger physical size seems to be causing issues. I've been building with tripple framebuffers for a long time but due to Bug 1206485 I did some reverts to have a stable build. Here's a video comparing the scrolling on flamingo, a normal device, to tianchi, a phablet. Notice the home button and notification bar are physically the same size, on tianchi it is just wider and has room for more icons. https://www.youtube.com/watch?v=a2BbR-6VQRA
Jerry, I think we should reconsider landing this?
Flags: needinfo?(hshih)
Comment on attachment 8651075 [details] PR: export framebuffer num to gecko. v1 How about comment 37 and comment 38?
Flags: needinfo?(hshih)
Attachment #8651075 - Flags: review?(mwu)
Comment on attachment 8651075 [details] PR: export framebuffer num to gecko. v1 I think Sotaro is best qualified to decide on this.
Flags: needinfo?(mwu)
Attachment #8651075 - Flags: review?(mwu) → review?(sotaro.ikeda.g)
(In reply to Adam Farden [:adfad666] from comment #37) > Here's a video comparing the scrolling on flamingo, a normal device, to > tianchi, a phablet. Notice the home button and notification bar are > physically the same size, on tianchi it is just wider and has room for more > icons. > > https://www.youtube.com/watch?v=a2BbR-6VQRA The video looks a bit wired that just scrolling on homescreen cause such jitter. It seems like that OpenGL module of the device does syncing at wrong timing(during composition).
On current gecko, just scrolling on homescreen seems not cause that heavy jitter if lower library works correctly.
IIRC, on JB devices, codeaurora's GL module did fence sync during EGLImage binding. It might affect to it.
(In reply to Sotaro Ikeda [:sotaro] from comment #44) > IIRC, on JB devices, codeaurora's GL module did fence sync during EGLImage > binding. It might affect to it. The GL module might be basically same even on L.
(In reply to Sotaro Ikeda [:sotaro] from comment #45) > (In reply to Sotaro Ikeda [:sotaro] from comment #44) > > IIRC, on JB devices, codeaurora's GL module did fence sync during EGLImage > > binding. It might affect to it. > > The GL module might be basically same even on L. Even on nexus devices, GL module works same way between different OS version.
(In reply to Adam Farden [:adfad666] from comment #37) > I mostly test on tianchi which is a phablet device. It has the same pixel > density as the flamingo but it is 6 inches instead of 4.5 inches. It is > 720x1280, just like aries, but the bigger physical size seems to be causing > issues. I do not think the size difference make the problem, since Composition is done by GPU. tianchi(Xperia T2 Ultra) was shipped originally as JB. flamingo(Xperia E3) was shipped originally as KK. codeaurara's GPU module works totally differently between JB and KK. As in Comment 44, in JB's gpu module, fence was called during composition. But in KK's GPU module, the problem was addressed. It works more async way.
Fence is implemented on android since JB. Then codeaurora seemed to fail to handle Fence correctly.
From Comment 41 to Comment 48 are just my assumptions. If it is correct, making triple buffers is just handle defect that exist only on GL module that created for JB.
Firefox OS assumes at least android KK to work it correctly.
The following is a profile got from nexus-5-l during homescreen scrolling. Compositor spent majority of time to wait event. http://people.mozilla.org/~bgirard/cleopatra/#report=9bcd78d5f4612c8d7bc131e1263900619590b67e&selection=0,1,2,3
Adam, can you take a profile when the problem happens? Performance profile can be started and stopped with the following command. You need to build and flash ROM to get correct native stack. ./profile.sh start -p b2g -t Compositor -f stackwalk,threads -i 1 ./profile.sh capture The following has more explanation about profiling. https://developer.mozilla.org/ja/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler
Flags: needinfo?(afarden)
All Sony devices are built on L. The video above shows tianchi and flamingo built with the same kernel as aries-l (all sony-l devices build on L with the same kernel.) I'll try the profile after the holidays (I'll keep NI here to remind me).
Attachment #8651075 - Flags: review?(sotaro.ikeda.g)
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Flags: needinfo?(afarden)
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: