Closed Bug 864210 Opened 12 years ago Closed 11 years ago

Camera preview cause high CPU usage in Compositor thread on Unagi

Categories

(Firefox OS Graveyard :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: chiajung, Unassigned)

Details

(Keywords: perf, Whiteboard: c= s=2013.05.31 ,)

Attachments

(4 files, 1 obsolete file)

Attached file perf output (deleted) —
Current camera preview cause very high CPU usage, here is a sample top data: User 65%, System 11%, IOW 22%, IRQ 0% User 202 + Nice 5 + Sys 37 + Idle 0 + IOW 72 + IRQ 0 + SIRQ 0 = 316 PID TID PR CPU% S VSS RSS PCY UID Thread Proc 3086 3108 0 57% R 184892K 70108K fg root Compositor /system/b2g/b2g 3086 3086 0 3% S 184892K 70108K fg root b2g /system/b2g/b2g 3300 3300 0 2% R 1116K 476K fg root top top 3221 3221 0 2% R 109204K 27316K fg app_3221 Camera /system/b2g/plugin-container 3086 3093 0 1% S 184892K 70108K fg root Gecko_IOThread /system/b2g/b2g And a perf data in attachment. (generated with perf record -a -g) Since perf can not generate stack for memcpy and the code path should not call memcpy, I tried to mark out some code and try. As a result, I found fEGLImageTargetTexture2D(LOCAL_GL_TEXTURE_EXTERNAL, image); in GLContextProviderEGL.cpp cause the high CPU usage.
Blocks: 860441
Summary: Camera preview cause high CPU usage in Compositor thread → Camera preview cause high CPU usage in Compositor thread on Unagi
The data was tested based on mozilla-central r126237. On r129442, the Camera preview is jittering and the CPU usage looks like: User 41%, System 10%, IOW 9%, IRQ 0% User 125 + Nice 3 + Sys 31 + Idle 122 + IOW 28 + IRQ 0 + SIRQ 0 = 309 PID TID PR CPU% S VSS RSS PCY UID Thread Proc 5402 5425 0 36% S 177056K 64784K fg root Compositor /system/b2g/b2g 5506 5599 0 2% S 102020K 28584K fg app_5506 Camera /system/b2g/plugin-container 5630 5630 0 2% R 1108K 460K fg root top top 5402 5402 0 1% S 177056K 64784K fg root b2g /system/b2g/b2g 5506 5506 0 1% S 102020K 28584K fg app_5506 Camera /system/b2g/plugin-container
Attached file perf output (r129442) (obsolete) (deleted) —
Here is new perf report, test against r129442. By the way, it seems camera preview sometimes shows old frame and cause the preview looks jumpy.
For the jitter/jumpy part, I think the problem is IPC related. I added some log into GrallocTextureHostOGL: 04-22 17:07:47.313 5950 5973 I GrallocTextureHostOGL: Update new graphicBuffer: 0x46ddd104 04-22 17:07:47.504 5950 5973 I GrallocTextureHostOGL: Update new graphicBuffer: 0x46dddb84 04-22 17:07:47.594 5950 5973 I GrallocTextureHostOGL: Update new graphicBuffer: 0x46ddd104 04-22 17:07:47.784 5950 5973 I GrallocTextureHostOGL: Update new graphicBuffer: 0x46dddb84 04-22 17:07:47.864 5950 5973 I GrallocTextureHostOGL: Update new graphicBuffer: 0x46ddd784 04-22 17:07:48.074 5950 5973 I GrallocTextureHostOGL: Update new graphicBuffer: 0x46d4ff84 04-22 17:07:48.184 5950 5973 I GrallocTextureHostOGL: Update new graphicBuffer: 0x46dddb04 04-22 17:07:48.394 5950 5973 I GrallocTextureHostOGL: Update new graphicBuffer: 0x4560ba84 04-22 17:07:48.474 5950 5973 I GrallocTextureHostOGL: Update new graphicBuffer: 0x46dddb84 The buffer update seems strange.
Some log added into CameraPreviewMediaStream. Camera update several frame then b2g process sense that. I think this is the source of jittering. 04-22 17:41:22.816 6124 6147 I GrallocTextureHostOGL: Update new graphicBuffer: 0x4757ab84 04-22 17:41:22.836 6261 6264 I CameraPreviewMediaStream: New buffer: 0x43ea4e84 04-22 17:41:22.886 6261 6307 I CameraPreviewMediaStream: New buffer: 0x44237784 04-22 17:41:22.916 6261 6262 I CameraPreviewMediaStream: New buffer: 0x43ea4404 04-22 17:41:22.946 6261 6264 I CameraPreviewMediaStream: New buffer: 0x43ea4484 04-22 17:41:22.986 6261 6307 I CameraPreviewMediaStream: New buffer: 0x43ea5104 04-22 17:41:23.026 6261 6262 I CameraPreviewMediaStream: New buffer: 0x43ea5304 04-22 17:41:23.036 6261 6264 I CameraPreviewMediaStream: New buffer: 0x43ea4f84 04-22 17:41:23.076 6261 6307 I CameraPreviewMediaStream: New buffer: 0x43ea5504 04-22 17:41:23.076 6124 6147 I GrallocTextureHostOGL: Update new graphicBuffer: 0x4757a804 04-22 17:41:23.116 6261 6262 I CameraPreviewMediaStream: New buffer: 0x43ea4e84 04-22 17:41:23.126 6124 6147 I GrallocTextureHostOGL: Update new graphicBuffer: 0x4757a804 04-22 17:41:23.136 6261 6264 I CameraPreviewMediaStream: New buffer: 0x44237784 04-22 17:41:23.176 6261 6307 I CameraPreviewMediaStream: New buffer: 0x43ea4404 04-22 17:41:23.226 6261 6262 I CameraPreviewMediaStream: New buffer: 0x43ea4484 04-22 17:41:23.236 6261 6264 I CameraPreviewMediaStream: New buffer: 0x43ea5104 04-22 17:41:23.276 6261 6307 I CameraPreviewMediaStream: New buffer: 0x43ea5304 04-22 17:41:23.326 6261 6262 I CameraPreviewMediaStream: New buffer: 0x43ea4f84 04-22 17:41:23.346 6261 6264 I CameraPreviewMediaStream: New buffer: 0x43ea5504 04-22 17:41:23.376 6124 6147 I GrallocTextureHostOGL: Update new graphicBuffer: 0x4757b204 04-22 17:41:23.376 6261 6307 I CameraPreviewMediaStream: New buffer: 0x43ea4e84 04-22 17:41:23.416 6261 6262 I CameraPreviewMediaStream: New buffer: 0x44237784 04-22 17:41:23.426 6124 6147 I GrallocTextureHostOGL: Update new graphicBuffer: 0x4757b204 We should fix this problem first then see why CPU consumption is high.
Attached patch test patch (deleted) — Splinter Review
This hacky patch make Camera preview smooth, and make it easy to see the high CPU consumption in Compositor thread. The reason Camera preview jittering is because the ImageBridge thread may block on IPC, while Camera preview thread generate many new tasks. When ImageBridge thread completes previous blocking operation, it finds most task in queue are out-of-date. As a result, most frame are skipped and cause jittering preview. This patch makes Camera preview thread blocking on IPC itself, and prevent jittering.
Attached file perf output (r129442) after the patch (deleted) —
After apply the patch, the top result is: After apply the patch, the top result becomes User 56%, System 14%, IOW 29%, IRQ 0% User 175 + Nice 2 + Sys 46 + Idle 0 + IOW 92 + IRQ 0 + SIRQ 0 = 315 PID TID PR CPU% S VSS RSS PCY UID Thread Proc 1476 1498 0 54% S 177956K 67960K fg root Compositor /system/b2g/b2g 1476 1476 0 2% S 177956K 67960K fg root b2g /system/b2g/b2g 1353 1353 0 2% S 0K 0K fg root kworker/0:0 1635 1635 0 2% R 1108K 464K fg root top top 1476 1491 0 0% S 177956K 67960K fg root Timer /system/b2g/b2g 1572 1577 0 0% S 87748K 28564K fg app_1572 Chrome_ChildThr /system/b2g/plugin-container 118 1603 0 0% S 2220K 508K fg root akmd8962_new /system/bin/akmd8962_new 873 1605 0 0% S 35920K 6028K fg media mediaserver /system/bin/mediaserver 1572 1576 0 0% S 87748K 28564K fg app_1572 Binder Thread # /system/b2g/plugin-container 1476 1513 0 0% S 177956K 67960K fg root GL updater /system/b2g/b2g and perf data attached.
Attachment #740189 - Attachment is obsolete: true
QA Contact: milan
Here is a perf data with a naive memcpy implementation in BionicGlue. The caller of memcpy is ioctl_kgsl_sharedmem_write (in libgsl), which may explain the experiment result in bug description. @Diego, Can you comment why libgsl cause memcpy when bind external texture?
Flags: needinfo?(dwilson)
Jeff, let's take a look at this as a priority today.
Assignee: nobody → jmuizelaar
I think the root of this issue may be bug 864017. Dup?
Flags: needinfo?(dwilson)
It could be different from bug 864017. It is about GRALLOC_PLANAR_YCBCR. Camera preview in b2g18 usess GONK_IO_SURFACE.
Interesting. Using the fix in bug 862952 to enable HWComposer the FPS goes up to 30 fps and the CPU usage goes down to ~20%. That means the problem in GPU composition is very likely in the way GONK_IO_SURFACE bind in the new compositor
As comment 10 says, this is different from bug 864017. Bug 864017 is going to implement GRALLOC_PLANAR_YCBCR image format for software decoded image. Camera preview use GONK_IO_SURFACE which was regressed after LayerRefactoring and fixed in bug 860441. This problem can be seen before LayerRefactoring (the first perf data, tested on r126237). Since LayerRefactoring introduce other problem that cause camera preview jitter, the problem becomes hard to notice. So if you want to investigate this problem after LayerRefactoring, you can apply my patch.
All product phones enable HwComposer. Only mozilla's ROM do not use it. I heard from mwu that he is going to enable HwComposer for new devices but not on unagi.
I will try to find a Inari and enable HWComposer to test it later. However, I think this may still a bug. Since Camera preview frame can be render to canvas. If we want to render camera preview to canvas via similar implementation to take advantage of hardware resource, we may still have to solve it.
I can not enable HWComposer by just apply the patch in bug 862952. It seems there are some more patches to be applied before I can enable it :S
(In reply to Chiajung Hung [:chiajung] from comment #14) > However, I think this may still a bug. Since Camera preview frame can be > render to canvas. If we want to render camera preview to canvas via similar > implementation to take advantage of hardware resource, we may still have to > solve it. It is a different problem. If gecko uses GPU for the rendering we use a lot of cpu time than HwComposer. HwComposer could mitigate this. canvas rendering do not use GPU for rendering right now. Rendering to Canvas use only cpu and needs more cpu time. Even when gecko uses GPU for rendering to canvas, cpu usage is greater than HwCompose.
Bug 845200, Bug 827229 are related to GPU-rendered canvas.
FYI both GPU composition and HWC composition access the same surfaces backing the layers, including for canvas layer. The layers themselves render their content to said surface in the exact same way for both. So the issue here is most likely that the surface binding during GPU composition has a bug that causes it to mem copy. I actually agree with both Sotaro and Chianjung. Yes, HWC will be used in commercial devices for camera preview. However, even in those commercial devices there are many use cases where we fall back to GPU rendering. So we still want to knock out this bug!
(In reply to Chiajung Hung [:chiajung] from comment #12) > This problem can be seen before LayerRefactoring (the first perf data, > tested on r126237). chiajung, I do not understand the above comment. Are you saying preview's performance problem is present also on b2g18?
(In reply to Diego Wilson [:diego] from comment #18) > here is most likely that the surface binding during GPU composition has a > bug that causes it to mem copy. diego, is it a bug of qcom's code? compositor does not call mem copy.
Flags: needinfo?(dwilson)
Assignee: jmuizelaar → sotaro.ikeda.g
(In reply to Sotaro Ikeda [:sotaro] from comment #20) > (In reply to Diego Wilson [:diego] from comment #18) > > here is most likely that the surface binding during GPU composition has a > > bug that causes it to mem copy. > > diego, is it a bug of qcom's code? compositor does not call mem copy. Most likely it's a problem in the way the camera frame surface is provided to GLES in the GPU composition. I think all other layers in B2G (eg ShadowThebesLayer) are backed by a surface and if that binding caused a mem copy we would see terrible performance in the homescreen too. I just tried it out. This camera issue is reproducible in b2g18 when I disable HWC composition.
Flags: needinfo?(dwilson)
Inder, Do you remember what fixed the camera preview performance? Maybe something in the HWC composition was patched that wasn't patched in the GPU composition.
Flags: needinfo?(ikumar)
Bug 832100 tracks enable HwComposer in mozbuild.
Almost, bug 828876 is the full HWC enabling bug
Heh, that was sotaro's patch in bug 844248 :) That was not an HWC specific patch. Oh well... My guess is the fix will be somewhere in ShadowImageLayer, which is the one in charge of binding the camera frame surface.
Flags: needinfo?(ikumar)
(In reply to Sotaro Ikeda [:sotaro] from comment #19) > (In reply to Chiajung Hung [:chiajung] from comment #12) > > This problem can be seen before LayerRefactoring (the first perf data, > > tested on r126237). > > chiajung, I do not understand the above comment. Are you saying preview's > performance problem is present also on b2g18? I tested it on m-c only, but I think the code path for camera preview rendering are similar on b2g18 and m-c before LayerRefactoring. And as comment 21 said, this is reproducible in b2g18. For more detail, I found video playback for 3gp/mp4 do not have simiar problem, I think the YUV format may be related, and I changed http://mxr.mozilla.org/mozilla-central/source/dom/camera/GonkCameraControl.cpp#58 to 0, and the result is the same.
If it is the color format proble, it is the qcom's platforms problem. Camera preview and video playback uses same code for rendering.
I agree this should be a qcom platform issue :)
> Camera preview and video playback uses same code for rendering Exactly! Does video playback also have the same problem?
I can not observe same high CPU usage problem when play video. But I just test a little set of video. If you need the top/perf data for MP4/3GP video playback, I can provide it later.
I checked some video clips. I also can not observe the problem. Though, it could depend on video size ans rendering scaling.
Hmm... that does sound suspect. I agree that both video playback and camera should follow the same rendering path. I will have to compare the video and camera frame surfaces. Could be that GL is deciding to convert the camera frame to another format in a slow and painful way.
On buri device Hw with composer disabled, cpu usage of camera preview is not so high. Composer thread uses 10% of cpu.
I just checked with chiajung and we didn't found the CPU high issue on Leo device. But inari and unagi did. Also we found the egl libraries were different. Was it the root cause? //Leo -rw-r--r-- root root 26 2013-03-06 08:00 egl.cfg -rw-r--r-- root root 30456 2013-03-06 08:00 eglsubAndroid.so -rw-r--r-- root root 134156 2013-03-06 08:00 libEGL_adreno200.so -rw-r--r-- root root 81520 2013-03-06 08:22 libGLES_android.so -rw-r--r-- root root 200980 2013-03-06 08:00 libGLESv1_CM_adreno200.so -rw-r--r-- root root 720128 2013-03-06 08:00 libGLESv2_adreno200.so -rw-r--r-- root root 379140 2013-03-06 08:00 libq3dtools_adreno200.so //inari -rw-r--r-- root root 26 2013-04-18 14:20 egl.cfg -rw-r--r-- root root 22160 2013-04-18 14:20 eglsubAndroid.so -rw-r--r-- root root 130008 2013-04-18 14:20 libEGL_adreno200.so -rw-r--r-- root root 81520 2013-04-18 14:21 libGLES_android.so -rw-r--r-- root root 196852 2013-04-18 14:20 libGLESv1_CM_adreno200.so -rw-r--r-- root root 575252 2013-04-18 14:20 libGLESv2_adreno200.so -rw-r--r-- root root 211040 2013-04-18 14:20 libq3dtools_adreno200.so
buri and leo devices use ics_strawberry. ungi and inari use ics_chocolate. That might affect to this problem.
(In reply to pchang from comment #34) > Also we found the egl libraries were different. Was it the root cause? As in comment #35, leo and inari uses different code base. So egl could also different. I also suspect this could be the root cause.
It the bug is an ics_chocolate specific issue. The bug can be tef bug, I think.
After the fix for bug 862324 landed I see the FPS for GPU rendered camera frames goes up to 30 FPS. Maybe this is fixed now?
Status: NEW → ASSIGNED
Whiteboard: c=performance
(In reply to Diego Wilson [:diego] from comment #38) > After the fix for bug 862324 landed I see the FPS for GPU rendered camera > frames goes up to 30 FPS. Maybe this is fixed now? Chiajung, can you answer the question?
Flags: needinfo?(chung)
We tried with 5/3 codebase on Unagi, the problem still present. Diego, which device you tested?
Flags: needinfo?(chung) → needinfo?(dwilson)
I tested on the leo device. It's around 30fps and pasted the cpu usage at the bottom. What is you cpu usage target? User 49%, System 17%, IOW 23%, IRQ 0% User 130 + Nice 24 + Sys 55 + Idle 30 + IOW 72 + IRQ 0 + SIRQ 0 = 311 PID PR CPU% S #THR VSS RSS PCY UID Name 153 0 43% S 50 212864K 56744K fg root /system/b2g/b2g 172 0 6% S 6 5908K 468K fg root /system/bin/sensord 170 0 6% S 4 8060K 1352K fg system /system/bin/mm-qcamera-daemon 496 0 5% S 19 81620K 23760K fg app_496 /system/b2g/plugin-container
Flags: needinfo?(dwilson)
Well, as comment 34 says, Leo devices have no such problem.
We should not bother trying to debug Unagi, the software on that device is a random collection of semi-related bits that nobody other than Mozilla really supports. The more interesting questions are (1) Does this manifest on the *vendor* Inari build, and if so (2) Does the vendor consider this to be blocking.
(In reply to Michael Vines [:m1] [:evilmachines] from comment #43) > (1) Does this manifest on the *vendor* Inari build, and if so Chiajung, can you confirm that? I do not have *vendor* Inari built ROM. And I think Inari device enables HW composer. It seems that there are no use case that it always renders video frame by using OpenGL.
Flags: needinfo?(chung)
(In reply to Sotaro Ikeda [:sotaro] from comment #44) > And I think Inari device enables HW composer. It seems that there are no use > case that it always renders video frame by using OpenGL. It is about v1.0.1 *vendor* Inari built ROM.
(In reply to Sotaro Ikeda [:sotaro] from comment #45) > (In reply to Sotaro Ikeda [:sotaro] from comment #44) > > And I think Inari device enables HW composer. It seems that there are no use > > case that it always renders video frame by using OpenGL. > > It is about v1.0.1 *vendor* Inari built ROM. FYI we are very close to enabling HW composer by default on Mozilla builds as well. See bug 828876
Whiteboard: c=performance → c=
top result of vendor built ROM on Inari during camera preview. It seems that hw composer is used. User 28%, System 21%, IOW 0%, IRQ 0% User 43 + Nice 16 + Sys 45 + Idle 101 + IOW 2 + IRQ 0 + SIRQ 0 = 207 PID PR CPU% S #THR VSS RSS PCY UID Name 114 0 20% S 39 200752K 54028K fg root /system/b2g/b2g 488 0 6% S 18 77996K 23648K fg app_488 /system/b2g/plugin-container 513 0 2% R 1 1064K 420K fg shell top 17 0 0% S 1 0K 0K fg root kworker/0:1 258 0 0% S 1 780K 360K fg root logcat 132 0 0% S 1 868K 424K fg root /system/bin/getlogtofile
unassign myself. There is nothing I have to do more.
Assignee: sotaro.ikeda.g → nobody
Since hardware composer default to on and fix the problem. Close this for now.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Flags: needinfo?(chung)
Resolution: --- → INVALID
Keywords: perf
Whiteboard: c= → c= s=2013.05.31 ,
No longer blocks: 860441
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: