Closed Bug 1389805 Opened 7 years ago Closed 6 years ago

no symbols for Android xpcshell crashes

Categories

(Firefox Build System :: Android Studio and Gradle Integration, defect)

defect
Not set
major

Tracking

(firefox59 wontfix, firefox60 fixed, firefox61 fixed)

RESOLVED FIXED
mozilla61
Tracking Status
firefox59 --- wontfix
firefox60 --- fixed
firefox61 --- fixed

People

(Reporter: aryx, Assigned: gbrown)

References

Details

(Keywords: regression, Whiteboard: [stockwell fixed:other])

Attachments

(2 files)

https://treeherder.mozilla.org/logviewer.html#?job_id=122759797&repo=autoland

seems to have no symbols:

[task 2017-08-12T15:08:48.330199Z] 15:08:48     INFO -  TEST-START | netwerk/test/unit/test_bug650995.js
[task 2017-08-12T15:08:56.386652Z] 15:08:56  WARNING -  TEST-UNEXPECTED-FAIL | netwerk/test/unit/test_bug650995.js | xpcshell return code: 139
[task 2017-08-12T15:08:56.387422Z] 15:08:56     INFO -  TEST-INFO took 8057ms
[task 2017-08-12T15:08:56.388045Z] 15:08:56     INFO -  >>>>>>>
[task 2017-08-12T15:08:56.388132Z] 15:08:56     INFO -  netwerk/test/unit/test_bug650995.js | xpcw: cd /storage/sdcard/tests/xpc/netwerk/test/unit
[task 2017-08-12T15:08:56.389071Z] 15:08:56     INFO -  netwerk/test/unit/test_bug650995.js | xpcw: xpcshell -r /storage/sdcard/tests/xpc/c/httpd.manifest --greomni /data/local/xpcb/target.apk -m -s -e const _HEAD_JS_PATH = "/storage/sdcard/tests/xpc/head.js"; -e const _MOZINFO_JS_PATH = "/storage/sdcard/tests/xpc/p/mozinfo.json"; -e const _TESTING_MODULES_DIR = "/storage/sdcard/tests/xpc/m"; -f /storage/sdcard/tests/xpc/head.js -e const _SERVER_ADDR = "localhost" -e const _HEAD_FILES = ["/storage/sdcard/tests/xpc/netwerk/test/unit/head_channels.js", "/storage/sdcard/tests/xpc/netwerk/test/unit/head_cache.js", "/storage/sdcard/tests/xpc/netwerk/test/unit/head_cache2.js"]; -e const _JSDEBUGGER_PORT = 0; -e const _TEST_FILE = ["test_bug650995.js"]; -e const _TEST_NAME = "netwerk/test/unit/test_bug650995.js" -e _execute_test(); quit(0);
[task 2017-08-12T15:08:56.392062Z] 15:08:56     INFO -  netwerk/test/unit/test_bug650995.js | [1115] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file /home/worker/workspace/build/src/xpcom/base/nsTraceRefcnt.cpp, line 171
[task 2017-08-12T15:08:56.392139Z] 15:08:56     INFO -  netwerk/test/unit/test_bug650995.js | Warning: MOZILLA_FIVE_HOME not set.
[task 2017-08-12T15:08:56.392251Z] 15:08:56     INFO -  netwerk/test/unit/test_bug650995.js | [1115] WARNING: Couldn't get the user appdata directory. Crash events may not be produced.: file /home/worker/workspace/build/src/toolkit/crashreporter/nsExceptionHandler.cpp, line 2866
[task 2017-08-12T15:08:56.392302Z] 15:08:56     INFO -  (xpcshell/head.js) | test MAIN run_test pending (1)
[task 2017-08-12T15:08:56.392344Z] 15:08:56     INFO -  (xpcshell/head.js) | test pending (2)
[task 2017-08-12T15:08:56.392388Z] 15:08:56     INFO -  (xpcshell/head.js) | test pending (3)
[task 2017-08-12T15:08:56.392441Z] 15:08:56     INFO -  (xpcshell/head.js) | test MAIN run_test finished (3)
[task 2017-08-12T15:08:56.392488Z] 15:08:56     INFO -  running event loop
[task 2017-08-12T15:08:56.392576Z] 15:08:56     INFO -  netwerk/test/unit/test_bug650995.js | JavaScript strict warning: test_bug650995.js, line 81: ReferenceError: assignment to undeclared variable cap
[task 2017-08-12T15:08:56.392632Z] 15:08:56     INFO -  netwerk/test/unit/test_bug650995.js | Segmentation fault
[task 2017-08-12T15:08:56.392679Z] 15:08:56     INFO -  netwerk/test/unit/test_bug650995.js | 13
[task 2017-08-12T15:08:56.392712Z] 15:08:56     INFO -  <<<<<<<
[task 2017-08-12T15:08:56.876037Z] 15:08:56     INFO -  mozcrash Copy/paste: /home/worker/workspace/build/linux64-minidump_stackwalk /tmp/tmp4RR9W4/4869656f-542d-c673-7046-0c67e1672464.dmp /home/worker/workspace/build/symbols
[task 2017-08-12T15:08:56.987426Z] 15:08:56     INFO -  mozcrash Saved minidump as /home/worker/workspace/build/blobber_upload_dir/4869656f-542d-c673-7046-0c67e1672464.dmp
[task 2017-08-12T15:08:56.993190Z] 15:08:56     INFO -  mozcrash Saved app info as /home/worker/workspace/build/blobber_upload_dir/4869656f-542d-c673-7046-0c67e1672464.extra
[task 2017-08-12T15:08:56.993281Z] 15:08:56  WARNING -  PROCESS-CRASH | netwerk/test/unit/test_bug650995.js | application crashed [@ libxul.so + 0x5903c]
[task 2017-08-12T15:08:56.993347Z] 15:08:56     INFO -  Crash dump filename: /tmp/tmp4RR9W4/4869656f-542d-c673-7046-0c67e1672464.dmp
[task 2017-08-12T15:08:56.993393Z] 15:08:56     INFO -  Operating system: Android
[task 2017-08-12T15:08:56.993454Z] 15:08:56     INFO -                    0.0.0 Linux 2.6.29-gea477bb #1 Wed Sep 26 11:04:45 PDT 2012 armv7l
[task 2017-08-12T15:08:56.993493Z] 15:08:56     INFO -  CPU: arm
[task 2017-08-12T15:08:56.993553Z] 15:08:56     INFO -       ARMv7 ARM Cortex-A8 features: swp,half,thumb,fastmult,vfpv2,edsp,neon,vfpv3
[task 2017-08-12T15:08:56.993592Z] 15:08:56     INFO -       1 CPU
[task 2017-08-12T15:08:56.993628Z] 15:08:56     INFO -  GPU: UNKNOWN
[task 2017-08-12T15:08:56.993670Z] 15:08:56     INFO -  Crash reason:  SIGSEGV
[task 2017-08-12T15:08:56.993708Z] 15:08:56     INFO -  Crash address: 0x0
[task 2017-08-12T15:08:56.993750Z] 15:08:56     INFO -  Process uptime: not available
[task 2017-08-12T15:08:56.993789Z] 15:08:56     INFO -  Thread 14 (crashed)
[task 2017-08-12T15:08:56.993831Z] 15:08:56     INFO -   0  libxul.so + 0x5903c
[task 2017-08-12T15:08:56.993897Z] 15:08:56     INFO -       r0 = 0x00000086    r1 = 0xe49c05a6    r2 = 0xe49c05a6    r3 = 0x00000000
[task 2017-08-12T15:08:56.993960Z] 15:08:56     INFO -       r4 = 0x45c0b530    r5 = 0x45c0b514    r6 = 0x529ff5b4    r7 = 0x45c0b500
[task 2017-08-12T15:08:56.994022Z] 15:08:56     INFO -       r8 = 0x529ff5b4    r9 = 0x434d7b54   r10 = 0x4305dff6   r12 = 0x00000003
[task 2017-08-12T15:08:56.994081Z] 15:08:56     INFO -       fp = 0x529ff5fc    sp = 0x529ff588    lr = 0x40715b43    pc = 0x4071703c
[task 2017-08-12T15:08:56.995471Z] 15:08:56     INFO -      Found by: given as instruction pointer in context
[task 2017-08-12T15:08:56.995522Z] 15:08:56     INFO -   1  libxul.so + 0x5add7
[task 2017-08-12T15:08:56.995580Z] 15:08:56     INFO -       sp = 0x529ff590    pc = 0x40718dd9
[task 2017-08-12T15:08:56.995633Z] 15:08:56     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:56.995683Z] 15:08:56     INFO -   2  libxul.so + 0x70f53
[task 2017-08-12T15:08:56.995737Z] 15:08:56     INFO -       sp = 0x529ff5a0    pc = 0x4072ef55
[task 2017-08-12T15:08:56.995779Z] 15:08:56     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:56.995827Z] 15:08:56     INFO -   3  libxul.so + 0x2997228
[task 2017-08-12T15:08:56.995876Z] 15:08:56     INFO -       sp = 0x529ff5a4    pc = 0x4305522a
[task 2017-08-12T15:08:56.995927Z] 15:08:56     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:56.995977Z] 15:08:56     INFO -   4  libxul.so + 0x299fff4
[task 2017-08-12T15:08:56.996032Z] 15:08:56     INFO -       sp = 0x529ff5b8    pc = 0x4305dff6
[task 2017-08-12T15:08:56.996079Z] 15:08:56     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:56.996119Z] 15:08:56     INFO -   5  libxul.so + 0x299fff4
[task 2017-08-12T15:08:56.996165Z] 15:08:56     INFO -       sp = 0x529ff5e4    pc = 0x4305dff6
[task 2017-08-12T15:08:56.996333Z] 15:08:56     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:56.996562Z] 15:08:56     INFO -   6  libxul.so + 0x7109f
[task 2017-08-12T15:08:56.996801Z] 15:08:56     INFO -       sp = 0x529ff5f0    pc = 0x4072f0a1
[task 2017-08-12T15:08:56.997026Z] 15:08:56     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:56.997280Z] 15:08:56     INFO -   7  libxul.so + 0x1be1f03
[task 2017-08-12T15:08:56.997512Z] 15:08:56     INFO -       sp = 0x529ff5f8    pc = 0x4229ff05
[task 2017-08-12T15:08:56.997740Z] 15:08:56     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:56.997962Z] 15:08:56     INFO -   8  libxul.so + 0x2997228
[task 2017-08-12T15:08:56.998194Z] 15:08:56     INFO -       sp = 0x529ff604    pc = 0x4305522a
[task 2017-08-12T15:08:56.998443Z] 15:08:56     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:56.998671Z] 15:08:56     INFO -   9  libxul.so + 0x1be2193
[task 2017-08-12T15:08:56.998902Z] 15:08:56     INFO -       sp = 0x529ff628    pc = 0x422a0195
[task 2017-08-12T15:08:56.999131Z] 15:08:56     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:56.999350Z] 15:08:56     INFO -  Thread 0
[task 2017-08-12T15:08:56.999578Z] 15:08:56     INFO -   0  libc.so + 0x1c5a8
[task 2017-08-12T15:08:56.999838Z] 15:08:56     INFO -       r0 = 0xfffffe00    r1 = 0x00000080    r2 = 0x00000002    r3 = 0x00000000
[task 2017-08-12T15:08:57.000080Z] 15:08:56     INFO -       r4 = 0x4462175c    r5 = 0x00000002    r6 = 0x00000000    r7 = 0x000000f0
[task 2017-08-12T15:08:57.000318Z] 15:08:57     INFO -       r8 = 0xc444bf54    r9 = 0x5202dfb0   r10 = 0x43128307   r12 = 0x00000000
[task 2017-08-12T15:08:57.000562Z] 15:08:57     INFO -       fp = 0xbeb473c8    sp = 0x45c61c40    lr = 0x4026ae8c    pc = 0x402795a8
[task 2017-08-12T15:08:57.000788Z] 15:08:57     INFO -      Found by: given as instruction pointer in context
[task 2017-08-12T15:08:57.001006Z] 15:08:57     INFO -   1  libxul.so + 0x1cbbdcb
[task 2017-08-12T15:08:57.001251Z] 15:08:57     INFO -       sp = 0x45c61c60    pc = 0x42379dcd
[task 2017-08-12T15:08:57.001476Z] 15:08:57     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:57.001688Z] 15:08:57     INFO -  Thread 1
[task 2017-08-12T15:08:57.001912Z] 15:08:57     INFO -   0  libc.so + 0x1c3dc
[task 2017-08-12T15:08:57.002161Z] 15:08:57     INFO -       r0 = 0xfffffffc    r1 = 0x45c0d800    r2 = 0x00000020    r3 = 0xffffffff
[task 2017-08-12T15:08:57.002399Z] 15:08:57     INFO -       r4 = 0xffffffff    r5 = 0x45c05390    r6 = 0x45c7b180    r7 = 0x000000fc
[task 2017-08-12T15:08:57.002665Z] 15:08:57     INFO -       r8 = 0x00000000    r9 = 0x45c05398   r10 = 0x0000000c   r12 = 0x45c0d800
[task 2017-08-12T15:08:57.002907Z] 15:08:57     INFO -       fp = 0x2a012bc8    sp = 0x45dffd20    lr = 0x409cc76b    pc = 0x402793dc
[task 2017-08-12T15:08:57.003136Z] 15:08:57     INFO -      Found by: given as instruction pointer in context
[task 2017-08-12T15:08:57.003357Z] 15:08:57     INFO -   1  libxul.so + 0x30ff11
[task 2017-08-12T15:08:57.003589Z] 15:08:57     INFO -       sp = 0x45dffd48    pc = 0x409cdf13
[task 2017-08-12T15:08:57.003813Z] 15:08:57     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:57.004038Z] 15:08:57     INFO -   2  libxul.so + 0x2a1dd76
[task 2017-08-12T15:08:57.004274Z] 15:08:57     INFO -       sp = 0x45dffd5c    pc = 0x430dbd78
[task 2017-08-12T15:08:57.004500Z] 15:08:57     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:57.004727Z] 15:08:57     INFO -   3  libxul.so + 0x2a1ddea
[task 2017-08-12T15:08:57.004960Z] 15:08:57     INFO -       sp = 0x45dffd60    pc = 0x430dbdec
[task 2017-08-12T15:08:57.005198Z] 15:08:57     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:57.005430Z] 15:08:57     INFO -   4  libxul.so + 0x306761
[task 2017-08-12T15:08:57.005666Z] 15:08:57     INFO -       sp = 0x45dffd68    pc = 0x409c4763
[task 2017-08-12T15:08:57.005891Z] 15:08:57     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:57.006111Z] 15:08:57     INFO -   5  libc.so + 0xf46
[task 2017-08-12T15:08:57.006341Z] 15:08:57     INFO -       sp = 0x45dffd70    pc = 0x4025df48
[task 2017-08-12T15:08:57.006564Z] 15:08:57     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:57.006790Z] 15:08:57     INFO -   6  libxul.so + 0x301d25
[task 2017-08-12T15:08:57.007026Z] 15:08:57     INFO -       sp = 0x45dffd98    pc = 0x409bfd27
[task 2017-08-12T15:08:57.007250Z] 15:08:57     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:57.007477Z] 15:08:57     INFO -   7  libmozglue.so + 0xcc47
[task 2017-08-12T15:08:57.007709Z] 15:08:57     INFO -       sp = 0x45dffdb8    pc = 0x401b3c49
[task 2017-08-12T15:08:57.007934Z] 15:08:57     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:57.008155Z] 15:08:57     INFO -   8  libxul.so + 0x30309d
[task 2017-08-12T15:08:57.008390Z] 15:08:57     INFO -       sp = 0x45dffdd0    pc = 0x409c109f
[task 2017-08-12T15:08:57.008617Z] 15:08:57     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:57.008844Z] 15:08:57     INFO -   9  libxul.so + 0x3030bb
[task 2017-08-12T15:08:57.009080Z] 15:08:57     INFO -       sp = 0x45dffde8    pc = 0x409c10bd
[task 2017-08-12T15:08:57.009388Z] 15:08:57     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:57.009701Z] 15:08:57     INFO -  10  libxul.so + 0x3137fb
[task 2017-08-12T15:08:57.010001Z] 15:08:57     INFO -       sp = 0x45dffdfc    pc = 0x409d17fd
[task 2017-08-12T15:08:57.010358Z] 15:08:57     INFO -      Found by: stack scanning
[task 2017-08-12T15:08:57.010513Z] 15:08:57     INFO -  11  libxul.so + 0x309345
[task 2017-08-12T15:08:57.010765Z] 15:08:57     INFO -       sp = 0x45dffe08    pc = 0x409c7347
[task 2017-08-12T15:08:57.011081Z] 15:08:57     INFO -      Found by: stack scanning
James, with nalexander on leave, do you know who can take a look at this Android issue?

More crashes without symbols:
https://treeherder.mozilla.org/logviewer.html#?job_id=122873609&repo=autoland
https://treeherder.mozilla.org/logviewer.html#?job_id=122873696&repo=autoland
Flags: needinfo?(snorp)
Summary: no crash symbols on Android? → no crash symbols on Android
I really have no idea how this works in automation. It would need to symbolicate the crash stack somehow, and I'm not sure how that's supposed to happen. Geoff, do you know? Ted?
Flags: needinfo?(ted)
Flags: needinfo?(snorp)
Flags: needinfo?(gbrown)
This is a debug build, so we download the symbols.zip at the start of the run (so we can use them to symbolicate assertion stacks). In the log you can see:
[task 2017-08-12T15:05:19.039308Z] 15:05:19     INFO - Downloading and extracting to /home/worker/workspace/build/symbols these dirs * from https://queue.taskcluster.net/v1/task/EwiG5BF3SlqZSsdc8yuGqg/artifacts/public/build/target.crashreporter-symbols.zip

Then when it runs minidump_stackwalk it does point it at that directory:
[task 2017-08-12T15:08:56.876037Z] 15:08:56     INFO -  mozcrash Copy/paste: /home/worker/workspace/build/linux64-minidump_stackwalk /tmp/tmp4RR9W4/4869656f-542d-c673-7046-0c67e1672464.dmp /home/worker/workspace/build/symbols

I'll download the minidump and symbols and try running it locally.
Flags: needinfo?(ted)
Ugh, from the minidump_stackwalk log:
2017-08-17 09:49:41: simple_symbol_supplier.cc:196: INFO: No symbol file at /tmp
/libxul.so/000000000000000000000000000000000/libxul.so.sym

and sure enough (from minidump_dump):
module[10]
MDRawModule
  base_of_image                   = 0x402d3000
  size_of_image                   = 0x16d000
  checksum                        = 0x0
  time_date_stamp                 = 0x0 1970-01-01 00:00:00
  module_name_rva                 = 0x199b8
  version_info.signature          = 0x0
  version_info.struct_version     = 0x0
  version_info.file_version       = 0x0:0x0
  version_info.product_version    = 0x0:0x0
  version_info.file_flags_mask    = 0x0
  version_info.file_flags         = 0x0
  version_info.file_os            = 0x0
  version_info.file_type          = 0x0
  version_info.file_subtype       = 0x0
  version_info.file_date          = 0x0:0x0
  cv_record.data_size             = 24
  cv_record.rva                   = 0x199a0
  misc_record.data_size           = 0
  misc_record.rva                 = 0x0
  (code_file)                     = "/data/local/xpcb/libxul.so"
  (code_identifier)               = "45ab4745ff68dfe2b9d1089701cf96d8f0756d7a"
  (cv_record).cv_signature        = 0x4270454c
  (cv_record).build_id            = 45ab4745ff68dfe2b9d1089701cf96d8f0756d7a
  (misc_record)                   = (null)
  (debug_file)                    = "/data/local/xpcb/libxul.so"
  (debug_identifier)              = "4547AB4568FFE2DFB9D1089701CF96D80"
  (version)                       = ""

module[11]
MDRawModule
  base_of_image                   = 0x406be000
  size_of_image                   = 0x3241000
  checksum                        = 0x0
  time_date_stamp                 = 0x0 1970-01-01 00:00:00
  module_name_rva                 = 0x19a10
  version_info.signature          = 0x0
  version_info.struct_version     = 0x0
  version_info.file_version       = 0x0:0x0
  version_info.product_version    = 0x0:0x0
  version_info.file_flags_mask    = 0x0
  version_info.file_flags         = 0x0
  version_info.file_os            = 0x0
  version_info.file_type          = 0x0
  version_info.file_subtype       = 0x0
  version_info.file_date          = 0x0:0x0
  cv_record.data_size             = 20
  cv_record.rva                   = 0x199f8
  misc_record.data_size           = 0
  misc_record.rva                 = 0x0
  (code_file)                     = "/data/local/xpcb/libxul.so"
  (code_identifier)               = "00000000000000000000000000000000"
  (cv_record).cv_signature        = 0x4270454c
  (cv_record).build_id            = 00000000000000000000000000000000
  (misc_record)                   = (null)
  (debug_file)                    = "/data/local/xpcb/libxul.so"
  (debug_identifier)              = "000000000000000000000000000000000"
  (version)                       = ""

libxul.so got two module entries, and the second one doesn't have a valid debug_identifier so we can't find symbols for it. I feel like we've had this issue before with shared libraries winding up with multiple entries in the modules list and breaking things.

The /proc/self/maps entries (also in the output of minidump_dump) show:
402d3000-40440000 r-xp 00000000 1f:01 666        /data/local/xpcb/libxul.so
40440000-406be000 ---p 40440000 00:00 0 
406be000-438ff000 r-xp 0016c000 1f:01 666        /data/local/xpcb/libxul.so
438ff000-43900000 ---p 438ff000 00:00 0 
43900000-43b31000 r--p 033ad000 1f:01 666        /data/local/xpcb/libxul.so
43b31000-445d3000 rw-p 035de000 1f:01 666        /data/local/xpcb/libxul.so

I assume whatever that memory region is in the middle of the two executable regions is what's breaking Breakpad here. snorp: any idea what that is?
Flags: needinfo?(gbrown) → needinfo?(snorp)
Attached file stack (deleted) —
I cheated and copied the libxul symbol file to a path with the all-zeroes debug identifier in it so that minidump_stackwalk could find it, and got this as output. It's not incredibly satisfying, the top frame doesn't have a function name, and the stack looks pretty wonky.
Hmm. Dunno? Eugen, glandium, any ideas?
Flags: needinfo?(snorp)
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(esawin)
elfhack! But the crash reporter is supposed to fill the module list with the full range for our libs, from what is filled for in with CrashReporter::AddLibraryMapping (called from nsAndroidStartup.cpp), and where the info itself comes from report_mapping in APKOpen.cpp (called from our linker when libs are loaded).
Flags: needinfo?(mh+mozilla)
bug 635961 effectively reenabled elfhack on android, and some more recent changes might have split the library differently from what it used to be split, but it shouldn't have made any difference with what comes from CrashReporter::AddLibraryMapping.
Blocks: 635961
In bug 1392985 there are lots of additional examples/logs.

It looks like all such examples are Android xpcshell. Meanwhile, browser tests (robocop, mochitest, etc) continue to provide symbolicated crash reports, as in bug 1394428.
Summary: no crash symbols on Android → no symbols for Android xpcshell crashes
Blocks: 1398312
Component: General Automation → Build Config & IDE Support
Product: Release Engineering → Firefox for Android
QA Contact: catlee
Flags: needinfo?(esawin)
I haven't noticed any new failures lately, but as far as I know, this is still a problem.

:jchen -- snorp suggested you might be able to look at this?
Flags: needinfo?(nchen)
Looks like an elfhack issue? I don't think xpcshell goes through nsAndroidStartup.cpp/APKOpen.cpp, if that's the only thing telling Breakpad about our elfhack mappings.
Blocks: 1378986
Flags: needinfo?(nchen) → needinfo?(mh+mozilla)
ah indeed, xpcshell doesn't use the custom linker, which is what ends up telling breakpad... I guess it's about time to fix bug 689178...
Flags: needinfo?(mh+mozilla)
(In reply to Geoff Brown [:gbrown] from comment #11)
> I haven't noticed any new failures lately, but as far as I know, this is
> still a problem.

If the silly OF bot would comment in non-intermittent-failure bugs, you would, since I started starring them as this bug instead of starring them as "expected fail" a while back. https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1389805 - frequency isn't all that high, since I can't get to every one of them before someone files a useless bug or stars it in some blowing-it-off way, but there are still a sufficient number. And rather oddly, it seems to also include Android mochitest-chrome, though I haven't been starring those.
We have lots of examples now - great!

I see some mis-stars for x86 mochitest-chrome. A lot of those look like they have no symbols at first glance because the stacks have lots of libc frames -- but it's normal and expected to not have symbols for frames in Android system libs.

There is an on-going, consistent problem with Android Debug xpcshell tests (why are there no Android Opt examples??) not having symbols for libxul and other Mozilla libs, and that's a real concern because we can't effectively follow up on those crashes.
Blocks: android-139
Between this bug and bug 1399773, it is apparent that we are seeing quite a few Android xpcshell crashes, and they are all being ignored because of the lack of symbols.
(In reply to Mike Hommey [:glandium] from comment #13)
> ah indeed, xpcshell doesn't use the custom linker, which is what ends up
> telling breakpad... I guess it's about time to fix bug 689178...

So, is fixing bug 689178 the next step here? Is that on your radar?
Flags: needinfo?(mh+mozilla)
Not on my radar. But the patches are there, they "just" need to be refreshed for the last few years of breakpad changes, and reviewed, which is what they've been stuck on for the past few years, so even if that /was/ on my radar, it would also need to be on ted's radar.
Flags: needinfo?(mh+mozilla)
:snorp, do you know anyone who could push this forward? Thank you.
Flags: needinfo?(snorp)
Whiteboard: [stockwell needswork]
Unfortunately I think there are several things that need to happen (see comment #23), and none of those seem likely in the near future. Our options for now are to either live with the broken crash data for xpcshell tests or stop using the elf hack in Fennec. I guess we could also just disable xpcshell tests...
Flags: needinfo?(snorp)
One thing to note is that in the intervening 5 years we have forked the Breakpad client code, so we would not have to land these patches *upstream*, just in m-c.
Starting with 14 Nov i see a decrease in failure numbers.
According to Neglected Oranges there are 30 failures in the past week.
Looking in OrangeFactor there are only 7 failures in the last week, occurring on android-4-3-armv7-api16 debug.
Here is a recent log: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-beta&job_id=145651056&lineNumber=1753

[task 2017-11-17T17:26:58.764Z] 17:26:58     INFO -  TEST-START | netwerk/test/unit/test_cache_jar.js
[task 2017-11-17T17:27:07.022Z] 17:27:07  WARNING -  TEST-UNEXPECTED-FAIL | netwerk/test/unit/test_cache_jar.js | xpcshell return code: 139
[task 2017-11-17T17:27:07.025Z] 17:27:07     INFO -  TEST-INFO took 8258ms
[task 2017-11-17T17:27:07.025Z] 17:27:07     INFO -  >>>>>>>
[task 2017-11-17T17:27:07.026Z] 17:27:07     INFO -  netwerk/test/unit/test_cache_jar.js | xpcw: cd /storage/sdcard/tests/xpc/netwerk/test/unit
[task 2017-11-17T17:27:07.027Z] 17:27:07     INFO -  netwerk/test/unit/test_cache_jar.js | xpcw: xpcshell -r /storage/sdcard/tests/xpc/c/httpd.manifest --greomni /data/local/xpcb/target.apk -m -s -e const _HEAD_JS_PATH = "/storage/sdcard/tests/xpc/head.js"; -e const _MOZINFO_JS_PATH = "/storage/sdcard/tests/xpc/p/mozinfo.json"; -e const _TESTING_MODULES_DIR = "/storage/sdcard/tests/xpc/m"; -f /storage/sdcard/tests/xpc/head.js -e const _SERVER_ADDR = "localhost" -e const _HEAD_FILES = ["/storage/sdcard/tests/xpc/netwerk/test/unit/head_channels.js", "/storage/sdcard/tests/xpc/netwerk/test/unit/head_cache.js", "/storage/sdcard/tests/xpc/netwerk/test/unit/head_cache2.js"]; -e const _JSDEBUGGER_PORT = 0; -e const _TEST_FILE = ["test_cache_jar.js"]; -e const _TEST_NAME = "netwerk/test/unit/test_cache_jar.js" -e _execute_test(); quit(0);
[task 2017-11-17T17:27:07.028Z] 17:27:07     INFO -  netwerk/test/unit/test_cache_jar.js | [2353, Unnamed thread 4620f080] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file /builds/worker/workspace/build/src/xpcom/base/nsTraceRefcnt.cpp, line 171
[task 2017-11-17T17:27:07.028Z] 17:27:07     INFO -  netwerk/test/unit/test_cache_jar.js | [2353, Main Thread] WARNING: Couldn't get the user appdata directory. Crash events may not be produced.: file /builds/worker/workspace/build/src/toolkit/crashreporter/nsExceptionHandler.cpp, line 2762
[task 2017-11-17T17:27:07.028Z] 17:27:07     INFO -  (xpcshell/head.js) | test MAIN run_test pending (1)
[task 2017-11-17T17:27:07.028Z] 17:27:07     INFO -  (xpcshell/head.js) | test pending (2)
[task 2017-11-17T17:27:07.029Z] 17:27:07     INFO -  netwerk/test/unit/test_cache_jar.js | [2353, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004002: file /builds/worker/workspace/build/src/toolkit/components/resistfingerprinting/nsRFPService.cpp, line 182
[task 2017-11-17T17:27:07.029Z] 17:27:07     INFO -  netwerk/test/unit/test_cache_jar.js | Segmentation fault
[task 2017-11-17T17:27:07.030Z] 17:27:07     INFO -  netwerk/test/unit/test_cache_jar.js | 13
Flags: needinfo?(ted)
I'm not sure if that needinfo had a specific question attached? The log in comment 31 still shows the same problem that this bug is filed about--the stack trace doesn't have symbols:
[task 2017-11-17T17:27:08.037Z] 17:27:08     INFO -  Thread 13 (crashed)
[task 2017-11-17T17:27:08.037Z] 17:27:08     INFO -   0  libxul.so + 0x81dae
[task 2017-11-17T17:27:08.038Z] 17:27:08     INFO -       r0 = 0x00000000    r1 = 0x9345bfd8    r2 = 0x438a1416    r3 = 0x00000076
[task 2017-11-17T17:27:08.038Z] 17:27:08     INFO -       r4 = 0x00000076    r5 = 0x4a2ffaa0    r6 = 0x46218394    r7 = 0x4a2ffa50
[task 2017-11-17T17:27:08.038Z] 17:27:08     INFO -       r8 = 0x4a2ffad8    r9 = 0x42807474   r10 = 0x43c44afc   r12 = 0x00000003
[task 2017-11-17T17:27:08.038Z] 17:27:08     INFO -       fp = 0x46218380    sp = 0x4a2ffa48    lr = 0x40740415    pc = 0x407a3dae
[task 2017-11-17T17:27:08.038Z] 17:27:08     INFO -      Found by: given as instruction pointer in context
[task 2017-11-17T17:27:08.038Z] 17:27:08     INFO -   1  libxul.so + 0x81639
[task 2017-11-17T17:27:08.038Z] 17:27:08     INFO -       sp = 0x4a2ffa58    pc = 0x407a363b
[task 2017-11-17T17:27:08.038Z] 17:27:08     INFO -      Found by: stack scanning

The frequency of crashes is not actually relevant to this specific bug. The crashes themselves are presumably just individual bugs we're hitting.
Flags: needinfo?(ted)
I would vote for disabling xpcshell tests on android as :glandium, :ted, and :snorp are not interested in seeing this issue get fixed.  :snorp, do you have any concerns if we were to disable xpcshell tests on android emulators?
Flags: needinfo?(snorp)
Hmm, it looks like there are quite a few tests running, so I'm hesitant to turn them off without a little more effort.

I wonder if we could run xpcshell tests under an Android service instead of the standalone executable we have now. That way all the normal linker stuff would be in effect. A brief look suggests it might Just Work if we run main() (or similar) from xpcshell.cpp[0]. Of course we'd need changes to the harness too in order to activate the service.

It looks like elfhack only saved about 70k or so in APK size[1] and there was not really a detectable startup performance improvement[2]. My recommendation is to disable elfhack until we get a solution for this bug. Glandium, do you have any objection?

[0] https://dxr.mozilla.org/mozilla-central/source/js/xpconnect/shell/xpcshell.cpp#36
[1] https://bugzilla.mozilla.org/show_bug.cgi?id=635961#c31
[2] https://mzl.la/2AtegMd
Flags: needinfo?(snorp) → needinfo?(mh+mozilla)
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #34)
> I would vote for disabling xpcshell tests on android as :glandium, :ted, and
> :snorp are not interested in seeing this issue get fixed.  :snorp, do you
> have any concerns if we were to disable xpcshell tests on android emulators?

It's my opinion that we want _more_ xpcshell tests, not less, since they're much more like unit tests than our very heavy mochitests.  I'm aware that xpcshell tests are a tremendous pile of hacks and lots of things don't make sense on Desktop (let alone Android), but saying that xpcshell tests aren't available on Android is such a disincentive to writing them that I think we shouldn't do it.

I think it makes more sense to disable elfhack if we can't invest in fixing the real cruftiness of xpcshell.
elfhack saves space on flash and in memory, though.

How about de-elfhacking the libs before running xpcshell? I completely forgot about that, but that's actually something elfhack can do. It doesn't actually remove the elfhack code, but expands the binary in a way that makes it not have multiple split PT_LOADs. Use elfhack -r *.so.
Flags: needinfo?(mh+mozilla)
Blocks: 1422110
Blocks: 1422209
Blocks: 1419605
Blocks: 1422441
Blocks: 1423222
Blocks: 1423227
Glandium, I see a variety of errors when I try running `elfhack -r` on our libs. It either says it's not elfhacked or isn't arranged appropriately. Can you make sure this is working as expected?
Flags: needinfo?(mh+mozilla)
For the record:
$ wget http://ftp.mozilla.org/pub/mobile/nightly/latest-mozilla-central-android-api-16/fennec-59.0a1.multi.android-arm.apk
(...)
$ unzip fennec-59.0a1.multi.android-arm.apk assets/armeabi-v7a/libxul.so
Archive:  fennec-59.0a1.multi.android-arm.apk
 extracting: assets/armeabi-v7a/libxul.so  
$ mv assets/armeabi-v7a/libxul.so{,.xz}
$ xz -d assets/armeabi-v7a/libxul.so.xz 
$ elfhack -r assets/armeabi-v7a/libxul.so  
assets/armeabi-v7a/libxul.so: .elfhack.data.v0 section not following .elfhack.text.v0. Skipping

This would be a regression from bug 1385783, which would grant its own bug.
Flags: needinfo?(mh+mozilla)
Blocks: 1423027
Blocks: 1423022
Blocks: 1423019
Blocks: 1422556
Blocks: 1412158
Attached patch apply elfhack -r to libs (deleted) — Splinter Review
I'm finally following up on comment 37: elfhack -r *.so

I have not verified that this fixes the crash report problems we have seen, but you can see elfhack in action, and tests continue to run:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ad05b541f55ff043b590daee57dc48b58e70dd6a

https://treeherder.mozilla.org/logviewer.html#?job_id=172760500&repo=try&lineNumber=1493

[task 2018-04-09T22:22:57.160Z] 22:22:57     INFO -  Pushing assets/armeabi-v7a/libfreebl3.so..
[task 2018-04-09T22:22:57.180Z] 22:22:57     INFO -  /tmp/tmps7T7Ph/assets/armeabi-v7a/libfreebl3.so: Not elfhacked. Skipping
[task 2018-04-09T22:22:57.384Z] 22:22:57     INFO -  Pushing assets/armeabi-v7a/liblgpllibs.so..
[task 2018-04-09T22:22:57.393Z] 22:22:57     INFO -  /tmp/tmps7T7Ph/assets/armeabi-v7a/liblgpllibs.so: Not elfhacked. Skipping
[task 2018-04-09T22:22:57.496Z] 22:22:57     INFO -  Pushing assets/armeabi-v7a/libmozavcodec.so..
[task 2018-04-09T22:22:57.508Z] 22:22:57     INFO -  /tmp/tmps7T7Ph/assets/armeabi-v7a/libmozavcodec.so: Grown by 4096 bytes
[task 2018-04-09T22:22:57.612Z] 22:22:57     INFO -  Pushing assets/armeabi-v7a/libmozavutil.so..
[task 2018-04-09T22:22:57.625Z] 22:22:57     INFO -  /tmp/tmps7T7Ph/assets/armeabi-v7a/libmozavutil.so: Not elfhacked. Skipping
[task 2018-04-09T22:22:57.728Z] 22:22:57     INFO -  Pushing assets/armeabi-v7a/libnss3.so..
[task 2018-04-09T22:22:57.792Z] 22:22:57     INFO -  /tmp/tmps7T7Ph/assets/armeabi-v7a/libnss3.so: Grown by 16384 bytes
[task 2018-04-09T22:22:58.297Z] 22:22:58     INFO -  Pushing assets/armeabi-v7a/libnssckbi.so..
[task 2018-04-09T22:22:58.318Z] 22:22:58     INFO -  /tmp/tmps7T7Ph/assets/armeabi-v7a/libnssckbi.so: Not elfhacked. Skipping
[task 2018-04-09T22:22:58.521Z] 22:22:58     INFO -  Pushing assets/armeabi-v7a/libsoftokn3.so..
[task 2018-04-09T22:22:58.535Z] 22:22:58     INFO -  /tmp/tmps7T7Ph/assets/armeabi-v7a/libsoftokn3.so: Not elfhacked. Skipping
[task 2018-04-09T22:22:58.638Z] 22:22:58     INFO -  Pushing assets/armeabi-v7a/libxul.so..
[task 2018-04-09T22:23:00.566Z] 22:23:00     INFO -  /tmp/tmps7T7Ph/assets/armeabi-v7a/libxul.so: Grown by 2392064 bytes
[task 2018-04-09T22:23:23.859Z] 22:23:23     INFO -  Pushing lib/armeabi-v7a/libmozglue.so..
[task 2018-04-09T22:23:24.474Z] 22:23:24     INFO -  Pushing lib/armeabi-v7a/libplugin-container.so..
Assignee: nobody → gbrown
Attachment #8966409 - Flags: review?(jmaher)
Comment on attachment 8966409 [details] [diff] [review]
apply elfhack -r to libs

Review of attachment 8966409 [details] [diff] [review]:
-----------------------------------------------------------------

this seems like a step in the right direction!
Attachment #8966409 - Flags: review?(jmaher) → review+
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f7f72ec4d044
Update linux64 host-utils with elfhack; r=me,a=test-only
https://hg.mozilla.org/integration/mozilla-inbound/rev/f36acf30a00d
Apply elfhack -r to Android xpcshell libs; r=jmaher
Let's leave this open until we verify some xpcshell crash reports.
Keywords: leave-open
Blocks: 1451930
It worked!

https://treeherder.mozilla.org/logviewer.html#?job_id=173202208&repo=autoland&lineNumber=2286-2312

[task 2018-04-12T04:36:17.134Z] 04:36:17     INFO -   3  libxul.so!mozilla::ProcessHangStackRunnable::Run [HangDetails.cpp:83c1d17f2d85b89bc3429a92d1aa88006b093292 : 383 + 0x8]
[task 2018-04-12T04:36:17.134Z] 04:36:17     INFO -      eip = 0xb50d0d66   esp = 0xae4bf820   ebp = 0xae4bf838
[task 2018-04-12T04:36:17.134Z] 04:36:17     INFO -      Found by: stack scanning
[task 2018-04-12T04:36:17.134Z] 04:36:17     INFO -   4  libxul.so!nsThread::ProcessNextEvent [nsThread.cpp:83c1d17f2d85b89bc3429a92d1aa88006b093292 : 1096 + 0x8]
[task 2018-04-12T04:36:17.135Z] 04:36:17     INFO -      eip = 0xb30321e0   esp = 0xae4bf840   ebp = 0xae4bfd98   ebx = 0xb7a9eff4
[task 2018-04-12T04:36:17.135Z] 04:36:17     INFO -      esi = 0x00000000   edi = 0xaf12ee80
[task 2018-04-12T04:36:17.135Z] 04:36:17     INFO -      Found by: call frame info
[task 2018-04-12T04:36:17.136Z] 04:36:17     INFO -   5  libxul.so!NS_ProcessNextEvent [nsThreadUtils.cpp:83c1d17f2d85b89bc3429a92d1aa88006b093292 : 519 + 0x11]
[task 2018-04-12T04:36:17.136Z] 04:36:17     INFO -      eip = 0xb3039d69   esp = 0xae4bfda0   ebp = 0xae4bfdc8   ebx = 0xb7a9eff4
[task 2018-04-12T04:36:17.136Z] 04:36:17     INFO -      esi = 0xae4bfdbf   edi = 0xaf11cdf0
[task 2018-04-12T04:36:17.136Z] 04:36:17     INFO -      Found by: call frame info
[task 2018-04-12T04:36:17.137Z] 04:36:17     INFO -   6  libxul.so!mozilla::ipc::MessagePumpForNonMainThreads::Run [MessagePump.cpp:83c1d17f2d85b89bc3429a92d1aa88006b093292 : 364 + 0x13]
[task 2018-04-12T04:36:17.137Z] 04:36:17     INFO -      eip = 0xb3384f68   esp = 0xae4bfdd0   ebp = 0xae4bfdf8   ebx = 0xb7a9eff4
[task 2018-04-12T04:36:17.138Z] 04:36:17     INFO -      esi = 0xaf115970   edi = 0xaf11cdf0
[task 2018-04-12T04:36:17.139Z] 04:36:17     INFO -      Found by: call frame info
[task 2018-04-12T04:36:17.139Z] 04:36:17     INFO -   7  libxul.so!MessageLoop::Run [message_loop.cc:83c1d17f2d85b89bc3429a92d1aa88006b093292 : 326 + 0x9]
[task 2018-04-12T04:36:17.139Z] 04:36:17     INFO -      eip = 0xb335dc1b   esp = 0xae4bfe00   ebp = 0xae4bfe28   ebx = 0xb7a9eff4
[task 2018-04-12T04:36:17.140Z] 04:36:17     INFO -      esi = 0xae4bfe10   edi = 0xaf12ee80
[task 2018-04-12T04:36:17.141Z] 04:36:17     INFO -      Found by: call frame info
[task 2018-04-12T04:36:17.141Z] 04:36:17     INFO -   8  libxul.so!nsThread::ThreadFunc [nsThread.cpp:83c1d17f2d85b89bc3429a92d1aa88006b093292 : 425 + 0x8]
[task 2018-04-12T04:36:17.143Z] 04:36:17     INFO -      eip = 0xb3030a53   esp = 0xae4bfe30   ebp = 0xae4bfe68   ebx = 0xb7a9eff4
[task 2018-04-12T04:36:17.143Z] 04:36:17     INFO -      esi = 0xaf11cdf0   edi = 0xaf12ee80
[task 2018-04-12T04:36:17.144Z] 04:36:17     INFO -      Found by: call frame info
[task 2018-04-12T04:36:17.144Z] 04:36:17     INFO -   9  libnss3.so!_pt_root [ptthread.c:83c1d17f2d85b89bc3429a92d1aa88006b093292 : 201 + 0x6]
[task 2018-04-12T04:36:17.145Z] 04:36:17     INFO -      eip = 0xb7e3bd97   esp = 0xae4bfe70   ebp = 0xae4bfe98   ebx = 0xb7f22ff4
[task 2018-04-12T04:36:17.146Z] 04:36:17     INFO -      esi = 0xb7f270c4   edi = 0xaf15b400
[task 2018-04-12T04:36:17.147Z] 04:36:17     INFO -      Found by: call frame info
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Whiteboard: [stockwell unknown] → [stockwell fixed:other]
Whiteboard: [stockwell fixed:other] → [stockwell fixed:other][checkin-needed-beta]
Target Milestone: --- → Firefox 61
https://hg.mozilla.org/releases/mozilla-beta/rev/5b90d07c0f0f
https://hg.mozilla.org/releases/mozilla-beta/rev/bb13bf12b0a3
Whiteboard: [stockwell fixed:other][checkin-needed-beta] → [stockwell fixed:other]
Blocks: 1499915
Product: Firefox for Android → Firefox Build System
Target Milestone: Firefox 61 → mozilla61
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: