Closed Bug 1476248 Opened 6 years ago Closed 6 years ago

Crash in _chkstk | static void lowbd_inv_txfm2d_add_no_identity_ssse3

Categories

(Core :: Audio/Video: Playback, defect, P2)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED DUPLICATE of bug 1474684
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox61 --- unaffected
firefox62 --- unaffected
firefox63 --- fixed

People

(Reporter: calixte, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression)

Crash Data

This bug was filed from the Socorro interface and is report bp-6f639690-a85e-4506-95ab-b42730180717. ============================================================= Top 2 frames of crashing thread: 0 xul.dll _chkstk 1 xul.dll static void lowbd_inv_txfm2d_add_no_identity_ssse3 third_party/aom/av1/common/x86/av1_inv_txfm_ssse3.c:2503 ============================================================= There is 1 crash in nightly 63 with buildid 20180716221418. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1445683. [1] https://hg.mozilla.org/mozilla-central/rev?node=0f16daade35d
Flags: needinfo?(dminor)
Thanks! We filed Bug 1474684 to track stack overflows after the av1 update. :dmajor, here's a windows av1 stack overflow, not sure if there is anything useful here though, the call stack is suspiciously short.
Blocks: 1474684
No longer blocks: 1445683
Rank: 15
Flags: needinfo?(dminor) → needinfo?(dmajor)
Priority: -- → P2
Thanks! I don't mind the short call stack; I bet Socorro is just getting tripped up. Could someone with the right access please send me the minidump?
Flags: needinfo?(dmajor)
0b 00000004`31e446f0 00007ffe`a2523556 xul!__chkstk+0x38 0c 00000004`31e44708 00007ffe`a251b99d xul!lowbd_inv_txfm2d_add_no_identity_ssse3+0x16 0d 00000004`31e44750 00007ffe`a250086c xul!av1_lowbd_inv_txfm2d_add_ssse3+0xe2d 0e 00000004`31e45160 00007ffe`a2503270 xul!av1_lowbd_inv_txfm2d_add_avx2+0xcc 0f 00000004`31e4dc80 00007ffe`a24e4554 xul!av1_inv_txfm_add_avx2+0x30 10 00000004`31e4dcc0 00007ffe`a256fcc3 xul!av1_inverse_transform_block+0x114 11 00000004`31e4dd10 00007ffe`a256d486 xul!decode_block+0x1e73 12 00000004`31e7e1e0 00007ffe`a256d544 xul!decode_partition+0xaf6 13 00000004`31e7e340 00007ffe`a256d544 xul!decode_partition+0xbb4 14 00000004`31e7e4a0 00007ffe`a256c89d xul!decode_partition+0xbb4 15 00000004`31e7e600 00007ffe`a256b008 xul!decode_tile+0x34d 16 00000004`31e7e720 00007ffe`a258284e xul!av1_decode_tg_tiles_and_wrapup+0x1498 17 00000004`31e7e8c0 00007ffe`a257d76a xul!aom_decode_frame_from_obus+0xdfe 18 00000004`31e7eb00 00007ffe`a24a84e2 xul!av1_receive_compressed_data+0x4ba 19 00000004`31e7eb80 00007ffe`a240811b xul!frame_worker_hook+0x32 1a 00000004`31e7ebd0 00007ffe`a24a6fcb xul!execute+0x1b 1b 00000004`31e7ec00 00007ffe`a243cdcf xul!decoder_decode+0x6ab 1c 00000004`31e7ecc0 00007ffe`a1b78fda xul!aom_codec_decode+0x3f 1d 00000004`31e7ecf0 00007ffe`a1b8810f xul!mozilla::AOMDecoder::ProcessDecode+0x5a 1e 00000004`31e7ef10 00007ffe`a033d970 xul!mozilla::detail::ProxyRunnable<mozilla::MozPromise<nsTArray<RefPtr<mozilla::MediaData> >,mozilla::MediaResult,1>,RefPtr<mozilla::MozPromise<nsTArray<RefPtr<mozilla::MediaData> >,mozilla::MediaResult,1> > (mozilla::VPXDecoder::*)(mozilla::MediaRawData *),mozilla::VPXDecoder,mozilla::MediaRawData *>::Run+0x2f 1f 00000004`31e7ef60 00007ffe`9fca4cb3 xul!mozilla::TaskQueue::Runner::Run+0x160 20 00000004`31e7f020 00007ffe`9fc0dd16 xul!nsThreadPool::Run+0x553 21 00000004`31e7f170 00007ffe`9fc0d6c2 xul!nsThread::ProcessNextEvent+0x626 22 00000004`31e7f700 00007ffe`9fc0d4da xul!NS_ProcessNextEvent+0x42 23 00000004`31e7f750 00007ffe`9fbf1ea9 xul!mozilla::ipc::MessagePumpForNonMainThreads::Run+0xaa 24 00000004`31e7f7b0 00007ffe`9fc0d3f8 xul!MessageLoop::RunHandler+0x49 25 00000004`31e7f800 00007ffe`9fbf2e35 xul!MessageLoop::Run+0x58 26 00000004`31e7f850 00007ffe`c4005196 xul!nsThread::ThreadFunc+0x145 27 00000004`31e7f8c0 00007ffe`c3ff4a2a nss3!_PR_NativeRunThread+0x156 28 00000004`31e7f930 00007ffe`f6fddc05 nss3!pr_root+0xa 29 00000004`31e7f960 00007ffe`f9091fe4 ucrtbase!thread_start<unsigned int (__cdecl*)(void * __ptr64)>+0x35 2a 00000004`31e7f990 00007ffe`ddd0367b kernel32!BaseThreadInitThunk+0x14 2b 00000004`31e7f9c0 00007ffe`fa73cb31 mozglue!patched_BaseThreadInitThunk+0xbb 2c 00000004`31e7fa40 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Looking at the change in Child-SP (second column) the biggest delta by far is between: 11 00000004`31e4dd10 00007ffe`a256d486 xul!decode_block+0x1e73 12 00000004`31e7e1e0 00007ffe`a256d544 xul!decode_partition+0xaf6 And indeed, xul!decode_block has... 00007ffe`a256de5c b888040300 mov eax,30488h 00007ffe`a256de61 e8ea616701 call xul!__chkstk (00007ffe`a3be4050) 00007ffe`a256de66 4829c4 sub rsp,rax 0:045> .formats 30488h Evaluate expression: Hex: 00000000`00030488 Decimal: 197768 ...a 197k stack frame.
Generally the compiler reserves stack space for all of a function's variables, even if they are scoped. So if you have code like: void callee() { if (condition) { int array1[BIG]; ...do some work... } else { int array2[BIG]; ...do some work... } } void caller() { if (condition) { int array3[BIG]; ...do some work... } callee() } Then `callee` would have a stack frame of size 2*BIG. Worse, if `callee` gets inlined, `caller` would have a stack frame of size 3*BIG. Splitting out scopes that declare large arrays into their own MOZ_NEVER_INLINE functions would make sure that you only use one "BIG" at a time.
These are likely to be the most dangerous calls MAX_SB_SIZE is so big: $ grep -r DECLARE_ALIGNED.*MAX_SB_SQUARE . ./common/blockd.h: DECLARE_ALIGNED(32, tran_low_t, dqcoeff[MAX_MB_PLANE][MAX_SB_SQUARE]); ./common/blockd.h: DECLARE_ALIGNED(16, uint8_t, color_index_map[2][MAX_SB_SQUARE]); ./common/blockd.h: DECLARE_ALIGNED(16, uint8_t, seg_mask[2 * MAX_SB_SQUARE]); ./common/reconinter.c: DECLARE_ALIGNED(16, uint8_t, tmp_buf1[2 * MAX_MB_PLANE * MAX_SB_SQUARE]); ./common/reconinter.c: DECLARE_ALIGNED(16, uint8_t, tmp_buf2[2 * MAX_MB_PLANE * MAX_SB_SQUARE]); ./common/reconinter.c: DECLARE_ALIGNED(16, uint16_t, intrapredictor[MAX_SB_SQUARE]); ./common/reconinter.c: DECLARE_ALIGNED(16, uint8_t, intrapredictor[MAX_SB_SQUARE]); ./common/reconinter.c: DECLARE_ALIGNED(16, uint16_t, uintrapredictor[MAX_SB_SQUARE]); ./common/reconinter.c: DECLARE_ALIGNED(16, uint8_t, uintrapredictor[MAX_SB_SQUARE]); ./decoder/decodeframe.c: DECLARE_ALIGNED(16, uint8_t, tmp_buf1[2 * MAX_MB_PLANE * MAX_SB_SQUARE]); ./decoder/decodeframe.c: DECLARE_ALIGNED(16, uint8_t, tmp_buf2[2 * MAX_MB_PLANE * MAX_SB_SQUARE]); ./encoder/block.h: DECLARE_ALIGNED(16, int16_t, src_diff[MAX_SB_SQUARE]); ./encoder/block.h: DECLARE_ALIGNED(16, int16_t, pred_luma[MAX_SB_SQUARE]); ./encoder/mcomp.c: DECLARE_ALIGNED(16, uint16_t, comp_pred16[MAX_SB_SQUARE]); ./encoder/mcomp.c: DECLARE_ALIGNED(16, uint8_t, comp_pred[MAX_SB_SQUARE]); ./encoder/mcomp.c: DECLARE_ALIGNED(16, uint16_t, pred16[MAX_SB_SQUARE]); ./encoder/mcomp.c: DECLARE_ALIGNED(16, uint8_t, pred[MAX_SB_SQUARE]); ./encoder/mcomp.c: DECLARE_ALIGNED(16, uint16_t, pred16[MAX_SB_SQUARE]); ./encoder/mcomp.c: DECLARE_ALIGNED(16, uint8_t, pred[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(32, tran_low_t, this_dqcoeff[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(16, uint16_t, pred16[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(16, uint16_t, second_pred_alloc_16[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(16, uint16_t, second_pred_alloc_16[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(32, int16_t, r0[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(32, int16_t, r1[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(32, int16_t, d10[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(32, int16_t, ds[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(32, int16_t, r1[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(32, int16_t, d10[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(32, int16_t, r0[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(32, int16_t, r1[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(32, int16_t, d10[MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(32, uint8_t, tmp_buf_[2 * MAX_MB_PLANE * MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(16, uint8_t, pred0[2 * MAX_SB_SQUARE]); ./encoder/rdopt.c: DECLARE_ALIGNED(16, uint8_t, pred1[2 * MAX_SB_SQUARE]); $ grep -r DECLARE_ALIGNED.*MAX_SB_SIZE.*MAX_SB_SIZE . ./common/reconinter.c: DECLARE_ALIGNED(32, uint16_t, tmp_dst[MAX_SB_SIZE * MAX_SB_SIZE]); ./decoder/decodeframe.c: DECLARE_ALIGNED(32, uint16_t, tmp_dst[MAX_SB_SIZE * MAX_SB_SIZE]); It looks like this list is too long for any surgical fixes; huge stack buffers are deeply woven into this code. I give up: increase the media thread stack size if you must. :/
Crash Signature: [@ _chkstk | static void lowbd_inv_txfm2d_add_no_identity_ssse3] → [@ _chkstk | static void lowbd_inv_txfm2d_add_no_identity_ssse3] [@ _chkstk | static void av1_combine_interintra]
Yeah, I think there is no way around at least a small (maximum) stack size increase. That said, the 197k stack frame does seem excessive - a single 16 bit superblock is "only" 32k. What is the impact of increasing max stack size if it's only committed when used, anyway?
This is hopefully fixed by Bug 1474684. We can re-open if it reappears.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
Crash Signature: [@ _chkstk | static void lowbd_inv_txfm2d_add_no_identity_ssse3] [@ _chkstk | static void av1_combine_interintra] → [@ _chkstk | static void lowbd_inv_txfm2d_add_no_identity_ssse3] [@ _chkstk | static void av1_combine_interintra] [@ mozilla::EventQueue::GetEvent ]
You need to log in before you can comment on or make changes to this bug.