Intermittent LeakSanitizer | leak at mozilla::NotNull, RacyRegisteredThread, RegisteredThread::RegisteredThread, mozilla::detail::UniqueSelector
Categories
(Core :: Gecko Profiler, defect, P5)
Tracking
()
People
(Reporter: intermittent-bug-filer, Assigned: egao)
References
(Blocks 1 open bug)
Details
(Keywords: intermittent-failure)
Attachments
(1 file)
(deleted),
text/x-phabricator-request
|
Details |
Filed by: nerli [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=280435304&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/JL3W6jibRL23XK3tRkyPSA/runs/0/artifacts/public/logs/live_backing.log
[task 2019-12-10T07:25:22.316Z] 07:25:22 INFO - GECKO(4052) | SUMMARY: AddressSanitizer: 64 byte(s) leaked in 2 allocation(s).
[task 2019-12-10T07:25:22.392Z] 07:25:22 INFO - TEST-INFO | Main app process: exit 0
[task 2019-12-10T07:25:22.393Z] 07:25:22 INFO - TEST-INFO | LeakSanitizer | To show the addresses of leaked objects add report_objects=1 to LSAN_OPTIONS
[task 2019-12-10T07:25:22.393Z] 07:25:22 INFO - TEST-INFO | LeakSanitizer | This can be done in testing/mozbase/mozrunner/mozrunner/utils.py
[task 2019-12-10T07:25:22.393Z] 07:25:22 ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::NotNull, RacyRegisteredThread, RegisteredThread::RegisteredThread, mozilla::detail::UniqueSelector
[task 2019-12-10T07:25:22.393Z] 07:25:22 INFO - runtests.py | Application ran for: 0:00:19.358857
[task 2019-12-10T07:25:22.393Z] 07:25:22 INFO - zombiecheck | Reading PID log: /tmp/tmpT2Dxgvpidlog
[task 2019-12-10T07:25:22.393Z] 07:25:22 INFO - ==> process 4052 launched child process 4065
[task 2019-12-10T07:25:22.393Z] 07:25:22 INFO - ==> process 4052 launched child process 4105
[task 2019-12-10T07:25:22.393Z] 07:25:22 INFO - ==> process 4052 launched child process 4118
[task 2019-12-10T07:25:22.393Z] 07:25:22 INFO - ==> process 4052 launched child process 4185
[task 2019-12-10T07:25:22.394Z] 07:25:22 INFO - ==> process 4052 launched child process 4218
[task 2019-12-10T07:25:22.394Z] 07:25:22 INFO - ==> process 4052 launched child process 4249
[task 2019-12-10T07:25:22.394Z] 07:25:22 INFO - zombiecheck | Checking for orphan process with PID: 4065
[task 2019-12-10T07:25:22.394Z] 07:25:22 INFO - zombiecheck | Checking for orphan process with PID: 4105
[task 2019-12-10T07:25:22.394Z] 07:25:22 INFO - zombiecheck | Checking for orphan process with PID: 4118
[task 2019-12-10T07:25:22.394Z] 07:25:22 INFO - zombiecheck | Checking for orphan process with PID: 4185
[task 2019-12-10T07:25:22.394Z] 07:25:22 INFO - zombiecheck | Checking for orphan process with PID: 4249
[task 2019-12-10T07:25:22.394Z] 07:25:22 INFO - zombiecheck | Checking for orphan process with PID: 4218
[task 2019-12-10T07:25:22.394Z] 07:25:22 INFO - Stopping web server
[task 2019-12-10T07:25:22.409Z] 07:25:22 INFO - Stopping web socket server
[task 2019-12-10T07:25:22.413Z] 07:25:22 INFO - Stopping ssltunnel
[task 2019-12-10T07:25:22.432Z] 07:25:22 WARNING - leakcheck | refcount logging is off, so leaks can't be detected!
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 2•5 years ago
|
||
:gerald - this is an issue that I've been observing on almost all chunks that have failures in linux64-asan/opt
, when I use the new ubuntu1804 test image.
I noticed that this particular failure was seen on mozilla-central (ubuntu1604) tests on December 16 but since December 17 it is no longer reported. However, on my latest mozilla-central pull, I am still able to reproduce this issue with ubuntu1804.
Here are some try runs:
https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&tier=1%2C2%2C3&revision=55d21a101684af73b0c7faa74ab912ff7d80596a&searchStr=browser-chrome
example log: https://firefoxci.taskcluster-artifacts.net/KSgSfGjHS7iZps2yTEdTFw/0/public/logs/live_backing.log
Would you be able to take a look, or pass the ni to someone that can comment on why this is still reproducible on ubuntu1804? Thanks!
To run tests against ubuntu1804, please use ./mach try fuzzy --ubuntu-bionic
and select linux64
tasks as normal.
Comment 3•5 years ago
|
||
It looks like the leaks are all happening with cubeb in the stack, if that indicates anything.
Comment hidden (Intermittent Failures Robot) |
Thank you Edwin and Andrew for all this good information.
I see that when the "AudioIPC Client RPC" thread is created, the thread function registers the thread, but then it's never de-registered.
And since bug 1445822, the information for still-registered threads is not scrapped when the profiler shuts down (in case the thread still needs access to it to work on labels).
So we need that thread to de-register itself when it ends -- assuming it ever ends? I'm afraid it may be one of those never-ending Rayon threads, see bug 1445822 comment 50.
Paul, I see you wrote this register_thread
. Would it be possible to write and call a corresponding deregister_thread
? And would it actually be called?!
(The C++ callback is there.)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 9•5 years ago
|
||
Failure is still reproducible on current mozilla-central as of 2020/01/06:
https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&tier=1%2C2%2C3&revision=351c5423c62b726096049d19bc4c10849be3fcb9&searchStr=asan%2Cbrowser-chrome&selectedJob=283661039
Refer to test-linux64-asan/opt-mochitest-browser-chrome-fis-e10s-8 M-fis(bc8)
for the instance of this failure.
Some possible alternatives:
- The thread could store its required data on the stack, and the profiler would store what it needs separately; so the profiler could destroy what it owns (like before) while the thread would still be able to do its work after that.
- Tell LeakSanitizer not to worry (if that's possible?)
Comment 11•5 years ago
|
||
(In reply to Gerald Squelart [:gerald] (he/him) from comment #10)
- Tell LeakSanitizer not to worry (if that's possible?)
You can use MOZ_LSAN_INTENTIONALLY_LEAK_OBJECT to tell LSan to ignore the leak of a specific object. Obviously, it should be used with great care.
Comment 12•5 years ago
|
||
You can also whitelist via the allocation stack, but in this case where there are just one or two specific objects then MOZ_LSAN_INTENTIONALLY_LEAK_OBJECT might be better.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 14•5 years ago
|
||
Checking to see if there's any movement on this.
This issue is one of the couple that are preventing mochitest-browser-chrome
from running under linux1804.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 16•5 years ago
|
||
:gerald - thanks for the initial triage of this bug back in comment 2.
There has been no movement on this bug and this is the last item holding the mochitest-browser-chrome suite from being moved to run on linux1804. This is also a failure that I can't disable or annotate, which means I am blocked in my migration.
My hope was to have the mochitest-browser-chrome suite migrated over before the Berlin All Hands as I am going on parental leave after that time.
Assignee | ||
Comment 17•5 years ago
|
||
For reference, this is the most recently try push showing the failures:
https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&tier=1%2C2%2C3&revision=84e3929daec47302129549699274491943bb1ad4&searchStr=asan%2Cbrowser-chrome&selectedJob=285706918
To run pushes against ubuntu1804, please use mach try fuzzy --ubuntu-bionic
and select linux64 jobs as normal.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 19•5 years ago
|
||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Comment 20•5 years ago
|
||
Assignee | ||
Comment 21•5 years ago
|
||
I've added the leave-open
flag, this issue still needs to be addressed as soon as possible.
I'm on PTO for one week, then half-PTO for 2 weeks, so "ASAP" for me may be a few weeks away...
Comment 23•5 years ago
|
||
bugherder |
Comment 24•5 years ago
|
||
I think we can arrange to call PROFILER_UNREGISTER_THREAD() when the AudioIPC client threads shut down, I'll try that when I have a chance once I get home from Berlin.
Comment hidden (Intermittent Failures Robot) |
Updated•5 years ago
|
Comment 26•5 years ago
|
||
(In reply to Matthew Gregan [:kinetik] from comment #24)
I think we can arrange to call PROFILER_UNREGISTER_THREAD() when the AudioIPC client threads shut down, I'll try that when I have a chance once I get home from Berlin.
I'll land this change via bug 1614547.
Updated•5 years ago
|
Comment 27•5 years ago
|
||
Worth noting that bug 1610640 may hide this, since AudioIPC is (temporarily) disabled in mochitest-browser-chrome.
Much appreciated, Matthew. ⭐️
I'll keep an eye on leaks here, in case there are other long-living threads...
Assignee | ||
Updated•5 years ago
|
RegisteredThread
was removed in bug 1722261, so I'll call this bug here effectively fixed, since 93.0a1 / 20210824094724.
Updated•3 years ago
|
Updated•3 years ago
|
Description
•