Closed Bug 1126614 Opened 10 years ago Closed 10 years ago

"Non-fatal assertion" is an oxymoron

Categories

(Core :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 279923

People

(Reporter: n.nethercote, Unassigned)

Details

(Keywords: meta)

NS_ASSERTION is non-fatal. MOZ_ASSERT is fatal. I propose converting as many of the former as possible to the latter. Some background reading. - Brendan argued that assertions should be fatal back in 2005: http://brendaneich.com/2005/01/assertions-should-be-fatal/ - I argued similarly in 2011: http://blog.mozilla.org/nnethercote/2011/06/07/what-is-the-point-of-non-fatal-assertions/ (But note that I conflated NS_ASSERTION with NS_ENSURE_* which are quite a different thing, so that post and its comments makes for somewhat confusing reading.) - Bob Clary carried on my argument in 2011 on dev-platform: https://groups.google.com/forum/#!topic/mozilla.dev.platform/UZ7c2EnBzdI There seem to be two objections to this proposal. - Some NS_ASSERTIONs are unreliable, and it would be hard to work out which ones are. But I've done some preliminary testing and things look pretty reasonable. It looks like only a small number of NS_ASSERTIONs fail regularly in tests. We can leave those alone (or file bugs to get the fixed or removed) and eventually rename NS_ASSERTION as NS_WARN_IF_FALSE, or similar. - Some reliable NS_ASSERTIONS indicate things that shouldn't necessarily be fatal. I think we can again leave these and eventually rename them as NS_WARN_IF_FALSE. Finally, my most telling argument is this: it's what we're doing anyway. Here are occurrence counts for various assertion kinds in the codebase: Non-fatal: - NS_ASSERTION 8767 - NS_PRECONDITION 1836 Fatal: - MOZ_ASSERT 29121 - NS_ABORT_IF_FALSE 1786 Fatal assertions outnumber non-fatal assertions by ~3x. glandium did this lovely measurement of the popularity of MOZ_ASSERT through various Firefox versions since the dev-platform conversation occurred: > version num > 6 6 > 7 12 > 8 16 > 9 37 > 10 69 > 11 129 > 12 213 > 13 388 > 14 649 > 15 1036 > 16 1330 > 17 1805 > 18 2578 > 19 2777 > 20 3090 > 21 3616 > 22 4237 > 23 4802 > 24 5555 > 25 6045 > 26 6813 > 27 7856 > 28 8571 > 29 9689 > 30 11078 > 31 11952 > 32 13907 > 33 14597 > 34 15776 > 35 26712 The corresponding numbers for NS_ASSERTION: > version NS_ASSERTION_count > 6 9971 > 7 9932 > 8 9739 > 9 10057 > 10 9752 > 11 9851 > 12 9889 > 13 9831 > 14 9837 > 15 10301 > 16 10335 > 17 10555 > 18 10758 > 19 10686 > 20 10722 > 21 10643 > 22 10640 > 23 10796 > 24 10669 > 25 10650 > 26 10395 > 27 10367 > 28 10166 > 29 10027 > 30 9909 > 31 9866 > 32 9852 > 33 9784 > 34 9773 > 35 8966 I just want to greatly accelerate this trend :)
For what it's worth, this is the trend of JS_ASSERT for the same versions since the dev-platform thread, which are MOZ_ASSERT's ancestor, then were just #define JS_ASSERT MOZ_ASSERT and are now gone: 6 4492 7 4443 8 4680 9 5287 10 5396 11 4959 12 4920 13 5016 14 4950 15 4972 16 4929 17 5283 18 6839 19 6914 20 7210 21 7059 22 7465 23 8251 24 7473 25 7495 26 7758 27 7946 28 8069 29 8228 30 8173 31 8198 32 8226 33 8373 34 8824 35 16 (those last 16 are actually JS_ASSERT_* stuff)
Duplicate of bug 279923? There are some decent arguments for allowing things that do not lead to a crash to be non-fatal assertions. For example, if a security bug comes in, with a good testcase, and the problem is clearly pointed out by the second or third assertion that fires (something that's not all that unusual with our existing NS_ASSERTIONs and incoming security bugs, actually), having those assertions be fatal would make it harder to find the underlying problem in the security bug, because we'd be forced to debug the wrong assertion first, rather than seeing the full context. See also bug 279923 comment 17 regarding test failures: it's often useful to know the full set of test failures. Also, I think the trend towards fatal assertions has been making debug builds substantially less usable: it's pretty common for me to have to comment out fatal assertions in order to just use a debug build to browse the Web. This problem is concentrated in some parts of the code: I think over the past few years it's mostly been in the JS engine and Necko, but recently a bit in ImageLib as well. It's been getting close to the point of my being unable to use debug builds for regular browsing -- not because of speed -- but because of too many fatal assertions that people don't fix. (It is relatively easy to work around by commenting them out locally, which is what I do. If it weren't for the easy workaround; I'd have given up on running debug builds long ago.) If we're going to push people to make more assertions fatal, I think we should push people to take usability of debug builds more seriously as well. Finally, I think we can use our existing test harnesses to keep the counts of NS_ASSERTIONs down by making unexpected NS_ASSERTIONS cause test failures and annotating the known cases as expecting a certain number of assertions. I've done that for reftest and most mochitests (modulo bug 847275), but it's at risk of losing effectiveness because I don't think anybody has had the chance to update it for e10s (although I could be wrong; I don't see an open bug right now). This has been quite effective in stopping regressions of NS_ASSERTIONs in the areas that it covers, though nobody has prioritized fixing the known assertions.
I did a try push where I converted all NS_ASSERTIONs under dom/ to MOZ_ASSERT. I got some failures in https://treeherder.mozilla.org/logviewer.html#?job_id=4484815&repo=try (Mochi-3, dom/media/test/test_seek-4.html). dbaron, does Mochi-3 have the NS_ASSERTION failure detection disabled?
Flags: needinfo?(dbaron)
> There are some decent arguments for allowing things that do not lead to a crash to be non-fatal > assertions. Given that 75% of our assertions are fatal and the number is rising, it sounds like it would be better to make MOZ_ASSERT fatal *by default* but provide an outlet, such as setting an environment variable, to allow it to be non-fatal. Someone will probably now say "but some assertions should always be fatal because they indicate memory corruption or something else catastrophic whereas others are less dangerous and things can continue reasonably". In response to which I would argue that this fine level of distinction has zero chance of being observed consistently in a codebase with 40,000+ assertions, and therefore an all-fatal-or-none-fatal switch is enough. As for test failures, I can live for now with the assertions that cause test failures staying as NS_ASSERTION. I expect the number of these to be small, e.g. probably measured in the dozens.
(In reply to Nicholas Nethercote [:njn] from comment #5) > Given that 75% of our assertions are fatal and the number is rising, it > sounds like it would be better to make MOZ_ASSERT fatal *by default* but > provide an outlet, such as setting an environment variable, to allow it to > be non-fatal. That wouldn't help if you were hitting one assertion while trying to get a stack for another - by the time you crash you might be somewhere else entirely. But I do think you could argue that if you hit an assertion in testing, it should be (1) fixed, (2) turned into a warning, or (3) backed out, as would happen with failures in automation. This still puts the onus on the tester to file a bug, but being stricter about the 'back out' option might incentivize people to fix things that we don't have test cases for in automation (assuming that this is a problem in practice). For that matter, I would argue that if you're really serious about an assertion, it should be a release assert so that it will show up in crash reports. Even a Nightly assert would receive far more testing than automation could ever offer (but those were removed when MOZ_ASSUME_UNREACHABLE was renamed). I guess that's a different level of commitment though.
Fwiw, bug 1080457 proposes making *warnings* count as failures, but only in regression tests (not in fuzzing or normal browsing).
(In reply to Nicholas Nethercote [:njn] from comment #5) > ... it sounds like it would be better to make MOZ_ASSERT fatal *by default* but > provide an outlet, such as setting an environment variable, to allow it to > be non-fatal. Suppose you make MOZ_ASSERT fatal "by default". I'd expect regression test suites, fuzzers, and people who browse with debug builds to all set the environment variable to make them non-fatal. Then what have you accomplished, other than disrupting the project to make a WP:POINT? If you want assertion bugs to get fixed faster, there are better ways to argue for that.
Assignee: n.nethercote → nobody
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
> Suppose you make MOZ_ASSERT fatal "by default". I'd expect regression test > suites, fuzzers, and people who browse with debug builds to all set the > environment variable to make them non-fatal. Then what have you > accomplished, other than disrupting the project to make a WP:POINT? What do these people do with MOZ_ASSERT as it currently stands, which doesn't allow non-fatalness?
In the case of fuzzers, stop fuzzing entire areas of functionality for months at a time until someone gets around to fixing the fatal assert that triggers all the time as soon as you start fuzzing. In the case of test suites, disable chunks of test suites (often for months at a time, if not forever) until someone gets around to fixing the fatal assert. In the case of people who browse with debug builds, see comment 2, though mostly fatal assertions that trigger on the web and remain unfixed forever (esp. in SpiderMonkey) have driven people away from even pretending to dogfood debug builds.
You need to log in before you can comment on or make changes to this bug.