881389 - (OneLogger) [meta] Improve Gecko's logging story

Reporter

Description

•

11 years ago

Gecko's B2G logging story is a mess. I know this might not seem like an important issue to tackle, but I think it is. Bear with me. Right now we log messages in one of a few ways: 1) Unconditionally print to logcat with __android_log_print. This is common in gonk-only code. 2) Use prlog, which prints to logcat if PR_LOGGING is enabled at build time (or FORCE_PR_LOGGING is defined in the given file), and if an environment variable is defined in b2g.sh. 3) Use hand-rolled logging macros (e.g. in ProcessPriorityManager), which usually require recompiling to enable. Option (1) is fine for Gonk-only code, but if we over-use this approach, we can clutter up the logs and potentially slow down the phone. I also think I've seen us using this idiom in non-Gonk-only code (because it's so convenient), but that's very bad, because then there's no way to get those log messages on desktop, short of editing the code. Option (2) requires editing b2g.sh on the device and, if we don't use FORCE_PR_LOGGING, also requires re-compiling. Option (3) requires re-compiling. Re-compiling is usually fine for developers, but it's not fine for many of the people doing QA. It's also not an option for a user who wants to submit a bug report. I'd also prefer not to make users or QA people edit b2g.sh. Indeed, that probably requires root. There's a lot of data that's hidden in logs that use option (2) or (3) above, and right now every time we want someone from QA to get that information, it's difficult for everyone involved. Asking a user to get this information is going to be a non-starter for all but the most technically-inclined users. I'd like to simplify and unify our logging story. I'd like a solution that: 1) Doesn't require recompiling to enable logging of any module. 2) Works using the same code on dekstop and mobile. 3) Has convenient syntax (so developers don't have to work around it today, as we currently have to work around prlog). prlog completely disables itself in release builds, because it doesn't want to add overhead. This new logging module is going to have the opposite default, that log messages are (usually) infrequent enough that checking whether logging is enabled isn't a big deal.

Mike Habicher [:mikeh] (high bugzilla latency)

Comment 1

•

11 years ago

Suggestion: 4) Be able to enable logging without rebooting the phone or restarting processes, to help diagnose hard-to-reproduce bugs.

Dave Hylands [:dhylands]

Comment 2

•

11 years ago

We should look at dynamic debug, a feature that exists in the kernel. There was a really good paper written on this here: https://www.kernel.org/doc/ols/2009/ols2009-pages-39-46.pdf

Mike Habicher [:mikeh] (high bugzilla latency)

Comment 3

•

11 years ago

Which reminds me: can we include getting kernel messages inline with logcat somehow?

Justin Lebar (not reading bugmail)

Reporter

Comment 4

•

11 years ago

I would really like at the very least to annotate crashes with "OOM", etc in logcat, because asking people to look at dmesg is too complicated. But I think this is orthogonal to what I'm trying to accomplish here.

Dave Hylands [:dhylands]

Comment 5

•

11 years ago

I filed bug 805476 to get the OOM messages into logcat...

Justin Lebar (not reading bugmail)

Reporter

Comment 6

•

11 years ago

(In reply to Mike Habicher [:mikeh] from comment #1) > 4) Be able to enable logging without rebooting the phone or restarting > processes, to help diagnose hard-to-reproduce bugs. That would be really nice. I'll see what I can do!

Justin Lebar (not reading bugmail)

Reporter

Comment 7

•

11 years ago

Hm, compile-time hashing of the filename is going to be fun...

Justin Lebar (not reading bugmail)

Reporter

Comment 8

•

11 years ago

I have an in-progress patch for this. It works pretty okay, and I'm hopeful the overhead will be very low.

Assignee: nobody → justin.lebar+bug

Justin Lebar (not reading bugmail)

Reporter

Updated

•

11 years ago

Alias: OneLogger

Justin Lebar (not reading bugmail)

Reporter

Updated

•

11 years ago

Summary: Improve B2G's logging story → Improve Gecko's logging story

Justin Lebar (not reading bugmail)

Reporter

Comment 9

•

11 years ago

Attached patch Part 1, WIP: One logger to rule them all (obsolete) (deleted) — Details — Splinter Review

The main thing that isn't complete here is Android/B2G integration. I plan to write code which will watch a file on the FS and use its contents to determine which logging statements to enable. I'm also still tweaking the interface. I've been going through and converting PR_LOG's and __android_log_print statements to the new interface in order to test that it's the right interface. I think I'm mostly happy with it now. I still need to measure the effect of these log statements on code size and speed. I'm hoping it's negligible, or if not can easily be made negligible. Anyway, there's a long comment in Logging.h that explains what's going on.

Justin Lebar (not reading bugmail)

Reporter

Comment 10

•

11 years ago

Attached patch Part 2, WIP: Convert content/base/src/{nsDocument,nsCSPService}.cpp to use the One Logger. (obsolete) (deleted) — Details — Splinter Review

This is an illustrative patch, if you just want to see the interface.

Justin Lebar (not reading bugmail)

Reporter

Comment 11

•

11 years ago

Attached patch Part 3, WIP: Convert volume handling over to the One Logger (obsolete) (deleted) — Details — Splinter Review

Another illustrative patch. I have a bunch more; pipe up if you want to see 'em.

Honza Bambas (:mayhemer)

Comment 12

•

11 years ago

Can this also incorporate thread names (as bug 885952 does) and timestamp to the log output?

Justin Lebar (not reading bugmail)

Reporter

Comment 13

•

11 years ago

(In reply to Honza Bambas (:mayhemer) from comment #12) > Can this also incorporate thread names (as bug 885952 does) and timestamp to > the log output? Sure, that's a good idea. These log messages are getting a lot of metadata: - log module name - filename + lineno of the log message - optionally "CRITICAL" - thread name - pid, if running in multiprocess mode - process name ("calculator", etc), if running in multiprocess mode I need to spend some time figuring out how not to overload us with all this info, but that's on me.

Mike Habicher [:mikeh] (high bugzilla latency)

Comment 14

•

11 years ago

(In reply to Justin Lebar [:jlebar] from comment #13) > > - pid, if running in multiprocess mode |logcat -v thread| or |logcat -v threadtime| will already show the pid and tid that generated the logcat entry, so no need to duplicate that on supporting platforms.

Justin Lebar (not reading bugmail)

Reporter

Comment 15

•

11 years ago

Except adb logcat --help doesn't list any of these things, so it might be more user-friendly to display them ourselves.

Gregory Szorc [:gps]

Comment 16

•

11 years ago

See bug 451283 and bug 884397 for tentative plans on the JS side of things. Not sure how much we care about having C++ and JS play together at this juncture.

Justin Lebar (not reading bugmail)

Reporter

Comment 17

•

11 years ago

I'd been explicitly not considering structured logging here; I figured that we cared only about consumers with heartbeats. Should I care, or do we want structured logging only for test output?

Gregory Szorc [:gps]

Comment 18

•

11 years ago

I can't think of an adequate use case for structured logging in C++. Now, having a unified manner for quickly changing log emission settings that affects both JS and C++, that would be interesting.

Justin Lebar (not reading bugmail)

Reporter

Comment 19

•

11 years ago

I had an interesting conversation with dveditz and Lucas on IRC, in which we identified a few security issues. For example, it would be bad if an app process could turn on logging and direct it to a file. Then that app process could (a) fill up the phone's storage, and (b) if the file is globally-readable (which it probably is), the app could read sensitive data (e.g. URLs) out of the logs. I think these are all moot if, on B2G, we don't allow logging to files. jcranmer convinced me that we want log-to-file on desktop, but I'm not convinced we need it on B2G. Everything goes through logcat right now anyway. Using logcat exclusively also solves problems around interleaving messages between processes.

Honza Bambas (:mayhemer)

Comment 20

•

11 years ago

(In reply to Justin Lebar [:jlebar] from comment #19) > jcranmer convinced me that we want log-to-file on desktop, If desktop means b2g desktop builds, then +1. We have issues when other processes are logging using NSPR now. E.g. a flash plugin can just overwrite the file start on windows - really crappy. Also, when a child process is spawned, it would be great to have the logs in one file. Or, at least somehow nicely and automatically separated to multiple files. I'm terrified a day I'll have to debug a deep problem on b2g directly in the phone w/o being able to get logs :(

Justin Lebar (not reading bugmail)

Reporter

Comment 21

•

11 years ago

As far as this bug is concerned, B2G desktop should behave just like Firefox desktop.

Joshua Cranmer [:jcranmer]

Comment 22

•

11 years ago

I'll be frank and point out that I mostly care about logging in terms of the logging used in mailnews, which is predominantly the protocol logs (this is basically the only way we have a hope of QAing many problems). This means my basic desiderata are few: 1. Ability to log to a file. 2. Log in both debug and release builds. 3. Ability to turn on logging via UI somehow. 4. Ability to give loggers simple names (i.e., IMAP, NNTP, etc.) 5. Turn on logging via an env variable (like NSPR_LOG_MODULES=IMAP:5 right now)--helps when debugging xpcshell test failures. 6. Logging from multiple threads needs to work (IMAP runs on its own thread). I don't particularly care about structured logging, nor do I feel strongly about JS integration at this point (since you can work around them).

Justin Lebar (not reading bugmail)

Reporter

Comment 23

•

11 years ago

What I'm designing should meet all of those criteria. I'm probably not going to move a ton of existing NSPR loggers over to this new thing, but it should be easy to do so for anything you care about.

Benoit Girard (:BenWa)

Updated

•

11 years ago

Blocks: 887971

Benoit Girard (:BenWa)

Comment 24

•

11 years ago

I think we should support C++ stream as well. C++ stream are useful when you're not sure which type flags you should use for printf like typdefs, uint64_t and objects. The primary concern is being able to lazy evaluate while using C++ stream. Perhaps we can have something like: LOG("Count: " << myInt);

Justin Lebar (not reading bugmail)

Reporter

Comment 25

•

11 years ago

C++ streams would also let us do nsIURI* uri; LOG("URI: " << uri); instead of nsIURI* uri; LOG("URI: %s", LOG_AS_STR(uri)); I think the former is clearly better. It's harder to do field width specifiers with <<, but those are going to be pretty uncommon with log messages, I think. So I guess I'll figure out if I can convert everything to using streams.

Benjamin Smedberg

Comment 26

•

11 years ago

Have you looked at the prior art in the chromium loggers? In particular their API has LOG (always) and DLOG (debug-only): http://src.chromium.org/svn/trunk/src/third_party/cld/base/logging.h The client code is written as: LOG(ERROR) << "Pickle error decoding Histogram: " << histogram_name; There is enough magic to avoid evaluating some of the arguments if the logging of that severity is disabled.

Benoit Girard (:BenWa)

Comment 27

•

11 years ago

Does it prevent something like 'ExpensiveFunction()' in: LOG(DEBUG) << "Tree Size: " << ExpensiveFunction(); from being evaluated?

Justin Lebar (not reading bugmail)

Reporter

Comment 28

•

11 years ago

(In reply to Benoit Girard (:BenWa) from comment #27) > Does it prevent something like 'ExpensiveFunction()' in: > LOG(DEBUG) << "Tree Size: " << ExpensiveFunction(); > from being evaluated? It doesn't look like it to me, but I could be misreading it...

Dave Hylands [:dhylands]

Comment 29

•

11 years ago

If LOG(ERROR) were defined as something along the lines of: #define LOG(level) if (LoggingEnabled(level)) logstream then ExpensiveFunction() would be inside an if and would only be evaluated if the if passes.

Justin Lebar (not reading bugmail)

Reporter

Comment 30

•

11 years ago

> #define LOG(level) if (LoggingEnabled(level)) logstream > > then ExpensiveFunction() would be inside an if and would only be evaluated if the if > passes. How would that work? Suppose I did > LOG(DEBUG) << ExpensiveFunction(); in the #define above, LOG(DEBUG) doesn't evaluate to anything, regardless of whether LoggingEnabled(DEBUG). But even if we made it evaluate to something, I don't see how that fixes things.

Dave Hylands [:dhylands]

Comment 31

•

11 years ago

I'm not following. Using my somewhat simple example if you had #define LOG(level) if (LoggingEnabled(level)) logstream then LOG(DEBUG) << ExpensiveFunction(); would become if (LoggingEnabled(DEBUG)) logstream << ExpensiveFunction(); And ExpensiveFunction() would only be called if LoggingEnabled(DEBUG) returned true. If you got more exotic, you could make the LoggingEnabled portion actually be a function based on the debug level, so that in release builds, debug stuff gets optimized out. if (0) logstream << ExpensiveFunction(); should be eliminated entirely.

Justin Lebar (not reading bugmail)

Reporter

Comment 32

•

11 years ago

Ah, I see what you mean. That might work...

Justin Lebar (not reading bugmail)

Reporter

Comment 33

•

11 years ago

Attached patch WIP rewrite of part 1 (the main event): Add the One Logger. (obsolete) (deleted) — Details — Splinter Review

Sadly I didn't have a chance to finish this bug before leaving Mozilla. In case someone wants to pick this up, here's the status: I had something which was mostly complete, but jcranmer convinced me that an iostream-style interface (i.e., using "<<") instead of a printf-style interface would be better. I never had a chance to do the conversion. This patch builds a system where it's easy to tell, in a threadsafe and hopefully efficient manner, whether logging for a file/module is enabled. So the remaining work is just converting the interface. Hopefully it's clear how all of this stuff fits together.

Attachment #766994 - Attachment is obsolete: true

Attachment #766995 - Attachment is obsolete: true

Attachment #766996 - Attachment is obsolete: true

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 34

•

11 years ago

I'd hate for this to fall through the cracks.

Dave Hylands [:dhylands]

Comment 35

•

11 years ago

(In reply to ben turner [:bent] (needinfo? encouraged) from comment #34) > I'd hate for this to fall through the cracks. I can assign this to me, and then at least I'll be reminded of it. It may be several weeks before I can get to it... If somebody else wants to take this on, let me know.

Assignee: justin.lebar+bug → dhylands

Dave Hylands [:dhylands]

Comment 37

•

11 years ago

Added [fxos:media] whiteboard for bugs assigned to me so that they can be triaged/prioritized, etc.

Whiteboard: [fxos:media]

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Updated

•

10 years ago

Blocks: 1052804

Part 1, WIP: One logger to rule them all 11 years ago Justin Lebar (not reading bugmail) (deleted), patch		Details \| Diff \| Splinter Review
Part 2, WIP: Convert content/base/src/{nsDocument,nsCSPService}.cpp to use the One Logger. 11 years ago Justin Lebar (not reading bugmail) (deleted), patch		Details \| Diff \| Splinter Review
Part 3, WIP: Convert volume handling over to the One Logger 11 years ago Justin Lebar (not reading bugmail) (deleted), patch		Details \| Diff \| Splinter Review
WIP rewrite of part 1 (the main event): Add the One Logger. 11 years ago Justin Lebar (not reading bugmail) (deleted), patch		Details \| Diff \| Splinter Review