Closed Bug 1545187 Opened 6 years ago Closed 5 years ago

Consider making gPSMutex a re-entrant mutex (recursive lock, mozilla::ReentrantMonitor)

Categories

(Core :: Gecko Profiler, defect, P3)

defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox68 --- affected

People

(Reporter: mstange, Unassigned)

Details

We have many workarounds throughout the profiler code in order to avoid re-entrant locking of gPSMutex (the central "profiler state" mutex).

I don't remember why we chose this approach over a re-entrant monitor. It might be time to reconsider this choice.

Thank you for the suggestion: I agree that a RE mutex could prevent some annoying issues and their awkward workaround code, especially as recently experienced with the profiler doing logging or file IOs that are intercepted and fed back into the profiler, e.g.: bug 1540340.

And if we go ahead with a RE mutex, I think we'd still want to keep the current function-call architecture in place, with "locked_..." functions taking a proof-of-lock reference (instead of trying to take the lock themselves); this saves a few locking attempts while still ensuring that the lock is held as expected.

To be complete, here are some "con"s I can think of:

  • RE mutexes potentially slower (to be measured),
  • May make us lazy and use more locking than needed in future code,
  • Easier to get into bad situations, e.g.: one function takes the mutex and modifies some complex structures, but in the middle (when invariants are broken) another function (or the same function!) is called and takes the mutex recursively, incorrectly thinking that expected invariants currenly hold.

This last one is the scariest, so I think we should think carefully before changing gPSMutex.
Note that the current non-RE mutex does tell us about re-locking in DEBUG mode, so it's not too hard to miss during tests.

As a possible alternative before changing gPSMutex, to help us catch recursive locking now and in the future, we could add something like MOZ_RELEASE_ASSERT(profiler_this_thread_does_not_currently_hold_lock()) in the known places where we potentially re-enter the profiler (logging and IOs at the moment).

Blocks: 1540340
Priority: -- → P3

mstange, I recall you expressed strong distaste for recursive mutexes when I first was reworking the profiler's global state. But I don't recall any specifics about why you didn't like them.

I don't recall the reasons for my strong distaste either; I think at the time I only had a diffuse feeling that it's hard to use them correctly.

But Gerald's point 3 is actually very convincing! I'm resolving this s wontfix.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX

Note for future readers: Bug 1540340 makes it possible to check if gPSMutex is already held on the current thread, to avoid double-locking. (I see I suggested that at the end of comment 1 -- I had forgotten about it 😅)

You need to log in before you can comment on or make changes to this bug.