Closed Bug 866675 Opened 12 years ago Closed 11 years ago

Implement a WASAPI backend for cubeb

Categories

(Core :: Audio/Video, defect)

All
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla25

People

(Reporter: padenot, Assigned: padenot)

References

Details

Attachments

(2 files, 1 obsolete file)

As part of the ongoing work to reduce latency when playing back sound in Gecko, we need a new Windows backend based on WASAPI instead of WinMM to improve our latency on Windows Vista+ machines. This will bring our latency down to the 30ms region (instead of the 100-150ms we have at the moment). This backend will be very impractical to write in plain C, because the libraries it relies on are making heavy use of inheritance, vtables, COM and stuff.
Use C++ if it's easier, there's no hard requirement on C for libcubeb, it's just nicer to be C-only when it's not inconvenient. The old DirectSound backend was C++ for the same reason.
So, while WASAPI will give us way better performance, it has its own set of problems: - the user of the API is expected to handle resampling/sample format conversion himself (i.e., the samplerate/format is fixed, and the application has to do the conversion in its pipeline, unlike every other audio backend out there). Doing the sample format conversion is trivial. I plan to use the speex's resampler to do the resampling. - we cannot have a single thread for every streams, like we do for every other platforms, WASAPI expecting each stream to run on its own thread. We could end up with a large number of thread, but well, it is how the API works.
Blocks: 785584
Update on this, it's mostly working, but I have an intermittent crash happening somewhere in the test suite (and no valgrind), that I'm tracking down. Also, the clock is a bit weird in wasapi, and I fail test_playback_rate.html.
Attached patch v1: add a wasapi backend to cubeb (obsolete) (deleted) — Splinter Review
This patch uses the shared mode. If needed (if we feel that we need to go even lower in terms of latency), we could implement a exclusive mode. I plan to put a copy of the speex resampler in the PR that goes in the github repo, but we can just use the in-tree copy, here. As we discussed during the work week, this backend does not prefill the buffer on stream start (call the callback during init). We can trivially add it if needed. Also, I put in functions that I need to add to the cubeb API (max channel count, latency reporting), but commented them in the vtable. Depending on the order of landing for the related patches, I'll strip them out.
Attachment #755445 - Flags: review?(kinetik)
Comment on attachment 755445 [details] [diff] [review] v1: add a wasapi backend to cubeb Review of attachment 755445 [details] [diff] [review]: ----------------------------------------------------------------- ::: media/libcubeb/src/cubeb_wasapi.cpp @@ +1,2 @@ > +/* > + * Copyright é 2013 Mozilla Foundation Charset funkiness. @@ +18,5 @@ > +#include "cubeb-speex-resampler.h" > +#include "cubeb-internal.h" > + > +#if 0 > +# define LOG(...) fprintf(stderr, __VA_ARGS__); fprintf(stderr, "\n"); Maybe make LOG a static function with the contents ifdefed out when not needed; otherwise at least wrap the macro contents in do { } while(0). @@ +23,5 @@ > +#else > +# define LOG(...) > +#endif > + > +/* Hundreads of nanoseconds per millisecond. */ Typo, repeated below: Hundreds. @@ +25,5 @@ > +#endif > + > +/* Hundreads of nanoseconds per millisecond. */ > +#define HNS_PER_MS 10000 > +#define MS_PER_S 1000 These can all be static const; or turn them into helper functions with the appropriate types so the math is centralized? @@ +31,5 @@ > +/* Hundreads of nanoseconds per second. */ > +#define HNS_PER_S (HNS_PER_MS * MS_PER_S) > + > +/* Hundreads of nanoseconds per microsecond. */ > +#define HNS_PER_US 1000 Unused? @@ +34,5 @@ > +/* Hundreads of nanoseconds per microsecond. */ > +#define HNS_PER_US 1000 > + > +/* Microseconds per second. */ > +#define US_PER_S 1000000 Unused? @@ +39,5 @@ > + > +#define ARRAY_LENGTH(array_) \ > + (sizeof(array_) / sizeof(array_[0])) > + > +extern "C" { You can move the extern "C" to the definition. @@ +44,5 @@ > + int wasapi_init(cubeb ** context, char const * context_name); > +}; > + > +typedef void (*refill_function)(cubeb_stream * stm, > + float * data, long frame_needed); frames_needed? @@ +46,5 @@ > + > +typedef void (*refill_function)(cubeb_stream * stm, > + float * data, long frame_needed); > + > +void SafeRelease(HANDLE handle) Newline between void and SafeRelease (apply this rule everywhere) @@ +54,5 @@ > + } > +} > + > +template <typename T> > +void SafeRelease(T ** ptr) Change this to take a T * and just call release, no need to null it. Then arrange for the callers to null only when necessary (e.g. shouldn't ever be needed when called in a _destroy function). Or remove this and use CComPtr everywhere instead? @@ +64,5 @@ > +} > + > +typedef HANDLE WINAPI set_mm_thread_characteristics_function( > + const char* TaskName, LPDWORD TaskIndex); > +typedef BOOL WINAPI revert_mm_thread_characteristics_function(HANDLE handle); refill_function is typedefed as a function pointer, these are typedefed as functions. Please change these to be function pointer typedefs for consistency. @@ +69,5 @@ > + > +struct cubeb > +{ > + cubeb_ops const * ops; > + IMMDevice* device; Space between IMMDevice and *. @@ +77,5 @@ > + set_mm_thread_characteristics_function * set_mm_thread_characteristics; > + revert_mm_thread_characteristics_function * revert_mm_thread_characteristics; > +}; > + > +extern cubeb_ops const wasapi_ops; This shouldn't need "extern"? @@ +116,5 @@ > + refill_function refill_function; > + /* Leftover frames handling, only used when resampling. */ > + uint32_t leftover_frame_count; > + uint32_t leftover_frame_size; > + float * leftover_frames; leftover_frames_buffer @@ +136,5 @@ > +{ > + return context->mmcss_module; > +} > + > +/* WASAPI in shared mode always support float32, so this is unused. Delete, or if defined(EXCLUSIVE_MODE) these out for now. @@ +176,5 @@ > +{ > + if (should_upmix(stm)) { > + return stm->bytes_per_frame / 2 * frames; > + } > + return stm->bytes_per_frame * frames; return stm->bytes_per_frame / should_upmix(stm) ? 2 : 1 * frames; @@ +182,5 @@ > + > +void > +refill_with_resampling(cubeb_stream * stm, float * data, long needed) > +{ > + long got; Declare this where it's first used. @@ +184,5 @@ > +refill_with_resampling(cubeb_stream * stm, float * data, long needed) > +{ > + long got; > + > + /* Use a bit more input frames that strictly needed, so in the worst case, "Use more input frames than strictly necessary" @@ +190,5 @@ > + * during the next iteration. */ > + float rate = > + static_cast<float>(stm->stream_params.rate) / stm->mix_params.rate; > + > + long before_resampling = static_cast<long>(ceilf(rate * needed) + 1); Abstract the double-to-long ceilf + 1 conversion out to an appropriately named function and use it everywhere this conversion is currently open coded. @@ +192,5 @@ > + static_cast<float>(stm->stream_params.rate) / stm->mix_params.rate; > + > + long before_resampling = static_cast<long>(ceilf(rate * needed) + 1); > + > + long frame_request = before_resampling - stm->leftover_frame_count; frames_requested? @@ +208,5 @@ > + if (got != frame_request) { > + stm->draining = true; > + } > + > + uint32_t in_frames_bck, out_frames_bck; I think it'd be clearer to remove the _bck variables and use "before_resampling" and "needed" below directly instead. @@ +280,5 @@ > + > + bool is_playing = true; > + HANDLE wait_array[2] = {stm->shutdown_event, stm->refill_event}; > + HANDLE mmcss_handle = NULL; > + DWORD mmcss_task_index = 0; Move this inside the can_use_mmcss block since it's not used anywhere else. @@ +316,5 @@ > + } > + case WAIT_OBJECT_0 + 1: { /* refill */ > + BYTE* data; > + UINT32 padding; > + long available; Declare all of these where they're first used. @@ +318,5 @@ > + BYTE* data; > + UINT32 padding; > + long available; > + > + hr = stm->client->GetCurrentPadding(&padding); Check hr here rather than after the stm->draining test? @@ +321,5 @@ > + > + hr = stm->client->GetCurrentPadding(&padding); > + > + if (stm->draining) { > + if(padding == 0) { Space between if and (. @@ +330,5 @@ > + } > + > + available = stm->buffer_frame_count - padding; > + > + if (SUCCEEDED(hr)) { Invert the logic in this entire block so there's less nesting for the success case. @@ +338,5 @@ > + hr = stm->render_client->GetBuffer(available, &data); > + float* data_f = reinterpret_cast<float *>(data); > + > + if (SUCCEEDED(hr)) { > + (*stm->refill_function)(stm, data_f, available); Please use the stm->refill_function(...) syntax for calling function pointers everywhere, it's simpler. Cast data_f directly here rather than introducing a new declaration since it's only used once. @@ +375,5 @@ > + return 0; > +} > + > +void wasapi_destroy(cubeb * context); > +} // namespace anonymous @@ +380,5 @@ > + > +int wasapi_init(cubeb ** context, char const * context_name) > +{ > + HRESULT hr; > + IMMDeviceEnumerator * enumerator = NULL; Can we use CComPtr for this to avoid manual management? @@ +383,5 @@ > + HRESULT hr; > + IMMDeviceEnumerator * enumerator = NULL; > + > + hr = CoInitializeEx(NULL, COINIT_MULTITHREADED); > + No newline. Applies for each case where hr is set and then tested. @@ +390,5 @@ > + wasapi_destroy(*context); > + return CUBEB_ERROR; > + } > + > + *context = (cubeb *)calloc(1, sizeof(cubeb)); If the context (or stream) pointer is used a lot in cases where the parameter is a ptr-to-ptr, declare a derefed version and use that everywhere for ease of reading. In the cases where you're initializing a new context/stream, just set the ptr-to-ptr outparam at the end of the function (on success). @@ +435,5 @@ > + > + (*context)->revert_mm_thread_characteristics = > + (revert_mm_thread_characteristics_function*) GetProcAddress( > + (*context)->mmcss_module, "AvRevertMmThreadCharacteristics"); > + if (!(*context)->revert_mm_thread_characteristics) { Fetch both proc addresses first, then merge the error handling since we're always going to fail unless we can get both functions. Actually, you could also implement dummy functions for the case when avrt.dll isn't available and set the function pointers to the dummy functions in that case, then call set and revert as necessary without needing to test for their presence. @@ +470,5 @@ > +int > +wasapi_get_max_channel_count(cubeb * ctx, uint32_t * max_channels) > +{ > + HRESULT hr; > + IAudioClient * client; Can we use CComPtr for this to avoid manual management? @@ +471,5 @@ > +wasapi_get_max_channel_count(cubeb * ctx, uint32_t * max_channels) > +{ > + HRESULT hr; > + IAudioClient * client; > + WAVEFORMATEX* mix_format; Can we use CComHeapPtr for this to avoid manual management? @@ +507,5 @@ > + unsigned int latency, cubeb_data_callback data_callback, > + cubeb_state_callback state_callback, void * user_ptr) > +{ > + HRESULT hr; > + WAVEFORMATEX* mix_format; Can we use CComHeapPtr for this to avoid manual management? @@ +516,5 @@ > + > + /* 30ms in shared mode is the minimum we can get when using WASAPI */ > + if (latency < 30) { > + LOG("Latency too low: got %u (30ms minimum)", latency); > + return CUBEB_ERROR; INVALID_PARAMETER @@ +545,5 @@ > + } > + > + (*stream)->refill_event = CreateEvent(NULL, NULL, 0, NULL); > + > + if (!(*stream)->shutdown_event) { This should be checking refill_event. You can probably merge the error handling for these two as we'll always fail unless both succeed. @@ +563,5 @@ > + wasapi_stream_destroy(*stream); > + return CUBEB_ERROR; > + } > + > + /* We have to dinstinguish between the format the mixer uses, distinguish @@ +580,5 @@ > + > + float resampling_rate = static_cast<float>((*stream)->stream_params.rate) / > + (*stream)->mix_params.rate; > + > + if (resampling_rate != 1.0) { Check stream_params.rate != mix_params.rate here. Then only calculation and use the resampling rate ratio inside the if block. @@ +675,5 @@ > + wasapi_stream_destroy(*stream); > + return CUBEB_ERROR; > + } > + > + hr = (*stream)->audio_clock->GetFrequency(&(*stream)->clock_freq); Is it safe to fetch this only once at setup? MSDN suggests the AudioClock's frequency never changes over the lifetime of a stream on Vista; I'm not sure if that means it can change on other Windows versions. @@ +720,5 @@ > + } > + > + if (stm->upmix_buffer) { > + free(stm->upmix_buffer); > + stm->resampling_src_buffer = NULL; Should be nulling upmix_buffer, Having said that, it's not necessary to null any of these as there shouldn't be a situation where we try to use a stream after it's destroyed. It's also safe to call free on a NULL pointer, so no need to test for null before calling free for any of these pointers. @@ +754,5 @@ > +int wasapi_stream_stop(cubeb_stream * stm) > +{ > + assert(stm); > + > + HRESULT hr; Declare where it's used. @@ +756,5 @@ > + assert(stm); > + > + HRESULT hr; > + > + if (stm->shutdown_event) { It shouldn't be possible to call stop without a shutdown event, so just use it (or assert rather than test). @@ +804,5 @@ > + /* The GetStreamLatency method only works if the > + * AudioClient has been initialized. */ > + if (!stm->client) { > + return CUBEB_ERROR; > + } else { Don't need an else after a return. @@ +814,5 @@ > + > + return CUBEB_OK; > +} > + > +} // namespace anonymous @@ +816,5 @@ > +} > + > +} > + > +cubeb_ops const wasapi_ops = { I think this can be hidden by the anonymous namespace, since it's exposed via a pointer in cubeb *.
+ stm->thread = CreateThread(NULL, 0, wasapi_stream_render_loop, stm, 0, NULL); This might need to use _beginthreadex instead. Also, limit the stack size for this thread (I think we use 128kB elsewhere?).
(In reply to Matthew Gregan [:kinetik] from comment #6) > + stm->thread = CreateThread(NULL, 0, wasapi_stream_render_loop, stm, 0, > NULL); > > This might need to use _beginthreadex instead. Also, limit the stack size > for this thread (I think we use 128kB elsewhere?). 64k in cubeb_winmm.c, the media test suite passes just fine if I set 64k as well in cubeb_wasapi.cpp.
> @@ +77,5 @@ > > + set_mm_thread_characteristics_function * set_mm_thread_characteristics; > > + revert_mm_thread_characteristics_function * revert_mm_thread_characteristics; > > +}; > > + > > +extern cubeb_ops const wasapi_ops; > > This shouldn't need "extern"? It does if you want to declare it, but not define it, so you can actually define it below all the function definitions. > @@ +675,5 @@ > > + wasapi_stream_destroy(*stream); > > + return CUBEB_ERROR; > > + } > > + > > + hr = (*stream)->audio_clock->GetFrequency(&(*stream)->clock_freq); > > Is it safe to fetch this only once at setup? MSDN suggests the AudioClock's > frequency never changes over the lifetime of a stream on Vista; I'm not sure > if that means it can change on other Windows versions. It seems fine from my testing on try/locally. Other software are doing that, from what I read.
Attached patch patch v2, rollup (deleted) — Splinter Review
Still green locally and on try.
Attachment #755445 - Attachment is obsolete: true
Attachment #755445 - Flags: review?(kinetik)
Attachment #764773 - Flags: review?(kinetik)
Comment on attachment 764773 [details] [diff] [review] patch v2, rollup Thanks!
Attachment #764773 - Flags: review?(kinetik) → review+
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
Depends on: 893538
Depends on: 894801
Depends on: 893307
Depends on: 899050
Depends on: 900178
Depends on: 923992
Depends on: 961475
Depends on: 970073
Depends on: 1123768
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: