Closed Bug 495040 Opened 15 years ago Closed 12 years ago

Implement playbackRate and related bits

Categories

(Core :: Audio/Video, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla20

People

(Reporter: kinetik, Assigned: padenot)

References

(Depends on 1 open bug, )

Details

(Keywords: dev-doc-complete, Whiteboard: parity-webkit, parity-ie, [evang-wanted])

Attachments

(4 files, 14 obsolete files)

(deleted), patch
kinetik
: review+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
kinetik
: review+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
We don't seem to have a bug filed on this already, so here's one. Bug 462953 (mozilla-central changeset 250cef7cf0d6) removed the skeletal support for this we had. When implementing this, we need to make sure bug 450115 does not recur. May also need to consider bug 454686, but that may not be relevant if reimplementing this from scratch.
It'd be really awesome if our playbackRate support could do audio timescale-pitch modification, so that watching a presentation at 2x retained the original pitch. QuickTime can do this, so Safari's playbackRate already supports this. There are GPL/LGPL libraries available that we might be able to use (e.g. SoundTouch, RubberBand), since implementing this from scratch would be a decent chunk of work.
I agree with comment 1. Pitch correction would be great. I regularly watch videos with other players at faster speeds and it's a big help when the player supports pitch correction.
Attached patch wip patch v0 (obsolete) (deleted) — Splinter Review
Attached patch wip patch v1 (obsolete) (deleted) — Splinter Review
This works, but is pretty hacked up and buggy. The patch adds SoundTouch to media/, which is an LGPL library that provides audio timescale-pitch modification, allowing playback at non-unity rates to retain the original pitch. A/V sync breaks sometimes if you change the playback rate during playback (it's mostly fine if you set your desired playback rate before playing). Setting playbackRate particularly high or low can cause frame dropping or other weird behaviour. Doesn't really work properly with the Waveform backend for rates > 1.0. Negative rates are not supported at all. The way I've wired SoundTouch into the audio code is pretty dumb, but it was the simplest way to do it for now. There's some inherent latency in the processing SoundTouch does, plus there's latency caused by the mBufferOverflow buffering mechanism. This interacts badly with playbackRate changing, and with the A/V sync mechanism. FWIW, Chrome has a simpler BSD licensed algorithm here: http://src.chromium.org/viewvc/chrome/trunk/src/media/filters/audio_renderer_algorithm_ola.cc
Attachment #406137 - Attachment is obsolete: true
Also blocks ietestcenter.
OS: Mac OS X → All
Hardware: x86 → All
Whiteboard: parity-webkit, [evang-wanted]
What is stopping us from landing playbackRate without pitch correction, and then doing pitch correction in a follow-up bug? This is feeling like how we didn't implement the |loop| attribute for a long time because we wanted to hold off until we could get seamless looping.
I'd be fine with leaving pitch correction until later; there are use cases for playbackRate that don't require it.
Supposing my college class used <video> to stream sessions (I've taken some which posted video online and then watched them later, although prior to the advent of <video>), it would be entirely unhelpful to listen to a professor discussing (say) oncogenes and anti-oncogenes with the voice of a chipmunk. I don't think pitch correction is in the same quality-of-implementation league as seamless looping.
(In reply to Jeff Walden (remove +bmo to email) from comment #14) > Supposing my college class used <video> to stream sessions (I've taken some > which posted video online and then watched them later, although prior to the > advent of <video>), it would be entirely unhelpful to listen to a professor > discussing (say) oncogenes and anti-oncogenes with the voice of a chipmunk. > I don't think pitch correction is in the same quality-of-implementation > league as seamless looping. Yes, that is a good use case for pitch correction. But I don't think pitch correction needs to block this feature. We are missing out on other entirely valid use cases that don't require pitch correction. And many users may be fine with a chipmunk voice until they find what they were looking for and then changing it to playbackRate=1 at that time.
How do you propose a site detect, using the standards it already knows about (like playbackRate itself), that playbackRate only half-works due to its lacking pitch correction? That seems like a substantial problem with considering a half-implementation.
(In reply to Jeff Walden (remove +bmo to email) from comment #16) > How do you propose a site detect, using the standards it already knows about > (like playbackRate itself), that playbackRate only half-works due to its > lacking pitch correction? That seems like a substantial problem with You knoe exactly how they'll do that. UA sniffing.
Actually, I think it's reasonably likely they won't do that. Which is precisely the problem.
Pitch correction is not required by the spec. I have filed a follow-up bug to add pitch correction. Let's not let a good solution wait for a perfect solution.
Pitch correction is not required by the spec (http://www.whatwg.org/specs/web-apps/current-work/multipage/the-video-element.html#effective-playback-rate). Also, as I said above, there are use cases where pitch correction is not necessary and may not be desirable to the content author. WebKit solves this by exporting an additional (prefixed) attribute "preservesPitch" (proposed to the WhatWG here: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-July/021100.html), which exposes what the implementation defaults to and can allow the content author to specify which behaviour they prefer. Perhaps we can do the same, and default preservesPitch to false until we have implemented pitch correction.
Ah, preservesPitch seems reasonable. Although I worry about it not being in the spec yet, and video interface web developers not knowing of its existence due to insufficient browser testing...
YouTube is now offering variable playback speed in their player for browsers that support the feature. Matthew: Do you think you could work on this?
Assignee: nobody → chris
Attached patch updated WIP (obsolete) (deleted) — Splinter Review
I've updated the patch to fit the latest state of the tree, and removed the use of SoundTouch, as it can be implemented in another ticket. Gonna leave the actual implementation of adjusting the video's playback speed to someone who knows that area of the code better.
Assignee: chris → nobody
I'm going to work on that feature during my internship (starting on the second of april, 2012).
Attached patch another prototype (obsolete) (deleted) — Splinter Review
I've had this version of the patch kicking around for a while--it might be a useful base to start from. It uses the resampler from Speex to do non-pitch-preserving playback rate adjustments. The top of the patch has a list of what's still missing (lots).
Assignee: nobody → paul
Depends on: 743720
Attached patch Patch v0 : Feature implementation (obsolete) (deleted) — Splinter Review
Here is an implementation of the playbackRate property. Pretty much everything works, apart from the backward playback part. Tested on mobile, desktop (every OS), but not B2G (for the nsRemotedAudioStream part). The constants at the top of nsHTMLMediaElement.cpp still have to be discussed : - When do we want to mute the sound because of too fast / too slow playback ? - What is the maximum/minimum playback rate that we want to support ?
Attachment #406169 - Attachment is obsolete: true
Attachment #606446 - Attachment is obsolete: true
Attachment #607819 - Attachment is obsolete: true
Attachment #625273 - Flags: review?(kinetik)
Attached patch Patch v0 : tests for the playbackRate property. (obsolete) (deleted) — Splinter Review
I'm not happy about the |setTimeout| part, but I don't really know how to proceed without it. These tests are mostly green on try. The only problem that remains is : > ok(playtime <= mediaTime, "[playbackRate] Duration of the fast rate section > should be different from the actual play time." + playtime + " " + mediaTime + > " " + t.token); failing rarely (on every platform), because the events are delayed to much, I suppose. I'm still trying to find a way to test the audio/video synchronization (maybe with canvas + Audio Data API).
Attachment #625277 - Flags: review?(kinetik)
Attachment #625279 - Flags: review?(kinetik)
To test, Youtube now has controls to change the playback rate in the html5 trial (http://youtube.com/html5). I also set up a test page with a variety of video only, video+audio, audio only media, here : http://paul.cx/public/playbackrate/. Builds will be available in a couple of hours at the following url : http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/paul@paul.cx-ae39e6bee12e
Comment on attachment 625273 [details] [diff] [review] Patch v0 : Feature implementation I found a last bug in this feature on the cubeb backend. Will upload a revised version soon.
Attachment #625273 - Flags: review?(kinetik)
Attachment #625277 - Flags: review?(kinetik)
Attachment #625279 - Flags: review?(kinetik)
Comment on attachment 625277 [details] [diff] [review] Patch v0 : tests for the playbackRate property. While you're updating the patches: checkPlaybackRate in test_playback_rate.html doesn't seem to be used.
Attached patch 625273: Patch v0.1 : Feature implementation (obsolete) (deleted) — Splinter Review
I finally took the time to change the last bits. This is green on try [1], apart from what is mentioned in comment 27. [1]: https://tbpl.mozilla.org/?tree=Try&rev=1b1ec2cc48ed
Attachment #625273 - Attachment is obsolete: true
Attachment #628872 - Flags: review?(kinetik)
I also made the various threshold a bit more tight in these tests.
Attachment #625277 - Attachment is obsolete: true
Attachment #628873 - Flags: review?(kinetik)
Attachment #625279 - Flags: review?(kinetik)
Thanks, I'll review this early next week. If you need it urgently, please let me know.
Status: NEW → ASSIGNED
Whiteboard: parity-webkit, [evang-wanted] → parity-webkit, parity-ie, [evang-wanted]
Attachment #625279 - Flags: review?(kinetik) → review+
Comment on attachment 628872 [details] [diff] [review] 625273: Patch v0.1 : Feature implementation Review of attachment 628872 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/html/content/public/nsHTMLMediaElement.h @@ +609,5 @@ > > + /** > + * Clamp the playbackRate to an acceptable value. > + */ > + double ClampPlaybackRate(double aPlaybackRate); Make this a standalone static function in the source since it doesn't use any member variables of the class. ::: content/html/content/src/nsHTMLMediaElement.cpp @@ +114,5 @@ > +static const double MAX_PLAYBACKRATE = 5.0; > +// Threshold above which audio is muted > +static const double THRESHOLD_HIGH_PLAYBACKRATE_AUDIO = 4.0; > +// Threshold above which audio is muted > +static const double THRESHOLD_LOW_PLAYBACKRATE_AUDIO = 0.5; Add a comment explaining the reasoning behind these limits (if there is one, otherwise note that they're arbitrary). @@ +3450,5 @@ > + if (aPlaybackRate == 0.0) { > + return aPlaybackRate; > + } > + if (NS_ABS(aPlaybackRate) < MIN_PLAYBACKRATE) { > + aPlaybackRate < 0 ? aPlaybackRate = -MIN_PLAYBACKRATE : aPlaybackRate = MIN_PLAYBACKRATE; The usual style for a ternary would be: aPlaybackRate = aPlaybackRate < 0 ? -MIN_PLAYBACKRATE : MIN_PLAYBACKRATE; But in this case I'd just return early: return aPlaybackRate < 0 ? -MIN_PLAYBACKRATE : MIN_PLAYBACKRATE; ::: content/media/Makefile.in @@ +14,5 @@ > LIBXUL_LIBRARY = 1 > XPIDL_MODULE = content_media > > +DEFINES += -DOUTSIDE_SPEEX > +DEFINES += -DRANDOM_PREFIX=speex I'd rather we didn't define these for all of content/media, so just define them before including speex_resampler.h in nsAudioStream.cpp. ::: content/media/nsAudioStream.cpp @@ +50,5 @@ > > +#if defined(ANDROID) > +#define RESAMPLER_QUALITY 3 > +#else > +#define RESAMPLER_QUALITY 7 Add a comment describing how these values were chosen. @@ +100,4 @@ > double mVolume; > void* mAudioHandle; > + // Pointer to an instance of a Speex resampler > + SpeexResamplerState* mResamplerState; Wrap this in an nsAutoRef with a specialized trait that calls speex_resampler_destroy. See nsOggCodecState.h for an example. @@ +489,5 @@ > PR_LOG(gAudioStreamLog, PR_LOG_ERROR, ("nsNativeAudioStream: sa_stream_open error")); > return NS_ERROR_FAILURE; > } > + > + mResamplerState = speex_resampler_init(aNumChannels, Can this be initialized lazily? It'd be nice not to pay the memory cost for this when the common case doesn't need it. @@ +656,5 @@ > } > > PRInt64 nsNativeAudioStream::GetPosition() > { > + return mAudioClock->GetPosition(); GetPosition and GetPositionInFrames are supposed to return the same position in different units, but GetPosition is adjusted for playback rate and GetPositionInFrames is not. GetPositionInFrames needs to be split into an internal version that reports the audio backend's position (which AudioClock can use) and an external version that provides the same behaviour as GetPosition. @@ +1181,5 @@ > + } > + > + PRUint32 bytesToCopy = resampledFrames * mBytesPerFrame; > + mAudioClock->UpdateWritePosition(resampledFrames); > + Can we do the resampling in DataCallback? That way the latency of rate changes is restricted to cubeb's stream latency, rather than that plus the size of mBuffer. ::: content/media/nsAudioStream.h @@ +26,5 @@ > + // Set the playback rate. > + // Called on the audio thread. > + void SetPlaybackRate(double aPlaybackRate); > + private: > + nsAudioStream* mAudioStream; Add a comment describing the lifetime of this, i.e. that the nsAudioStream holds a strong owning reference to this AudioClock, so this pointer is guaranteed to always be valid. @@ +160,3 @@ > int mChannels; > SampleFormat mFormat; > + AudioClock* mAudioClock; Why do we need to allocate this dynamically at all? Include it directly and then there's no need to worry about managing its lifetime. ::: content/media/nsBuiltinDecoder.cpp @@ +1050,5 @@ > + > +nsresult nsBuiltinDecoder::SetPlaybackRate(double aPlaybackRate) > +{ > + if (aPlaybackRate == 0) { > + mPlaybackRateIsNull = true; Rename mPlaybackRateIsNull to something that indicates its purpose directly so that it's obvious this signals that the decoder was paused as a result of the playback rate being set to zero and will be resumed when a non-zero playback rate is set. @@ +1054,5 @@ > + mPlaybackRateIsNull = true; > + Pause(); > + return NS_OK; > + } else { > + if (mPlaybackRateIsNull) { This can be merged into a single else if. ::: content/media/nsBuiltinDecoderStateMachine.h @@ +503,5 @@ > // Accessed only via the state machine thread. > TimeStamp mPlayStartTime; > > + // When the playbackRate changes, and there is no audio clock, it is necessary > + // to reset the mPlayStartTime. This is done next time the clock is queries, Typo. ::: dom/interfaces/html/nsIDOMHTMLMediaElement.idl @@ +64,5 @@ > readonly attribute nsIDOMTimeRanges played; > readonly attribute nsIDOMTimeRanges seekable; > + attribute double defaultPlaybackRate; > + attribute double playbackRate; > + attribute boolean mozPreservesPitch; Minor, but move these above played (and below paused) so that they're in the same order as the W3C's IDL. ::: dom/ipc/AudioParent.cpp @@ +202,5 @@ > + mStream->SetPlaybackRate(aPlaybackRate); > + return true; > +} > + > + Remove extra newline.
> ::: content/html/content/src/nsHTMLMediaElement.cpp > @@ +114,5 @@ > > +static const double MAX_PLAYBACKRATE = 5.0; > > +// Threshold above which audio is muted > > +static const double THRESHOLD_HIGH_PLAYBACKRATE_AUDIO = 4.0; > > +// Threshold above which audio is muted > > +static const double THRESHOLD_LOW_PLAYBACKRATE_AUDIO = 0.5; The second comment here should probably be: > // Threshold below which audio is muted
> ::: content/html/content/src/nsHTMLMediaElement.cpp > @@ +114,5 @@ > > +static const double MAX_PLAYBACKRATE = 5.0; > > +// Threshold above which audio is muted > > +static const double THRESHOLD_HIGH_PLAYBACKRATE_AUDIO = 4.0; > > +// Threshold above which audio is muted > > +static const double THRESHOLD_LOW_PLAYBACKRATE_AUDIO = 0.5; > > Add a comment explaining the reasoning behind these limits (if there is one, > otherwise note that they're arbitrary). Those numbers depend on the buffer size of the buffers in the callback based backends (cubeb, macos sydneyaudio). fwiw, youtube currently proposes in its interface, to change the playback rate to 0.25, 0.5, 1.5 and 2. I'll have to have a quick chat with jmspeex, to make sure those limits are reasonable if we want to do pitch correction afterwards (i.e. at what value should we mute the audio because the pitch shift algorithm cannot produce acceptable results). If we choose to support those low playback rate, we will have to increase the buffer size in those backends (from 1 second to 4 second if we choose to support playing media at 0.25x). This ensure that the logic on top of the nsAudioStreams remain unchanged, but at the cost of higher memory usage. We could also resize those buffers dynamically in a lazy fashion. > ::: content/media/nsAudioStream.cpp > @@ +50,5 @@ > > > > +#if defined(ANDROID) > > +#define RESAMPLER_QUALITY 3 > > +#else > > +#define RESAMPLER_QUALITY 7 > > Add a comment describing how these values were chosen. I just did a quick profiling/listening test on Linux, and I need to investigate on how to be able to profile on Android. I lowered the quality here with battery life in mind, considering the resampling code is pretty heavy on the CPU. I'll come to that this week, most probably. > @@ +1181,5 @@ > > + } > > + > > + PRUint32 bytesToCopy = resampledFrames * mBytesPerFrame; > > + mAudioClock->UpdateWritePosition(resampledFrames); > + > > Can we do the resampling in DataCallback? That way the latency of rate changes > is restricted to cubeb's stream latency, rather than that plus the size of > mBuffer. I started to do that: requesting less or more frames from the queue (depending on the playback rate), resampling into the buffer handed by cubeb, then I noticed that we have no control on the size of the buffer passed to DataCallback, hence, if we resample in the callback, it would have to go like this : > // Change the number of frames we will pull from the queue > PRUint32 bytesWanted = static_cast<PRUint32>(aFrames / playbackRate) * mBytesPerFrame; > (...) > // Actually pop the frames from the queue > mBuffer.PopElements(bytesWanted, ...); > (...) > // Get the final length of the buffer, in frames, after resampling. > outputSize = bytesWanted * playbackRate Here, |output_size| may not necessarily be aligned on |mBytesPerFrame|, so we will end up adding silence or dropping a frame. We could possibly change subtly the playback rate to end up aligned to |mBytesPerFrame|, but I'm not sure if it is desirable, and could produce audio glitches I suppose. Maybe we can align the cubeb buffer on the lower mBytesPerFrames boundaries after resampling and returning a value not equal to aFrames in the callback, but I'm afraid that will be understood as a drain by cubeb.
Per discussion with jmspeex on irc, a quality of 3 for mobile and 5 for desktop are fine if we can afford the computational complexity (which seem to be ok, according to a few test I've performed. Nothing really scientific, though). He also suggested we could lower the quality of the resampling if the video decoding is slow (we could use our existing machinery [1] to do that), to give it air. It could be worth on mobile, but we need to profile first to see if it really is a problem. [1] : http://mxr.mozilla.org/mozilla-central/source/content/media/nsBuiltinDecoderStateMachine.cpp#885
Attached patch Implement playbackRate and related bits r= (obsolete) (deleted) — Splinter Review
I've addressed review comments, and tried to move the processing in the callback, but I faced quite a lot of issues on latter. The speex resampler often does not give us the amount of frames asked for, but a little less. Hence, we have to oversize the input and output buffers, at the risk of having to put the remaining data in spillbuffers. This works just fine. But then the problems arise. Sometimes when I run the tests, and the playback rate goes from, say, 0.25 to 5, the resampling function gives us _way_ less frames than asked for, despite having more than enough input frames. Therefore, there is a jitter in the actual playback rate resulting from the frames computed by speex: the ratio of the number of input frames by the number of output frames will not be 5.0, more like 5.6 or something. This kind of playback rate jitter only occurs when the playback rate changes, so I assume we can't really do someting about it (Or maybe a bug in speex, I've no idea). That provokes an underrun, of course. We have two solutions: - Add another layer of buffers on top of cubeb to handle the processing and to hand exactly the correct amount of samples to the backend: if we find that we lack frames just before returning from the callback, do another pass of resampling. This increases memory usage, I'm not sure by how much, but likely more that the two spillbuffer I have right now. - Accept the fact that we will have little underruns when we change the playback rate. We are missing at most 40 frames or something, from 5.0 to 0.5 (which is the largest drift). The attached patch does the second solution, but is work in progress. The tests are green, though, and for a day to day usage, it works quite fine. Any ideas, or obvious mistake or false assumptions I'm making on how the resampling process is supposed to work?
Attachment #628872 - Attachment is obsolete: true
Attachment #628872 - Flags: review?(kinetik)
A little update on this feature. I've switched from the Speex's resampler to using the time stretcher of SoundTouch[1], so we get pitch correction for free. I've got a patch to import the necessary parts of the library (I've stripped out the parts we don't need) in the tree, and another one to use it for playback rate instead of the Speex's resampler. The result are encouraging so far, I can have a video playing at various playback rates using time stretching. Still have a couple problems to address, though, but nothing major. [1]: http://www.surina.net/soundtouch/
SoundTouch appears to be LGPL so has some contraints on how we use it. See <https://groups.google.com/d/msg/mozilla.legal/NVb3IT-Megc/gSKYqEWsJt8J> and maybe contact licensing@mozilla.org for clarification. From that post it looks like it may need to be built as a shared library and linked to: ----8<---- Any version of the LGPL from 2.0 upwards - "Or later version" allowed - 3rd party code only - Clearly-demarcated library - Code must be compiled such that it is dynamically linked - Code to live wherever in the tree is most appropriate ----8<----
Chris, I've been in touch with Gerv and other legal folks regarding the import of such a library, in bug 764495. For now, the patch I wrote to import links soundtouch statically in gkmedia, but it's only temporary, as I wanted to make sure SoundTouch was suitable for our use case before wasting time learning how to create a .so in our build system.
Depends on: 775319
Attached patch Patch using SoundTouch. (obsolete) (deleted) — Splinter Review
Here is a new patch that uses SoundTouch, and address the review comments. Using that new library gives pitch preservation for free (kind of), so I went ahead and wired the mozPreservesPitch property. I also removed the const from the ns*AudioStream::Write methods to do volume processing in place in the common case, in an effort to reduce all the buffer copying/allocation we are doing in the media code. This patch applies on top of patches from bug 779997 and bug 775319, but I've pushed to try (builds if needed: [0]), and it's green. [0]: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/paul@paul.cx-7d3d979e3fcb/
Attachment #640838 - Attachment is obsolete: true
Attachment #649911 - Flags: review?(kinetik)
Attached patch Add test for playbackRate. (obsolete) (deleted) — Splinter Review
The tests have been adjusted to always pass on try, and to test the mozPreservesPitch property.
Attachment #628873 - Attachment is obsolete: true
Attachment #628873 - Flags: review?(kinetik)
Attachment #649915 - Flags: review?(kinetik)
Attached patch Patch using SoundTouch. (obsolete) (deleted) — Splinter Review
(I forgot to qref a tiny change).
Attachment #649911 - Attachment is obsolete: true
Attachment #649911 - Flags: review?(kinetik)
Attachment #649917 - Flags: review?(kinetik)
Attachment #649915 - Attachment description: Update tests. → Add test for playbackRate.
Comment on attachment 649917 [details] [diff] [review] Patch using SoundTouch. Review of attachment 649917 [details] [diff] [review]: ----------------------------------------------------------------- I've reviewed everything except the major changes in nsAudioStream.{cpp,h}. I'll review those once the removal of const Write() is fixed. ::: content/html/content/src/nsHTMLMediaElement.cpp @@ +116,5 @@ > +static const double MAX_PLAYBACKRATE = 5.0; > +// Threshold above which audio is muted > +static const double THRESHOLD_HIGH_PLAYBACKRATE_AUDIO = 4.0; > +// Threshold under which audio is muted > +static const double THRESHOLD_LOW_PLAYBACKRATE_AUDIO = 0.5; My last review asked for a comment explaining the choice of these constants or a note that they have been selected arbitrarily. Please do this. @@ +3574,5 @@ > + > + if (mPlaybackRate < 0 || > + mPlaybackRate > THRESHOLD_HIGH_PLAYBACKRATE_AUDIO || > + mPlaybackRate < THRESHOLD_LOW_PLAYBACKRATE_AUDIO) { > + SetMuted(true); It's not clear to me from "When the effective playback rate is so low or so high that the user agent cannot play audio usefully, the corresponding audio must also be muted." in the spec means muting the element in a way visible to the media API. It seems wrong that a user setting mute on an element explicitly and then setting the playbackRate to a valid value would cause the element to unmute. @@ +3589,5 @@ > + > +/* attribute bool mozPreservesPitch; */ > +NS_IMETHODIMP nsHTMLMediaElement::GetMozPreservesPitch(bool* aPreservesPitch) > +{ > + *aPreservesPitch = mPreservesPitch; I think I'd rather query the decoder for this than store the state in multiple places. ::: content/media/nsAudioStream.cpp @@ +869,5 @@ > } > > PRInt64 nsRemotedAudioStream::GetPosition() > { > + // The child does the playback rate ajustments. Typo here and in GetPositionInFrames. @@ +1050,5 @@ > nsAutoRef<cubeb_stream> mCubebStream; > > PRUint32 mBytesPerFrame; > > + inline PRInt32 BytesToFrames(PRInt32 aBytes) { This and FramesToBytes don't need inline. @@ +1051,5 @@ > > PRUint32 mBytesPerFrame; > > + inline PRInt32 BytesToFrames(PRInt32 aBytes) { > + NS_ASSERTION(!(aBytes % mBytesPerFrame), |aBytes % mBytesPerFrame == 0| is clearer. ::: content/media/nsAudioStream.h @@ +10,5 @@ > #include "nsISupportsImpl.h" > #include "nsIThread.h" > #include "nsAutoPtr.h" > +#include "nsAutoRef.h" > +#include "soundtouch/SoundTouch.h" Can we get away with forward declaring SoundTouch and only including this in the .cpp? @@ +38,5 @@ > + public: > + AudioClock(nsAudioStream* aStream); > + // Update the number of samples that has been written in the audio backend. > + // Called on the state machine thread. > + void UpdateWritePosition(const PRUint32 aCount); const is meaningless here. @@ +50,5 @@ > + // Called on the audio thread. > + void SetPlaybackRate(double aPlaybackRate); > + // Initialize the clock with the current AudioStream. Need to be called > + // before querying the clock. Called on the audio thread. > + void Init(); This should come directly after the constructor. @@ +71,5 @@ > + // time in the media, not wall clock position. > + PRInt64 mOldBasePosition; > + // Write position at which the playbackRate change occured. > + PRInt64 mPlaybackRateChangeOffset; > + // The previous position reached in the media, used whenc compensating Typo. @@ +144,5 @@ > // Write audio data to the audio hardware. aBuf is an array of frames in > // the format specified by mFormat of length aCount. If aFrames is larger > // than the result of Available(), the write will block until sufficient > // buffer space is available. > + virtual nsresult Write(void* aBuf, PRUint32 aFrames) = 0; > I also removed the const from the ns*AudioStream::Write methods to do volume processing in place in the common case, in an effort to reduce all the buffer copying/allocation we are doing in the media code. This isn't the right approach. The audio stream should not be modifying data owned by something else in an API that looks like a pure data consumer--especially when there's no documentation that this is happening. The queued audio is already used more than once before the decoder disposes of it (see nsBuiltinDecoderStateMachine::PlayFromAudioQueue, where it writes the data to the nsAudioStream, then hands it off to mEventManager for use in the AudioAvailable events). Just leave it as is for now and live with the extra copies when playbackRate is not 1.0. In the future, we can look at avoiding copies by using a scheme where the nsAudioStream can receive ownership of the audio data when the decoder knows it won't need it again. @@ +191,3 @@ > int GetChannels() { return mChannels; } > SampleFormat GetFormat() { return MOZ_AUDIO_DATA_FORMAT; } > + nsresult EnsureResamplerInitialized(); This isn't implemented anywhere. @@ +191,4 @@ > int GetChannels() { return mChannels; } > SampleFormat GetFormat() { return MOZ_AUDIO_DATA_FORMAT; } > + nsresult EnsureResamplerInitialized(); > + nsresult EnsureTimeStretcherInitialized(); Seems like this could just return a bool. @@ +205,5 @@ > + // Output rate in Hz (characteristic of the playback rate) > + int mOutRate; > + double mPlaybackRate; > + // True if the we are timestretching, false if we are resampling. > + bool mPreservesPitch; Seems like these are duplicated in mAudioClock. Rather than duplicating state, just store it in one place and let other users query it from that location. Maybe move it all into AudioClock? ::: content/media/nsBuiltinDecoderStateMachine.cpp @@ +837,5 @@ > + double playbackRate; > + { > + ReentrantMonitorAutoEnter mon(mDecoder->GetReentrantMonitor()); > + playbackRate = mPlaybackRate; > + } There's an assertion that we're already holding this monitor a few lines above this. @@ +903,5 @@ > ReentrantMonitorAutoExit exitMon(mDecoder->GetReentrantMonitor()); > TimeStamp start = TimeStamp::Now(); > videoPlaying = mReader->DecodeVideoFrame(skipToNextKeyframe, currentTime); > decodeTime = TimeStamp::Now() - start; > + playbackRate = mPlaybackRate; And here you're fetching it without the decoder monitor held at all. @@ +1043,5 @@ > + mAudioStream->SetPreservesPitch(preservesPitch); > + playbackRate = mPlaybackRate; > + if (playbackRate != 1.0) { > + NS_ASSERTION(playbackRate != 0, > + "Don't set the playbackRate to 0 in the nsAudioStreams"); "in the nsAudioStreams" (here and below) is slightly odd, maybe "on an nsAudioStream" instead? @@ +2263,2 @@ > { > + if (mPlayStartTime.IsNull()) { Is this ever called when this can be true? None of the callers are checking their results of calls to this for -1. @@ +2269,4 @@ > } > > + { > + ReentrantMonitorAutoEnter mon(mDecoder->GetReentrantMonitor()); Just assert that the monitor is already held at the top of this function. @@ +2285,5 @@ > + } > + return mPlayDuration + mStartTime; > +} > + > +PRInt64 nsBuiltinDecoderStateMachine::GetClock() { Needs to assert the appropriate monitor is held and is being called from the expected thread(s). @@ +2353,5 @@ > // Current frame has already been presented, wait until it's time to > // present the next frame. > if (frame && !currentFrame) { > + PRInt64 now = IsPlaying() ? clock_time : mPlayDuration; > + No newline. @@ +2358,5 @@ > remainingTime = frame->mTime - mStartTime - now; > } > } > > + No newline. @@ +2709,5 @@ > mDecoder->GetReentrantMonitor().AssertCurrentThreadIn(); > mEventManager.NotifyAudioAvailableListener(); > } > + > +nsresult nsBuiltinDecoderStateMachine::SetPlaybackRate(double aPlaybackRate) This doesn't return any error values, and none of the callers check the return value anyway. Even if they did, it seems like this could just return a bool. @@ +2735,5 @@ > + > + return NS_OK; > +} > + > +nsresult nsBuiltinDecoderStateMachine::SetPreservesPitch(bool aPreservesPitch) As above. ::: content/media/nsBuiltinDecoderStateMachine.h @@ +590,5 @@ > + > + // Get the video stream position, taking the |playbackRate| change into > + // account. This is a position in the media, not the duration of the playback > + // so far. > + PRInt64 GetVideoStreamPosition(); Most of the class declarations in this code separate the member functions from the member variables, so please do the same with this. @@ +593,5 @@ > + // so far. > + PRInt64 GetVideoStreamPosition(); > + > + // video position, in us > + PRInt64 mVideoPosition; Needs a proper comment, including thread safety. Actually, it looks like this is never initialized or used, so just remove it. ::: content/media/test/manifest.js @@ +367,5 @@ > > // Number of tests to run in parallel. Warning: Each media element requires > // at least 3 threads (4 on Linux), and on Linux each thread uses 10MB of > // virtual address space. Beware! > +var PARALLEL_TESTS = 1; Was this included accidentally? ::: dom/ipc/AudioParent.cpp @@ +42,5 @@ > } > > NS_IMETHOD Run() > { > + mOwner->Write(const_cast<char*>(mData.get()), mFrames); Casting away const makes awfully large assumptions about nsCString's implementation.
Attachment #649917 - Flags: review?(kinetik) → review-
Comment on attachment 649915 [details] [diff] [review] Add test for playbackRate. Review of attachment 649915 [details] [diff] [review]: ----------------------------------------------------------------- Add a test that "volumechange" is not fired when playbackRate is set to a value that causes audio to be muted. ::: content/media/test/manifest.js @@ +24,5 @@ > { name:"bogus.duh", type:"bogus/duh" } > ]; > > +// Used by test_playback_rate.html > +var gPlaybackRate = [ Can you use one of the existing lists? If not, add a comment explaining why. ::: content/media/test/test_playback_rate.html @@ +20,5 @@ > + return false; > +} > + > +function checkPlaybackRate(wallclock, media, expected, threshold) { > + var t = threshold || 0.35; t is never used. @@ +31,5 @@ > +let VERY_SLOW_RATE = 0.1, > + SLOW_RATE = 0.25, > + FAST_RATE = 5, > + VERY_FAST_RATE = 20, > + NULL_RATE = 0.0, Make a note that the values are expected to match those in nsHTMLMediaElement.cpp. @@ +32,5 @@ > + SLOW_RATE = 0.25, > + FAST_RATE = 5, > + VERY_FAST_RATE = 20, > + NULL_RATE = 0.0, > + THRESHOLD = 0.3; Declare threshold separately, it's not related to the rates. Also, I'm not sure I see the value of having threshold declared here, and then only used in one place... with a different default value in checkPlaybackRate and another different value passed into checkPlaybackRate from ontimeupdate. Can you use a common threshold in all of those places, or at least have a constant for each with an explanatory name?
Attached patch Implement playbackRate and related bits r= (obsolete) (deleted) — Splinter Review
This patch addresses all the comment, but see below: > @@ +3589,5 @@ > > + > > +/* attribute bool mozPreservesPitch; */ > > +NS_IMETHODIMP nsHTMLMediaElement::GetMozPreservesPitch(bool* aPreservesPitch) > > +{ > > + *aPreservesPitch = mPreservesPitch; > > I think I'd rather query the decoder for this than store the state in > multiple places. I'd prefer to do volume, muted, playback rate and preserves pitch the same way. I can file a bug to change all if you think it's better, but I like the fact that we can get the volume without having to grab a monitor in the state machine. > ::: content/media/nsAudioStream.h > @@ +10,5 @@ > > #include "nsISupportsImpl.h" > > #include "nsIThread.h" > > #include "nsAutoPtr.h" > > +#include "nsAutoRef.h" > > +#include "soundtouch/SoundTouch.h" > > Can we get away with forward declaring SoundTouch and only including this in > the .cpp? The definition is needed to use it as an AutoRefTrait. Also, I had no chance making nsAutoRef working without a definition.
Attachment #653827 - Flags: review?(kinetik)
Attachment #649917 - Attachment is obsolete: true
Updated tests, still green on all platforms.
Attachment #653832 - Flags: review?(kinetik)
Attachment #649915 - Attachment is obsolete: true
Attachment #649915 - Flags: review?(kinetik)
Comment on attachment 653832 [details] [diff] [review] Implement playbackRate and related bits (Tests) r= Review of attachment 653832 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/test/test_playback_rate.html @@ +46,5 @@ > + var delta = t.oldCurrentTime, > + delta_wallclock = (t.timestamp - t.startTimestamp - t.bufferingTime) / 1000; > + > + t.mozPreservesPitch = false; > + is(t.mozPreservesPitch, false, "We we disable the pitch preservation, it should appear as such."); "we" is doubled. @@ +134,5 @@ > + is(t.defaultPlaybackRate, SLOW_RATE, > + "The default playback rate shoud be "+SLOW_RATE+"." + t.token); > + ok(delta_wallclock > delta , "We are effectively slowing down playback. (" + delta_wallclock + ", " + delta + ")"); > + if (t.skippedFastPart) { > + is(t.ratechangecount, 7, "We should have received 8 \"ratechange\" events."); 7?
Attachment #653832 - Flags: review?(kinetik) → review+
Comment on attachment 653827 [details] [diff] [review] Implement playbackRate and related bits r= Review of attachment 653827 [details] [diff] [review]: ----------------------------------------------------------------- Looks good, r+ with the following comments addressed. ::: content/html/content/src/nsHTMLMediaElement.cpp @@ +108,5 @@ > > // Number of milliseconds between timeupdate events as defined by spec > #define TIMEUPDATE_MS 250 > > +// Those constants are arbitrary These @@ +113,5 @@ > +// Minimum playbackRate for a media > +static const double MIN_PLAYBACKRATE = 0.25; > +// Maximum playbackRate for a media > +static const double MAX_PLAYBACKRATE = 5.0; > +// Thos are the limits beyonds which SoundTouch does not perform too well and when These @@ +711,5 @@ > return NS_OK; > SetPlayedOrSeeked(false); > mIsRunningLoadMethod = true; > AbortExistingLoads(); > + SetPlaybackRate(mDefaultPlaybackRate); Also need to call this in MozLoadFrom and possibly LoadWithChannel. ::: content/media/nsAudioStream.cpp @@ +150,5 @@ > PRUint32 aBytesPerFrame) > { > mAudioChild = aChild; > mBytesPerFrame = aBytesPerFrame; > + mBuffer.Assign((char*)aBuf, aNumberOfFrames * aBytesPerFrame); No need to cast const away now. @@ +438,5 @@ > NS_DispatchToMainThread(event); > } > } > > +bool nsAudioStream::EnsureTimeStretcherInitialized() It feels like the results returned from this are inverted. I'd expect EnsureTimeStretcherInitialized() returning true meant that it is initialized. @@ +448,5 @@ > + mTimeStretcher.own(state); > + } > + if (!mTimeStretcher) { > + return true; > + } Slightly simpler: if (!state) { return false; } mTimeStretcher.own(state); @@ +458,5 @@ > + > +nsresult nsAudioStream::SetPlaybackRate(double aPlaybackRate) > +{ > + NS_ASSERTION(aPlaybackRate > 0.0, > + "Can't handle negative or null playbackrate in the AudioStream."); Indent the string to the (, please. (This applies everywhere you've added an assertion.) @@ +462,5 @@ > + "Can't handle negative or null playbackrate in the AudioStream."); > + // Avoid instantiating the resampler if we are not changing the playback rate. > + if (aPlaybackRate == mAudioClock.GetPlaybackRate()) { > + return NS_OK; > + } Do we need a case here to handle playbackRate returning to 1.0 so that we release mTimeStretcher and avoid extra processing overhead? I guess this is tricky because you need to feed some samples into SoundTouch to cover the transition between rates? @@ +483,5 @@ > +{ > + // Avoid instantiating the timestretcher instance if not needed. > + if (aPreservesPitch == mAudioClock.GetPreservesPitch()) { > + return NS_OK; > + } Same question as above, but for toggling preservesPitch when playbackRate == 1.0. @@ +577,5 @@ > + 32767 : > + short(scaled_value); > + } > + } > +#else // Fixed point audio, no need to convert. Just say "Integer audio" here, fixed point usually refers to something else. @@ +645,2 @@ > } > + mTimeStretcher->putSamples(toWrite, aFrames); Does SoundTouch have a fixed size internal buffer for samples, such that putSamples will eventually block? Or will it buffer everything we feed it? Write() is expected to be blocking once the audio backend's buffers has filled. @@ +650,5 @@ > + PRUint32 framesAvailable = mTimeStretcher->receiveSamples(data, numFrames); > + NS_ASSERTION(mTimeStretcher->numSamples() == 0, > + "We did not get all the data from the SoundTouch pipeline."); > + // It is possible to have nothing to write: the data are in the processing > + // pipeline, and will be written to the backend next time. This ends up being quite painful, since it's possible there won't be another Write() for a while (if the audio loop sleeps, or if we're at the end of the stream), causing data to be truncated. The former case is difficult to handle, since the audio stream has no knowledge of Write() scheduling. The latter we can fix by flushing the remaining samples from SoundTouch inside Drain(). Given that we're trying to ditch this code and switch to nsBufferedAudioStream (which doesn't suffer these types of problems due to using a pull model), don't worry about fixing this if it's difficult, but please do file a followup bug so we've got a way to track it until the code is removed. @@ +1080,5 @@ > + > + PRInt32 FramesToBytes(PRInt32 aFrames) { > + return aFrames * mBytesPerFrame; > + } > + This should both use PRUint32. @@ +1197,5 @@ > const PRUint8* src = static_cast<const PRUint8*>(aBuf); > PRUint32 bytesToCopy = aFrames * mBytesPerFrame; > > while (bytesToCopy > 0) { > + PRInt32 available = NS_MIN(bytesToCopy, mBuffer.Available()); This should remain PRUint32. @@ +1361,5 @@ > +{ > + PRUint8* wpos = reinterpret_cast<PRUint8*>(aBuffer); > + > + // Flush the timestretcher pipeline, if we were playing using a playback rate > + // different of 1.0. "other than 1.0". @@ +1367,5 @@ > + if (mTimeStretcher && mTimeStretcher->numSamples()) { > + flushedFrames = mTimeStretcher->receiveSamples(reinterpret_cast<SampleType*>(wpos), aFrames); > + wpos += FramesToBytes(flushedFrames); > + } > + PRUint32 toPop_b = FramesToBytes(aFrames - flushedFrames); Call this toPopBytes. @@ +1383,5 @@ > +long > +nsBufferedAudioStream::GetTimeStretched(void* aBuffer, long aFrames) > +{ > + long processedFrames = 0; > + if(EnsureTimeStretcherInitialized()) { Space between if and (. @@ +1384,5 @@ > +nsBufferedAudioStream::GetTimeStretched(void* aBuffer, long aFrames) > +{ > + long processedFrames = 0; > + if(EnsureTimeStretcherInitialized()) { > + return 0; Return -1 to indicate an error. @@ +1388,5 @@ > + return 0; > + } > + PRUint8* wpos = reinterpret_cast<PRUint8*>(aBuffer); > + double playbackRate = static_cast<double>(mInRate) / mOutRate; > + PRUint32 toPop_b = FramesToBytes(ceil(aFrames / playbackRate)); Call this toPopBytes. @@ +1409,5 @@ > + } > + PRUint32 receivedFrames = mTimeStretcher->receiveSamples(reinterpret_cast<SampleType*>(wpos), aFrames - processedFrames); > + wpos += FramesToBytes(receivedFrames); > + processedFrames += receivedFrames; > + } while(processedFrames < aFrames && !lowOnBufferedData); Space between while and (. @@ +1441,2 @@ > > // Adjust bytesWanted to fit what is available in mBuffer. bytesWanted was removed, update the comment. @@ +1464,2 @@ > if (mState != DRAINING) { > + PRUint8* rpos = static_cast<PRUint8*>(aBuffer) + (aFrames - underrunFrames) * mBytesPerFrame; Use FramesToBytes helper here. And anywhere else in the class that's currently doing this conversion or the inverse manually. ::: content/media/nsBuiltinDecoderStateMachine.cpp @@ +900,5 @@ > ReentrantMonitorAutoExit exitMon(mDecoder->GetReentrantMonitor()); > TimeStamp start = TimeStamp::Now(); > videoPlaying = mReader->DecodeVideoFrame(skipToNextKeyframe, currentTime); > decodeTime = TimeStamp::Now() - start; > + playbackRate = mPlaybackRate; This is being read without the monitor held. It seems like you can get rid of the playbackRate decl and use mPlaybackRate directly everywhere in this function. @@ +1040,5 @@ > + mAudioStream->SetPreservesPitch(preservesPitch); > + playbackRate = mPlaybackRate; > + if (playbackRate != 1.0) { > + NS_ASSERTION(playbackRate != 0, > + "Don't set the playbackRate to 0 on an nsAudioStream."); Indent the string to the (, please. @@ +1100,5 @@ > mAudioStream->SetVolume(volume); > } > + if (setPlaybackRate) { > + NS_ASSERTION(playbackRate != 0, > + "Don't set the playbackRate to 0 in the nsAudioStreams"); Indent the string to the (, please. ::: dom/ipc/AudioChild.cpp @@ +20,5 @@ > mIPCOpen(true), > mDrained(false) > { > MOZ_COUNT_CTOR(AudioChild); > +} Keep the empty line. ::: dom/ipc/AudioParent.cpp @@ +42,5 @@ > } > > NS_IMETHOD Run() > { > + mOwner->Write(const_cast<char*>(mData.get()), mFrames); No need for this change now that the no-const Write() was reverted.
Attachment #653827 - Flags: review?(kinetik) → review+
What's the status of this? Is it being worked on?
Chris, it is. To be specific I've ran into very obscure test failures on try (the kind you can't reproduce locally). I now know what happens, but it was a long process (instrument a build, push to try, wait, etc.). From what I've gathered, it seems that the test (which is part of the webgl conformance test suite, uses a video, but does not change the playback rate) is wrong. If I fix it, it's green all the time. The patch is finished, periodically rebased against tip, and is not landed only because of that test failure. However, I want to double check everything, and I've been side tracked by other stuff. I've got a message in a tab that explains the problem and the solution, that I've been willing to finish for a couple days, but I keep getting distracted by other stuff. Also, we need the library to land, and for that khuey's r+, but it should not be too long.
(This is the message I mention in the previous comment). Update on this, I've been debugging a failing test since (in the webgl conformance test suite, the test right after the test tex-image-and-sub-image-2d-with-video.html, that has a video that loops and is never paused) the last weeks. What I found so far: - Without those patches, the video does not play on try (I've enabled the NSPR log on try, and I see no trace of AdvanceFrame setting a new image). I guess it successfully sets the first image (this is done on another code path), because it can upload the video element to the texture, read the pixels, that are correct. - With those patches, the video plays (I see a lot of message "Decoder playing video frame XXX"), the test in which the video is is successful, and then the next tests times out. Since the video is not stopped, it continue to play for a while, loops several time. - In both cases the element (+ state machine, decoder, and associated threads) are released _long_ after the test is finished, because the webgl test never pauses the video, that has the loop attribute. They are released approximately at the same time, around "conformance/textures/texture-size-cube-maps.html". I checked that with those patches, everything is released like it should be. If I force the release of everything just before the function that ends the test is called (by doing |videoElement.src = "";|), everything is green all the time. My best guess at that point is that those patches, by enabling the video to play instead of not playing (which is certainly not indented, I guess, because |loop| is set to make sure there always is a frame that shows the color when the frame is uploaded) starves the main thread that does not have time to execute the other test before the timeout. All this only happens on Linux, only on try server. I don't know if this totally makes sense yet. I want to double check the way this patch changes the clock, because it might be what causes the difference. IIRC, the video present in this test also has the characteristic of having an audio stream that is _way_ shorter that the video stream. That might cause the problem we see.
Thanks for the update!
Try changing the WebGL test video to a different video, say one with no audio track at all.
So for some reason this passes try 100% of the time, now, after a lot of rebasing and a month of vacation. This is rebased on top of eshan, cpearce and roc's refactoring of content/media. I've tested that it still works, using boths cubeb and sydneyaudio. The part that has changed the most is content/media/AudioStream.cpp. Everything else is just code moved around.
Attachment #682167 - Flags: review?(kinetik)
Attachment #653827 - Attachment is obsolete: true
Attachment #653832 - Attachment is obsolete: true
Comment on attachment 682167 [details] [diff] [review] Implement playbackRate and related bits Review of attachment 682167 [details] [diff] [review]: ----------------------------------------------------------------- You'll need to rebase again, I get a bunch of rejects due to bug 795237 landing. ::: content/media/AudioStream.h @@ +22,5 @@ > + static void Release(soundtouch::SoundTouch* resamplerState) { > + delete resamplerState; > + resamplerState = nullptr; > + } > +}; Since this only needs a simple |delete|, you can use nsAutoPtr instead and save a few lines of code.
Attachment #682167 - Flags: review?(kinetik) → review+
Depends on: 814532
Depends on: 814533
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
Depends on: 814708
Depends on: 815107
In this day, Firefox nightly 20.0a1 (2012-11-26) have 42 test passed/47. Firefox fail for: * Seeking to unbuffered position with continuous playback after seeking (MP3) * Seeking to unbuffered position with continuous playback after seeking (AAC) * Supports MP3 format * Supports AAC format => Normal: problems with patent. It's work if you choose Ogg. * Event "stalled" I can't find a bug for it in bigzilla. Firefox 17.0 have 39/47. The difference it's for: * Property "playbackRate" * Property "defaultPlaybackRate" * Event "ratechange"
Depends on: 815106
Depends on: 819377
Depends on: 819828
Depends on: 826349
Depends on: 822952
Depends on: 846612
Depends on: 821737
Depends on: 1167260
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: