Closed
Bug 570904
Opened 14 years ago
Closed 14 years ago
Implement buffered time ranges for WebM
Categories
(Core :: Audio/Video, defect)
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
blocking2.0 | --- | betaN+ |
People
(Reporter: kinetik, Assigned: kinetik)
References
(Blocks 1 open bug)
Details
Attachments
(3 files, 9 obsolete files)
(deleted),
patch
|
cajbir
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
kinetik
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
Details | Diff | Splinter Review |
Spinning this off from bug 462957, which contains the general DOM support and implementations for Ogg and Wave.
Assignee | ||
Comment 1•14 years ago
|
||
This works, but isn't complete:
1. Need to read the segment timecode scale at startup.
2. End time needs to be calculated using last cached block in end cluster.
3. Need to do more to verify parser sync.
4. Performance is unacceptable while the media cache is downloading.
Other issues (also affects the Ogg backend, I think):
1. The local file media stream fires assertions when called on the main thread.
2. Will get horribly confused with a multi-segment file. Ogg has some protection via stream serial numbers.
The performance issue is strange and needs investigation. With every file tested so far, calculating the buffered ranges takes around 1500ms while the media cache is downloading, but drops to 30ms as soon as the download is suspended.
Assignee | ||
Comment 2•14 years ago
|
||
The perf issue was simply the number of small cached reads performed when searching for parser sync. The implementation in the current patch searched byte-by-byte, which is quite expensive with our cached reader implementation as it takes locks and issues syscalls to read from a file on disk.
Unlike with Ogg pages (which can be no larger than ~64kB), Cluster sizes in WebM can be almost any size, with typical sizes of 200kB-500kB (but I have a test file with a 4MB cluster). The current patch was issuing around 100k reads before finding sync. Using an intermediate buffer, I've reduced this to 1-10 (larger) reads. Performance is typically < 0.5ms, but I'm still seeing up to 5ms sometimes when testing against YouTube.
Given that GetBuffered() may be called very frequently (e.g. it seems as if YouTube is calling it via timeupdate events, which we fire every frame), taking 5ms on the main thread to calculate the time ranges probably isn't acceptable.
Updated•14 years ago
|
Keywords: dev-doc-needed
Assignee | ||
Comment 3•14 years ago
|
||
Adds a method to query the timecode scale from the nestegg context.
Attachment #450056 -
Attachment is obsolete: true
Assignee | ||
Comment 4•14 years ago
|
||
Implement buffered for WebM:
- Adds a data discarded notification to the media cache, and feeds that notification through to the decoder's reader.
- Fix a hang in nsMediaFileStream when trying to read past EOF.
- Adds a hunk of ugly code to nsWebMReader to grovel through WebM files to fetch timecodes.
There are a bunch of optimizations that can be done to this, such as only recomputing the time ranges that are in byte ranges that have changed, but it's not yet clear how much optimization effort is worthwhile for this feature. On my current machine a typical call (that isn't returning a cached value) takes around 0.5ms.
I've enabled test_buffered in this patch, but it currently fails because the end buffered time is not identical to the duration. That's a real bug--the end time is actually the start time of the last frame. I need to think about the correct way to fix this, then I'll post an updated patch.
Comment 5•14 years ago
|
||
removing doc needed, since this is already documented generally and isn't WebM specific.
Keywords: dev-doc-needed
Assignee | ||
Updated•14 years ago
|
Attachment #466223 -
Attachment is obsolete: true
Assignee | ||
Updated•14 years ago
|
Attachment #466225 -
Attachment is obsolete: true
blocking2.0: --- → betaN+
Assignee | ||
Comment 6•14 years ago
|
||
Assignee | ||
Comment 7•14 years ago
|
||
Not quite ready for review, but this works.
Assignee | ||
Comment 8•14 years ago
|
||
Attachment #472536 -
Attachment is obsolete: true
Attachment #472950 -
Flags: review?(roc)
Assignee | ||
Updated•14 years ago
|
Attachment #472535 -
Flags: review?(chris.double)
+ virtual void NotifyDataArrived(const char *aBuffer, PRUint32 aLength, PRUint32 aOffset);
no space between char and *
+ // consumes blocks. A new parser is created for each distinct byte range
+ // in the media cache and begins parsing from the first WebM cluster
+ // within that range. Old parsers are destroyed when their range merges
Well, the byte range might not really be in the media cache at any given time. I'd just say "A new parser is created for each distinct byte range of data received"
+ void Run(const unsigned char* aBuffer, PRUint32 aLength,
+ nsTArray<TimeDataOffset>& aMapping);
I think "Append" would be a better name.
+ PRInt64 mOffset;
I think "mCurrentOffset"
+ enum State {
+ SKIP_DATA,
+ CLUSTER_SYNC,
+ READ_VINT,
+ READ_VINT_REST,
+ TIMECODE_SYNC,
+ READ_CLUSTER_TIMECODE,
+ ANY_BLOCK_SYNC,
+ BLOCK_SYNC,
+ READ_BLOCK,
+ READ_BLOCK_TIMECODE
+ };
Better document these a bit.
I think ClusterParser could be moved to its own file. It's nice and isolated.
+ PRUint32 mIDPos;
+
+ PRUint64 mVInt;
+ PRUint32 mVIntLength;
+ PRUint32 mVIntLeft;
+
+ PRUint64 mBlockSize;
+
+ PRUint64 mClusterTimecode;
+
+ PRInt64 mBlockOffset;
+ PRInt16 mBlockTimecode;
+ PRUint32 mBlockTimecodeLength;
+
+ PRUint32 mSkipBytes;
These all need a little bit of explanation, even if it's just a reference to a spec
I think you could just overload operators instead of defining your own Comparators.
+ // Merge parsers with overlapping regions and clean up the dead remnants.
Maybe this would be a little simpler if you did
PRUint32 i = 0;
while (i + 1 < mRangeParser.Length()) {
if mRangeParser[i].mCurrentOffset >= mRangeParser[i + 1].mStartOffset {
mRangeParser[i + 1].mStartOffset = mRangeParser[i].mStartOffset;
remove mRangeParser[i];
} else {
++i;
}
}
+PRBool nsWebMReader::CalculateBufferedForRange(nsTimeRanges* aBuffered,
+ PRInt64 aStartOffset, PRInt64 aEndOffset)
Why return a value? You don't use it.
+ // Find the first data->time map entry at or after aStartOffset.
+ PRUint32 start;
+ mTimeMapping.GreatestIndexLtEq(TimeDataOffset(aStartOffset, 0), start,
+ TimeDataOffsetComparator());
Aren't you finding the first entry at or before aStartOffset?
+ // Find the first data->time map entry at or after aEndOffset.
+ PRUint32 end;
+ mTimeMapping.GreatestIndexLtEq(TimeDataOffset(aEndOffset, 0), end,
+ TimeDataOffsetComparator());
Ditto?
Assignee | ||
Comment 10•14 years ago
|
||
(In reply to comment #9)
> + // Find the first data->time map entry at or after aStartOffset.
> + PRUint32 start;
> + mTimeMapping.GreatestIndexLtEq(TimeDataOffset(aStartOffset, 0), start,
> + TimeDataOffsetComparator());
>
> Aren't you finding the first entry at or before aStartOffset?
It's returning an index at which inserting aStartOffset would leave the array sorted. I don't insert anything, so it ends up pointing to the element >= aStartOffset. Is it confusing because the method is named for one intended use (ordered insertion) and the comment is talking about a different use?
> + // Find the first data->time map entry at or after aEndOffset.
> + PRUint32 end;
> + mTimeMapping.GreatestIndexLtEq(TimeDataOffset(aEndOffset, 0), end,
> + TimeDataOffsetComparator());
>
> Ditto?
The above comment for start applies, but then I decrement the index by one to make it the element <= aEndOffset. The comment is wrong, it should say "at or before", and if there's an exact match it shouldn't decrement end by 1. I've fixed those in my latest patch.
OK, I think the comment in nsTArray for GreatestIndexLtEq is just wrong.
+ // Find the first data->time map entry at or after aEndOffset.
+ PRUint32 end;
+ mTimeMapping.GreatestIndexLtEq(TimeDataOffset(aEndOffset, 0), end,
+ TimeDataOffsetComparator());
+
+ // Adjust end to be the first entry before aEndOffset.
+ if (end > 0) {
+ end -= 1;
+ }
What if there is a timecode entry for aEndOffset? Seems like we shouldn't subtract 1 in that case; the end time would be the time for aEndOffset.
+ TimeDataOffset entry(mBlockOffset , mClusterTimecode + mBlockTimecode);
Stray space before ,
I'm a bit concerned about the performance impact of using an array to store the timecode map. Usually we'll be inserting at the end of the array, but I suppose if you play some video late in the stream and then seek back to the beginning to play the rest, inserting the timecodes could be expensive. I think it would be premature to optimize it at this point, but you might want to try a pathological testcase to see how bad it can be.
Assignee | ||
Comment 12•14 years ago
|
||
(In reply to comment #11)
> What if there is a timecode entry for aEndOffset? Seems like we shouldn't
> subtract 1 in that case; the end time would be the time for aEndOffset.
Yeah, I noticed that and mentioned it above. :-)
> I'm a bit concerned about the performance impact of using an array to store the
> timecode map. Usually we'll be inserting at the end of the array, but I suppose
> if you play some video late in the stream and then seek back to the beginning
> to play the rest, inserting the timecodes could be expensive. I think it would
> be premature to optimize it at this point, but you might want to try a
> pathological testcase to see how bad it can be.
Yeah, it needs to be optimized once this patch lands. I've filed bug 595071 to deal with this. Some numbers:
I timed each NotifyDataArrived call: for a 1GB 720p file (preload="metadata") with 260930 blocks (and a 4GB media cache), seeking to time=duration*0.3 and waiting for the rest of the file to download, then seeking to time=0 and waiting until the entire file had downloaded, there were 40788 calls, ~31000 were under 1ms (most are 0.01ms), ~120 were over 10ms, and the rest (~9000) were between 1 - 10ms. There are ~29000 blocks from duration*0.3 .. duration, so almost all of the fast calls were best case (appending at the end).
Assignee | ||
Comment 13•14 years ago
|
||
Everything but FSM documentation.
- Moved most of the parsing stuff to nsWebMBufferedParser
- Removed BLOCK_SYNC state and merged it with ANY_BLOCK_SYNC
- Fixed a bug parsing BlockGroups/Blocks (mBlockOffset wasn't set)
- Added SKIP_ELEMENT state to skip contents of unused elements
I'll post another patch with that and then request review, but feel free to give feedback on one too.
Attachment #472950 -
Attachment is obsolete: true
Attachment #472950 -
Flags: review?(roc)
Assignee | ||
Comment 14•14 years ago
|
||
Adds comments documenting the parser's state machine. Also fixes a few Win32 warnings.
Working on an additional test to go with this (we already have test_buffered, but it only tests basic functionality), I'll attach that as a separate patch once it's ready.
Attachment #473959 -
Attachment is obsolete: true
Attachment #474622 -
Flags: review?(roc)
+ case TIMECODE_SYNC:
+ if (*p++ != TIMECODE_ID) {
+ mState = CLUSTER_SYNC;
+ break;
+ }
If *p is CLUSTER_ID[0], shouldn't we consume it in CLUSTER_SYNC instead of skipping it here? In other words, we should increment p and consume the character only if we successfully advance to READ_VINT.
But what if the first CLUSTER_SYNC was spurious, we start reading a VINT and inside the VINT we encounter a new, real CLUSTER_SYNC? Seems like we'll lose that here too.
It seems like we might need to spawn off a new parser every time we see a CLUSTER_SYNC bit pattern, or something like that. Maybe we shouldn't try to do that now.
Other than that one issue, the rest looks good.
Assignee | ||
Comment 16•14 years ago
|
||
Fix the issue discussed below, and also treat variable length integers smaller or larger than the spec permits as a sync error.
(In reply to comment #15)
> + case TIMECODE_SYNC:
> + if (*p++ != TIMECODE_ID) {
> + mState = CLUSTER_SYNC;
> + break;
> + }
>
> If *p is CLUSTER_ID[0], shouldn't we consume it in CLUSTER_SYNC instead of
> skipping it here? In other words, we should increment p and consume the
> character only if we successfully advance to READ_VINT.
>
> But what if the first CLUSTER_SYNC was spurious, we start reading a VINT and
> inside the VINT we encounter a new, real CLUSTER_SYNC? Seems like we'll lose
> that here too.
Ah, good catch. I handle this in another place by decrementing p before returning to CLUSTER_SYNC. That should be sufficient to handle this case.
> It seems like we might need to spawn off a new parser every time we see a
> CLUSTER_SYNC bit pattern, or something like that. Maybe we shouldn't try to do
> that now.
Yeah, to handle this problem reliably for the general case of losing sync after we've been through multiple parse steps and consumed a bit of data (and can't necessarily back up far enough to recover) we need to do something like that.
The other change in this patch suffers from this issue--if we fail inside READ_VINT we can't easily get back to the next correct place to find sync, so we'll just scan forward looking for a sync point and possibly miss some data.
We could also have the parser buffer data from potential sync points to allow it to back up, but that could end up consuming a lot of memory, so I prefer your idea. I'll file a follow up bug for that.
Attachment #474622 -
Attachment is obsolete: true
Attachment #474639 -
Flags: review?(roc)
Attachment #474622 -
Flags: review?(roc)
Assignee | ||
Comment 17•14 years ago
|
||
Gah, forgot to qrefresh.
Attachment #474639 -
Attachment is obsolete: true
Attachment #474647 -
Flags: review?(roc)
Attachment #474639 -
Flags: review?(roc)
Comment on attachment 474647 [details] [diff] [review]
second attempt - v3.1
+ // Variable length integers must be 1 to 8 bytes long. If we got
+ // anything else we must've lost sync.
+ if (mVIntLength < 1 || mVIntLength > 8) {
There is no way VIntLength can return a value greater than 8.
Change the >8 check to an assertion, and r=me
Attachment #474647 -
Flags: review?(roc) → review+
Assignee | ||
Comment 19•14 years ago
|
||
Thanks, that addition was just wrong. It can't be less than 1, either. I've added an assertion to VIntLength to document that. Carrying forward review, thanks!
Attachment #474647 -
Attachment is obsolete: true
Attachment #474651 -
Flags: review+
Updated•14 years ago
|
Attachment #472535 -
Flags: review?(chris.double) → review+
Assignee | ||
Comment 20•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/ffd7d74232e6
http://hg.mozilla.org/mozilla-central/rev/fd21140bec7b
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 21•14 years ago
|
||
Caused bustage:
WINNT 5.2 mozilla-central debug test mochitests-1/5 on 2010/09/14 00:15:34
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1284448534.1284450904.24147.gz#err1
NEXT ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_play_events.html | Exited with code -1073741819 during test run
INFO | automation.py | Application ran for: 0:31:46.500000
INFO | automation.py | Reading PID log: c:\docume~1\cltbld\locals~1\temp\tmpsftv3wpidlog
==> process 3628 launched child process 412
==> process 3628 launched child process 3356
INFO | automation.py | Checking for orphan process with PID: 412
INFO | automation.py | Checking for orphan process with PID: 3356
PROCESS-CRASH | /tests/content/media/test/test_play_events.html | application crashed (minidump found)
Operating system: Windows NT
5.2.3790 Service Pack 2
CPU: x86
GenuineIntel family 6 model 23 stepping 8
1 CPU
Crash reason: EXCEPTION_ACCESS_VIOLATION
Crash address: 0xb0
Thread 0 (crashed)
0 xul.dll!nestegg_duration [nestegg.c:c14c5fe155da : 1507 + 0x12]
eip = 0x10e337d8 esp = 0x0012c444 ebp = 0x0012c478 ebx = 0x03a300b8
esi = 0x15a3dc3c edi = 0xffff0007 eax = 0x0012c470 ecx = 0x000000b0
edx = 0x0012c444 efl = 0x00010206
Found by: given as instruction pointer in context
Almost certainly caused by mContext being null in nsWebMReader::GetBuffered() when nestegg_duration is called.
Pushed trivial fix: http://hg.mozilla.org/mozilla-central/rev/5588c9796f0b
Assignee | ||
Comment 22•14 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•