- This change:
1. Extracts HlsExtractor interface from TsExtractor.
2. Adds AdtsExtractor for AAC/ADTS streams, which turned out to be
really easy.
Selection of the ADTS extractor relies on seeing the .aac extension.
This is at least guaranteed not to break anything that works already
(since no-one is going to be using .aac as the extension for something
that's not elementary AAC/ADTS).
Issue: #209
I think this is the limit of how far we should be pushing complexity
v.s. efficiency. It's a little complicated to understand, but probably
worth it since the H264 bitstream is the majority of the data.
Issue: #278
Use of Sample objects was inefficient for several reasons:
- Lots of objects (1 per sample, obviously).
- When switching up bitrates, there was a tendency for all Sample
instances to need to expand, which effectively led to our whole
media buffer being GC'd as each Sample discarded its byte[] to
obtain a larger one.
- When a keyframe was encountered, the Sample would typically need
to expand to accommodate it. Over time, this would lead to a
gradual increase in the population of Samples that were sized to
accommodate keyframes. These Sample instances were then typically
underutilized whenever recycled to hold a non-keyframe, leading
to inefficient memory usage.
This CL introduces RollingBuffer, which tightly packs pending sample
data into a byte[]s obtained from an underlying BufferPool. Which
fixes all of the above. There is still an issue where the total
memory allocation may grow when switching up bitrate, but we can
easily fix that from this point, if we choose to restrict the buffer
based on allocation size rather than time.
Issue: #278
- Remove TsExtractor's knowledge of Sample.
- Push handling of Sample objects into SampleQueue as much
as possible. This is a precursor to replacing Sample objects
with a different type of backing memory. Ideally, the
individual readers shouldn't know how the sample data is
stored. This is true after this CL, with the except of the
TODO in H264Reader.
- Avoid double-scanning every H264 sample for NAL units, by
moving the scan for SEI units from SeiReader into H264Reader.
Issue: #278
The complexity around not enabling the video renderer before it
has a valid surface is because MediaCodecTrackRenderer supports
a "discard" mode where it pulls through and discards samples
without a decoder. This mode means that if the demo app were to
enable the renderer before supplying the surface, the renderer
could discard the first few frames prior to getting the surface,
meaning video rendering wouldn't happen until the following sync
frame.
To get a handle on complexity, I think we're better off just removing
support for this mode, which nicely decouples how the demo app
handles surfaces v.s. how it handles enabling/disabling renderers.
Reordering in the extractor isn't going to work well with the
optimizations I'm making there. This change moves sorting back
to the renderer, although keeps all of the renderer
simplifications. It's basically just moving where the sort
happens from one place to another.
I'm not really a fan of micro-optimizations, but given this method
scans through every H264 frame in the HLS case, it seems worthwhile.
The trick here is to examine the first 7 bits of the third byte
first. If they're not all 0s, then we know that we haven't found a
NAL unit, and also that we wont find one at the next two positions.
This allows the loop to increment 3 bytes at a time.
Speedup is around 60% on Art according to some ad-hoc benchmarking.
1. AdtsReader would previously copy all data through an intermediate
adtsBuffer. This change eliminates the additional copy step, and
instead copies directly into Sample objects.
2. PesReader would previously accumulate a whole packet by copying
multiple TS packets into an intermediate buffer. This change
eliminates this copy step. After the change, TS packet buffers
are propagated directly to PesPayloadReaders, which are required
to handle partial payload data correctly. The copy steps in the
extractor are simplified from:
DataSource->Ts_BitArray->Pes_BitArray->Sample->SampleHolder
To:
DataSource->Ts_BitArray->Sample->SampleHolder
Issue: #278
- TsExtractor is now based on ParsableByteArray rather than BitArray.
This makes is much clearer that, for the most part, data is byte
aligned. It will allow us to optimize TsExtractor without worrying
about arbitrary bit offsets.
- BitArray is renamed ParsableBitArray for consistency, and is now
exclusively for bit-stream level reading.
- There are some temporary methods in ParsableByteArray that should be
cleared up once the optimizations are in place.
Issue: #278
This is the start of a sequence of changes to fix the ref'd
github issue. Currently TsExtractor involves multiple memory
copy steps:
DataSource->Ts_BitArray->Pes_BitArray->Sample->SampleHolder
This is inefficient, but more importantly, the copy into
Sample is problematic, because Samples are of dynamically
varying size. The way we end up expanding Sample objects to
be large enough to hold the data being written means that we
end up gradually expanding all Sample objects in the pool
(which wastes memory), and that we generate a lot of GC churn,
particularly when switching to a higher quality which can
trigger all Sample objects to expand.
The fix will be to reduce the copy steps to:
DataSource->TsPacket->SampleHolder
We will track Pes and Sample data with lists of pointers into
TsPackets, rather than actually copying the data. We will
recycle these pointers.
The following steps are approximately how the refactor will
progress:
1. Start reducing use of BitArray. It's going to be way too
complicated to track bit-granularity offsets into multiple packets,
and allow reading across packet boundaries. In practice reads
from Ts packets are all byte aligned except for small sections,
so we'll move over to using ParsableByteArray instead, so we
only need to track byte offsets.
2. Move TsExtractor to use ParsableByteArray except for small
sections where we really need bit-granularity offsets.
3. Do the actual optimization.
Issue: #278
If the timesource track renderer ends, but other track renderers
haven't finished, the player would get stuck in a pending state.
This change enables automatic switching to the media clock in the
case that the timesource renderer has ended, which allows other
renderers to continue to play.
These may occur in VOD streams where a representation's data
is small enough not to require segmentation or an index. For
example subtitle files.
Issue: #268
SampleExtractor will initially only be implemented by FrameworkSampleExtractor
which delegates to a MediaExtractor, but eventually it will also be implemented
by additional extractors.
The sample extractor can be used as a source of samples via DefaultSampleSource.
Also added clamping to getSegmentNum in one case where
it was not already implemented, and defined this behavior
property in the getSegmentNum javadoc.
Issue: #262
Without this, the byte is cast as follows (in bits) if the top
byte is set:
10000010 -> 1000000000000000000000000000010
This works because we then always shift at least one bit left,
and only look at the bottom 8 bits of the result. It's confusing
though. It's clearer if the cast to int gives just adds zeros to
the front, like:
10000010 -> 0000000000000000000000010000010
- Workaround issue where video may freeze whilst audio continues
on some devices that have entered bad states.
- Fix wrap-around for playbacks lasting more than 27 hours.
- Target 4x the minimum specified by the framework.
- Impose a minimum duration (250ms).
- Impose a maximum duration (750ms, or the minimum
specified by the framework if that's larger).
I've removed the ability to specify the multiplication
factor, since the underlying implementation is getting more
complicated, and we should really be able to figure this out
internally.
* this fixes a bug when switching from HE-AAC 22050Hz to AAC 44100Hz (the AudioTrack was not reset and we were trying to send a bad number of bytes, triggering a "AudioTrack.write() called with invalid size" error)
* this also improves quality switches, making it almost seamless
Previously samples belonging to disabled tracks would just
accumulate in an arbitrarily long queue in TsExtractor. We
need to actively throw samples away from disabled tracks up
to the current playback position, so as to prevent this.
Issue: #174
I'm not sure exactly what the implications of this change are,
but I'd really hope that only one program in each stream is carrying
audio/video. For GoPro cameras, they expose the video stream in
the second program, for some reason.
Issue: #116
We've seen a few streams where this assertion fails. If you
just skip the packet, things appear to recover correctly in
all cases I've seen, so replacing failure with a warning.
- Handle read returning NOTHING_READ for AC-3 streams.
- Remove extra checks for the audio track being initialized.
- Call isInitialized() instead of checking audioTrack != null.
ac3Bitrate is set only after the first buffer is handled, which meant that
getting the playback position would cause a divide by zero before then.
When playing back AC-3 content, the ac3Bitrate will always be set after the
first buffer is handled, so return a 0 position if it is not set.
- Adds support for dash manifests that define SegmentTemplate
but no SegmentTimeline.
- Assumes that the device clock is correct when calculating which
segments to load. The final step here is to use the Utc timing
element in the DASH manifest to obtain an accurate client clock.
- Doesn't yet enforce that the client shouldn't load segments that
are in the future or behind the live window.
This fixes the referenced issue, except that the MPD parser
needs to actually parse out UUID and binary data for schemes
that we wish to support. Alternatively, it's easy to applications
to do this themselves by extending the parser and overriding
the parseContentProtection and buildContentProtection methods.
Github Issue: #119
It's cleaner to not inject data into the extractor only
so that it can be read out as though it were parsed from
the stream. This is also an incremental step towards
fixing Github issue #119.
Note: This adds support for the majority of DASH live streams,
however we do not yet correctly support live streams that rely
on UtcTimingElements in their manifests.
Issue: #52
Plus start to properly document the SmoothStreaming package.
Note that where the documentation is a little vague, this is
because the original SmoothStreaming documentation is equally
vague!
- Split sample pools for video/audio/misc, since the typical
required sample sizes are very different (and so it becomes
inefficient to use a sample sized for video to hold audio).
- Add TODO for further improvements.
Issue: #174
The timestamp scaling in SegmentBase.getSegmentTimeUs was
overflowing for some streams. Apply a similar trick to that
applied in the SmoothStreaming case to fix it.
- Move all three buffering constants to a single class (the
chunk source).
- Increase the target buffer to 40s for increased robustness
against temporary network blips.
- Make values configurable via the chunk source constructor.
- Treat captions as a text track for HLS. This allows them to
be enabled/disabled through the demo app UI.
Issue: #165
- Add options to switch abruptly at segment boundaries. Third
parties who guarantee keyframes at the start of segments will
want this, because it makes switching more efficient and hence
rebuffering less likely.
- Switch quality faster when performing a splicing switch (when
we detect that we need to switch variant, we now immediately
request the same segment as we did last time for the new variant,
rather than requesting one more segment for the old variant
before doing this.
1. Correctly replace the AES data source if IV changes.
2. Check the largest timestamp for being equal to MIN_VALUE, and
handle this case properly.
3. Clean up AES data source a little.
Issue: #162
The actual fix here is to not call discardExtractors in HlsSampleSource
whilst the loading thread that's pushing data into it is still running.
It's required to wait for that thread to have exited before doing this.
Issue: #159
Looking up a long in a HashSet<Long> auto boxes the long and leaves
it for the GC. As decodeOnly is relatively infrequent it's much
better to do a simple linear search in a List<Long>. That way
we can avoid boxing every incoming time stamp value. In the general
case this will be linear searching in an empty list, a very fast
operation.
Signed-off-by: Jonas Larsson <jonas@hallerud.se>
AudioTrack contains the portions of MediaCodecAudioTrackRenderer that handle the
platform AudioTrack instance, including synchronization (playback position
smoothing), non-blocking writes and releasing.
This refactoring should not affect the behavior of audio playback, and is in
preparation for adding an Ac3PassthroughAudioTrackRenderer that will use the
AudioTrack.
Passing uncropped dimensions to certain decoders will make them
output frames without proper cropping set.
Signed-off-by: Jonas Larsson <jonas@hallerud.se>
- Add a readBit method to BitsArray for reading a boolean flag.
- Make things accessed from inner classes package visibility to avoid
the compiler generating thunk methods.
- The HlsSampleSource now owns the extractor. TsChunk is more or less dumb.
The previous model was weird, because you'd end up "reading" samples from
TsChunk objects that were actually parsed from the previous chunk (due to
the way the extractor was shared and maintained internal queues).
- Split out consuming and reading in the extractor.
- Make it so we consume 5s ahead. This is a window we allow for uneven
interleaving, whilst preventing huge read-ahead (e.g. in the case of sparse
ID3 samples).
- Avoid flushing the extractor for a discontinuity until it has been fully
drained of previously parsed samples. This avoids skipping media shortly
before discontinuities.
- Also made start-up faster by avoiding double-loading the first segment.
Issue: #3
The key change here is that nextLoadPositionUs is set to -1
if we're not loading but don't have a next chunk ready to
load. This ensures that "missing chunks" in one stream don't
prevent chunks in another stream from loading. This occurs
in SmoothStreaming with TTML subtitles, where the chunks are
sparse.
Propagate elapsedRealtimeUs to the video renderer. This allows
the renderer to calculate and adjust for the elapsed time since
the start of the current rendering loop. Typically this is <2ms,
but there situations where it can go higher (normally when the
video renderer ends up processing more than 1 output buffer in
a single loop).
Also made variable naming more consistent throughout the package.
- Move parsing onto a background thread. This is analogous
to how frame decoding is pushed to MediaCodec, and should
prevent possible jank when new subtitle samples are parsed.
This is more important for out-of-band subtitles, which can
take a second or two to parse fully.
- Add Useful DataSpec method.
- Use native frame release timing in video renderer for
smoother video playback.
- Avoid unnecessary memory copy steps in audio renderer.
- Use non-blocking AudioTrack API.
Previously we'd end up blocking forever in this case, which
is the worst thing we could do :). We could either throw an
exception or just print a warning. Printing a warning is more
in line with what other methods do (Handler prints a "sending
message to dead thread" warning).
This allows ManifestFetcher to both execute the initial
manifest load and be plugged into an ExoPlayer ChunkSource,
where it can be used for repeated manfiest refreshes during
live playback.
This API wasn't particularly nice. Best to remove it whilst
hopefully no-one is using it. Leaving the ReadHead abstraction
in place, since it might well prove useful in the future.
2. Common interface for manifest parsers.
- This effectively moves the common interface from the Fetcher level
(i.e. ManifestFetcher) to the Parser level (i.e. ManifestParser).
- The motivation here is to allow the implementation of components that
can work with a generic ManifestParser implementation.
- Skips unrecognized elements rather than crashing.
- FourCC treated as required for video and optional elsewhere,
as per the SmoothStreaming spec.
- Only parse initData text when we're actually in the ProtectionHeader element
This means that after a decoder flush, the renderer will avoid
feeding non-keyframes into the decoder until it has received and
fed the first keyframe. The decoder has no way of correctly
decoding non-keyframes that arrive before a keyframe.
It looks like for the case of self-contained media segments,
it's possible to get stuck without failure in the case that
the load fails having loaded less than the length of the init
data.
Since we have a Format class as well, it's very confusing that
FormatHolder actually holds a MediaFormat. I think it's quite
likely that Format will need promoting into the root package as
part of the HLS work, which will make this even more confusing
(although it is possible that for HLS we'll define yet another
Format class, if it turns out we need significantly different
fields).
Note - I deliberately avoided renaming the formatHolder
args/params, because they're not particularly ambiguous and
because it introduces some ugly line breaks.
- Bring back requirement for the first video frame to be rendered
before isReady returns true, *unless* we've deduced that the
upstream source is serving multiple renderers.
- Ditto for requiring that the audio track has some buffered data.
- cache ref didn't work because it referred to a private variable
(which isn't documented) from a public interface definition
(which is). Meaning the Javadoc generator was trying to link
to documentation that didn't exist.
The equals check we perform needs to ignore the max dimensions.
This tended to work in practice because formats would be the
same object, but in the case where different format objects
are used, things can break.
- Add constants class. Currently housing a single lonely variable,
which is used generally throughout the library, and so no longer
nicely fits into a specific class.
- Rename a few other constants to add clear units.
- Made minor tweak to ExoPlayer documentation.
1. Use ints rather than longs.
2. Remove some counters that dont seem hugely useful.
3. Replace use of volatile with explicit method calls that
cause a memory barrier. This is a lot more efficient than
using volatile because it can be invoked only once per
doSomeWork.
- Make MediaCodecTrackRenderer.isReady more permissive.
This largely fixes#21
- Bring WebmExtractor closer to FragmentedMp4Extractor.
The two will probably be placed under a common interface
fairly soon, which will allow significant code
deduplication.
* Remove concept of being prepared by simply reporting if format
and/or cues are known.
* Allow replacement of format and/or cues later in the stream.
* Initialization and index segments can be parsed independently
of one another but must be in order due to internal WebM dependencies.
* Let seekTo() work even when cues are unknown.
This paves the way for SegmentTemplate and SegmentList based
mpds, which will implement DashSegmentIndex directly rather than
parsing an index from the media stream.
- Define DashSegmentIndex.
- Make use of DashSegmentIndex in chunk sources.
- Define an implementation of DashSegmentIndex that wraps a SegmentIndex.
- Add method that will allow Representations to return a DashSegmentIndex
directly in the future.
- Add support for non-contiguous index and initialization data in media streams.
For the Webm case this isn't enabled yet due to extractor limitations.
- Removed ability to fetch multiple chunks. This functionality does not extend
properly to SegmentList and SegmentTemplate variants of DASH.
Why: This was a bad initial choice. Manifests typically define bandwidth in
bits/sec. If you divide by 8 then you're throwing away information due to
rounding. Unfortunately it turns out that SegmentTemplate based manifests
require you to be able to recall the bitrate exactly (because it's substituted
in during segment URL construction).
Medium term: We should consider converting all our bandwidth estimation
over to bits/sec as well.
Note1: Also changed Period id to be a string, to match the mpd spec.
Note2: Made small optimization in FormatEvaluator to not consider discarding
the first chunk (durationBeforeThisSegmentUs will always be negative, and even
in the error case where it's not, removing the first thunk should be an error).
- Allow the content type of an adaptation set to be inferred
from the mimeTypes of the contained representations.
- Ensure the contained mimeTypes are consistent with one
another, and with the adaptation set.
Ref: Issue #2
- Add support for parsing avc3 boxes.
- Make workaround for signed sample offsets in trun files always enabled.
- Generalize remaining workaround into a flag, to make it easy to add additional workarounds going forward without changing the API.
- Fix DataSourceStream bug where read wouldn't return -1 having fully read segment whose spec length was unbounded.
This can help custom ChunkSource implementations to act on
this information. For example an adaptive implementation may
choose to blacklist a problematic format if loads of that
format keep failing.
AudioTrack time will go out of sync if the decodeOnly flag
is set of arbitrary samples (as opposed to just those following
a seek). It's a pretty obscure case and it would be weird for
anyone to do it, but we should be robust against it anyway.
1. Fix SimpleCache startReadWrite asymmetry. Allow more concurrency.
- startReadWrite does not have the concept of a read lock. Once
a cached span is returned, the caller can do whatever it likes
for as long as it wants to. This allows a read to be performed
in parallel with a write that starts after it.
- If there's an ongoing write, startReadWrite will block even if
the return operation will be a read. So there's a weird asymmetry
where reads can happen in parallel with writes, but only if the
reads were started first.
- This CL removes the asymmetry, by allowing a read to start even
if the write lock is held.
- Note that the reader needs to be prepared for the thing it's
reading to disappear, but this was already the case, and will
always be the case since the reader will need to handle disk
read failures anyway.
2. Add isCached method.