When transmuxing, the `EncodedSampleExporter` maintains a queue of input
buffers that get filled with encoded data by the asset loader. The number of
buffers was limited to avoid using more and more memory if producer (asset
loader) gets far ahead of the consumer (exporter).
Previously this limit was fixed at 10 buffers, but increasing the number of
buffers can make some transmux operations much faster. Allow allocating between
a min and max number of buffers, and also set a target allocation size beyond
which new buffers can't be allocated. This allows audio formats which require
many small buffers to be processed more quickly, while preventing allocating
too much memory for hypothetical very high bitrate formats.
'Remove video' edits on local videos in particular get much faster, because
audio buffers are very short and there are lots of them. With a sample 10
minute video, a 'remove video' edit took 2 seconds (36 seconds before this
change). With a sample 1 minute removing video took 0.25 seconds after this
change (2.5 seconds before).
The speed improvement is smaller for other types of edits that retain the video
track. Transmuxing a 10 minute video retaining the video track took 26 seconds
(40 seconds before).
PiperOrigin-RevId: 583390284
Modifying dumping to not required "released" to be called.
Track index is an arbitrary value based on the order of addTrack calls.
Samples are dumped by track (rather than as soon as they are written),
so it's preferable to use a value that provides more context.
By using the track type as a key, dump files will be more deterministic
and will have more similarities when branched.
PiperOrigin-RevId: 563700982
When generating silence for AudioProcessingPipeline, audio never
queued EOS downstream.
Linked to this, when silence followed an item with audio, the silence
was added to SilentAudioGenerator before the mediaItem reconfiguration
occurred. If the silence had effects, the APP would be flushed after
silence queued EOS, resetting APP.isEnded back to false, so AudioGraph
never ended.
Regression tests reproduce failure without fix, but pass with it.
PiperOrigin-RevId: 550853714
On a MediaItem change, the input Format (and Effects to apply) may be
different. Therefore the AudioProcessingPipeline must be reconfigured
to determine what processing is active, and what the AudioFormat of the
data output is. In the event that it is different, additional
AudioProcessor instances must be used to ensure the encoder will still
be able to accept the audio buffers.
PiperOrigin-RevId: 544338451
Goal of tests (SequenceExportTest) that use this media is for the
silence and the media to match exactly with audio format, however
`sample_with_increasing_timestamps.mp4` had a different sample rate.
testvid_1022ms.mp4: channel count = 2, sample rate = 44100.
PiperOrigin-RevId: 543458948
With the upcoming "handle format changes" CL, stereo -> mono audio
would add an AudioProcessor. Robolectric decodes output encoded data,
which crashes some AudioProcessors because the number of frames may not
be an integer.
PiperOrigin-RevId: 542568875
Audio only tests are now using RAW audio where possible, which is
passed through the Robolectric decoders/encoders, and can be handled by
the AudioProcessor instances accurately.
PiperOrigin-RevId: 541648853
Issue: When running the Transformer related test cases, the tests are flaky
because the order in which audio and video samples are interleaved seems to
differ in few instances.
Root cause: When running a transformation the sample producer (Asset loader)
and sample consumer (Sample pipeline) both runs on different thread and
theoretically there is no reason for behaviour to be deterministic because
the number of samples produced/written depends on how fast individual thread
works. So it is indeed surprising that test somehow worked deterministically in
majority of instances (may be something to do with Robolectric environment).
Solution: Since we don't expect the order of sample interleaving to be deterministic, make the dumping logic deterministic where all the video
samples will be collected and then dumped together (similarly for audio). This would mean we won't be able to see the interleaving so for that we need to
add separate test case verifying the interleaving logic only.
Pending: Test case for interleaving.
PiperOrigin-RevId: 540930871
Mp4Muxer already supports writing Mp4LocationData so added that
as supported Metadata entry.
Support for more Metadata entries will be added in upcoming CLs.
PiperOrigin-RevId: 534473866
Simplify the audio encoder input timestamp calculation. The new calculation
avoids drifting by tracking the total number of bytes encoded rather than
tracking the timestamp and remainder separately, and also makes the timestamps
match the decoder output buffer timestamps.
Also switch one of the export tests that was passing through AMR samples over
to using WAVE audio. The problem with using AMR is that the compressed samples
are not necessarily an integer number of audio frames and the shadow decoder
would pass them from input to output, so the audio encoder was receiving
non-integer numbers of audio frames.
Tested by logging the timestamps at the decoder output and encoder input with
forcing transcoding audio, and verifying that after this change the audio
timestamps are no longer off by one.
PiperOrigin-RevId: 523409869
The FakeExtractorOutput dumps data in more readable format
so using that where ever possible.
The MdtaMetadataEntry which contains key and value dumped only "key".
Added fix to dump "value" as well.
PiperOrigin-RevId: 517968996
- This is to make sure we know about all the tracks before initializing
the SamplePipelines. This allows to set the muxer and the fallback
listener track count before the SamplePipelines are built.
- As a result, the test files had to be updated because the order in
which the tracks are written has changed.
- The ImageAssetLoader also had to be updated to call onOutputFormat
repeatedly until it returns a non-null SampleConsumer.
- Also fix the trackCount sent to the muxer and fallback listener. The
correct track count can be computed now that we know about all the
tracks before building the SamplePipelines.
PiperOrigin-RevId: 514426123
`SilentAudioGenerator` could output a fractional audio frame, and this could cause downstream components to throw because of trying to read a complete audio frame but only seeing a partial one.
Calculate the output buffer size based on the frame size (which is a no-op for stereo 16-bit audio) and calculate a total number of frames to output then multiple by the frame size.
PiperOrigin-RevId: 494992941
This is necessary to move video decoding to the AssetLoader. Otherwise,
if the decoder max pending frame count is reached, the AssetLoader will
stop queuing frames to the pipeline, and process data will not be called
anymore.
PiperOrigin-RevId: 492392621