Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from output seeking to combined input seeking and output seeking to accelerate extracting small segments from huge files #729

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

inatuwe
Copy link

@inatuwe inatuwe commented May 14, 2023

Without this change, seeking inside a huge file gets slower the further you seek. E.g., if you want to have a small AudioSegment with 100 seconds starting from 7000 seconds, it may take up to 5 seconds to extract the segment. With this change, it takes less than a second!

I had the idea of trimming start and end positions with:

AudioSegment.from_file(
    file = "video.mp4",
    start_seconds = 7000,
    duration = 100,
)

But this took surprisingly long!

From analysing the command line I realized, that AudioSegment.from_file() was first specifying the input file and then the seek parameters:

'ffmpeg', '-y', '-i', 'video.mp4', '-ss', '7000', '-t', '100', ...

But when reading about the seek parameter for FFMPEG I understood from "https://trac.ffmpeg.org/wiki/Seeking", that actually:
As of FFmpeg 2.1, when transcoding with ffmpeg (i.e. not stream copying), -ss is now also "frame-accurate" even when used as an input option...

So I tried instead the following command:

'ffmpeg', '-y', '-ss', '7000', '-t', '100', '-i', 'video.mp4', ...

... and the video file was processed incredibly much faster!

When the pull request first ran through a test with an mp3 file failed:
test_partial_load_start_second_and_duration_equals_cropped_mp3_audio_segment

The reason was that input seeking is not accurate for encoded streams! So I had to add some margin in the input seeking. I tried to calculate the maximum required margin based on the properties of the stream. Let me know, if it makes sense for you as well.
The approximate margin needed is ~144 ms.
At the end, we still need to seek the output stream, so the ffmpeg command would be

'ffmpeg', '-y', '-ss', '6999.856', '-t', '100.288', '-i', 'video.mp4', '-ss' '0.144' ...

Without this change, seeking inside a huge file gets slower the further you seek. E.g., if you want to have a small AudioSegment with 10 seconds starting from 6900 seconds, it may take up to 5 seconds to extract the segment.
With this change, it takes only few milliseconds!
@inatuwe inatuwe marked this pull request as draft May 15, 2023 19:13
Make input sampling accurate by respecting the worst case expected
inaccuracy due to the frame sizes
Add test case for very small start_second, since the new logic must cover the case,
where the additional part before the start_second would be truncated to zero.
@inatuwe inatuwe marked this pull request as ready for review May 15, 2023 22:07
@inatuwe inatuwe changed the title Switch from output seeking to input seeking Switch from output seeking to combined input seeking and output seeking to accelerate small segments from huge files May 15, 2023
@inatuwe inatuwe changed the title Switch from output seeking to combined input seeking and output seeking to accelerate small segments from huge files Switch from output seeking to combined input seeking and output seeking to accelerate extracting small segments from huge files May 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant