Merge branch 'master' into patch-1

rfscordia · Apr 19, 2021 · ebdedc8 · ebdedc8
2 parents 5658398 + 72b474e
commit ebdedc8
Show file tree

Hide file tree

Showing 14 changed files with 449 additions and 36 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -1,15 +1,16 @@
-sudo: required
-dist: bionic
+os: linux
+dist: bionic  # focal
 language: python
 before_install:
   - sudo apt-get update --fix-missing
 install:
   - sudo apt-get install -y ffmpeg libopus-dev python-scipy python3-scipy
 python:
   - "2.7"
-  - "3.5"
   - "3.6"
   - "3.7"
+  - "3.8"
+  - "3.9"
   - "pypy2"
   - "pypy3"
 script:

diff --git a/API.markdown b/API.markdown
@@ -10,7 +10,6 @@ Currently Undocumented:
 - Signal Processing (compression, EQ, normalize, speed change - `pydub.effects`, `pydub.scipy_effects`)
 - Signal generators (Sine, Square, Sawtooth, Whitenoise, etc - `pydub.generators`)
 - Effect registration system (basically the `pydub.utils.register_pydub_effect` decorator)
-- Silence utilities (detect silence, split on silence, etc - `pydub.silence`)
 
 
 ## AudioSegment()
@@ -91,14 +90,18 @@ The first argument is the path (as a string) of the file to read, **or** a file
 
 **Supported keyword arguments**:
 
-- `format` | example: `"aif"` | default: `"mp3"`
+- `format` | example: `"aif"` | default: autodetected
   Format of the output file. Supports `"wav"` and `"raw"` natively, requires ffmpeg for all other formats. `"raw"` files require 3 additional keyword arguments, `sample_width`, `frame_rate`, and `channels`, denoted below with: **`raw` only**. This extra info is required because raw audio files do not have headers to include this info in the file itself like wav files do.
 - `sample_width` | example: `2`
   **`raw` only** — Use `1` for 8-bit audio `2` for 16-bit (CD quality) and `4` for 32-bit. It’s the number of bytes per sample.
 - `channels` | example: `1`
   **`raw` only** — `1` for mono, `2` for stereo.
 - `frame_rate` | example: `2`
   **`raw` only** — Also known as sample rate, common values are `44100` (44.1kHz - CD audio), and `48000` (48kHz - DVD audio)
+- `start_second` | example: `2.0` | default: `None`
+  Offset (in seconds) to start loading the audio file. If `None`, the audio will start loading from the beginning.
+- `duration` | example: `2.5` | default: `None`
+  Number of seconds to be loaded. If `None`, full audio will be loaded.
 
 
 ### AudioSegment(…).export()
@@ -553,6 +556,33 @@ shifted_samples_array = array.array(sound.array_type, shifted_samples)
 new_sound = sound._spawn(shifted_samples_array)
 ```
 
+Here's how to convert to a numpy float32 array:
+
+```python
+import numpy as np
+from pydub import AudioSegment
+
+sound = AudioSegment.from_file("sound1.wav")
+sound = sound.set_frame_rate(16000)
+channel_sounds = seg.split_to_mono()
+samples = [s.get_array_of_samples() for s in channel_sounds]
+
+fp_arr = np.array(samples).T.astype(np.float32)
+fp_arr /= np.iinfo(samples[0].typecode).max
+```
+
+And how to convert it back to an AudioSegment:
+
+```python
+import io
+import scipy.io.wavfile
+
+wav_io = io.BytesIO()
+scipy.io.wavfile.write(wav_io, 16000, fp_arr)
+wav_io.seek(0)
+sound = pydub.AudioSegment.from_wav(wav_io)
+```
+
 ### AudioSegment(…).get_dc_offset()
 
 Returns a value between -1.0 and 1.0 representing the DC offset of a channel. This is calculated using `audioop.avg()` and normalizing the result by samples max value.
@@ -581,3 +611,83 @@ Collection of DSP effects that are implemented by `AudioSegment` objects.
 ### AudioSegment(…).invert_phase()
 
 Make a copy of this `AudioSegment` and inverts the phase of the signal. Can generate anti-phase waves for noise suppression or cancellation.
+
+## Silence
+
+Various functions for finding/manipulating silence in AudioSegments. For creating silent AudioSegments, see AudioSegment.silent().
+
+### silence.detect_silence()
+
+Returns a list of all silent sections [start, end] in milliseconds of audio_segment. Inverse of detect_nonsilent(). Can be very slow since it has to iterate over the whole segment.
+
+```python
+from pydub import AudioSegment, silence
+
+print(silence.detect_silence(AudioSegment.silent(2000)))
+# [[0, 2000]]
+```
+
+**Supported keyword arguments**:
+
+- `min_silence_len` | example: `500` | default: 1000
+  The minimum length for silent sections in milliseconds. If it is greater than the length of the audio segment an empty list will be returned.
+
+- `silence_thresh` | example: `-20` | default: -16
+  The upper bound for how quiet is silent in dBFS.
+
+- `seek_step` | example: `5` | default: 1
+  Size of the step for checking for silence in milliseconds. Smaller is more precise. Must be a positive whole number.
+
+### silence.detect_nonsilent()
+
+Returns a list of all silent sections [start, end] in milliseconds of audio_segment. Inverse of detect_silence() and has all the same arguments. Can be very slow since it has to iterate over the whole segment.
+
+**Supported keyword arguments**:
+
+- `min_silence_len` | example: `500` | default: 1000
+  The minimum length for silent sections in milliseconds. If it is greater than the length of the audio segment an empty list will be returned.
+
+- `silence_thresh` | example: `-20` | default: -16
+  The upper bound for how quiet is silent in dBFS.
+
+- `seek_step` | example: `5` | default: 1
+  Size of the step for checking for silence in milliseconds. Smaller is more precise. Must be a positive whole number.
+
+### silence.split_on_silence()
+
+Returns list of audio segments from splitting audio_segment on silent sections.
+
+**Supported keyword arguments**:
+
+- `min_silence_len` | example: `500` | default: 1000
+  The minimum length for silent sections in milliseconds. If it is greater than the length of the audio segment an empty list will be returned.
+
+- `silence_thresh` | example: `-20` | default: -16
+  The upper bound for how quiet is silent in dBFS.
+
+- `seek_step` | example: `5` | default: 1
+  Size of the step for checking for silence in milliseconds. Smaller is more precise. Must be a positive whole number.
+
+- `keep_silence` ~ example: True | default: 100
+  How much silence to keep in ms or a bool. leave some silence at the beginning and end of the chunks. Keeps the sound from sounding like it is abruptly cut off.
+  When the length of the silence is less than the keep_silence duration it is split evenly between the preceding and following non-silent segments.
+  If True is specified, all the silence is kept, if False none is kept.
+
+### silence.detect_leading_silence()
+
+Returns the millisecond/index that the leading silence ends. If there is no end it will return the length of the audio_segment.
+
+```python
+from pydub import AudioSegment, silence
+
+print(silence.detect_silence(AudioSegment.silent(2000)))
+# 2000
+```
+
+**Supported keyword arguments**:
+
+- `silence_thresh` | example: `-20` | default: -50
+  The upper bound for how quiet is silent in dBFS.
+
+- `chunk_size` | example: `5` | default: 10
+  Size of the step for checking for silence in milliseconds. Smaller is more precise. Must be a positive whole number.
diff --git a/AUTHORS b/AUTHORS
@@ -84,3 +84,15 @@ Carlos del Castillo
 
 Yudong Sun
     github: sunjerry019
+
+Jorge Perianez
+    github: JPery
+
+Chendi Luo
+    github: Creonalia
+
+Daniel Lefevre
+    gitHub: dplefevre
+
+Grzegorz Kotfis
+    github: gkotfis
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,15 @@
-# on master
+# v0.25.1
+- Fix crashing bug in new scipy-powered EQ effects
+
+# v0.25.0
 - Don't show a runtime warning about the optional ffplay dependency being missing until someone trys to use it
+- Documentation improvements
+- Python 3.9 support
+- Improved efficiency of loading wave files with `pydub.AudioSegment.from_file()`
+- Ensure `pydub.AudioSegment().export()` always retuns files with a seek position at the beginning of the file   
+- Added more EQ effects to `pydub.scipy_effects` (requires scipy to be installed)
+- Fix a packaging bug where the LICENSE file was not included in the source distribution
+- Add a way to instantiate a `pydub.AudioSegment()` with a portion of an audio file via `pydub.AudioSegment().from_file()`
 
 # v0.24.1
 - Fix bug where ffmpeg errors in Python 3 are illegible

diff --git a/MANIFEST.in b/MANIFEST.in
@@ -0,0 +1 @@
+include LICENSE
diff --git a/README.markdown b/README.markdown
@@ -210,12 +210,12 @@ Mac (using [homebrew](http://brew.sh)):
 
 ```bash
 # libav
-brew install libav --with-libvorbis --with-sdl --with-theora
+brew install libav
 
 ####    OR    #####
 
 # ffmpeg
-brew install ffmpeg --with-libvorbis --with-sdl2 --with-theora
+brew install ffmpeg
 ```
 
 Linux (using aptitude):
@@ -249,7 +249,7 @@ some of a number of potential codecs (see page 3 of the rfc) that can be used fo
 encapsulated data.
 
 When no codec is specified exporting to `ogg` will _default_ to using `vorbis`
-as a convinence. That is:
+as a convenience. That is:
 
 ```python
 from pydub import AudioSegment

diff --git a/appveyor.yml b/appveyor.yml
@@ -21,7 +21,7 @@ install:
   - "%PYTHON%/python.exe -m pip install wheel"
   - "%PYTHON%/python.exe -m pip install -e ."
   # Install ffmpeg
-  - ps: Start-FileDownload ('https://ffmpeg.zeranoe.com/builds/win64/shared/ffmpeg-' + $env:FFMPEG + '-win64-shared.zip') ffmpeg-shared.zip
+  - ps: Start-FileDownload ('https://github.com/advancedfx/ffmpeg.zeranoe.com-builds-mirror/releases/download/20200915/ffmpeg-' + $env:FFMPEG + '-win64-shared.zip') ffmpeg-shared.zip
   - 7z x ffmpeg-shared.zip > NULL
   - "SET PATH=%cd%\\ffmpeg-%FFMPEG%-win64-shared\\bin;%PATH%"
   # check ffmpeg installation (also shows version)

diff --git a/pydub/audio_segment.py b/pydub/audio_segment.py
@@ -503,7 +503,7 @@ def from_mono_audiosegments(cls, *mono_segments):
         )
 
     @classmethod
-    def from_file_using_temporary_files(cls, file, format=None, codec=None, parameters=None, **kwargs):
+    def from_file_using_temporary_files(cls, file, format=None, codec=None, parameters=None, start_second=None, duration=None, **kwargs):
         orig_file = file
         file, close_file = _fd_or_path_or_tempfile(file, 'rb', tempfile=False)
 
@@ -526,7 +526,14 @@ def is_format(f):
                 obj = cls._from_safe_wav(file)
                 if close_file:
                     file.close()
-                return obj
+                if start_second is None and duration is None:
+                    return obj
+                elif start_second is not None and duration is None:
+                    return obj[start_second*1000:]
+                elif start_second is None and duration is not None:
+                    return obj[:duration*1000]
+                else:
+                    return obj[start_second*1000:(start_second+duration)*1000]
             except:
                 file.seek(0)
         elif is_format("raw") or is_format("pcm"):
@@ -542,7 +549,14 @@ def is_format(f):
             obj = cls(data=file.read(), metadata=metadata)
             if close_file:
                 file.close()
-            return obj
+            if start_second is None and duration is None:
+                return obj
+            elif start_second is not None and duration is None:
+                return obj[start_second * 1000:]
+            elif start_second is None and duration is not None:
+                return obj[:duration * 1000]
+            else:
+                return obj[start_second * 1000:(start_second + duration) * 1000]
 
         input_file = NamedTemporaryFile(mode='wb', delete=False)
         try:
@@ -581,10 +595,17 @@ def is_format(f):
         conversion_command += [
             "-i", input_file.name,  # input_file options (filename last)
             "-vn",  # Drop any video streams if there are any
-            "-f", "wav",  # output options (filename last)
-            output.name
+            "-f", "wav"  # output options (filename last)
         ]
 
+        if start_second is not None:
+            conversion_command += ["-ss", str(start_second)]
+
+        if duration is not None:
+            conversion_command += ["-t", str(duration)]
+
+        conversion_command += [output.name]
+
         if parameters is not None:
             # extend arguments with arbitrary set
             conversion_command.extend(parameters)
@@ -610,10 +631,18 @@ def is_format(f):
             os.unlink(input_file.name)
             os.unlink(output.name)
 
-        return obj
+        if start_second is None and duration is None:
+            return obj
+        elif start_second is not None and duration is None:
+            return obj[0:]
+        elif start_second is None and duration is not None:
+            return obj[:duration * 1000]
+        else:
+            return obj[0:duration * 1000]
+
 
     @classmethod
-    def from_file(cls, file, format=None, codec=None, parameters=None, **kwargs):
+    def from_file(cls, file, format=None, codec=None, parameters=None, start_second=None, duration=None, **kwargs):
         orig_file = file
         try:
             filename = fsdecode(file)
@@ -637,7 +666,14 @@ def is_format(f):
 
         if is_format("wav"):
             try:
-                return cls._from_safe_wav(file)
+                if start_second is None and duration is None:
+                    return cls._from_safe_wav(file)
+                elif start_second is not None and duration is None:
+                    return cls._from_safe_wav(file)[start_second*1000:]
+                elif start_second is None and duration is not None:
+                    return cls._from_safe_wav(file)[:duration*1000]
+                else:
+                    return cls._from_safe_wav(file)[start_second*1000:(start_second+duration)*1000]
             except:
                 file.seek(0)
         elif is_format("raw") or is_format("pcm"):
@@ -650,7 +686,14 @@ def is_format(f):
                 'channels': channels,
                 'frame_width': channels * sample_width
             }
-            return cls(data=file.read(), metadata=metadata)
+            if start_second is None and duration is None:
+                return cls(data=file.read(), metadata=metadata)
+            elif start_second is not None and duration is None:
+                return cls(data=file.read(), metadata=metadata)[start_second*1000:]
+            elif start_second is None and duration is not None:
+                return cls(data=file.read(), metadata=metadata)[:duration*1000]
+            else:
+                return cls(data=file.read(), metadata=metadata)[start_second*1000:(start_second+duration)*1000]
 
         conversion_command = [cls.converter,
                               '-y',  # always overwrite existing files
@@ -703,10 +746,17 @@ def is_format(f):
 
         conversion_command += [
             "-vn",  # Drop any video streams if there are any
-            "-f", "wav",  # output options (filename last)
-            "-"
+            "-f", "wav"  # output options (filename last)
         ]
 
+        if start_second is not None:
+            conversion_command += ["-ss", str(start_second)]
+
+        if duration is not None:
+            conversion_command += ["-t", str(duration)]
+
+        conversion_command += ["-"]
+
         if parameters is not None:
             # extend arguments with arbitrary set
             conversion_command.extend(parameters)
@@ -726,12 +776,20 @@ def is_format(f):
 
         p_out = bytearray(p_out)
         fix_wav_headers(p_out)
-        obj = cls._from_safe_wav(BytesIO(p_out))
+        p_out = bytes(p_out)
+        obj = cls(p_out)
 
         if close_file:
             file.close()
 
-        return obj
+        if start_second is None and duration is None:
+            return obj
+        elif start_second is not None and duration is None:
+            return obj[0:]
+        elif start_second is None and duration is not None:
+            return obj[:duration * 1000]
+        else:
+            return obj[0:duration * 1000]
 
     @classmethod
     def from_mp3(cls, file, parameters=None):
@@ -839,6 +897,7 @@ def export(self, out_f=None, format='mp3', codec=None, bitrate=None, parameters=
 
         # for easy wav files, we're done (wav data is written directly to out_f)
         if easy_wav:
+            out_f.seek(0)
             return out_f
 
         output = NamedTemporaryFile(mode="w+b", delete=False)