🔔 Pydiogment

Pydiogment aims to simplify audio augmentation. It generates multiple audio files based on a starting mono audio file. The library can generates files with higher speed, slower, and different tones etc.

📥 Installation

Dependencies

Pydiogment requires:

Python (>= 3.5)
NumPy (>= 1.17.2) pip install numpy
SciPy (>= 1.3.1) pip install scipy
FFmpeg sudo apt install ffmpeg

Installation

If you already have a working installation of NumPy and SciPy , you can simply install Pydiogment using pip:

pip install pydiogment

To update an existing version of Pydiogment, use:

pip install -U pydiogment

💡 How to use

Amplitude related augmentation

Apply a fade in and fade out effect

from pydiogment.auga import fade_in_and_out

test_file = "path/test.wav"
fade_in_and_out(test_file)

Apply gain to file

from pydiogment.auga import apply_gain

test_file = "path/test.wav"
apply_gain(test_file, -100)
apply_gain(test_file, -50)

Add Random Gaussian Noise based on SNR to file

from pydiogment.auga import add_noise

test_file = "path/test.wav"
add_noise(test_file, 10)

Frequency related augmentation

Change file tone

from pydiogment.augf import change_tone

test_file = "path/test.wav"
change_tone(test_file, 0.9)
change_tone(test_file, 1.1)

Time related augmentation

Slow-down/ speed-up file

from pydiogment.augt import slowdown, speed

test_file = "path/test.wav"
slowdown(test_file, 0.8)
speed(test_file, 1.2)

Apply random cropping to the file

from pydiogment.augt import random_cropping

test_file = "path/test.wav"
random_cropping(test_file, 1)

Change shift data on the time axis in a certain direction

from pydiogment.augt import shift_time

test_file = "path/test.wav"
shift_time(test_file, 1, "right")
shift_time(test_file, 1, "left")

📑 Documentation

A thorough documentation of the library is available under pydiogment.readthedocs.io.

👷 Contributing

Contributions are welcome and encouraged. To learn more about how to contribute to Pydiogment please refer to the Contributing guidelines

🎉 Acknowledgment and credits

The test file used in the pytests is OSR_us_000_0060_8k.wav from the Open Speech Repository.

About

:mega: Python library for audio augmentation

BSD 3-Clause "New" or "Revised" License

Languages

Language:Python 81.4%Language:TeX 15.7%Language:Makefile 2.8%