A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.
-
Updated
Apr 25, 2025 - Python
A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
Soundscape analysis with BirdNET.
Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
Turn an image into sound whose spectrogram looks like the image.
A simple yet effective Audio-to-Midi Automatic Piano Transcription system
Pitch-shift audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.
PyTorch implementation of our graph convolutional network (GCN) for human motion generation from music. Also with paired dance-music data for training!
📣 Python library for audio augmentation
Removing background noise in a sound file
A video analysis tool built completely in python.
Using a raspberry pi, we listen to the coffee machine and count the number of coffee consumption
Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.
logWMSE, an audio quality metric & loss function with support for digital silence target. Useful for training and evaluating audio source separation systems.
Sonification tool for turning scatter plots into perceptually uniform sound files for science and science access.
Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.
This project use PANNs for audio tagging and sound event detection, and finally get audio embeddings. Then Milvus is used to search the similarity audio items.
DNN-based hearing aid for real-time sound processing
🔊 Study about audio features extraction (repo in french).
Encode an image to sound (WAV file) and view it as a spectrogram. Optimized Python 3 version.
Add a description, image, and links to the sound-processing topic page so that developers can more easily learn about it.
To associate your repository with the sound-processing topic, visit your repo's landing page and select "manage topics."