Awesome-Audio

A curated list of awesome audio technology resources for developers

Code
Community
Education
Hardware
Industry
Research

Code

Software applications, tools, and APIs you can use to solve audio-related problems to use in your own awesome audio projects.

How-To Analyze Audio
How-To Edit Audio
How-To Playback Audio
How-To Read and Write Audio Files
How-To Record Audio
How-To Send Real-Time Audio
How-To Transcribe Audio
How-To Visualize Audio

How-To Analyze Audio

APIs
- Dolby.io Media Analyze API - services to analyze an audio file to identify codec, clipping, loudness, sound classification, silence, etc. Also has options force Speech, and Diagnostics.
Apps
- MATLAB DSP System Toolbox - application for designing, simulating, and analyzing signal processing systems
Python
- Librosa - python package for music and audio analysis
- PyAudio Analysis - python package for audio analysis and feature extraction

How-To Edit Audio

APIs
- Dolby.io Media Enhance API - services to enhance media such as correcting audio impurities like noise, sibilance, equalization, tonality, loudness
- Dolby.io Media Transcode API - Convert and assemble content that looks and sounds great no matter the device or where it’s viewed. With support for high resolution, high frame rates, and web and streaming formats.
- Dolby.io Media Music Mastering API - Get professional-sounding audio masters that keep your creative intent intact with the powerful Music Mastering API from Dolby.io — the result of thousands of hours of musical analysis.
Apps
- Avid Pro Tools - music software to create audio recording, composing, editing, and mastering
- iZotope - audio software for music production and post production, composing, editing, and mastering
- FL Studio - DAW for MacOS and Windows
- Ableton Live - DAW for MacOS and Windows
- Nuendo - DAW for MacOS and Windows that has support for Dolby Atmos and other forms of spatial audio
- Logic Pro - Logic Pro is a digital audio workstation and MIDI sequencer software application for macOS
- Garageband - Free tool for MacOS users to record and edit audio
- Audacity - Audacity is a free and open-source digital audio editor and recording application software
- Reaper - Propietary cross platform DAW
- Bitwig Studio - Cross Platform DAW made by ex-Ableton employees
- Ardour - Ardour is a hard disk recorder and digital audio workstation application that runs on Linux, macOS, FreeBSD and Microsoft Windows.
- LMMS - free, open source, cross platform DAW

How-To Playback Audio

Android
- AudioTrack - Android class that streams PCM audio buffers to audio hardware for playback
- ExoPlayer - library for local or streaming playback of audio and video
- MediaPlayer - class for controlling playback of a pre-existing audio or video file
- Oboe - C++ library that wraps OpenSL ES and AAudio for high performance audio operations
JavaScript
- Cross-Browser Audio Basics - tutorial for creating an HTML5 audio player
Python
- PyAudio - python bindings for PortAudio to interface with audio drivers to record or playback audido (Open-Source/MIT)

How-To Read and Write Audio Files

CLI
- ffmpeg - A complete, cross-platform solution to record, convert and stream audio and video.
- GStreamer - library for constructing graphs of media-handling components
- mpv - mpv is a free (as in freedom) media player for the command line. It supports a wide variety of media file formats, audio and video codecs, and subtitle types.
- VLC - VLC is a free and open source cross-platform multimedia player and framework that plays most multimedia files as well as DVDs, Audio CDs, VCDs, and various streaming protocols.
- Handbrake - HandBrake is a tool for converting video from nearly any format to a selection of modern, widely supported codecs.
- Sound eXchange - SoX is billed as the swiss army knife of sound processing programs
Python
- pyAV - python bindings for ffmpeg to access media via containers, streams, packets, codecs, and frames

How-To Record Audio

Audio capture solutions...

Android
- AudioRecord - Android class for reading buffers of raw audio data from audio hardware
- MediaRecorder - records encoded audio or video and saves the recording to an output file
- Oboe - C++ library that wraps OpenSL ES and AAudio for high performance audio operations compatible across API levels
JavaScript
- MediaRecorder - Web API for processing a stream of media content such as audio tracks
- react-mic - javascript / react library to record audio cross-platform
Python
- PyAudio - python bindings for PortAudio to interface with audio drivers to record or playback audido (Open-Source/MIT)
Swift
- AVFoundation - framework for audiovisual assets, control devices, audio processing, and system audio interactions
  - AVCapture - device, input, session, and output classes for a graph processing architecture allowing buffer analysis and processing (including video support)
- AVFAudio - foundation framework to play, record, and process audio and configure system behavior
  - AVAudioRecorder - class to record audio to a file and may be simplest when getting started
  - AVAudioEngine - group of audio nodes to generate and process audio signals for input and output; does not natively support video capture but highly configurable processing nodes
- Audio Toolbox - framework to record or play audio, convert formats, parse audio streams, and configure your audio session
Windows
- Core Audio APIS

How-To Send Real-Time Audio

Communications solutions...

APIs
- Dolby.io Communications API - services with SDKs for adding audio and video conferencing and communications
- Dolby.io Streaming API - Millicast, acquired by Dolby, is now part of the Dolby.io platform. Millicast is a WebRTC-based real-time streaming service that enables sub-second latency, broadcast-quality color and sound, global scale, and end-to-end encryption
JavaScript
- WebRTC API - capture and stream audio / video media between browsers without requiring an intermediary
- HLS Streaming - HLS lets you deploy content using ordinary web servers and content delivery networks. HLS is designed for reliability and dynamically adapts to network conditions by optimizing playback for the available speed of wired and wireless connections.
Local
- PulseAudio - PulseAudio is a sound server system for POSIX OSes
- JACK - JACK Audio Connection Kit is a professional sound server API and pair of daemon implementations to provide real-time, low-latency connections for both audio and MIDI data between applications
- Loopback - Cable-free audio routing for Mac
- Soundflower - MacOS system extension that allows applications to pass audio to other applications.
- BlackHole - BlackHole is a modern MacOS virtual audio driver that allows applications to pass audio to other applications with zero additional latency.

How-To Transcribe Audio into Text

Transcription solutions...

APIs
- AWS Transcribe - speech to text capabilities
- Azure Speech-to-Text - transcribe audio to text
- Google Speech-to-Text - convert speech into text
- Symbl.ai Transcription over WebSockets - speech to text
- Rev - Convert Audio & Video To Text worked on by humans
  - rev.ai - AI branch of Rev.com
- Speechmatics - The Most Accurate and Inclusive Speech Technology.
- Deepgram - Build better voice applications with faster, more accurate transcription through AI Speech Recognition.
- Picovoice - Picovoice is the end-to-end platform for adding voice to anything on your terms.
- sonix - Automated transcription in 35+ languages.
- fireflies.ai - Process audio, generate transcripts, and extract actionable data.
- AssemblyAI - Transcribe and understand audio with a single AI-powered API
Apps
- descript.com - use transcripts to cut and edit video
- Otter.ai - Generate rich notes for meetings, interviews, lectures, and other important voice conversations
Python
- PyKaldi - Python scripting layer for the Kaldi speech recognition toolkit.

How-to Turn Text into Voice and Speech

Speech synthesis solutions...

Aflorithmic API.audio - Simple APIs to transform text to speech, add sound design and make it sound beautiful at scale.
Amazon Polly - Turn text into lifelike speech using deep learning
Google Cloud Text to Speech - Convert text into natural-sounding speech using an API powered by the best of Google’s AI technologies.
Azure Text to Speech - A Speech service feature that converts text to lifelike speech
IBM Watson Text to Speech - Convert text into natural-sounding speech in a variety of languages and voices

How-To Visualize Audio

Apps
- headliner.app - create engaging social video with audio editing, transcription, and visualization
- getaudiogram.com - create engaging social video with audio visualizations
JavaScript
- Wavesurfer - a customizable audio waveform visualization built on Web Audio API; supporting spectrograms and other features

Audio Plugin Development Tools

JUCE - JUCE is an open-source cross-platform C++ application framework for desktop and mobile applications, including VST, VST3, AU, AUv3, RTAS and AAX audio plug-ins.
- react-juce - React-JUCE is a hybrid JavaScript/C++ framework that enables a React.js frontend for a JUCE application or plugin.
iPlug2 - iPlug 2 is a simple-to-use C++ framework for developing cross platform audio plug-ins/apps and targeting multiple plug-in APIs with the same minimalistic code.
AudioKit - AudioKit is an entire audio development ecosystem of code repositories, packages, libraries, algorithms, applications, playgorunds, tests, and scripts, built and used by a community of audio programmers, app developers, engineers, researchers, scientists, musicians, gamers, and people new to programming.
Plug'n Script - Blue Cat's Plug'n Script is an audio and MIDI scripting plug-in and application that can be programmed to build custom effects or virtual instruments, without quitting your favorite DAW software.
Faust - Faust (Functional Audio Stream) is a functional programming language for sound synthesis and audio processing with a strong focus on the design of synthesizers, musical instruments, audio effects, etc. Faust targets high-performance signal processing applications and audio plug-ins for a variety of platforms and standards.
SOUL - The SOUL project is creating a new language and infrastructure for writing and deploying audio code. It aims to unlock improvements in latency, performance, portability and ease-of-development that aren't possible with the current mainstream techniques that are being used.
Max - Max is an infinitely flexible space to create your own interactive software

Community

Social media, discussion groups, events, and audio experiences you can seek out to increase your appreciation for awesome audio.

Awesome Lists
Collections
Conferences and Events
Experiences and Places
Groups
Podcasts
Social Forums

Awesome Lists

awesome-scientific-audio - python for scientific audio
awesome-sound - curated list of delightful sound packages and resources
awesome-webaudio - curated list of awesome webaudio packages and resources
awesome-diarization - curated list of speaker diarization papers, libraries, datasets, and other resources

Collections

Internet Archive Audio Archive - over 14 million recordings of music, concerts, audiobooks, radio, etc.
Library of Congress Audio Recordings - over 20,000 audio recordings of historical or cultural significance

Conferences and Events

Audio Developers Conference - ADC is an annual event celebrating audio development technologies from music applications and game audio to audio processing andd embedded systems. ADC's mission is to help attendees acquire and develop new skills.
Demuxed - video-tech community event for technical topics related to video technology
KrankyGeek - annual event for WebRTC technology used for real time communications in a web browser
Web Audio Conference - WAC is an international conference dedicated to web audio technologies and applications. The conference addresses academic research, artistic research, development, design, evaluation and standards concerned with emerging audio-related web technologies such as Web Audio API, Web RTC, WebSockets and Javascript.

Experiences and Places

Audium - sound art event in a theatre of sound-sculpted space (San Francisco)
ASMR University - art & science of autonomous sensory meridian response
Exploratorium Listen Exhibit - making sense of sound (San Francisco)

Groups

Audio Engineering Society - AES is an international organization that unites audio engineers, creative artists, scientists, and students promoting advances in audio and disseminating new knowledge and research with many local communities
International Society for Music Information Retrieval - ISMIR is a non-profit seeking to advance access, organization, and understanding of music information
Women's Audio Mission - WAM is a non-profit built and run by women to inspire and educate on the subject of audio in music and media

Podcasts

Audio Programmer Podcast - all things audio programming, including DSP (digital signal processing), coding, and audio tech.
Dissect - Long form music analysis of albums that goes track by track discussing music theory and artist intention
Game Audio Podcast - aims to provide sound designers, composers, and everyone else interested in game audio a biweekly show
Song Exploder - music podcast where musicians take apart their songs and tell the story of how they were made
Twenty Thousand Hertz - the stories behind the world's most recognizable and interesting sounds

Social Forums

Music and Audio Professionals - LinkedIn group for audio engineers, music arrangers, music composers, etc.
- r/audioengineering - products, practices, and stories about the profession or hobby of recording, editing, and producing audio
Signal Processing StackExchange - question and answer for practioners of the art and science of signal, image, and video processing
The Audio Programmer Discord - We invite you to the Audio Programmer community, where you can connect with other audio programmers, ask questions about coding and choosing the right career path, find job opportunities and more!

Social Networks

Display - social platform for creators
Lava - social network for audio

Video Channels

The Audio Programmer - SOUL tutorials, JUCE tutorials, teaching audio programming for beginners, etc.

Education

Resources such as books, courses, tutorials, journals, and blogs that are worth checking out to become more awesome with audio yourself.

Books
Courses
Journals
Tutorials & Blogs

See something missing, view the contribute section and let us know.

Books

Corey, Jason. (2016). Audio Production and Critical Listening: Technical Ear Training. Focal Press.
Dittmar, Tim. (2017). Audio Engineering 101: A Beginner's Guide to Music Production. Routledge.
Watkinson, John. (2002). Introduction to Digital Audio. Focal Press.

Courses

Audio Signal Processing - audio signal methodologies for music. Topics include: spectral processing techniques, transformation of sounds, analyze, synthesize, transform audio signals, python (Coursera)
Digital Media Foundations - Audio Made Simple. Topics include creating space with channels, measuring power of sound, capturing tone as frequency, phase. (LinkedIn Learning)
- Communication Acoustics - This is a comprehensive course starting from the basics: what is sound, how it propagates and prepares us gradually to learn about the human auditory system, psychoacoustics(connecting the physical world to how we perceive sounds), speech acoustics(human speech production system) and finally electroacoustics(the world of loud speakers and microphones)(Edx)
Fundamentals of Audio and Music Engineering - basic concepts of acoustics and electronics and how they can be applied to understanding musical sound and make music with electronic instruments. Topics include: sound waves, musical sound, basic electronics, and applications of these basic principles in amplifiers and speaker design (Coursera)

Journals

Computer Music Journal - a peer-reviewed academic journal that covers a wide range of topics related to digital audio signal processing and electroacoustic music
IEEE/ACM Transactions on Audio, Speech, and Language Processing - dedicated to innovative theory and methods for processing signals representing audio, speech and language, and their applications. This includes analysis, synthesis, enhancement, transformation, classification and interpretation of such signals as well as the design, development, and evaluation of associated signal processing systems
Journal of the Acoustical Society of America - a monthly peer-reviewed scientific journal covering aspects of acoustics
Journal of the Audio Engineering Society - peer-reviewed journal devoted to audio technology
SMPTE Motion Imaging Journal - the key publication of the Society, providing peer-reviewed articles on topics in 3D, imaging processing, display technologies, audio, compression, digitaal cinema, and much more

Tutorials and Blogs

Designing Sound - tutorials on the art & technique of sound design
ProAudioGirl - Amy Tucker's blog covering audio for filmmakers, dialog editing basics, hacks & tricks, etc.
The Ear Training Guide for Audio Producers - NPR training guide to help identify problematic audio and prevent most common problems
Using ffmpeg to manipulate audio and video files - How to tame the "Swiss army knife" of audio and video manipulation…

Hardware

Resources for hardware considerations for recording and listening to awesome audio.

View the contribute section and let us know what you think would be great resources for this section.

Industry

Domains and use-case specific resources such as broadcasting, communications, gaming, music, and the web where awesome audio is applied.

Standards

Standards

AES Standards - 2-channel digital audio, MADI, analog XLR pin-out, networked audio, etc.
ATSC A/85 - Advanced Television Systems Committee (ATSC) Techniques for establishing and maintaining audio loudness for digital television
EBU R.128 - European Broadcasting Union (EBU) loudness normalisation and permitted maximum level of audio signals
ITU-R BS.1770 - International Telecommunication Union (ITU) algorithms to measure audio programme loudness and true-peak audio level
ITU-R BS.2159-7 - International Telecommunications Union (ITU) multi-channel speaker configurations for home and broadcast applications
MPEG Advanced Audio Coding - aac wideband perceptual audio coding algorithm that provides state of the art levels of compression for audio signals
SMPTE Audio Standards - collection of standards related to audio

Research

Areas of experimentation and exploration for awesome algorithms.

Data

AudioSet - large-scale dataset of manually annotated audio events with sound ontology
CSTR VCTK - speech data uttered by 110 English speakers with various accents reading about 400 sentences from newspapers
Freesound - Freesound is a collaborative database of Creative Commons Licensed sounds.
LibriSpeech - text-to-speech training corpus with 1000 hours of English speech of read audiobooks from the LibriVox project
Mozilla Common Voice - open-source, multi-language dataset of voices to train speech-enabled applications with 68 validated hours and 18 languages
Netflix Open Content - test titles with documentary, live action, and animation films
Spoken Wikipedia Corpora - SWC is comprised of spoken articles in multiple languages
Voice Datasets - A comprehensive list of open source voice and music datasets.

Contribute

Contributions welcome! Read the contribution guidelines first.

DolbyIO / awesome-audio