Free Dataset
This is a dataset of various freely-distributed files, primarily intended for testing data compression.
Currently includes only audio.
Naming scheme
Audio
type-channels-samplerate-bitdepth-{description}.wav
- Type:
instruments
/music
/noises
/tones
/voice
- Channels:
mono
/stereo
/ number of channels - Samplerate: float value in kHz +
k
- Bitdepth: number of bits +
k
+f
for floating-point - Description:
{
+ name, notes, other info +}
Peg grammar for Nim pegs:
const DatasetNameRules = """
namingscheme <- {type} '-' {channels} '-' {samplerate} '-' {bitdepth} ('-' description)?
type <- 'instruments' / 'music' / 'noises' / 'tones' / 'voice'
channels <- \d+ / 'mono' / 'stereo'
samplerate <- \d+ ('.' \d+)? 'k'
bitdepth <- ('8' / '16' / '24' / '32' / '64') 'b' 'f'?
description <- '{' {@} '}'"""
Content information
audio/
The set includes files in a wide variety of PCM formats holding various kinds of sounds, including artificial test tones. This means, this group of files does not represent a common set of audio you would find in the wild, treating them jointly is not statistically representative.
File | Description | License |
---|---|---|
instruments-stereo-48k-24b-{Drum stem}.wav | Drum recording, overhead mics with minimal processing. Typical file for audio-engineering workflows. By @indiscipline. | CC BY 4.0 |
music-stereo-44k-16b-{Spitsbergen 3}.wav | Progressive electronica. A great track by Viktor Mayorov, this repo's author late friend, coincidentally released under CC Attribution 3.0 Unported . From the album "Мохообразные, лишайники и цианопрокариоты острова Шпицберген". |
CC BY 3.0 |
music-stereo-48k-24b-{Long Way to Tipperary}.wav | "It's A Long Way To Tipperary" by Judge & Williams, performed by Victor Novelty Band, 1930 (Victor 22487-B). | Public Domain |
music-stereo-96k-24b-{Drunken Sailor}.wav | "Drunken Sailor", trad. arranged by R.R. Terry, performed by John Goss and the Cathedral Male Voice Quartet, 1927 (His Master's Voice (B 2420)). | Public Domain |
noises-mono-48k-16bf.wav | Pink Noise, Binary White Noise, Brown Noise. One minute of each. | CC0 |
noises-stereo-96k-16b-{faux-stereo;dithered}.wav | Dithered faux-stereo. One minute of Pink, Binary White and Brown noises. Converted from a higher bitdepth to 16 bit with a conventional dithering noise added. This introduces small differences between otherwise identical interleaved stereo samples, which broadens the distribution of the sample values. For example, in case of Binary noise, you can see 4a0e 4c0e for left and right channels. |
CC0 |
noises-stereo-96k-32bf-{faux-stereo}.wav | Faux-stereo: both channels hold identical data, so each other sample is redundant. | CC0 |
tones-stereo-48k-32bf-{sweeps}.wav | Sweeps in the audible range, up to ~ 8kHz: 20s sine sweep, 10s triangle sweep, 10s square sweep, 20s of two close sines (beating), 20s of counterdirectional sines. | CC0 |
voice-mono-44.1k-16b-{upconvert}.wav | Narration of "The Art of War" by Sun Tzu performed by Moira Fogarty for LibriVox.org, 2006-11-02. Upconvert: Originally an mp3, decoded to WAV PCM, which should theoretically decrease the entropy. | Public Domain |