Achronferry / frechet-audio-distance

A lightweight library for Frechet Audio Distance calculation.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Frechet Audio Distance in PyTorch

A lightweight library of Frechet Audio Distance calculation.

Currently, we support embedding from:

Installation

pip install frechet_audio_distance

Demo

from frechet_audio_distance import FrechetAudioDistance

# to use `vggish`
frechet = FrechetAudioDistance(
    model_name="vggish",
    use_pca=False, 
    use_activation=False,
    verbose=False
)
# to use `PANN`
frechet = FrechetAudioDistance(
    model_name="pann",
    use_pca=False, 
    use_activation=False,
    verbose=False
)
fad_score = frechet.score("/path/to/background/set", "/path/to/eval/set")

Result validation

Test 1: Distorted sine waves on vggish (as provided here) [notes]

FAD scores comparison w.r.t. to original implementation in google-research/frechet-audio-distance

baseline vs test1 baseline vs test2
google-research 12.4375 4.7680
frechet_audio_distance 12.7398 4.9815

Test 2: Distorted sine waves on PANN

baseline vs test1 baseline vs test2
frechet_audio_distance 0.000465 0.00008594

References

VGGish in PyTorch: https://github.com/harritaylor/torchvggish

Frechet distance implementation: https://github.com/mseitzer/pytorch-fid

Frechet Audio Distance paper: https://arxiv.org/abs/1812.08466

PANN paper: https://arxiv.org/abs/1912.10211

About

A lightweight library for Frechet Audio Distance calculation.

License:MIT License


Languages

Language:Python 100.0%