Antonio Scaiella's repositories

ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

core

Production ready AI assistant framework

Language:PythonLicense:GPL-3.0Stargazers:0Issues:0Issues:0

DeepSpeech-Italian-Model

Tooling for producing Italian model (public release available) for DeepSpeech and text corpus

Language:PythonStargazers:0Issues:0Issues:0

GLiNER

Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 24

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0
Language:PythonStargazers:0Issues:0Issues:0

OmniFusion

OmniFusion — a multimodal model to communicate using text and images

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

squad-it

A large scale dataset for Question Answering in Italian

Stargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

skynet

AI core services for Jitsi

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction"

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:0Issues:0Issues:0