pawel-polyai's starred repositories

professional-programming

A collection of learning resources for curious software engineers

Language:PythonLicense:MITStargazers:46505Issues:0Issues:0

jetson-voice

ASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch and TensorRT

Language:PythonStargazers:183Issues:0Issues:0

gigaGPT

a small code base for training large models

Language:PythonLicense:Apache-2.0Stargazers:262Issues:0Issues:0

uform

Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

Language:PythonLicense:Apache-2.0Stargazers:1030Issues:0Issues:0

usearch

Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍

Language:C++License:Apache-2.0Stargazers:2150Issues:0Issues:0

act-plus-plus

Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN

Language:PythonLicense:MITStargazers:2975Issues:0Issues:0
Language:PythonLicense:MITStargazers:672Issues:0Issues:0
Language:PythonLicense:MITStargazers:1433Issues:0Issues:0

moondream

tiny vision language model

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4972Issues:0Issues:0

nanotron

Minimalistic large language model 3D-parallelism training

Language:PythonLicense:Apache-2.0Stargazers:1154Issues:0Issues:0

datatrove

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Language:PythonLicense:Apache-2.0Stargazers:1971Issues:0Issues:0

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Language:PythonLicense:Apache-2.0Stargazers:2012Issues:0Issues:0

SoundStorm

The reproduced code for Google's SoundStorm

Language:PythonStargazers:245Issues:0Issues:0

LLM-Finetuning-Toolkit

Toolkit for fine-tuning, ablating and unit-testing open-source LLMs.

Language:PythonLicense:Apache-2.0Stargazers:768Issues:0Issues:0

swarm-jax

Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes

Language:PythonStargazers:236Issues:0Issues:0

Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

Language:C++License:Apache-2.0Stargazers:22124Issues:0Issues:0

PySoundConcat

A Python project for generating concatenative synthesis driven representations of audio files based on audio database analysis.

Language:PythonStargazers:16Issues:0Issues:0

PhoneticMatching

A phonetic matching library. Includes text utilities to do string comparisons on phonemes (the sound of the string), as opposed to characters.

Language:C#License:MITStargazers:154Issues:0Issues:0
Language:PythonLicense:BSD-3-ClauseStargazers:250Issues:0Issues:0

TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Language:PythonLicense:Apache-2.0Stargazers:3812Issues:0Issues:0

wavegrad

A fast, high-quality neural vocoder.

Language:PythonLicense:Apache-2.0Stargazers:273Issues:0Issues:0

cc_net

Tools to download and cleanup Common Crawl data

Language:PythonLicense:MITStargazers:964Issues:0Issues:0
Language:PythonLicense:MITStargazers:69Issues:0Issues:0

goclassy

An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.

Language:GoLicense:Apache-2.0Stargazers:85Issues:0Issues:0

conversational-datasets

Large datasets for conversational AI

Language:PythonLicense:Apache-2.0Stargazers:1286Issues:0Issues:0