andrewsofie

OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.

Language:Jupyter NotebookApache-2.0688500

flowtron

Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer

Language:Jupyter NotebookApache-2.088800

ToolChanger

STPs / STLs / DXFs / PDFs

GPL-3.030100

OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

Language:PythonApache-2.0154300

legacy-v1-python-example

Example script (supported) to help you integrate with our SaaS v1 API

Language:Python1400

noizeus_corpora

Speech corpora for the speech recognition evaluation system

1700

StoryTelling

A neural network based StoryTeller that outputs a short story from an input image

Language:Python1300

SPADE-Tensorflow

Simple Tensorflow implementation of "Semantic Image Synthesis with Spatially-Adaptive Normalization" a.k.a. GauGAN, SPADE (CVPR 2019 Oral)

Language:PythonMIT36500

pytorch_GAN_zoo

A mix of GAN implementations including progressive growing

Language:PythonBSD-3-Clause160700

TTS

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Language:Jupyter NotebookMPL-2.0926300

SceneGraphParser

A python toolkit for parsing captions (in natural language) into scene graphs (as symbolic representations).

Language:PythonMIT53800

planercnn

PlaneRCNN detects and reconstructs piece-wise planar surfaces from a single RGB image

Language:PythonNOASSERTION55400

DEXTR-PyTorch

Deep Extreme Cut http://www.vision.ee.ethz.ch/~cvlsegmentation/dextr

Language:PythonGPL-3.084400

Phonetisaurus

Phonetisaurus G2P

Language:ShellBSD-3-Clause44600

neural_renderer

A PyTorch port of the Neural 3D Mesh Renderer

Language:PythonNOASSERTION112900

voca

This codebase demonstrates how to synthesize realistic 3D character animations given an arbitrary speech signal and a static character mesh.

Language:Python114200

WER-in-python

This program calculates the word error rate of hypothesis in ASR and print the aligned result.

Language:PythonMIT15200

free-spoken-digit-dataset

A free audio dataset of spoken digits. An audio version of MNIST.

Language:Python61800

text-to-ssml

Converts your text to AWS Polly's SSML.

Language:RustMIT1100