SamuelZeng's starred repositories

ADeus

An open source AI wearable device that captures what you say and hear in the real world and then transcribes and stores it on your own server. You can then chat with Adeus using the app, and it will have all the right context about what you want to talk about - a truly personalized, personal AI.

Language:TypeScriptLicense:NOASSERTIONStargazers:2874Issues:0Issues:0

VoiceStreamAI

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS

Language:PythonLicense:MITStargazers:605Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:27Issues:0Issues:0

chatbot-ui

AI chat for every model.

Language:TypeScriptLicense:MITStargazers:27864Issues:0Issues:0

AI-Employe

Create browser automation as if you were teaching a human using GPT-4 Vision.

Language:TypeScriptLicense:AGPL-3.0Stargazers:546Issues:0Issues:0
Language:TypeScriptStargazers:507Issues:0Issues:0

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4379Issues:0Issues:0

OutfitAnyone

Outfit Anyone: Ultra-high quality virtual try-on for Any Clothing and Any Person

Stargazers:5457Issues:0Issues:0

self-operating-computer

A framework to enable multimodal models to operate a computer.

Language:PythonLicense:MITStargazers:8475Issues:0Issues:0

neuroglancer

WebGL-based viewer for volumetric data

Language:TypeScriptLicense:Apache-2.0Stargazers:1046Issues:0Issues:0

CBIM-Medical-Image-Segmentation

A PyTorch framework for medical image segmentation

Language:PythonLicense:Apache-2.0Stargazers:256Issues:0Issues:0

vimGPT

Browse the web with GPT-4V and Vimium

Language:PythonLicense:MITStargazers:2582Issues:0Issues:0

HeyGenClone

A simple and open-source analogue of the HeyGen system

Language:PythonStargazers:861Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:1872Issues:0Issues:0

gsgen

[CVPR 2024] Text-to-3D using Gaussian Splatting

Language:PythonLicense:MITStargazers:749Issues:0Issues:0

GPT-4V-Act

AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI

Language:JavaScriptStargazers:943Issues:0Issues:0

pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音

Language:PythonLicense:GPL-3.0Stargazers:8991Issues:0Issues:0

3d-to-photo

3D to Photo is an open-source package by Dabble, that combines threeJS and Stable diffusion to build a virtual photo studio for product photography. Load a 3D model into the browser and virtual shoot it in any kind of scene you can imagine

Language:JavaScriptLicense:MITStargazers:434Issues:0Issues:0

Eureka

Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models" (ICLR 2024)

Language:Jupyter NotebookLicense:MITStargazers:2759Issues:0Issues:0

autogen-ui

Web UI for AutoGen (A Framework Multi-Agent LLM Applications)

Language:TypeScriptLicense:MITStargazers:681Issues:0Issues:0

tryOnDiffusion

Implementation of the tryOnDiffusion paper

Language:PythonStargazers:19Issues:0Issues:0

RealtimeSTT

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

Language:PythonLicense:MITStargazers:1459Issues:0Issues:0
Language:PythonStargazers:155Issues:0Issues:0

ai-town

A MIT-licensed, deployable starter kit for building and customizing your own version of AI town - a virtual town where AI characters live, chat and socialize.

Language:TypeScriptLicense:MITStargazers:7285Issues:0Issues:0

PromethAI-Mobile

PromethAI app

Language:DartLicense:NOASSERTIONStargazers:15Issues:0Issues:0

AgentSims

AgentSims is an easy-to-use infrastructure for researchers from all disciplines to test the specific capacities they are interested in.

Language:PythonLicense:MITStargazers:733Issues:0Issues:0

automa

A browser extension for automating your browser by connecting blocks

Language:VueLicense:NOASSERTIONStargazers:11105Issues:0Issues:0

Voyager

An Open-Ended Embodied Agent with Large Language Models

Language:JavaScriptLicense:MITStargazers:5429Issues:0Issues:0

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonLicense:MITStargazers:20433Issues:0Issues:0