MovLab2's repositories
Agent-0
This project is a **proof of concept** that aims to replicate the reasoning capabilities of OpenAI's newly released O1 model.
agent-zero
Agent Zero AI framework
agents
Build real-time multimodal AI applications 🤖🎙️📹
amica
Amica is an open source interface for interactive communication with 3D characters with voice synthesis and speech recognition.
Aria
Codebase for Aria - an Open Multimodal Native MoE
cognitive-services-speech-sdk
Sample code for the Microsoft Cognitive Services Speech SDK
DingoQuadruped
Base code for the Dingo quadruped; modified from Stanford Pupper and Notspot repositories. Includes integration with ROS Noetic and a simulation of the Dingo
EMO
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Freenove_Robot_Dog_Kit_for_Raspberry_Pi
Apply to FNK0050
local-talking-llm
A talking LLM that runs on your own computer without needing the internet.
MemGPT
Create LLM agents with long-term memory and custom tools 📚🦙
Microsoft-Activation-Scripts
Open-source Windows and Office activator featuring HWID, Ohook, KMS38, and Online KMS activation methods, along with advanced troubleshooting.
mobile-aloha
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
open-webui
User-friendly WebUI for LLMs (Formerly Ollama WebUI)
openedai-speech
An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.
text-to-audio2face
Web interface to convert text to speech and route it to an Audio2Face streaming player.
UEVR
Universal Unreal Engine VR Mod (4.8 - 5.3)
voice-changer
リアルタイムボイスチェンジャー Realtime Voice Changer
WindowsAgentArena
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.