computer-use

There are 64 repositories under computer-use topic.

bytedance / UI-TARS-desktop
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
agent agent-tars browser-use computer-use gui-agent gui-operator mcp mcp-server multimodal tars ui-tars vision vlm
Language:TypeScript 19429
trycua / cua
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).
agent ai-agent apple computer-use computer-use-agent containerization cua desktop-automation hacktoberfest lume macos manus operator swift virtualization virtualization-framework windows windows-sandbox
Language:Python 11202
web-infra-dev / midscene
Your AI Operator for Web, Android, Automation & Testing.
ai ai-test browser-use computer-use gpt-operator javascript phone-use testing
Language:TypeScript 10641
bytebot-ai / bytebot
Bytebot is a self-hosted AI desktop agent that automates computer tasks through natural language commands, operating within a containerized Linux desktop environment.
agent agentic-ai agents ai ai-agents ai-tools anthropic automation bytebot computer-use computer-use-agent cua desktop desktop-automation docker gemini llm mcp openai
Language:TypeScript 9469
simular-ai / Agent-S
Agent S: an open agentic framework that uses computers like a human
agent-computer-interface ai-agents computer-automation gui-agents memory mllm planning retrieval-augmented-generation in-context-reinforcement-learning computer-use grounding computer-use-agent cua
Language:Python 8076
Upsonic
Upsonic / Upsonic
Agent Framework For Fintech
agent agent-framework claude computer-use llms mcp model-context-protocol openai rag reliability
Language:Python 7677
A9T9 / RPA
Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.
web-automation selenium-ide imacros browser-extension browser-automation data-driven-tests web-scraping anthropic anthropic-claude computer-use
Language:JavaScript 1766
e2b-dev / open-computer-use
AI computer use powered by open source LLMs and E2B Desktop Sandbox
agent ai anthropic claude computer-use llm
Language:Python 1636
showlab / ShowUI
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
agent computer-use gui-agent vision-language-action vision-language-model
Language:Python 1542
trycua / acu
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
ai ai-research awesome computer computer-use gui-agent ui-agent
1501
OpenAdapt
OpenAdaptAI / OpenAdapt
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
process-automation python transformers large-language-models large-multimodal-models huggingface segment-anything large-action-model process-mining agents ai-agents ai-agents-framework anthropic google-gemini openai ultralytics generative-process-automation computer-use gpt4o omniparser
Language:Python 1418
zai-org / CogAgent
An open-sourced end-to-end VLM-based GUI Agent
gui-agent computer-use vlm agent glm
Language:Python 1081
deedy / mac_computer_use
A fork of Anthropic Computer Use that you can run on Mac computers to give Claude and other AI models autonomous access to your computer.
ai anthropic claude computer-use
Language:Python 825
microsoft / WindowsAgentArena
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
agentic ai ai-agent ai-research windows ai-benchmark desktop-agent computer computer-use
Language:Python 783
instavm / clickclickclick
A framework to enable autonomous android and computer use using any LLM (local or remote)
android-automation computer-use gemini ollama openai agents ai-agents-framework antrophic framework generative-ai molmo python operator
Language:Python 512
suitedaces / computer-agent
Desktop app powered by Claude’s computer use capability to control your computer
ai ai-tools anthropic claude-3-5-sonnet computer-use gui pyqt pyqt6 python
Language:Python 506
baryhuang / mcp-remote-macos-use
The only general AI agent that does NOT requires extra API key, giving you full control on your local and remote MacOs from Claude Desktop App
computer-use mcp-server macos-use claude-desktop macos general-agent
Language:Python 413
OS-Agent-Survey / OS-Agent-Survey
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).
mllms survey agent llms gui gui-agent computer-use phone-use browser-agent web-agent os-agent computing-devices os-agent-survey computer-using-agent computer-using operator
364
BrowserOperator / browser-operator-core
Browser Operator - The AI browser with built in Multi-Agent platform! Open source alternative to ChatGPT Atlas, Perplexity Comet, Dia and Microsoft CoPilot Edge Browser
agent-workflow agentic-ai agents langgraph llamacpp llm mcp mcp-client ollama open-operator openai operator browser-operator computer-use agent-browser ai-browser agentic-browser agentic-framework agentic-workflow
Language:TypeScript 317
cyberdesk-hq / cyberdesk
Open source virtual desktops for AI agents
ai-agents computer-use fastapi hono kubernetes nextjs terraform virtual-machine
Language:JavaScript 290
open-computer-use
LLmHub-dev / open-computer-use
The Open Framework for autonomous virtual computer agents at scale, fully open-source, safe, auditable, and production-ready.
agentic-framework ai ai-agents claude computer-use computer-use-agent llm openai cli-tools computer-control full-stack gui-automation llm-agents saas-application gemini-api xai agent anthropic google-api gemini-ai
Language:TypeScript 201
cuga-project / cuga-agent
CUGA is an open-source generalist agent for the enterprise, supporting complex task execution on web and APIs, OpenAPI/MCP integrations, composable architecture, reasoning modes, and policy-aware features.
computer-use enterprise generalist-agent mcp
Language:Python 182
spongecake
aditya-nadkarni / spongecake
Spongecake is the easiest way to launch computer use agents.
ai-agents ai-agents-framework automation computer-use docker llm openai python
Language:JavaScript 159
BIGPPWONG / EdgeBox
A fully-featured, GUI-powered local LLM Agent sandbox with complete MCP protocol support. Features both CLI and full desktop environment, enabling AI agents to operate browsers, terminal, and other desktop applications just like humans. Based on E2B oss code.
code-interpreter computer-use e2b llm-agent llm-sandbox mcp
Language:TypeScript 157
bilalonur / awesome-llm-os
A curated list of awesome resources, tools, research papers, and projects related to the concept of Large Language Model Operating Systems (LLM-OS).
large-language-models llm llmos natural-language-processing operating-systems llm-os large-language-model awesome awesome-list agent computer-use
137
777genius / os-ai-computer-use
AI controls your OS. OS AI Computer Use, OS and API agnostic. For now on Anthropic (Claude) API. Desktop app ready.
agent ai ai-agents-framework anthropic browser-use claude computer-use computer-use-agent computer-vision macos os-ai-cross-platform-engineering python vision windows artificial-intelligence cli flutter gui desktop-automation
Language:Python 133
chatsci / Aeiva
A general AI agent framework that can be adapted to various tasks and environments.
agent ai ai4science computer-use llm multi-agent-system multimodal self-evolving-systems large-languge-models memory world-model computer-usage
Language:Python 102
jeffrey-zang / opus
On-device computer use agent that runs fully in the background 🪄
computer-use electron opus react tailwind agentic macos
Language:TypeScript 93
open-compass / MMBench-GUI
Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent with a hierarchical manner across multiple platforms, including Windows, Linux, macOS, iOS, Android and Web.
benchmark-framework computer-use gui-agent vision-language-model
Language:Python 84
openmule / gacua
The World's First Out-of-the-Box Computer Use Agent Powered by Gemini-CLI @openmule
agent ai computer-use gacua
Language:TypeScript 84
TurixAI / TuriX-CUA
This is the official website for TuriX Computer-use-Agent
agent ai-agents computer-use-agent cua computer-automation mcp computer-use browser-use gui-agent gui-operator qwen3-vl
Language:Python 74
AB498 / computer-control-mcp
MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.
automation computer-use mcp
Language:Python 68
TongUI-agent
TongUI-agent / TongUI-agent
Release of code, datasets and model for our work TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials
computer-use-agent vision-language-model tongui agent computer-use gui-agent vision-language-action vision-language-action-model
Language:HTML 55
lvqq / intelli-browser
✨ Use natural language to control your browser, powered by LLM and playwright
claude claude-3-5-sonnet e2e e2e-tests playwright anthropic computer-use
Language:TypeScript 48
presidio-oss / factif-ai
AI-powered computer control for automated testing. Factifai uses vision models (Claude, GPT-4o, Gemini) to interact with applications naturally - clicking, typing, and verifying results just like a human would.
automated-testing computer-use hai human-ai omniparser anthropic automation bedrock claude docker-vnc gpt-4o puppeteer testing factifai
Language:TypeScript 48
reidbarber / webmarker
Mark web pages for use with vision-language models
prompt prompt-engineering som vision-language-model set-of-mark claude gemini gpt4o gpt4v llms playwright qwen-vl operator computer-use computer-using-agent cua
Language:TypeScript 46

computer-use

bytedance / UI-TARS-desktop

trycua / cua

web-infra-dev / midscene

bytebot-ai / bytebot

simular-ai / Agent-S

Upsonic / Upsonic

A9T9 / RPA

e2b-dev / open-computer-use

showlab / ShowUI

trycua / acu

OpenAdaptAI / OpenAdapt

zai-org / CogAgent

deedy / mac_computer_use

microsoft / WindowsAgentArena

instavm / clickclickclick

suitedaces / computer-agent

baryhuang / mcp-remote-macos-use

OS-Agent-Survey / OS-Agent-Survey

BrowserOperator / browser-operator-core

cyberdesk-hq / cyberdesk

LLmHub-dev / open-computer-use

cuga-project / cuga-agent

aditya-nadkarni / spongecake

BIGPPWONG / EdgeBox

bilalonur / awesome-llm-os

777genius / os-ai-computer-use

chatsci / Aeiva

jeffrey-zang / opus

open-compass / MMBench-GUI

openmule / gacua

TurixAI / TuriX-CUA

AB498 / computer-control-mcp

TongUI-agent / TongUI-agent

lvqq / intelli-browser

presidio-oss / factif-ai

reidbarber / webmarker