Chen Wu (ChenWu98)

ChenWu98

User data from Github https://github.com/ChenWu98

Company:Carnegie Mellon University

Location:Pittsburgh, PA

Home Page:https://chenwu.io/

GitHub:@ChenWu98


Organizations
HKUNLP

Chen Wu's repositories

cycle-diffusion

[ICCV 2023] A latent space for stochastic diffusion models

Language:PythonLicense:NOASSERTIONStargazers:644Issues:11Issues:35

unified-generative-zoo

[ICCV 2023] https://arxiv.org/abs/2210.05559

Language:PythonLicense:NOASSERTIONStargazers:122Issues:8Issues:1

generative-visual-prompt

[NeurIPS 2022] (Amortized) distributional control for pre-trained generative models

Language:PythonLicense:NOASSERTIONStargazers:121Issues:1Issues:0

agent-attack

[ICLR 2025] Dissecting adversarial robustness of multimodal language model agents

Language:PythonLicense:MITStargazers:108Issues:3Issues:1

algorithmic-creativity

[ICML 2025] Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Language:PythonStargazers:72Issues:0Issues:0

Point-Then-Operate

Code for the ACL 2019 paper ``A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer``

Language:PythonLicense:Apache-2.0Stargazers:45Issues:3Issues:1

cliport-batchify

A batched version of CLIPort: What and Where Pathways for Robotic Manipulation

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:5Issues:0Issues:0

Coupled-VAE

Code for the ACL 2020 paper ``On the Encoder-Decoder Incompatibility in Variational Text Modeling and Beyond``

Language:PythonStargazers:5Issues:1Issues:0

visualwebarena

VisualWebArena is a benchmark for multimodal agents.

Language:HTMLLicense:MITStargazers:2Issues:0Issues:0

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

prismatic-vlms

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

simpletransformers

Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0