Tun-Hsiang Chou's repositories
ARLDM
Official Pytorch Implementation of Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
av_hubert
A self-supervised learning framework for audio-visual speech
diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
fastcomposer
FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
FiD
Fusion-in-Decoder
image-classfication-segmentation
This repository implements image classification and segmentation.
llama-recipes
Examples and recipes for Llama 2 model
LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
MARLspectrumSharingV2X
Spectrum sharing in vehicular networks based on multi-agent reinforcement learning, IEEE Journal on Selected Areas in Communications
multidoc2dial-mtl
MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents
picpick
Azure Search Python sample code
s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit.
SeedSelect
Code for our papers : "Generating images of rare concepts using pre-trained diffusion models" (AAAI 24) and "Norm-guided latent space exploration for text-to-image generation" (Neurips 23)
SimCSE
EMNLP'2021: SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
Yolov5_StrongSORT_OSNet
Real-time multi-camera multi-object tracker using YOLOv5 and StrongSORT with OSNet