Likun Cai's starred repositories

ALLaVA

Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model

Language:PythonLicense:Apache-2.0Stargazers:225Issues:0Issues:0

ALIP

[ICCV 2023] ALIP: Adaptive Language-Image Pre-training with Synthetic Caption

Language:PythonStargazers:87Issues:0Issues:0

clip-rocket

Code release for "Improved baselines for vision-language pre-training"

Language:PythonLicense:NOASSERTIONStargazers:53Issues:0Issues:0

textaugment

TextAugment: Text Augmentation Library

Language:PythonLicense:MITStargazers:387Issues:0Issues:0

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Language:PythonLicense:MITStargazers:8792Issues:0Issues:0

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookLicense:MITStargazers:11384Issues:0Issues:0

OpenAnnotate3D

OpenAnnotate3D: Open-Vocabulary Auto-Labeling System for Multi-modal Data

Language:Jupyter NotebookStargazers:71Issues:0Issues:0

perceiver-pytorch

Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch

Language:PythonLicense:MITStargazers:1064Issues:0Issues:0

perceiver-io

A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with PyTorch Lightning scripts for distributed training

Language:PythonLicense:Apache-2.0Stargazers:419Issues:0Issues:0

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:45734Issues:0Issues:0

DeCLIP

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

Language:PythonStargazers:621Issues:0Issues:0

grok-1

Grok open release

Language:PythonLicense:Apache-2.0Stargazers:49195Issues:0Issues:0

siamese-triplet

Siamese and triplet networks with online pair/triplet mining in PyTorch

Language:PythonLicense:BSD-3-ClauseStargazers:3079Issues:0Issues:0

onlinetripletmining

Fast Online Triplet mining in Pytorch

Language:PythonStargazers:8Issues:0Issues:0

slot-attention

Implementation of Slot Attention from GoogleAI

Language:PythonLicense:MITStargazers:364Issues:0Issues:0

detr

End-to-End Object Detection with Transformers

Language:PythonLicense:Apache-2.0Stargazers:13167Issues:0Issues:0

webdataset

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

Language:PythonLicense:BSD-3-ClauseStargazers:2124Issues:0Issues:0

OTTER

This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described in the paper.

Language:PythonLicense:MITStargazers:64Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:12Issues:0Issues:0

img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Language:PythonLicense:MITStargazers:3471Issues:0Issues:0

DownloadConceptualCaptions

Reliably download millions of images efficiently

Language:Jupyter NotebookLicense:MITStargazers:110Issues:0Issues:0

conceptual-captions

Conceptual Captions is a dataset containing (image-URL, caption) pairs designed for the training and evaluation of machine learned image captioning systems.

Language:ShellLicense:NOASSERTIONStargazers:506Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:59Issues:0Issues:0

conceptual-12m

Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.

License:NOASSERTIONStargazers:345Issues:0Issues:0

CVPR-2023-24-Papers

CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!

Language:PythonLicense:MITStargazers:354Issues:0Issues:0

dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:8469Issues:0Issues:0

dino

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Language:PythonLicense:Apache-2.0Stargazers:6104Issues:0Issues:0

Awesome-Information-Bottleneck

This is a curated list for Information Bottleneck Principle, in memory of Professor Naftali Tishby.

License:MITStargazers:294Issues:0Issues:0

Tree-Transformer

Implementation of the paper Tree Transformer

Language:PythonStargazers:208Issues:0Issues:0

UniCL

[CVPR 2022] Official code for "Unified Contrastive Learning in Image-Text-Label Space"

Language:PythonLicense:MITStargazers:378Issues:0Issues:0