tbergman's starred repositories

detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Language:PythonLicense:Apache-2.0Stargazers:29101Issues:384Issues:3464

moviepy

Video editing with Python

Language:PythonLicense:MITStargazers:11974Issues:253Issues:1467

GroundingDINO

Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Language:PythonLicense:Apache-2.0Stargazers:5306Issues:36Issues:274

layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis

Language:PythonLicense:Apache-2.0Stargazers:4573Issues:71Issues:145

notebooks

Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models like Grounding DINO and SAM.

Language:Jupyter NotebookStargazers:4410Issues:68Issues:120

pixel2style2pixel

Official Implementation for "Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation" (CVPR 2021) presenting the pixel2style2pixel (pSp) framework

Language:Jupyter NotebookLicense:MITStargazers:3140Issues:63Issues:316

DIG

A library for graph deep learning research

Language:PythonLicense:GPL-3.0Stargazers:1804Issues:30Issues:204

awesome-openai-vision-api-experiments

Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥

OneFormer

OneFormer: One Transformer to Rule Universal Image Segmentation, arxiv 2022 / CVPR 2023

Language:Jupyter NotebookLicense:MITStargazers:1365Issues:20Issues:105

multimodal-maestro

Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥

Language:PythonLicense:MITStargazers:974Issues:14Issues:7

GPT-4V-Act

AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI

lcnn

LCNN: End-to-End Wireframe Parsing

Language:PythonLicense:MITStargazers:481Issues:17Issues:72

examples

Example code and applications for machine learning on Graphcore IPUs

Language:PythonLicense:MITStargazers:313Issues:44Issues:3

grounded-segment-anything-colab

Grounding DINO with Segment Anything & Stable Diffusion colab

Language:Jupyter NotebookLicense:UnlicenseStargazers:189Issues:7Issues:6

Anything2Image

Generate image from anything with ImageBind and Stable Diffusion

Language:Jupyter NotebookStargazers:184Issues:7Issues:14

nx-guides

Examples and IPython Notebooks about NetworkX

Language:PythonLicense:CC0-1.0Stargazers:180Issues:24Issues:35

ViP-LLaVA

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

Language:PythonLicense:Apache-2.0Stargazers:179Issues:5Issues:16

neurvps

Neural Vanishing Point Scanning via Conic Convolution

Language:PythonLicense:MITStargazers:174Issues:8Issues:24

nerd

NeRD: Neural 3D Reflection Symmetry Detector

Language:PythonLicense:MITStargazers:99Issues:7Issues:10

holicity

HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures

Language:PythonLicense:NOASSERTIONStargazers:87Issues:10Issues:15

SoM

Unofficial implementation and experiments related to Set-of-Mark (SoM) 👁️

Language:Jupyter NotebookStargazers:74Issues:3Issues:0

SoM-LLaVA

Empowering Multimodal LLMs with Set-of-Mark Prompting and Improved Visual Reasoning Ability.

Language:PythonStargazers:69Issues:0Issues:0

shapeunity

Learning to Reconstruct 3D Manhattan Wireframes from a Single Image

Language:PythonLicense:MITStargazers:67Issues:4Issues:11

vecui

Tiny, ergonomic and fun vector library for UI engineers.

GPT-4V-AD

Code for "Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly Detection"

pointingqa

Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"

cookbooks

Templates for computer vision projects, referenced in Roboflow blog posts.

audio-retrieval-plugin

FiftyOne Plugin for searching images by audio clip using ImageBind and Qdrant

Language:TypeScriptStargazers:8Issues:2Issues:0

video-to-frames

Split videos into frames

Language:PythonLicense:MITStargazers:3Issues:0Issues:0