NielsRogge

NielsRogge

Geek Repo

Company:HuggingFace

Location:Belgium

Home Page:nielsrogge.github.io

Twitter:@NielsRogge

Github PK Tool:Github PK Tool

NielsRogge's repositories

Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.

Language:Jupyter NotebookLicense:MITStargazers:9119Issues:136Issues:443

transformers

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

Language:PythonLicense:Apache-2.0Stargazers:42Issues:4Issues:0

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

Language:PythonLicense:Apache-2.0Stargazers:2Issues:2Issues:0

huggingface.js

Utilities to use the Hugging Face Hub API

Language:TypeScriptLicense:MITStargazers:2Issues:0Issues:0

MeshAnythingV2

From anything to mesh like human artists. Official impl. of "MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization"

Language:PythonLicense:MITStargazers:2Issues:0Issues:0

segment-anything-2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2Issues:0Issues:0

Show-o

Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

License:MITStargazers:2Issues:0Issues:0

GST

Official implementation of "GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers"

License:BSD-3-ClauseStargazers:1Issues:0Issues:0

ultralytics

NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite

Language:PythonLicense:AGPL-3.0Stargazers:1Issues:0Issues:0
Stargazers:1Issues:0Issues:0

1d-tokenizer

This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation

License:Apache-2.0Stargazers:0Issues:0Issues:0

AiM

Official PyTorch Implementation of "Scalable Autoregressive Image Generation with Mamba"

License:MITStargazers:0Issues:0Issues:0

Apollo

Music repair method to convert lossy MP3 compressed music to lossless music.

Stargazers:0Issues:0Issues:0

co-tracker

CoTracker is a model for tracking any point (pixel) on a video.

License:NOASSERTIONStargazers:0Issues:0Issues:0

CoMAE

[AAAI 2023 Oral] CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets

Stargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

CounTR

CounTR: Transformer-based Generalised Visual Counting

License:MITStargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

doubletake

[ECCV 2024] DoubleTake: Geometry Guided Depth Estimation

License:NOASSERTIONStargazers:0Issues:0Issues:0

EMA-VFI

[CVPR 2023] Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolatio

License:Apache-2.0Stargazers:0Issues:0Issues:0

FluxMusic

Text-to-Music Generation with Rectified Flow Transformers

License:NOASSERTIONStargazers:0Issues:0Issues:0

lerobot

🤗 LeRobot: End-to-end Learning for Real-World Robotics in Pytorch

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:AGPL-3.0Stargazers:0Issues:0Issues:0

mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

License:MITStargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

PGTFormer

[IJCAI'24] Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer

License:NOASSERTIONStargazers:0Issues:0Issues:0

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

StreamingT2V

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

Language:PythonStargazers:0Issues:0Issues:0

VFIMamba

VFIMamba: Video Frame Interpolation with State Space Models

License:Apache-2.0Stargazers:0Issues:0Issues:0

vggsfm

VGGSfM: Visual Geometry Grounded Deep Structure From Motion

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0