aihorse's repositories
Macaw-LLM-
Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration
sample-notebooks
AI 看新聞
annotated_deep_learning_paper_implementations
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
AutoDL
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL challenge@NeurIPS.
AutoLabelMe
use DL model automatically label image
Awesome-LLMs-for-Video-Understanding
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
ByteTrack
[ECCV 2022] ByteTrack: Multi-Object Tracking by Associating Every Detection Box
chatgpt-api
封装 OpenAI 网页版最新 ChatGPT 接口, 不需要使用 API Key, 完全免费
deep-person-reid
Torchreid: Deep learning person re-identification in PyTorch.
deep_sort
Simple Online Realtime Tracking with a Deep Association Metric
Devon
Devon: An open-source pair programmer
HorseRacePrediction
Using machine learning models to predict the outcome of a horse race, and run backtesting to see if we can profit from betting
IdealGPT-
Official Code of IdealGPT
MiniGPT4-video
Official code for MiniGPT4-video
mmdetection
OpenMMLab Detection Toolbox and Benchmark
mmyolo
OpenMMLab YOLO series toolbox and benchmark
PINTO_model_zoo-
A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8), EdgeTPU, CoreML.
Python-100-Days
Python - 100天从新手到大师
self_llm-automl
基于AutoDL快速部署开源大模型,更适合**宝宝的部署教程
sleap
A Bonsai interface for real-time multi-animal pose tracking using SLEAP
sleap-
A deep learning framework for multi-animal pose tracking.
Video-ChatGPT
"Video-ChatGPT" is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
Video-LLaMA-
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Video_Point_Tracking_SIFT
Tracking Keypoints in Video using SIFT and OpenCV. 可惜是單目標跟蹤
viper-Python-
Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"
VisionLLM-
VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks
visprog-
Official code for VisProg (CVPR 2023 Best Paper!)
yolov9
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information