Beast code in Giters

nahidalam's repositories

MobiLlama

MobiLlama : Small Language Model tailored for edge devices

Apache-2.0000

latent-scope

A scientific instrument for investigating latent spaces

MIT000

jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

NOASSERTION000

Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

000

MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

BSD-3-Clause000

A coding-free framework built on PyTorch for reproducible deep learning studies. 🏆22 knowledge distillation methods presented at CVPR, ICLR, ECCV, NeurIPS, ICCV, etc are implemented so far. 🎁 Trained models, training logs and configurations are available for ensuring the reproducibiliy and benchmark.

MIT000

LURE

[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models

000

vstar

PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"

MIT000

awesome-ml

Curated list of useful LLM / Analytics / Datascience resources

MIT000

llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Apache-2.0000

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

NOASSERTION000

generative-ai-for-beginners

12 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

MIT100

semantic_video_search

GPL-3.0000

Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

BSD-3-Clause000

gpt4-vision-plugin

Chat with your images using GPT-4 Vision!

000

YOLOv8-multi-task

AGPL-3.0100

Awesome-Foundation-Models

A curated list of foundation models for vision and language tasks

000

Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

NOASSERTION000

InstructDiffusion

PyTorch implementation of InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.

NOASSERTION000

Awesome-Optical-Flow

This is a list of awesome paper about optical flow and related work.

000

llm-finetune

LLM Finetune

Language:PythonApache-2.0000

WoodScape

The repository containing tools and information about the WoodScape dataset.

Language:Python000

meru

Code for the paper "Hyperbolic Image-Text Representations", Desai et al, ICML 2023

NOASSERTION000

DeepCamera

Open-Source AI Camera. Empower any camera/CCTV with state-of-the-art AI, including facial recognition, person recognition(RE-ID) car detection, fall detection and more

MIT000

heim

Holistic Evaluation of Text-to-Image Models (HEIM), a fork of HELM to evaluate to text-to-image models (paper coming soon).

Apache-2.0000

GIST-image-text-fine-grained

Generating Image-Specific Text for Fine-grained Object Classification

MIT000

lightly

A python library for self-supervised learning on images.

Language:PythonMIT000

awesome-self-supervised-multimodal-learning

A curated list of self-supervised multimodal learning resources.

000

nahidalam

nahidalam's repositories

MobiLlama

LWM

latent-scope

jepa

CogCoM

Awesome-LLMs-for-Video-Understanding

MiniGPT-4

torchdistill

LURE

vstar

awesome-ml

llm-course

CogVLM

generative-ai-for-beginners

semantic_video_search

Video-LLaMA

gpt4-vision-plugin

YOLOv8-multi-task

Awesome-Foundation-Models

Otter

InstructDiffusion

Awesome-Optical-Flow

llm-finetune

WoodScape

meru

DeepCamera

heim

GIST-image-text-fine-grained

lightly

awesome-self-supervised-multimodal-learning