Shoshin23

Karthik Kannan's starred repositories

cookbook

Examples and guides for using the Gemini API.

Language:Jupyter NotebookApache-2.0455900

skyvern

Automate browser-based workflows with LLMs and Computer Vision

Language:PythonAGPL-3.0563800

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION457100

aifs

Local semantic search. Stupidly simple.

Language:PythonApache-2.035700

MultiModalMamba

A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever.

Language:PythonMIT42600

TinyGPT-V

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

Language:PythonBSD-3-Clause122900

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonMIT643300

Everything-LLMs-And-Robotics

The world's largest GitHub Repository for LLMs + Robotics

BSD-3-Clause74300

guidance

A guidance language for controlling large language models.

Language:Jupyter NotebookMIT1849200

open_flamingo

An open-source framework for training large multimodal models.

Language:PythonMIT361400

langchain

🦜🔗 Build context-aware reasoning applications

Language:Jupyter NotebookMIT9090400

MiDaS

Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"

Language:PythonMIT433700

whisper.cpp

Port of OpenAI's Whisper model in C/C++

Language:C++MIT3386600

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonMIT6622200

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonApache-2.03116500

OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Language:PythonApache-2.0238400

BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Language:Jupyter NotebookBSD-3-Clause456800

NYU-DLSP21

NYU Deep Learning Spring 2021

Language:Jupyter Notebook154200

mmdetection

OpenMMLab Detection Toolbox and Benchmark

Language:PythonApache-2.02887500

camerax-tflite

Language:KotlinApache-2.05000

PINTO_model_zoo

A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8), EdgeTPU, CoreML.

Language:PythonMIT346800