Karthik Kannan (Shoshin23)

Shoshin23

Geek Repo

Company:@Envision-AI

Location:Delft, NL

Home Page:www.karthikkannan.in

Github PK Tool:Github PK Tool

Karthik Kannan's starred repositories

cookbook

Examples and guides for using the Gemini API.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4559Issues:0Issues:0

skyvern

Automate browser-based workflows with LLMs and Computer Vision

Language:PythonLicense:AGPL-3.0Stargazers:5638Issues:0Issues:0

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:4571Issues:0Issues:0

aifs

Local semantic search. Stupidly simple.

Language:PythonLicense:Apache-2.0Stargazers:357Issues:0Issues:0

MultiModalMamba

A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever.

Language:PythonLicense:MITStargazers:426Issues:0Issues:0

TinyGPT-V

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

Language:PythonLicense:BSD-3-ClauseStargazers:1229Issues:0Issues:0

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonLicense:MITStargazers:6433Issues:0Issues:0

Everything-LLMs-And-Robotics

The world's largest GitHub Repository for LLMs + Robotics

License:BSD-3-ClauseStargazers:743Issues:0Issues:0

guidance

A guidance language for controlling large language models.

Language:Jupyter NotebookLicense:MITStargazers:18492Issues:0Issues:0

open_flamingo

An open-source framework for training large multimodal models.

Language:PythonLicense:MITStargazers:3614Issues:0Issues:0

langchain

🦜🔗 Build context-aware reasoning applications

Language:Jupyter NotebookLicense:MITStargazers:90904Issues:0Issues:0

MiDaS

Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"

Language:PythonLicense:MITStargazers:4337Issues:0Issues:0

whisper.cpp

Port of OpenAI's Whisper model in C/C++

Language:C++License:MITStargazers:33866Issues:0Issues:0

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:66222Issues:0Issues:0

pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonLicense:Apache-2.0Stargazers:31165Issues:0Issues:0

OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Language:PythonLicense:Apache-2.0Stargazers:2384Issues:0Issues:0

BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:4568Issues:0Issues:0

NYU-DLSP21

NYU Deep Learning Spring 2021

Language:Jupyter NotebookStargazers:1542Issues:0Issues:0

mmdetection

OpenMMLab Detection Toolbox and Benchmark

Language:PythonLicense:Apache-2.0Stargazers:28875Issues:0Issues:0
Language:KotlinLicense:Apache-2.0Stargazers:50Issues:0Issues:0

PINTO_model_zoo

A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8), EdgeTPU, CoreML.

Language:PythonLicense:MITStargazers:3468Issues:0Issues:0

Connectivity

🌐 Makes Internet connectivity detection more robust by detecting Wi-Fi networks without Internet access.

Language:SwiftLicense:MITStargazers:1640Issues:0Issues:0

CV-pretrained-model

A collection of computer vision pre-trained models.

License:MITStargazers:1290Issues:0Issues:0

A-Hackers-AI-Voice-Assistant

A hackers AI voice assistant, built using Python and PyTorch.

Language:PythonLicense:MITStargazers:994Issues:0Issues:0

Dewarping-Document-Image-By-Displacement-Flow-Estimation

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

Language:PythonLicense:MITStargazers:158Issues:0Issues:0

layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis

Language:PythonLicense:Apache-2.0Stargazers:4728Issues:0Issues:0

sense

Enhance your application with the ability to see and interact with humans using any RGB camera.

Language:PythonLicense:MITStargazers:732Issues:0Issues:0

budgetml

Deploy a ML inference service on a budget in less than 10 lines of code.

Language:PythonLicense:Apache-2.0Stargazers:1339Issues:0Issues:0

CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Language:Jupyter NotebookLicense:MITStargazers:24252Issues:0Issues:0

yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Language:PythonLicense:AGPL-3.0Stargazers:49230Issues:0Issues:0