Mohamed KARAA (mohamedkaraa)

mohamedkaraa

Geek Repo

Github PK Tool:Github PK Tool

Mohamed KARAA's starred repositories

QFormer

The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"

Language:PythonLicense:MITStargazers:167Issues:0Issues:0

TSP6K

The official PyTorch code for "Traffic Scene Parsing through the TSP6K Dataset".

Language:PythonLicense:Apache-2.0Stargazers:21Issues:0Issues:0

Awesome-Prompting-on-Vision-Language-Model

This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models.

Stargazers:359Issues:0Issues:0

TrafficVLM

[CVPRW 2024] TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning. Official code for the 3rd place solution of the AI City Challenge 2024 Track 2.

Language:PythonLicense:CC0-1.0Stargazers:29Issues:0Issues:0

pytorch-spynet

a reimplementation of Optical Flow Estimation using a Spatial Pyramid Network in PyTorch

Language:PythonLicense:GPL-3.0Stargazers:312Issues:0Issues:0

Image2Paragraph

[A toolbox for fun.] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.

Language:PythonLicense:Apache-2.0Stargazers:789Issues:0Issues:0

moondream

tiny vision language model

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4978Issues:0Issues:0

ChatCaptioner

Official Repository of ChatCaptioner

Language:Jupyter NotebookLicense:MITStargazers:450Issues:0Issues:0

VLog

Transform Video as a Document with ChatGPT, CLIP, BLIP2, GRIT, Whisper, LangChain.

Language:PythonLicense:MITStargazers:530Issues:0Issues:0

Surveillance-Video-Understanding

Official project page of the paper "Towards Surveillance Video-and-Language Understanding: New Dataset, Baselines, and Challenges" (Accepted by CVPR 2024)

Stargazers:24Issues:0Issues:0

Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

Language:PythonLicense:CC-BY-4.0Stargazers:1175Issues:0Issues:0

Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Language:PythonLicense:BSD-3-ClauseStargazers:2739Issues:0Issues:0
Language:PythonStargazers:56Issues:0Issues:0

constriction

Entropy coders for research and production in Python and Rust.

Language:RustLicense:Apache-2.0Stargazers:80Issues:0Issues:0

CompressAI

A PyTorch library and evaluation platform for end-to-end compression research

Language:PythonLicense:BSD-3-Clause-ClearStargazers:1164Issues:0Issues:0

External-Attention-pytorch

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

Language:PythonLicense:MITStargazers:11350Issues:0Issues:0
Language:Jupyter NotebookStargazers:2Issues:0Issues:0

developer-roadmap

Interactive roadmaps, guides and other educational content to help developers grow in their careers.

Language:TypeScriptLicense:NOASSERTIONStargazers:293974Issues:0Issues:0

gradient-checkpointing

Make huge neural nets fit in memory

Language:PythonLicense:MITStargazers:2707Issues:0Issues:0

applied-ml

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

License:MITStargazers:27220Issues:0Issues:0

streamlit-folium

Streamlit Component for rendering Folium maps

Language:PythonLicense:MITStargazers:468Issues:0Issues:0
Stargazers:784Issues:0Issues:0

Streamlit_DataScience_Apps

Streamlit Data Science and ML Apps in Python

Language:HTMLStargazers:593Issues:0Issues:0