Gueter Josmy Faure (joslefaure)

joslefaure

Geek Repo

Company:National Taiwan University

Location:Taiwan

Home Page:https://joslefaure.github.io/

Github PK Tool:Github PK Tool

Gueter Josmy Faure's starred repositories

ollama

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

anything-llm

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.

Language:JavaScriptLicense:MITStargazers:23839Issues:189Issues:1562

devika

Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika aims to be a competitive open-source alternative to Devin by Cognition AI.

Language:PythonLicense:MITStargazers:18344Issues:203Issues:386

TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Language:PythonLicense:Apache-2.0Stargazers:7700Issues:108Issues:156

Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Language:PythonLicense:Apache-2.0Stargazers:6830Issues:49Issues:211

Awesome-Transformer-Attention

An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites

Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Language:PythonLicense:Apache-2.0Stargazers:2881Issues:28Issues:179

Awesome-Text-to-Image

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

DemoFusion

Let us democratise high-resolution generation! (CVPR 2024)

Language:Jupyter NotebookStargazers:1971Issues:33Issues:44

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language:PythonLicense:Apache-2.0Stargazers:1934Issues:24Issues:90

LLaMA-VID

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)

Language:PythonLicense:Apache-2.0Stargazers:693Issues:14Issues:104

TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models

Language:PythonLicense:Apache-2.0Stargazers:587Issues:13Issues:109

DoRA

[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation

Language:PythonLicense:NOASSERTIONStargazers:582Issues:9Issues:16

StreamMultiDiffusion

Official code for the paper "StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control."

Language:Jupyter NotebookLicense:MITStargazers:526Issues:10Issues:15

meshed-memory-transformer

Meshed-Memory Transformer for Image Captioning. CVPR 2020

Language:PythonLicense:BSD-3-ClauseStargazers:516Issues:13Issues:97

AlphAction

Spatio-Temporal Action Localization System

TransformerCompression

For releasing code related to compression methods for transformers, accompanying our publications

Language:PythonLicense:MITStargazers:360Issues:9Issues:45

unmasked_teacher

[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models

Language:PythonLicense:MITStargazers:285Issues:13Issues:47

ReST

[ICCV 2023] ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking

Language:PythonLicense:MITStargazers:137Issues:5Issues:21

videoCC-data

VideoCC is a dataset containing (video-URL, caption) pairs for training video-text machine learning models. It is created using an automatic pipeline starting from the Conceptual Captions Image-Captioning Dataset.

HIT

Official Implementation of our WACV2023 paper: “Holistic Interaction Transformer Network for Action Detection”

Language:PythonLicense:BSD-3-ClauseStargazers:28Issues:1Issues:8

Tensorflow-JS-Projects

Web projects using Tensorflow JS, Plotly, D3, Echarts, NumJS, and NumericJS

Language:JavaScriptLicense:Apache-2.0Stargazers:19Issues:6Issues:0

iCLIP

[ICCVW 2023] Interaction-Aware Prompting for Zero-Shot Spatio-Temporal Action Detection