Zhenwei's repositories
LLaVA-UHD-Better
A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo
CosAttention2d
a 2D cosine attention module inspired by cosFormer: Rethinking Softmax in Attention(https://arxiv.org/abs/2202.08791)
SamaritanHDU
A roll-call system using face recognition technique and WeChat App Platform.
Awesome-Multimodal-Large-Language-Models
Latest Papers and Datasets on Multimodal Large Language Models
ChuanhuChatGPT
GUI for ChatGPT API
cosformer-pytorch
Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".
fancy-and-tricky
remarkable snippets!
hexo-deploy-github-pages-action
🚀 GitHub action for deploying a Hexo project to GitHub pages.
image-processing-from-scratch
This project contains some interesting image processing algorithms that were wrote in python and c++ from scratch.
imp
Powerful multimodal small language models
LLaVA
[NeurIPS 2023 Oral] Visual Instruction Tuning: LLaVA (Large Language-and-Vision Assistant) built towards multimodal GPT-4 level capabilities.
Phi3V-Finetuning
Parameter-efficient finetuning script for Phi-3-vision, the strong multimodal language model by Microsoft.
PPOxFamily
PPO x Family DRL Tutorial Course(决策智能入门级公开课:8节课帮你盘清算法理论,理顺代码逻辑,玩转决策AI应用实践 )
shell_display.py
Display a image in shell using 20 lines Python code.
Sketch2Attributes
predict the attributes of a sketch of humans
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.