WANGJUNJIE's starred repositories
paper-reading
深度学习经典、新论文逐段精读
vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
weiboSpider
新浪微博爬虫,用python爬取新浪微博数据
FileCentipede
Cross-platform internet upload/download manager for HTTP(S), FTP(S), SSH, magnet-link, BitTorrent, m3u8, ed2k, and online videos. WebDAV client, FTP client, SSH client.
awesome-multimodal-ml
Reading list for research topics in multimodal machine learning
video-subtitle-extractor
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
Fengshenbang-LM
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
Pix2Text
An Open-Source Python3 tool for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported.
Awesome-Multimodal-Research
A curated list of Multimodal Related Research.
FightingCV-Paper-Reading
⭐⭐⭐FightingCV Paper Reading, which helps you understand the most advanced research work in an easier way 🍀 🍀 🍀
BERT-whitening
简单的向量白化改善句向量质量
RapidVideOCR
Extract video hard subtitles and automatically generate corresponding srt files.
MTO-Platform
Multitask Optimization Platform (MToP): A MATLAB Optimization Platform for Evolutionary Multitasking
stable-diffusion-webui
Stable Diffusion web UI
VQA_to_multimodal_survey
Update 2020