cydiachen's starred repositories
MediaCrawler
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫
wechatDownload
微信公众号文章批量下载工具,支持图片、评论下载,支持保存html/md/pdf/docx文件
Qwen-Agent
Agent framework and applications built upon Qwen2, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
MetaTransformer
Meta-Transformer for Unified Multimodal Learning
word-to-markdown
A ruby gem to liberate content from Microsoft Word documents
Chat-UniVi
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
poe-api-wrapper
👾 A Python API wrapper for Poe.com. With this, you will have free access to ChatGPT, Claude, Llama, Gemini, Google-PaLM and more! 🚀
protest-detection-violence-estimation
Implementation of the model used in the paper Protest Activity Detection and Perceived Violence Estimation from Social Media Images (ACM Multimedia 2017)
InstructDoc
InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)
ExplainableVQA
[ACMMM Oral, 2023] "Towards Explainable In-the-wild Video Quality Assessment: A Database and a Language-Prompted Approach"
MLLM-protector
The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"
llm-vision-datasets
Collection of image and video datasets for generative AI and multimodal visual AI
MSWord_ChatGPT
How to use ChatGPT in MS Word