Hon-Wong

Hon-Wong

Geek Repo

Company:Bytedance Inc.

Github PK Tool:Github PK Tool

Hon-Wong's starred repositories

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:64866Issues:542Issues:0

grok-1

Grok open release

Language:PythonLicense:Apache-2.0Stargazers:49186Issues:561Issues:202

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:33888Issues:316Issues:423

spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python

Language:PythonLicense:MITStargazers:29328Issues:558Issues:5612

awesome-nlp

:book: A curated list of resources dedicated to Natural Language Processing (NLP)

ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Language:PythonLicense:NOASSERTIONStargazers:15635Issues:134Issues:615

wechat-chatgpt

Use ChatGPT On Wechat via wechaty

Language:TypeScriptStargazers:13192Issues:95Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

ChatRWKV

ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.

Language:PythonLicense:Apache-2.0Stargazers:9343Issues:90Issues:116

textract

extract text from any document. no muss. no fuss.

Language:HTMLLicense:MITStargazers:3833Issues:82Issues:241

Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

Language:PythonLicense:NOASSERTIONStargazers:709Issues:8Issues:63

text-dedup

All-in-one text de-duplication

Language:PythonLicense:Apache-2.0Stargazers:552Issues:4Issues:57
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:542Issues:14Issues:15

Video-MME

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Youku-mPLUG

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks

Language:PythonLicense:Apache-2.0Stargazers:271Issues:5Issues:29

GroundingGPT

[ACL 2024] GroundingGPT: Language-Enhanced Multi-modal Grounding Model

Language:PythonLicense:Apache-2.0Stargazers:265Issues:14Issues:10

Multilingual-PR

Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three different self-supervised models, Wav2vec (2019, 2020), HuBERT (2021) and WavLM (2022) pretrained on a corpus of English speech that we will use in various ways to perform phoneme recognition for different languages with a network trained with Connectionist Temporal Classification (CTC) algorithm.

FreestyleNet

[CVPR 2023 Highlight] Freestyle Layout-to-Image Synthesis

Language:PythonLicense:MITStargazers:138Issues:5Issues:15

orange3-text

🍊 :page_facing_up: Text Mining add-on for Orange3

Language:PythonLicense:NOASSERTIONStargazers:125Issues:20Issues:356

Text2NeRF

Official implementation of 'Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields'

Language:PythonLicense:MITStargazers:111Issues:15Issues:14

Elysium

[ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM

PTSEFormer

[ECCV2022] PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection

Language:PythonLicense:MITStargazers:28Issues:2Issues:14