zye1996

zye1996

Geek Repo

Company:GMU

Location:Fairfax, VA

Home Page:zye1996.github.io

Github PK Tool:Github PK Tool

zye1996's starred repositories

DocumentLayoutAnalysis

Document Layout Analysis resources repos for development with PdfPig.

Language:C#Stargazers:568Issues:0Issues:0

CnSTD

CnSTD: 基于 PyTorch/MXNet 的 中文/英文 场景文字检测(Scene Text Detection)、数学公式检测(Mathematical Formula Detection, MFD)、篇章分析(Layout Analysis)的Python3 包

Language:PythonLicense:Apache-2.0Stargazers:667Issues:0Issues:0

PaddleOCR2Pytorch

PaddleOCR inference in PyTorch. Converted from [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)

Language:PythonLicense:Apache-2.0Stargazers:843Issues:0Issues:0

GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Language:PythonLicense:Apache-2.0Stargazers:1325Issues:0Issues:0

whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Language:PythonLicense:AGPL-3.0Stargazers:1823Issues:0Issues:0

ai

Build AI-powered applications with React, Svelte, Vue, and Solid

Language:TypeScriptLicense:NOASSERTIONStargazers:9232Issues:0Issues:0

BitNet

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

Language:PythonLicense:MITStargazers:1511Issues:0Issues:0

AutoPrompt

A framework for prompt tuning using Intent-based Prompt Calibration

Language:PythonLicense:Apache-2.0Stargazers:2011Issues:0Issues:0

Chinese-medical-dialogue-data

Chinese medical dialogue data 中文医疗对话数据集

Language:PythonLicense:MITStargazers:1124Issues:0Issues:0

vearch

Distributed vector search for AI-native applications

Language:GoLicense:Apache-2.0Stargazers:2015Issues:0Issues:0

LESS

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

Language:Jupyter NotebookLicense:MITStargazers:319Issues:0Issues:0

magika

Detect file content types with deep learning

Language:RustLicense:Apache-2.0Stargazers:7679Issues:0Issues:0

jieba

结巴中文分词

Language:PythonLicense:MITStargazers:33019Issues:0Issues:0

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8478Issues:0Issues:0

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language:PythonLicense:Apache-2.0Stargazers:11375Issues:0Issues:0
Language:PythonLicense:MITStargazers:58Issues:0Issues:0

gutenberg-dialog

Build a dialog dataset from online books in many languages

Language:PythonLicense:MITStargazers:71Issues:0Issues:0

ChatRTX

A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM

Language:TypeScriptLicense:NOASSERTIONStargazers:2631Issues:0Issues:0

MuTual

A Dataset for Multi-Turn Dialogue Reasoning

Language:PythonStargazers:280Issues:0Issues:0

conversational-datasets

Large datasets for conversational AI

Language:PythonLicense:Apache-2.0Stargazers:1275Issues:0Issues:0

tvsub

TVsub: DCU-Tencent Chinese-English Dialogue Corpus

Stargazers:45Issues:0Issues:0

LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:29733Issues:0Issues:0

OLMo

Modeling, training, eval, and inference code for OLMo

Language:PythonLicense:Apache-2.0Stargazers:4311Issues:0Issues:0

clean-dialog

A framework for cleaning Chinese dialog data

Language:PythonStargazers:256Issues:0Issues:0

CDial-GPT

A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models

Language:PythonLicense:MITStargazers:1750Issues:0Issues:0

chinese-chatbot-corpus

中文公开聊天语料库

Language:PythonLicense:Apache-2.0Stargazers:3961Issues:0Issues:0

MiniCPM

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.

Language:PythonLicense:Apache-2.0Stargazers:4677Issues:0Issues:0

ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)

Language:PythonLicense:MITStargazers:2814Issues:0Issues:0

Luotuo-Chinese-LLM

骆驼(Luotuo): Open Sourced Chinese Language Models. Developed by 陈启源 @ 华中师范大学 & 李鲁鲁 @ 商汤科技 & 冷子昂 @ 商汤科技

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:3628Issues:0Issues:0

Luotuo-Text-Embedding

Luotuo Embedding(骆驼嵌入) is a text embedding model, which developed by 李鲁鲁, 冷子昂, 陈启源, 蒟蒻等.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:257Issues:0Issues:0