DUT-LiuYang

DUT-LiuYang

Geek Repo

Location:Beijing

Github PK Tool:Github PK Tool

DUT-LiuYang's starred repositories

AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Language:PythonLicense:MITStargazers:167164Issues:1553Issues:2689

langchain

🦜🔗 Build context-aware reasoning applications

Language:Jupyter NotebookLicense:MITStargazers:92972Issues:681Issues:7660

Prompt-Engineering-Guide

🐙 Guides, papers, lecture, notebooks and resources for prompt engineering

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonLicense:MITStargazers:19621Issues:302Issues:1359

ChineseBQB

🇨🇳 Chinese sticker pack,More joy / 表情包的博物馆, Github最有毒的仓库, **表情包大集合, 聚欢乐~

ML-Papers-of-the-Week

🔥Highlighting the top ML papers every week.

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:9572Issues:74Issues:1124

easy-rl

强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:9185Issues:79Issues:143

LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Language:PythonLicense:Apache-2.0Stargazers:8226Issues:72Issues:407

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Language:PythonLicense:Apache-2.0Stargazers:7764Issues:97Issues:1582

awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Language:PythonLicense:MITStargazers:4758Issues:91Issues:12

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Language:PythonLicense:Apache-2.0Stargazers:3831Issues:23Issues:520

MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

Linly

Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集

pythia

The hub for EleutherAI's work on interpretability and learning dynamics

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2227Issues:32Issues:105

WebGLM

WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)

Language:PythonLicense:Apache-2.0Stargazers:1557Issues:25Issues:70

GPT2-NewsTitle

Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。

Language:PythonLicense:Apache-2.0Stargazers:1094Issues:10Issues:42

DeepLearningBookQA_cn

深度学习面试问题 回答对应的DeepLearning中文版页码

wordninja

Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.

Language:PythonLicense:MITStargazers:802Issues:10Issues:21

MacBERT

Revisiting Pre-trained Models for Chinese Natural Language Processing (MacBERT)

BERT-whitening-pytorch

Pytorch version of BERT-whitening

Language:PythonLicense:MITStargazers:309Issues:1Issues:14

leetcode-java

🎓🎓🎓 Leetcode solution in Java - 536/921 Solved. https://leetcode.com/problemset/all/

Language:JavaStargazers:150Issues:9Issues:0

QuRating

[ICML 2024] Selecting High-Quality Data for Training Language Models

GEC-Info

Repository to collect and categorize Grammatical Error Correction papers.

BANG

BANG is a new pretraining model to Bridge the gap between Autoregressive (AR) and Non-autoregressive (NAR) Generation. AR and NAR generation can be uniformly regarded as to what extent previous tokens can be attended, and BANG bridges AR and NAR generation by designing a novel model structure for large-scale pretraining. The pretrained BANG model can simultaneously support AR, NAR and semi-NAR generation to meet different requirements.

Language:PythonLicense:MITStargazers:28Issues:5Issues:4

Meta-Curriculum

Meta-Curriculum Learning for Domain Adaptation in Neural Machine Translation (AAAI 2021)

Language:PythonLicense:NOASSERTIONStargazers:25Issues:7Issues:4
Language:Macaulay2Stargazers:16Issues:2Issues:0

eracond

The first high-quality, fine-grained error-correction conversation dataset between English second language learner and an educational chatbot.

License:MITStargazers:12Issues:2Issues:0
Language:PythonLicense:MITStargazers:12Issues:1Issues:1

Prune-Tune

Official code repository for AAAI2021 paper Finding Sparse Structures for Domain Specific Neural Machine Translation