Qilong Zhang (qilong-zhang)

qilong-zhang

Geek Repo

Company:ByteDance

Location:China

Github PK Tool:Github PK Tool

Qilong Zhang's starred repositories

ollama

Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.

awesome-public-datasets

A topic-centric list of HQ open datasets.

HanLP

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

Language:PythonLicense:Apache-2.0Stargazers:33177Issues:1142Issues:1405

jieba

结巴中文分词

Language:PythonLicense:MITStargazers:32880Issues:1282Issues:847

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:12841Issues:99Issues:1032

tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Language:PythonLicense:MITStargazers:11326Issues:167Issues:224

nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language:PythonLicense:MITStargazers:6223Issues:38Issues:900

DeepSeek-Coder

DeepSeek Coder: Let the Code Write Itself

Language:PythonLicense:MITStargazers:6193Issues:69Issues:151

ChineseNlpCorpus

搜集、整理、发布 中文 自然语言处理 语料/数据集,与 有志之士 共同 促进 中文 自然语言处理 的 发展。

Language:Jupyter NotebookStargazers:5734Issues:118Issues:24

Qwen1.5

Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud.

TextRank4ZH

:deciduous_tree:从中文文本中自动提取关键词和摘要

Language:PythonLicense:MITStargazers:3248Issues:103Issues:34

sqlcoder

SoTA LLM for converting natural language questions to SQL queries

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:3172Issues:32Issues:103

Chinese-ELECTRA

Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)

Language:PythonLicense:Apache-2.0Stargazers:1388Issues:26Issues:85

DeepSeek-LLM

DeepSeek LLM: Let there be answers

Language:MakefileLicense:MITStargazers:1349Issues:23Issues:32

chatgpt-comparison-detection

Human ChatGPT Comparison Corpus (HC3), Detectors, and more! 🔥

ProphetNet

A research project for natural language generation, containing the official implementations by MSRA NLC team.

Language:PythonLicense:MITStargazers:660Issues:20Issues:74

MacBERT

Revisiting Pre-trained Models for Chinese Natural Language Processing (MacBERT)

llm-books

利用LLM构建应用实践笔记

self-refine

LLMs can generate feedback on their work, use it to improve the output, and repeat this process iteratively.

Language:PythonLicense:Apache-2.0Stargazers:533Issues:13Issues:20

opencc-python

OpenCC made with Python

Language:PythonLicense:Apache-2.0Stargazers:529Issues:20Issues:11

Metaphor

Metaphor - Stagefright with ASLR bypass

Language:PythonLicense:GPL-3.0Stargazers:314Issues:33Issues:16

llms_paper

该仓库主要记录 LLMs 算法工程师相关的顶会论文研读笔记(多模态、PEFT、小样本QA问答、RAG、LMMs可解释性、Agents、CoT)

Awesome-Machine-Generated-Text

Continuously updated list of related resources for generative LLMs like GPT and their analysis and detection.

chisp

scripts and baselines for CSpider: Chinese semantic parsing and text-to-SQL challenge

stella

text embedding

Language:PythonLicense:Apache-2.0Stargazers:129Issues:2Issues:14

AIGC_text_detector

The official codes of our work on AIGC detection: "Multiscale Positive-Unlabeled Detection of AI-Generated Texts" (ICLR'24 Spotlight)

Language:PythonLicense:Apache-2.0Stargazers:87Issues:1Issues:7

MixSet

Official code repository for Mixset.

Language:PythonStargazers:15Issues:0Issues:0