Yiqing Huang (RubickH)

RubickH

Geek Repo

Company:Tsinghua University

Location:Beijing

Github PK Tool:Github PK Tool

Yiqing Huang's starred repositories

jieba

结巴中文分词

Language:PythonLicense:MITStargazers:32963Issues:1282Issues:848

Chinese-Names-Corpus

中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。

wikiextractor

A tool for extracting plain text from Wikipedia dumps

Language:PythonLicense:AGPL-3.0Stargazers:3712Issues:74Issues:242

line_profiler

Line-by-line profiling for Python

Language:PythonLicense:NOASSERTIONStargazers:2617Issues:15Issues:96

biobert

Bioinformatics'2020: BioBERT: a pre-trained biomedical language representation model for biomedical text mining

Language:PythonLicense:NOASSERTIONStargazers:1905Issues:62Issues:175

scispacy

A full spaCy pipeline and models for scientific/biomedical documents.

Language:PythonLicense:Apache-2.0Stargazers:1665Issues:52Issues:317

marian

Fast Neural Machine Translation in C++

Language:C++License:NOASSERTIONStargazers:1210Issues:67Issues:376

open-images-dataset

Open Images is a dataset of ~9 million images that have been annotated with image-level labels and bounding boxes spanning thousands of classes.

Language:MakefileLicense:NOASSERTIONStargazers:792Issues:22Issues:35

fast_align

Simple, fast unsupervised word aligner

Language:C++License:Apache-2.0Stargazers:729Issues:25Issues:38

bluebert

BlueBERT, pre-trained on PubMed abstracts and clinical notes (MIMIC-III).

Language:PythonLicense:NOASSERTIONStargazers:546Issues:23Issues:36

ChineseBLUE

Chinese Biomedical Language Understanding Evaluation benchmark (ChineseBLUE)

Language:PythonLicense:Apache-2.0Stargazers:520Issues:12Issues:18

awesome-align

A neural word aligner based on multilingual BERT

Language:PythonLicense:BSD-3-ClauseStargazers:318Issues:11Issues:47

bottom-up-attention.pytorch

A PyTorch reimplementation of bottom-up-attention models

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:291Issues:2Issues:94

py-bottom-up-attention

PyTorch bottom-up attention with Detectron2

Language:PythonLicense:Apache-2.0Stargazers:228Issues:5Issues:29

bert-vocab-builder

Builds wordpiece(subword) vocabulary compatible for Google Research's BERT

RL4NMT

Reinforcement Learning for Neural Machine Translation

image-paragraph-captioning

[EMNLP 2018] Training for Diversity in Image Paragraph Captioning

RMN

IJCAI2020: Learning to Discretely Compose Reasoning Module Networks for Video Captioning

wikipedia-parallel-titles

Tools for extracting parallel corpora from article titles across languages in Wikipedia

Language:Jupyter NotebookLicense:MITStargazers:61Issues:5Issues:8

ParaMed

Chinese to English medical translation

Language:PythonStargazers:46Issues:1Issues:0

fairseq_mix

Code for ICML2020 "Sequence Generation with Mixed Representations"

Language:PythonLicense:NOASSERTIONStargazers:12Issues:1Issues:1

image-paragraph-captioning

PyTorch implementation of 'Text Embedding Bank Module for Detailed Image Paragraph Captioning'

Language:PythonLicense:CC0-1.0Stargazers:6Issues:1Issues:1

Ngram_LG

My modification for the research project based on GPT2.

Language:PythonStargazers:2Issues:2Issues:0

zenme-whatsthatcalled

[experiment] Translate brand names between english and chinese using wikipedia written with svelte

Language:JavaScriptStargazers:1Issues:2Issues:0
Language:OpenEdge ABLStargazers:1Issues:3Issues:1