JuqiangJ's repositories
funNLP
中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、历史名人词库、诗词词库、医学词库、饮食词库、法律词库、汽车词库、动物词库、中文聊天语料、中文谣言数据、百度中文问答数据集、句子相似度匹配算法集合、bert资源、文本生成&摘要相关工具、cocoNLP信息抽取工具、国内电话号码正则匹配、清华大学XLORE:中英文跨语言百科知识图谱、清华大学人工智能技术系列报
ltp
Language Technology Platform
juqiangj.github.io
:sparkles: Build a beautiful and simple website in literally minutes. Demo at http://deanattali.com/beautiful-jekyll
lexical_analysis
A comparison of methods for the analysis of lexical proficiency in learner texts.
wordVectors
An R package for creating and exploring word2vec and other word embedding models
praat-scripts
collection of custom praat scripts
Multidimensional-Analysis-Tagger-of-Mandarin-Chinese
An open-source library in Python for analysing Chinese registers
imagetext
图文对照网站的简单实现
Emotional-Speech-Data
This is the GitHub page for publicly available emotional speech data.
OpenHowNet
Core Data of HowNet and OpenHowNet Python API
zotero-scihub
A plugin that will automatically download PDFs of zotero items from sci-hub
DomainWordsDict
DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task。涵盖68个领域、共计916万词的专业词典知识库,可用于文本分类、知识增强、领域词汇库扩充等自然语言处理应用。
praat
Praat scripts
Praat_Scripts
Some basic praat scripts.
TidyTuesday
📊 My contributions to the #TidyTuesday challenge
lael
Supplemental information for the Automatically Assessing Lexical Features in Learner Corpora Workshop (LAEL 2020)
cheatsheets
RStudio Cheat Sheets
cv
My CV built using RMarkdown and the pagedown package.
showtext
Using Fonts More Easily in R Graphs
timevis
Create interactive timeline visualizations in R
TALPCo
TUFS Asian Language Parallel Corpus
bible-corpus
A multilingual parallel corpus created from translations of the Bible.
Halfrost-Field
✍️ 这里是写博客的地方 —— Halfrost-Field 冰霜之地
chinese-poetry
The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。
markdown-cv
a simple template to write your CV in a readable markdown file and use CSS to publish/print it.
bib2df
Parse a BibTeX file to a tibble
DH_eda
explorative data analysis