Xiao's starred repositories

tensorflow

An Open Source Machine Learning Framework for Everyone

Language:C++License:Apache-2.0Stargazers:184189Issues:7618Issues:39512

HanLP

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

Language:PythonLicense:Apache-2.0Stargazers:33106Issues:1141Issues:1405

jieba

结巴中文分词

Language:PythonLicense:MITStargazers:32843Issues:1283Issues:847

fastText

Library for fast text representation and classification.

LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

newspaper

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:

Language:PythonLicense:MITStargazers:13916Issues:385Issues:673

tflearn

Deep learning library featuring a higher-level API for TensorFlow.

Language:PythonLicense:NOASSERTIONStargazers:9613Issues:457Issues:918

ansj_seg

ansj分词.ict的真正java实现.分词效果速度都超过开源版的ict. 中文分词,人名识别,词性标注,用户自定义词典

Language:JavaLicense:Apache-2.0Stargazers:6461Issues:657Issues:709

pytext

A natural language modeling framework based on PyTorch

Language:PythonLicense:NOASSERTIONStargazers:6343Issues:169Issues:135

goproxy

An HTTP proxy library for Go

Language:GoLicense:BSD-3-ClauseStargazers:5916Issues:159Issues:347

mace

MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

Language:C++License:Apache-2.0Stargazers:4897Issues:230Issues:677

wechat-spider

微信公众号爬虫

zh-NER-TF

A very simple BiLSTM-CRF model for Chinese Named Entity Recognition 中文命名实体识别 (TensorFlow)

congress-legislators

Members of the United States Congress, 1789-Present, in YAML/JSON/CSV, as well as committees, presidents, and vice presidents.

Language:PythonLicense:CC0-1.0Stargazers:2012Issues:161Issues:300

QuestionAnsweringSystem

QuestionAnsweringSystem是一个Java实现的人机问答系统,能够自动分析问题并给出候选答案。

Language:JavaLicense:Apache-2.0Stargazers:1954Issues:216Issues:30

LightLDA

Scalable, fast, and lightweight system for large-scale topic modeling

Language:C++License:MITStargazers:840Issues:94Issues:71

NRE

Neural Relation Extraction, including CNN, PCNN, CNN+ATT, PCNN+ATT

Language:C++License:MITStargazers:809Issues:79Issues:34

dbpedia-spotlight

DBpedia Spotlight is a tool for automatically annotating mentions of DBpedia resources in text.

Language:ScalaStargazers:752Issues:94Issues:0

fact-extractor

Fact Extraction from Wikipedia Text

sft_datasets

开源SFT数据集整理,随时补充

govtrack.us-web

The Django source code for the GovTrack.us website.

clickmodels

ClickModels is a small set of Python scripts for the user click models initially developed at Yandex. A Click Model is a probabilistic graphical model used to predict search engine click data from past observations. This project is aimed to deal with click models used in Information Retrieval (see next README.md) and intended to be easy-to-read and easy-to-modify. If it's not, please let me know how to improve it :)

Language:PythonLicense:BSD-3-ClauseStargazers:234Issues:21Issues:9

chinastock

chinastock **股票行情,数据

smpcup2016

5th Place Solution for smp cup competition

shakey

股票自动分析工具

Language:JavaLicense:Apache-2.0Stargazers:72Issues:18Issues:46

cancer-deep-learning-model

Keras Deep Learning neural network model for University of Wisconsin Cancer data that uses the Integrated Variants library to explain predictions made by a trained model

Language:PythonLicense:Apache-2.0Stargazers:72Issues:18Issues:0
Language:PythonLicense:GPL-2.0Stargazers:36Issues:9Issues:0

WHOIS-history

historical whois information

Language:RubyStargazers:8Issues:0Issues:0