Vivien's starred repositories

awesome-public-datasets

A topic-centric list of HQ open datasets.

chinese-poetry

The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。

Language:JavaScriptLicense:MITStargazers:47691Issues:1156Issues:203

matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!

Language:PHPLicense:GPL-3.0Stargazers:19486Issues:419Issues:13823

Marketing-for-Engineers

A curated collection of marketing articles & tools to grow your product.

svg.js

The lightweight library for manipulating and animating SVG

Language:JavaScriptLicense:NOASSERTIONStargazers:11007Issues:272Issues:1062

chinese-xinhua

:orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。

Language:PythonLicense:MITStargazers:10820Issues:311Issues:58

ltp

Language Technology Platform

lac

百度NLP:分词,词性标注,命名实体识别,词重要性

Language:C++License:Apache-2.0Stargazers:3823Issues:106Issues:247

erxes

Source available experience management infrastructure. Pioneering the future of experiences with XOS (Experience Operating System). Hubspot + Qualtrics alternative

Language:TypeScriptLicense:NOASSERTIONStargazers:3481Issues:103Issues:2268

JioNLP

中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com

Language:PythonLicense:Apache-2.0Stargazers:3218Issues:35Issues:200

datasets

🎁 5,400,000+ Unsplash images made available for research and machine learning

Language:Jupyter NotebookStargazers:2373Issues:64Issues:37

Introduction-NLP

HanLP作者的新书《自然语言处理入门》详细笔记!业界良心之作,书中不是枯燥无味的公式罗列,而是用白话阐述的通俗易懂的算法模型。从基本概念出发,逐步介绍中文分词、词性标注、命名实体识别、信息抽取、文本聚类、文本分类、句法分析这几个热门问题的算法原理与工程实现。

Language:PythonLicense:Apache-2.0Stargazers:2147Issues:38Issues:9

www.mlcompendium.com

The Machine Learning & Deep Learning Compendium was a list of references in my private & single document, which I curated in order to expand my knowledge, it is now an open knowledge-sharing project compiled using Gitbook.

THULAC-Python

An Efficient Lexical Analyzer for Chinese

Language:PythonLicense:MITStargazers:1999Issues:79Issues:113

ChineseNLP

Datasets, SOTA results of every fields of Chinese NLP

animockup

Create animated mockups in the browser 🔥

Language:JavaScriptLicense:MITStargazers:1631Issues:34Issues:4

SentiBridge

SentiBridge: A Knowledge Base for Entity-Sentiment Representation

Language:PythonLicense:NOASSERTIONStargazers:632Issues:37Issues:7

socialreaper

Social media scraping / data collection library for Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs

Language:PythonLicense:MITStargazers:540Issues:31Issues:1

Small-Chinese-Corpus

Some useful Chinese corpus datasets 中文语料小数据

Chinese-ChatBot

中文聊天机器人,基于10万组对白训练而成,采用注意力机制,对一般问题都会生成一个有意义的答复。已上传模型,可直接运行。

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:322Issues:7Issues:22

text_clustering

文本聚类(Kmeans、DBSCAN、LDA、Single-pass)

Language:PythonLicense:Apache-2.0Stargazers:322Issues:1Issues:5

awesome-bootstrappers

👩‍🚀👨‍🚀 Must-read articles, videos and books for coders, marketers and bootstrappers.

Language:ShellLicense:MITStargazers:231Issues:20Issues:0

THUCTC

An Efficient Chinese Text Classifier

Language:JavaLicense:MITStargazers:202Issues:18Issues:7

awesome-search-engine-optimization

A curated list of backlink, social signal opportunities, link building strategies and tactics, along with educational opportunities to help improve search engine results and ranking.

License:Apache-2.0Stargazers:186Issues:9Issues:0

SinaWeibo-Emotion-Classification

新浪微博情感分析应用

Language:PythonStargazers:139Issues:16Issues:0

cn-text-classifier

中文文本聚类

Language:PythonLicense:GPL-3.0Stargazers:119Issues:1Issues:2

hack-the-paygap

PIF, U.S. Census + Council on Women and Girls project focused on hacking the gender pay gap. NOTE: THIS REPOSITORY IS NO LONGER BEING MAINTAINED.

Language:SCSSLicense:NOASSERTIONStargazers:29Issues:8Issues:19

finvest-spider

Finance and Investment Info Spider Collections - 投融资信息爬虫集合

Language:PythonLicense:MITStargazers:22Issues:0Issues:1