shaoliu08's repositories
ansj_seg
ansj分词.ict的真正java实现.分词效果速度都超过开源版的ict. 中文分词,人名识别,词性标注,用户自定义词典
best-practices
Tidbits of developer best practices from around the web
brown-cluster
C++ implementation of the Brown word clustering algorithm.
corpus
古典中文語料庫
CRNN-Keras
CRNN (CNN+RNN) for OCR using Keras / License Plate Recognition
docker-spark-stand-alone
Spark 2.4.7 stand alone docker image
elasticsearch
ElasticSearch Dockerfile for trusted automated Docker builds.
elasticsearch-analysis-ik
The IK Analysis plugin integrates Lucene IK analyzer into elasticsearch, support customized dictionary.
elasticsearch-http-basic
HTTP Basic Authentication for Elasticsearch
FraudDetection
Data Anomaly and Fraud Detection with Python and R
jieba-analysis
结巴分词(java版)
jpmml-sklearn
Java library and command-line application for converting Scikit-Learn pipelines to PMML
RolX
An alternative implementation of Recursive Feature and Role Extraction (KDD11 & KDD12)
spark
Apache Spark
spark-gotchas
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
tabula-java
Extract tables from PDF files
traprange
A Method to Extract Table Content in PDF Files (Java)