There are 191 repositories under data-mining topic.
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
:memo: An awesome Data Science repository to learn and apply for real world problems.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
Topic Modelling for Humans
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
The "Python Machine Learning (1st edition)" book code repository and info resource
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Anomaly detection related books, papers, videos, and toolboxes
A Comprehensive and Scalable Python Library for Outlier Detection (Anomaly Detection)
A unified framework for machine learning with time series
人工智能学习路线图，整理近200个实战案例与项目，免费提供配套教材，零基础入门，就业实战！包括：Python，数学，机器学习，数据分析，深度学习，计算机视觉，自然语言处理，PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
:octocat: Machine Learning for Cyber Security
Declarative web scraping
A library of extension and helper modules for Python's data analysis and machine learning libraries.
🍊 :bar_chart: :bulb: Orange: Interactive data analysis
Curated list of Python resources for data science.
extract text from any document. no muss. no fuss.
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
A curated list of awesome machine learning interpretability resources.
🏅 Collection of Kaggle Solutions and Ideas 🏅
List of tools & datasets for anomaly detection on time-series data.
10x faster matrix and vector operations
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
HTML5 based online tool to extract numerical data from plot images.
Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning
novel deep learning research works with PaddlePaddle
An open source alternative to Tableau. Easily embedded as a component in web apps.
Extract structured data from PDF invoices
:memo: Подборка ресурсов по машинному обучению
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Multi-class confusion matrix library in Python
A curated list of data mining papers about fraud detection.
Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.
AIL framework - Analysis Information Leak framework. Project moved to https://github.com/ail-project