There are 261 repositories under data-mining topic.
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
:memo: An awesome Data Science repository to learn and apply for real world problems.
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
The "Python Machine Learning (1st edition)" book code repository and info resource
人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are committed to automating these high-value generic R&D processes through R&D-Agent, which lets AI drive data-driven AI. 🔗https://aka.ms/RD-Agent-Tech-Report
Anomaly detection related books, papers, videos, and toolboxes
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
:octocat: Machine Learning for Cyber Security
🏅 Collection of Kaggle Solutions and Ideas 🏅
Declarative web scraping
A library of extension and helper modules for Python's data analysis and machine learning libraries.
Curated list of Python resources for data science.
extract text from any document. no muss. no fuss.
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
List of tools & datasets for anomaly detection on time-series data.
An open source alternative to Tableau. Embeddable visual analytic
Computer vision assisted tool to extract numerical data from plot images.
Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
Comprehensive and timely academic information on federated learning (papers, frameworks, datasets, tutorials, workshops)
A Python toolkit/library for reality-centric machine/deep learning and data mining on partially-observed time series, including SOTA neural network models for scientific analysis tasks of imputation/classification/clustering/forecasting/anomaly detection/cleaning on incomplete industrial (irregularly-sampled) multivariate TS with NaN missing values
novel deep learning research works with PaddlePaddle
A curated list of data mining papers about fraud detection.
安全场景、基于AI的安全算法和安全数据分析业界实践
A curated list of graph-based fraud, anomaly, and outlier detection papers & resources
A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.
Some awesome AI related books and pdfs for learning and downloading, also apply some playground models for learning