Py-Contributors/metrics

machine-learning machine-learning-library machine-learning-metrics metrics metrics-library metrics-visualization python3 mathematical-programming mathematics r2-score

👉 Implementation of ML/DL Metrics in Python 👈

Implementation of various metrics for regression and classification problems. For Data Science and Machine Learning projects, it is important to have a good understanding of the metrics used to evaluate the performance of the model. This repository contains the implementation of various metrics for regression and classification problems. The metrics are implemented in Python and are available as a Python package. The metrics are implemented using NumPy and are implemented from scratch. The metrics are implemented using the formulae given in the Wikipedia pages for the respective metrics. The metrics are implemented in the following order:

Regression Metrics

R2 Score

R2 score, also known as the coefficient of determination, is a statistical measure of how close the data are to the fitted regression line. It is also known as the coefficient of determination, or the coefficient of multiple determination for multiple regression.

$$R^2 = 1 - \frac{\sum_{i=1}^n (y_i - \hat{y}_i)^2}{\sum_{i=1}^n (y_i - \bar{y})^2}$$

Mean Absolute Error

$$MAE = \frac{1}{n} \sum_{i=1}^n |y_i - \hat{y}_i|$$

Mean Squared Error

$$MSE = \frac{1}{n} \sum_{i=1}^n (y_i - \hat{y}_i)^2$$

Root Mean Squared Error

$$RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^n (y_i - \hat{y}_i)^2}$$

Mean Absolute Percentage Error

$$MAPE = \frac{100}{n} \sum_{i=1}^n \frac{|y_i - \hat{y}_i|}{y_i}$$

Mean Squared Logarithmic Error

$$MSLE = \frac{1}{n} \sum_{i=1}^n (log(y_i + 1) - log(\hat{y}_i + 1))^2$$

Median Absolute Error

$$MdAE = median(|y_i - \hat{y}_i|)$$

Median Squared Error

$$MdSE = median((y_i - \hat{y}_i)^2)$$

Median Absolute Percentage Error

$$MdAPE = median(\frac{|y_i - \hat{y}_i|}{y_i})$$

Median Squared Logarithmic Error

$$MdSLE = median((log(y_i + 1) - log(\hat{y}_i + 1))^2)$$

Explained Variance Score

$$EV = 1 - \frac{\sum_{i=1}^n (y_i - \hat{y}_i)^2}{\sum_{i=1}^n (y_i - \bar{y})^2}$$

Max Error

$$max_error = max(|y_i - \hat{y}_i|)$$

Mean Bias Error

$$MBE = \frac{1}{n} \sum_{i=1}^n (y_i - \hat{y}_i)$$

Mean Percentage Error

$$MPE = \frac{100}{n} \sum_{i=1}^n \frac{y_i - \hat{y}_i}{y_i}$$

Mean Squared Percentage Error

$$MSPE = \frac{100}{n} \sum_{i=1}^n \frac{(y_i - \hat{y}_i)^2}{y_i^2}$$

Median Bias Error

$$MdBE = median(y_i - \hat{y}_i)$$

Median Percentage Error

$$MdPE = median(\frac{y_i - \hat{y}_i}{y_i})$$

Median Squared Percentage Error

$$MdSPE = median(\frac{(y_i - \hat{y}_i)^2}{y_i^2})$$

Mean Absolute Scaled Error

$$MASE = \frac{1}{n} \sum_{i=1}^n \frac{|y_i - \hat{y}_i|}{\frac{1}{n-1} \sum_{i=1}^n |y_i - \bar{y}_i|}$$

Mean Squared Scaled Error

$$MSSE = \frac{1}{n} \sum_{i=1}^n \frac{(y_i - \hat{y}_i)^2}{\frac{1}{n-1} \sum_{i=1}^n (y_i - \bar{y}_i)^2}$$

Median Absolute Scaled Error

$$MdASE = median(\frac{|y_i - \hat{y}_i|}{\frac{1}{n-1} \sum_{i=1}^n |y_i - \bar{y}_i|})$$

Median Squared Scaled Error

$$MdSSE = median(\frac{(y_i - \hat{y}_i)^2}{\frac{1}{n-1} \sum_{i=1}^n (y_i - \bar{y}_i)^2})$$

Classification Metrics

Accuracy

$$Accuracy = \frac{TP + TN}{TP + TN + FP + FN}$$

Precision

$$Precision = \frac{TP}{TP + FP}$$

Recall

$$Recall = \frac{TP}{TP + FN}$$

F1 Score

$$F1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}$$

Matthews Correlation Coefficient

$$MCC = \frac{TP \times TN - FP \times FN}{\sqrt{(TP + FP)(TP + FN)(TN + FP)(TN + FN)}}$$

Cohen's Kappa

$$Kappa = \frac{p_o - p_e}{1 - p_e}$$

where

$$p_o = \frac{TP + TN}{TP + TN + FP + FN}$$

$$p_e = \frac{TP + FP}{TP + TN + FP + FN} \times \frac{TP + FN}{TP + TN + FP + FN} + \frac{TN + FP}{TP + TN + FP + FN} \times \frac{TN + FN}{TP + TN + FP + FN}$$

Area Under the Receiver Operating Characteristic Curve (ROC AUC)

$$ROC AUC = \frac{1}{2} \sum_{i=1}^{n-1} (TPR_i - TPR_{i+1}) \times (FPR_i + FPR_{i+1})$$

Area Under the Precision-Recall Curve (PR AUC)

$$PR AUC = \frac{1}{2} \sum_{i=1}^{n-1} (Recall_i - Recall_{i+1}) \times (Precision_i + Precision_{i+1})$$

Hamming Loss

$$Hamming Loss = \frac{1}{n} \sum_{i=1}^n \frac{1}{m} \sum_{j=1}^m I(y_{ij} \neq \hat{y}_{ij})$$

Zero-One Loss

$$Zero-One Loss = \frac{1}{n} \sum_{i=1}^n I(y_i \neq \hat{y}_i)$$

Jaccard Similarity Score

$$Jaccard = \frac{TP}{TP + FP + FN}$$

Fowlkes-Mallows Score

$$FM = \sqrt{\frac{TP}{TP + FP} \times \frac{TP}{TP + FN}}$$

Log Loss

$$Log Loss = - \frac{1}{n} \sum_{i=1}^n \sum_{j=1}^m y_{ij} \times log(\hat{y}_{ij})$$

Cross-Entropy Loss

$$Cross-Entropy Loss = - \frac{1}{n} \sum_{i=1}^n \sum_{j=1}^m y_{ij} \times log(\hat{y}_{ij}) - (1 - y_{ij}) \times log(1 - \hat{y}_{ij})$$

Hinge Loss

$$Hinge Loss = \frac{1}{n} \sum_{i=1}^n \sum_{j=1}^m max(0, 1 - y_{ij} \times \hat{y}_{ij})$$

Squared Hinge Loss

$$Squared Hinge Loss = \frac{1}{n} \sum_{i=1}^n \sum_{j=1}^m (max(0, 1 - y_{ij} \times \hat{y}_{ij}))^2$$

Classification Error

$$Classification Error = \frac{1}{n} \sum_{i=1}^n I(y_i \neq \hat{y}_i)$$

Balanced Classification Error

$$Balanced Classification Error = \frac{1}{n} \sum_{i=1}^n \frac{1}{m} \sum_{j=1}^m I(y_{ij} \neq \hat{y}_{ij})$$

Clustering Metrics

About

Machine/Deep Learning metrics implementation in python

machine-learning machine-learning-library machine-learning-metrics metrics metrics-library metrics-visualization python3 mathematical-programming mathematics r2-score

GNU General Public License v3.0

Languages

Language:Python 100.0%