nestalk / R-notes

Notes on using R for statistics

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

R-notes

Notes on using R for statistics

Linear regression

Good for predicting a continous outcome, numeric value. Simple, and works on small and large datasets. Assumes a linear relationship.

Logistic regression

Good for predicting a binary outcome, two categorical value. Creates probabilities on the outcome. Assumes a linear relationship.

Regression Trees (CART)

Good for prediction an outcome, or continous outcome. Can handle datasets without a linear relationship and is easy to explain. Small datasets may not work.

Random Forests

Similar to CART but can improve accuracy. Needs more setup and not as easy to interpret.

Hierarchial Clustering

Good for finding similar groups of data. No need to know how many clusters you need, easy to visulise. Difficult to use on large datasets.

K-Means clustering

Similar to hierachial. Need to know number of clusters beforhand.

About

Notes on using R for statistics

License:MIT License