In this presentation I will explain the basic concepts behind data analysis and machine learning.
In order to understand the concepts that we are going to study here we need to first review a few math concepts
Probability is a numerical description of how likely an event is to occur or how likely it is that a proposition is true.
Probability is a number between 0 and 1, where, roughly speaking, 0 indicates impossibility and 1 indicates certainty.
Analysis of the events and the ways that they occur
Examples:
-
Roll of Dices
-
A game of Blackjack
Measuring of a event in order to determine its probability
Examples:
-
Uptime of a service
-
Failed jobs
"AI is a computer system able to perform tasks that ordinarily require human intelligence.
Many are powered by machine learning, some of them are powered by deep learning and some of them are powered by very boring things like rules."
Jeremy Achin
Machine Learning is a field of artificial intelligence often called Narrow AI. Its focus is to use statistical techniques to help a system to be progressively better at a task, without the need to specifically program it to do so.
There are two main areas inside the study of Machine Learning:
-
Supervised learning (using labeled datasets)
-
Unsupervised learning (using unlabeled datasets)
-
Machine learning:
We can say that machine learning program is a algorithm that trained to perform one task extremely well
We can divide machine learning into two types of algorithm based on its utilization
-
Regression
Regression its the core for prediction, it looks for relationships between your data.
-
Classification
Uses prediction to sort data, or to apply a label to a given input
The term linearity in algebra refers to a linear relationship between two or more variables. If we draw this relationship in a two-dimensional space we get a straight line.
A linear relationship basically means that when one or more independent variables increases (or decreases), the dependent variable increases (or decreases) too.
[ picture of the function ]
-
Y is the dependent variable (or the variable we are trying to predict or estimate);
-
X is the independent variable (the variable we are using to make predictions);
-
m is the slope of the regression line, it represent the effect X has on Y, example:
Y = mx + b
Y = 2x + 4 (1, 6) (2, 8) (3, 10)
It also is a number that describes both the direction and the steepness of the line.
-
b is a constant, the intercept in the Y axis when x is 0;
We won’t always know the exact relationship between X and Y, for that we use Linear Regression (LR).
In a (LR) model, we build a model based on data, meaning that the slope and Y-intercept derive from the it.
# install anaconda and jupyter-notebook
$ mkdir tech_talk && cd tech_talk
$ jupyter notebook
# this opens jupyter on your browser where
# we can start exploring
-
blog.usejournal.com/how-does-switching-from-web-development-to-machine-learning-feel-like-9ac7a370e751
-
coursera.org/learn/machine-learning
-
Book: Practical Statistics for Data Scientists