Kevin-tati / Spark

Spark

download the titanic desaster train.csv file on kaggle: https://www.kaggle.com/c/titanic/data

Databricks

Create an account on Databricks: https://community.cloud.databricks.com/login.html
Create a table and load the previously downloaded dataset
Create a cluster
Upload the file in your workspace and choose the cluster you just created
Run the notebook

Machine learning part

Model use for the predition :

LogisticRegression
DecisionTreeClassifier
RandomForestClassifier
Gradient-boosted tree classifier
NaiveBayes
Support Vector Machine

About

Languages

Language:Jupyter Notebook 100.0%