securitypilot / hack_lu_2017

Python and Machine Learning Workshop at Hack.lu 2017

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Python and Machine Learning: How to clusterize a malware dataset ?

This repository is the summary of the workshop at Hack.lu 2017 about machine learning and malwares

Setup your environment

  • Install python and libraries

sudo apt python3.5 python3.5-dev python3.5-pip python3.5-virtualenv libpython3.5-dev libfuzzy-dev libffi-dev redis-server

  • Create a virtual env

virtualenv --python=/usr/bin/python3.5 <destination_path>

  • activate your virtualenv

source <destination_path>/bin/activate

  • Install dependencies

pip install -r requirements.txt

git clone https://github.com/sebdraven/pe-parse.git

cd pe-parse/python

python3.5 setup.py install

git clone https://github.com/sebdraven/petojson.git

cd petojson

python3.5 setup.py install

Install the dataset

  • clone theZoo malware dataset git clone https://github.com/ytisf/theZoo.git

  • Unzip all .zip

cd theZoo

find . -name '*.zip' -exec sh -c 'unzip -P infected -o -d "${0%.*}" "$0"' '{}' ';'

Run your environment

git clone https://github.com/sebdraven/hack_lu_2017.git

cd hack_lu_2017

ipython notebook

About

Python and Machine Learning Workshop at Hack.lu 2017


Languages

Language:Jupyter Notebook 99.9%Language:Python 0.1%