tinsir888 / dm2023-exercises

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exercises, Data Mining at Aarhus University

The Data Mining course at Aarhus University is primarily based on the

Zaki, M.J., Meira Jr, W. and Meira, W., 2020. Data mining and analysis: fundamental concepts and algorithms. Cambridge University Press.

Leskovec, J., Rajaraman, A. and Ullman, J.D., 2020. Mining of massive data sets. Cambridge university press.

Hamilton W. L., 2020. Graph Representation Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning

Note that an online version of the book can be downloaded on the official webpage, which is linked above. Furthermore, under the Resources tab on that website, there are links to lecture videos, which might have value for you.

Disclaimer: the books lecture videos are not part of the course material and are not guaranteed to cover the same aspects of the course material as the actual lectures. So use them with cause.

Additional material for the course can be found on Blackboard under the sections "Material".

Structure of the repository

Every week, there will be a Jupyter Notebook with exercises. The notebooks can be found in the exercises directory.

The utilities directory includes the data that we will be working with along with convenience methods for the data.

Practical considerations

Tools

note: This course is a Python course, so in case you are not familiar with Python, you might want to familiarize your self with Python and JupyterNotebooks. Also, one library, that we are going to be using a lot is numpy, which allows us to work with vectors and matrices. Additional libraries will be introduced during the course

Setup

If you don't have Python installed already, we recommend you to install MiniConda. MiniConda allows you to have different environments (think different python installations) for each of your projects, such that you can keep dependencies separated.

Install MiniConda and then open a conda terminal. In there, you can then create an environment, where we will install the necessary packages for this course.

Navigate to the project directory:

> cd /path/to/dm2023-exercises

Create and activate environment:

> conda env create -f requirements.yml
> conda activate dm23

Now you should have created a conda environment with the necessary dependencies. From now on, when you want to run a python script or start a notebook for this course, make sure to activate the environment (as in the last line of coda above). You know that your environment is active, if your active line in the terminal is prefixed with (dm23).

By the way, a pretty fine cheatsheet can be found here.

Starting Jupyter Lab:
To edit Jupyter Notebooks, we need to start Jupyter lab. Navigate to the root of this repo and run the following command from the command line:

(dm23) > jupyter lab

The command should open a new window in your browser, where you can start running Python scripts.

Happy hacking.

For Vim enthusiats

If you want vim-bindings in Jupyter Lab, then you can go to the extensions panel (on the left in Jupyter Lab) and search for the library jupyterlab_vim.

About


Languages

Language:Jupyter Notebook 99.3%Language:Python 0.7%