Quantco / vectorization-tutorial

This repo is used to illustrate the vectorization principle in a tutorial. It was created for a CEOI workshop in August 2023 and might not be kept up-to-date.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

introduction to the vectorizing principle

This repo is used to illustrate the vectorization principle in a tutorial. In data science applications, large amounts of data are processed, and dynamically typed and interpreted languages like python, R or Matlab are often used to perform a job in just a few lines based on a highly dynamic library universe. The heavy lifting is done within those libraries written in C, C++ or Fortran. Thus the user API to those libraries must be based on handing around large amounts of data instead of single values. This is the basis of the vectorization principle.

This tutorial was created for a CEOI workshop in August 2023 and might not be kept up-to-date.

Disclaimer: The term vectorization is also used for talking about using SIMD based instruction level parallelism provided by CPUs. Here, we talk about vectorization as a library design pattern for structural data transformation code – applying operations to vectors instead of scalars.

Setting up the environment for running the python files and jupyter notebooks in this repository

Follow https://mamba.readthedocs.io/en/latest/installation/micromamba-installation.html to download the micromamba executable. Please put it in ~/bin/micromamba or adjust the instructions below accordingly. Then run the following commands to create a new environment and install the required packages:

MICROMAMBA=~/bin/micromamba
eval "$("$MICROMAMBA" shell hook -s bash)"
micromamba create -y -n vectorization -f conda-lock.yml
micromamba activate vectorization

Table of Contents:

About

This repo is used to illustrate the vectorization principle in a tutorial. It was created for a CEOI workshop in August 2023 and might not be kept up-to-date.

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Jupyter Notebook 95.2%Language:Python 4.8%