yf-ivanguo / OctaPicks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Create Distance Strike Features

yf-ivanguo opened this issue · comments

Features:
{Fighter A, Fighter B} x {distance strikes attempted per minute, distance strikes landed per minute, distance strikes accuracy %} x {R1, R2, R3, R4, R5, overall} x {last 3 fights, last 5 fights, alltime}

{Fighter A, Fighter B} x {distance strikes attempted per minute differential, distance strikes landed per minute differential, distance strikes accuracy % differential} x {R1, R2, R3, R4, R5, overall} x {last 3 fights, last 5 fights, alltime}

{Fighter A, Fighter B} x {distance strikes absorbed per minute, distance strikes received per minute, distance strikes defended %} x {R1, R2, R3, R4, R5, overall} x {last 3 fights, last 5 fights, alltime}

{Fighter A, Fighter B} x {distance strikes absorbed per minute differential, distance strikes received per minute differential, distance strikes defended % differential} x {R1, R2, R3, R4, R5, overall} x {last 3 fights, last 5 fights, alltime}

Should end up being 432 extra features.

Create a new empty CSV file and create these new features by transforming ufc_men_stats_by_fight.csv
This will be a time-consuming operation - make sure to utilize vectorization to speed up the testing process.

Vectorization:
https://pythonspeed.com/articles/pandas-vectorization/

Pandas Apply (For use when vectorization is not possible):
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.apply.html

Example (__create_home_adv_feats() uses apply, which is slower than __create_surface_env_feats() that uses vectorization):
https://github.com/ivan-guo/TennisModel/blob/main/processing.py