jhrcook / protein-language-models

Experimenting with protein language model predictions

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Protein language models

Setup

pyenv local 3.11
python -m venv .env
source .env/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
pre-commit install

Data preparation

Run the following script to download and prepare the raw data:

./prepare_data.py

Data sources:

downloaded AlphaMissesnse predictions: https://zenodo.org/records/8360242 downloaded the file: "AlphaMissense_aa_substitutions.tsv.gz"

ESM1b paper: https://www.nature.com/articles/s41588-023-01465-0 Downloaded ESM1b: https://huggingface.co/spaces/ntranoslab/esm_variants/tree/main downloaded the file: "ALL_hum_isoforms_ESM1b_LLR.zip"

Copied them to "raw-data/"

About

Experimenting with protein language model predictions


Languages

Language:Python 100.0%