NoLabs is an open source biolab which lets you run experiments with latest state of the art models for bio research.
The goal of the project is to accelerate bio research by making inference models easy to use for everyone. We are currenly supporting protein biolab (predicting useful protein properties such as solubility, localisation, gene ontology, folding etc.) and drug discovery biolab (construct ligands and test binding to target proteins).
We are working on expanding both and adding a cell biolab and genetic biolab, and we will appreciate your support and contributions.
Let's accelerate bio research!
Bio Buddy - drug discovery co-pilot:
BioBuddy is a drug discovery copilot which supports:
- Donwloading data from ChemBL
- Downloading data from RcsbPDB
- Questions about drug discovery process, targets, chemical components etc.
For example, you can ask
- "Can you pull me some latest approved drugs?"
- "Can you download me 1000 rhodopsins?"
- "How does an aspirin molecule look like?" and it will do this and answer other questions.
To enable biobuddy run this command when starting nolabs:
$ ENABLE_BIOBUDDY=true docker compose up nolabs
And also start the biobuddy microservice:
$ OPENAI_API_KEY=your_openai_api_key docker compose up biobuddy
Nolabs is running on GPT4 for the best performance. You can adjust the model you use in microservices/biobuddy/biobuddy/services.py
You can ignore OPENAI_API_KEY warnings when running other services using docker compose.
Drug discovery lab:
- Drug-target interaction prediction, high throughput virtual screening (HTVS) based on:
- Automatic pocket prediction via P2Rank
- Automatic MSA generation via HH-suite3
Protein lab:
- Prediction of subcellular localisation via fine-tuned ritakurban/ESM_protein_localization model (to be updated with a better model)
- Prediction of folded structure via facebook/esmfold_v1
- Gene ontology prediction for 200 most popular gene ontologies
- Protein solubility prediction
Protein design Lab:
- Protein generation via RFDiffusion
Conformations Lab:
# Clone this project
$ git clone https://github.com/BasedLabs/nolabs
$ cd nolabs
$ docker compose up
OR if you want to run a single feature
$ docker compose -up nolabs [gene_ontology|localisation|protein_design|solubility|conformations]
Server will be available on http://localhost:9000
WARNING: To install RoseTTAFold check RoseTTAFold section
We provide individual Docker containers backed by FastAPI for each feature, which are available in the /microservices
folder. You can use them individually as APIs.
For example, to run the esmfold
service, you can use Docker Compose:
$ docker compose up esmfold
Once the service is up, you can make a POST request to perform a task, such as predicting a protein's folded structure. Here's a simple Python example:
import requests
# Define the API endpoint
url = 'http://127.0.0.1:5736/run-folding'
# Specify the protein sequence in the request body
data = {
'protein_sequence': 'YOUR_PROTEIN_SEQUENCE_HERE'
}
# Make the POST request and get the response
response = requests.post(url, json=data)
# Extract the PDB content from the response
pdb_content = response.json().get('pdb_content', '')
print(pdb_content)
This Python script makes a POST request to the esmfold microservice with a protein sequence and prints the predicted PDB content.
Since we provide individual Docker containers backed by FastAPI for each feature, available in the /microservices
folder, you can run them on separate machines. This setup is particularly useful if you're developing on a computer
without GPU support but have access to a VM with a GPU for tasks like folding, docking, etc.
For instance, to run the diffdock
service, use Docker Compose on the VM or computer equipped with a GPU.
On your server/VM/computer with a GPU, run:
$ docker compose up diffdock
Once the service is up, you can check that you can access it from your computer by navigating to http://< gpu_machine_ip>:5737/docs
If everything is correct, you should see the FastAPI page with diffdock's API surface like this:
Next, update the nolabs/infrastructure/settings.ini file on your primary machine to include the IP address of the service (replace 127.0.0.1 with your GPU machine's IP):
...
p2rank = http://127.0.0.1:5731
esmfold = http://127.0.0.1:5736
esmfold_light = http://127.0.0.1:5733
msa_light = http://127.0.0.1:5734
umol = http://127.0.0.1:5735
diffdock = http://127.0.0.1:5737 -> http://74.82.28.227:5737
...
And now you are ready to use this service hosted on a separate machine!
Model: RFdiffusion
RFdiffusion is an open source method for structure generation, with or without conditional information (a motif, target etc).
docker compose up protein_design
Swagger UI will be available on http://localhost:5789/docs
or install as a python package
Model: ESMFold - Evolutionary Scale Modeling
docker compose up esmfold
Swagger UI will be available on http://localhost:5736/docs
or install as a python package
Model: ESMAtlas
docker compose up esmfold_light
Swagger UI will be available on http://localhost:5733/docs
or install as a python package
Model: Hugging Face
docker compose up gene_ontology
Swagger UI will be available on http://localhost:5788/docs
or install as a python package
Model: Hugging Face
docker compose up localisation
Swagger UI will be available on http://localhost:5787/docs
or install as a python package
Model: p2rank
docker compose up p2rank
Swagger UI will be available on http://localhost:5731/docs
or install as a python package
Model: Hugging Face
docker compose up solubility
Swagger UI will be available on http://localhost:5786/docs
Model: UMol
docker compose up umol
Swagger UI will be available on http://localhost:5735/docs
Model: RoseTTAFold
docker compose up rosettafold
Swagger UI will be available on http://localhost:5738/docs
WARNING: To use Rosettafold you must specify ROSETTACOMMONS_CONDA_USERNAME and ROSETTACOMMONS_CONDA_PASSWORD in compose.yaml and download additional data (check step 5 on https://github.com/RosettaCommons/RoseTTAFold page). Also change the volumes '.' to point to the specified folders.
The following tools were used in this project:
[Recommended for laptops] If you are using a laptop, use --test
argument (no need to have a lot of compute):
- RAM > 16GB
- [Optional] GPU memory >= 16GB (REALLY speeds up the inference)
[Recommended for powerful workstations] Else, if you want to host everything on your machine and have faster inference (also a requirement for folding sequences > 400 amino acids in length):
- RAM > 30GB
- [Optional] GPU memory >= 40GB (REALLY speeds up the inference)