ohld / NoLabs

Open source biolab

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NoLabs

NoLabs

Open source biolab

Github top language Github language count Repository size License

Contents

About

NoLabs is an open source biolab which lets you run experiments with latest state of the art models for bio research.

The goal of the project is to accelerate bio research by making inference models easy to use for everyone. We are currenly supporting protein biolab (predicting useful protein properties such as solubility, localisation, gene ontology, folding etc.) and drug discovery biolab (construct ligands and test binding to target proteins).

We are working on expanding both and adding a cell biolab and genetic biolab, and we will appreciate your support and contributions.

Let's accelerate bio research!

Features

Bio Buddy - drug discovery co-pilot:

BioBuddy is a drug discovery copilot which supports:

  • Donwloading data from ChemBL
  • Downloading data from RcsbPDB
  • Questions about drug discovery process, targets, chemical components etc.

For example, you can ask

  • "Can you pull me some latest approved drugs?"
  • "Can you download me 1000 rhodopsins?"
  • "How does an aspirin molecule look like?" and it will do this and answer other questions.

To enable biobuddy run this command when starting nolabs:

$ ENABLE_BIOBUDDY=true docker compose up nolabs

And also start the biobuddy microservice:

$ OPENAI_API_KEY=your_openai_api_key docker compose up biobuddy

Nolabs is running on GPT4 for the best performance. You can adjust the model you use in microservices/biobuddy/biobuddy/services.py

You can ignore OPENAI_API_KEY warnings when running other services using docker compose.

Drug discovery lab:

  • Drug-target interaction prediction, high throughput virtual screening (HTVS) based on:
  • Automatic pocket prediction via P2Rank
  • Automatic MSA generation via HH-suite3

Protein lab:

  • Prediction of subcellular localisation via fine-tuned ritakurban/ESM_protein_localization model (to be updated with a better model)
  • Prediction of folded structure via facebook/esmfold_v1
  • Gene ontology prediction for 200 most popular gene ontologies
  • Protein solubility prediction

Protein design Lab:


Conformations Lab:

Starting

# Clone this project
$ git clone https://github.com/BasedLabs/nolabs
$ cd nolabs
$ docker compose up

OR if you want to run a single feature

$ docker compose -up nolabs [gene_ontology|localisation|protein_design|solubility|conformations]

Server will be available on http://localhost:9000

WARNING: To install RoseTTAFold check RoseTTAFold section

APIs

We provide individual Docker containers backed by FastAPI for each feature, which are available in the /microservices folder. You can use them individually as APIs.

For example, to run the esmfold service, you can use Docker Compose:

$ docker compose up esmfold

Once the service is up, you can make a POST request to perform a task, such as predicting a protein's folded structure. Here's a simple Python example:

import requests

# Define the API endpoint
url = 'http://127.0.0.1:5736/run-folding'

# Specify the protein sequence in the request body
data = {
    'protein_sequence': 'YOUR_PROTEIN_SEQUENCE_HERE'
}

# Make the POST request and get the response
response = requests.post(url, json=data)

# Extract the PDB content from the response
pdb_content = response.json().get('pdb_content', '')

print(pdb_content)

This Python script makes a POST request to the esmfold microservice with a protein sequence and prints the predicted PDB content.

Running services on a separate machine

Since we provide individual Docker containers backed by FastAPI for each feature, available in the /microservices folder, you can run them on separate machines. This setup is particularly useful if you're developing on a computer without GPU support but have access to a VM with a GPU for tasks like folding, docking, etc.

For instance, to run the diffdock service, use Docker Compose on the VM or computer equipped with a GPU.

On your server/VM/computer with a GPU, run:

$ docker compose up diffdock

Once the service is up, you can check that you can access it from your computer by navigating to http://< gpu_machine_ip>:5737/docs

If everything is correct, you should see the FastAPI page with diffdock's API surface like this:

Next, update the nolabs/infrastructure/settings.ini file on your primary machine to include the IP address of the service (replace 127.0.0.1 with your GPU machine's IP):

...
p2rank = http://127.0.0.1:5731
esmfold = http://127.0.0.1:5736
esmfold_light = http://127.0.0.1:5733
msa_light = http://127.0.0.1:5734
umol = http://127.0.0.1:5735
diffdock = http://127.0.0.1:5737 -> http://74.82.28.227:5737
...

And now you are ready to use this service hosted on a separate machine!

Supported microservices list

1) Protein design docker API

Model: RFdiffusion

RFdiffusion is an open source method for structure generation, with or without conditional information (a motif, target etc).

docker compose up protein_design

Swagger UI will be available on http://localhost:5789/docs

or install as a python package

2) ESMFold docker API

Model: ESMFold - Evolutionary Scale Modeling

docker compose up esmfold

Swagger UI will be available on http://localhost:5736/docs

or install as a python package

3) ESMAtlas docker API

Model: ESMAtlas

docker compose up esmfold_light

Swagger UI will be available on http://localhost:5733/docs

or install as a python package

4) Protein function prediction docker API

Model: Hugging Face

docker compose up gene_ontology

Swagger UI will be available on http://localhost:5788/docs

or install as a python package

5) Protein localisation prediction docker API

Model: Hugging Face

docker compose up localisation

Swagger UI will be available on http://localhost:5787/docs

or install as a python package

6) Protein binding site prediction docker API

Model: p2rank

docker compose up p2rank

Swagger UI will be available on http://localhost:5731/docs

or install as a python package

7) Protein solubility prediction docker API

Model: Hugging Face

docker compose up solubility

Swagger UI will be available on http://localhost:5786/docs

or Install as python package

8) Protein-ligand structure prediction docker API

Model: UMol

docker compose up umol

Swagger UI will be available on http://localhost:5735/docs

or Install as python package

9) RoseTTAFold docker API

Model: RoseTTAFold

docker compose up rosettafold

Swagger UI will be available on http://localhost:5738/docs

or Install as python package

WARNING: To use Rosettafold you must specify ROSETTACOMMONS_CONDA_USERNAME and ROSETTACOMMONS_CONDA_PASSWORD in compose.yaml and download additional data (check step 5 on https://github.com/RosettaCommons/RoseTTAFold page). Also change the volumes '.' to point to the specified folders.

Technologies

The following tools were used in this project:

Requirements

[Recommended for laptops] If you are using a laptop, use --test argument (no need to have a lot of compute):

  • RAM > 16GB
  • [Optional] GPU memory >= 16GB (REALLY speeds up the inference)

[Recommended for powerful workstations] Else, if you want to host everything on your machine and have faster inference (also a requirement for folding sequences > 400 amino acids in length):

  • RAM > 30GB
  • [Optional] GPU memory >= 40GB (REALLY speeds up the inference)

Made by Igor and Tim

 

Back to top

About

Open source biolab

License:MIT License


Languages

Language:Python 51.3%Language:JavaScript 30.6%Language:GSC 10.7%Language:TypeScript 3.5%Language:Vue 2.6%Language:Shell 0.8%Language:Jupyter Notebook 0.3%Language:Dockerfile 0.2%Language:HTML 0.0%Language:SCSS 0.0%