PaddlePaddle / PaddleHelix

Bio-Computing Platform Featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

English | 简体中文


Version python version support os DOI

Latest News

2022.12.08 Paper "HelixMO: Sample-Efficient Molecular Optimization in Scene-Sensitive Latent Space" is accepted by BIBM 2022. Please refere to link1 or link2 for more details. We also deployed the drug design service on the website PaddleHelix.

2022.08.11 PaddleHelix released the codes of HelixGEM-2, a novel Molecular Property Prediction Network that models full-range many-body interactions. And it ranked 1st in the OGB PCQM4Mv2 leaderboard. Please refer to paper and codes for more details.

2022.07.29 PaddleHelix released the codes of HelixFold-Single, an MSA-free protein structure prediction pipeline relying on only the primary sequences, which can predict the protein structures within seconds. Please refer to paper and codes for more details. Welcome to PaddleHelix website to try out the structure prediction online service.

2022.07.18 PaddleHelix fully released HelixFold including training and inference pipeline. The complete training time are optimized from 11 days to 5.12 days. Ultra-long monomer protein (around 6600 AA) prediction is supported now. Please refer to paper and codes for more details.

2022.07.07 Paper "BatchDTA: implicit batch alignment enhances deep learning-based drug–target affinity estimation" is published in Briefings in Bioinformatics. Please refer to paper and codes for more details.

2022.05.24 Paper "HelixADMET: a robust and endpoint extensible ADMET system incorporating self-supervised knowledge transfer" is published in Bioinformatics. Refer to paper for more information.

2022.02.07 Paper "Geometry-enhanced molecular representation learning for property prediction" is published in Nature Machine Intelligence. Please refer to paper and codes to explore the algorithm.

More news ...

2022.01.07 PaddleHelix released the reproduction of AlphaFold 2 inference pipeline using PaddlePaddle in HelixFold.

2021.11.23 Paper "Multimodal Pre-Training Model for Sequence-based Prediction of Protein-Protein Interaction" is accepted by MLCB 2021. Please refer to paper and code for more details.

2021.10.25 Paper "Docking-based Virtual Screening with Multi-Task Learning" is accepted by BIBM 2021.

2021.09.29 Paper "Property-Aware Relation Networks for Few-shot Molecular Property Prediction" is accepted by NeurIPS 2021 as a Spotlight Paper. Please refer to PAR for more details.

2021.07.29 PaddleHelix released a novel geometry-level molecular pre-training model, taking advantage of the 3D spatial structures of the molecules. Please refer to GEM for more details.

2021.06.17 PaddleHelix team won the 2nd place in the OGB-LCS KDD Cup 2021 PCQM4M-LSC track, predicting DFT-calculated HOMO-LUMO energy gap of molecules. Please refer to the solution for more details.

2021.05.20 PaddleHelix v1.0 released. 1) Update from static framework to dynamic framework; 2) Add new applications: molecular generation and drug-drug synergy.

2021.05.18 Paper "Structure-aware Interactive Graph Neural Networks for the Prediction of Protein-Ligand Binding Affinity" is accepted by KDD 2021. The code is available at here.

2021.03.15 PaddleHelix team ranks 1st in the ogbg-molhiv and ogbg-molpcba of OGB, predicting the molecular properties.


Introduction

PaddleHelix is a bio-computing tool, taking advantage of the machine learning approaches, especially deep neural networks, for facilitating the development of the following areas:

  • Drug Discovery. Provide 1) Large-scale pre-training models: compounds and proteins; 2) Various applications: molecular property prediction, drug-target affinity prediction, and molecular generation.
  • Vaccine Design. Provide RNA design algorithms, including LinearFold and LinearPartition.
  • Precision Medicine. Provide application of drug-drug synergy.

Resources

Application Platform

PaddleHelix platform provides the AI + biochemistry abilities for the scenarios of drug discovery, vaccine design and precision medicine.

Installation Guide

PaddleHelix is a bio-computing repository based on PaddlePaddle, a high-performance Parallelized Deep Learning Platform. The installation prerequisites and guide can be found here.

Tutorials

We provide abundant tutorials to help you navigate the repository and start quickly.

Examples

We also provide examples that implement various algorithms and show the methods running the algorithms:

Competition Solutions

PaddleHelix team participated in multiple competitions related to bio-computing. The solutions can be found here.

Guide for Developers

  • To develope new functions based on the source code of PaddleHelix, please refer to guide for developers.
  • For more details of the APIs, please refer to the documents.

Welcome to Join Us

We are looking for machine learning researchers / engineers or bioinformatics / computational chemistry researchers interested in AI-driven drug design. We base in Shenzhen or Shanghai, China. Please send the resumes to wangfan04@baidu.com or fangxiaomin01@baidu.com.

About

Bio-Computing Platform Featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集

License:Apache License 2.0


Languages

Language:Python 69.8%Language:Jupyter Notebook 17.6%Language:C++ 11.4%Language:Shell 1.1%Language:C 0.1%Language:CMake 0.0%