Update: ZMPY3D PyTorch implementation is available (August 25, 2024).
ZMPY3D: accelerating protein structure volume analysis through vectorized 3D Zernike Moments and Python-based GPU Integration
For CPU support only, please refer to the repository:
ZMPY3D
supports NumPy
(https://github.com/tawssie/ZMPY3D)
For GPU support with TensorFlow, CuPy and PyTorch, please refer to the other three repositories:
ZMPY3D_TF
supports Tensorflow
(https://github.com/tawssie/ZMPY3D_TF)
ZMPY3D_CP
supports CuPy
(https://github.com/tawssie/ZMPY3D_CP)
ZMPY3D_PT
supports PyTorch
(https://github.com/tawssie/ZMPY3D_PT)
Here presents a Python-based software package, ZMPY3D, to accelerate the moments computation by vectorizing the mathematical formulae, enabling their computation in graphical processing units (GPUs). The package offers popular GPU-supported libraries such as CuPy and TensorFlow along with NumPy implementations, aiming to improve computational efficiency, adaptability, and flexibility in future algorithmic development.
Prerequisites:
- ZMPY3D : Python >=3.9.16, NumPy >=1.23.5
- ZMPY3D_CP: Python >=3.9.16, NumPy, CuPy >=12.2.0
- ZMPY3D_TF: Python >=3.9.16, NumPy >=1.23.5, Tensorflow >=2.12.0, Tensorflow-Probability >=0.20.1
- ZMPY3D_PT: Python >=3.9.16, NumPy >=1.23.5, PyTorch >= 2.3.1
- Open the terminal
- Using pip to install the package through PyPI
- Run
pip install ZMPY3D_TF
for the installation
- 3D Zernike moments with Tensorflow:
- Shape similarity with CuPy:
- Structure superposition with NumPy:
- Runtime evaluation:
A voxel cube with dimensions of 100x100x100 was applied to perform 10,000 3D Zernike moment calculations, using 2 different maximum orders 20 and 40. Execution times for different hardware configurations using TensorFlow, CuPy, and NumPy libraries:
Order | CPU1 | CPU2 |
---|---|---|
20 | 33m20s | 14m1s |
40 | 951m40s | 338m20s |
Order | T4 | RX3070Ti | V100 | L4 |
---|---|---|---|---|
20 | 1m1s | 0m36s | 0m31s | 0m39s |
40 | 24m40s | 9m3s | 10m54s | 11m13s |
Order | T4 | RX3070Ti | V100 | L4 |
---|---|---|---|---|
20 | 4m45s | 2m30s | 1m42s | 2m50s |
40 | 35m20s | 19m19s | 14m45s | 18m40s |
Note: m = minutes, s = seconds.
Due to GitHub's file size limitations, follow these steps to download the cache data for order 40 (1.3G) in the ZMPY3D_TF package:
- Open your terminal and execute the following command to find the folder of the ZMPY3D_TF package:
python -c "import ZMPY3D_TF; print(ZMPY3D_TF.__file__)"
- Note the path, which ends with
/User/path/ptyhon/site-packages/ZMPY3D_TF/__init__.py
.
- Go to the
cache_data
folder at the same level as__init__.py
file, i.e.,/User/path/ptyhon/site-packages/ZMPY3D_TF/cache_data
.
- Download the 1.3 GB max order 40
.pkl
file to thecache_data
folder from the link below. https://drive.google.com/uc?id=1RR1rF_5YJqaxNC5AK0Ie_8MswGb0Tttw
- Enhancing fold classification
- Facilitating structural superpositions
- Supporting protein docking
- Assisting molecular dynamics
- Enabling structure-based virtual screening
- Forecasting interacting interfaces
Feel free to submit pull requests for improvements or bug fixes.
Lai, J. S., Burley, S. K., & Duarte, J. M. (2024). ZMPY3D: Accelerating protein structure volume analysis through vectorized 3D Zernike moments and Python-based GPU integration. (Bioinformatics Advances, vbae111, https://doi.org/10.1093/bioadv/vbae111)
This project is licensed under the GNU General Public License v3.0. You can view the full license here.