ACEsuit / mace

MACE - Fast and accurate machine learning interatomic potentials with higher order equivariant message passing.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Poor energy & force metrics on paper's datasets (carbon nanotube, buckyball catcher)

ale99WGiais opened this issue · comments

Describe the bug
We tried to fit MACE potentials on some datasets mentioned in the reference paper "Evaluation of the MACE Force Field Architecture: from Medicinal Chemistry to Materials Science".

In particular we tried fitting MACE on "Double-walled nanotube" and "Buckyball catcher".

The MAE metrics obtained by us are very different from the ones stated in the paper, so we are wondering what we colud be doing wrong :(

To Reproduce

MACE was installed using the following commands

git clone https://github.com/ACEsuit/mace.git

conda create -n mace python=3.10 -y
conda activate mace
conda install micromamba -c conda-forge -c anaconda
micromamba install pytorch==2.0 torchvision torchaudio pytorch-cuda -c pytorch -c nvidia -c conda-forge -c anaconda
micromamba install numpy scipy matplotlib ase opt_einsum prettytable pandas e3nn scikit-learn=1.3.2 -c conda-forge -c anaconda
pip install mace/

To fit MACE on the nanotube we used the following scripts:

python ~/mace/mace/cli/run_train.py \
    --name="tube-256-0-r6-int1" \
    --train_file="../md22_double-walled_nanotube.xyz" \
    --valid_fraction=0.05 \
    --E0s="average" \
    --model="MACE" \
    --num_interactions=1 \
    --num_channels=256 \
    --max_L=0 \
    --correlation=3 \
    --r_max=6.0 \
    --forces_weight=1000 \
    --energy_weight=10 \
    --batch_size=2 \
    --valid_batch_size=2 \
    --max_num_epochs=650 \
    --start_swa=450 \
    --scheduler_patience=5 \
    --patience=15 \
    --eval_interval=3 \
    --ema \
    --swa \
    --swa_forces_weight=10 \
    --error_table='PerAtomMAE' \
    --default_dtype="float64"\
    --device=cuda \
    --seed=123 \
    --restart_latest \
    --save_cpu
python ~/mace/mace/cli/run_train.py \
    --name="tube-256-2-r5-int2" \
    --train_file="../md22_double-walled_nanotube.xyz" \
    --valid_fraction=0.05 \
    --E0s="average" \
    --model="MACE" \
    --num_interactions=2 \
    --num_channels=256 \
    --max_L=2 \
    --correlation=3 \
    --r_max=5.0 \
    --forces_weight=1000 \
    --energy_weight=10 \
    --batch_size=1 \
    --valid_batch_size=2 \
    --max_num_epochs=650 \
    --start_swa=450 \
    --scheduler_patience=5 \
    --patience=15 \
    --eval_interval=3 \
    --ema \
    --swa \
    --swa_forces_weight=10 \
    --error_table='PerAtomMAE' \
    --default_dtype="float64"\
    --device=cuda \
    --seed=123 \
    --restart_latest \
    --save_cpu
python ~/mace/mace/cli/run_train.py \
    --name="tube-256-2-r3-int2" \
    --train_file="../md22_double-walled_nanotube.xyz" \
    --valid_fraction=0.05 \
    --E0s="average" \
    --model="MACE" \
    --num_interactions=2 \
    --num_channels=256 \
    --max_L=2 \
    --correlation=3 \
    --r_max=3.0 \
    --forces_weight=1000 \
    --energy_weight=10 \
    --batch_size=2 \
    --valid_batch_size=2 \
    --max_num_epochs=650 \
    --start_swa=450 \
    --scheduler_patience=5 \
    --patience=15 \
    --eval_interval=3 \
    --ema \
    --swa \
    --swa_forces_weight=10 \
    --error_table='PerAtomMAE' \
    --default_dtype="float64"\
    --device=cuda \
    --seed=123 \
    --restart_latest \
    --save_cpu

According with the examples in https://mace-docs.readthedocs.io/en/latest/examples/training_examples.html

Similar scripts were adopted for the buckyball catcher.

The code was submitted to single Nvida Tesla A100 GPU machines with a time limit of about 3 days.

Data for both nanotube and buckyball was downloaded from here: http://www.sgdml.org/
image

Expected behavior

We expected to have low energy and force MAE as in the paper:

image

But we got errors orders of magnitude higher:

buckyball mace-256-0-r6-int1 stdout.txt
buckyball mace-256-2-r3-int2 stdout.txt
buckyball mace-256-2-r5-int2 stdout.txt
nanotube mace-256-0-r6-int1 stdout.txt
nanotube mace-256-2-r3-int2 stdout.txt
nanotube mace-256-2-r5-int2 stdout.txt

Everything is uploaded here: https://uniudamce-my.sharepoint.com/:f:/g/personal/142135_spes_uniud_it/EvqCwMiR9PNMkZqb8L5iQTMBnHnpEm0-CQVCOsEskxbdaA?e=dhfnXJ

Thanks very much for the support,
Alessio

The MACE numbers are in eV and eV/A, but the original dataset is in kcal/mol. Did you make the conversion? For numerical precision, it is better to use eV and eV/A in the MACE code.

Hi Ilyes, thanks for your very quick response!

No, I'm sorry but we didn't notice that the original dataset was in in kcal/mol.

We'll try converting the dataset to eV and refit the potentials asap :)