Lammps with CPU MD

Question

Lammps with CPU MD

Amitcuhp opened this issue 7 months ago · comments

After I have train the model by develop branch, lammps md not working and the following error comes :-

LAMMPS (28 Mar 2023 - Development)
OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (src/comm.cpp:98)
using 1 OpenMP thread(s) per MPI task

The 'box' command has been removed and will be ignored

Reading data file ...
triclinic box = (0 0 0) to (30 30 30) with tilt (0 0 0)
1 by 1 by 2 MPI processor grid
reading atoms ...
55 atoms
Finding 1-2 1-3 1-4 neighbors ...
special bond factors lj: 0 0 0
special bond factors coul: 0 0 0
0 = max # of 1-2 neighbors
0 = max # of 1-3 neighbors
0 = max # of 1-4 neighbors
1 = max # of special neighbors
special bonds CPU = 0.001 seconds
read_data CPU = 0.002 seconds
CUDA unavailable, setting device type to torch::kCPU.
CUDA unavailable, setting device type to torch::kCPU.
Loading MACE model from "MACE_model_run-123_swa.model-lammps.pt" ...Loading MACE model from "MACE_model_run-123_swa.model-lammps.pt" ...terminate called after throwing an instance of 'c10::Error'
what(): open file failed because of errno 2 on fopen: , file path: MACE_model_run-123_swa.model-lammps.pt
Exception raised from RAIIFile at ../caffe2/serialize/file_adapter.cc:21 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x3e (0x7fdaa5ed556e in /home/Raman/lammps-mace-cpu/libtorch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5c (0x7fdaa5e9ff18 in /home/Raman/lammps-mace-cpu/libtorch/lib/libc10.so)
frame #2: caffe2::serialize::FileAdapter::RAIIFile::RAIIFile(std::string const&) + 0x124 (0x7fda8db24634 in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #3: caffe2::serialize::FileAdapter::FileAdapter(std::string const&) + 0x2e (0x7fda8db2468e in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #4: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::string const&) + 0x5a (0x7fda8db22ada in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #5: torch::jit::import_ir_module(std::shared_ptrtorch::jit::CompilationUnit, std::string const&, c10::optionalc10::Device, std::unordered_map<std::string, std::string, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, std::string> > >&) + 0x2a5 (0x7fda8ebd0b85 in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #6: torch::jit::import_ir_module(std::shared_ptrtorch::jit::CompilationUnit, std::string const&, c10::optionalc10::Device) + 0x7b (0x7fda8ebd139b in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #7: torch::jit::load(std::string const&, c10::optionalc10::Device) + 0xa5 (0x7fda8ebd1475 in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #8: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x96e54f]
frame #9: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x4ab633]
frame #10: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x4b0ace]
frame #11: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x4b1395]
frame #12: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x491bf8]
frame #13: __libc_start_main + 0xf5 (0x7fda89260555 in /lib64/libc.so.6)
frame #14: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x493047]

terminate called after throwing an instance of 'c10::Error'
what(): open file failed because of errno 2 on fopen: , file path: MACE_model_run-123_swa.model-lammps.pt
Exception raised from RAIIFile at ../caffe2/serialize/file_adapter.cc:21 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x3e (0x7fa3c2e0656e in /home/Raman/lammps-mace-cpu/libtorch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5c (0x7fa3c2dd0f18 in /home/Raman/lammps-mace-cpu/libtorch/lib/libc10.so)
frame #2: caffe2::serialize::FileAdapter::RAIIFile::RAIIFile(std::string const&) + 0x124 (0x7fa3aaa55634 in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #3: caffe2::serialize::FileAdapter::FileAdapter(std::string const&) + 0x2e (0x7fa3aaa5568e in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #4: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::string const&) + 0x5a (0x7fa3aaa53ada in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #5: torch::jit::import_ir_module(std::shared_ptrtorch::jit::CompilationUnit, std::string const&, c10::optionalc10::Device, std::unordered_map<std::string, std::string, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, std::string> > >&) + 0x2a5 (0x7fa3abb01b85 in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #6: torch::jit::import_ir_module(std::shared_ptrtorch::jit::CompilationUnit, std::string const&, c10::optionalc10::Device) + 0x7b (0x7fa3abb0239b in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #7: torch::jit::load(std::string const&, c10::optionalc10::Device) + 0xa5 (0x7fa3abb02475 in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #8: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x96e54f]
frame #9: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x4ab633]
frame #10: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x4b0ace]
frame #11: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x4b1395]
frame #12: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x491bf8]
frame #13: __libc_start_main + 0xf5 (0x7fa3a6191555 in /lib64/libc.so.6)

wcwitt · Answer 1 · Tue Nov 14 2023 17:28:52 GMT+0800 (China Standard Time)

It's hard to say much without seeing your LAMMPS input. But my first guess would be to try atom_style atomic.

Amitcuhp · Answer 2 · Tue Nov 14 2023 18:19:35 GMT+0800 (China Standard Time)

Sorry sir, it is working fine.
I have train now using the develop branch and it is working fine.

Now i am using the cpu version.

GPU version is not installting. Dont know how to solve but I will try.
Can you comment sir
gcc cuda cudnn which version should be used to gpu installtion.
Because I have tried with cuda-11.0.2 gcc-8.5 cudnn-8.0-11.0 It has some undefined reference to GLIB_2.27

wcwitt · Answer 3 · Tue Nov 21 2023 05:42:41 GMT+0800 (China Standard Time)

I'm not certain, but if I had to guess I would say those versions are sufficient. Do you have a system administrator who can help with building code?

Amitcuhp · Answer 4 · Tue Nov 21 2023 12:21:57 GMT+0800 (China Standard Time)

They have tried, but not able to compile GPU version !!!