espresso example: replace MDAnalysis

Question

espresso example: replace MDAnalysis

junghans opened this issue 5 years ago · comments

Christoph Junghans commented 5 years ago

MDAnalysis got dropped from Fedora, so replacing it with something else would be nice.

This is the only line where MDAnalysis is used:
https://github.com/votca/csg-tutorials/blob/master/spce/ibi_espresso/spce.py#L42

Related to votca/csg#400 and #34 .

@fweik @jngrad @RudolfWeeber any ideas? Can espresso read any coordinate file formats natively?

Christoph Junghans commented 5 years ago

Usually.

Jens commented 5 years ago

#51

RudolfWeeber · Answer 1 · Thu Sep 05 2019 04:36:54 GMT+0800 (China Standard Time)

MDAnalysis got dropped from Fedora, so replacing it with something else would be nice. This is the only line where MDAnalysis is used: https://github.com/votca/csg-tutorials/blob/master/spce/ibi_espresso/spce.py#L42 Related to <votca/csg#400> votca/csg#400 and <#34> #34 . <https://github.com/fweik> @fweik <https://github.com/jngrad> @jngrad <https://github.com/RudolfWeeber> @RudolfWeeber any ideas? Can espresso read any coordinate file formats natively?

If it is only about positions and masses, then an ascii table with Data = np.load(…) System.part.add(pos=data[:,0:3],mass=data[:3])

Christoph Junghans · Answer 2 · Thu Sep 05 2019 04:43:18 GMT+0800 (China Standard Time)

@RudolfWeeber ascii table would work, but as VOTCA doesn't support, so we would need to duplicate the input file.

Looking at https://github.com/espressomd/espresso/blob/python/src/core/io/reader/readpdb.cpp, can espresso read pdb files?

Florian Weik · Answer 3 · Thu Sep 05 2019 05:25:06 GMT+0800 (China Standard Time)

@junghans the pdb code is not exposed to python, I actually thought it was removed. In a perfect world all packages would support h5md which I'm sure we all want to establish as the new standard for MD data :-)

Christoph Junghans · Answer 4 · Thu Sep 05 2019 05:37:48 GMT+0800 (China Standard Time)

We support (and actually use) h5md for trajectories, but not topologies. Not sure how hard that would be to add. @jkrajniak might know.

Actually xyz (see https://openbabel.org/wiki/XYZ_(format)) is just an ascii table, so maybe we should use that.

Florian Weik · Answer 5 · Thu Sep 05 2019 06:42:33 GMT+0800 (China Standard Time)

Yeah, Espresso can't read h5md. XYZ can probably be read by numpy, but mass support would be needed iirc.

Jens · Answer 6 · Thu Sep 05 2019 15:30:50 GMT+0800 (China Standard Time)

Writing a small parser in python for h5md positions and masses would also not be that difficult

Christoph Junghans · Answer 7 · Thu Sep 05 2019 23:52:31 GMT+0800 (China Standard Time)

@JensWehner are you volunteering?

Jens · Answer 8 · Fri Sep 06 2019 00:31:45 GMT+0800 (China Standard Time)

if you make the h5md file I write the parser for it

Jakub Krajniak · Answer 9 · Fri Sep 06 2019 00:44:28 GMT+0800 (China Standard Time)

Reading h5md (it's nothing more than hdf5 file) with h5py library is very easy; in the end you will work with numpy arrays for positions and masses

import h5py
import sys

h5 = h5py.File(sys.argv[1], 'r')
particle_group = 'atoms'
pos = h5['/particles/{}/positions/value'.format(particle_group)].value

so the only attribute you need is in fact the particle_group which either is hard coded in espresso or is provided in the script.

Christoph Junghans · Answer 10 · Fri Sep 06 2019 01:12:19 GMT+0800 (China Standard Time)

@jkrajniak we need a topology reader on the VOTCA side, too.

Jakub Krajniak · Answer 11 · Fri Sep 06 2019 01:30:45 GMT+0800 (China Standard Time)

Do you mean the connectivity (https://nongnu.org/h5md/h5md.html#connectivity-group) ?

Christoph Junghans · Answer 12 · Fri Sep 06 2019 01:31:49 GMT+0800 (China Standard Time)

I mean currently the VOTCA can use h5md files only as trajectories, but not topology files.

Jens · Answer 13 · Sat Sep 07 2019 14:45:11 GMT+0800 (China Standard Time)

do we need that for that script? I thought we need just the coordinates. I am confused

Christoph Junghans · Answer 14 · Sat Sep 07 2019 22:28:09 GMT+0800 (China Standard Time)

You right, we don't really need the coordinates to setup the topology, so the easiest might be to turn spce.gro into a table and make topol.xml a standalone topology.

Jens · Answer 15 · Sun Sep 08 2019 21:26:57 GMT+0800 (China Standard Time)

Can someone summarize the problem and maybe the solution once more. I am still lost.

Christoph Junghans · Answer 16 · Mon Sep 09 2019 04:39:39 GMT+0800 (China Standard Time)

@JensWehner in short the espresso example uses MDAnalysis, which isn't part of Fedora, so hence we need a different mechanism to read in the initial coordinates. One solution would be to use an h5md file instead of the currently used gro file, however VOTCA can only use h5md as trajectory files not topology files. Currently both Espresso and VOTCA use a gro file for the tutorial. So another solution would be to use an ASCII table as initial condition for the espresso and an xml topology for VOTCA, which obviously would duplicate some information.

Jens · Answer 17 · Mon Sep 09 2019 04:43:11 GMT+0800 (China Standard Time)

Alternatives are we write a small python .gro writer or we implement an h5md topology reader?

Christoph Junghans · Answer 18 · Mon Sep 09 2019 08:05:49 GMT+0800 (China Standard Time)

Yes to both.

Christoph Junghans · Answer 19 · Mon Sep 09 2019 08:06:58 GMT+0800 (China Standard Time)

Implementing a gro reader is slightly proffered as we don't need to add a binary h5md file to the repo in that case.

Jens · Answer 20 · Mon Sep 09 2019 18:31:09 GMT+0800 (China Standard Time)

but the h5md top reader would be the more rigorous solution.

Christoph Junghans · Answer 21 · Mon Sep 09 2019 22:38:54 GMT+0800 (China Standard Time)

but the h5md top reader would be the more rigorous solution.

Not sure about rigorous, but would be good to have anyway.

Jens · Answer 22 · Tue Sep 10 2019 23:08:43 GMT+0800 (China Standard Time)

I do not understand the test script. The masses that are used are Carbon masses but the .gro file says CG? Is that intentional?

Jens · Answer 23 · Tue Sep 10 2019 23:14:37 GMT+0800 (China Standard Time)

How does espresso know the masses of the particles.

Christoph Junghans · Answer 24 · Wed Sep 11 2019 00:33:01 GMT+0800 (China Standard Time)

The mass should be the ones of water, so 18u.

But for static properties, like the rdfs, masses don’t matter.

Jens · Answer 25 · Wed Sep 11 2019 01:04:05 GMT+0800 (China Standard Time)

@junghans okay how do you normally get the masses into espresso, because using a .gro files does not work. I have no knowledge about these methods, but I will finish the small groreader this week.

Christoph Junghans · Answer 26 · Wed Sep 11 2019 01:06:14 GMT+0800 (China Standard Time)

I just set the masses manually!

Jens · Answer 27 · Wed Sep 11 2019 01:06:44 GMT+0800 (China Standard Time)

In the espresso script?

Jens · Answer 28 · Wed Sep 11 2019 01:08:18 GMT+0800 (China Standard Time)

ahh okay. So I can do that too, in spce.py makes the whole parser a lot easier?

Christoph Junghans · Answer 29 · Wed Sep 11 2019 03:14:09 GMT+0800 (China Standard Time)

Yes please do the simplest parser!