What quantities to compute

Question

What quantities to compute

peastman opened this issue 3 years ago · comments

What quantities do we want to compute and include in the dataset? Energies and forces are of course essential, but there are other things we could also include. A good principle is that if it's cheap to compute something, and if it might potentially be useful to someone, we might as well include it. Here is a list of quantities that Psi4 can compute: https://psicode.org/psi4manual/master/oeprop.html. Here are some to consider.

Partial charges. Psi4 can compute a few different types of partial charges. Would this be useful for people who want to train models to predict partial charges?
Dipoles. In the PhysNet paper, they predict molecular dipoles from their model during training and include them in the loss function. You could easily do the same thing with other models.
Electrostatic potential and/or field. I'm not sure if this would be useful to anyone, but it's a possibility.
Anything else?

Raimondas Galvelis · Answer 1 · Wed Sep 22 2021 22:11:31 GMT+0800 (China Standard Time)

We should save the converged wavefunction.

If we have the wavefunction, we can relatively cheaply compute any additional electronic properties.
If we decide to recompute the dataset with a higher-accuracy method, the current wavefunction could be used as an initial guess to the reduce computational cost of the higher-accuracy method.

In the past there were problems saving the wavefunction with Psi4, but hopefully in the latest release it is fixed.

Gianni De Fabritiis · Answer 2 · Wed Sep 22 2021 23:33:39 GMT+0800 (China Standard Time)

the wavefunction seems a good idea but is it doable in terms of storage?

…

On Wed, Sep 22, 2021 at 4:11 PM Raimondas Galvelis ***@***.***> wrote: We should save the converged wavefunction. - If we have the wavefunction, we can relatively cheaply compute any additional electronic properties. - If we decide to recompute the dataset with a higher-accuracy method, the current wavefunction could be used as an initial guess to the reduce computational cost of the higher-accuracy method. In the past there were problems saving the wavefunction with Psi4, but hopefully in the latest release it is fixed. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB3KUOUX73WGCHAI2NQO4QDUDHPZ3ANCNFSM5EPBWRDQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

Raimondas Galvelis · Answer 3 · Thu Sep 23 2021 00:03:37 GMT+0800 (China Standard Time)

Computed benzene with wB97X-D/def2-TZVPPD:

import psi4

psi4.set_memory('32 GB')

benzene = psi4.geometry("""
  H      1.2194     -0.1652      2.1600
  C      0.6825     -0.0924      1.2087
  C     -0.7075     -0.0352      1.1973
  H     -1.2644     -0.0630      2.1393
  C     -1.3898      0.0572     -0.0114
  H     -2.4836      0.1021     -0.0204
  C     -0.6824      0.0925     -1.2088
  H     -1.2194      0.1652     -2.1599
  C      0.7075      0.0352     -1.1973
  H      1.2641      0.0628     -2.1395
  C      1.3899     -0.0572      0.0114
  H      2.4836     -0.1022      0.0205
""")

energy, wfn = psi4.energy('wB97X-D/def2-TZVPPD', molecule=benzene, return_wfn=True)

wfn.to_file('benzene')

The wavefunction size is 8.1 MB.

Peter Eastman · Answer 4 · Thu Sep 23 2021 00:41:15 GMT+0800 (China Standard Time)

I don't know for sure what QCArchive can handle, but I suspect that won't be practical. For a molecule that size, the coordinates and forces together take 288 bytes. Adding in a few other values and some metadata brings it up to around 1 KB. Storing the wavefunction increases the storage requirements by 3-4 orders of magnitude!

John Chodera · Answer 5 · Thu Sep 23 2021 03:30:44 GMT+0800 (China Standard Time)

@jthorton and @pavankum will have to chime in with which properties are supported by QCEngine/QCFractal/QCArchive and can reasonably be captured.

Pavan Behara · Answer 6 · Thu Sep 23 2021 04:45:10 GMT+0800 (China Standard Time)

Instead of the wavefunction we can save the orbital coefficients and eigenvalues, which are good enough for most properties and also to reconstruct the wavefunction. A "crude" example to restart from orbital coeffs,

import psi4
import numpy as np

psi4.set_memory('32 GB')

benzene = psi4.geometry("""
  H      1.2194     -0.1652      2.1600
  C      0.6825     -0.0924      1.2087
  C     -0.7075     -0.0352      1.1973
  H     -1.2644     -0.0630      2.1393
  C     -1.3898      0.0572     -0.0114
  H     -2.4836      0.1021     -0.0204
  C     -0.6824      0.0925     -1.2088
  H     -1.2194      0.1652     -2.1599
  C      0.7075      0.0352     -1.1973
  H      1.2641      0.0628     -2.1395
  C      1.3899     -0.0572      0.0114
  H      2.4836     -0.1022      0.0205
""")

energy, wfn = psi4.energy('wB97X-D/def2-TZVPPD', molecule=benzene, return_wfn=True)

alpha_orb_coeffs = wfn.Ca().np
eigen_vals = wfn.epsilon_a().np
nalpha = wfn.nalpha()

print("a and b densities same: ", wfn.same_a_b_dens())
print("a and b orbs same: ", wfn.same_a_b_orbs)

Density = np.dot(alpha_orb_coeffs[:, :nalpha], alpha_orb_coeffs[:, :nalpha].T)
print(Density == wfn.Da().np)

# Changing orbitals to orbitals read from file (here, stored in variables)
psi4.core.clean()

new_scf, new_wfn = psi4.energy('hf/def2-tzvppd', molecule=benzene, return_wfn=True)
print(new_wfn.Ca().np == wfn.Ca().np)

# since alpha and beta are similar
new_wfn.Ca().np[:] = alpha_orb_coeffs
new_wfn.epsilon_a().np[:] = eigen_vals

new_wfn.Cb().np[:] = alpha_orb_coeffs
new_wfn.epsilon_b().np[:] = eigen_vals

# writing to the scratch file that psi4 reads if scf_guess was set to READ
my_file=new_wfn.get_scratch_filename(180) + '.npy'
new_wfn.to_file(my_file)

psi4.set_options({'guess': 'read'})
energy = psi4.energy('wb97x-d/def2-TZVPPD', molecule=benzene)

May be @jthorton has a polished way to construct a new wfn object instead of replacing the orb coeffs of another energy calc. Anyways, those orbitals and eigenvalues would be on the order of 10's of kilobytes.

Some properties we would be interested in are wiberg/mayer bond indices, dipole, quadrupole moments (already listed above). ESPs can be built from orbital coefficients after we reconstruct the wavefunction.

Gianni De Fabritiis · Answer 7 · Fri Sep 24 2021 00:40:25 GMT+0800 (China Standard Time)

This seems like a good compromise.

…

On Wed, Sep 22, 2021 at 10:45 PM Pavan Behara ***@***.***> wrote: Instead of the wavefunction we can save the orbital coefficients and eigenvalues, which are good enough for most properties and also to reconstruct the wavefunction. A "crude" example to restart from orbital coeffs, import psi4 import numpy as np psi4.set_memory('32 GB') benzene = psi4.geometry(""" H 1.2194 -0.1652 2.1600 C 0.6825 -0.0924 1.2087 C -0.7075 -0.0352 1.1973 H -1.2644 -0.0630 2.1393 C -1.3898 0.0572 -0.0114 H -2.4836 0.1021 -0.0204 C -0.6824 0.0925 -1.2088 H -1.2194 0.1652 -2.1599 C 0.7075 0.0352 -1.1973 H 1.2641 0.0628 -2.1395 C 1.3899 -0.0572 0.0114 H 2.4836 -0.1022 0.0205 """) energy, wfn = psi4.energy('wB97X-D/def2-TZVPPD', molecule=benzene, return_wfn=True) alpha_orb_coeffs = wfn.Ca().np eigen_vals = wfn.epsilon_a().np nalpha = wfn.nalpha() print("a and b densities same: ", wfn.same_a_b_dens()) print("a and b orbs same: ", wfn.same_a_b_orbs) Density = np.dot(alpha_orb_coeffs[:, :nalpha], alpha_orb_coeffs[:, :nalpha].T) print(Density == wfn.Da().np) # Changing orbitals to orbitals read from file (here, stored in variables) psi4.core.clean() new_scf, new_wfn = psi4.energy('hf/def2-tzvppd', molecule=benzene, return_wfn=True) print(new_wfn.Ca().np == wfn.Ca().np) # since alpha and beta are similar new_wfn.Ca().np[:] = alpha_orb_coeffs new_wfn.epsilon_a().np[:] = eigen_vals new_wfn.Cb().np[:] = alpha_orb_coeffs new_wfn.epsilon_b().np[:] = eigen_vals # writing to the scratch file that psi4 reads if scf_guess was set to READ my_file=new_wfn.get_scratch_filename(180) + '.npy' new_wfn.to_file(my_file) psi4.set_options({'guess': 'read'}) energy = psi4.energy('wb97x-d/def2-TZVPPD', molecule=benzene) May be @jthorton <https://github.com/jthorton> has a polished way to construct a new wfn object instead of replacing the orb coeffs of another energy calc. Anyways, those orbitals and eigenvalues would be on the order of 10's of kilobytes. Some properties we would be interested in are wiberg/mayer bond indices, dipole, quadrupole moments (already listed above). ESPs can be built from orbital coefficients after we reconstruct the wavefunction. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB3KUOQWZ3BNOJLL2WNEI3TUDI56FANCNFSM5EPBWRDQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

Peter Eastman · Answer 8 · Fri Sep 24 2021 06:29:28 GMT+0800 (China Standard Time)

Saving the coefficients isn't a substitute for also computing and storing useful quantities. Even if it only took 1 second to recompute them for each conformation, it would still take weeks for the entire dataset. How about including the following?

DIPOLE
QUADRUPOLE
WIBERG_LOWDIN_INDICES
MAYER_INDICES
MBIS_CHARGES

Peter Eastman · Answer 9 · Fri Sep 24 2021 08:42:05 GMT+0800 (China Standard Time)

Psi4 also supports Distributed Multipole Analysis, which is another way of computing atomic charges and multipoles. I don't know how it compares to MBIS.

Peter Eastman · Answer 10 · Wed Jul 13 2022 03:53:53 GMT+0800 (China Standard Time)

Closing since version 1 is now released.