SINGROUP / dscribe

DScribe is a python package for creating machine learning descriptors for atomistic systems.

Home Page:https://singroup.github.io/dscribe/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Physical Meaning of SOAP feature ?

lchaoyi opened this issue · comments

Hi, I was wondering if there is any way I can explain the SOAP feature with a more real-world view ? The partial power spectrum vector p(r) seems so abstract that it is very hard to explain it with a real picture. I notice that there is a function get_location() which can help me easily extract parts of particular species combination, and I think these terms can represent the interaction between these species. Moreover, for a certain feature p^{Z Z'}_{n n' l} with known {Z Z' n n' l}, is there any way I can extract more information of the local environment from this specific term besides the species interaction ?

I come up this question from a common machine learning scenario. Taking the SOAP feature vector as the input, one can easily get weights of these features via various machine learning models. I was wondering if I can explain these weights with a more specific structural picture rather than spectrum vector term itself.

Thank you for your precious time !

Hi @lchaoyi!

Good question, for which I do not have a good answer.

You can get some idea of the spherical components by looking at some spherical harmonics visualizations, but at least for me, this is not very useful. The thing is complicated also by the fact that the radial basis can also be quite complicated. Orthonormalized GTO's especially are quite broad and multi-peaked: they are computationally very fast, but hard to decipher by humans. On top of this, you have to consider the fact that the expansion coefficients are squared and summed over the spherical harmonics m-components, which makes the features rotationally invariant, but even harder to understand.

If you require good human-interpretability, I would recommend looking at LMBTR or MBTR: Their features are very easy to understand and visualize. SOAP is harder for humans to understand, but machine learning algorithms are very happy with it :)

PS: In the future, I think the new discussions-feature is more suitable for this kind of topic: I would like to keep the issues related to working with the software itself.

Thank you for your instructions and suggestions !