Tony-Y / oqmd-v1.2-dataset-for-cgnn

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Open In Colab

OQMD v1.2 Dataset for CGNN

This dataset is downloadable from this link, which contains 561,888 materials. Its format is described in here. The original data is available at the OQMD website.

Click on the above Colab link to create a Colab notebook for a data loading tutorial.

How to create this dataset is described in here.

Abnormal Entry

There is an obviously abnormal entry in this dataset.

index name formula spacegroup nelements nsites
277145 oqmd-753381 Mg 10 2 1

This problem originates in the OQMD. You can see its calculation result at link on the online database (based on OQMD v1.5 as of April 19, 2022). You can remove this entry by modifying the split file as follows:

import json
with open('split.json') as f:
    split = json.load(f)
split['train'].remove(277145)
with open("split.json", 'w') as f:
    json.dump(split, f)

Errata of Space Group

As of December 2022, there are 15 corrections for space group. You can see at this link. These incorrect determinations were uncovered by updating Spglib (https://spglib.github.io/spglib/).

Gallery

Jadeite, Benitoite, Taaffeite, and Lithium sodium carbonate

Oxides of Carbon Group Elements

LaSiHO4

About

License:Other


Languages

Language:Jupyter Notebook 99.9%Language:Python 0.1%Language:Shell 0.0%