Inconsistencies in Dataset Counts Across Different Attributes
HarshaSatyavardhan opened this issue · comments
specialized boy commented
I have downloaded the dataset and the number of datapoints keep on changing
- by conductivity - 377223
- by id - 384939
- by reduced_formula - 377184
why their is huge number difference in the cif's in these particular folders ?
Matthew Evans commented
I understand they are screening some erroneous structures from the initial dataset, I archived the initially published version as an OPTIMADE API at https://optimade-gnome.odbx.science/v1/structures.