Matbench Discovery

TL;DR: We benchmark ML models on crystal stability prediction from unrelaxed structures finding universal interatomic potentials (UIP) like MACE, CHGNet and M3GNet to be highly accurate, robust across chemistries and ready for production use in high-throughput materials discovery.

Matbench Discovery is an interactive leaderboard and associated PyPI package which together make it easy to rank ML energy models on a task designed to simulate a high-throughput discovery campaign for new stable inorganic crystals.

We've tested models covering multiple methodologies ranging from random forests with structure fingerprints to graph neural networks, from one-shot predictors to iterative Bayesian optimizers and interatomic potential relaxers.

Our results show that ML models have become robust enough to deploy them as triaging steps to more effectively allocate compute in high-throughput DFT relaxations. This work provides valuable insights for anyone looking to build large-scale materials databases.

We welcome contributions that add new models to the leaderboard through GitHub PRs. See the contributing guide for details.

If you're interested in joining this work, please reach out via GitHub discussion or email.

For detailed results and analysis, check out the preprint.

About

An evaluation framework for machine learning models simulating high-throughput materials discovery.

https://matbench-discovery.materialsproject.org

MIT License

Languages

Language:Python 86.2%Language:Svelte 10.0%Language:CSS 1.4%Language:TypeScript 1.0%Language:JavaScript 0.8%Language:HTML 0.4%Language:Shell 0.2%