Tried Atom Joggling as a method for data augmentation with Rhys'es CGCNN implementation on 6 out of 9 MatBench datasets with structures:
matbench_dielectric
matbench_jdft2d
matbench_log_gvrh
matbench_log_kvrh
matbench_perovskites
matbench_phonons
Worked best when splitting train and test set by spacegroups. Showed no performance difference at all on random splits.
With spacegroup splits, the test set performance improvement was most pronounced on the matbench_phonons
dataset, very subtle on the matbench_log_g/kvrh
, matbench_jdft2d
and matbench_perovskites
datasets and non-existent on matbench_dielectric
. See tensorboard_screenshots
and the following uploads of TensorBoard logs for results.
- fine-tuning the XXL CGCNN for 200 epochs with 3e-5 learning rate after initial training of 600 epochs with lr 3e-4 on MatBench Perovskites: Joggling amplitudes = [0, 0.01, 0.02, 0.03] A, joggling rate = 0.3, test set = spacegroups 200-231
- Different-sized CGCNN models tested on MatBench Perovskites: Joggling amplitudes = [0, 0.01, 0.02, 0.03] A, joggling rate = 0.3, test set = spacegroups 200-231
- MatBench Phonons dataset (last phonon DOS peak)
- Shear Modulus showing less of a benefit with joggling than MatBench Phonons does
- MatBench JDFT (exfoliation energy)
task name | target column (unit) | sample count | task type | input | links |
---|---|---|---|---|---|
matbench_dielectric |
n (unitless) |
4764 | regression | structure | download, interactive |
matbench_expt_gap |
gap expt (eV) |
4604 | regression | composition | download, interactive |
matbench_expt_is_metal |
is_metal (unitless) |
4921 | classification | composition | download, interactive |
matbench_glass |
gfa (unitless) |
5680 | classification | composition | download, interactive |
matbench_jdft2d |
exfoliation_en (meV/atom) |
636 | regression | structure | download, interactive |
matbench_log_gvrh |
log10(G_VRH) (log(GPa)) |
10987 | regression | structure | download, interactive |
matbench_log_kvrh |
log10(K_VRH) (log(GPa)) |
10987 | regression | structure | download, interactive |
matbench_mp_e_form |
e_form (eV/atom) |
132752 | regression | structure | download, interactive |
matbench_mp_gap |
gap pbe (eV) |
106113 | regression | structure | download, interactive |
matbench_mp_is_metal |
is_metal (unitless) |
106113 | classification | structure | download, interactive |
matbench_perovskites |
e_form (eV, per unit cell) |
18928 | regression | structure | download, interactive |
matbench_phonons |
last phdos peak (1/cm) |
1265 | regression | structure | download, interactive |
matbench_steels |
yield strength (MPa) |
312 | regression | composition | download, interactive |
task name | verified top score (MAE or ROCAUC) | algorithm name, config, | general purpose algorithm? |
---|---|---|---|
matbench_dielectric |
0.299 (unitless) | Automatminer express v1.0.3.2019111 | yes |
matbench_expt_gap |
0.416 eV | Automatminer express v1.0.3.2019111 | yes |
matbench_expt_is_metal |
0.92 | Automatminer express v1.0.3.2019111 | yes |
matbench_glass |
0.861 | Automatminer express v1.0.3.2019111 | yes |
matbench_jdft2d |
38.6 meV/atom | Automatminer express v1.0.3.2019111 | yes |
matbench_log_gvrh |
0.0849 log(GPa) | Automatminer express v1.0.3.2019111 | yes |
matbench_log_kvrh |
0.0679 log(GPa) | Automatminer express v1.0.3.2019111 | yes |
matbench_mp_e_form |
0.0327 eV/atom | MEGNet v0.2.2 | yes, structure only |
matbench_mp_gap |
0.228 eV | CGCNN (2019) | yes, structure only |
matbench_mp_is_metal |
0.977 | MEGNet v0.2.2 | yes, structure only |
matbench_perovskites |
0.0417 | MEGNet v0.2.2 | yes, structure only |
matbench_phonons |
36.9 cm^-1 | MEGNet v0.2.2 | yes, structure only |
matbench_steels |
95.2 MPa | Automatminer express v1.0.3.2019111 | yes |