There is a newer version of the record available.

Published July 2022 | Version v5
Dataset Open

ab initio REPEAT Charge MOF Database (ARC-MOF)

Description

This is a database of ~280,000 MOFs which have been either experimentally characterized or computationally generated, spanning all publicly available MOF databases. DFT-derived REPEAT charges, adsorption data, and various descriptors are available for all MOFs.

  • all_structures_1.tar.gz and all_structures_2.tar.gz – these are the cif files that were considered to compose the “entire known design space” of MOFs, with any bad structures removed (split into two separate tarballs since it is a lot of data).
  • ARCMOF_20220610.tar.gz – these are all of the cif files with REPEAT charges composing ARC-MOF.
  • flig-clusters.csv, func-clusters.csv, geo-clusters.csv, mc-clusters.csv – Each file indicates for each MOF which cluster it belongs to, and whether the MOF is present in ARC-MOF. This is done for each "type" of MOF chemistry and for the geometric properties. Clusters with a negative value indicate the MOF does not belong to any cluster (i.e., it is assumed to be "unique").
  • all_topology_lists.csv – a csv file containing the topology reported by the filename of applicable structures, and the topology reported by CrystalNets.jl
  • ML_test_set.tar.gz – these are the cif files (with REPEAT charges) of the MOFs in the diverse-mc subset, but missing from ARC-MOF (for the purposes of a ML test set for the prediction of metal charges). 
  • geometric_properties.csv – a csv file containing geometric descriptors computed for this study for all MOFs. The csv file also indicates which MOFs are present in ARC-MOF, and the order in which they were chosen for the farthest point sampling (up to 100K MOFs).
  • RACs.csv – See geometric_properties.csv description. Same type of file, but with the RAC descriptors.
  • RDFs.csv – The RDFs for each MOF, using several atomic properties. Some atomic properties are not available for all elements. In the cases where the atomic property is not available for a particular structure, no value is assigned.
  • methane.csv, methane_purification-CH4.csv, methane_purification_CO2.csv, post_comb_vsa-CO2.csv, post_comb_vsa-N2.csv, pre_comb_4040-CO2.csv, pre_comb_4040-H2.csv, landfill-CH4.csv, landfill-CO2.csv – these are csv files of the raw uptake data and various temperature, pressure conditions (with standard deviations) for each gas separation process specified in the file overall_process.csv.
  • overall_process.csv – This is a csv file of the adsorption properties of the MOFs. Particularly, the csv files contain the working capacity (mmol/g_working_capacity) and selectivity of each MOF for each of the five process conditions.
  • mc-diverse-set.csv, func-diverse-set.csv – csv files containing which MOFs are present in each diverse set (from farthest point sampling of the MOFs based on either their functional group chemistry or metal chemistry). The file indicates which MOFs are present in ARC-MOF and which are not.

Version history of repository:

v2 -- added file: "all_topology_lists.csv"

v3 -- added file: "ML_test_set.tar.gz"

v4 -- replaced file: "ML_test_set.tar.gz". Originally incorrect repository of cifs

v5 -- A slightly updated version of ARC-MOF has been provided. Some MOFs were removed from ARC-MOF due to structural errors. Some MOFs in ARC-MOF containing Sm were updated, as they had incorrectly assigned charges. Additional MOFs from all_structures containing Sm were added to ARC-MOF.

Files

all_topology_lists.csv

Files (7.5 GB)

Name Size Download all
md5:79d3c550d2f3dc74567e44391b2e01ad
962.7 MB Download
md5:4095562586d33657666f5ef5cd869fc0
1.2 GB Download
md5:14da9901247d08e450fb31d84c6ac912
11.7 MB Preview Download
md5:10999d85f30191aa0e02a19f4cc5a516
866.7 MB Preview Download
md5:b58a0b8b608a76c9ba87457d18c9f01f
21.8 MB Preview Download
md5:b122413fe5015b9735a3e3d4f9856344
18.7 MB Preview Download
md5:0e59830962eb57ca9ad41b5f7ff80817
6.0 MB Preview Download
md5:82df1c16d0b7bff9eca621f20da272a7
24.2 MB Preview Download
md5:345ecb861674a3a25e7be580cd1ab716
110.4 MB Preview Download
md5:0f1cbae2fb5894162f84961bd2045480
111.7 MB Preview Download
md5:e8649940b48df03cd42c2a8aee722554
112.1 MB Preview Download
md5:01cfac0b7ef5baf8532e70a599a55b9c
21.2 MB Preview Download
md5:d8791f9ee5ee6dba6feaa625f094bc4e
733.6 kB Preview Download
md5:48ee74a8bc5e02fd90c043a09b63eb0b
99.1 MB Preview Download
md5:bca6a607b40c8ba02a3225c0208fb549
111.9 MB Preview Download
md5:06f7e5a14cfdc3561d8d33eb2c076449
112.0 MB Preview Download
md5:fa045f108983d0a82309f400a5f39573
22.3 MB Download
md5:04b17cf2fc3431b05a7882bdff2abc3a
228.7 MB Preview Download
md5:9d516ec31dd1c97f0e2b929df50ca249
112.3 MB Preview Download
md5:cde549daf06b436f5721f5c07709c583
111.6 MB Preview Download
md5:5a47aa6df2faa8b6e602fd220122f654
115.1 MB Preview Download
md5:0464eab47218715e1f312e9f5c12f709
112.7 MB Preview Download
md5:0bc7315e8bab41cc115c2baf31ba2ca4
838.4 MB Preview Download
md5:65eb9c091ad427ff1411022290b307f4
2.1 GB Preview Download

Additional details

Dates

Updated
2023-05