Data sets and machine learning models for: Machine learning from quantum chemistry to predict experimental solvent effects on reaction rates (Q6693421)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Data sets and machine learning models for: Machine learning from quantum chemistry to predict experimental solvent effects on reaction rates |
Dataset published at Zenodo repository.
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Data sets and machine learning models for: Machine learning from quantum chemistry to predict experimental solvent effects on reaction rates |
Dataset published at Zenodo repository. |
Statements
The datasets andfinal machine learning model filesfor the manuscript "Machine learning from quantum chemistry to predict experimental solvent effects on reaction rates".Citation should refer directly to the manuscript: Chung, Y.; Green, W. H. Machine learning from quantum chemistry to predict experimental solvent effects on reaction rates. Chemical Science 2024,doi: 10.1039/D3SC05353A To use the machine learningmodels, please refer to the sample files and instructions onhttps://github.com/yunsiechung/chemprop/tree/RxnSolvKSE_ML. Detailed informationcan be found in README.md file. Details on the files In the pretraining and finetuning set csv files, each column represents: rxn_smiles: atom-mapped reaction SMILES solvent_smiles: solvent SMILES ddGsolv: solvation free energy of activation of a reaction-solvent pair at 298K in kcal/mol (main prediction target) ddHsolv: solvation enthalpy of activation of a reaction-solvent pair at 298K in kcal/mol (main prediction target) dGsolv_reactant: solvation free energy of reactant(s) at 298K in kcal/mol (additional feature) dGsolv_product: solvation free energy of product(s) at 298K in kcal/mol (additional feature) dHsolv_reactant: solvation enthalpy of reactant(s) at 298K in kcal/mol (additional feature) dHsolv_product: solvation enthalpy of product(s) at 298K in kcal/mol (additional feature) Data sets under 'RxnSolvKSE_dataset_v1.1.zip' pretraining_set: contains the dataset used for pre-training all_data: contains all calculated data pretraining_rxn_solvent_ddGsolv_ddHsolv_with_features_all.csv: contains both mainprediction targets and additional featurefor reaction-solvent pairs pretraining_solvent_info.csv: list of all solvents pretraining_unique_rxn.csv: list of all reactions, both forward and reverse directions chosen_500k_data: contains the chosen 500k data pretraining_rxn_solvent_ddGsolv_ddHsolv_500k.csv: contains main prediction targets for reaction-solvent pairs pretraining_features_react_prod_dGsolv_dHsolv_500k.csv: contains additional features for reaction-solvent pairs train_test_split: contains the 5-fold random split training and test sets. finetuning_set: contains the dataset used for fine-tuning all_data: contains all calculated data finetuning_rxn_solvent_ddGsolv_ddHsolv_with_features_all.csv: constains both main prediction targets and additional featuresfor reaction-solvent pairs. The rxn_key column indicates whether the reaction is bimolecular hydrogen abstraction (bihabs),unimolecular hydrogen migration (intrahabs), or radical addition to a multiple bond (raddition). The 'fwd' and 'rev' eachindicate forward and reverse reactions. finetuning_solvent_info.csv: list of all solvents finetuning_unique_rxn.csv: list of all reactions, both forward and reverse directions chosen_data: contains chosen data finetuning_rxn_solvent_ddGsolv_ddHsolv_chosen.csv: contains main prediction targets for reaction-solvent pairs finetuning_features_react_prod_dGsolv_dHsolv_chosen.csv: contains additional features for reaction-solvent pairs experimental_set: contains the experimental rate constant data used to test the model. The original experimental data can be found at https://zenodo.org/record/7747557. expt_rxn_atom_mapped_smiles.csv: contains the atom-mapped reaction SMILES used for the experimental data. expt_data_collected.xlsx: contains all experimental data and detailed information expt_rxn_solv_smiles_with_features_all.csv: contains the computed additional features for the experimental reaction-solvent pairs. Machine learning model files under 'RxnSolvKSE_ML_model_files.zip' Contains the Chemprop machine learning model files for predicting ddGsolv and ddHsolv for a reaction-solvent pair. It takesatom-mapped reaction SMILES and solvent SMILES as inputs. To use these ML models, please refer to the sample files and instructions on https://github.com/yunsiechung/chemprop/tree/RxnSolvKSE_ML
0 references
17 June 2023
0 references
1.1
0 references