Publication

The Melting Point Profile of Organic Molecules

View document

Bibliographic Details
Summary:The combination of the generical molecular maps of atom-level properties (MOLMAPs) encoding approach and the Random Forest algorithm (RF) is applied in order to model, predict, and interpret the structural motifs responsible for a certain organic molecule's melting point (mp) profile. A high-quality database is used for model build-up and evaluation of predictive ability. The obtained results for the complete independent test set (R2 = 0.811, MAE = 31.99 K, RMS = 43.98 K) are comparable or better than reference works. The form of codification represents implicitly the structure of a given molecule and highlights the interactions responsible for a certain melting point profile. This generical encoding approach groups different structural motifs based on its calculated atomic-based properties leading to good predictive ability for structurally different chemical systems not contained in the training set.
Main Authors:Carrera, Gonçalo Valente da Silva Marino
Subject:chemoinformatics codification kohonen neural-networks melting points organic molecules qspr random forests
Year:2022
Country:Portugal
Document type:article
Access type:open access
Associated institution:Universidade Nova de Lisboa
Language:English
Origin:Repositório Institucional da UNL
Description
Summary:The combination of the generical molecular maps of atom-level properties (MOLMAPs) encoding approach and the Random Forest algorithm (RF) is applied in order to model, predict, and interpret the structural motifs responsible for a certain organic molecule's melting point (mp) profile. A high-quality database is used for model build-up and evaluation of predictive ability. The obtained results for the complete independent test set (R2 = 0.811, MAE = 31.99 K, RMS = 43.98 K) are comparable or better than reference works. The form of codification represents implicitly the structure of a given molecule and highlights the interactions responsible for a certain melting point profile. This generical encoding approach groups different structural motifs based on its calculated atomic-based properties leading to good predictive ability for structurally different chemical systems not contained in the training set.