Publication
The Melting Point Profile of Organic Molecules
| Summary: | The combination of the generical molecular maps of atom-level properties (MOLMAPs) encoding approach and the Random Forest algorithm (RF) is applied in order to model, predict, and interpret the structural motifs responsible for a certain organic molecule's melting point (mp) profile. A high-quality database is used for model build-up and evaluation of predictive ability. The obtained results for the complete independent test set (R2 = 0.811, MAE = 31.99 K, RMS = 43.98 K) are comparable or better than reference works. The form of codification represents implicitly the structure of a given molecule and highlights the interactions responsible for a certain melting point profile. This generical encoding approach groups different structural motifs based on its calculated atomic-based properties leading to good predictive ability for structurally different chemical systems not contained in the training set. |
|---|---|
| Main Authors: | Carrera, Gonçalo Valente da Silva Marino |
| Subject: | chemoinformatics codification kohonen neural-networks melting points organic molecules qspr random forests |
| Year: | 2022 |
| Country: | Portugal |
| Document type: | article |
| Access type: | open access |
| Associated institution: | Universidade Nova de Lisboa |
| Language: | English |
| Origin: | Repositório Institucional da UNL |
| Summary: | The combination of the generical molecular maps of atom-level properties (MOLMAPs) encoding approach and the Random Forest algorithm (RF) is applied in order to model, predict, and interpret the structural motifs responsible for a certain organic molecule's melting point (mp) profile. A high-quality database is used for model build-up and evaluation of predictive ability. The obtained results for the complete independent test set (R2 = 0.811, MAE = 31.99 K, RMS = 43.98 K) are comparable or better than reference works. The form of codification represents implicitly the structure of a given molecule and highlights the interactions responsible for a certain melting point profile. This generical encoding approach groups different structural motifs based on its calculated atomic-based properties leading to good predictive ability for structurally different chemical systems not contained in the training set. |
|---|