Sciact
  • EN
  • RU

Digitization of molecular complexity with machine learning Научная публикация

Журнал Chemical Science
ISSN: 2041-6520 , E-ISSN: 2041-6539
Вых. Данные Год: 2025, Том: 16, Страницы: 6895-6908 Страниц : 14 DOI: 10.1039/d4sc07320g
Авторы Tyrin Andrei S. 1 , Boiko Daniil A. 1 , Kolomoets Nikita I. 1 , Ananikov Valentine P. 1
Организации
1 Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospekt 47, Moscow, 119991, Russia

Реферат: Digitization of molecular complexity is of key importance in chemistry and life sciences to develop structure–activity relationships in chemical behavior and biological activity. The complexity of a given molecule compared to others is largely based on intuitive perception and lacks a standardized numerical measure. Quantifying molecular complexity remains a fundamental challenge, with key implications currently remaining controversial. In this study, we introduce a novel machine learning-based framework employing a Learning to Rank (LTR) approach to quantify molecular complexity on the basis of labeled data. As a result, we developed a ranking model utilizing the dataset that comprizes approximately 300 000 data points across diverse chemical structures, leveraging human expertise to capture complex decision rules that researchers intuitively use. Applications of our model in mapping the current organic chemistry landscape, analyzing FDA-approved drugs, guiding lead optimization processes, and interpreting total synthesis approaches reveal key trends in increasing molecular complexity and synthetic strategy evolution. Our study advances the methodologies available for quantifying molecular complexity, changing it from an elusive property to a numerical characteristic. With machine learning, we managed to digitize human perception of molecular complexity. Moreover, a corresponding large labeled dataset was produced for future research in this area.
Библиографическая ссылка: Tyrin A.S. , Boiko D.A. , Kolomoets N.I. , Ananikov V.P.
Digitization of molecular complexity with machine learning
Chemical Science. 2025. V.16. P.6895-6908. DOI: 10.1039/d4sc07320g WOS OpenAlex
Идентификаторы БД:
Web of science: WOS:001447488200001
OpenAlex: W4408613216
Цитирование в БД: Пока нет цитирований
Альметрики: