Dataset for quantum-mechanical exploration of conformers and solvent effects in large drug-like molecules

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Beitragende

  • Leonardo Medrano Sandonas - , Professur für Materialwissenschaft und Nanotechnik, University of Luxembourg (Autor:in)
  • Dries Van Rompaey - , Johnson & Johnson (Autor:in)
  • Alessio Fallani - , University of Luxembourg, Johnson & Johnson (Autor:in)
  • Mathias Hilfiker - , University of Luxembourg (Autor:in)
  • David Hahn - , Johnson & Johnson (Autor:in)
  • Laura Perez-Benito - , Johnson & Johnson (Autor:in)
  • Jonas Verhoeven - , Johnson & Johnson (Autor:in)
  • Gary Tresadern - , Johnson & Johnson (Autor:in)
  • Joerg Kurt Wegner - , Johnson & Johnson (Autor:in)
  • Hugo Ceulemans - , Johnson & Johnson (Autor:in)
  • Alexandre Tkatchenko - , University of Luxembourg (Autor:in)

Abstract

We here introduce the Aquamarine (AQM) dataset, an extensive quantum-mechanical (QM) dataset that contains the structural and electronic information of 59,783 low-and high-energy conformers of 1,653 molecules with a total number of atoms ranging from 2 to 92 (mean: 50.9), and containing up to 54 (mean: 28.2) non-hydrogen atoms. To gain insights into the solvent effects as well as collective dispersion interactions for drug-like molecules, we have performed QM calculations supplemented with a treatment of many-body dispersion (MBD) interactions of structures and properties in the gas phase and implicit water. Thus, AQM contains over 40 global and local physicochemical properties (including ground-state and response properties) per conformer computed at the tightly converged PBE0+MBD level of theory for gas-phase molecules, whereas PBE0+MBD with the modified Poisson-Boltzmann (MPB) model of water was used for solvated molecules. By addressing both molecule-solvent and dispersion interactions, AQM dataset can serve as a challenging benchmark for state-of-the-art machine learning methods for property modeling and de novo generation of large (solvated) molecules with pharmaceutical and biological relevance.

Details

OriginalspracheEnglisch
Aufsatznummer742
FachzeitschriftScientific data
Jahrgang11
Ausgabenummer1
PublikationsstatusVeröffentlicht - Dez. 2024
Peer-Review-StatusJa

Externe IDs

PubMed 38972891