Dataset for quantum-mechanical exploration of conformers and solvent effects in large drug-like molecules

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

  • Leonardo Medrano Sandonas - , Chair of Materials Science and Nanotechnology, University of Luxembourg, TUD Dresden University of Technology (Author)
  • Dries Van Rompaey - , Johnson & Johnson (Author)
  • Alessio Fallani - , University of Luxembourg, Johnson & Johnson (Author)
  • Mathias Hilfiker - , University of Luxembourg (Author)
  • David Hahn - , Johnson & Johnson (Author)
  • Laura Perez-Benito - , Johnson & Johnson (Author)
  • Jonas Verhoeven - , Johnson & Johnson (Author)
  • Gary Tresadern - , Johnson & Johnson (Author)
  • Joerg Kurt Wegner - , Johnson & Johnson (Author)
  • Hugo Ceulemans - , Johnson & Johnson (Author)
  • Alexandre Tkatchenko - , University of Luxembourg (Author)

Abstract

We here introduce the Aquamarine (AQM) dataset, an extensive quantum-mechanical (QM) dataset that contains the structural and electronic information of 59,783 low-and high-energy conformers of 1,653 molecules with a total number of atoms ranging from 2 to 92 (mean: 50.9), and containing up to 54 (mean: 28.2) non-hydrogen atoms. To gain insights into the solvent effects as well as collective dispersion interactions for drug-like molecules, we have performed QM calculations supplemented with a treatment of many-body dispersion (MBD) interactions of structures and properties in the gas phase and implicit water. Thus, AQM contains over 40 global and local physicochemical properties (including ground-state and response properties) per conformer computed at the tightly converged PBE0+MBD level of theory for gas-phase molecules, whereas PBE0+MBD with the modified Poisson-Boltzmann (MPB) model of water was used for solvated molecules. By addressing both molecule-solvent and dispersion interactions, AQM dataset can serve as a challenging benchmark for state-of-the-art machine learning methods for property modeling and de novo generation of large (solvated) molecules with pharmaceutical and biological relevance.

Details

Original languageEnglish
Article number742
JournalScientific data
Volume11
Issue number1
Publication statusPublished - Dec 2024
Peer-reviewedYes

External IDs

PubMed 38972891