Chemformer: a pre-trained transformer for computational chemistry

Irwin, Ross and Dimitriadis, Spyridon and He, Jiazhen and Bjerrum, Esben Jannik (2022) Chemformer: a pre-trained transformer for computational chemistry. Machine Learning: Science and Technology, 3 (1). 015022. ISSN 2632-2153

[thumbnail of Irwin_2022_Mach._Learn.__Sci._Technol._3_015022.pdf]

Text
Irwin_2022_Mach._Learn.__Sci._Technol._3_015022.pdf - Published Version
Download (1MB)

Official URL: https://doi.org/10.1088/2632-2153/ac3ffb

Abstract

Transformer models coupled with a simplified molecular line entry system (SMILES) have recently proven to be a powerful combination for solving challenges in cheminformatics. These models, however, are often developed specifically for a single application and can be very resource-intensive to train. In this work we present the Chemformer model—a Transformer-based model which can be quickly applied to both sequence-to-sequence and discriminative cheminformatics tasks. Additionally, we show that self-supervised pre-training can improve performance and significantly speed up convergence on downstream tasks. On direct synthesis and retrosynthesis prediction benchmark datasets we publish state-of-the-art results for top-1 accuracy. We also improve on existing approaches for a molecular optimisation task and show that Chemformer can optimise on multiple discriminative tasks simultaneously. Models, datasets and code will be made available after publication.

Item Type:	Article
Subjects:	Open Library Press > Multidisciplinary
Depositing User:	Unnamed user with email support@openlibrarypress.com
Date Deposited:	14 Jul 2023 11:12
Last Modified:	14 Jul 2023 11:12
URI:	https://openlibrarypress.com/id/eprint/1801

Actions (login required)

: View Item