a 3d generative model for structure-based drug design

11
A 3D Generative Model for Structure-Based Drug Design Shitong Luo HeliXon Research [email protected] [email protected] Jiaqi Guan University of Illinois Urbana-Champaign [email protected] Jianzhu Ma Peking University [email protected] Jian Peng University of Illinois Urbana-Champaign [email protected] Abstract We study a fundamental problem in structure-based drug design — generating molecules that bind to specific protein binding sites. While we have witnessed the great success of deep generative models in drug design, the existing methods are mostly string-based or graph-based. They are limited by the lack of spatial information and thus unable to be applied to structure-based design tasks. Partic- ularly, such models have no or little knowledge of how molecules interact with their target proteins exactly in 3D space. In this paper, we propose a 3D generative model that generates molecules given a designated 3D protein binding site. Specifi- cally, given a binding site as the 3D context, our model estimates the probability density of atom’s occurrences in 3D space — positions that are more likely to have atoms will be assigned higher probability. To generate 3D molecules, we propose an auto-regressive sampling scheme — atoms are sampled sequentially from the learned distribution until there is no room for new atoms. Combined with this sampling scheme, our model can generate valid and diverse molecules, which could be applicable to various structure-based molecular design tasks such as molecule sampling and linker design. Experimental results demonstrate that molecules sampled from our model exhibit high binding affinity to specific targets and good drug properties such as drug-likeness even if the model is not explicitly optimized for them. 1 Introduction Designing molecules that bind to a specific protein binding site, also known as structure-based drug design, is one of the most challenging tasks in drug discovery [2]. Searching for suitable molecule candidates in silico usually involves massive computational efforts because of the enormous space of synthetically feasible chemicals [22] and conformational degree of freedom of both compound and protein structures [11]. In recent years, we have witnessed the success of machine learning approaches to problems in drug design, especially on molecule generation. Most of these approaches use deep generative models to propose drug candidates by learning the underlying distribution of desirable molecules. However, most of such methods are generally SMILES/string-based [10, 17] or graph-based [18, 19, 13, 14]. They are limited by the lack of spatial information and unable to perceive how molecules interact with proteins in 3D space. Hence, these methods are not applicable to generating molecules that fit to a specific protein structure which is also known as the drug target. Another line of work studies 35th Conference on Neural Information Processing Systems (NeurIPS 2021).

Upload: others

Post on 16-Apr-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A 3D Generative Model for Structure-Based Drug Design

A 3D Generative Model forStructure-Based Drug Design

Shitong LuoHeliXon Research

[email protected]@gmail.com

Jiaqi GuanUniversity of Illinois Urbana-Champaign

[email protected]

Jianzhu MaPeking University

[email protected]

Jian PengUniversity of Illinois Urbana-Champaign

[email protected]

Abstract

We study a fundamental problem in structure-based drug design — generatingmolecules that bind to specific protein binding sites. While we have witnessedthe great success of deep generative models in drug design, the existing methodsare mostly string-based or graph-based. They are limited by the lack of spatialinformation and thus unable to be applied to structure-based design tasks. Partic-ularly, such models have no or little knowledge of how molecules interact withtheir target proteins exactly in 3D space. In this paper, we propose a 3D generativemodel that generates molecules given a designated 3D protein binding site. Specifi-cally, given a binding site as the 3D context, our model estimates the probabilitydensity of atom’s occurrences in 3D space — positions that are more likely tohave atoms will be assigned higher probability. To generate 3D molecules, wepropose an auto-regressive sampling scheme — atoms are sampled sequentiallyfrom the learned distribution until there is no room for new atoms. Combinedwith this sampling scheme, our model can generate valid and diverse molecules,which could be applicable to various structure-based molecular design tasks suchas molecule sampling and linker design. Experimental results demonstrate thatmolecules sampled from our model exhibit high binding affinity to specific targetsand good drug properties such as drug-likeness even if the model is not explicitlyoptimized for them.

1 Introduction

Designing molecules that bind to a specific protein binding site, also known as structure-based drugdesign, is one of the most challenging tasks in drug discovery [2]. Searching for suitable moleculecandidates in silico usually involves massive computational efforts because of the enormous space ofsynthetically feasible chemicals [22] and conformational degree of freedom of both compound andprotein structures [11].

In recent years, we have witnessed the success of machine learning approaches to problems in drugdesign, especially on molecule generation. Most of these approaches use deep generative models topropose drug candidates by learning the underlying distribution of desirable molecules. However,most of such methods are generally SMILES/string-based [10, 17] or graph-based [18, 19, 13, 14].They are limited by the lack of spatial information and unable to perceive how molecules interactwith proteins in 3D space. Hence, these methods are not applicable to generating molecules that fitto a specific protein structure which is also known as the drug target. Another line of work studies

35th Conference on Neural Information Processing Systems (NeurIPS 2021).

Page 2: A 3D Generative Model for Structure-Based Drug Design

generating molecules directly in 3D space [8, 28, 29, 20, 30, 15]. Most of them [8, 28, 29] can onlyhandle very small organic molecules, not sufficient to generate drug-scale molecules which usuallycontain dozens of heavy atoms. [20] proposes to generate voxelized molecular images and use apost-processing algorithm to reconstruct molecular structures. Though this method could producedrug-scale molecules for specific protein pockets, the quality of the sampling is heavily limited byvoxelization. Therefore, generating high-quality drug molecules for specific 3D protein binding sitesremains challenging.

In this work, we propose a 3D generative model to approach this task. Specifically, we aim atmodeling the distribution of atom occurrence in the 3D space of the binding site. Formally, givena binding site C as input, we model the distribution p(e, r|C), where r ∈ R3 is an arbitrary 3Dcoordinate and e is atom type. To realize this distribution, we design a neural network architecturewhich takes as input a query 3D coordinate r, conditional on the 3D context C, and outputs theprobability of r being occupied by an atom of a particular chemical element. In order to ensure thedistribution is equivariant to C’s rotation and translation, we utilize rotationally invariant graph neuralnetworks to perceive the context of each query coordinate.

Despite having a neural network to model the distribution of atom occurrence p(e, r|C), how togenerate valid and diverse molecules still remains technically challenging, mainly for the followingtwo reasons: First, simply drawing i.i.d. samples from the distribution p(e, r|C) does not yieldvalid molecules because atoms within a molecule are not independent of each other. Second, adesirable sampling algorithm should capture the multi-modality of the feasible chemical space, i.e. itshould be able to generate a diverse set of desired molecules given a specific binding context. Totackle the challenge, we propose an auto-regressive sampling algorithm. In specific, we start with acontext consisting of only protein atoms. Then, we iteratively sample one atom from the distributionat each step and add it to the context to be used in the next step, until there is no room for newatoms. Compared to other recent methods [20, 23], our auto-regressive algorithm is simpler and moreadvantageous. It does not rely on post-processing algorithms to infer atom placements from density.More importantly, it is capable of multi-modal sampling by the nature of auto-regressive, avoidingadditional latent variables via VAEs [16] or GANs [9] which would bring about extra architecturalcomplexity and training difficulty.

We conduct extensive experiments to evaluate our approach. Quantitative and qualitative results showthat: (1) our method is able to generate diverse drug-like molecules that have high binding affinity tospecific targets based on 3D structures of protein binding sites; (2) our method is able to generatemolecules with fairly high drug-likeness score (QED) [4] and synthetic accessibility score (SA) [6]even if the model is not specifically optimized for them; (3) in addition to molecule generation, theproposed method is also applicable to other relevant tasks such as linker design.

2 Related Work

SMILES-Based and Graph-Based Molecule Generation Deep generative models have beenprevalent in molecule design. The overall idea is to use deep generative models to propose moleculecandidates by learning the underlying distribution of desirable molecules. Existing works can beroughly divided into two classes — string-based and graph-based. String-based methods representmolecules as linear strings, e.g. SMILES strings [34], making a wide range of language modelingtools readily applicable. For example, [5, 10, 26] utilize recurrent neural networks to learn a languagemodel of SMILES strings. However, string-based representations fail to capture molecular similarities,making it a sub-optimal representation for molecules [13]. In contrast, graph representations aremore natural, and graph-based approaches have drawn great attention. The majority of graph-basedmodels generate molecules in an auto-regressive fashion, i.e., adding atoms or fragments sequentially,which could be implemented based upon VAEs [13], normalizing flows [27], reinforcement learning[35, 14], etc. Despite the progress made in string-based and graph-based approaches, they are limitedby the lack of spatial information and thus unable to be directly applied to structure-based drug designtasks [2]. Specifically, as 1D/2D-based methods, they are unable to perceive how molecules interactwith their target proteins exactly in 3D space.

Molecule Generation in 3D Space There has been another line of methods that generate moleculesdirectly in 3D space. [8] proposes an auto-regressive model which takes a partially generated moleculeas input and outputs the next atom’s chemical element and the distances to previous atoms and places

2

Page 3: A 3D Generative Model for Structure-Based Drug Design

the atoms in the 3D space according to the distance constraints. [28, 29] approach this task viareinforcement learning by generating 3D molecules in a sequential way. Different from the previousmethod[8], they mainly rely on a reward function derived from the potential energy function of atomicsystems. These works could generate realistic 3D molecules. However, they can only handle smallorganic molecules, not sufficient to generate drug-scale molecules which usually contain dozens ofheavy atoms.

[20, 23] propose a non-autoregressive approach to 3D molecular generation which is able to generatedrug-scale molecules. It represents molecules as 3D images by voxelizing molecules onto 3Dmeshgrids. In this way, the molecular generation problem is transformed into an image generationproblem, making it possible to leverage sophisticated image generation techniques. In specific, itemploys convolutional neural network-based VAEs [16] or GANs [9] to generate such molecularimages. It also attempts to fuse the binding site structures into the generative network, enabling themodel to generate molecules for designated binding targets. In order to reconstruct the molecularstructures from images, it leverages a post-processing algorithm to search for atom placements thatbest fit the image. In comparison to previous methods which can only generate small 3D molecules,this method can generate drug-scale 3D molecules. However, the quality of its generated molecules isnot satisfying because of the following major limitations. First, it is hardly scalable to large bindingpockets, as the number of voxels grows cubically to the size of the binding site. Second, the resolutionof the 3D molecular images is another bottleneck that significantly limits the precision due to thesame scalability issue. Last, conventional CNNs are not rotation-equivariant, which is crucial formodeling molecular systems [25].

3 Method

Our goal is to generate a set of atoms that is able to form a valid drug-like molecule fitting to aspecific binding site. To this end, we first present a 3D generative model in Section 3.1 that predictsthe probability of atom occurrence in 3D space of the binding site. Second, we present in Section 3.2the auto-regressive sampling algorithm for generating valid and multi-modal molecules from themodel. Finally, in Section 3.3, we derive the training objective, by which the model learns to predictwhere should be placed and atoms and what type of atom should be placed.

3.1 3D Generative Model Design

A binding site can be defined as a set of atoms C = {(ai, ri)}Nbi=1, where Nb is the number of atoms

in the binding site, ai is the i-th atom’s attributes such as chemical element, belonging amino acid,etc., and ri is its 3D coordinate. To generate atoms in the binding site, we consider modeling theprobability of atom occurring at some position r in the site. Formally, this is to model the densityp(e|r, C), where r ∈ R3 is an arbitrary 3D coordinate, and e ∈ E = {H,C,O, . . .} is the chemicalelement. Intuitively, this density can be interpreted as a classifier that takes as input a 3D coordinater conditional on C and predicts the probability of r being occupied by an atom of type e.

To model p(e|r, C), we devise a model consisting of two parts: Context Encoder learns the repre-sentation of each atom in the context C via graph neural networks. Spatial Classifier takes as input aquery position r, then aggregates the representation of contextual atoms nearby it, and finally predictsp(e|r, C). The implementation of these two parts is detailed as follows.

Context Encoder The purpose of the context encoder is to extract information-rich representationsfor each atom in C. We assume a desirable representation should satisfy two properties: (1) context-awareness: the representation of an atom should not only encode the property of the atom itself,but also encode its context. (2) rotational and translational invariance: since the physical andbiological properties of the system do not change according to rigid transforms, the representationsthat reflect these properties should be invariant to rigid transforms as well. To this end, we employrotationally and translationally invariant graph neural networks [25] as the backbone of the contextencoder, described as follows.

First of all, since there is generally no natural topology in C, we construct a k-nearest-neighbor graphbased on inter-atomic distances, denoted as G = 〈C,A〉, where A is the adjacency matrix. We alsodenote the k-NN neighborhood of atom i as Nk(ri) for convenience. The context encoder will takeG as input and output structure-aware node embeddings.

3

Page 4: A 3D Generative Model for Structure-Based Drug Design

C

<latexit sha1_base64="3bu16Cn5/iiILHet4KjUQ6vTU/4=">AAACT3icbVBdSxtBFJ2N1cb1K7UvQl8uDYKChF0pbR+FvPhooTGBTAizk5s4OLO7zNwtDcv+IH+N+GZ+iPgmncR98KMHBg7n3I+5J8m1chRFi6Cx9mF942NzM9za3tnda33av3RZYSX2ZKYzO0iEQ61S7JEijYPcojCJxn5y3V36/T9oncrS3zTPcWTELFVTJQV5adzqciPoSgpddqtxDHyG5OClFgGXRQ68hCNO+JeIvHoCPDGl9R3HwKtxqx11ohXgPYlr0mY1LsatBz7JZGEwJamFc8M4ymlUCktKaqxCXjjMhbwWMxx6mgqDblSujq3g0CsTmGbWv5Rgpb7sKIVxbm4SX7k8w731luL/vGFB05+jUqV5QZjK50XTQgNlsEwOJsqiJD33REir/F9BXgkrJPl8w/AQZOEoM1DPe704MVXoc4rfpvKeXJ524u+d01/f2mcHdWJN9oV9ZUcsZj/YGTtnF6zHJLtht+yeLYK74DF4atSljaAmn9krNDb/AXhHsr0=</latexit>

C1 C0 [ {(C, r1)}<latexit sha1_base64="x+K9k9vad0Ef3taRcB5i5Exx/10=">AAACKHicbVDNS8MwHE3n16xfVS+CHoJjMEFGO0Q9DnbxOMF9wDpGmqVbWNKWJBVG7cW/RrzpX+JNdvV/8G669eCmDwKP935feV7EqFS2PTMKa+sbm1vFbXNnd2//wDo8asswFpi0cMhC0fWQJIwGpKWoYqQbCYK4x0jHmzQyv/NIhKRh8KCmEelzNAqoTzFSWhpYZ1GFXELX44lI4RN0OVJjjFjSSAfOxcAq2VV7DviXODkpgRzNgfXtDkMccxIozJCUPceOVD9BQlHMSGq6sSQRwhM0Ij1NA8SJ7CfzX6SwrJUh9EOhX6DgXP3dkSAu5ZR7ujK7Uq56mfif14uVf9tPaBDFigR4sciPGVQhzCKBQyoIVmyqCcKC6lshHiOBsNLBmWYZ4liqkMN83vJij6emzslZTeUvadeqznW1dn9Vqp/kiRXBKTgHFeCAG1AHd6AJWgCDZ/AC3sC78Wp8GJ/GbFFaMPKeY7AE4+sHbJKkzw==</latexit>

p(e, r|C1)

CC

<latexit sha1_base64="p46GFvnrdJ/44lBXW4lbzNAkG24=">AAACKHicbVBLSwMxGMzWV11fVS+CHoKlUEHKbhH1WOjFYwX7gHZZsmnahibZJckKZd2Lv0a86S/xJr36H7ybtnuw1YHAMPO9MkHEqNKOM7Vya+sbm1v5bXtnd2//oHB41FJhLDFp4pCFshMgRRgVpKmpZqQTSYJ4wEg7GNdnfvuRSEVD8aAnEfE4Ggo6oBhpI/mFs6hMLmEv4IlM4RPscaRHGLGknvrVC79QdCrOHPAvcTNSBBkafuG71w9xzInQmCGluq4TaS9BUlPMSGr3YkUihMdoSLqGCsSJ8pL5L1JYMkofDkJpntBwrv7uSBBXasIDUzm7Uq16M/E/rxvrwa2XUBHFmgi8WDSIGdQhnEUC+1QSrNnEEIQlNbdCPEISYW2Cs+0SxLHSIYfZvOXFAU9tk5O7mspf0qpW3OtK9f6qWDvJEsuDU3AOysAFN6AG7kADNAEGz+AFvIF369X6sD6t6aI0Z2U9x2AJ1tcPbjqk0A==</latexit>

p(e, r|C2)

<latexit sha1_base64="Dftt+EfgsqzgP39vhQWDnBL23Bo=">AAACT3icbVBdSxtBFJ1NrZpt1di+CH25GAQLEnZDUR+FvPgYwaiQCWF2chMHZ3aXmbvFsOwP8tdI3+oPEd+kk7gPfvTAwOGc+zH3JLlWjqLoIWh8Wvm8urbeDL983djcam1/u3BZYSUOZKYze5UIh1qlOCBFGq9yi8IkGi+Tm97Cv/yN1qksPad5jiMjZqmaKinIS+NWjxtB11LosleNu8BnSA5eazFwWeTAS9jnhLdE5NUD4Ikpre/4Cbwat9pRJ1oCPpK4Jm1Woz9uPfJJJguDKUktnBvGUU6jUlhSUmMV8sJhLuSNmOHQ01QYdKNyeWwFe16ZwDSz/qUES/V1RymMc3OT+MrFGe69txD/5w0Lmh6PSpXmBWEqXxZNCw2UwSI5mCiLkvTcEyGt8n8FeS2skOTzDcM9kIWjzEA97+3ixFShzyl+n8pHctHtxIed7tmv9slOndg6+8F22T6L2RE7YaeszwZMsjt2z/6yh+BP8BQ8N+rSRlCT7+wNGs1/fZiywA==</latexit>

C2 C1 [ {(C, r2)}

CC

O

CC

O

<latexit sha1_base64="AA5G/SxvYar9P+QEVcSAco2unZ4=">AAACTnicbVBdSxtBFJ2Ntk23H0Z9KfgyNAgWSthNS/Ux1BffVDAxkAlhdnITh8zsLjN3xbDs//HXiG/VH6Jv0k6SfUhiLwwczrnn3rknSpW0GASPXmVj883bd9X3/oePnz5v1bZ3OjbJjIC2SFRiuhG3oGQMbZSooJsa4DpScBlNjmf65TUYK5P4Aqcp9DUfx3IkBUdHDWq/meZ4JbjKj4vBD8rGgJYuc00mspSynB4whBtEzE+L75RFOjfO8I2yYlCrB41gXvQ1CEtQJ2WdDWpPbJiITEOMQnFre2GQYj/nBqVQUPgss5ByMeFj6DkYcw22n89vLei+Y4Z0lBj3YqRzdtmRc23tVEeuc3aFXddm5P+0Xoajo34u4zRDiMVi0ShTFBM6C44OpQGBauoAF0a6v1JxxQ0X6OL1/X0qMouJpuW81cWRLnyXU7ieymvQaTbCX43m+c9660uZWJXska/kgITkkLTICTkjbSLILbkjf8iDd+89ey/e30VrxSs9u2SlKtV/TVOzpA==</latexit>

C3 C2 [ {(O, r3)}<latexit sha1_base64="3Lhon0whJRIZEk5zOWU3VdFGlGw=">AAACKHicbVDNS8MwHE3n16xfVS+CHoJjMEFGO0U9DnbxOMF9wFpGmmVbWNKWJBVG7cW/RrzpX+JNdvV/8G629eA2HwQe7/2+8vyIUalse2Lk1tY3Nrfy2+bO7t7+gXV41JRhLDBp4JCFou0jSRgNSENRxUg7EgRxn5GWP6pN/dYTEZKGwaMaR8TjaBDQPsVIaalrnUUlcgldnycihc/Q5UgNMWJJLe1eXXStgl22Z4CrxMlIAWSod60ftxfimJNAYYak7Dh2pLwECUUxI6npxpJECI/QgHQ0DRAn0ktmv0hhUSs92A+FfoGCM/VvR4K4lGPu68rplXLZm4r/eZ1Y9e+8hAZRrEiA54v6MYMqhNNIYI8KghUba4KwoPpWiIdIIKx0cKZZhDiWKuQwm7e42OepqXNyllNZJc1K2bkpVx6uC9WTLLE8OAXnoAQccAuq4B7UQQNg8AJewTv4MN6MT+PLmMxLc0bWcwwWYHz/Am/ipNE=</latexit>

p(e, r|C3)

<latexit sha1_base64="AA5G/SxvYar9P+QEVcSAco2unZ4=">AAACTnicbVBdSxtBFJ2Ntk23H0Z9KfgyNAgWSthNS/Ux1BffVDAxkAlhdnITh8zsLjN3xbDs//HXiG/VH6Jv0k6SfUhiLwwczrnn3rknSpW0GASPXmVj883bd9X3/oePnz5v1bZ3OjbJjIC2SFRiuhG3oGQMbZSooJsa4DpScBlNjmf65TUYK5P4Aqcp9DUfx3IkBUdHDWq/meZ4JbjKj4vBD8rGgJYuc00mspSynB4whBtEzE+L75RFOjfO8I2yYlCrB41gXvQ1CEtQJ2WdDWpPbJiITEOMQnFre2GQYj/nBqVQUPgss5ByMeFj6DkYcw22n89vLei+Y4Z0lBj3YqRzdtmRc23tVEeuc3aFXddm5P+0Xoajo34u4zRDiMVi0ShTFBM6C44OpQGBauoAF0a6v1JxxQ0X6OL1/X0qMouJpuW81cWRLnyXU7ieymvQaTbCX43m+c9660uZWJXska/kgITkkLTICTkjbSLILbkjf8iDd+89ey/e30VrxSs9u2SlKtV/TVOzpA==</latexit>

C3 C2 [ {(O, r3)}<latexit sha1_base64="3Lhon0whJRIZEk5zOWU3VdFGlGw=">AAACKHicbVDNS8MwHE3n16xfVS+CHoJjMEFGO0U9DnbxOMF9wFpGmmVbWNKWJBVG7cW/RrzpX+JNdvV/8G629eA2HwQe7/2+8vyIUalse2Lk1tY3Nrfy2+bO7t7+gXV41JRhLDBp4JCFou0jSRgNSENRxUg7EgRxn5GWP6pN/dYTEZKGwaMaR8TjaBDQPsVIaalrnUUlcgldnycihc/Q5UgNMWJJLe1eXXStgl22Z4CrxMlIAWSod60ftxfimJNAYYak7Dh2pLwECUUxI6npxpJECI/QgHQ0DRAn0ktmv0hhUSs92A+FfoGCM/VvR4K4lGPu68rplXLZm4r/eZ1Y9e+8hAZRrEiA54v6MYMqhNNIYI8KghUba4KwoPpWiIdIIKx0cKZZhDiWKuQwm7e42OepqXNyllNZJc1K2bkpVx6uC9WTLLE8OAXnoAQccAuq4B7UQQNg8AJewTv4MN6MT+PLmMxLc0bWcwwWYHz/Am/ipNE=</latexit>

p(e, r|C3)

CC

O

N

CC

O

N

CC

O

N

O

CC

O

NO

<latexit sha1_base64="I8BwHSzxis/us+OC1Mg/kpbF/aM=">AAACKHicbVDNS8MwHE3n16xfVS+CHoJjMEFGO4Z6HOzicYL7gG2UNEu3sKQtSSqM2ot/jXjTv8Sb7Or/4N1s68FtPgg83vt95XkRo1LZ9tTIbWxube/kd829/YPDI+v4pCXDWGDSxCELRcdDkjAakKaiipFOJAjiHiNtb1yf+e0nIiQNg0c1iUifo2FAfYqR0pJrXUQlcg17Hk9ECp9hjyM1wogl9dStXrlWwS7bc8B14mSkADI0XOunNwhxzEmgMENSdh07Uv0ECUUxI6nZiyWJEB6jIelqGiBOZD+Z/yKFRa0MoB8K/QIF5+rfjgRxKSfc05WzK+WqNxP/87qx8u/6CQ2iWJEALxb5MYMqhLNI4IAKghWbaIKwoPpWiEdIIKx0cKZZhDiWKuQwm7e82OOpqXNyVlNZJ61K2bkpVx6qhdpZllgenINLUAIOuAU1cA8aoAkweAGv4B18GG/Gp/FlTBelOSPrOQVLML5/AXGKpNI=</latexit>

p(e, r|C4)

<latexit sha1_base64="/rcbnFdQd+ybi698hxVo/nBr4s0=">AAACTnicbVBNa9tAEF05Tesoaesml0AuS43BgWIk434cQ3PJrQnUccArzGo9tpfsSmJ3VGqE/k9/Tcit6Q9pbqVdOTrEcQcWHu/Nm9l5caakxSD45TW2nm0/f9Hc8Xf3Xr563Xqzf2nT3AgYilSl5irmFpRMYIgSFVxlBriOFYzi69NKH30DY2WafMVlBpHm80TOpODoqEnrM9McF4Kr4rScvKdsDmjpY27ARJ5RVtAuQ/iOiMWX8h1lsS6MMxxTVk5a7aAXrIpugrAGbVLX+aT1m01TkWtIUChu7TgMMowKblAKBaXPcgsZF9d8DmMHE67BRsXq1pJ2HDOls9S4lyBdsY8dBdfWLnXsOqsr7FOtIv+njXOcfYoKmWQ5QiIeFs1yRTGlVXB0Kg0IVEsHuDDS/ZWKBTdcoIvX9ztU5BZTTet564tjXfoup/BpKpvgst8LP/T6F4P2yWGdWJMckbekS0LykZyQM3JOhkSQH+SG/CR33q137/3x/j60Nrzac0DWqtH8B1fxs6o=</latexit>

C5 C4 [ {(O, r5)}

<latexit sha1_base64="lpk6RNv35MtxcaKBKcwi9JwtA3s=">AAACTnicbVBNaxsxENU6TepsvtzkEshFxAQcCGbXMW2PobnkVByo44BljFYeO8LS7iLNhphl/09/Tcit6Q9pbqWV7T3ka0DweG/ejOZFqZIWg+C3V1n5sLr2sbrub2xube/UPu1e2SQzAroiUYm5jrgFJWPookQF16kBriMFvWh6Ptd7t2CsTOIfOEthoPkklmMpODpqWPvGNMcbwVV+XgzblE0ALX3OnTKRpZTltMEQ7hAx/16cUBbp3DjDMWXFsFYPmsGi6FsQlqBOyuoMa3/YKBGZhhiF4tb2wyDFQc4NSqGg8FlmIeViyifQdzDmGuwgX9xa0CPHjOg4Me7FSBfsc0fOtbUzHbnO+RX2tTYn39P6GY6/DnIZpxlCLJaLxpmimNB5cHQkDQhUMwe4MNL9lYobbrhAF6/vH1GRWUw0Lee9XBzpwnc5ha9TeQuuWs3wc7N12a6f7ZeJVckBOSQNEpIv5IxckA7pEkF+knvyizx6D96T99f7t2yteKVnj7yoSvU/UOyzpg==</latexit>

C4 C3 [ {(N, r4)}

<latexit sha1_base64="0FH//dvjAnnUQZF09f+8bESbbd4=">AAACKHicbVDNS8MwHE3n16xfVS+CHoJjMEFGO0Q9DnbxOMF9wDpGmqVbWNKWJBVG7cW/RrzpX+JNdvV/8G669eCmDwKP935feV7EqFS2PTMKa+sbm1vFbXNnd2//wDo8asswFpi0cMhC0fWQJIwGpKWoYqQbCYK4x0jHmzQyv/NIhKRh8KCmEelzNAqoTzFSWhpYZ1GFXELX44lI4RN0OVJjjFjSSAf2xcAq2VV7DviXODkpgRzNgfXtDkMccxIozJCUPceOVD9BQlHMSGq6sSQRwhM0Ij1NA8SJ7CfzX6SwrJUh9EOhX6DgXP3dkSAu5ZR7ujK7Uq56mfif14uVf9tPaBDFigR4sciPGVQhzCKBQyoIVmyqCcKC6lshHiOBsNLBmWYZ4liqkMN83vJij6emzslZTeUvadeqznW1dn9Vqp/kiRXBKTgHFeCAG1AHd6AJWgCDZ/AC3sC78Wp8GJ/GbFFaMPKeY7AE4+sHauqkzg==</latexit>

p(e, r|C0)

<latexit sha1_base64="I8BwHSzxis/us+OC1Mg/kpbF/aM=">AAACKHicbVDNS8MwHE3n16xfVS+CHoJjMEFGO4Z6HOzicYL7gG2UNEu3sKQtSSqM2ot/jXjTv8Sb7Or/4N1s68FtPgg83vt95XkRo1LZ9tTIbWxube/kd829/YPDI+v4pCXDWGDSxCELRcdDkjAakKaiipFOJAjiHiNtb1yf+e0nIiQNg0c1iUifo2FAfYqR0pJrXUQlcg17Hk9ECp9hjyM1wogl9dStXrlWwS7bc8B14mSkADI0XOunNwhxzEmgMENSdh07Uv0ECUUxI6nZiyWJEB6jIelqGiBOZD+Z/yKFRa0MoB8K/QIF5+rfjgRxKSfc05WzK+WqNxP/87qx8u/6CQ2iWJEALxb5MYMqhLNI4IAKghWbaIKwoPpWiEdIIKx0cKZZhDiWKuQwm7e82OOpqXNyVlNZJ61K2bkpVx6qhdpZllgenINLUAIOuAU1cA8aoAkweAGv4B18GG/Gp/FlTBelOSPrOQVLML5/AXGKpNI=</latexit>

p(e, r|C4)

<latexit sha1_base64="lpk6RNv35MtxcaKBKcwi9JwtA3s=">AAACTnicbVBNaxsxENU6TepsvtzkEshFxAQcCGbXMW2PobnkVByo44BljFYeO8LS7iLNhphl/09/Tcit6Q9pbqWV7T3ka0DweG/ejOZFqZIWg+C3V1n5sLr2sbrub2xube/UPu1e2SQzAroiUYm5jrgFJWPookQF16kBriMFvWh6Ptd7t2CsTOIfOEthoPkklmMpODpqWPvGNMcbwVV+XgzblE0ALX3OnTKRpZTltMEQ7hAx/16cUBbp3DjDMWXFsFYPmsGi6FsQlqBOyuoMa3/YKBGZhhiF4tb2wyDFQc4NSqGg8FlmIeViyifQdzDmGuwgX9xa0CPHjOg4Me7FSBfsc0fOtbUzHbnO+RX2tTYn39P6GY6/DnIZpxlCLJaLxpmimNB5cHQkDQhUMwe4MNL9lYobbrhAF6/vH1GRWUw0Lee9XBzpwnc5ha9TeQuuWs3wc7N12a6f7ZeJVckBOSQNEpIv5IxckA7pEkF+knvyizx6D96T99f7t2yteKVnj7yoSvU/UOyzpg==</latexit>

C4 C3 [ {(N, r4)}

<latexit sha1_base64="/rcbnFdQd+ybi698hxVo/nBr4s0=">AAACTnicbVBNa9tAEF05Tesoaesml0AuS43BgWIk434cQ3PJrQnUccArzGo9tpfsSmJ3VGqE/k9/Tcit6Q9pbqVdOTrEcQcWHu/Nm9l5caakxSD45TW2nm0/f9Hc8Xf3Xr563Xqzf2nT3AgYilSl5irmFpRMYIgSFVxlBriOFYzi69NKH30DY2WafMVlBpHm80TOpODoqEnrM9McF4Kr4rScvKdsDmjpY27ARJ5RVtAuQ/iOiMWX8h1lsS6MMxxTVk5a7aAXrIpugrAGbVLX+aT1m01TkWtIUChu7TgMMowKblAKBaXPcgsZF9d8DmMHE67BRsXq1pJ2HDOls9S4lyBdsY8dBdfWLnXsOqsr7FOtIv+njXOcfYoKmWQ5QiIeFs1yRTGlVXB0Kg0IVEsHuDDS/ZWKBTdcoIvX9ztU5BZTTet564tjXfoup/BpKpvgst8LP/T6F4P2yWGdWJMckbekS0LykZyQM3JOhkSQH+SG/CR33q137/3x/j60Nrzac0DWqtH8B1fxs6o=</latexit>

C5 C4 [ {(O, r5)}Done.

Done.

Binding Site (Pocket) C Generated Atom

Density of Carbon Density of Oxygen Density of Nitrogen

Figure 1: An illustration of the sampling process. Atoms are sampled sequentially. The probabilitydensity changes as we place new atoms. The sampling process naturally diverges, leading to differentsamples.

The first layer of the encoder is a linear layer. It maps atomic attributes {ai} to initial embeddings{h(0)

i }. Then, these embeddings along with the graph structure A are fed into L message passinglayers. Specifically, the formula of message passing takes the form:

h(`+1)i = σ

W `0h

(`)i +

∑j∈Nk(ri)

W `1w(dij)�W `

2h(`)j

, (1)

where w(·) is a weight network and dij denotes the distance between atom i and atom j. The formulais similar to continuous filter convolution [25]. Note that, the weight of message from j to i dependsonly on dij , ensuring its invariance to rotation and translation. Finally, we obtain {h(L)

i } a set ofembeddings for each atom in C.

Spatial Classifier The spatial classifier takes as input a query position r ∈ R3 and predicts the typeof atom occupying r. In order to make successful predictions, the model should be able to perceivethe context around r. Therefore, the first step of this part is to aggregate atom embeddings from thecontext encoder:

v =∑

j∈Nk(r)

W0waggr(‖r − rj‖)�W1h(L)j , (2)

where Nk(r) is the k-nearest neighborhood of r. Note that we weight different embedding using theweight network waggr(·) according to distances because it is necessary to distinguish the contributionof different atoms in the context. Finally, in order to predict p(e|r, C), the aggregated feature v isthen passed to a classical multi-layer perceptron classifier:

c = MLP(v), (3)

where c is the non-normalized probability of chemical elements. The estimated probability of positionr being occupied by atom of type e is:

p(e|r, C) = exp (c[e])

1 +∑

e′∈E exp (c[e′]), (4)

where E is the set of possible chemical elements. Unlike typical classifiers that apply softmax to c,we make use of the extra degree of freedom by adding 1 to the denominator, so that the probability of“nothing” can be expressed as:

p(Nothing|r, C) = 1

1 +∑

exp(c[e′])). (5)

3.2 Sampling

Sampling a molecule amounts to generating a set of atoms {(ei, ri)}Nai=1. However, formulating an

effective sampling algorithm is non-trivial because of the following three challenges. First, we haveto define the joint distribution of e and r, i.e. p(e, r|C), from which we can jointly sample an atom’s

4

Page 5: A 3D Generative Model for Structure-Based Drug Design

chemical element and its position. Second, notice that simply drawing i.i.d. samples from p(e, r|C)doesn’t make sense because atoms are clearly not independent of each other. Thus, the samplingalgorithm should be able to attend to the dependencies between atoms. Third, the sampling algorithmshould produce multi-modal samples. This is important because in reality there is usually more thanone molecule that can bind to a specific target.

In the following, we first define the joint distribution p(e, r|C). Then, we present an auto-regressivesampling algorithm to tackle the second and the third challenges.

Joint Distribution We define the joint distribution of coordinate r and atom type e using Eq.4:

p(e, r|C) = exp (c[e])

Z, (6)

where Z is an unknown normalizing constant and c is a function of r and C as defined in Eq.3.Though p(e, r) is a non-normalized distribution, drawing samples from it would be efficient becausethe dimension of r is only 3. Viable sampling methods include Markov chain Monte Carlo (MCMC)or discretization.

Auto-Regressive Sampling We sample a molecule by progressively sampling one atom at eachstep. In specific, at step t, the context Ct contains not only protein atoms but also t atoms sampledbeforehand. Sampled atoms in Ct are treated equally as protein atoms in the model, but they havedifferent attributes in order to differentiate themselves from protein atoms. Then, the (t+ 1)-th atomwill be sampled from p(e, r|Ct) and will be added to Ct, leading to the context for next step Ct+1.The sampling process is illustrated in Figure 1. Formally, we have:

(et+1, rt+1) ∼ p(e, r|Ct),Ct+1 ← Ct ∪ {(et+1, rt+1)}.

(7)

To determine when the auto-regressive sampling should stop, we employ an auxiliary network. Thenetwork takes as input the embedding of previously sampled atoms, and classifies them into twocategories: frontier and non-frontier. If all the existing atoms are non-frontier, which means there isno room for more atoms, the sampling will be terminated. Finally, we use OpenBabel [21, 20] toobtain bonds of generated structures.

In summary, the proposed auto-regressive algorithm succeeds to settle the aforementioned twochallenges. First, the model is aware of other atoms when placing new atoms, thus being able toconsider the dependencies between them. Second, auto-regressive sampling is a stochastic process.Its sampling path naturally diverges, leading to diverse samples.

3.3 Training

As we adopt auto-regressive sampling strategies, we propose a cloze-filling training scheme — attraining time, a random portion of the target molecule is masked, and the network learns to predict themasked part from the observable part and the binding site. This emulates the sampling process wherethe model can only observe partial molecules. The training loss consists of three terms describedbelow.

First, to make sure the model is able to predict positions that actually have atoms (positive positions),we include a binary cross entropy loss to contrast positive positions against negative positions:

LBCE = −Er∼p+ [log (1− p(Nothing|r, C))]− Er∼p− [log p(Nothing|r, C)] . (8)

Here, p+ is a positive sampler that yields coordinates of masked atoms. p− is a negative samplerthat yields random coordinates in the ambient space. p− is empirically defined as a Gaussianmixture model containing |C| components centered at each atom in C. The standard deviation ofeach component is set to 2Å in order cover to the ambient space. Intuitively, the first term in Eq.8increases the likelihood of atom placement for positions that should get an atom. The second termdecreases the likelihood for other positions.

Second, our model should be able to predict the chemical element of atoms. Hence, we furtherinclude a standard categorical cross entropy loss:

LCAT = −E(e,r)∼p+[log p(e|r, C)] . (9)

5

Page 6: A 3D Generative Model for Structure-Based Drug Design

ContextEncoder

O C

Spa/alClassifierO C

<latexit sha1_base64="sYwOYMwvFY9j5rhUtosudc84Spo=">AAACHXicbVDLSsNAFJ3UV42vVJduBkuhopSkiLosuHFZwT6gjWEynbRDZ5IwM1FKyKeIO/0Sd+JW/BD3TtssbOuBC4dz7ovjx4xKZdvfRmFtfWNzq7ht7uzu7R9YpcO2jBKBSQtHLBJdH0nCaEhaiipGurEgiPuMdPzxzdTvPBIhaRTeq0lMXI6GIQ0oRkpLnlWq9n2eiuzhzHPOIfGcU88q2zV7BrhKnJyUQY6mZ/30BxFOOAkVZkjKnmPHyk2RUBQzkpn9RJIY4TEakp6mIeJEuuns9QxWtDKAQSR0hQrO1L8TKeJSTrivOzlSI7nsTcX/vF6igms3pWGcKBLi+aEgYVBFcJoDHFBBsGITTRAWVP8K8QgJhJVOyzQrECdSRRzm+xYP+zwzdU7OciqrpF2vOZe1+t1FuWHliRXBMTgBVeCAK9AAt6AJWgCDJ/AMXsGb8WK8Gx/G57y1YOQzR2ABxtcvBV+f1Q==</latexit>

(r+1 , e1)

<latexit sha1_base64="ER3SO+FbcAJpU+ouXmuTqTSau0Y=">AAACHXicbVDLSsNAFJ3UV42vVJduBkuhopQkiLosuHFZwT6gjWEynbRDZ5IwM1FK6KeIO/0Sd+JW/BD3TtssbOuBC4dz7osTJIxKZdvfRmFtfWNzq7ht7uzu7R9YpcOWjFOBSRPHLBadAEnCaESaiipGOokgiAeMtIPRzdRvPxIhaRzdq3FCPI4GEQ0pRkpLvlWq9gKeicnDme+eQ+K7p75Vtmv2DHCVODkpgxwN3/rp9WOcchIpzJCUXcdOlJchoShmZGL2UkkShEdoQLqaRogT6WWz1yewopU+DGOhK1Jwpv6dyBCXcswD3cmRGsplbyr+53VTFV57GY2SVJEIzw+FKYMqhtMcYJ8KghUba4KwoPpXiIdIIKx0WqZZgTiVKuYw37d4OOATU+fkLKeySlpuzbmsuXcX5bqVJ1YEx+AEVIEDrkAd3IIGaAIMnsAzeAVvxovxbnwYn/PWgpHPHIEFGF+/CLSf1w==</latexit>

(r+2 , e2)

<latexit sha1_base64="G5lXoM+eXh0UiF2zzBqJWoMJQFk=">AAACHXicbVDLSsNAFJ3UV42vVJduBkuhopSkFXVZcOOygn1AG8NkOmmHziRhZqKU0E8Rd/ol7sSt+CHunbZZ2NYDFw7n3BfHjxmVyra/jdza+sbmVn7b3Nnd2z+wCoctGSUCkyaOWCQ6PpKE0ZA0FVWMdGJBEPcZafujm6nffiRC0ii8V+OYuBwNQhpQjJSWPKtQ7vk8FZOHM692DolXO/Wsol2xZ4CrxMlIEWRoeNZPrx/hhJNQYYak7Dp2rNwUCUUxIxOzl0gSIzxCA9LVNEScSDedvT6BJa30YRAJXaGCM/XvRIq4lGPu606O1FAue1PxP6+bqODaTWkYJ4qEeH4oSBhUEZzmAPtUEKzYWBOEBdW/QjxEAmGl0zLNEsSJVBGH2b7Fwz6fmDonZzmVVdKqVpzLSvXuoli3ssTy4BicgDJwwBWog1vQAE2AwRN4Bq/gzXgx3o0P43PemjOymSOwAOPrFwwJn9k=</latexit>

(r+3 , e3)

Posi/veSampling

Nega/veSampling

O CCO

N

O C

<latexit sha1_base64="8fXbrOh8KQaH/YvfV53kIUJ1BsU=">AAACFnicbVDLSsNAFJ3UV42vqEs3g6VQF5akirosuHFZwT6giWUynbZDZzJhZiKUkN8Qd/ol7sStWz/EvdM2C9t64MLhnPvihDGjSrvut1VYW9/Y3Cpu2zu7e/sHzuFRS4lEYtLEggnZCZEijEakqalmpBNLgnjISDsc30799hORioroQU9iEnA0jOiAYqSN5Ff8kKcyezzvXZz1nJJbdWeAq8TLSQnkaPScH78vcMJJpDFDSnU9N9ZBiqSmmJHM9hNFYoTHaEi6hkaIExWks58zWDZKHw6ENBVpOFP/TqSIKzXhoenkSI/UsjcV//O6iR7cBCmN4kSTCM8PDRIGtYDTAGCfSoI1mxiCsKTmV4hHSCKsTUy2XYY4UVpwmO9bPBzyzDY5ecuprJJWrepdVWv3l6W6kydWBCfgFFSAB65BHdyBBmgCDGLwDF7Bm/VivVsf1ue8tWDlM8dgAdbXL6zjnjU=</latexit>

(r�3 )

<latexit sha1_base64="AJlwfLCrJAhs9bv9E4+pUFhs8qs=">AAACFnicbVDLSsNAFJ34rPEVdelmsBTqwpIUUZcFNy4r2Ac0sUymk3bozCTMTIQS8hviTr/Enbh164e4d9pmYVsPXDicc1+cMGFUadf9ttbWNza3tks79u7e/sGhc3TcVnEqMWnhmMWyGyJFGBWkpalmpJtIgnjISCcc3079zhORisbiQU8SEnA0FDSiGGkj+VU/5JnMHy/69fO+U3Zr7gxwlXgFKYMCzb7z4w9inHIiNGZIqZ7nJjrIkNQUM5LbfqpIgvAYDUnPUIE4UUE2+zmHFaMMYBRLU0LDmfp3IkNcqQkPTSdHeqSWvan4n9dLdXQTZFQkqSYCzw9FKYM6htMA4IBKgjWbGIKwpOZXiEdIIqxNTLZdgThVOuaw2Ld4OOS5bXLyllNZJe16zbuq1e8vyw2nSKwETsEZqAIPXIMGuANN0AIYJOAZvII368V6tz6sz3nrmlXMnIAFWF+/qzueNA==</latexit>

(r�2 )

<latexit sha1_base64="OPmOOziSpsMnnGeaxEnhesR/Whc=">AAACFnicbVDLSsNAFJ34rPEVdelmsBTqwpIUUZcFNy4r2Ac0sUymk3bozCTMTIQS8hviTr/Enbh164e4d9pmYVsPXDicc1+cMGFUadf9ttbWNza3tks79u7e/sGhc3TcVnEqMWnhmMWyGyJFGBWkpalmpJtIgnjISCcc3079zhORisbiQU8SEnA0FDSiGGkj+VU/5JnMHy/63nnfKbs1dwa4SryClEGBZt/58QcxTjkRGjOkVM9zEx1kSGqKGcltP1UkQXiMhqRnqECcqCCb/ZzDilEGMIqlKaHhTP07kSGu1ISHppMjPVLL3lT8z+ulOroJMiqSVBOB54eilEEdw2kAcEAlwZpNDEFYUvMrxCMkEdYmJtuuQJwqHXNY7Fs8HPLcNjl5y6mskna95l3V6veX5YZTJFYCp+AMVIEHrkED3IEmaAEMEvAMXsGb9WK9Wx/W57x1zSpmTsACrK9fqZOeMw==</latexit>

(r�1 )

<latexit sha1_base64="apaWOu4NAVrHJfuNeduiOn+2xH8=">AAACFHicbVBJSwMxGM241nGrehG8BEvBi2WmiHosePFYwS7YjiWTZtrQLEOSEcow/0K86S/xJl69+0O8m7ZzsK0PAo/3vi0vjBnVxvO+nZXVtfWNzcKWu72zu7dfPDhsapkoTBpYMqnaIdKEUUEahhpG2rEiiIeMtMLRzcRvPRGlqRT3ZhyTgKOBoBHFyFjpoRvyVGWP5z2/Vyx5FW8KuEz8nJRAjnqv+NPtS5xwIgxmSOuO78UmSJEyFDOSud1EkxjhERqQjqUCcaKDdHpxBstW6cNIKvuEgVP1b0eKuNZjHtpKjsxQL3oT8T+vk5joOkipiBNDBJ4tihIGjYST78M+VQQbNrYEYUXtrRAPkULY2JBctwxxoo3kMJ83vzjkmWtz8hdTWSbNasW/rFTvLkq14zyxAjgBp+AM+OAK1MAtqIMGwECAZ/AK3pwX5935cD5npStO3nME5uB8/QLWeZ3W</latexit>

r�1

<latexit sha1_base64="lrNSFL9GbwgVbs+/SwM0TU2IzhM=">AAACFHicbVBJSwMxGM241nGrehG8BEvBi2WmiHosePFYwS7YjiWTZtrQLEOSEcow/0K86S/xJl69+0O8m7ZzsK0PAo/3vi0vjBnVxvO+nZXVtfWNzcKWu72zu7dfPDhsapkoTBpYMqnaIdKEUUEahhpG2rEiiIeMtMLRzcRvPRGlqRT3ZhyTgKOBoBHFyFjpoRvyVGWP571qr1jyKt4UcJn4OSmBHPVe8afblzjhRBjMkNYd34tNkCJlKGYkc7uJJjHCIzQgHUsF4kQH6fTiDJat0oeRVPYJA6fq344Uca3HPLSVHJmhXvQm4n9eJzHRdZBSESeGCDxbFCUMGgkn34d9qgg2bGwJworaWyEeIoWwsSG5bhniRBvJYT5vfnHIM9fm5C+mskya1Yp/WaneXZRqx3liBXACTsEZ8MEVqIFbUAcNgIEAz+AVvDkvzrvz4XzOSlecvOcIzMH5+gXYIJ3X</latexit>

r�2

<latexit sha1_base64="N3eqkdOmM2prpOGrK4cOWtR2uec=">AAACFHicbVBJSwMxGM3UrY5b1YvgJVgKXiwzVdRjwYvHCnbBdiyZNG1DswxJRijD/Avxpr/Em3j17g/xbtrOwbY+CDze+7a8MGJUG8/7dnIrq2vrG/lNd2t7Z3evsH/Q0DJWmNSxZFK1QqQJo4LUDTWMtCJFEA8ZaYajm4nffCJKUynuzTgiAUcDQfsUI2Olh07IE5U+nnXPu4WiV/amgMvEz0gRZKh1Cz+dnsQxJ8JghrRu+15kggQpQzEjqduJNYkQHqEBaVsqECc6SKYXp7BklR7sS2WfMHCq/u1IENd6zENbyZEZ6kVvIv7ntWPTvw4SKqLYEIFni/oxg0bCyfdhjyqCDRtbgrCi9laIh0ghbGxIrluCONZGcpjNm18c8tS1OfmLqSyTRqXsX5YrdxfF6lGWWB4cgxNwCnxwBargFtRAHWAgwDN4BW/Oi/PufDifs9Kck/Ucgjk4X7/Zx53Y</latexit>

r�3

<latexit sha1_base64="4K6IXA3nZ+X/mtampTxQKjSj5dw=">AAACFHicbVBJSwMxGM241nGrehG8BEtBEMpMEfVY8OKxgl2wHUsmzbShWYYkI5Rh/oV401/iTbx694d4N23nYFsfBB7vfVteGDOqjed9Oyura+sbm4Utd3tnd2+/eHDY1DJRmDSwZFK1Q6QJo4I0DDWMtGNFEA8ZaYWjm4nfeiJKUynuzTgmAUcDQSOKkbHSQzfkqcoez3t+r1jyKt4UcJn4OSmBHPVe8afblzjhRBjMkNYd34tNkCJlKGYkc7uJJjHCIzQgHUsF4kQH6fTiDJat0oeRVPYJA6fq344Uca3HPLSVHJmhXvQm4n9eJzHRdZBSESeGCDxbFCUMGgkn34d9qgg2bGwJworaWyEeIoWwsSG5bhniRBvJYT5vfnHIM9fm5C+mskya1Yp/WaneXZRqx3liBXACTsEZ8MEVqIFbUAcNgIEAz+AVvDkvzrvz4XzOSlecvOcIzMH5+gXTJ53U</latexit>

r+1

<latexit sha1_base64="IeHLZvHjaqjsCJkLTtxm7AD5+VE=">AAACFHicbVBJSwMxGM241nGrehG8BEtBEMpMEfVY8OKxgl2wHUsmzbShWYYkI5Rh/oV401/iTbx694d4N23nYFsfBB7vfVteGDOqjed9Oyura+sbm4Utd3tnd2+/eHDY1DJRmDSwZFK1Q6QJo4I0DDWMtGNFEA8ZaYWjm4nfeiJKUynuzTgmAUcDQSOKkbHSQzfkqcoez3vVXrHkVbwp4DLxc1ICOeq94k+3L3HCiTCYIa07vhebIEXKUMxI5nYTTWKER2hAOpYKxIkO0unFGSxbpQ8jqewTBk7Vvx0p4lqPeWgrOTJDvehNxP+8TmKi6yClIk4MEXi2KEoYNBJOvg/7VBFs2NgShBW1t0I8RAphY0Ny3TLEiTaSw3ze/OKQZ67NyV9MZZk0qxX/slK9uyjVjvPECuAEnIIz4IMrUAO3oA4aAAMBnsEreHNenHfnw/mcla44ec8RmIPz9QvUzp3V</latexit>

r+2

<latexit sha1_base64="MmdthnFqwOdBQt7v+B/eKl/zO5Q=">AAACFHicbVBJSwMxGM3UrY5b1YvgJVgKglBmqqjHghePFeyC7VgyadqGZhmSjFCG+RfiTX+JN/Hq3R/i3bSdg219EHi89215YcSoNp737eRWVtfWN/Kb7tb2zu5eYf+goWWsMKljyaRqhUgTRgWpG2oYaUWKIB4y0gxHNxO/+USUplLcm3FEAo4GgvYpRsZKD52QJyp9POuedwtFr+xNAZeJn5EiyFDrFn46PYljToTBDGnd9r3IBAlShmJGUrcTaxIhPEID0rZUIE50kEwvTmHJKj3Yl8o+YeBU/duRIK71mIe2kiMz1IveRPzPa8emfx0kVESxIQLPFvVjBo2Ek+/DHlUEGza2BGFF7a0QD5FC2NiQXLcEcayN5DCbN7845Klrc/IXU1kmjUrZvyxX7i6K1aMssTw4BifgFPjgClTBLaiBOsBAgGfwCt6cF+fd+XA+Z6U5J+s5BHNwvn4B1nWd1g==</latexit>

r+3

ContextEncoder

O C

Spa/alClassifier

C N O

C N O

C N O

C N O

C N O

C N O

<latexit sha1_base64="hiJMxAhPX5hzpAGGlqIOwwFegxM=">AAACFnicbVC7SgNBFJ31GddXfHQ2gyFgFXaDqGUkjYVFhLwgu4TZySQZMrO7zNwVw5LfEDv9EjuxtfVD7J0kW5jEAxcO59wXJ4gF1+A439ba+sbm1nZux97d2z84zB8dN3WUKMoaNBKRagdEM8FD1gAOgrVjxYgMBGsFo+rUbz0ypXkU1mEcM1+SQcj7nBIwknff9YA9QVq9rU+6+YJTcmbAq8TNSAFlqHXzP14voolkIVBBtO64Tgx+ShRwKtjE9hLNYkJHZMA6hoZEMu2ns58nuGiUHu5HylQIeKb+nUiJ1HosA9MpCQz1sjcV//M6CfRv/JSHcQIspPND/URgiPA0ANzjilEQY0MIVdz8iumQKELBxGTbRUwTDZHE2b7Fw4Gc2CYndzmVVdIsl9yrUvnhslA5zRLLoTN0ji6Qi65RBd2hGmogimL0jF7Rm/VivVsf1ue8dc3KZk7QAqyvX6yZntI=</latexit>

LCAT

<latexit sha1_base64="NVa8tj+xXDTLyKQSVPBbQVLK1+w=">AAACFnicbVDLSsNAFJ3UV42v+ti5GSwFVyUpoi6LRXDhooJthbaUyXTSDp1JwsyNWEJ+Q9zpl7gTt279EPdO2yxs64ELh3Pui+NFgmtwnG8rt7K6tr6R37S3tnd29wr7B00dxoqyBg1FqB48opngAWsAB8EeIsWI9ARreaPaxG89MqV5GNzDOGJdSQYB9zklYKTOba8D7AmSq9p12isUnbIzBV4mbkaKKEO9V/jp9EMaSxYAFUTrtutE0E2IAk4FS+1OrFlE6IgMWNvQgEimu8n05xSXjNLHfqhMBYCn6t+JhEitx9IznZLAUC96E/E/rx2Df9lNeBDFwAI6O+THAkOIJwHgPleMghgbQqji5ldMh0QRCiYm2y5hGmsIJc72zR/2ZGqbnNzFVJZJs1J2z8uVu7Ni9ShLLI+O0Qk6RS66QFV0g+qogSiK0DN6RW/Wi/VufVifs9aclc0cojlYX7+VaZ7E</latexit>

LBCE

<latexit sha1_base64="gtbrJ+vozMQ0sFrSvZR0oCq1IqI=">AAACOnicbVBNSxtBGJ6NWtOtbVd77GUwBCJtw64U7VHw0lOxYEwgG8Ps5I0ZnI9l5t3SsN1/4a8Rb/ZP9Npb6bXencQ9+NEHBh6e5/2aJ8ulcBjHv4LGyuras/Xm8/DFxstXr6PNrRNnCsuhx400dpAxB1Jo6KFACYPcAlOZhH52frjw+9/AOmH0Mc5zGCl2psVUcIZeGkfdhH6geSdF+I6I5RejoaI/aJqp0lan78bJe5oqhjPOZHlY7YyjVtyNl6BPSVKTFqlxNI5u0onhhQKNXDLnhkmc46hkFgWXUIVp4SBn/JydwdBTzRS4Ubn8V0XbXpnQqbH+aaRL9X5HyZRzc5X5ysWN7rG3EP/nDQucfhqVQucFguZ3i6aFpGjoIiQ6ERY4yrknjFvhb6V8xizj6KMMwzblhUOjaD3v4eJMVaHPKXmcylNysttN9rq7Xz+2Dpp1Yk3ylmyTDknIPjkgn8kR6RFOLsgluSY/g6vgd/An+HtX2gjqnjfkAYJ/t1w5q8Y=</latexit>

1� p(None|r+1 , C)

<latexit sha1_base64="6DBunQQdNaSQidE0C2+FVeJ+vbs=">AAACOnicbVBNaxNBGJ5N/YhbtWk9ehkMgYgadkOxPRZy8SQRTBPIxjA7eZMMnY9l5t3SsN1/0V9TvOmf8OpNvOrdSboH2/SBgYfneb/mSTMpHEbRj6C28+Dho8f1J+Hu02fP9xr7B6fO5JbDgBtp7ChlDqTQMECBEkaZBaZSCcP0rLf2h+dgnTD6M64ymCi20GIuOEMvTRudmL6jWTtBuEDE4qPRUNJLmqSqsOWXN9PuW5oohkvOZNErX08bzagTbUC3SVyRJqnQnzb+JjPDcwUauWTOjeMow0nBLAouoQyT3EHG+BlbwNhTzRS4SbH5V0lbXpnRubH+aaQb9f+OginnVir1lesb3V1vLd7njXOcH08KobMcQfObRfNcUjR0HRKdCQsc5coTxq3wt1K+ZJZx9FGGYYvy3KFRtJp3e3GqytDnFN9NZZucdjvx+07302HzpF4lVicvySvSJjE5IifkA+mTAeHkilyTb+R78DX4GfwKft+U1oKq5wW5heDPP13uq8c=</latexit>

1� p(None|r+2 , C)

<latexit sha1_base64="o+fUC7KdMOu2GxRdF8x9z4UWSVQ=">AAACOnicbVDLSiNBFK32McZWx6hLN4VBUGYM3SrOLAU3sxIFo0I6hurKjRbWo6m6LRN6+i/mawZ3+hOzdSdudW8l9sLXgYLDOfdVJ82kcBhF/4Ox8YnJL1O16XBmdu7rfH1h8diZ3HJocSONPU2ZAyk0tFCghNPMAlOphJP0cm/on1yBdcLoIxxk0FHsXIu+4Ay91K03Y7pBs7UE4TciFvtGQ0n/0CRVhS3PvnW3vtNEMbzgTBZ75Xq33oia0Qj0I4kr0iAVDrr1p6RneK5AI5fMuXYcZdgpmEXBJZRhkjvIGL9k59D2VDMFrlOM/lXSVa/0aN9Y/zTSkfq6o2DKuYFKfeXwRvfeG4qfee0c+z87hdBZjqD5y6J+LikaOgyJ9oQFjnLgCeNW+Fspv2CWcfRRhuEq5blDo2g17+3iVJWhzyl+n8pHcrzZjHeam4fbjd1alViNLJMVskZi8oPskl/kgLQIJ3/JP3JDboPr4C64Dx5eSseCqmeJvEHw+Axfo6vI</latexit>

1� p(None|r+3 , C)

<latexit sha1_base64="np87w9tlisL4pEocK4BDhGVmS2Q=">AAACOnicbVDLSiNBFK12dIw9OhN16aYwCAoaumVQl4IbV6JgVEjHUF250cJ6NFW3xdD2X/g1gzv9CbfuxK3urcRe+JgDBYdz7qtOmknhMIoegrEf4xM/J2tT4a/pmd9/6rNzR87klkOLG2nsScocSKGhhQIlnGQWmEolHKcXO0P/+BKsE0Yf4iCDjmJnWvQFZ+ilbr0Z0zWaLScIV4hY7BkNJb2mSaoKW56udeNVmiiG55zJYqdc6dYbUTMagX4ncUUapMJ+t/6a9AzPFWjkkjnXjqMMOwWzKLiEMkxyBxnjF+wM2p5qpsB1itG/SrrklR7tG+ufRjpSP3YUTDk3UKmvHN7ovnpD8X9eO8f+VqcQOssRNH9f1M8lRUOHIdGesMBRDjxh3Ap/K+XnzDKOPsowXKI8d2gUreZ9XpyqMvQ5xV9T+U6O1pvxRnP94G9ju1YlViMLZJEsk5hskm2yS/ZJi3ByQ/6RO3If3AaPwVPw/F46FlQ98+QTgpc3X6eryA==</latexit>

1� p(None|r�1 , C)

<latexit sha1_base64="MkQO0LKYJXuoCwsn3dDxIb7zwO4=">AAACOnicbVBNSxtBGJ61auPW2miPvQwGQaGG3VCqx4AXT8WCiUI2htnJGx2cj2XmXTGs+y/8NcWb/gmv3kqveneS7MGPPjDw8Dzv1zxpJoXDKLoP5j7MLyx+rC2Fn5Y/r3ypr651nckthw430tjjlDmQQkMHBUo4ziwwlUo4Ss/3Jv7RBVgnjD7EcQZ9xU61GAnO0EuDejOm2zTbTBAuEbH4ZTSU9IomqSpsebI9aH2niWJ4xpks9sqtQb0RNaMp6HsSV6RBKhwM6k/J0PBcgUYumXO9OMqwXzCLgksowyR3kDF+zk6h56lmCly/mP6rpBteGdKRsf5ppFP1ZUfBlHNjlfrKyY3urTcR/+f1chzt9guhsxxB89miUS4pGjoJiQ6FBY5y7AnjVvhbKT9jlnH0UYbhBuW5Q6NoNe/14lSVoc8pfpvKe9JtNeOfzdbvH412rUqsRr6RdbJJYrJD2mSfHJAO4eSa/CG35C64CR6Cv8G/WelcUPV8Ja8QPD4DYVyryQ==</latexit>

1� p(None|r�2 , C)

<latexit sha1_base64="I8Lcr1CbR+014hJKi/CDkl/rWK0=">AAACOnicbVBNSxtBGJ7VVuNaNdWjl6FBsKBhV8V6FHLxJCk0KmTTMDt5o0PmY5l5Vwzr/gt/jXizf6JXb+K1vXcS91A/Hhh4eJ73a540k8JhFP0OZmY/fJybry2Ei5+Wllfqn1dPnMkthw430tizlDmQQkMHBUo4yywwlUo4TUetiX96CdYJo3/gOIOeYudaDAVn6KV+vRnTbZptJghXiFgcGw0lvaZJqgpb/tzu727RRDG84EwWrfJrv96ImtEU9C2JK9IgFdr9+t9kYHiuQCOXzLluHGXYK5hFwSWUYZI7yBgfsXPoeqqZAtcrpv8q6YZXBnRorH8a6VT9v6NgyrmxSn3l5Eb32puI73ndHIcHvULoLEfQ/HnRMJcUDZ2ERAfCAkc59oRxK/ytlF8wyzj6KMNwg/LcoVG0mvdycarK0OcUv07lLTnZacb7zZ3ve43DWpVYjayTL2STxOQbOSRHpE06hJMbckvuya/gLngIHoOn59KZoOpZIy8Q/PkHYxGryg==</latexit>

1� p(None|r�3 , C)

<latexit sha1_base64="8fQFhQcX4tAMQE7b5qGIOaMdhW0=">AAACFHicbVDLSgMxFM3UVx1f9bFzEywFV2WmiLosCOLCRQX7wLaUTJq2oUlmSO6IZehfiDv9Enfi1r0f4t60nYVtPRA4nHNfOUEkuAHP+3YyK6tr6xvZTXdre2d3L7d/UDNhrCmr0lCEuhEQwwRXrAocBGtEmhEZCFYPhlcTv/7ItOGhuodRxNqS9BXvcUrASg+3nRawJ0iux51c3it6U+Bl4qckj1JUOrmfVjeksWQKqCDGNH0vgnZCNHAq2NhtxYZFhA5JnzUtVUQy006mF49xwSpd3Au1fQrwVP3bkRBpzEgGtlISGJhFbyL+5zVj6F22E66iGJiis0W9WGAI8eT7uMs1oyBGlhCqub0V0wHRhIINyXULmMYGQonTefOLAzl2bU7+YirLpFYq+ufF0t1ZvnyUJpZFx+gEnSIfXaAyukEVVEUUKfSMXtGb8+K8Ox/O56w046Q9h2gOztcvaTGeLA==</latexit>

LFp(Fron/er | )O

p(Fron/er | )C

Embeddings fromContext Encoder

SharedBinding Site andMasked Ligand

(a)

(b) (c)

(d)

Fron/er Net.

Training Loss<latexit sha1_base64="T4ZosD2KklEfWzA0D8IysCAqOws=">AAACDHicbVDLSgMxFM3UVx1f9bFzEywFV2WmiLosuHHhoop9QDuUTJppQ5PMkNwRSukfiDv9Enfi1n/wQ9ybtrOwrQcCh3PuKydMBDfged9Obm19Y3Mrv+3u7O7tHxQOjxomTjVldRqLWLdCYpjgitWBg2CtRDMiQ8Ga4fBm6jefmDY8Vo8wSlggSV/xiFMCVnq4w91C0St7M+BV4mekiDLUuoWfTi+mqWQKqCDGtH0vgWBMNHAq2MTtpIYlhA5Jn7UtVUQyE4xnl05wySo9HMXaPgV4pv7tGBNpzEiGtlISGJhlbyr+57VTiK6DMVdJCkzR+aIoFRhiPP027nHNKIiRJYRqbm/FdEA0oWDDcd0SpqmBWOJs3uLiUE5cm5O/nMoqaVTK/mW5cn9RrJ5kieXRKTpD58hHV6iKblEN1RFFEXpGr+jNeXHenQ/nc16ac7KeY7QA5+sXNTGaPg==</latexit>

L

Figure 2: (a) A portion of the molecule is masked. (b) Positive coordinates are drawn from the maskedatoms’ positions and negative coordinates are drawn from the ambient space. (c) Both positive andnegative coordinates are fed into the model. The model predicts the probability of atom occurrenceat the coordinates. (d) Training losses are computed based on the discrepancy between predictedprobabilities and ground truth.

Third, as introduced in Section 3.2, the sampling algorithm requires a frontier network to tell whetherthe sampling should be terminated. This leads to the last term — a standard binary cross entropy lossfor training the frontier network:

LF =∑

i∈F⊆Clog σ(F (hi)) +

∑i/∈F⊆C

log(1− σ(F (hi))), (10)

where F is the set of frontier atoms in C, σ is the sigmoid function, and F (·) is the frontier networkthat takes atom embedding as input and predicts the logit probability of the atom being a frontier.During training, an atom is regarded as a frontier if and only if (1) the atom is a part of the targetmolecule, and (2) at least one of its bonded atom is masked.

Finally, by summing up LBCE, LCAT, and LF, we obtain the full training loss L = LBCE +LCAT +LF.The full training process is illustrated in Figure 2.

4 Experiments

We evaluate the proposed method on two relevant structure-based drug design tasks: (1) MoleculeDesign is to generate molecules for given binding sites (Section 4.1), and (2) Linker Predictionis to generate substructures to link two given fragments in the binding site. (Section 4.2). Below,we describe common setups shared across tasks. Detailed task-specific setups are provided in eachsubsection.

Data We use the CrossDocked dataset [7] following [20]. The dataset originally contains 22.5million docked protein-ligand pairs at different levels of quality. We filter out data points whosebinding pose RMSD is greater than 1Å, leading to a refined subset consisting of 184,057 datapoints. We use mmseqs2 [31] to cluster data at 30% sequence identity, and randomly draw 100,000protein-ligand pairs for training and 100 proteins from remaining clusters for testing.

Model We trained a universal model for all the tasks. The number of message passing layersin context encoder L is 6, and the hidden dimension is 256. We train the model using the Adamoptimizer at learning rate 0.0001. Other details about model architectures and training parametersare provided in the supplementary material and the open source repository: https://github.com/luost26/3D-Generative-SBDD.

6

Page 7: A 3D Generative Model for Structure-Based Drug Design

Metric liGAN Ours Ref

Vina Score Avg. -6.144 -6.344 -7.158(kcal/mol, ↓) Med. -6.100 -6.200 -6.950

QED (↑) Avg. 0.371 0.525 0.484Med. 0.369 0.519 0.469

SA (↑) Avg. 0.591 0.657 0.733Med. 0.570 0.650 0.745

High Affinity Avg. 23.77 29.09 -(%, ↑) Med. 11.00 18.50 -

Diversity (↑) Avg. 0.655 0.720 -Med. 0.676 0.736 -

Table 1: Mean and median values of the four met-rics on generation quality. (↑) indicates higher isbetter. (↓) indicates lower is better.

Vina Score QED Score SA Score

Refe

renc

eliG

AN

Our

s

Figure 3: Distributions of Vina, QED, and SAscores over all the generated molecules.

4.1 Molecule Design

In this task, we generate molecules for specific binding sites with our model and baselines. The inputto models are binding sites extracted from the proteins in the testing set. We sample 100 uniquemolecules for each target.

Baselines We compare our approach with the state-of-the-art baseline liGAN [20]. liGAN is basedon conventional 3D convolutional neural networks. It generates voxelized molecular images andrelies on a post-processing algorithm to reconstruct the molecule from the generated image.

Metrics We evaluate the quality of generated molecules from three main aspects: (1) BindingAffinity measures how well the generated molecules fit the binding site. We use Vina [33, 1] tocompute the binding affinity (Vina Score). Before feeding the molecules to Vina, we employ theuniversal force fields (UFF) [24] to refine the generated structures following [20]. (2) Drug Likenessreflects how much a molecule is like a drug. We use QED score [4] as the metric for drug-likeness.(3) Synthesizability assesses the ease of synthesis of generated molecules. We use normalized SAscore [6, 35] to measure molecules’ synthesizability.

In order to evaluate the generation quality and diversity for each binding site, we define two additionalmetrics: (1) Percentage of Samples with High Affinity, which measures the percentage of a bindingsite’s generated molecules whose binding affinity is higher than or equal to the reference ligand.(2) Diversity [14], which measures the diversity of generated molecules for a binding site. It iscalculated by averaging pairwise Tanimoto similarities [3, 32] over Morgan fingerprints among thegenerated molecules of a target.

Results We first calculate Vina Score, QED, and SA for each of the generated molecules. Figure 3presents the histogram of these three metrics and Table 1 shows the mean and median values of themover all generated molecules. For each binding site, we further calculate Percentage of Samples withHigh Affinity and Diversity. We report their mean and median values in the bottom half of Table 1.From the quantitative results, we find that in general, our model is able to discover diverse moleculesthat have higher binding affinity to specific targets. Besides, the generated molecules from our modelalso exhibit other desirable properties including fairly high drug-likeness and synthesizability. Whencompared to the CNN baseline liGAN [20], our method achieves clearly better performance on allmetrics, especially on the drug-likeness score QED, which indicates that our model produces morerealistic drug-like molecules.

To better understand the results, we select two binding sites in the testing set and visualize their topaffinity samples for closer inspection. The top row of Figure 4 is the first example (PDB ID:2hcj).The average QED and SA scores of the generated molecules for this target are 0.483 and 0.663

7

Page 8: A 3D Generative Model for Structure-Based Drug Design

Ours (4rlu) Reference (4rlu)

Vina: -9.4QED: 0.88 SA: 0.93

Vina: -9.3QED: 0.93 SA: 0.60

Vina: -9.1QED: 0.90 SA: 0.83

Vina: -9.1QED: 0.83 SA: 0.66

Vina: -9.0QED: 0.88 SA: 0.93

Vina: -9.0QED: 0.87 SA: 0.75

Vina: -8.4QED: 0.58 SA: 0.88

Vina: -7.3QED: 0.35 SA: 0.50

Vina: -7.1QED: 0.19 SA: 0.63

Vina: -6.8QED: 0.74 SA: 0.72

Vina: -6.7QED: 0.46 SA: 0.61

Vina: -6.7QED: 0.36 SA: 0.64

Vina: -6.6QED: 0.31 SA: 0.67

Vina: -6.5QED: 0.23 SA: 0.64

Ours (2hcj) Reference (2hcj)

Figure 4: Generated molecules with top binding affinity and the reference molecule for two represen-tative binding sites. Lower Vina score indicates higher binding affinity.

respectively, around the median of these two scores. 8% of the generated molecules have higherbinding affinity than the reference molecule, below the median 18.5%. The second example (PDBID:4rlu) is shown in the bottom row. The average QED and SA scores are 0.728 and 0.785, and 18%of sampled molecules achieve higher binding affinity. From these two examples in Figure 4, we cansee that the generated molecules have overall structures similar to the reference molecule and theyshare some common important substructures, which indicates that the generated molecules fit intothe binding site as well as the reference one. Besides, the top affinity molecules generally achieveQED and SA score comparable to or even higher than the reference molecule, which reflects thatthe top affinity molecules not only fit well into the binding site but also exhibit desirable quality. Inconclusion, the above two representative cases evidence the model’s ability to generate drug-like andhigh binding affinity molecules for designated targets.

4.2 Linker Prediction

Table 2: Performance of linker prediction.

Metric DeLinker Ours

Similarity (↑) Avg. 0.612 0.701Med. 0.600 0.722

Recovered (%, ↑) 40.00 48.33Vina Score Avg. -8.512 -8.603

(kcal/mol, ↓) Med. -8.576 -8.575

Linker prediction is to build a molecule that incorpo-rates two given disconnected fragments in the con-text of a binding site [12]. Our model is capable oflinker design without any task-specific adaptation orre-training. In specific, given a binding site and somefragments as input, we compose the initial context C0containing both the binding site and the fragments.Then, we run the auto-regressive sampling algorithmto sequentially add atoms until the molecule is complete.

Data Preparation Following [12], we construct fragments of molecules in the testing set byenumerating possible double-cuts of acyclic single bonds. The pre-processing results in 120 datapoints in total. Each of them consists of two disconnected molecule fragments.

Baselines We compare our model with DeLinker [12]. Despite that DeLinker incorporates some3D information, it is still a graph-based generative model. In contrast, our method operates fully in3D space and thus is able to fully utilize the 3D context.

8

Page 9: A 3D Generative Model for Structure-Based Drug Design

Figure 5: Two example of linker prediction. Atoms highlighted in red are predicted linkers.

Metrics We assess the generated molecules from fragments with four main metrics: (1) Similarity:We use Tanimoto Similarity [32, 3] over Morgan fingerprints [14] to measure the similarity betweenthe molecular graphs of generated molecule and the reference molecule. (2) Percentage of RecoveredMolecules: We say a test molecule is recovered if the model is able to generate a molecule thatperfectly matches it (Similarity = 1.0). We calculate the percentage of test molecules that arerecovered by the model. (3) Binding Affinity: We use Vina [1, 33] to compute the the generatedmolecules’ binding affinity to the target.

Results For each data point, we use our model and DeLinker to generate 100 molecules. Wefirst calculate the average similarity for each data point and report their overall mean and medianvalues. Then, we calculate the percentage of test molecules that are successfully recovered by themodel. Finally, we use Vina to evaluate the generated molecules’ binding affinity. These resultsare summarized in Table 2. As shown in the table, when measured by Vina score, our proposedmethod’s performance is on par with the graph-based baseline DeLinker. However, our methodclearly outperforms DeLinker on Similarity and Percentage of Recovery, suggesting that our methodis able to link fragments in a more realistic way. In addition, we present two examples along with5 generated molecules at different similarities in Figure 5. The example demonstrates the model’sability to generate suitable linkers.

5 Conclusions and Discussions

In this paper, we propose a new approach to structure-based drug design. In specific, we designa 3D generative model that estimates the probability density of atom’s occurrences in 3D spaceand formulate an auto-regressive sampling algorithm. Combined with the sampling algorithm, themodel is able to generate drug-like molecules for specific binding sites. By conducting extensiveexperiments, we demonstrate our model’s effectiveness in designing molecules for specific targets.Though our proposed method achieves reasonable performance in structure-based molecule design,there is no guarantee that the model always generates valid molecules successfully. To build amore robust and useful model, we can consider incorporating graph representations to building 3Dmolecules as future work, such that we can leverage on sophisticated techniques for generating validmolecular graphs such as valency check [35] and property optimization [14].

9

Page 10: A 3D Generative Model for Structure-Based Drug Design

References[1] Amr Alhossary, Stephanus Daniel Handoko, Yuguang Mu, and Chee-Keong Kwoh. Fast,

accurate, and reliable molecular docking with quickvina 2. Bioinformatics, 31(13):2214–2216,2015.

[2] Amy C. Anderson. The process of structure-based drug design. Chemistry & Biology, 10(9):787–797, 2003. ISSN 1074-5521. doi: https://doi.org/10.1016/j.chembiol.2003.09.002. URLhttps://www.sciencedirect.com/science/article/pii/S1074552103001947.

[3] Dávid Bajusz, Anita Rácz, and Károly Héberger. Why is tanimoto index an appropriate choicefor fingerprint-based similarity calculations? Journal of cheminformatics, 7(1):1–13, 2015.

[4] G Richard Bickerton, Gaia V Paolini, Jérémy Besnard, Sorel Muresan, and Andrew L Hopkins.Quantifying the chemical beauty of drugs. Nature chemistry, 4(2):90–98, 2012.

[5] Esben Jannik Bjerrum and Richard Threlfall. Molecular generation with recurrent neuralnetworks (rnns). arXiv preprint arXiv:1705.04612, 2017.

[6] Peter Ertl and Ansgar Schuffenhauer. Estimation of synthetic accessibility score of drug-likemolecules based on molecular complexity and fragment contributions. Journal of cheminfor-matics, 1(1):1–11, 2009.

[7] Paul G Francoeur, Tomohide Masuda, Jocelyn Sunseri, Andrew Jia, Richard B Iovanisci, IanSnyder, and David R Koes. Three-dimensional convolutional neural networks and a cross-docked data set for structure-based drug design. Journal of Chemical Information and Modeling,60(9):4200–4215, 2020.

[8] Niklas WA Gebauer, Michael Gastegger, and Kristof T Schütt. Symmetry-adapted generation of3d point sets for the targeted discovery of molecules. arXiv preprint arXiv:1906.00957, 2019.

[9] Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, SherjilOzair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks. arXiv preprintarXiv:1406.2661, 2014.

[10] Rafael Gómez-Bombarelli, Jennifer N Wei, David Duvenaud, José Miguel Hernández-Lobato,Benjamín Sánchez-Lengeling, Dennis Sheberla, Jorge Aguilera-Iparraguirre, Timothy D Hirzel,Ryan P Adams, and Alán Aspuru-Guzik. Automatic Chemical Design Using a Data-DrivenContinuous Representation of Molecules. ACS Central Science, 4(2):268–276, 2018. ISSN2374-7943. doi: 10.1021/acscentsci.7b00572.

[11] Paul CD Hawkins. Conformation generation: the state of the art. Journal of ChemicalInformation and Modeling, 57(8):1747–1756, 2017.

[12] Fergus Imrie, Anthony R Bradley, Mihaela van der Schaar, and Charlotte M Deane. Deepgenerative models for 3d linker design. Journal of chemical information and modeling, 60(4):1983–1995, 2020.

[13] Wengong Jin, Regina Barzilay, and Tommi Jaakkola. Junction tree variational autoencoderfor molecular graph generation. In International Conference on Machine Learning, pages2323–2332. PMLR, 2018.

[14] Wengong Jin, Regina Barzilay, and Tommi Jaakkola. Composing molecules with multipleproperty constraints. arXiv preprint arXiv:2002.03244, 2020.

[15] Wengong Jin, Jeremy Wohlwend, Regina Barzilay, and Tommi Jaakkola. Iterative re-finement graph neural network for antibody sequence-structure co-design. arXiv preprintarXiv:2110.04624, 2021.

[16] Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv preprintarXiv:1312.6114, 2013.

[17] Matt J Kusner, Brooks Paige, and José Miguel Hernández-Lobato. Grammar variationalautoencoder. In International Conference on Machine Learning, pages 1945–1954. PMLR,2017.

10

Page 11: A 3D Generative Model for Structure-Based Drug Design

[18] Yujia Li, Oriol Vinyals, Chris Dyer, Razvan Pascanu, and Peter Battaglia. Learning deepgenerative models of graphs. arXiv preprint arXiv:1803.03324, 2018.

[19] Qi Liu, Miltiadis Allamanis, Marc Brockschmidt, and Alexander L Gaunt. Constrained graphvariational autoencoders for molecule design. arXiv preprint arXiv:1805.09076, 2018.

[20] Tomohide Masuda, Matthew Ragoza, and David Ryan Koes. Generating 3d molecular struc-tures conditional on a receptor binding site with deep generative models. arXiv preprintarXiv:2010.14442, 2020.

[21] Noel M O’Boyle, Michael Banck, Craig A James, Chris Morley, Tim Vandermeersch, andGeoffrey R Hutchison. Open babel: An open chemical toolbox. Journal of cheminformatics, 3(1):1–14, 2011.

[22] Pavel G Polishchuk, Timur I Madzhidov, and Alexandre Varnek. Estimation of the size ofdrug-like chemical space based on gdb-17 data. Journal of computer-aided molecular design,27(8):675–679, 2013.

[23] Matthew Ragoza, Tomohide Masuda, and David Ryan Koes. Learning a continuous representa-tion of 3d molecular structures with deep generative models. arXiv preprint arXiv:2010.08687,2020.

[24] Anthony K Rappé, Carla J Casewit, KS Colwell, William A Goddard III, and W MasonSkiff. Uff, a full periodic table force field for molecular mechanics and molecular dynamicssimulations. Journal of the American chemical society, 114(25):10024–10035, 1992.

[25] Kristof T Schütt, PJ Kindermans, Huziel E Sauceda, Stefan Chmiela, Alexandre Tkatchenko,and Klaus R Müller. Schnet: A continuous-filter convolutional neural network for modelingquantum interactions. In 31st Conference on Neural Information Processing Systems (NIPS2017), Long Beach, CA, USA, pages 1–11, 2017.

[26] Marwin HS Segler, Thierry Kogej, Christian Tyrchan, and Mark P Waller. Generating focusedmolecule libraries for drug discovery with recurrent neural networks. ACS central science, 4(1):120–131, 2018.

[27] Chence Shi, Minkai Xu, Zhaocheng Zhu, Weinan Zhang, Ming Zhang, and Jian Tang.Graphaf: a flow-based autoregressive model for molecular graph generation. arXiv preprintarXiv:2001.09382, 2020.

[28] Gregor Simm, Robert Pinsler, and José Miguel Hernández-Lobato. Reinforcement learningfor molecular design guided by quantum mechanics. In International Conference on MachineLearning, pages 8959–8969. PMLR, 2020.

[29] Gregor NC Simm, Robert Pinsler, Gábor Csányi, and José Miguel Hernández-Lobato.Symmetry-aware actor-critic for 3d molecular design. arXiv preprint arXiv:2011.12747, 2020.

[30] Miha Skalic, José Jiménez, Davide Sabbadin, and Gianni De Fabritiis. Shape-based generativemodeling for de novo drug design. Journal of chemical information and modeling, 59(3):1205–1214, 2019.

[31] Martin Steinegger and Johannes Söding. Mmseqs2 enables sensitive protein sequence searchingfor the analysis of massive data sets. Nature biotechnology, 35(11):1026–1028, 2017.

[32] Taffee T Tanimoto. Elementary mathematical theory of classification and prediction. 1958.

[33] Oleg Trott and Arthur J Olson. Autodock vina: improving the speed and accuracy of dockingwith a new scoring function, efficient optimization, and multithreading. Journal of computationalchemistry, 31(2):455–461, 2010.

[34] David Weininger. Smiles, a chemical language and information system. 1. introduction tomethodology and encoding rules. Journal of chemical information and computer sciences, 28(1):31–36, 1988.

[35] Jiaxuan You, Bowen Liu, Rex Ying, Vijay Pande, and Jure Leskovec. Graph convolutionalpolicy network for goal-directed molecular graph generation. arXiv preprint arXiv:1806.02473,2018.

11