At the time of writing this paper, researchers around the globe are searching for a vaccine or an effective treatment against the 2019 novel coronavirus (SARS-CoV-2). One strategy to limit virus replication is protease inhibition. A biomolecule called ligand binds to a virus protease enzyme and inhibits its functional properties. For SARS-CoV-2 the crystal structure of its main protease Mpro has been solved, e.g. by Jin et al. 
. The search for a valid protease inhibitor can be expressed as optimization problem. As not only the binding of the ligand is an important objective, but also further properties like drug-likeliness or filter properties, we comprise the molecule search problem as multi-objective optimization problem, which we aim to solve with evolutionary algorithms.
This paper is structured as follows. In Section 2 we shortly repeat the basics of protease inhibition and the connection to the novel coronavirus. Section 3 gives an overview of related work on evolutionary molecule design.In Section 4 we introduce molecule metrics, which we aim to optimize with the EMOA that is presented in Section 5. The experimental part in Section 6 presents our experimental results and discusses the evolved molecules. Conclusions are drawn in Section 7, where also prospective future research directions are presented.
2 Virus Protease Inhibition
As of late 2019, a novel respiratory disease named COVID-19 spread worldwide. COVID-19 is caused by SARS-CoV-2, which belongs to the coronavirus family like the well-known severe acute respiratory syndrome coronavirus (SARS-CoV). As RNA virus SARS-CoV-2’s replication mechanism hijacks the cell mechanisms for replication. An essential part of the virus replication process is a cleavage process, in which the virus protease enzyme cuts long precursor polyproteins into mature non-structural proteins, see Fig. 1. If a ligand biomolecule binds to the protease it can prevent and inhibit this cleavage process. A ligand binds to the target protein in a so-called pocket based on various non-covalent interactions like hydrophobic interactions, hydrogen bonding, -stacking, salt bridges, and amide stacking . With the proper ligand, the protease cleavage process is inhibited, in practice measured by the half maximal inhibitory concentration IC corresponding to the inhibitory substance quantity needed to inhibit of the protease process. The protease inhibitor is the target of the drug design process, which we aim to find with evolutionary search.
Computational modeling of protein-ligand binding is a complex process depending on protein-ligand geometry, chemical interactions as well as various constraints and properties like hydration and quantum effects. Complex molecular dynamics computations are often too expensive in computational drug design. Instead, docking tools like AutoDock , see Section 4
, are supposed to be sufficient for a coarse binding affinity estimation based on a simplification of the physical reality.
3 Related Work
Methods for de novo drug design can be categorized in different ways [10, 5]. Some works construct molecules directly from atoms [11, 28], while others use chemical fragments as their smallest building block . The goal also varies among publications and is sometimes to find drugs that bind to a specific protein binding site like in our work or [29, 35], and other times the goal is to generate any drug-like molecules as in [11, 30].
is a fragment-based method that optimizes for molecule to bind to a specific binding site using a genetic algorithm on an acyclic graph-representation consisting of chemical fragments. The fitness of a resulting compound is evaluated through a docking simulation with a target protein binding site and common drug-likeness indicators.
On the other hand, Douguet et al.  use the SMILIES representation and as such work on the level of atoms instead of fragments. In contrast to our work their genetic algorithm optimizes for drug-likeness only instead of binding to a specific ligand. Furthermore, the algorithm is single-objective and simply weighs the different properties in a fitness function using constant coefficients.
Similarly, Nigam et al.  present a genetic algorithm on the SMILIES
representation for general molecule design. The method increases diversity by using a deep neural network as an adaptive fitness function to penalize long-surviving molecules. In contrast to methods like ours that try to stay inside the distribution of drug-like molecules, the genetic algorithm is free to explore the chemical space in its entirety.
Finally, LigBuilder  is a software tool for drug design that is based on a genetic algorithm. It allows optimizing for the interesting quality of binding to multiple targets, which enables tackling more complex diseases with a single drug without the risk of drug-drug interactions that comes with combination drugs (treatment with multiple compounds).
Brown et al.  have utilized an approach for multi-objective optimization of molecules applying a graph-based representation of molecules. The multi-objective evolutionary algorithm applies a Pareto ranking scheme for the optimization process.
Wagner et al.  have developed a tool which identifies potential CNS drugs by means of a multi-objective optimization. The molecules have been optimized for six physical properties. In contrast to the approach presented here, this tool is not based on evolutionary algorithms but on medical knowledge.
A short review, which focuses on the multi-objective optimization of drugs, is given in . In this context, different problem definitions and various Multi-objective optimization methods are summarized.
have emerged. These propose fixed datasets and metrics to measure and compare the generative abilities of different algorithms. Among the benchmarked algorithms are random sampling, variational autoencoders, generative adversarial networks, Monte-Carlo tree search, and others.
4 Molecule Design Metrics
In computational drug design, molecule metrics define the optimization objectives. This section introduces the five metrics our optimization approach is based on.
4.1 Binding Affinity Scores
The major objective in protease inhibitor search is the protein-ligand binding affinity. A widespread tool for this metric is the automated docking tool AutoDock , which will also be used by the OpenPandamics111https://www.ibm.org/OpenPandemics activities to fight COVID-19. AutoDock performs very fast calculations of the binding energy by using grid-based look-up tables. For this purpose, the protein is embedded in a grid. The binding energy of all individual atoms of the ligand is calculated at all positions of the grid using semi-empirical force field methods. Using a Lamarckian genetic algorithm, the best binding position and binding energy of the complete ligand can be determined with the help of the look-up tables.
Through various improvements, the accuracy and especially the performance of AutoDock has been significantly improved. In AutoDock Vina  a hybrid scoring function based on empirical and knowledge-based data is used instead of the force field method. QuickVina  and QuickVina 2  mainly improve the search algorithm by performing the most complex part of the optimization only for very promising ligand positions. We use QuickVina 2 for the calculation of the binding energies of our proposed ligands, as it provides very good results at high performance. For the sake of simplicity we will use binding affinity score and docking score synonymously.
The informative value of QuickVina 2 binding scores may be limited due to a simplification of various physical properties, such as the neglect of water molecules and the changing electrical properties of ligand and protein when they interact with each other. However, it has been shown by Gaillard  that AutoDock Vina binding scores outperform various computational docking methods and Quickvina 2 achieves very comparable results with Autodock Vina .
4.2 Quantitative Estimate of Drug-likeness (QED)
To estimate whether a molecule can be used as a drug, its similarity to other existing drugs can be considered. This is based on the fact that many important physiochemical properties of drugs follow a certain distribution. Lipinski’s rule of five  which specifies ranges of values for different molecular properties such as size, is frequently used. A major disadvantage, however, is that this rule is only a rule of thumb and only checks whether its criteria are met or not. Among modern drugs there are molecules that violate more than one of Lipinski’s rules. A modern approach by Bickerton et al.  is based on multi-criteria optimization and the principle of desirability. Instead of a fixed value range, all relevant molecular properties are evaluated by an individual desirability function. A single score (QED) is then determined by geometrically averaging all desirability functions.
4.3 Natural Product-likeness (NP)
In addition to the similarity to known drugs, the similarity to naturally occurring biomolecules (natural products) is also an important metric. Natural products have numerous bioactive structures that were created and validated by nature in an evolutionary process. Ertl et al.  have studied the key differentiating features of natural and synthetic molecules and developed a measure of similarity to natural products. This score is based on structural characteristics of the molecules, such as the number of aromatic rings and the distribution of nitrogen and oxygen atoms.
4.4 Medical Chemical Filters
Medical chemical filters are used to exclude molecules that are toxic due to their structural nature. Potentially unstable molecules whose metabolites may be toxic are also not suitable as drugs. We use the MCFs and PAINS filters described by Polykovskiy  as a Boolean indicator metric.
4.5 Synthetic Accessibility (SA)
For drug design it is not only important to find a molecule with the desired properties, but also a synthesizable one. Ertl and Schuffenhauer  create a method to estimate the synthetic accessibility of drug-like molecules and achieve a high agreement with manual estimations by experts. Such a method can easily be incorporated into a search process and we use it as one of our optimization goals, too. A different approach to the synthesis problem is taken by Segler et al. . Instead of estimating synthetic accessibility, their symbolic AI driven approach searches for actual synthesis routes of desired target molecules with a combination of Monte Carlo tree search and neural networks encoding rules for reaction centers.
4.6 Value Ranges
|docking score [kcal/mol]||SA||QED||NP||filters|
5 Evolutionary Molecule Search
This section presents the evolutionary approach for the protease inhibitor design. For searching in the design space of biomolecules we use evolutionary algorithms (EAs), which are biologically inspired population-based search heuristics. We employ the evolution strategy orientedpopulation model .
A solution is defined by a string based on the self-referencing embedded strings (SELFIES) representation , which is an advancement of the simplified molecular-input line-entry system (SMILIES)  representation. Figure 2 pictures an exemplary molecule with its structural formula and the corresponding SMILIES and SELFIES representations. Each string consist of symbols, encoding the occurring atoms, bindings, branches and ring sizes. SELFIES implements a formal grammar, and the interpretation of a symbol depends on derivation rules and state of derivation. In contrast to SMILIES, SELFIES strings are always syntactically correct and therefore always yield valid molecules .
The EA’s initial population consists of individuals with randomly generated strings representation of a fixed length. Since multiple SELFIES strings can be translated to the same SMILIES string, the resulting SMILIES string is compared to a global list of all previously generated individuals. Individuals with a representation that already occurred are discarded and a new individual is generated. This process is repeated until the population consist of unique individuals and also applies for the generation of offspring individuals.
Since every SELFIES string corresponds to a valid molecule and every molecule can be expressed in SELFIES representation, the design space can be explored by applying random mutations to the strings – more precisely the SELFIES
symbols of which the string is composed. Offspring solutions are created by choosing a random individual from the parental population. Each child is mutated with the following mutation operations with defined probabilities:
is applied independently for every symbol with a probability of . The symbol is replaced by a random SELFIES symbol.
is applied with probability . A random symbol is inserted at a random position in the individual’s representation.
is applied with probability and deletes a randomly chosen symbol of the individual’s representation.
The new symbols are drawn from a set of symbols inspired by . This set has been extended with benzene as a separate, composed symbol, to increase the likelihood of its occurrence and ease the generation of complex molecules. Additionally, each symbol is assigned a weighting parameter to adjust the probability with which the is is randomly selected. This weighting can be used to increase the likelihood of more common symbols (e.g. [C]) in contrast to more complex ones (e.g. branches and ring structures).
5.2 Fitness Evaluation
For the selection operator the fitness of each solution candidate is evaluated based on the molecule metrics binding affinity score, QED, filters, NP, and SA introduced in Section 4. To increase the comparability, each metric is scaled to the range between 0 (best possible score) and 1 (worst possible score). The binding affinities are scaled with regard to the experimentally chosen minimum of and maximum of and clipped to the range between 0 and 1 with soft clipping .
For the single-objective baseline experiments each individual is assigned a single composed fitness value. We use a weighted sum fitness of the introduced metrics:
with weights with corresponding to 1: docking, 2: SA, 3: QED, 4: NP, and 5: filters. The choice of weights is based on preliminary experiment with the objective of putting the highest attention on the docking score, while at the same time considering the other properties.
Individuals are evaluated in parallel. Therefore, the respective SELFIES strings are converted to the SMILIES representation. MOSES is then used for the calculation of QED, NP, and SA as well as for the application of the PAINS and MCF filters. The docking score for each compound is determined by QuickVina 2. Therefore, RDKit222https://www.rdkit.org and MGLTools333http://mgltools.scripps.edu are used to generate PDB and PDBQT files for the respective SMILIES representation. The binding energy is calculated in regards to the COVID-19 Mpro (PDB ID: 6LU7 )444PDB: protein data base, https://www.rcsb.org with the search grid being centered around the native ligand position and sized to . The exhaustiveness is maintained at its default value of 8.
The objectives presented in Section 4 may be contradictory. For example, in preliminary experiments, we discovered that molecules with high AutoDock binding scores suffer from low QED scores. As the choice of predefined weights for objectives is difficult in advance, a multi-objective approach may be preferable in practice. In our multi-objective optimization setting in molecule space with fitness functions to minimize we seek for a Pareto set of non-dominated solutions, where means dominates , i.e., , while . NSGA-II  is known to be able of approximating a Pareto set with a broad distribution of solutions in objective space, i.e., of the Pareto front. After non-dominated sorting, non-dominated solutions maximizing the crowding distance, which corresponds to the sum of Manhattan distances between the neighboring solutions in objective space. For comparison of different multi-objective runs we also employ the S-metric measuring the dominated hypervolume in objective space with regards to a dominated reference point .
In this section we experimentally analyze the single-objective and the NSGA-II approaches for the protease inhibitor candidate search. For the experimental analyses, the following settings are applied. A -EA is used for the single-objective run i.e., in each generation from 10 parents 100 offspring candidate molecules are generated with the mutation operators introduced in Section 5.1 with mutation probabilities , , and applying plus selection. For multi-objective runs the number of parents is increased to 20 to achieve a broader distribution of solutions in objective space. No crossover is applied. Individuals are limited to a length of 80 SELFIES tokens oriented to the setting by Krenn et al. . All runs are terminated after 200 generations and are repeated 20 times.
6.1 Metric Development
Figure 3 shows the development of the previously explained normalized metrics in single- and multi-objective runs. For the single-objective runs, the best individuals according to fitness are chosen in each generation and their metrics are averaged over all runs. The optimization process concentrates on improving docking score, QED, and NP. As expected, an improvement of one metric may result in a deterioration of another, e.g., as of generation 140, when QED and NP deteriorate in favor of SA and docking score.
For multi-objective runs, the best individuals for each metric are chosen in each generation and then averaged over all runs. A steady improvement with regard to all objectives is achieved here, but has to be paid with regard to deteriorations in other objectives that are not shown here.
Figure 4 shows three different two-dimensional slices of the Pareto front that compare docking score to QED, NP, and SA. A Pareto front is shown for every 10th generation and their colors start at light blue for the first generation and end at dark blue for the final generation. The plots illustrate NSGA-II’s ability to generate solutions with different degrees of balance between docking score and the plotted metric.
In the course of the optimization process the front of non-dominated solutions has the expected tendency to move towards the lower left. This is also reflected by the S-metric, which, in average over all runs improves from in the first to in the last generation. In the slice plots deteriorations are possible due to improvements in the remaining three objectives.
A comparison of final experimental results of the single-objective and NSGA-II runs is presented in Table 2. For NSGA-II the best achieved values for each objective are shown corresponding to the corner points of the Pareto front approximation. For comparison, corresponding metric values are shown for N3 proposed as ligand in the PDB database as well as for Lopinavir, the HIV main protease inhibitor . Docking scores achieved by the single objective optimization process show that the best values even overcome the scores of N3 and Lopinavir. Lopinavir and N3 bind similarly strong to Mpro. NSGA-II achieves promising values for all metrics. The broad coverage of objective function values offers the practitioner a huge variety of interesting candidates. However, some of the extreme metric values may sometimes be unpractical, e.g., the outstanding docking score of the best NSGA-II molecule (docking score ) has been achieved by a chemically unrealistic candidate.
From our observations we conclude that the SELFIES representation with our mutation operators are able to robustly achieve molecules of a certain quality. However, we expect the quality of the results to improve with mechanisms that allow the development of larger molecules to overcome fitness plateaus and local optima. Figure 5 compares the populations of the last generation of a typical single-objective and NSGA-II run. The solutions in the single-objective population are similar to each other, while the solutions in the last NSGA-II population maintain a higher diversity of molecule properties.
6.2 Candidate Comparison
In the following we present interesting protein inhibitor candidates evolved with the single- and multi-objective approaches. In our experiments we made three main observations. The molecules generated have a strong tendency to contain aromatic ring structures. Candidates with good drug-likeliness are comparatively short. Candidates with high docking scores often have unrealistic geometries.
In Figure 6 we present a list of six promising protease inhibitors (PI) candidates with properties as radar plots, structural formulas, and chemical names. PI-I (a) to PI-III (c) are results from single-objective runs, while PI-IV (c) to PI-VI (f) show candidates generated by NSGA-II. Points near the border of the radar plot represent better values, e.g., a zero value lies at the corner of a plot. All candidates fulfill the filter condition. PI-1 achieves a high SA value with a reasonable docking score. PI-II achieves an excellent docking score with kcal/mol. PI-III, PI-IV, and PI-VI achieve excellent drug-likeliness QED with good docking results around kcal/mol. An interesting candidate balancing all objectives is PI-V with docking score kcal/mol and QED value of .
Last, we visualize how the ligand candidates are located in the Mpro protein pocket optimized by QuickVina 2. Figure 7 shows candidates (a) PI-I and (b) PI-V in their Mpro pockets.
In this paper we introduced an evolutionary multi-objective approach to evolve protein inhibitor candidates for the Mpro of SARS-CoV-2, which could be a starting point for drug design attempts, aiming at optimizing the QuickVina 2-based protein-ligand binding scores and further important objectives like QED drug-likeliness and filter properties. In the experimental part we have shown that the evolutionary processes are able to evolve interesting inhibitor candidates. Many of them achieve promising metrics with ordinary structures, but also unconventional candidates have been evolved that may be worth for a deeper analysis. As the informative value of QuickVina 2 binding scores and also the further metrics may be limited in practice, we understand our approach as AI-assisted virtual screening of the chemical biomolecule space.
Future research will focus on the improvement of protein-ligand models for more detailed and more efficient binding affinity models. Further, we see potential to improve the SELFIES representation in terms of bloated strings that represent comparatively small molecules and mechanisms to guarantee their validity.
We would like to thank Ahmad Reza Mehdipour, Max Planck Institute of Biophysics, Frankfurt, Germany, for useful suggestions and comments that helped to improve this manuscript.
-  Alhossary, A., Handoko, S.D., Mu, Y., Kwoh, C.K.: Fast, accurate, and reliable molecular docking with QuickVina 2. Bioinformatics 31(13), 2214–2216 (2015). DOI 10.1093/bioinformatics/btv082
-  Beume, N., Naujoks, B., Emmerich, M.: SMS-EMOA: Multiobjective selection based on dominated hypervolume. European Journal of Operational Research 181(3), 1653–1669 (2007). DOI 10.1016/j.ejor.2006.08.008
-  Beyer, H.G., Schwefel, H.P.: Evolution strategies – A comprehensive introduction. Natural Computing 1(1), 3–52 (2002). DOI 10.1023/A:1015059928466
-  Bickerton, G.R., Paolini, G.V., Besnard, J., Muresan, S., Hopkins, A.L.: Quantifying the chemical beauty of drugs. Nature Chemistry 4(2), 90–98 (2012). DOI 10.1038/nchem.1243
-  Brown, N., Fiscato, M., Segler, M.H., Vaucher, A.C.: GuacaMol: Benchmarking Models for de Novo Molecular Design. Journal of Chemical Information and Modeling 59(3), 1096–1108 (2019). DOI 10.1021/acs.jcim.8b00839
-  Brown, N., McKay, B., Gilardoni, F., Gasteiger, J.: A Graph-Based Genetic Algorithm and Its Application to the Multiobjective Evolution of Median Molecules. Journal of Chemical Information and Computer Sciences 44(3), 1079–1087 (2004). DOI 10.1021/ci034290p
-  Caly, L., Druce, J.D., Catton, M.G., Jans, D.A., Wagstaff, K.M.: The FDA-approved Drug Ivermectin inhibits the replication of SARS-CoV-2 in vitro. Antiviral Research p. 104787 (2020). DOI 10.1016/j.antiviral.2020.104787
-  Dai, W., Zhang, B., Su, H., Li, J., Zhao, Y., Xie, X., Jin, Z., Liu, F., Li, C., Li, Y., Bai, F., Wang, H., Cheng, X., Cen, X., Hu, S., Yang, X., Wang, J., Liu, X., Xiao, G., Jiang, H., Rao, Z., Zhang, L.K., Xu, Y., Yang, H., Liu, H.: Structure-based design of antiviral drug candidates targeting the SARS-CoV-2 main protease. Science (2020). DOI 10.1126/science.abb4489
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist
multiobjective genetic algorithm: NSGA-II.
IEEE Transactions on Evolutionary Computation6(2), 182–197 (2002). DOI 10.1109/4235.996017
-  Devi, R.V., Sathya, S.S., Coumar, M.S.: Evolutionary algorithms for de novo drug design – A survey. Applied Soft Computing 27, 543–552 (2015). DOI 10.1016/j.asoc.2014.09.042
-  Douguet, D., Thoreau, E., Grassy, G.: A genetic algorithm for the automated generation of small organic molecules: Drug design using an evolutionary algorithm. Journal of Computer-Aided Molecular Design 14(5), 449–466 (2000). DOI 10.1023/A:1008108423895
-  Ertl, P., Roggo, S., Schuffenhauer, A.: Natural Product-likeness Score and Its Application for Prioritization of Compound Libraries. Journal of Chemical Information and Modeling 48(1), 68–74 (2008). DOI 10.1021/ci700286x
-  Ertl, P., Schuffenhauer, A.: Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. Journal of Cheminformatics 1(1), 8 (2009). DOI 10.1186/1758-2946-1-8
-  Fischer, A., Sellner, M., Neranjan, S., Lill, M.A., Smieško, M.: Inhibitors for Novel Coronavirus Protease Identified by Virtual Screening of 687 Million Compounds (2020). DOI 10.26434/chemrxiv.11923239.v1
-  de Freitas, R.F., Schapira, M.: A systematic analysis of atomic protein–ligand interactions in the PDB. MedChemComm 8(10), 1970–1981 (2017). DOI 10.1039/C7MD00381A
-  Gaillard, T.: Evaluation of AutoDock and AutoDock Vina on the CASF-2013 Benchmark. Journal of Chemical Information and Modeling 58(8), 1697–1706 (2018). DOI 10.1021/acs.jcim.8b00312
-  Jin, Z., Du, X., Xu, Y., Deng, Y., Liu, M., Zhao, Y., Zhang, B., Li, X., Zhang, L., Peng, C., Duan, Y., Yu, J., Wang, L., Yang, K., Liu, F., Jiang, R., Yang, X., You, T., Liu, X., Yang, X., Bai, F., Liu, H., Liu, X., Guddat, L.W., Xu, W., Xiao, G., Qin, C., Shi, Z., Jiang, H., Rao, Z., Yang, H.: Structure of M pro from COVID-19 virus and discovery of its inhibitors. Nature pp. 1–9 (2020). DOI 10.1038/s41586-020-2223-y
-  Kaplan, S.S., Hicks, C.B.: Safety and antiviral activity of lopinavir/ritonavir-based therapy in human immunodeficiency virus type 1 (HIV-1) infection. Journal of Antimicrobial Chemotherapy 56(2), 273–276 (2005). DOI 10.1093/jac/dki209
-  Khaerunnisa, S., Kurniawan, H., Awaluddin, R., Suhartati, S., Soetjipto, S.: Potential Inhibitor of COVID-19 Main Protease (Mpro) From Several Medicinal Plant Compounds by Molecular Docking Study (2020). DOI 10.20944/preprints202003.0226.v1
-  Klimek, M.D., Perelstein, M.: Neural Network-Based Approach to Phase Space Integration. arXiv:1810.11509 [hep-ex, physics:hep-ph, physics:physics, stat] (2018)
-  Krenn, M., Häse, F., Nigam, A., Friederich, P., Aspuru-Guzik, A.: Self-Referencing Embedded Strings (SELFIES): A 100% robust molecular string representation. arXiv:1905.13741 [physics, physics:quant-ph, stat] (2020)
-  Lipinski, C.A., Lombardo, F., Dominy, B.W., Feeney, P.J.: Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Advanced Drug Delivery Reviews 23(1), 3–25 (1997). DOI 10.1016/S0169-409X(96)00423-1
-  Liu, X., Zhang, B., Jin, Z., Yang, H., Rao, Z.: 6LU7: The crystal structure of COVID-19 main protease in complex with an inhibitor N3. https://www.rcsb.org/structure/6lu7 (2020)
-  Macchiagodena, M., Pagliai, M., Procacci, P.: Inhibition of the Main Protease 3CL-pro of the Coronavirus Disease 19 via Structure-Based Ligand Design and Molecular Modeling. arXiv:2002.09937 [q-bio] (2020)
-  Morris, G.M., Huey, R., Lindstrom, W., Sanner, M.F., Belew, R.K., Goodsell, D.S., Olson, A.J.: AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. Journal of Computational Chemistry 30(16), 2785–2791 (2009). DOI 10.1002/jcc.21256. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/jcc.21256
-  Nguyen, H., Case, D.A., Rose, A.S.: NGLview–interactive molecular graphics for Jupyter notebooks. Bioinformatics 34(7), 1241–1242 (2018). DOI 10.1093/bioinformatics/btx789
-  Nicolaou, C.A., Brown, N.: Multi-objective optimization methods in drug design. Drug Discovery Today: Technologies 10(3), e427–e435 (2013). DOI 10.1016/j.ddtec.2013.02.001
-  Nigam, A., Friederich, P., Krenn, M., Aspuru-Guzik, A.: Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space. arXiv:1909.11655 [physics] (2020)
-  Pegg, S.C.H., Haresco, J.J., Kuntz, I.D.: A genetic algorithm for structure-based de novo design. Journal of Computer-Aided Molecular Design 15(10), 911–933 (2001). DOI 10.1023/A:1014389729000
-  Polykovskiy, D., Zhebrak, A., Sanchez-Lengeling, B., Golovanov, S., Tatanov, O., Belyaev, S., Kurbanov, R., Artamonov, A., Aladinskiy, V., Veselov, M., Kadurin, A., Johansson, S., Chen, H., Nikolenko, S., Aspuru-Guzik, A., Zhavoronkov, A.: Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models. arXiv:1811.12823 [cs, stat] (2019)
-  Segler, M.H.S., Preuss, M., Waller, M.P.: Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555(7698), 604–610 (2018). DOI 10.1038/nature25978
-  Trott, O., Olson, A.J.: AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of Computational Chemistry 31(2), 455–461 (2010). DOI 10.1002/jcc.21334. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/jcc.21334
-  Wager, T.T., Hou, X., Verhoest, P.R., Villalobos, A.: Central Nervous System Multiparameter Optimization Desirability: Application in Drug Discovery. ACS Chemical Neuroscience 7(6), 767–775 (2016). DOI 10.1021/acschemneuro.6b00029
-  Weininger, D.: SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Computer Sciences 28(1), 31–36 (1988). DOI 10.1021/ci00057a005
-  Yuan, Y., Pei, J., Lai, L.: LigBuilder V3: A Multi-Target de novo Drug Design Approach. Frontiers in Chemistry 8 (2020). DOI 10.3389/fchem.2020.00142
-  Zhang, L., Lin, D., Sun, X., Curth, U., Drosten, C., Sauerhering, L., Becker, S., Rox, K., Hilgenfeld, R.: Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved -ketoamide inhibitors. Science 368(6489), 409–412 (2020). DOI 10.1126/science.abb3405