A Unified View of Relational Deep Learning for Polypharmacy Side Effect, Combination Synergy, and Drug-Drug Interaction Prediction

by   Benedek Rozemberczki, et al.

In recent years, numerous machine learning models which attempt to solve polypharmacy side effect identification, drug-drug interaction prediction and combination therapy design tasks have been proposed. Here, we present a unified theoretical view of relational machine learning models which can address these tasks. We provide fundamental definitions, compare existing model architectures and discuss performance metrics, datasets and evaluation protocols. In addition, we emphasize possible high impact applications and important future research directions in this domain.



page 1

page 2

page 3

page 4


Multi-View Substructure Learning for Drug-Drug Interaction Prediction

Drug-drug interaction (DDI) prediction provides a drug combination strat...

Drug-Target Indication Prediction by Integrating End-to-End Learning and Fingerprints

Computer-Aided Drug Discovery research has proven to be a promising dire...

Bi-Level Graph Neural Networks for Drug-Drug Interaction Prediction

We introduce Bi-GNN for modeling biological link prediction tasks such a...

Prediction of Drug Synergy by Ensemble Learning

One of the promising methods for the treatment of complex diseases such ...

Prediction of Drug-Induced TdP Risks Using Machine Learning and Rabbit Ventricular Wedge Assay

The evaluation of drug-induced Torsades de pointes (TdP) risks is crucia...

Machine Learning over Static and Dynamic Relational Data

This tutorial overviews principles behind recent works on training and m...

Weisfeiler and Leman go Machine Learning: The Story so far

In recent years, algorithms and neural architectures based on the Weisfe...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Relational deep learning has an unprecedented potential for revolutionizing the drug discovery process and pharmaceutical industry [19]. A number of high value use cases for relational deep learning in the pharmaceutical domain involve answering questions about what happens when two drugs are administered at the same time. These potential applications might want to answer questions such as: Will a combination of two drugs be more effective at destroying a specific type of lung cancer cells [46]? Is there an unexpected (polypharmacy) side effect [76] of using these two drugs together? Is there an unwanted chemical interaction [57] that these drug molecules can have?

All of these previously mentioned questions can be answered by what we see as drug pair scoring, a machine learning task that involves a set of drugs and the task of predicting the behaviour of pairs in a specific context of interest. Given an incomplete database of drug pairs, drug administration contexts and outcomes, the goal is to train a model to accurately make probabilistic predictions for unseen entries. The reasons for answering these questions via algorithmic methods are multi-fold. Firstly, testing all drug pairs in all of the contexts is not feasible due to time and financial constraints such as drug prices and labour costs [46]. Secondly, certain pair scoring tasks such as polypharmacy side effect prediction can only be validated in human-based trials. Finally, laboratory testing of drug pairs is prone to human errors [35].

Traditional supervised machine learning methods which solve the drug pair scoring task use handcrafted molecular features to predict the outcome of administering the drugs together in a specific context [53, 10]. Another group of techniques uses an unsupervised approach which diffuses the profile of the drug pair on a heterogeneous biological graph [74, 33, 27]

in order to find potential polypharmacy, synergy or interaction indications. Deep learning techniques which solve the drug pair scoring task can be seen as a fusion and extension of these traditional methods. Such models first generate drug representations based either on molecular structure or the heterogeneous graph based neighbourhood context. In the second optional step, these representations are propagated in the biological graph and aggregated. Finally, drug pair representations are formed and probability scores are outputted in the specific drug administration contexts. We present a high level summary of the drug pair scoring task idea in Figure


Figure 1:

Drug-drug interaction, polipharmacy side effect and pair combination therapy design prediction tasks follow the same template. Given a pair of drugs (denoted with dashed rectangles on the right) from a set of drugs the task is to predict an outcome in a specific context (dashed rectangle on the left). Relational machine learning models which solve these task can exploit the molecular features of these compounds, knowledge graph based neighbourhoods of drugs or both of these.

Our main contributions can be summarized as:

  1. We provide a formal unification of drug-drug interaction, polypharmacy side effect and synergistic drug combination prediction tasks.

  2. We present an overview on the design and architecture of relational machine learning models which can address these predictive tasks.

  3. We highlight the publicly available datasets used to train and test the models on these tasks and survey the literature for the most commonly used evaluation metrics.

  4. We review the most important applications of these machine learning techniques and discuss directions for future research in the domain.

The remainder of this survey is structured as follows. In Section 2 we establish the foundations of a unified view of discriminative machine learning tasks defined on pairs of drugs. Section 3 discusses the architectural details of models that can solves these tasks. The evaluation metrics, protocols and datasets used in the literature are detailed in Section 4. Several important key application areas are highlighted in Section 5. We discuss the limitations of current approaches and future research directions in Section 6. The paper concludes with Section 7. The survey is supported by a collection of relevant works under the https://github.com/AstraZeneca/polypharmacy-ddi-synergy-survey repository.

2 Background

Our discussion of drug pair scoring models requires the introduction of a drug set that describes compounds of interest and a context set that contains contexts where two drugs are used in a pair combination.

Definition 1.

Labeled drug pair. A labeled drug pair defined on drug set and context set is the tuple , where the binary indicator is the outcome for drug pair in context .

A labeled drug pair is a known fact about the drug pair having an effect in a context such as a specific polypharmacy side effect, interaction or synergistic relationship at treating a disease. The purpose of pair scoring models is to learn from these tuples to predict the labels for unlabeled drug pairs and contexts.

Definition 2.

Database of labeled drug pairs. A database of labeled drug pairs defined on drug and context sets and is the set containing labeled drug pairs where and .

Pair scoring models are trained on databases of labeled drug pairs and the trained models are used to predict the label of pairs for which we do not know the outcome in certain contexts.

Definition 3.

Heterogeneous interaction graph with drug entities. We denote with the heterogeneous interaction graph with drug entities, where and are the entity and relation sets, it holds that the drug set and is formed by typed edges of the form .

We consider a heterogeneous graph where the drug set is a subset of the vertex set. This definition of heterogeneous (biological) knowledge graph helps to create knowledge graph based representation for the compounds of interest.

Definition 4.

Neighbourhood encoder. A neighbourhood encoder is the function , where

is a parametric vector representation of

and is a neighbourhood set.

The neighbourhood encoder function [20] creates a vector representation of drug vertices of the graph based on the aggregation of trainable parameter vectors in the neighbourhood of the source node. Neighbourhoods of a drug can be defined based on arbitrary notions of proximity and the aggregation itself could be a parametric transformation.

Definition 5.

Molecular encoder. A molecular encoder is the function , parametrized by where is the learned vector representation and is a generic notation of molecular features describing the drug .

A molecular encoder is a neural network which generates a vector representation from the features of the molecule - these molecular features can be derived from generic features (e.g. hydrophilicity), a string representation, molecular graph or geometry.

Definition 6.

Neighbourhood informed molecular encoder. This encoder is the function where and are molecular and neighbourhood encoders respectively.

This encoder combines the layers described in Definitions 4 and 5. It is essentially a neighbourhood encoder parametrized by representations outputted by a molecular encoder – molecular representations learned by the molecular encoder are aggregated in the neighbourhood of source drug nodes in the knowledge graph which has drug entities.

Definition 7.

Molecular representation combiner. Given the drugs with vector representations the molecular representation combiner is the function that outputs a vector representation of the drug pair.

The representation output by this combiner function can be order dependent or independent. This way the temporal order of drug orchestration can be expressed by the pair scoring model. For example the concatenation of drug vectors results in order dependent representations of pairs, while a bilinear transformation of drug representations with a diagonal matrix does not.

Definition 8.

Scoring head layer. The scoring head layer is the function , where the predicted label satisfies that .

Given a drug pair representation and a context, the scoring head layer outputs a probability score for the outcome.

Definition 9.

Drug pair scoring loss and cost functions. Given the drug pair , context , ground-truth label and predicted score the loss is defined as the function . The cost on the whole drug pair database is defined by Equation (1).


In practical settings, drug pair scoring models are trained by the minimization of the binary cross-entropy summed over the labeled drug pair, context triples.

Entity Types Drug Features
Task Method Reference Model view Induction Drug Protein Disease SMILES Graph Geometry Generic
Polypharmacy DECAGON [76] Higher
KBLRN [41] Higher
SDHINE [23] Higher
ESP [4] Higher
MHCADDI [12] Lower
TIP [67] Higher
Interaction DeepCCI [57] Lower
MVGAE [40] Hierarchical
DeepDDI [51] Lower
DI [45] Hierarchical
MR-GNN [68] Lower
SkipGNN [25] Higher
CASTER [26] Lower
DeepDrug [5] Lower
GoGNN [61] Hierarchical
DPDDI [15] Lower
KGNN [34] Higher
BiGNN [2] Hierarchical
MIRACLE [64] Hierarchical
EPGCN-DS [55] Lower
SumGNN [70] Higher
DANN-DDI [37] Higher
SSI-DDI [43] Lower
MTDDI [16] Hierarchical
MUFFIN [9] Hierarchical
DDIAAE [11] Higher
RWGCN [14] Higher
SmileGNN [21] Lower
GCN-BMP [7] Lower
Synergy DeepSynergy [46] Lower
MCDC [6] Hierarchical
DTF [56] Higher
DeepSignalFlow [72] Hierarchical
DeepDDS [62] Lower
GraphSynergy [69] Higher
TranSynergy [36] Hierarchical
MatchMaker [3] Lower
AuDNNSynergy [73] Lower
AID [31] Hierarchical
MOOMIN [49] Hierarchical
Table 1: A machine learning task, model view level, induction, interaction graph node type (entity) and drug feature based comparison of drug pair scoring machine learning models. Machine learning models that solve a specific pair scoring task are ordered chronologically in the table.

3 Drug Pair Scoring Models

Our discussion of the drug pair scoring models introduces our unified view about the general architecture of these models and discusses and compares state-of-the-art architecture designs.

3.1 Unified View: The Drug Pair Scoring Model

Based on the definitions outlined in Section 2 we propose a unified view of drug pair scoring models. We postulate that the abstract design of drug pair scoring models irrespective of the specific subtask solved always has the following architecture:

  1. An encoder to generate drug representations - this can be one of the functions described by Definitions 4, 5 and 6.

  2. A molecular representation combiner function to generate a drug pair representation – see Definition 7.

  3. The scoring head layer to predict the probability of a context dependent outcome proposed by Definition 8.

  4. The loss function of Definition

    9 which depends on ground-truth labels and the probabilities output by the head layer.

This architecture and design allows for the joint end-to-end training of the individual model components – gradient descent based update of the layer weights.

3.2 Specific Architecture Designs

We compare state-of-the-art model architectures in Table 1 that can solve pair scoring tasks. Our comparison considers the model level, induction capabilities, specific subtask, node types of the heterogeneous graph and the molecular features exploited by the model. Model attributes used for comparison were the following:

  • Model level: A model operates at the following levels based on the encoder architecture used for generating the drug representations: (a) higher-view – neighbourhood encoder, (b) lower-view – molecular encoder, (c) hierarchical-view – neighbourhood informed molecular encoder.

  • Machine learning task: The drug pair scoring task of interest solved by the dedicated model architecture proposed in the research paper. It has to be one of interaction, polypharmacy or synergy prediction.

  • Induction: A model is inductive if it can predict the label of drug pairs where at least one of the drugs was not in the training set drug pairs.

  • Entities: The types of hetereogeneous graph entities (drugs, proteins, diseases) used by the model to solve the task.

  • Drug features: Molecular features and information about the compound encoded by the molecular encoder function.

Our comparison highlights that there is a hard trade-off between induction and the exclusion of compound features. It is also evident that there is a connection between the machine learning subtask and the model architecture design: for example, polypharmacy side effect prediction models are mostly high level transductive neighbourhood encoders with a scoring layer on top. Synergy scoring models are mostly inductive techniques which exploit the molecular information about the drugs. Currently, there is no single pair scoring model which includes all of the considered biological modalities.

4 Evaluation

The evaluation of machine learning models requires performance metrics, train-test split strategies and publicly accessible datasets.

4.1 Performance Metrics

The predictive performance of drug pair scoring models is evaluated by metrics tailored to binary classification tasks. We summarise how these metric are used for the evaluation of state-of-the-art drug pair scoring architectures in Table 2.

Evaluation metric
Model Reference PRAUC ROCAUC Precision Recall Accuracy
KBLRN [41]
ESP [4]
TIP [67]
DeepCCI [57]
MVGAE [40]
DeepDDI [51]
DI [45]
MR-GNN [68]
SkipGNN [25]
DeepDrug [5]
GoGNN [61]
DPDDI [15]
KGNN [34]
BiGNN [2]
SumGNN [70]
SSI-DDI [43]
MTDDI [16]
RWGCN [14]
SmileGNN [21]
DeepSynergy [46]
MCDC [6]
DTF [56]
DeepSignalFlow [72]
DeepDDS [62]
GraphSynergy [69]
TranSynergy [36]
MatchMaker [3]
AuDNNSynergy [73]
AID [31]
Table 2: Predictive performance evaluation metrics used by the research papers which proposed novel drug pair scoring techniques. Models are grouped by the pair scoring task and ordered chronologically.

Looking at Table 2 it is evident that the evaluation metrics used in the literature can be grouped into two categories:

  • Score based metrics: These quantify predictive performance based over the whole domain of discrimination thresholds. The precision-recall area under the curve (PRAUC) considers the precision-recall trade off under the whole domain of discrimination thresholds while the receiver operating characteristic area under the curve (ROCAUC) considers false and true positive rates.

  • Hard cut off evaluation metrics: These performance metrics (accuracy, score, precision, recall) apply a hard discrimination threshold to assign a label to the data points based on the scores output by the pair scoring model. In order to calculate these, one needs to set a discrimination threshold.

Our findings demonstrate that pair scoring models are predominantly evaluated by score based metrics (PR AUC and ROC AUC) which do not require manual setting of a discrimination threshold. It is also evident that seminal research works which defined the key pair scoring tasks influenced the later evaluation metric choices – polypharmacy prediction models adapted the evaluation metrics from [76] for example.

4.2 Train-Test Split Strategies

The evaluation of drug pair scoring tasks allows for the use of various train-test split strategies [46] to test the performance of the model under cold-start and inductive scenarios [13]. Given a labeled drug pair-context database , defined on the drug and and context sets and , we assume that one can create the randomized splits and . We summarized these splitting strategies on Figure 2.

Figure 2: The train and test split of drug pair scoring datasets allows for various stratified splits. Stratification of the pairs can happen across the evaluated drugs or the outcomes being tested. Drug pairs used for training the pair scoring model are noted with white, pairs used for testing are noted with gray.

Using the formalism established to describe the pair scoring models, the splitting strategies are defined as:

  • Random split: labeled drug pair - context entries of are randomly split between and .

  • Drug pair stratified split: A drug pair that appears in entries of does not appear in entries of . This split requires a pair scoring model which is inductive with respect to drugs.

  • Drug stratified split: A drug that appears in entries of does not appear in entries of . Like the drug pair stratified split this requires the model to be inductive with respect to new drugs.

  • Context stratified split: A context that appears in entries of does not appear in entries of . This requires that the pair scoring model is inductive with respect to the set of contexts.

4.3 Datasets

We detail public sources for drug pair data which have been used by the approaches in this review in Table 3. Datasets are listed chronologically according to subtask and the licence and any restrictions for commercial use are detailed where available. It can be seen that the majority of datasets contain a small number of drugs, indicating most focus on approved drugs rather than all possible compounds, with the interactions captured in drug pairs being much more numerous.

Dataset Reference Subtask Compounds Pairs Contexts Licence Restricted
TWOSIDES [59] Polypharmacy 1,918 211,990 12,726 Not Specified
DrugBank 5 [65] Polypharmacy 14,575 ¿365,000 ¿86 CC BY-NC 4.0
DeepDDI [51] Polypharmacy 1,710 191,995 86 Not Specified
TDC (TWOSIDES) [24] Polypharmacy 645 63,473 1,317 CC BY 4.0
TDC (DrugBank) [24] Polypharmacy 1,706 191,519 86 CC BY-NC 4.0
STITCH-CCI 5 [58] Interaction 389,393 17,705,799 4 CC BY-NC-SA 4.0
ZhangDDI [74] Interaction 548 48,548 - Not Specified
ChCh-Miner [42] Interaction 1,514 48,514 - Not Specified
DCDB [38] Synergy 485 499 Not Specified
DCDB 2.0 [39] Synergy 904 1,363 Not Specified
ASDCD [8] Synergy 105 215 Not Specified
O’Neil [44] Synergy 38 583 39 Not Specified
NCI-ALMANAC [22] Synergy 104 ¿5,000 60 CC BY 4.0
DrugComb [71] Synergy 2,276 437,932 93 CC BY-NC-SA 4.0
SynergyXDB [52] Synergy 1,977 22,507 151 Not Specified
DrugCombDB [35] Synergy 2,887 448,555 124 Not Specified
TDC (OncoPolyPharm) [24] Synergy 38 583 39 CC BY 4.0
DrugComb 2.0 [75] Synergy 8,397 751,498 2,320 CC BY-NC-SA 4.0
TDC (DrugComb) [24] Synergy 129 5,628 29 CC BY-NC-SA 4.0
Table 3: Public drug pair scoring datasets ordered chronologically with the subtask, number of compounds, count of tested compound pairs, cardinality of the context set, licence and if commercial use is explicitly restricted.

It should be noted that established resources such as TWOSIDES and DrugBank are frequently filtered, cleaned and split into new datasets. For example the Therapeutics Data Commons (TDC) resource contains filtered versions of both of these datasets designed for benchmark use [24]. It is also common for datasets to be named differently in publications, for example the split of TWOSIDES contained in TDC is also called ChChSe-Decagon in some works [42].

5 Applications

In this section we introduce three key, yet currently largely, unexplored applications for the methods detailed in this review.

5.1 Combination Therapy for COVID-19

One topical application of these methods is in relation to COVID-19 pandemic. Patients affected by polypharmacy of certain drug types (anti-psychotics and opiates being prominent examples) had a significantly higher chance of a negative clinical outcome from COVID-19 [29, 30]. Using methods covered in this review to predict which combinations may have a negative effect for COVID-19 patients, could enable high risk groups to seek alternative treatments, reducing the risk of a negative outcome.

5.2 Antibiotic Evolutionary Pressure

The prevalent use of antibiotics has resulted in microbes evolving resistance to the drugs, reducing efficacy and potentially eliminating cost effective ways of treating severe bacterial-related diseases such as Tuberculosis. Interestingly, it has been shown that the combination of different antibiotics can slow, and even reverse, this evolutionary resistance [54]. However discovering these suppressive interactions using traditional methods is a complex and slow process, yet one currently unexplored using the methods covered in this review.

5.3 Reducing Toxicity

Although drug combinations can result in an increase of unwanted side effects, one promising application is that the combination of two or more drugs can actually lead to a reduced level of toxicity for patients. This is due to the fact that synergistic drugs, which together posses a higher level of efficacy at targeting a certain condition, means that the levels of each individual compound can actually be lowered, reducing toxicity issues associated with higher doses [28]. Thus, accurate prediction of synergistic drug combinations can reduce the impact of toxicity resulting from the individual compounds.

6 Discussion and Future Directions

The body of work regarding relational machine learning for drug pair scoring primarily focuses on the design of novel architectures and applications. Our unification survey identified a number of potential shortcomings of existing approaches and venues for novel research in the domain.

6.1 Encoding Molecular Geometry

Our summary on the design of relational machine learning architectures for drug pair scoring tasks in Table 1 highlighted that molecular geometry and spatial structure of the molecules is rarely encoded by existing models. Recent advances in geometric deep learning applied to chemistry [47, 66, 18] would allow the inclusion of geometric information which could lead to better predictive performance on the pair scoring tasks. The modularity of existing architectures makes replacing the molecular encoders with state-of-the-art geometric encoder layers a possibility.

6.2 Higher Order Drug Combinations

Existing research about the interactions, unwanted side effects and synergy of drugs is primarily focused on the evaluation of binary pair combinations. This is driven by the lack of datasets focused on the outcomes of using higher order drug combinations and the lack of architectures designed specifically for these higher order combinations. By using set based representation aggregation layers [60, 1], the existing pair scoring models could be adapted to generate drug subset representations.

6.3 Transfer Learning

Self-supervised and unsupervised learning for pretraining molecular encoders is already widely used for single molecule machine learning tasks


. This provides an opportunity for pretraining the molecular encoders on single molecule tasks and fine-tuning them on the data scarce, pair scoring tasks. Another opportunity for transfer learning comes from the fact that certain pair scoring tasks have a greater quantity of labeled data available. The summary of drug pair scoring datasets in Table

2 demonstrated that the drug-drug interaction prediction task has datasets such as STITCH-CCI-5 which covers a large number of pair combinations, while the polypharmacy side effect and synergy prediction tasks have smaller databases. Pretraining models by performing drug-drug interaction prediction and fine-tuning these models for other tasks seems to be a important future research direction for training accurate, and therefore useful, models.

6.4 Multimodal Learning

A heterogeneous graph based representation of drugs allows for the fusion of multiple data modalities. Our survey of existing models in Table 1 has demonstrated that only a handful of existing architectures integrates multimodal data effectively [49, 36] without losing induction. Integrating multiple data modalities such as proteomics, molecular structure and biological pathway information could be an important venue for designing novel pair scoring architectures.

6.5 Software for Drug Pair Scoring

Currently there is no dedicated open-source machine learning library which was specifically designed for solving the drug pair scoring task. Developing a dedicated relational machine learning framework on top of existing geometric deep learning

[17, 63, 50] and deep chemistry frameworks [48, 32] could be an important contribution to the domain. This would require curated datasets and the architectural design of encoder, combiner, scoring layers and drug pair iterators.

7 Conclusion

We have provided an exhaustive overview of relational machine learning models designed to solve drug pair scoring tasks. We outlined a general theoretical framework which unifies the drug-drug interaction, polypharmacy side effect and drug synergy prediction tasks and created a taxonomy of models which address these. By surveying the literature considering the architecture and evaluation of existing models, we identified key real world application areas and important directions for future research.


The authors would like to thank Piotr Grabowski, Rocío Mercado, and Paul Scherer for help and feedback throughout the preparation of this manuscript. Stephen Bonner is a fellow of the AstraZeneca postdoctoral program.


  • [1] J. Baek, M. Kang, and S. J. Hwang (2020) Accurate Learning of Graph Representations with Graph Multiset Pooling. In International Conference on Learning Representations, Cited by: §6.2.
  • [2] Y. Bai, K. Gu, Y. Sun, and W. Wang (2020) Bi-Level Graph Neural Networks for Drug-Drug Interaction Prediction. ICML 2020 Graph Representation Learning and Beyond (GRL+) Workshop. Cited by: Table 1, Table 2.
  • [3] K. H. Brahim, O. Tastan, and E. Cicek (2021) MatchMaker: A Deep Learning Framework for Drug Synergy Prediction. IEEE/ACM Transactions on Computational Biology and Bioinformatics. Cited by: Table 1, Table 2.
  • [4] H. A. Burkhardt, D. Subramanian, J. Mower, and T. A. Cohen (2019) Predicting Adverse Drug-Drug Interactions with Neural Embedding of Semantic Predications. In AMIA 2019, American Medical Informatics Association Annual Symposium, Washington, DC, USA, November 16-20, 2019, External Links: Link Cited by: Table 1, Table 2.
  • [5] X. Cao, R. Fan, and W. Zeng (2020) DeepDrug: A General Graph-Based Deep Learning Framework for Drug Relation Prediction. bioRxiv. Cited by: Table 1, Table 2.
  • [6] H. Chen, S. Iyengar, and J. Li (2019) Large-Scale Analysis of Drug Combinations by Integrating Multiple Heterogeneous Information Networks. In Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB ’19, pp. 67–76. Cited by: Table 1, Table 2.
  • [7] X. Chen, X. Liu, and J. Wu (2020) GCN-bmp: investigating graph representation learning for ddi prediction task. Methods 179, pp. 47–54. Note: Interpretable machine learning in bioinformatics External Links: ISSN 1046-2023, Document, Link Cited by: Table 1, Table 2.
  • [8] X. Chen, B. Ren, M. Chen, M. Liu, W. Ren, Q. Wang, L. Zhang, and G. Yan (2014) ASDCD: Antifungal Synergistic Drug Combination Database. PloS one 9 (1), pp. e86499. Cited by: Table 3.
  • [9] Y. Chen, T. Ma, X. Yang, J. Wang, B. Song, and X. Zeng (2021) MUFFIN: Multi-Scale Feature Fusion For Drug–Drug Interaction Prediction. Bioinformatics. Cited by: Table 1, Table 2.
  • [10] W. Chiang, L. Shen, L. Li, and X. Ning (2020) Drug-Drug Interaction Prediction Based on Co-Medication Patterns and Graph Matching. International Journal of Computational Biology and Drug Design 13 (1), pp. 36–57. Cited by: §1.
  • [11] Y. Dai, C. Guo, W. Guo, and C. Eickhoff (2021)

    Drug–Drug Interaction Prediction with Wasserstein Adversarial Autoencoder-Based Knowledge Graph Embeddings

    Briefings in Bioinformatics 22 (4), pp. bbaa256. Cited by: Table 1, Table 2.
  • [12] A. Deac, Y. Huang, P. Velickovic, P. Liò, and J. Tang (2019) Drug-Drug Adverse Effect Prediction with Graph Co-Attention. ICML Workshop on Computational Biology. Cited by: Table 1, Table 2.
  • [13] P. Dewulf, M. Stock, and B. De Baets (2021) Cold-Start Problems in Data-Driven Prediction of Drug–Drug Interaction Effects. Pharmaceuticals 14 (5), pp. 429. Cited by: §4.2, §6.3.
  • [14] A. Feeney, R. Gupta, V. Thost, R. Angell, G. Chandu, Y. Adhikari, and T. Ma (2021) Relation Matters in Sampling: A Scalable Multi-Relational Graph Neural Network for Drug-Drug Interaction Prediction. arXiv preprint arXiv:2105.13975. Cited by: Table 1, Table 2.
  • [15] Y. Feng, S. Zhang, and J. Shi (2020) DPDDI: A Deep Predictor for Drug-Drug Interactions. BMC Bioinformatics 21 (1), pp. 1–15. Cited by: Table 1, Table 2.
  • [16] Y. Feng, S. Zhang, Q. Zhang, C. Zhang, and J. Shi (2021) MTDDI: A Graph Convolutional Network Framework for Predicting Multi-Type Drug-Drug Interactions. Research Square. Cited by: Table 1, Table 2.
  • [17] M. Fey and J. E. Lenssen (2019)

    Fast Graph Representation Learning with PyTorch Geometric

    In ICLR Workshop on Representation Learning on Graphs and Manifolds, Cited by: §6.5.
  • [18] M. Fey, J. E. Lenssen, F. Weichert, and H. Müller (2018) Splinecnn: fast geometric deep learning with continuous b-spline kernels. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    pp. 869–877. Cited by: §6.1.
  • [19] T. Gaudelet, B. Day, A. Jamasb, J. Soman, C. Regep, G. Liu, J. Hayter, R. Vickers, C. Roberts, J. Tang, et al. (2021) Utilizing graph machine learning within drug discovery and development.. Briefings in Bioinformatics. Cited by: §1.
  • [20] W. L. Hamilton, R. Ying, and J. Leskovec (2017) Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 1025–1035. Cited by: §2.
  • [21] X. Han, X. Li, and J. Li (2021) SmileGNN: Drug-Drug Interaction Prediction Based on SMILES and Graph Neural Network. Cited by: Table 1, Table 2.
  • [22] S. L. Holbeck, R. Camalier, J. A. Crowell, J. P. Govindharajulu, M. Hollingshead, L. W. Anderson, E. Polley, L. Rubinstein, A. Srivastava, D. Wilsker, et al. (2017) The national cancer institute almanac: a comprehensive screening resource for the detection of anticancer drug pairs with enhanced therapeutic activity. Cancer research 77 (13), pp. 3564–3576. Cited by: Table 3.
  • [23] B. Hu, H. Wang, L. Wang, and W. Yuan (2018) Adverse Drug Reaction Predictions Using Stacking Deep Heterogeneous Information Network Embedding Approach. Molecules 23 (12), pp. 3193. Cited by: Table 1, Table 2.
  • [24] K. Huang, T. Fu, W. Gao, Y. Zhao, Y. Roohani, J. Leskovec, C. W. Coley, C. Xiao, J. Sun, and M. Zitnik (2021) Therapeutics data commons: machine learning datasets and tasks for therapeutics. arXiv preprint arXiv:2102.09548. Cited by: §4.3, Table 3.
  • [25] K. Huang, C. Xiao, L. M. Glass, M. Zitnik, and J. Sun (2020) SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks. Scientific reports 10 (1), pp. 1–16. Cited by: Table 1, Table 2.
  • [26] K. Huang, C. Xiao, T. N. Hoang, L. M. Glass, and J. Sun (2020) CASTER: Predicting Drug Interactions with Chemical Substructure Representation. AAAI. Cited by: Table 1, Table 2.
  • [27] L. Huang, D. Brunell, C. Stephan, J. Mancuso, X. Yu, B. He, T. C. Thompson, R. Zinner, J. Kim, P. Davies, et al. (2019) Driver Network as a Biomarker: Systematic Integration and Network Modeling of Multi-Omics Data to Derive Driver Signaling Pathways for Drug Combination Prediction. Bioinformatics 35 (19), pp. 3709–3717. Cited by: §1.
  • [28] A. Ianevski, S. Timonen, A. Kononov, T. Aittokallio, and A. K. Giri (2020) SynToxProfiler: an interactive analysis of drug combination synergy, toxicity and efficacy. PLoS computational biology 16 (2), pp. e1007604. Cited by: §5.3.
  • [29] S. Iloanusi, O. Mgbere, and E. J. Essien (2021) Polypharmacy among covid-19 patients: a systematic review. Journal of the American Pharmacists Association 61 (5), pp. e14–e25. Cited by: §5.1.
  • [30] W. Jin, J. M. Stokes, R. T. Eastman, Z. Itkin, A. V. Zakharov, J. J. Collins, T. S. Jaakkola, and R. Barzilay (2021) Deep Learning Identifies Synergistic Drug Combinations for Treating COVID-19. Proceedings of the National Academy of Sciences 118 (39). Cited by: §5.1.
  • [31] Y. Kim, S. Zheng, J. Tang, W. Jim Zheng, Z. Li, and X. Jiang (2021) Anticancer Drug Synergy Prediction in Understudied Tissues Using Transfer Learning. Journal of the American Medical Informatics Association 28 (1), pp. 42–51. Cited by: Table 1, Table 2.
  • [32] M. Korshunova, B. Ginsburg, A. Tropsha, and O. Isayev (2021) OpenChem: a deep learning toolkit for computational chemistry and drug design. Journal of Chemical Information and Modeling 61 (1), pp. 7–13. Cited by: §6.5.
  • [33] H. Li, T. Li, D. Quang, and Y. Guan (2018) Network Propagation Predicts Drug Synergy in Cancers. Cancer research 78 (18), pp. 5446–5457. Cited by: §1.
  • [34] X. Lin, Z. Quan, Z. Wang, T. Ma, and X. Zeng (2020) KGNN: Knowledge Graph Neural Network for Drug-Drug Interaction Prediction. In

    Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20

    pp. 2739–2745. Cited by: Table 1, Table 2.
  • [35] H. Liu, W. Zhang, B. Zou, J. Wang, Y. Deng, and L. Deng (2020) DrugCombDB: A Comprehensive Database of Drug Combinations Toward the Discovery of Combinatorial Therapy. Nucleic acids research 48 (D1), pp. D871–D881. Cited by: §1, Table 3.
  • [36] Q. Liu and L. Xie (2021) TranSynergy: Mechanism-driven interpretable deep neural network for the synergistic prediction and pathway deconvolution of drug combinations. PLoS computational biology 17 (2), pp. e1008653. Cited by: Table 1, Table 2, §6.4.
  • [37] S. Liu, Y. Zhang, Y. Cui, Y. Qiu, Y. Deng, W. Zhang, and Z. Zhang (2021) Enhancing Drug-Drug Interaction Prediction Using Deep Attention Neural Networks. Cited by: Table 1, Table 2.
  • [38] Y. Liu, B. Hu, C. Fu, and X. Chen (2010) DCDB: Drug Combination Database. Bioinformatics 26 (4), pp. 587–588. Cited by: Table 3.
  • [39] Y. Liu, Q. Wei, G. Yu, W. Gai, Y. Li, and X. Chen (2014) DCDB 2.0: A Major Update of the Drug Combination Database. Database 2014. Cited by: Table 3.
  • [40] T. Ma, C. Xiao, J. Zhou, and F. Wang (2018) Drug Similarity Integration Through Attentive Multi-View Graph Auto-Encoders. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 3477–3483. Cited by: Table 1, Table 2.
  • [41] B. Malone, A. García-Durán, and M. Niepert (2018) Knowledge Graph Completion to Predict Polypharmacy Side Effects. In International Conference on Data Integration in the Life Sciences, pp. 144–149. Cited by: Table 1, Table 2.
  • [42] S. M. Marinka Zitnik and J. Leskovec (2018-08) BioSNAP Datasets: Stanford biomedical network dataset collection. Note: http://snap.stanford.edu/biodata Cited by: §4.3, Table 3.
  • [43] A. K. Nyamabo, H. Yu, and J. Shi (2021) SSI–DDI: Substructure–Substructure Interactions for Drug–Drug Interaction Prediction. Briefings in Bioinformatics. Cited by: Table 1, Table 2.
  • [44] J. O’Neil, Y. Benita, I. Feldman, M. Chenard, B. Roberts, Y. Liu, J. Li, A. Kral, S. Lejnine, A. Loboda, et al. (2016) An unbiased oncology compound screen to identify novel combination strategies. Molecular cancer therapeutics 15 (6), pp. 1155–1162. Cited by: Table 3.
  • [45] B. Peng and X. Ning (2019) Deep Learning for High-Order Drug-Drug Interaction Prediction. In Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB ’19, pp. 197–206. Cited by: Table 1, Table 2.
  • [46] K. Preuer, R. P. Lewis, S. Hochreiter, A. Bender, K. C. Bulusu, and G. Klambauer (2018) DeepSynergy: Predicting Anti-Cancer Drug Synergy with Deep Learning. Bioinformatics 34 (9), pp. 1538–1546. Cited by: §1, §1, Table 1, §4.2, Table 2.
  • [47] C. R. Qi, H. Su, K. Mo, and L. J. Guibas (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660. Cited by: §6.1.
  • [48] B. Ramsundar, P. Eastman, P. Walters, and V. Pande (2019) Deep Learning for the Life Sciences: Applying Deep Learning to Genomics, Microscopy, Drug Discovery, and More. O’Reilly Media, Inc.. Cited by: §6.5.
  • [49] B. Rozemberczki, G. Anna, S. Nilsson, G. Edwards, A. Nikolov, and E. Papa (2021) MOOMIN: Deep Molecular Omics Network for Anti-Cancer Drug Combination Therapy Design. Cited by: Table 1, Table 2, §6.4.
  • [50] B. Rozemberczki, P. Scherer, Y. He, G. Panagopoulos, A. Riedel, M. Astefanoaei, O. Kiss, F. Beres, G. Lopez, N. Collignon, and R. Sarkar (2021) PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management, Cited by: §6.5.
  • [51] J. Y. Ryu, H. U. Kim, and S. Y. Lee (2018) Deep Learning Improves Prediction of Drug–Drug and Drug–Food Interactions. Proceedings of the National Academy of Sciences 115 (18), pp. E4304–E4311. Cited by: Table 1, Table 2, Table 3.
  • [52] H. Seo, D. Tkachuk, C. Ho, A. Mammoliti, A. Rezaie, S. A. Madani Tonekaboni, and B. Haibe-Kains (2020) SYNERGxDB: An Integrative Pharmacogenomic Portal to Identify Synergistic Drug Combinations for Precision Oncology. Nucleic Acids Research 48 (W1), pp. W494–W501. Cited by: Table 3.
  • [53] P. Sidorov, S. Naulaerts, J. Ariey-Bonnet, E. Pasquier, and P. J. Ballester (2019) Predicting synergism of cancer drug combinations using nci-almanac data. Frontiers in chemistry 7, pp. 509. Cited by: §1.
  • [54] N. Singh and P. J. Yeh (2017) Suppressive drug combinations and their potential to combat antibiotic resistance. The Journal of antibiotics 70 (11), pp. 1033–1042. Cited by: §5.2.
  • [55] M. Sun, F. Wang, O. Elemento, and J. Zhou (2020) Structure-Based Drug-Drug Interaction Detection via Expressive Graph Convolutional Networks and Deep Sets. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, pp. 13927–13928. Cited by: Table 1, Table 2.
  • [56] Z. Sun, S. Huang, P. Jiang, and P. Hu (2020)

    DTF: Deep Tensor Factorization for Predicting Anticancer Drug Synergy

    Bioinformatics 36 (16), pp. 4483–4489. Cited by: Table 1, Table 2.
  • [57] K. Sunyoung and Y. Sungroh (2017) DeepCCI: End-to-End Deep Learning for Chemical-Chemical Interaction Prediction. In Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 203–212. Cited by: §1, Table 1, Table 2.
  • [58] D. Szklarczyk, A. Santos, C. Von Mering, L. J. Jensen, P. Bork, and M. Kuhn (2016) STITCH 5: augmenting protein–chemical interaction networks with tissue and affinity data. Nucleic acids research 44 (D1), pp. D380–D384. Cited by: Table 3.
  • [59] N. P. Tatonetti, P. Y. Patrick, R. Daneshjou, and R. B. Altman (2012) Data-driven prediction of drug effects and interactions. Science translational medicine 4 (125), pp. 125ra31–125ra31. Cited by: Table 3.
  • [60] O. Vinyals, S. Bengio, and M. Kudlur (2016) Order Matters: Sequence to Sequence for Sets. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, Y. Bengio and Y. LeCun (Eds.), Cited by: §6.2.
  • [61] H. Wang, D. Lian, Y. Zhang, L. Qin, and X. Lin (2020) GoGNN: Graph of Graphs Neural Network for Predicting Structured Entity Interactions. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, C. Bessiere (Ed.), pp. 1317–1323. Cited by: Table 1, Table 2.
  • [62] J. Wang, W. Zhang, S. Shen, L. Deng, and H. Liu (2021) DeepDDS: Deep Graph Neural Network with Attention Mechanism to Predict Synergistic Drug Combinations. bioRxiv. Cited by: Table 1, Table 2.
  • [63] M. Wang, L. Yu, D. Zheng, Q. Gan, Y. Gai, Z. Ye, M. Li, J. Zhou, Q. Huang, C. Ma, et al. (2019) Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs.. Cited by: §6.5.
  • [64] Y. Wang, Y. Min, X. Chen, and J. Wu (2021) Multi-view Graph Contrastive Representation Learning for Drug-Drug Interaction Prediction. In Proceedings of the Web Conference 2021, pp. 2921–2933. Cited by: Table 1, Table 2.
  • [65] D. S. Wishart, Y. D. Feunang, A. C. Guo, E. J. Lo, A. Marcu, J. R. Grant, T. Sajed, D. Johnson, C. Li, Z. Sayeeda, et al. (2018) DrugBank 5.0: a major update to the drugbank database for 2018. Nucleic acids research 46 (D1), pp. D1074–D1082. Cited by: Table 3.
  • [66] T. Xie and J. C. Grossman (2018)

    Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties

    Physical review letters 120 (14), pp. 145301. Cited by: §6.1.
  • [67] H. Xu, S. Sang, and H. Lu (2020) Tri-Graph Information Propagation for Polypharmacy Side Effect Prediction. CoRR abs/2001.10516. External Links: Link, 2001.10516 Cited by: Table 1, Table 2.
  • [68] N. Xu, P. Wang, L. Chen, J. Tao, and J. Zhao (2019) MR-GNN: Multi-Resolution and Dual Graph Neural Network for Predicting Structured Entity Interactions. Proceedings of IJCAI. Cited by: Table 1, Table 2.
  • [69] J. Yang, Z. Xu, W. Wu, Q. Chu, and Q. Zhang (2021) GraphSynergy: Network Inspired Deep Learning Model for Anti–Cancer Drug Combination Prediction. Cited by: Table 1, Table 2.
  • [70] Y. Yu, K. Huang, C. Zhang, L. M. Glass, J. Sun, and C. Xiao (2021) SumGNN: Multi-Typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization. Bioinformatics. Cited by: Table 1, Table 2.
  • [71] B. Zagidullin, J. Aldahdooh, S. Zheng, W. Wang, Y. Wang, J. Saad, A. Malyutina, M. Jafari, Z. Tanoli, A. Pessia, et al. (2019) DrugComb: An Integrative Cancer Drug Combination Data Portal. Nucleic acids research 47 (W1), pp. W43–W51. Cited by: Table 3.
  • [72] H. Zhang, Y. Chen, P. R. Payne, and F. Li (2021) Mining Signaling Flow to Interpret Mechanisms of Synergy of Drug Combinations Using Deep Graph Neural Networks. bioRxiv. Cited by: Table 1, Table 2.
  • [73] T. Zhang, L. Zhang, P. R. Payne, and F. Li (2021) Synergistic Drug Combination Prediction by Integrating Multiomics Data in Deep Learning Models. In Translational Bioinformatics for Therapeutic Development, pp. 223–238. Cited by: Table 1, Table 2.
  • [74] W. Zhang, Y. Chen, F. Liu, F. Luo, G. Tian, and X. Li (2017) Predicting Potential Drug-Drug Interactions by Integrating Chemical, Biological, Phenotypic and Network Data. BMC bioinformatics 18 (1), pp. 1–12. Cited by: §1, Table 3.
  • [75] S. Zheng, J. Aldahdooh, T. Shadbahr, Y. Wang, D. Aldahdooh, J. Bao, W. Wang, and J. Tang (2021) DrugComb Update: A More Comprehensive Drug Sensitivity Data Repository and Analysis Portal. Nucleic Acids Research. Cited by: Table 3.
  • [76] M. Zitnik, M. Agrawal, and J. Leskovec (2018-06) Modeling Polypharmacy Side Effects with Graph Convolutional Networks. Bioinformatics 34 (13), pp. i457–i466. Cited by: §1, Table 1, §4.1, Table 2.