1 Introduction
With the recent advances in biomedical technology, a large number of relational data interlinking biomedical components including proteins, drugs, diseases, and symptoms, etc. has gained much attention in biomedical academic research. Relational data, also known as the graph, which captures the interactions (i.e., edges) between entities (i.e., nodes), now plays a key role in the modern machine learning domain. Analyzing these graphs provides users a deeper understanding of topology information and knowledge behind these graphs, and thus greatly benefits many biomedical applications such as biological graph analysis
[2], network medicine [4], clinical phenotyping and diagnosis [40], etc.As summarized in Figure 1, although graph analytics is of great importance, most existing graph analytics methods suffer the computational cost drawn by high dimensionality and sparsity of the graphs [12, 7, 36]. Furthermore, owing to the heterogeneity of biomedical graphs, i.e., containing multiple types of nodes and edges, traditional analyses over biomedical graphs remain challenging. Recently, graph embedding methods, aiming at learning a mapping that embeds nodes into a low dimensional vector space
, now provide an effective and efficient way to address the problems. Specifically, the goal is to optimize this mapping so that the node representation in the embedding space can well preserve information and properties of the original graphs. After optimization of such representation learning, the learned embedding can then be used as feature inputs for many machine learning downstream tasks, which hence introduces enormous opportunities for biomedical data science. Efforts of applying graph embedding over biomedical data are recently made but still not thoroughly explored; capabilities of graph embedding for biomedical data are also not extensively evaluated. In addition, the biomedical graphs are usually sparse, incomplete, and heterogeneous, making graph embedding more complicated than other application domains. To address these issues, it is strongly motivated to understand and compare the stateoftheart graph embedding techniques, and further study how these techniques can be adapted and applied to biomedical data science. Thus in this survey, we investigate recent developments and trends of graph embedding techniques for biomedical data, which give us better insights into future directions. In this article, we introduce the general models related to biomedical data and omit the complete technical details. For a more comprehensive overview of graph embedding techniques and applications, we refer readers to previous wellsummarized papers
[7, 19, 43, 9].In this article, we first give the preliminaries used in this paper. We then briefly introduce the widely used graph embedding models. After that, we introduce some related public biomedical datasets. Finally, we carefully discuss the recent developments and trends of biomedical graph embedding applications.
2 Preliminaries
Definition 1 (Homogeneous graphs)
A homogeneous graph is associated with two mapping functions (node set) (node type set) and (edge set) (edge type set) and .
Definition 2 (Heterogeneous graphs)
A heterogeneous graph is associated with a node type mapping function and an edge type mapping function and and/or .
Definition 3 (Dynamic graphs)
A graph is a dynamic graph where with , are respectively the start and end timestamps for the vertex existence (with ); with and , are respectively the start and end timestamps for the edge existence (with ).
Problem 1 (Graph embedding)
Given a graph , and a predefined embedding dimensionality where . Graph embedding aims to convert into a dimensional space , where the information and proprieties of are well preserved as much as possible.
In the following section, we provide the taxonomy of graph embedding methods based on the graph settings and embedding techniques, respectively.
3 Taxonomy of Graph Embedding Models
As shown in Figure 2, in this section, according to the graph settings, we introduce homogeneous graph embedding models, heterogeneous graph embedding models and dynamic graph embedding models as follows.
3.1 Homogeneous Graph Embedding Models
In the literature, there are three main types of homogeneous graph embedding methods, i.e., matrix factorizationbased methods, random walkbased methods and deep learningbased methods.
Matrix factorizationbased methods. Matrix factorizationbased methods, inspired by classic techniques for dimensionality reduction, use the form of a matrix to represent the graph properties, e.g., node pairwise similarity. Generally, there are two types of matrix factorization to compute the node embedding, i.e., node proximity matrix and graph Laplacian eigenmaps.
For node proximity matrix factorization methods, they usually approximate node proximity into a low dimension and the objective of preserving node proximity is to minimize the approximation loss , where is the node proximity matrix, is the embedding for context nodes and embedding
can be computed using this loss function. Actually, there are many other solutions to approximate this loss function, such as low rank matrix factorization, regularized Gaussian matrix factorization, etc. For graph Laplacian eigenmaps factorization methods, the assumption is that the graph property can be interpreted as the similarity of pairwise nodes. Thus, to obtain a good representation, the normal operation is that a larger penalty will be given if two nodes with higher similarity are far embedded. The optimal embedding
can be computed by using the objective function (1):(1) 
where is the graph Laplacian. is the diagonal matrix and . There are many works using graph Laplacianbased methods and they mainly differ from how they calculate the pairwise node similarity . For example, BANE [55] defines a new WeisfeilerLehman proximity matrix to capture data dependence between edges and attributes; then based on this matrix, BANE learns the node embeddings by formulating a new WeisfilerLehman matrix factorization. Recently, NetMF [37] unifies stateoftheart approaches into a matrix factorization framework with close forms.
Random walkbased methods. Random walkbased methods have been widely used to approximate many properties in the graph including node centrality and similarity. They are more useful when the graph can only partially be observed, or the graph is too large to measure. Two widely recognized random walkbased methods have been proposed, i.e., DeepWalk [36] and node2vec [20]. Concretely, DeepWalk considers the paths as sentences and implements an NLP model to learn node embeddings. Compared to DeepWalk, node2vec introduces a tradeoff strategy using breadthfirst and depthfirst search to perform a biased random walk. In recent year, there are still many random walkbased papers working on improving performance. For example, AWE [24] uses a recently developed method called anonymous walks, i.e., an anonymized version of the random walkbased method providing characteristic graph traits and are capable to exactly reconstruct network proximity of a node. AttentionWalk [1] uses the softmax to learn a freeform context distribution in a random walk; then the learned attention parameters guide the random walk, by allowing it to focus more on short or long term dependencies when optimizing an upstream objective. BiNE [18] proposes methods for bipartite graph embedding by performing biased random walks. Then they generate vertex sequences that can well preserve the longtail distribution of vertices in original bipartite graphs.
Deep learningbased methods. Deep learning has shown outstanding performance in a wide variety of research fields. SDNE [47]
applies a deep autoencoder to model nonlinearity in the graph structure. DNGR
[8]learns deep lowdimensional vertex representations, by using the stacked denoising autoencoders on the highdimensional matrix representations. Furthermore, Graph Convolutional Network (GCN)
[27]introduces a wellbehaved layerwise propagation rule for the neural network model which operates directly on graphs in Equation (
2):(2) 
with , where and
is the adjacency and identity matrix and
is the diagonal degree matrix of . is a weight matrix for the th neural network layer andis a nonlinear activation function like the
ReLU. and are the input and output for layer and layer , respectively. Another important work is Graph Attention Network (GAT) [46], which leverages masked selfattentional layers to address the shortcomings of prior graph convolutionbased methods. Specifically, as shown in Equation (3):(3) 
is the neighbors of node . GAT computes normalized coefficients using the softmax function across different neighborhoods by a byproduct of an attentional mechanism across node pairs. To stabilize the learning process of selfattention, GAT uses multihead attention to replicate times of learning phases, and outputs are featurewise aggregated (typically by concatenating or adding), as shown in Equation (4):
(4) 
where and
are the attention coefficients and the weight matrix specifying the linear transformation of the
th replica. Recently, HGCN [11] and ATTH [10] use hyperbolic model to embed hierarchical graph structure with less distortion.3.2 Heterogeneous Graph Embedding Models
The heterogeneity in both graph structures and node attributes makes it challenging for the graph embedding task to encode their diverse and rich information. In this section, we will introduce translational distance methods and semantic matching methods, which try to address the above issue by constructing different energy functions. Furthermore, we will introduce metapathbased methods that use different strategies to capture graph heterogeneity.
Translational distance methods. The first work of translation distance models is TransE [6]. The basic idea of the translational distance models is, for each observed fact representing head entity having a relation with tail entity , to learn a good graph representation such that and are closely connected by relation in low dimensional embedding space, i.e., h + r t using geometric notations. Here h, r and t are embedding vectors for entities , and relation , respectively. The energy function of TransE is defined as . The marginbased objective funtion of TransE is shown in Equation (5):
(5) 
where denotes the set containing the true facts, e.g., , and is the set of false triplets, e.g.,
, that are not observed in the knowledge graphs. Please note that the energy function
here can be viewed as the distance score of the embedding of entities and in terms of relation . To further improve TransE model and address its inadequacies, many recent works have been developed. For example, RotatE [44] defines each relation as a rotation from the source entity to the target entity in the complex vector space. QuatE [56] computes node embedding vectors in the hypercomplex space with three imaginary components, as opposed to the standard complex space with a single real component and imaginary component. MuRP [3] is a hyperbolic embedding method that embeds multirelational data in the Poincaré ball model of hyperbolic space, which can well perform in hierarchical and scalefree graphs.Semantic matching methods. Semantic matching models exploit similaritybased scoring functions. They measure plausibility of facts by matching latent semantics of entities and relations embodied in their representations. Targetting the observed fact , RESCAL [34] embeds each entity with a vector to capture its latent semantics and each relation with a matrix to model pairwise interactions between latent factors. Equation (6) defines the energy function:
(6) 
where is a matrix associated with the relation. HolE [33] deals with directed graphs and composes head entity and tail entity by their circular correlation, which achieves a better performance than RESCAL. There are other works trying to extend or simplify RESCAL, e.g., DistMult [54], ComplEx [45], ANALOGY [30]. Other direction of semantic matching methods is to fuse neural network architecture by considering embedding as the input layer and energy function as the output layer. For instance, SME model [5]
first imputs embeddings of entities and relations in the input layer. The relation
is then combined with the head entity to get , and with the tail entity to get in the hidden layer. The score function is defined as . There are other semantic matching methods using neural network architecture, e.g., NTN [42], MLP [15].Metapathbased methods. Generally, a metapath is an ordered path that consists of node types and connects via edge types defined on the graph schema, e.g., , which describes a composite relation between node types , , , and edge types , , . Thus, metapaths can be viewed as highorder proximity between two nodes with specific semantics. A set of recent works have been proposed. Metapath2vec [16] computes node embeddings by feeding metapathguided random walks to a skipgram[32] model. HAN [51] learns metapathoriented node embeddings from different metapathbased graphs converted from the original heterogeneous graph and leverages the attention mechanism to combine them into one vector representation for each node. HERec [39] learns node embeddings by applying DeepWalk [36] to the metapathbased homogeneous graphs for recommendation. MAGNN [17] comprehensively considers three main components to achieve the stateoftheart performance. Concretely, MAGNN [17] fuses the node content transformation to encapsulate node attributes, the intrametapath aggregation to incorporate intermediate semantic nodes, and the intermetapath aggregation to combine messages from multiple metapaths.
Other methods. LANE [23] constructs proximity matrices by incorporating label information, graph topology, and learns embeddings while preserving their correlations based on Laplacian matrix. EOE [53]
aims to embed the graph coupled by two nonattribute graphs. In EOE, latent features encode not only intranetwork edges, but also internetwork ones. To tackle the challenge of heterogeneity of two graphs, the EOE incorporates a harmonious embedding matrix to further embed the embeddings. Inspired by generative adversarial network models, HeGAN
[21] is designed to be relationaware in order to capture the rich semantics on heterogeneous graphs and further trains a discriminator and a generator in a minimax game to generate robust graph embeddings.3.3 Dynamic Graph Embedding Models
In practice, graphs are always evolving over time. Recently, much attention is paid to graph embedding for dynamic graphs. In this section, we will briefly introduce some typical general models as follows.
Probabilistic models. In generative probabilistic models, Dynamic latent space models and Dynamic stochastic block models
are two main types within. Latent space models model every node with an unobserved feature vector. An edge between two nodes is then formed conditionally independent of all other pairs of nodes. The latent features are changed over time. Such models are flexible and require fitting of parameters with Markov chain Monte Carlo methods that scale up to only a few hundred nodes
[25]. Stochastic block models divide nodes into blocks (classes), where nodes within a block are assumed to have identical statistical properties. An edge between two nodes is formed independently of all other pairs of nodes with a probability dependents only on the blocks of the two nodes, giving the adjacency matrix of blocks corresponding to pairs of blocks.
Dynamic graph embedding methods. In dynamic graph embedding methods, there are mainly three types of methods, i.e., tensor decompositionbased methods, random walkbased methods, deep learningbased methods
, which are actually inspired from those for homogeneous graphs. Tensor decomposition is analogous to matrix factorization where the additional dimension is time. As for random walkbased methods for dynamic graphs, they are generally extensions of random walkbased embedding methods for static graphs or they apply temporal random walks. Furthermore, deep learning models for dynamic graphs mainly contain two types of models: temporal restricted Boltzmann machines and dynamic graph neural networks. For detailed analysis, please refer to the survey over dynamic graph embedding in
[41, 26].4 Applications and Tasks in Biomedical Domain
4.1 Biomedical datasets
We first summarize some commonly used biomedical datasets in Table 1, where the columns are: average number of nodes/edges, dimensionality of node features, number of node classes, and graphs, respectively.
Dataset  avg.  avg.  Features  Classes  Graphs  Graph Type 

PubMeddiabetes  19,717.00  44,338.00  500  3  1  Citation Graph 
PPI  2,372.67  34,113.17  50  121  24  Biochemical Graph 
MUTAG  17.93  19.79  7  2  188  Biochemical Graph 
NCI1  29.87  32.30  37  2  4,110  Biochemical Graph 
NCI33  30.20    29    2,843  Biochemical Graph 
NCI83  29.50    28    3,867  Biochemical Graph 
NCI109  29.60    38    4,127  Biochemical Graph 
DD  284.31  715.65  82  2  1,178  Biochemical Graph 
PROTEINS  39.06  72.81  4  2  1,113  Biochemical Graph 
ENZYMES  32.46  63.14  6  6  600  Biological Graph 
PubMeddiabetes^{1}^{1}1https://linqs.soe.ucsc.edu/data is a citation graph consists of scientific publications and citations pertaining to diabetes. PPI^{2}^{2}2http://snap.stanford.edu/graphsage/ppi.zip contains 24 graphs including proteinprotein interactions of different organisms such as Homo sapiens, Mus musculus, etc. MUTAG^{3}^{3}3https://ls11www.cs.unidortmund.de/people/morris/graphkerneldatasets dataset contains nitro compounds which are divided into two classes according to their mutagenic effect on a bacterium. NCI{1, 33, 83, 109}[35] contains chemical compounds which are screened for activity against nonsmall cell cancer of lung, melanoma, breast and ovarian, respectively. DD^{4}^{4}4https://chrsmrrs.github.io/datasets/docs/datasets/ and PROTEINS^{4} are two datasets that represent proteins as graphs which labels are enzymes and nonenzymes and ENZYMES[52] is a biological dataset.
4.2 Applications and Tasks
In recent years, graph embedding methods have been applied in biomedical data science. In this section, we will introduce some main biomedical applications of applying graph embedding techniques, including pharmaceutical data analysis, multiomics data analysis and clinical data analysis.
Pharmaceutical data analysis. Generally, there are two main types of applications for pharmaceutical data analysis, i.e., (i) drug repositioning and (ii) adverse drug reaction analysis.
(i) Drug repositioning usually aims to predict unknown drugtarget or drugdisease interactions. Recently, DTINet [31] generates drug and targetprotein embedding by separately performing random walk with restart on heterogeneous biomedical graphs. Then DTINet projects drugs into the embedding space of target proteins and made predictions based on geometric proximity. Other studies over drug repositioning focused on predicting drug disease associations. For instance, Dai et al. [14]
first embed genes by applying eigenvalue decomposition to a genegene interaction graph and calculated genomic representations for drugs and diseases from the gene embedding vectors. Wang et al.
[49] proposed to detect unknown drugdisease interactions from the medical literature by fusing NLP and graph embedding techniques. (ii) An adverse drug reaction (ADR) is defined as any undesirable drug effect out of its desired therapeutic effects that occur at a usual dosage, which now is the center of drug development before a drug is launched on the clinical trial.Multiomics data analysis. The main aim of multiomics is to study structures, functions, and dynamics of organism molecules. Fortunately, graph embedding now becomes a valuable tool to analyze relational data in omics. Concretely, the computation tasks included in multiomics data analysis are mainly about (i) genomics, (ii) proteomics and (iii) transcriptomics.
(i) Works of graph embedding used in genomics data analysis usually try to decipher biology from genome sequences and related data. For example, based on genegene interaction data, a recent work [29] extends the graph embedding method, i.e., LINE, over two bipartite graphs, CellContexGene and GeneContexGene networks, and then proposes SCRL to address representation learning for single cell RNAseq data, which outperforms traditional dimensional reduction methods according to the experimental results. (ii) As we have introduced before, PPIs play key roles in most cell functions. Graph embedding has also been introduced to PPI graphs for proteomics data analysis, such as assessing and predicting PPIs or predicting protein functions, etc. Recently, ProSNet [50] has been proposed for protein function prediction. In this model, they introducing DCA to a heterogeneous molecular graph and further use the metapathbased methods to modify DCA for preserving heterogeneous structural information. Thanks to the proposed embedding methods for such heterogeneous graphs, their experimental prediction performance was greatly improved. (iii) As for transcriptomics study, the focus is to analyze an organism’s transcriptome. For instance, Identifying miRNAdisease associations now becomes an important topic of pathogenicity; while graph embedding now provides a useful tool to involve in transcriptomics for prediction of miRNAdisease associations. To predict new associations, CMFMDA [38] introduces matrix factorization methods to the bipartite miRNAdisease graph for graph embedding. Besides, Li et al. [28] proposed a method by using DeepWalk to embed the bipartite miRNAdisease network. Their experimental results demonstrate that, by preserving both local and global graph topology, DeepWalk can result in significant improvements in association prediction for miRNAdisease graphs.
Clinical data analysis. Graph embedding techniques have been applied to clinic data, such as electronic medical records (EMRs), electronic health records (EHRs) and medical knowledge graph, providing useful assistance and support for clinicians in recent clinic development.
EMRs and EHRs are heterogeneous graphs that comprehensively include medical and clinical information from patients, which provide opportunities for graph embedding techniques to make medical research and clinical decision. To address the heterogeneity of EMRs and EHRs data, GRAM [13] learns EHR representation with the help of hierarchical information inherent to medical ontologies. ProSNet [22] constructs a biomedical knowledge graph to learn the embeddings of medical entities. The proposed method is used to visualize the Parkinson’s disease data set. Conducting medical knowledge graph is of great importance and attention recently. For instance, analogous to TransE, Zhao et al. [58] defined energy function by considering the relation between the symptoms of patients and diseases as a translation vector to further learn the representation of medical forum data. Then a new method is proposed to learn embeddings of medical entities in the medical knowledge graph, based on the energy functions of RESCAL and TransE [57]. In addition, Wang et al. [48] constructed objective function by using both the energy function of TransR and LINE’s 2ndorder proximity measurement to learn embeddings from a heterogeneous medical knowledge graph to further recommend proper medicine to patients.
5 Conclusion
Graph embedding methods aim to learn compact and informative representations for graph analysis and thus provide a powerful opportunity to solve the traditional graphbased machine learning problems both effectively and efficiently. With the rapid development of relational data in the biomedical data domain, applying graph embedding techniques now draws much attention in numerous biomedical applications. However, as we have reviewed in this survey, the capability of graph embedding for biomedical graph analysis has not been fully explored. There may exist many issues associated with the biomedical data that may bring challenges to biomedical graph embedding tasks. For example, biomedical data quality could be not well structured; knowledge and information from biomedical domain or health care records could be complicated, compared to the general domain. In this survey, we introduce recent developments and trends of different graph embedding methods. By carefully summarizing biomedical applications with graph embedding methods, we provide more perspectives over this emerging research domain for better improvement in human health care.
References
 [1] (2018) Watch your step: learning node embeddings via graph attention. In NeurIPS, pp. 9180–9190. Cited by: §3.1.
 [2] (2005) Scalefree networks in cell biology. Journal of cell science. Cited by: §1.
 [3] (2019) Multirelational poincaré graph embeddings. In NeurIPS, pp. 4465–4475. Cited by: §3.2.
 [4] (2011) Network medicine: a networkbased approach to human disease. Nature reviews genetics 12 (1), pp. 56–68. Cited by: §1.
 [5] (2014) A semantic matching energy function for learning with multirelational data. ML 94 (2), pp. 233–259. Cited by: §3.2.
 [6] (2013) Translating embeddings for modeling multirelational data. In NeurIPS, pp. 2787–2795. Cited by: §3.2.
 [7] (2018) A comprehensive survey of graph embedding: problems, techniques, and applications. TKDE 30 (9), pp. 1616–1637. Cited by: §1.
 [8] (2016) Deep neural networks for learning graph representations. In AAAI, Cited by: §3.1.
 [9] (2020) Machine learning on graphs: a model and comprehensive taxonomy. arXiv preprint arXiv:2005.03675. Cited by: §1.
 [10] (2020) Lowdimensional hyperbolic knowledge graph embeddings. ACL. Cited by: §3.1.

[11]
(2019)
Hyperbolic graph convolutional neural networks
. In NeurIPS, pp. 4868–4879. Cited by: §3.1.  [12] (2020) Efficient community search over large directed graph: an augmented indexbased approach. In IJCAI, pp. 3544–3550. Cited by: §1.

[13]
(2017)
GRAM: graphbased attention model for healthcare representation learning
. In SIGKDD, Cited by: §4.2.  [14] (2015) Matrix factorizationbased prediction of novel drug indications by integrating genomic space. CMMM 2015. Cited by: §4.2.
 [15] (2014) Knowledge vault: a webscale approach to probabilistic knowledge fusion. In SIGKDD, pp. 601–610. Cited by: §3.2.
 [16] (2017) Metapath2vec: scalable representation learning for heterogeneous networks. In SIGKDD, pp. 135–144. Cited by: §3.2.
 [17] (2020) MAGNN: metapath aggregated graph neural network for heterogeneous graph embedding. In WWW, pp. 2331–2341. Cited by: §3.2.
 [18] (2018) Bine: bipartite network embedding. In SIGIR, pp. 715–724. Cited by: §3.1.
 [19] (2018) Graph embedding techniques, applications, and performance: a survey. KnowledgeBased Systems 151, pp. 78–94. Cited by: §1.
 [20] (2016) Node2vec: scalable feature learning for networks. In SIGKDD, pp. 855–864. Cited by: §3.1.
 [21] (2019) Adversarial learning on heterogeneous information networks. In SIGKDD, pp. 120–129. Cited by: §3.2.
 [22] (2018) VisAGE: integrating external knowledge into electronic medical record visualization.. In PSB, pp. 578–589. Cited by: §4.2.
 [23] (2017) Label informed attributed network embedding. In WSDM, pp. 731–739. Cited by: §3.2.
 [24] (2018) Anonymous walk embeddings. arXiv:1805.11921. Cited by: §3.1.
 [25] (2016) Evaluating link prediction accuracy in dynamic networks with added and removed edges. In BDCloudSocialComSustainCom, pp. 377–384. Cited by: §3.3.
 [26] (2020) Representation learning for dynamic graphs: a survey. Journal of Machine Learning Research 21 (70), pp. 1–73. Cited by: §3.3.
 [27] (2016) Semisupervised classification with graph convolutional networks. arXiv:1609.02907. Cited by: §3.1.
 [28] (2017) Predicting micrornadisease associations using network topological similarity based on deepwalk. IEEE Access 5, pp. 24032–24039. Cited by: §4.2.
 [29] (2017) Network embeddingbased representation learning for single cell rnaseq data. Nucleic acids research 45 (19), pp. e166–e166. Cited by: §4.2.
 [30] (2017) Analogical inference for multirelational embeddings. In ICML, pp. 2168–2178. Cited by: §3.2.
 [31] (2017) A network integration approach for drugtarget interaction prediction and computational drug repositioning from heterogeneous information. Nature communications 8 (1), pp. 1–13. Cited by: §4.2.

[32]
(2013)
Efficient estimation of word representations in vector space
. In ICLR (Workshop Poster), Cited by: §3.2.  [33] (2016) Holographic embeddings of knowledge graphs. In AAAI, Cited by: §3.2.
 [34] (2011) A threeway model for collective learning on multirelational data.. In ICML, Vol. 11, pp. 809–816. Cited by: §3.2.
 [35] (2013) Graph stream classification using labeled and unlabeled graphs. In ICDE, pp. 398–409. Cited by: §4.1.
 [36] (2014) Deepwalk: online learning of social representations. In SIGKDD, pp. 701–710. Cited by: §1, §3.1, §3.2.
 [37] (2018) Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM, Cited by: §3.1.
 [38] (2017) MiRNAdisease association prediction with collaborative matrix factorization. Complexity. Cited by: §4.2.
 [39] (2018) Heterogeneous information network embedding for recommendation. TKDE 31 (2), pp. 357–370. Cited by: §3.2.
 [40] (2017) Deep ehr: a survey of recent advances in deep learning techniques for electronic health record (ehr) analysis. IEEE journal of biomedical and health informatics 22 (5), pp. 1589–1604. Cited by: §1.
 [41] (2020) Foundations and modelling of dynamic networks using dynamic graph neural networks: a survey. arXiv:2005.07496. Cited by: §3.3.
 [42] (2013) Reasoning with neural tensor networks for knowledge base completion. In NeurIPS, pp. 926–934. Cited by: §3.2.
 [43] (2020) Network embedding in biomedical data science. Briefings in bioinformatics 21 (1), pp. 182–197. Cited by: §1.
 [44] (2019) RotatE: knowledge graph embedding by relational rotation in complex space. In ICLR (Poster), Cited by: §3.2.
 [45] (2016) Complex embeddings for simple link prediction. Cited by: §3.2.
 [46] (2017) Graph attention networks. arXiv:1710.10903. Cited by: §3.1.
 [47] (2016) Structural deep network embedding. In SIGKDD, pp. 1225–1234. Cited by: §3.1.
 [48] (2017) Safe medicine recommendation via medical knowledge graph embedding. arXiv:1710.05980. Cited by: §4.2.
 [49] (2017) Largescale extraction of drug–disease pairs from the medical literature. Journal of the AIST 68 (11), pp. 2649–2661. Cited by: §4.2.
 [50] (2017) PROSNET: integrating homology with molecular networks for protein function prediction. In PSB, pp. 27–38. Cited by: §4.2.
 [51] (2019) Heterogeneous graph attention network. In WWW, pp. 2022–2032. Cited by: §3.2.
 [52] (2019) Capsule graph neural network. In ICLR (Poster), Cited by: §4.1.
 [53] (2017) Embedding of embedding (eoe) joint embedding for coupled heterogeneous networks. In WSDM, pp. 741–749. Cited by: §3.2.
 [54] (2014) Embedding entities and relations for learning and inference in knowledge bases. arXiv:1412.6575. Cited by: §3.2.
 [55] (2018) Binarized attributed network embedding. In ICDM, pp. 1476–1481. Cited by: §3.1.
 [56] (2019) Quaternion knowledge graph embeddings. In NeurIPS, pp. 2731–2741. Cited by: §3.2.

[57]
(2018)
EMRbased medical knowledge representation and inference via markov random fields and distributed representation learning
. Artificial intelligence in medicine 87, pp. 49–59. Cited by: §4.2.  [58] (2017) ContextCare: incorporating contextual information networks to representation learning on medical forum data.. In IJCAI, pp. 3497–3503. Cited by: §4.2.
Comments
There are no comments yet.