1 Introduction
Building Knowledge Graphs (KGs) out of unstructured data is an area of active research. Research in this has resulted in the construction of several large scale KGs, such as NELL Mitchell et al. (2015), Google Knowledge Vault Dong et al. (2014) and YAGO Suchanek et al. (2007). These KGs consist of millions of entities and beliefs involving those entities. Such KG construction methods are schemaguided as they require the list of input relations and their schemata (e.g., playerPlaysSport(Player, Sport)). In other words, knowledge of schemata is an important first step towards building such KGs.
While beliefs in such KGs are usually binary (i.e., involving two entities), many beliefs of interest go beyond two entities. For example, in the sports domain, one may be interested in beliefs of the form win(Roger Federer, Nadal, Wimbledon, London), which is an instance of the highorder (or nary) relation win whose schema is given by win(WinningPlayer, OpponentPlayer, Tournament, Location). We refer to the problem of inducing such relation schemata involving multiple arguments as Higherorder Relation Schema Induction (HRSI). In spite of its importance, HRSI is mostly unexplored.
Recently, tensor factorizationbased methods have been proposed for binary relation schema induction Nimishakavi et al. (2016), with gains in both speed and accuracy over previously proposed generative models. To the best of our knowledge, tensor factorization methods have not been used for HRSI. We address this gap in this paper.
Due to data sparsity, straightforward adaptation of tensor factorization from Nimishakavi et al. (2016) to HRSI is not feasible, as we shall see in Section 3.1. We overcome this challenge in this paper, and make the following contributions.

We propose Tensor Factorization with Backoff and Aggregation (TFBA), a novel tensor factorizationbased method for Higherorder RSI (HRSI). In order to overcome data sparsity, TFBA backsoff and jointly factorizes multiple lowerorder tensors derived from an extremely sparse higherorder tensor.

As an aggregation step, we propose a constrained clique mining step which constructs the higherorder schemata from multiple binary schemata.

Through experiments on multiple realworld datasets, we show the effectiveness of TFBA for HRSI.
Source code of TFBA is available at https://github.com/madhavcsa/TFBA.
The remainder of the paper is organized as follows. We discuss related work in Section 2. In Section 3.1, we first motivate why a backoff strategy is needed for HRSI, rather than factorizing the higherorder tensor. Further, we discuss the proposed TFBA framework in Section 3.2. In Section 4, we demonstrate the effectiveness of the proposed approach using multiple real world datasets. We conclude with a brief summary in Section 5.
2 Related Work
In this section, we discuss related works in two broad areas: schema induction, and tensor and matrix factorizations.
Schema Induction: Most work on inducing schemata for relations has been in the binary setting Mohamed et al. (2011); MovshovitzAttias and Cohen (2015); Nimishakavi et al. (2016). McDonald et al. (2005) and Peng et al. (2017) extract nary relations from Biomedical documents, but do not induce the schema, i.e., type signature of the nary relations. There has been significant amount of work on Semantic Role Labeling Lang and Lapata (2011); Titov and Khoddam (2015); Roth and Lapata (2016), which can be considered as nary relation extraction. However, we are interested in inducing the schemata, i.e., the type signature of these relations. Event Schema Induction is the problem of inducing schemata for events in the corpus Balasubramanian et al. (2013); Chambers (2013); Nguyen et al. (2015). Recently, a model for event representations is proposed in Weber et al. (2018).
Notation  Definition 

Set of nonnegative reals.  
order nonnegative tensor.  
mode matricization of tensor . Please see Kolda and Bader (2009) for details.  
Nonnegative matrix of order .  
Hadamard product: 
Cheung et al. (2013) propose a probabilistic model for inducing frames from text. Their notion of frame is closer to that of scripts Schank and Abelson (1977). Script learning is the process of automatically inferring sequence of events from text Mooney and DeJong (1985). There is a fair amount of recent work in statistical script learning Pichotta and Mooney (2016), Pichotta and Mooney (2014). While script learning deals with the sequence of events, we try to find the schemata of relations at a corpus level. Ferraro and Durme (2016) propose a unified Bayesian model for scripts, frames and events. Their model tries to capture all levels of Minsky Frame structure Minsky (1974), however we work with the surface semantic frames.
Tensor and Matrix Factorizations: Matrix factorization and joint tensormatrix factorizations have been used for the problem of predicting links in the Universal Schema setting Riedel et al. (2013); Singh et al. (2015). Chen et al. (2015) use matrix factorizations for the problem of finding semantic slots for unsupervised spoken language understanding. Tensor factorization methods are also used in factorizing knowledge graphs Chang et al. (2014); Nickel et al. (2012). Joint matrix and tensor factorization frameworks, where the matrix provides additional information, is proposed in Acar et al. (2013) and Wang et al. (2015). These models are based on PARAFAC Harshman (1970), a tensor factorization model which approximates the given tensor as a sum of rank1 tensors. A boolean Tucker decomposition for discovering facts is proposed in Erdos and Miettinen (2013). In this paper, we use a modified version (Tucker2) of Tucker decomposition Tucker (1963).
RESCAL Nickel et al. (2011) is a simplified Tucker model suitable for relational learning. Recently, SICTF Nimishakavi et al. (2016), a variant of RESCAL with side information, is used for the problem of schema induction for binary relations. SICTF cannot be directly used to induce higher order schemata, as the higherorder tensors involved in inducing such schemata tend to be extremely sparse. TFBA overcomes these challenges to induce higherorder relation schemata by performing NonNegative Tuckerstyle factorization of sparse tensor while utilizing a backoff strategy, as explained in the next section.
3 Higher Order Relation Schema Induction using Backoff Factorization
In this section, we start by discussing the approach of factorizing a higherorder tensor and provide the motivation for backoff strategy. Next, we discuss the proposed TFBA approach in detail. Please refer to Table 1 for notations used in this paper.
3.1 Factorizing a Higherorder Tensor
Given a text corpus, we use OpenIEv5 Mausam (2016) to extract tuples. Consider the following sentence “Federer won against Nadal at Wimbledon.”. Given this sentence, OpenIE extracts the 4tuple (Federer, won, against Nadal, at Wimbledon). We lemmatize the relations in the tuples and only consider the noun phrases as arguments. Let represent the set of these 4tuples. We can construct a 4order tensor from . Here, is the number of subject noun phrases (NPs), is the number of object NPs, is the number of other NPs, and is the number of relations in . Values in the tensor correspond to the frequency of the tuples. In case of 5tuples of the form (subject, relation, object, other1, other2), we split the 5tuples into two 4tuples of the form (subject, relation, object, other1) and (subject, relation, object, other2) and frequency of these 4tuples is considered to be same as the original 5tuple. Factorizing the tensor results in discovering latent categories of NPs, which help in inducing the schemata. We propose the following approach to factorize .
where,
Here, I
is the identity matrix. Nonnegative updates for the variables can be obtained following
Lee and Seung (2000). Similar to Nimishakavi et al. (2016), schemata induced will be of the form relation . Here, represents the column of a matrix P. A is the embedding matrix of subject NPs in (i.e., mode1 of ), is the embedding rank in mode1 which is the number of latent categories of subject NPs. Similarly, B and C are the embedding matrices of object NPs and other NPs respectively. and are the number of latent categories of object NPs and other NPs respectively. is the core tensor. , and are the regularization weights.However, the 4order tensors are heavily sparse for all the datasets we consider in this work. The sparsity ratio of this 4order tensor for all the datasets is of the order 1e7. As a result of the extreme sparsity, this approach fails to learn any schemata. Therefore, we propose a more successful backoff strategy for higherorder RSI in the next section.
3.2 TFBA: Proposed Framework
To alleviate the problem of sparsity, we construct three tensors , , and from as follows:

is constructed out of the tuples in by dropping the other argument and aggregating resulting tuples, i.e., . For example, 4tuples (Federer, Win, Nadal, Wimbledon), 10 and (Federer, Win, Nadal, Australian Open), 5 will be aggregated to form a triple (Federer, Win, Nadal), 15.

is constructed out of the tuples in by dropping the object argument and aggregating resulting tuples i.e., .

constructed out of the tuples in by dropping the subject argument and aggregating resulting tuples i.e., .
The proposed framework TFBA for inducing higher order schemata involves the following two steps.
3.2.1 Step 1: Backoff Tensor Factorization
A schematic overview of this step is shown in Figure 1. TFBA first preprocesses the corpus and extracts OpenIE tuple set out of it. The 4mode tensor is constructed out of . Instead of performing factorization of the higherorder tensor as in Section 3.1, TFBA creates three tensors out of : and .
TFBA performs a coupled nonnegative Tucker factorization of the input tensors and by solving the following optimization problem.
(1) 
where,
We enforce nonnegativity constraints on the matrices and the core tensors (). Nonnegativity is essential for learning interpretable latent factors Murphy et al. (2012).
Each slice of the core tensor corresponds to one of the relations. Each cell in a slice corresponds to an induced schema in terms of the latent factors from matrices A and B. In other words, is an induced binary schema for relation involving induced categories represented by columns and . Cells in and may be interpreted accordingly.
We derive nonnegative multiplicative updates for and C following the NMF updating rules given in Lee and Seung (2000). For the update of A, we consider the mode1 matricization of first and the second term in Equation 1 along with the regularizer.
where,
In order to estimate
B, we consider mode2 matricization of first term and mode1 matricization of third term in Equation 1, along with the regularization term. We get the following update rule for Bwhere,
For updating C, we consider mode2 matricization of second and third terms in Equation 1 along with the regularization term, and we get
where,
In all the above updates, represents elementwise division and I is the identity matrix.
Initialization: For initializing the component matrices , and C, we first perform a nonnegative Tucker2 Decomposition of the individual input tensors and . Then compute the average of component matrices obtained from each individual decomposition for initialization. We initialize the core tensors and with the core tensors obtained from the individual decompositions.
3.2.2 Step 2: Binary to HigherOrder Schema Induction
In this section, we describe how a higherorder schema is constructed from the factorization described in the previous subsection. Each relation has three representations given by the slices , and from each core tensor. We need a principled way to produce a joint schema from these representations. For a relation, we select top indices with highest values from each matrix. The indices and from correspond to column numbers of A and B respectively, indices from correspond to columns from A and C and columns from correspond to columns from B and C.
We construct a tripartite graph with the column numbers from each of the component matrices A, B and C as the vertices belonging to independent sets, the top indices selected are the edges between these vertices. From this tripartite graph, we find all the triangles which will give schema with three arguments for a relation, illustrated in Figure 2. We find higher order schemata, i.e., schemata with more than three arguments by merging two third order schemata with same column number from A and B. For example, if we find two schemata and then we merge these two to give as a higher order schema. This can be continued further for even higher order schemata. This process may be thought of as finding a constrained clique over the tripartite graph. Here the constraint is that in the maximal clique, there can only be one edge between sets corresponding to columns of A and columns of B.
The procedure above is inspired by McDonald et al. (2005). However, we note that McDonald et al. (2005) solved a different problem, viz., nary relation instance extraction, while our focus is on inducing schemata. Though we discuss the case of backoff from 4order to 3order, ideas presented above can be extended for even higher orders depending on the sparsity of the tensors.
4 Experiments
Dataset  

Shootings  
NYT Sports  
MUC 
Dataset  

Shootings  (10, 20,15)  (0.3, 0.1, 0.7) 
NYT Sports  (20, 15, 15)  (0.9, 0.5, 0.7) 
MUC  (15, 12, 12)  (0.7, 0.7, 0.4) 
Relation Schema  NPs from the induced categories  Evaluator Judgment  (Human) Suggested Label 
Shootings  
leave  : shooting, shooting incident, double shooting  valid  shooting 
: one person, two people, three people  people  
: dead, injured, on edge  injured  
identify  : police, officers, huntsville police  valid  police 
: man, victims, four victims  victim(s)  
: sunday, shooting staurday, wednesday afternoon  day/time  
: apartment, bedroom, building in the neighborhood  place  
shoot  : gunman, shooter, smith  valid  perpetrator 
: freeman, slain woman, victims  victim  
: friday, friday night, early monday morning  time  
shoot  : numyearold man, numyearold george reavis, numyearold brockton man  valid  victim 
: in the leg, in the head, in the neck  body part  
: in macon, in chicago, in an alley  location  
say  : police, officers, huntsville police  invalid  – 
: man, victims, four victims  
: sunday, shooting staurday, wednesday afternoon  
NYT sports  
spend  : yankees, mets, jets  valid  team 
: $ num million, $ num, $ num billion  money  
: num, year, last season  year  
win  : red sox, team, yankees  valid  team 
: world series, title, world cup  championship  
: num, year, last season  year  
get  : umpire, mike cameron, andre agassi  invalid  – 
: ball, lives, grounder  
: back, forward, numyard line  
MUC  
tell  : medardo gomez, jose azcona, gregorio roza chavez  valid  politician 
: media, reporters, newsmen  media  
: today, at num, tonight  day/time  
occur  : bomb, blast, explosion  valid  bombing 
: near san salvador, here in madrid, in the same office  place  
: at num, this time, simultaneously  time  
suffer  : justice maria elena diaz, vargas escobar, judge sofia de roldan  invalid  – 
: casualties , car bomb, grenade  
: settlement of refugees, in san roman, now  
Shootings  NYT Sports  MUC  

E1  E2  E3  Avg  E1  E2  E3  Avg  E1  E2  E3  Avg  
HardClust  0.64  0.70  0.64  0.66  0.42  0.28  0.52  0.46  0.64  0.58  0.52  0.58 
Chambers13  0.32  0.42  0.28  0.34  0.08  0.02  0.04  0.07  0.28  0.34  0.30  0.30 
TFBA  0.82  0.78  0.68  0.76  0.86  0.6  0.64  0.70  0.58  0.38  0.48  0.48 
In this section, we evaluate the performance of TFBA for the task of HRSI. We also propose a baseline model for HRSI called HardClust.
HardClust: We propose a baseline model called the Hard Clustering Baseline (HardClust) for the task of higher order relation schema induction.
This model induces schemata by grouping perrelation NP arguments from OpenIE extractions. In other words, for each relation, all the Noun Phrases (NPs) in first argument form a cluster that represents the subject of the relation, all the NPs in the second argument form a cluster that represents object and so on.
Then from each cluster, the top most frequent NPs are chosen as the representative NPs for the argument type. We note that this method is only able to induce one schema per relation.
Datasets: We run our experiments on three datasets. The first dataset (Shootings) is a collection of 1,335 documents constructed from a publicly available database of mass shootings in the United States. The second is New York Times Sports (NYT Sports) dataset which is a collection of 20,940 sports documents from the period 2005 and 2007. And the third dataset (MUC) is a set of 1300 Latin American newswire documents about terrorism events. After performing the processing steps described in Section 3, we obtained 357,914 unique OpenIE extractions from the NYT Sports dataset, 10,847 from Shootings dataset, and 8,318 from the MUC dataset. However, in order to properly analyze and evaluate the model, we consider only the 50 most frequent relations in the datasets and their corresponding OpenIE extractions. This is done to avoid noisy OpenIE extractions to yield better data quality and to aid subsequent manual evaluation of the data. We construct input tensors following the procedure described in Section 3.2. Details on the dimensions of tensors obtained are given in Table 2.
Model Selection: In order to select appropriate TFBA parameters, we perform a grid search over the space of hyperparameters, and select the set of hyperparameters that give best Average FIT score ().
where,
We perform a grid search for the rank parameters between 5 and 20, for the regularization weights we perform a grid search over 0 and 1. Table 3 provides the details of hyperparameters set for different datasets. Evaluation Protocol: For TFBA, we follow the protocol mentioned in Section 3.2.2 for constructing higher order schemata. For every relation, we consider top 5 binary schemata from the factorization of each tensor. We construct a tripartite graph, as explained in Section 3.2.2, and mine constrained maximal cliques from the tripartite graphs for schemata. Table 4 provides some qualitative examples of higherorder schemata induced by TFBA. Accuracy of the schemata induced by the model is evaluated by human evaluators. In our experiments, we use human judgments from three evaluators. For every relation, the first and second columns given in Table 4 are presented to the evaluators and they are asked to validate the schema. We present top 50 schemata based on the score of the constrained maximal clique induced by TFBA to the evaluators. This evaluation protocol was also used in MovshovitzAttias and Cohen (2015) for evaluating ontology induction. All evaluations were blind, i.e., the evaluators were not aware of the model they were evaluating.
Difficulty with Computing Recall: Even though recall is a desirable measure, due to the lack of availability of gold higherorder schema annotated corpus, it is not possible to compute recall. Although the MUC dataset has gold annotations for some predefined list of events, it does not have annotations for the relations.
Experimental results comparing performance of various models for the task of HRSI are given in Table 5. We present evaluation results from three evaluators represented as E1, E2 and E3. As can be observed from Table 5, TFBA achieves better results than HardClust for the Shootings and NYT Sports datasets, however HardClust achieves better results for the MUC dataset. Percentage agreement of the evaluators for TFBA is 72%, 70% and 60% for Shootings, NYT Sports and MUC datasets respectively.
HardClust Limitations: Even though HardClust gives better induction for MUC corpus, this approach has some serious drawbacks. HardClust can only induce one schema per relation. This is a restrictive constraint as multiple senses can exist for a relation. For example, consider the schemata induced for the relation shoot as shown in Table 4. TFBA induces two senses for the relation, but HardClust can induce only one schema. For a set of 4tuples, HardClust can only induce ternary schemata; the dimensionality of the schemata cannot be varied. Since the latent factors induced by HardClust are entirely based on frequency, the latent categories induced by HardClust are dominated by only a fixed set of noun phrases. For example, in NYT Sports dataset, subject category induced by HardClust for all the relations is team, yankees, mets. In addition to inducing only one schema per relation, most of the times HardClust only induces a fixed set of categories. Whereas for TFBA, the number of categories depends on the rank of factorization, which is a user provided parameter, thus providing more flexibility to choose the latent categories.
4.1 Using Event Schema Induction for HRSI
Event schema induction is defined as the task of learning highlevel representations of events, like a tournament, and their entity roles, like winningplayer etc, from unlabeled text. Even though the main focus of event schema induction is to induce the important roles of the events, as a side result most of the algorithms also provide schemata for the relations. In this section, we investigate the effectiveness of these schemata compared to the ones induced by TFBA.
Event schemata are represented as a set of (Actor, Rel, Actor) triples in Balasubramanian et al. (2013). Actors represent groups of noun phrases and Rels represent relations. From this style of representation, however, the nary schemata for relations cannot be induced. Event schemata generated in Weber et al. (2018) are similar to that in Balasubramanian et al. (2013). Event schema induction algorithm proposed in Nguyen et al. (2015) doesn’t induce schemata for relations, but rather induces the roles for the events. For this investigation we experiment with the following algorithm.
Chambers13 Chambers (2013): This model learns event templates from text documents. Each event template provides a distribution over slots, where slots are clusters of NPs. Each event template also provides a cluster of relations, which is most likely to appear in the context of the aforementioned slots. We evaluate the schemata of these relation clusters.
As can be observed from Table 5, the proposed TFBA performs much better than Chambers13. HardClust also performs better than Chambers13 on all the datasets. From this analysis we infer that there is a need for algorithms which induce higherorder schemata for relations, a gap we fill in this paper. Please note that the experimental results provided in Chambers (2013) for MUC dataset are for the task of event schema induction, but in this work we evaluate the relation schemata. Hence the results in Chambers (2013) and results in this paper are not comparable. Example schemata induced by TFBA and (Chambers13) are provided as part of the supplementary material.
5 Conclusion
Higher order Relation Schema Induction (HRSI) is an important first step towards building domainspecific Knowledge Graphs (KGs). In this paper, we proposed TFBA, a tensor factorizationbased method for higherorder RSI. To the best of our knowledge, this is the first attempt at inducing higherorder (nary) schemata for relations from unlabeled text. Rather than factorizing a severely sparse higherorder tensor directly, TFBA performs backoff and jointly factorizes multiple lowerorder tensors derived out of the higherorder tensor. In the second step, TFBA solves a constrained clique problem to induce schemata out of multiple binary schemata. We are hopeful that the backoffbased factorization idea exploited in TFBA will be useful in other sparse factorization settings.
Acknowledgment
We thank the anonymous reviewers for their insightful comments and suggestions. This research has been supported in part by the Ministry of Human Resource Development (Government of India), Accenture, and Google.
References
 Acar et al. (2013) Evrim Acar, Morten Arendt Rasmussen, Francesco Savorani, Tormod Næs, and Rasmus Bro. 2013. Understanding data fusion within the framework of coupled matrix and tensor factorizations. Chemometrics and Intelligent Laboratory Systems 129:53–63.
 Balasubramanian et al. (2013) Niranjan Balasubramanian, Stephen Soderland, Mausam, and Oren Etzioni. 2013. Generating coherent event schemas at scale. In EMNLP.
 Chambers (2013) Nathanael Chambers. 2013. Event schema induction with a probabilistic entitydriven model. In EMNLP.
 Chang et al. (2014) KaiWei Chang, Wen tau Yih, Bishan Yang, and Christopher Meek. 2014. Typed tensor decomposition of knowledge bases for relation extraction. In EMNLP.
 Chen et al. (2015) YunNung Chen, William Yang Wang, Anatole Gershman, and Alexander I. Rudnicky. 2015. Matrix factorization with knowledge graph propagation for unsupervised spoken language understanding. In ACL.
 Cheung et al. (2013) Jackie Chi Kit Cheung, Hoifung Poon, and Lucy Vanderwende. 2013. Probabilistic frame induction. In NAACLHLT.
 Dong et al. (2014) Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun, and Wei Zhang. 2014. Knowledge vault: A webscale approach to probabilistic knowledge fusion. In KDD.
 Erdos and Miettinen (2013) Dora Erdos and Pauli Miettinen. 2013. Discovering facts with boolean tensor tucker decomposition. In CIKM.
 Ferraro and Durme (2016) Francis Ferraro and Benjamin Van Durme. 2016. A unified bayesian model of scripts, frames and language. In AAAI.
 Harshman (1970) R. A. Harshman. 1970. Foundations of the PARAFAC procedure: Models and conditions for an” explanatory” multimodal factor analysis. UCLA Working Papers in Phonetics 16(1):84.
 Kim and Choi (2007) YongDeok Kim and Seungjin Choi. 2007. Nonnegative tucker decomposition. In CVPR.
 Kolda and Bader (2009) Tamara G Kolda and Brett W Bader. 2009. Tensor decompositions and applications. SIAM review 51(3):455–500.
 Lang and Lapata (2011) Joel Lang and Mirella Lapata. 2011. Unsupervised semantic role induction via splitmerge clustering. In NAACLHLT.
 Lee and Seung (2000) Daniel D. Lee and H. Sebastian Seung. 2000. Algorithms for nonnegative matrix factorization. In NIPS.
 Mausam (2016) Mausam. 2016. Open information extraction systems and downstream applications. In IJCAI.
 McDonald et al. (2005) Ryan McDonald, Fernando Pereira, Seth Kulick, Scott Winters, Yang Jin, and Pete White. 2005. Simple algorithms for complex relation extraction with applications to biomedical ie. In ACL.
 Minsky (1974) Marvin Minsky. 1974. A framework for representing knowledge. Technical report.
 Mitchell et al. (2015) T. Mitchell, W. Cohen, E. Hruschka, P. Talukdar, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner, B. Kisiel, J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohamed, N. Nakashole, E. Platanios, A. Ritter, M. Samadi, B. Settles, R. Wang, D. Wijaya, A. Gupta, X. Chen, A. Saparov, M. Greaves, and J. Welling. 2015. Neverending learning. In AAAI.
 Mohamed et al. (2011) Thahir P. Mohamed, Jr. Estevam R. Hruschka, and Tom M. Mitchell. 2011. Discovering relations between noun categories. In EMNLP.

Mooney and DeJong (1985)
Raymond Mooney and Gerald DeJong. 1985.
Learning schemata for natural language processing.
In IJCAI.  MovshovitzAttias and Cohen (2015) Dana MovshovitzAttias and William W. Cohen. 2015. Kblda: Jointly learning a knowledge base of hierarchy, relations, and facts. In ACL.
 Murphy et al. (2012) Brian Murphy, Partha Talukdar, and Tom Mitchell. 2012. Learning effective and interpretable semantic models using nonnegative sparse embedding. In COLING.
 Nguyen et al. (2015) KiemHieu Nguyen, Xavier Tannier, Olivier Ferret, and Romaric Besançon. 2015. Generative event schema induction with entity disambiguation. In ACL.
 Nickel et al. (2011) Maximilian Nickel, Volker Tresp, and HansPeter Kriegel. 2011. A threeway model for collective learning on multirelational data. In ICML.

Nickel et al. (2012)
Maximilian Nickel, Volker Tresp, and HansPeter Kriegel. 2012.
Factorizing yago: Scalable machine learning for linked data.
In WWW.  Nimishakavi et al. (2016) Madhav Nimishakavi, Uday Singh Saini, and Partha Talukdar. 2016. Relation schema induction using tensor factorization with side information. In EMNLP.
 Peng et al. (2017) Nanyun Peng, Hoifung Poon, Chris Quirk, Kristina Toutanova, and Wentau Yih. 2017. Crosssentence nary relation extraction with graph lstms. TACL 5:101–115.
 Pichotta and Mooney (2014) Karl Pichotta and Raymond J. Mooney. 2014. Statistical script learning with multiargument events. In EACL.

Pichotta and Mooney (2016)
Karl Pichotta and Raymond J. Mooney. 2016.
Learning statistical scripts with lstm recurrent neural networks.
In AAAI.  Riedel et al. (2013) Sebastian Riedel, Limin Yao, Andrew McCallum, and Benjamin M. Marlin. 2013. Relation extraction with matrix factorization and universal schemas. In NAACLHLT.
 Roth and Lapata (2016) Michael Roth and Mirella Lapata. 2016. Neural semantic role labeling with dependency path embeddings. In ACL.
 Schank and Abelson (1977) R. Schank and R. Abelson. 1977. Scripts, plans, goals and understanding: An inquiry into human knowledge structures. Lawrence Erlbaum Associates, Hillsdale, NJ.

Singh et al. (2015)
Sameer Singh, Tim Rocktäschel, and Sebastian Riedel. 2015.
Towards Combined Matrix and Tensor Factorization for Universal
Schema Relation Extraction.
In
NAACL Workshop on Vector Space Modeling for NLP (VSM)
.  Suchanek et al. (2007) Fabian M Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowledge. In WWW.
 Titov and Khoddam (2015) Ivan Titov and Ehsan Khoddam. 2015. Unsupervised induction of semantic roles within a reconstructionerror minimization framework. In NAACLHLT.
 Tucker (1963) L. R. Tucker. 1963. Implications of factor analysis of threeway matrices for measurement of change. In Problems in measuring change., University of Wisconsin Press, Madison WI, pages 122–137.
 Wang et al. (2015) Yichen Wang, Robert Chen, Joydeep Ghosh, Joshua C. Denny, Abel N. Kho, You Chen, Bradley A. Malin, and Jimeng Sun. 2015. Rubik: Knowledge guided tensor factorization and completion for health data analytics. In KDD.
 Weber et al. (2018) Noah Weber, Niranjan Balasubramanian, and Nathanael Chambers. 2018. Event representations with tensorbased compositions. In AAAI.
Comments
There are no comments yet.