Fuzzy Logic based Logical Query Answering on Knowledge Graph

08/05/2021 ∙ by Xuelu Chen, et al. ∙ 0

Answering complex First-Order Logical (FOL) queries on large-scale incomplete knowledge graphs (KGs) is an important yet challenging task. Recent advances embed logical queries and KG entities in the vector space and conduct query answering via dense similarity search. However, most of the designed logical operators in existing works do not satisfy the axiomatic system of classical logic. Moreover, these logical operators are parameterized so that they require a large number of complex FOL queries as training data, which are often arduous or even inaccessible to collect in most real-world KGs. In this paper, we present FuzzQE, a fuzzy logic based query embedding framework for answering FOL queries over KGs. FuzzQE follows fuzzy logic to define logical operators in a principled and learning free manner. Extensive experiments on two benchmark datasets demonstrate that FuzzQE achieves significantly better performance in answering FOL queries compared to the state-of-the-art methods. In addition, FuzzQE trained with only KG link prediction without any complex queries can achieve comparable performance with the systems trained with all FOL queries.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Figure 1: FOL query and its computation graph for the question “Who have sung the songs written by John Lennon or Paul McCartney, but never won Grammy Award?”.
Logic Law Model Property
Conjunction elimination
Commutativity
Associativity
Disjunction amplification
Commutativity
Associativity
Involution
Non-contradiction
Table 1: Here we list some logic laws from classical logic and fuzzy logic axiomatic systems (fuzzytheorybook) and give the corresponding properties that a query embedding model should possess. They are based on the Modus Ponen inference rule “” and the implication semantics “ holds if the truth value of is larger than or equal to ”. The full axiomatic systems and analysis are provided in Appendix B. represent logical formulae.

denotes the scoring function that estimates the probability that the entity

can answer the query . means is monotonically decreasing with regard to .
Expressivity (Closed) Com. Asso. Elim. Expressivity (Closed) Com. Asso. Ampli. Expressivity (Closed) Inv. Non-Contra.
GQE ✓(✓) ✓(✗) N/A N/A
Query2Box ✓(✓) ✓(✗) N/A N/A
BetaE ✓(✓) ✓(✗) ✓(✓)
BetaE ✓(✓) ✓(✓) ✓(✓)
FuzzQE ✓(✓) ✓(✓) ✓(✓)
Table 2: Comparisons of existing works regarding the properties of logical operations. Expressivity indicates whether the model can handle such logical operations, and closed indicates whether the embedding is in a closed form. Commu., Asso., Elim., Ampli., Inv. and Non-contra. stand for commutativity, associativity, conjunction elimination, disjunction amplification, involution, and non-contradiction respectively.

Knowledge graphs (KGs), such as Freebase (bollacker2008freebase), YAGO (rebele2016yago), and NELL (mitchell2018never)

, provide structured representations of facts about real-world entities and relations. One of the fundamental problems in Artificial Intelligence is to answer complex queries, which involves logical reasoning over the facts captured by KGs, e.g., answering First-Order Logic (FOL) queries with existential quantification (

), conjunction (), disjunction (), and negation (). For instance, the question “Who have sung the songs written by John Lennon or Paul McCartney, but never won Grammy Award?” can be expressed as the FOL query shown in Fig 1.

Traditional symbolic reasoning (lei2011rdfsubgraph; schmidt2010foundations) have drawbacks in terms of computation complexity and handling missing edges of KGs. To address these issues, recent works (GQE; Query2Box; BetaE) seek to embed logical queries and entities in the same vector space and conduct query answering via dense similarity search. Such approaches robustly handles missing edges and offers significant advantages in terms of time and space complexity of inference. However, the logic operators in these models are defined in an ad-hoc fashion, many of which do not satisfy the axiomatic system of classical logic or fuzzy logic (KlementTNormBook)

, which limits their inference accuracy. Furthermore, the logical operators of existing works are based on multi-layer perceptron (MLP) and/or attention mechanism, which requires a large number of training queries containing such logic operations to learn the parameters. This greatly limits the scope of application of the models, since it is extremely challenging to collect a large number of reasonable complex queries with accurate answers, and such training data is not readily available in most of the real-world KGs.

To address the above issues, we present FuzzQE (Fuzzy Query Embedding), a fuzzy logic based embedding framework for answering logical queries on KGs. To implement logical operators in a more principled and learning-free manner, we borrow the idea of fuzzy logic and use the fuzzy conjunction, disjunction, and negation operations to define the logical operators in vector space.

One advantage of FuzzQE is that it employs differentiable logical operators that fully satisfy the axioms of logical operations and is capable of preserving logical operation properties in vector space. This superiority corroborated by extensive experiments on two benchmark datasets, which demonstrate that FuzzQE delivers significantly better performance to the state-of-the-art methods in answering FOL queries.

The other advantage is that our logical operations preserve the property of the logical operations while they do not require learning any operator specific parameters. Accordingly, experimental results show that even when our model is trained with link prediction only, it achieves results comparable with state-of-the-art models that are trained with extra complex query data. Furthermore, when no complex query data is used for training, our model significantly outperforms previous models. This is a huge advantage in real-world applications, since complex FOL training queries are often arduous or even inaccessible to collect in most real-world KGs.

2 Proposed Method

In this section, we present our model FuzzQE, which uses fuzzy operations to implement logical operators.

2.1 Background and Preliminaries

Knowledge Graph (KG)

A knowledge graph consists of a set of triples , with (the set of entities) denoting the subject and object entities respectively and (the set of relations) denoting the relation between and . An FOL query on KG consists of atomic queries, existential quantifiers, and logical connectives . Notation wise, we use boldfaced letters and to represent the embedding of entity and query respectively.

Logic Laws and Model Properties

Here, we refer to axioms shared by both classical logic and basic fuzzy logic (fuzzytheorybook) and summarize several the basic properties that the logical operators should possess in Table 1. The complete list of axioms written in Hilbert-style Deductive System is provided in Appendix B. In order to embed logical queries for query answering, previous works (GQE; Query2Box; BetaE) define logical operations in vector space as transformations of query embeddings. We summarize the capability of different models to retain those properties in Table 2.

Model avg avg 1p 2p 3p 2i 3i pi ip 2u up 2in 3in inp pin pni
FB15k-237
GQE 16.3 - 35.0 7.2 5.3 23.3 34.6 16.5 10.7 8.2 5.7 - - - - -
Query2Box 20.1 - 40.6 9.4 6.8 29.5 42.3 21.2 12.6 11.3 7.6 - - - - -
BetaE 20.9 5.5 39.0 10.9 10.0 28.8 42.5 22.4 12.6 12.4 9.7 5.1 7.9 7.4 3.5 3.4
FuzzQE 24.0 7.8 42.8 12.9 10.3 33.3 46.9 26.9 17.8 14.6 10.3 8.5 11.6 7.8 5.2 5.8
NELL995
GQE 18.6 - 32.8 11.9 9.6 27.5 35.2 18.4 14.4 8.5 8.8 - - - - -
Query2Box 22.9 - 42.2 14.0 11.2 33.3 44.5 22.4 16.8 11.3 10.3 - - - - -
BetaE 24.6 5.9 53.0 13.0 11.4 37.6 47.5 24.1 14.3 12.2 8.5 5.1 7.8 10.0 3.1 3.5
FuzzQE 27.0 7.8 47.4 17.2 14.6 39.5 49.2 26.2 20.6 15.3 12.6 7.8 9.8 11.1 4.9 5.5
Table 3: MRR results (%) on answering FOL queries. avg denotes the average MRR on EPFO () queries, while avg is the average MRR on queries containing negation. Results of baseline models are taken from (BetaE).
Model avg avg 1p 2p 3p 2i 3i pi ip 2u up 2in 3in inp pin pni
FB15k-237
GQE 17.7 - 41.6 7.9 5.4 25.0 33.6 16.3 10.9 11.9 6.2 - - - - -
Query2Box 18.2 - 42.6 6.9 4.7 27.3 36.8 17.5 11.1 11.7 5.5 - - - - -
BetaE 19.0 0.4 53.1 6.0 3.9 32.0 37.7 15.8 8.5 10.1 3.5 0.1 1.4 0.1 0.1 0.1
FuzzQE 21.9 6.6 44.0 10.8 8.6 32.3 41.4 22.7 15.1 13.5 8.7 7.7 9.5 7.0 4.1 4.7
NELL995
GQE 21.7 - 47.2 12.7 9.3 30.6 37.0 20.6 16.1 12.6 9.6 - - - - -
Query2Box 21.6 - 47.6 12.5 8.7 30.7 36.5 20.5 16.0 12.7 9.6 - - - - -
BetaE 15.8 0.5 37.7 5.6 4.4 23.3 34.5 15.1 7.8 9.5 4.5 0.1 1.1 0.8 0.1 0.2
FuzzQE 24.7 6.0 55.7 14.9 11.9 35.7 41.5 16.8 22.0 13.4 9.7 6.2 7.7 8.4 3.5 4.2
Table 4: MRR results (%) when models are trained with link prediction only. This tests the ability of the models to generalize to complex logical queries when trained with only KG edges. For the baseline models, we configure set weights to identical for logical operators.

2.2 Logical query embedding

Our objective is to design a query embedding model that can fit the axiomatic system of classical logic and retain the properties in Table 1. Considering fuzzy logic systems is a natural direction, as their axiomatic systems are fully compatible with classical logic, and their logical operations therefore conform to these properties as well. In our work, we implement the logical operators in vector space through element-wise fuzzy conjunction, fuzzy disjunction and fuzzy negation. This design ensures that we can satisfy the requirements of Table 1 in all dimensions. Particularly, we present FuzzQE with reference to product logic, one of the most prominent fuzzy logic systems (KlementTNormBook). Alternative design choices based on Łukasiewicz and Gödel-Dummett logic are discussed in Appendix E.

Embedding Space

We propose to embed entities and queries into the same space: . Using this embedding space provides the following benefits: (i) Each dimension of the embedding vector is between

, which satisfies the domain and range requirements of fuzzy logic and allows the model to execute element-wise fuzzy conjunction/disjunction/negation. (ii) The representation is endowed with probabilistic interpretation. Particularly, this domain restriction is implemented with an element-wise sigmoid function.

Atomic Query

We first discuss how to embed an atomic query like . In our work, we associate each relation with a weight matrix

and a bias vector

. Given a relation and the anchor entity embedding , we model the query embedding as the following relation transform:

where is an element-wise sigmoid function to impose the domain restriction . To avoid the rapid growth in number of parameters with the number of relations and alleviate overfitting on rare relations, we follow (RGCN) and adopt basis-decomposition to define : , i.e. as a linear combination of basis transformations with coefficients that depend on . It can be seen as a form of effective weight sharing among different relation types. Note that intermediate variables can be embeded by performing multiple such transformations.

Logical Connectives (

We propose to use fuzzy logic operations to model logic connectives in vector space. Following product logic with Łukasiewicz negator (KlementTNormBook), the embeddings of , , and are computed as follows:

where denotes element-wise multiplication (fuzzy conjunction), and is the all-ones vector. corresponds to product logic fuzzy disjunction .

Scoring Function

We estimate the plausibility that answers by . In practice, multiple conjunction operations tend to diminish the norms of query embeddings. In order to mitigate the influence of query complexity, we apply Layer Normalization (LayerNorm) before computing the score.

Theoretical Analysis

For FuzzQE, we present the following propositions with proof in Appendix A.

Proposition 1.

Our conjunction operator is commutative, associative, and satisfies conjunction elimination.

Proposition 2.

Our disjunction operator is commutative, associative, and satisfies disjunction amplification.

Proposition 3.

Our negation operator is involutory and satisfies non-contradiction.

2.3 Model learning and inference

During training, given a query embedding , we optimize the following negative log-likelihood loss:

where is an entity in the answer set of , represents a random negative sample, and denotes the margin. We use random negative samples and optimize over the average.

For inference, given a query , FuzzQE embeds it as and rank all the entities by .

3 Experiments

In this section, we evaluate FuzzQE  by answering a wide range of complex FOL queries over two incomplete KBs.

Datasets

We evaluate our model on two benchmark datasets provided by (BetaE), which contains 14 types of logical queries on FB15k-237 (FB15k237) and NELL995 (DeepPath). Dataset statistics are summarized in Appendix C. We exclude FB15k (TransE) as this dataset suffers from major test leakage (FB15k237).

Evaluation Protocol

We follow the evaluation protocol in (BetaE). For each answer of a test query, we denote the model’s predicted rank as and report the Mean Reciprocal Rank (MRR): . To evaluate the model’s generalization over incomplete KB, the datasets are masked out so that each validation/test query involves at least one missing link.

Baselines

We consider three state-of-the-art baselines for answering complex logical queries on KGs: GQE (GQE), Query2Box (Query2Box), and BetaE (BetaE). GQE and Q2B can answer Existential Positive First-Order (EPFO) queries (i.e. queries with ), but they cannot model negation. To the best of our knowledge, BetaE is the only existing embedding based baseline that could model negation in FOL queries. For BetaE, we report the results of the BetaE variant since it generally provides better performance than BetaE

. We list hyperparameters and more details in Appendix

D.

3.1 Results

Modeling FOL queries

First we test the ability of FuzzQE to model arbitrary FOL queries. The MRR results are reported in Table 3. For EPFO queries (queries with but no negation), our approach consistently outperforms the state-of-the-art approaches. Regarding queries with negation, FuzzQE significantly outperforms the only available baseline BetaE across all the query structures that contain negation. On average, FuzzQE improves the MRR by 2.3% (42% relatively) on FB15k-237 and 1.9% (32% relatively) on NELL995 for queries containing negation.

Training with only Link Prediction

As the logical operators in our framework doesn’t contain any learnable parameters, our model can be trained with only link prediction task, and then generalize to answer arbitrary complex FOL logical queries To evaluate such generalization, we train our model and other baselines with only KG edges (i.e. 1p querys in the datasets). Experiment setting details are given in Appendix D.2. As shown in Table 4, FuzzQE is able to generalize to unseen complex queries even if it is only trained on link prediction and provides significantly better performance than the baseline models. This is mostly due to the design of the principled and learning-free logical operators. This superiority in generalization of FuzzQE is very useful in for real-world applications, since complex query datasets are not available in most real-world KGs.

4 Conclusion

We propose a novel logical query embedding framework FuzzQE for answering complex logical queries on KGs. Extensive experiments show the promising capability of FuzzQE on answering logical queries on KGs. The results are encouraging and suggest various extensions, including introducing logical rules into learning, and in-depth study of predicate fuzzy logic systems.

References

Appendix

Appendix A Proof of propositions

a.1 Proof of Proposition 1

a.1.1 Commutativity

Proof.

We have where denotes element-wise multiplication.
Therefore, . ∎

a.1.2 Associativity

Proof.

Since , we have

a.1.3 Conjunction elimination ,  

Proof.

can be proved by

can be proved similarly. ∎

a.2 Proof of Proposition 2

a.2.1 Commutativity

Proof.

We have .
Therefore, . ∎

a.2.2 Associativity

Proof.

Therefore

a.2.3 Disjunction amplification ,  

Proof.

can be proved by

can be proved similarly. ∎

a.3 Proof of Proposition 3

a.3.1 Involution

Proof.

Therefore

a.3.2 Non-contradiction  

Proof.

The Łukasiewicz negation is monotonically decreasing with regard to . Therefore, is monotonically decreasing with regard to . ∎

Appendix B Axiomatic systems of logic

Axiomatic system of classical logic and fuzzy logic consist of a set of axioms and the Modus Ponen inference rule:

Implication is defined as holds if the truth value of is larger than or equal to .

In Table 5, we compare the semantics of classical logic and product logic with Łukasiewicz negator and show that product logic operations are fully compatible with classical logic.

Classical Logic Product Logic
Interpretation
Table 5: Semantics of classical logic and product logic. denote all valid logic formulae under the logic system, and are logical formulae.  denotes the truth value of a logical formula.

In Table 6, we provide the list of axioms written in Hilbert-style deductive system for classical logic, basic fuzzy logic, and three prominent fuzzy logic systems that extend basic fuzzy logic: Łukasiewicz logic, Gödel-Dummett logic, and product logic (KlementTNormBook). We also provide some of the derivable logic laws. Interested readers are referred to (fuzzytheorybook) for proofs.

Axiom / Logic Law Classical Logic Basic Fuzzy Logic Łukasiewicz Gödel Product
Transitivity
Weakening
Exchange
(I)
(II)
(III)
(I)
(II)
III)
Prelinearity
EFQ
Contraction
Wajsberg
(II)
XI
XII
XIII
Table 6: Axioms and derivable logic laws of classical logic, basic fuzzy logic, and three prominent fuzzy logic systems that are based on on basic fuzzy logic: Łukasiewicz logic, Gödel-Dummett logic, and product logic (KlementTNormBook). • denotes that the formula is in the minimal axiomatic system (chvalovsky2012independence), while  means the logic law could be derived from the minimal axiomatic system. EFQ stands for Ex falso quodlibet, which is Latin for from falsehood, anything. denotes the strong conjunction defined as .

Appendix C Dataset statistics and query structures

The 14 types of query structures in the datasets are shown in Fig. 2. The knowledge graph dataset statistics are summarized in Table 8. We list the average number of answers the test queries have in Table 9, and the number of training/validation/test queries in Table 7.

Figure 2: Query structure types used in training and evaluation. Naming convention: projection, intersection, complement (negation), union.
Queries Training Validation Test
Dataset 1p/2p/3p/2i/3i 2in/3in/inp/pin/pni 1p others 1p others
FB15k-237 149689 14968 20101 5000 22812 5000
NELL995 107982 10798 16927 4000 17034 4000
Table 7: Number of training, validation, and test queries for different query structures.
Dataset Entities Relations Training Edges Validation Edges Test Edges Total Edges
FB15k-237 14505 237 272115 17526 20438 310079
NELL 63361 200 114213 143234 14267 142804
Table 8: Knowledge graph dataset statistics as well as training, validation, and test edge splits.
Dataset 1p 2p 3p 2i 3i ip pi 2u up 2in 3in inp pin pni
FB15k-237 1.7 17.3 24.3 6.9 4.5 17.7 10.4 19.6 24.3 16.3 13.4 19.5 21.7 18.2
NELL995 1.6 14.9 17.5 5.7 6.0 17.4 11.9 14.9 19.0 12.9 11.1 12.9 16.0 13.0
Table 9: Average number of answers of test queries in our new dataset.

Appendix D Experimental details

For baselines GQE (GQE), Q2B (Query2Box), and BetaE (BetaE), we use implementation from https://github.com/snap-stanford/KGReasoning and use the best hyper-parameters reported by (BetaE).

d.1 Hyper-parameters and hardware specifications

We finetune hyperparameters based on the average MRR on the validation set with a patience of 15k steps. We search hyperparameters in the following range: learning rate from {0.001, 0.0001, 0.00001}, embedding dimension from {400, 800, 1000}, number of relation bases from {30, 50, 100, 150}, batch size from {256, 512}.

The best hyper-parameter configurations are learning rate 0.001, embedding dimension 800, and batch size 512 on both datasets. The number of relation bases is 150 for FB15k-237 and 50 for NELL995.

Each single experiment is run on CPU E5-2650 v4 12-core and a single GP102 TITAN Xp (12GB) GPU. We run each method up to 450k steps.

d.2 Train baseline models with only Link Prediction

For the baseline models GQE, Q2B, and BetaE, their intersection operators employ MLP or the attention mechanism to learn set weights to put emphasis on the potentially more important set. For these baseline models, we refer to the implementation of Q2B-AVG-1P model in (Query2Box) and configure the set weights as identical at the inference stage if the model is trained with link prediction only.

-norm () -conorm () Special Properties
minimum (Gödel-Dummett) idempotent
product strict monotonicity
Łukasiewicz nilpotent
Table 10: Prominent examples of -norms and the corresponding -norms derived by De Morgan’s law and the canonical negator . We list the special properties of the formulas in addition to the basic properties (i.e. commutativity, associativity, monotonicity, and boundary condition) of t-norm and t-conorm.

Appendix E -norm based fuzzy logic systems

In fuzzy logic, the Łukasiewicz negation is also called the canonical negator. Functions that qualify as fuzzy conjunction and fuzzy disjunction are usually referred to in literature as -norms  (triangular norms) and -conorms  (triangular conorms) respectively in literature (KlirFuzzyBook).

A -norm (tnorm) is a function which is commutative and associative and satisfies the boundary condition as well as monotonicity, i.e. if and . Prominent examples of -norms include minimum, product, and Łukasiewicz -norm. Their formulas and special properties are listed in Table 10.

-conorm is the logical dual form of the -norm. Following De Morgan’s law, given the canonical negator , the -conorm  of a -norm  is defined as . A -conorm is commutative and associative, and satisfies the boundary condition as well as monotonicity: if and . Interested readers are referred to (KlirFuzzyBook) for proofs.

The formulas of -conorms that correspond to the minimum (Gödel-Dummett), product, and Łukasiewicz -norms are given in Table 10.