1 Introduction
Logic Law | Model Property | |
Conjunction elimination | ||
Commutativity | ||
Associativity | ||
Disjunction amplification | ||
Commutativity | ||
Associativity | ||
Involution | ||
Non-contradiction | ||
denotes the scoring function that estimates the probability that the entity
can answer the query . means is monotonically decreasing with regard to .Expressivity (Closed) | Com. | Asso. | Elim. | Expressivity (Closed) | Com. | Asso. | Ampli. | Expressivity (Closed) | Inv. | Non-Contra. | |
---|---|---|---|---|---|---|---|---|---|---|---|
GQE | ✓(✓) | ✓ | ✓ | ✗ | ✓(✗) | ✓ | ✓ | ✓ | ✗ | N/A | N/A |
Query2Box | ✓(✓) | ✓ | ✓ | ✓ | ✓(✗) | ✓ | ✓ | ✓ | ✗ | N/A | N/A |
BetaE | ✓(✓) | ✓ | ✓ | ✗ | ✓(✗) | ✓ | ✓ | ✓ | ✓(✓) | ✓ | ✗ |
BetaE | ✓(✓) | ✓ | ✓ | ✗ | ✓(✓) | ✓ | ✓ | ✗ | ✓(✓) | ✓ | ✗ |
FuzzQE | ✓(✓) | ✓ | ✓ | ✓ | ✓(✓) | ✓ | ✓ | ✓ | ✓(✓) | ✓ | ✓ |
Knowledge graphs (KGs), such as Freebase (bollacker2008freebase), YAGO (rebele2016yago), and NELL (mitchell2018never)
, provide structured representations of facts about real-world entities and relations. One of the fundamental problems in Artificial Intelligence is to answer complex queries, which involves logical reasoning over the facts captured by KGs, e.g., answering First-Order Logic (FOL) queries with existential quantification (
), conjunction (), disjunction (), and negation (). For instance, the question “Who have sung the songs written by John Lennon or Paul McCartney, but never won Grammy Award?” can be expressed as the FOL query shown in Fig 1.Traditional symbolic reasoning (lei2011rdfsubgraph; schmidt2010foundations) have drawbacks in terms of computation complexity and handling missing edges of KGs. To address these issues, recent works (GQE; Query2Box; BetaE) seek to embed logical queries and entities in the same vector space and conduct query answering via dense similarity search. Such approaches robustly handles missing edges and offers significant advantages in terms of time and space complexity of inference. However, the logic operators in these models are defined in an ad-hoc fashion, many of which do not satisfy the axiomatic system of classical logic or fuzzy logic (KlementTNormBook)
, which limits their inference accuracy. Furthermore, the logical operators of existing works are based on multi-layer perceptron (MLP) and/or attention mechanism, which requires a large number of training queries containing such logic operations to learn the parameters. This greatly limits the scope of application of the models, since it is extremely challenging to collect a large number of reasonable complex queries with accurate answers, and such training data is not readily available in most of the real-world KGs.
To address the above issues, we present FuzzQE (Fuzzy Query Embedding), a fuzzy logic based embedding framework for answering logical queries on KGs. To implement logical operators in a more principled and learning-free manner, we borrow the idea of fuzzy logic and use the fuzzy conjunction, disjunction, and negation operations to define the logical operators in vector space.
One advantage of FuzzQE is that it employs differentiable logical operators that fully satisfy the axioms of logical operations and is capable of preserving logical operation properties in vector space. This superiority corroborated by extensive experiments on two benchmark datasets, which demonstrate that FuzzQE delivers significantly better performance to the state-of-the-art methods in answering FOL queries.
The other advantage is that our logical operations preserve the property of the logical operations while they do not require learning any operator specific parameters. Accordingly, experimental results show that even when our model is trained with link prediction only, it achieves results comparable with state-of-the-art models that are trained with extra complex query data. Furthermore, when no complex query data is used for training, our model significantly outperforms previous models. This is a huge advantage in real-world applications, since complex FOL training queries are often arduous or even inaccessible to collect in most real-world KGs.
2 Proposed Method
In this section, we present our model FuzzQE, which uses fuzzy operations to implement logical operators.
2.1 Background and Preliminaries
Knowledge Graph (KG)
A knowledge graph consists of a set of triples , with (the set of entities) denoting the subject and object entities respectively and (the set of relations) denoting the relation between and . An FOL query on KG consists of atomic queries, existential quantifiers, and logical connectives . Notation wise, we use boldfaced letters and to represent the embedding of entity and query respectively.
Logic Laws and Model Properties
Here, we refer to axioms shared by both classical logic and basic fuzzy logic (fuzzytheorybook) and summarize several the basic properties that the logical operators should possess in Table 1. The complete list of axioms written in Hilbert-style Deductive System is provided in Appendix B. In order to embed logical queries for query answering, previous works (GQE; Query2Box; BetaE) define logical operations in vector space as transformations of query embeddings. We summarize the capability of different models to retain those properties in Table 2.
Model | avg | avg | 1p | 2p | 3p | 2i | 3i | pi | ip | 2u | up | 2in | 3in | inp | pin | pni |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
FB15k-237 | ||||||||||||||||
GQE | 16.3 | - | 35.0 | 7.2 | 5.3 | 23.3 | 34.6 | 16.5 | 10.7 | 8.2 | 5.7 | - | - | - | - | - |
Query2Box | 20.1 | - | 40.6 | 9.4 | 6.8 | 29.5 | 42.3 | 21.2 | 12.6 | 11.3 | 7.6 | - | - | - | - | - |
BetaE | 20.9 | 5.5 | 39.0 | 10.9 | 10.0 | 28.8 | 42.5 | 22.4 | 12.6 | 12.4 | 9.7 | 5.1 | 7.9 | 7.4 | 3.5 | 3.4 |
FuzzQE | 24.0 | 7.8 | 42.8 | 12.9 | 10.3 | 33.3 | 46.9 | 26.9 | 17.8 | 14.6 | 10.3 | 8.5 | 11.6 | 7.8 | 5.2 | 5.8 |
NELL995 | ||||||||||||||||
GQE | 18.6 | - | 32.8 | 11.9 | 9.6 | 27.5 | 35.2 | 18.4 | 14.4 | 8.5 | 8.8 | - | - | - | - | - |
Query2Box | 22.9 | - | 42.2 | 14.0 | 11.2 | 33.3 | 44.5 | 22.4 | 16.8 | 11.3 | 10.3 | - | - | - | - | - |
BetaE | 24.6 | 5.9 | 53.0 | 13.0 | 11.4 | 37.6 | 47.5 | 24.1 | 14.3 | 12.2 | 8.5 | 5.1 | 7.8 | 10.0 | 3.1 | 3.5 |
FuzzQE | 27.0 | 7.8 | 47.4 | 17.2 | 14.6 | 39.5 | 49.2 | 26.2 | 20.6 | 15.3 | 12.6 | 7.8 | 9.8 | 11.1 | 4.9 | 5.5 |
Model | avg | avg | 1p | 2p | 3p | 2i | 3i | pi | ip | 2u | up | 2in | 3in | inp | pin | pni |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
FB15k-237 | ||||||||||||||||
GQE | 17.7 | - | 41.6 | 7.9 | 5.4 | 25.0 | 33.6 | 16.3 | 10.9 | 11.9 | 6.2 | - | - | - | - | - |
Query2Box | 18.2 | - | 42.6 | 6.9 | 4.7 | 27.3 | 36.8 | 17.5 | 11.1 | 11.7 | 5.5 | - | - | - | - | - |
BetaE | 19.0 | 0.4 | 53.1 | 6.0 | 3.9 | 32.0 | 37.7 | 15.8 | 8.5 | 10.1 | 3.5 | 0.1 | 1.4 | 0.1 | 0.1 | 0.1 |
FuzzQE | 21.9 | 6.6 | 44.0 | 10.8 | 8.6 | 32.3 | 41.4 | 22.7 | 15.1 | 13.5 | 8.7 | 7.7 | 9.5 | 7.0 | 4.1 | 4.7 |
NELL995 | ||||||||||||||||
GQE | 21.7 | - | 47.2 | 12.7 | 9.3 | 30.6 | 37.0 | 20.6 | 16.1 | 12.6 | 9.6 | - | - | - | - | - |
Query2Box | 21.6 | - | 47.6 | 12.5 | 8.7 | 30.7 | 36.5 | 20.5 | 16.0 | 12.7 | 9.6 | - | - | - | - | - |
BetaE | 15.8 | 0.5 | 37.7 | 5.6 | 4.4 | 23.3 | 34.5 | 15.1 | 7.8 | 9.5 | 4.5 | 0.1 | 1.1 | 0.8 | 0.1 | 0.2 |
FuzzQE | 24.7 | 6.0 | 55.7 | 14.9 | 11.9 | 35.7 | 41.5 | 16.8 | 22.0 | 13.4 | 9.7 | 6.2 | 7.7 | 8.4 | 3.5 | 4.2 |
2.2 Logical query embedding
Our objective is to design a query embedding model that can fit the axiomatic system of classical logic and retain the properties in Table 1. Considering fuzzy logic systems is a natural direction, as their axiomatic systems are fully compatible with classical logic, and their logical operations therefore conform to these properties as well. In our work, we implement the logical operators in vector space through element-wise fuzzy conjunction, fuzzy disjunction and fuzzy negation. This design ensures that we can satisfy the requirements of Table 1 in all dimensions. Particularly, we present FuzzQE with reference to product logic, one of the most prominent fuzzy logic systems (KlementTNormBook). Alternative design choices based on Łukasiewicz and Gödel-Dummett logic are discussed in Appendix E.
Embedding Space
We propose to embed entities and queries into the same space: . Using this embedding space provides the following benefits: (i) Each dimension of the embedding vector is between
, which satisfies the domain and range requirements of fuzzy logic and allows the model to execute element-wise fuzzy conjunction/disjunction/negation. (ii) The representation is endowed with probabilistic interpretation. Particularly, this domain restriction is implemented with an element-wise sigmoid function.
Atomic Query
We first discuss how to embed an atomic query like . In our work, we associate each relation with a weight matrix
and a bias vector
. Given a relation and the anchor entity embedding , we model the query embedding as the following relation transform:where is an element-wise sigmoid function to impose the domain restriction . To avoid the rapid growth in number of parameters with the number of relations and alleviate overfitting on rare relations, we follow (RGCN) and adopt basis-decomposition to define : , i.e. as a linear combination of basis transformations with coefficients that depend on . It can be seen as a form of effective weight sharing among different relation types. Note that intermediate variables can be embeded by performing multiple such transformations.
Logical Connectives (
We propose to use fuzzy logic operations to model logic connectives in vector space. Following product logic with Łukasiewicz negator (KlementTNormBook), the embeddings of , , and are computed as follows:
where denotes element-wise multiplication (fuzzy conjunction), and is the all-ones vector. corresponds to product logic fuzzy disjunction .
Scoring Function
We estimate the plausibility that answers by . In practice, multiple conjunction operations tend to diminish the norms of query embeddings. In order to mitigate the influence of query complexity, we apply Layer Normalization (LayerNorm) before computing the score.
Theoretical Analysis
For FuzzQE, we present the following propositions with proof in Appendix A.
Proposition 1.
Our conjunction operator is commutative, associative, and satisfies conjunction elimination.
Proposition 2.
Our disjunction operator is commutative, associative, and satisfies disjunction amplification.
Proposition 3.
Our negation operator is involutory and satisfies non-contradiction.
2.3 Model learning and inference
During training, given a query embedding , we optimize the following negative log-likelihood loss:
where is an entity in the answer set of , represents a random negative sample, and denotes the margin. We use random negative samples and optimize over the average.
For inference, given a query , FuzzQE embeds it as and rank all the entities by .
3 Experiments
In this section, we evaluate FuzzQE by answering a wide range of complex FOL queries over two incomplete KBs.
Datasets
We evaluate our model on two benchmark datasets provided by (BetaE), which contains 14 types of logical queries on FB15k-237 (FB15k237) and NELL995 (DeepPath). Dataset statistics are summarized in Appendix C. We exclude FB15k (TransE) as this dataset suffers from major test leakage (FB15k237).
Evaluation Protocol
We follow the evaluation protocol in (BetaE). For each answer of a test query, we denote the model’s predicted rank as and report the Mean Reciprocal Rank (MRR): . To evaluate the model’s generalization over incomplete KB, the datasets are masked out so that each validation/test query involves at least one missing link.
Baselines
We consider three state-of-the-art baselines for answering complex logical queries on KGs: GQE (GQE), Query2Box (Query2Box), and BetaE (BetaE). GQE and Q2B can answer Existential Positive First-Order (EPFO) queries (i.e. queries with ), but they cannot model negation. To the best of our knowledge, BetaE is the only existing embedding based baseline that could model negation in FOL queries. For BetaE, we report the results of the BetaE variant since it generally provides better performance than BetaE
. We list hyperparameters and more details in Appendix
D.3.1 Results
Modeling FOL queries
First we test the ability of FuzzQE to model arbitrary FOL queries. The MRR results are reported in Table 3. For EPFO queries (queries with but no negation), our approach consistently outperforms the state-of-the-art approaches. Regarding queries with negation, FuzzQE significantly outperforms the only available baseline BetaE across all the query structures that contain negation. On average, FuzzQE improves the MRR by 2.3% (42% relatively) on FB15k-237 and 1.9% (32% relatively) on NELL995 for queries containing negation.
Training with only Link Prediction
As the logical operators in our framework doesn’t contain any learnable parameters, our model can be trained with only link prediction task, and then generalize to answer arbitrary complex FOL logical queries To evaluate such generalization, we train our model and other baselines with only KG edges (i.e. 1p querys in the datasets). Experiment setting details are given in Appendix D.2. As shown in Table 4, FuzzQE is able to generalize to unseen complex queries even if it is only trained on link prediction and provides significantly better performance than the baseline models. This is mostly due to the design of the principled and learning-free logical operators. This superiority in generalization of FuzzQE is very useful in for real-world applications, since complex query datasets are not available in most real-world KGs.
4 Conclusion
We propose a novel logical query embedding framework FuzzQE for answering complex logical queries on KGs. Extensive experiments show the promising capability of FuzzQE on answering logical queries on KGs. The results are encouraging and suggest various extensions, including introducing logical rules into learning, and in-depth study of predicate fuzzy logic systems.
References
Appendix
Appendix A Proof of propositions
a.1 Proof of Proposition 1
a.1.1 Commutativity
Proof.
We have
where denotes element-wise multiplication.
Therefore,
.
∎
a.1.2 Associativity
Proof.
Since , we have
∎
a.1.3 Conjunction elimination ,
Proof.
can be proved by
can be proved similarly. ∎
a.2 Proof of Proposition 2
a.2.1 Commutativity
Proof.
We have
.
Therefore,
.
∎
a.2.2 Associativity
Proof.
Therefore
∎
a.2.3 Disjunction amplification ,
Proof.
can be proved by
can be proved similarly. ∎
a.3 Proof of Proposition 3
a.3.1 Involution
Proof.
Therefore ∎
a.3.2 Non-contradiction
Proof.
The Łukasiewicz negation is monotonically decreasing with regard to . Therefore, is monotonically decreasing with regard to . ∎
Appendix B Axiomatic systems of logic
Axiomatic system of classical logic and fuzzy logic consist of a set of axioms and the Modus Ponen inference rule:
Implication is defined as holds if the truth value of is larger than or equal to .
In Table 5, we compare the semantics of classical logic and product logic with Łukasiewicz negator and show that product logic operations are fully compatible with classical logic.
Classical Logic | Product Logic | |
---|---|---|
Interpretation | ||
In Table 6, we provide the list of axioms written in Hilbert-style deductive system for classical logic, basic fuzzy logic, and three prominent fuzzy logic systems that extend basic fuzzy logic: Łukasiewicz logic, Gödel-Dummett logic, and product logic (KlementTNormBook). We also provide some of the derivable logic laws. Interested readers are referred to (fuzzytheorybook) for proofs.
Axiom / Logic Law | Classical Logic | Basic Fuzzy Logic | Łukasiewicz | Gödel | Product | |
---|---|---|---|---|---|---|
Transitivity | • | • | • | • | • | |
Weakening | • | • | • | • | • | |
Exchange | • | • | • | • | • | |
(I) | • | • | • | • | • | |
(II) | • | • | • | • | • | |
(III) | • | • | • | • | • | |
(I) | • | • | • | • | • | |
(II) | • | • | • | • | • | |
III) | • | • | • | • | • | |
Prelinearity | • | • | • | • | • | |
EFQ | • | • | • | • | • | |
Contraction | • | |||||
Wajsberg | • | |||||
(II) | • | |||||
XI | • | |||||
XII | ||||||
XIII |
Appendix C Dataset statistics and query structures
The 14 types of query structures in the datasets are shown in Fig. 2. The knowledge graph dataset statistics are summarized in Table 8. We list the average number of answers the test queries have in Table 9, and the number of training/validation/test queries in Table 7.
Queries | Training | Validation | Test | |||
---|---|---|---|---|---|---|
Dataset | 1p/2p/3p/2i/3i | 2in/3in/inp/pin/pni | 1p | others | 1p | others |
FB15k-237 | 149689 | 14968 | 20101 | 5000 | 22812 | 5000 |
NELL995 | 107982 | 10798 | 16927 | 4000 | 17034 | 4000 |
Dataset | Entities | Relations | Training Edges | Validation Edges | Test Edges | Total Edges |
---|---|---|---|---|---|---|
FB15k-237 | 14505 | 237 | 272115 | 17526 | 20438 | 310079 |
NELL | 63361 | 200 | 114213 | 143234 | 14267 | 142804 |
Dataset | 1p | 2p | 3p | 2i | 3i | ip | pi | 2u | up | 2in | 3in | inp | pin | pni |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
FB15k-237 | 1.7 | 17.3 | 24.3 | 6.9 | 4.5 | 17.7 | 10.4 | 19.6 | 24.3 | 16.3 | 13.4 | 19.5 | 21.7 | 18.2 |
NELL995 | 1.6 | 14.9 | 17.5 | 5.7 | 6.0 | 17.4 | 11.9 | 14.9 | 19.0 | 12.9 | 11.1 | 12.9 | 16.0 | 13.0 |
Appendix D Experimental details
For baselines GQE (GQE), Q2B (Query2Box), and BetaE (BetaE), we use implementation from https://github.com/snap-stanford/KGReasoning and use the best hyper-parameters reported by (BetaE).
d.1 Hyper-parameters and hardware specifications
We finetune hyperparameters based on the average MRR on the validation set with a patience of 15k steps. We search hyperparameters in the following range: learning rate from {0.001, 0.0001, 0.00001}, embedding dimension from {400, 800, 1000}, number of relation bases from {30, 50, 100, 150}, batch size from {256, 512}.
The best hyper-parameter configurations are learning rate 0.001, embedding dimension 800, and batch size 512 on both datasets. The number of relation bases is 150 for FB15k-237 and 50 for NELL995.
Each single experiment is run on CPU E5-2650 v4 12-core and a single GP102 TITAN Xp (12GB) GPU. We run each method up to 450k steps.
d.2 Train baseline models with only Link Prediction
For the baseline models GQE, Q2B, and BetaE, their intersection operators employ MLP or the attention mechanism to learn set weights to put emphasis on the potentially more important set. For these baseline models, we refer to the implementation of Q2B-AVG-1P model in (Query2Box) and configure the set weights as identical at the inference stage if the model is trained with link prediction only.
-norm () | -conorm () | Special Properties | |
---|---|---|---|
minimum (Gödel-Dummett) | idempotent | ||
product | strict monotonicity | ||
Łukasiewicz | nilpotent |
Appendix E -norm based fuzzy logic systems
In fuzzy logic, the Łukasiewicz negation is also called the canonical negator. Functions that qualify as fuzzy conjunction and fuzzy disjunction are usually referred to in literature as -norms (triangular norms) and -conorms (triangular conorms) respectively in literature (KlirFuzzyBook).
A -norm (tnorm) is a function which is commutative and associative and satisfies the boundary condition as well as monotonicity, i.e. if and . Prominent examples of -norms include minimum, product, and Łukasiewicz -norm. Their formulas and special properties are listed in Table 10.
-conorm is the logical dual form of the -norm. Following De Morgan’s law, given the canonical negator , the -conorm of a -norm is defined as . A -conorm is commutative and associative, and satisfies the boundary condition as well as monotonicity: if and . Interested readers are referred to (KlirFuzzyBook) for proofs.
The formulas of -conorms that correspond to the minimum (Gödel-Dummett), product, and Łukasiewicz -norms are given in Table 10.
Comments
There are no comments yet.