Logical Boltzmann Machines

The idea of representing symbolic knowledge in connectionist systems has been a long-standing endeavour which has attracted much attention recently with the objective of combining machine learning and scalable sound reasoning. Early work has shown a correspondence between propositional logic and symmetrical neural networks which nevertheless did not scale well with the number of variables and whose training regime was inefficient. In this paper, we introduce Logical Boltzmann Machines (LBM), a neurosymbolic system that can represent any propositional logic formula in strict disjunctive normal form. We prove equivalence between energy minimization in LBM and logical satisfiability thus showing that LBM is capable of sound reasoning. We evaluate reasoning empirically to show that LBM is capable of finding all satisfying assignments of a class of logical formulae by searching fewer than 0.75 (approximately 1 billion) assignments. We compare learning in LBM with a symbolic inductive logic programming system, a state-of-the-art neurosymbolic system and a purely neural network-based system, achieving better learning performance in five out of seven data sets.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

05/31/2017

Propositional Knowledge Representation in Restricted Boltzmann Machines

Representing symbolic knowledge into a connectionist network is the key ...
06/09/2014

Tableaux for Dynamic Logic of Propositional Assignments

The Dynamic Logic for Propositional Assignments (DL-PA) has recently bee...
01/16/2014

Decidability and Undecidability Results for Propositional Schemata

We define a logic of propositional formula schemata adding to the syntax...
07/22/2020

A sound interpretation of Leśniewski's epsilon in modal logic KTB

In this paper, we shall show that the following translation I^M from the...
04/30/2021

Resource separation in dynamic logic of propositional assignments

We extend dynamic logic of propositional assignments by adding an operat...
09/18/2018

On a Convex Logic Fragment for Learning and Reasoning

In this paper we introduce the convex fragment of Łukasiewicz Logic and ...
07/09/2018

AI Reasoning Systems: PAC and Applied Methods

Learning and logic are distinct and remarkable approaches to prediction....

1 Introduction

The research community has been witnessing an increasing attention devoted to the integration of learning and reasoning, especially through the combination of neural networks and symbolic AI [10, 3] into neurosymbolic systems. At the core of a neurosymbolic system there is an algorithm to represent symbolic knowledge in a neural network. One of the goals is to leverage the parallel and distributed properties of the network to perform reasoning. In many neurosymbolic approaches, the most used form of knowledge representation is if-then rules whereby logical reasoning is built upon Modus-Ponens as the only rule of inference [18, 5, 19, 4, 20, 9]. Given a formula (read “B if A” following a logic programming notation) a neural network would either infer approximately (True) if by forward chaining, or search for the value of to confirm or refute the hypothesis (backward chaining). This has two shortcomings. First, Modus-Ponens alone may not capture entirely the power of logical reasoning as required by an application. For example, it may be the case in an application that if (False), the neural network is expected to infer approximately (Modus-Tollens). Second, one may wish to allow other forms of rules to be represented by the neural network such as disjunctive normal form (DNF) with any number of negative literals.

In this paper, we introduce Logical Boltzmann Machines, a neurosymbolic system that can represent any propositional logic formula in a neural network and achieve efficient reasoning using restricted Boltzmann machines. We introduce an algorithm to translate any logical formula described in DNF into a Boltzmann machine and we show equivalence between the logical formula and the energy-based connectionist model. In other words, we show soundness of the translation algorithm. Specifically, the connectionist model will assign minimum energy to the assignments of truth-values that satisfy the formula. This produces a new way of performing reasoning in neural networks by employing the neural network to search for the models of the logical formula, that is, to search for assignments of truth-values that map the logical formula to true. We show that Gibbs sampling can be applied efficiently for this search with a large number of variables in the logical formula. If the number of variable is small, inference can be carried out analytically by sorting the free-energy of all possible truth-value assignments. Returning to our example with formula

, Logical Boltzmann Machines can infer approximately given , and it can infer approximately given since both truth-value assignments - (, ) and (, ) - would minimise the energy of the network.

In what concerns the representational issue of rules other than if-then rules, we propose a new way of converting any logical formula into strict DNF (SDNF) which is shown to map conveniently onto Restricted Boltzmann Machines (RBMs). In the experiments reported in this paper, this new mapping into SDNF and RBMs is shown to enable approximate reasoning with a large number of variables. The proposed approach is evaluated in a logic programming benchmark task whereby machine learning models are trained from data and background knowledge. Logical Boltzmann Machines achieved a better training performance (higher test set accuracy) in five out of seven data sets when evaluated empirically on this benchmark in comparison with a purely-symbolic learning system (Inductive Logic Programming system Aleph [17]), a neurosymbolic system for Inductive Logic Programming (CILP++ [5]) and a purely-connectionist system (standard RBMs [15]).

The contribution of this work is twofold:

  • A theoretical proof to equivalently map logical formulas and probabilistic neural networks, namely Restricted Boltzmann Machines, which can facilitate neuro-symbolic learning and reasoning.

  • A foundation for the employment of statistical inference methods to perform logical reasoning.

The remainder of the paper is organised as follows. In the next section, we review the related work. Section 3 describes and proves correctness of the mapping from any logical formula into SDNF and then RBMs. Section 4 defines reasoning by sampling and energy minimization in RBMs. Section 5 introduces the LBM system and evaluates scalability of reasoning with an increasing number of variables. Section 6 contains the experimental results on learning and the comparison with a symbolic, neurosymbolic and a purely-neural learning system. We then conclude the paper and discuss directions for future work.

2 Related Work

One of the earliest work on the integration of neural networks and symbolic knowledge is Knowledge-based Artificial Neural Network [18] which encodes if-then

rules into a hierarchy of perceptrons. In another early approach

[6], a single-hidden layer neural network with recurrent connections is proposed to support logic programming rules. An extension of that approach to work with first-order logic programs, called CILP++ [5], uses the concept of propositionalisation from Inductive Logic Programming (ILP) whereby first-order variables can be treated as propositional atoms in the neural network. Also based on first-order logic programs, [4] propose a differentiable ILP approach that can be implemented by neural networks, while [2]

maps stochastic logic programs into a differentiable function also trainable by neural networks. These are all supervised learning approaches.

Among unsupervised learning approaches, Penalty Logic

[12] was the first work to integrate propositional and non-monotonic logic formulae into symmetric networks. However, it required the use of higher-order Hopfield networks which can become complicated to construct and inefficient to train with the learning algorithm of Boltzmann machines (BM). Such higher-order networks require transforming the energy function into a quadratic form by adding hidden variables not present in the original logic formulae for the purpose of building the network. More recently, several attempts have been made to extract and encode symbolic knowledge into RBMs [11, 19]. These are based on the structural similarity between symmetric networks and bi-conditional logical statements and do not contain soundness results. By contrast, and similarly to Penalty Logic, the approach introduced in this paper is based on a proof of equivalence between the logic formulae and the symmetric networks, but without requiring higher-order networks.

Alongside the above approaches which translate from a symbolic to a neural representation, normally from if-then rules to a feedforward or recurrent neural network, there are also hybrid approaches, which combine neural networks with symbolic AI systems and logic operators, such as DeepProbLog

[9]

and Logic Tensor Networks (LTN)

[14]. While DeepProbLog adds a neural network module to probabilistic logic programming, LTN represents the level of truth of first-order logic statements in the neural network. LTN employs a discriminative approach to infer the level of truth rather than the generative approach adopted in this paper by LBM. Although a discriminative approach can be very useful for evaluating assignments of variables to truth-values, we argue that it is not adequate for the implementation of a search for the satisfying assignments of logical formulae. For this purpose, the use of a generative approach is needed as proposed in this paper with LBM.

3 Knowledge Representation in RBMs

An RBM [15] can be seen as a two-layer neural network with bidirectional (symmetric) connections, which is characterised by a function called the energy of the RBM:

(1)

where and are the biases of input unit and hidden unit , respectively, and is the connection weight between and

. This RBM represents a joint distribution

where is the partition function and parameter is called the temperature of the RBM, is the set of visible units and is the set of hidden units in the RBM.

In propositional logic, any well-formed formula (WFF) can be mapped into Disjunctive Normal Form (DNF), i.e. disjunctions () of conjunctions (), as follows:

where is called a conjunctive clause, e.g. . Here, we denote the propositional variables (literals) as for positive literals (e.g. ), for negative literals (e.g. ); and denote, respectively, the sets of indices of the positive literals and indices of the negative literals in the formula. This notation may seem over-complicated but it will be useful in the proof of soundness of our translation from SDNF to RBMs.

Definition 1

Let denote the truth-value of a WFF given an assignment of truth-values to the literals of with truth-value mapped to 1 and truth-value mapped to 0. Let denote the energy function of an energy-based neural network with visible units and hidden units . is said to be equivalent to if and only if for any assignment there exists a function such that .

This definition of equivalence is similar to that of Penalty Logic [12], whereby all assignments of truth-values satisfying a WFF are mapped to global minima of the energy function of network . In our case, by construction, assignments that do not satisfy the WFF are mapped to maxima of the energy function.

Definition 2

A strict DNF (SDNF) is a DNF with at most one conjunctive clause that maps to for any assignment of truth-values . A full DNF is a DNF where each propositional variable must appear at least once in every conjunctive clause.

Lemma 1

Any SDNF can be mapped onto an energy function:

where (resp. ) is the set of (resp. ) indices of the positive (resp. negative) literals in .

Proof. Each conjunctive clause in can be represented by which maps to if and only if (i.e. ) and (i.e. ) for all and . Since is a SDNF, it is if and only if one conjunctive clause is . Then, the sum if and only if the assignment of truth-values to , is a model of . Hence, the neural network with energy function is such that .

Table 1: Energy function and truth-table for the formula ; we use the symbol to denote exclusive-or, that is .
Example 1

The formula , c.f. Table 1, can be converted into a SDNF as follows:

For each conjunctive clause in , a corresponding expression is added to the energy function, e.g. corresponding to clause . Hence, the energy function for equivalent to becomes:

We now show that any SDNF can be mapped onto an RBM.

Theorem 1

Any SDNF can be mapped onto an equivalent RBM with energy function

(2)

where , and are, respectively, the sets of indices of the positive and negative literals in each conjunctive clause of the SDNF, and is the number of positive literals in conjunctive clause .

Proof. We have seen in Lemma 1 that any SDNF can be mapped onto energy function . For each expression , we define an energy expression associated with hidden unit as . is minimized with value when , written . This is because if and only if and for all and . Otherwise, and with . By repeating this process for each we obtain that any SDNF is equivalent to an RBM with the energy function: such that .

Figure 1: An RBM equivalent to the XOR formula .

Applying Theorem 1, an RBM for the XOR formula can be built as shown in Figure 1. We choose . The energy function of this RBM is:

For comparison, one may construct an RBM for XOR using Penalty Logic, as follow. First, we compute the higher-order energy function: then we transform it to quadratic form by adding a hidden variable to obtain: which is not an energy function of an RBM, so we keep adding hidden variables until the energy function of an RBM might be obtained, in this case:

The above example illustrates in a simple case the value of using SDNF, in that it produces a direct translation into efficient RBMs by contrast with existing approaches. Next, we discuss the challenges of the conversion of WFFs into SDNF.

Representation Capacity: It is well-known that any formula can be converted into DNF. If is not SDNF then by definition there is a group of conjunctive clauses in which map to when is satisfied. This group of conjunctive clauses can always be converted into a full DNF which is also a SDNF. Therefore, any WFF can be converted into SDNF. From Theorem 1, it follows that any WFF can be represented by the energy of an RBM. For example, becomes . We now describe a method for converting logical formulae into SDNF, which we use in the empirical evaluations that follow.

Let us consider a clause:

(3)

which can be rearranged as , where is a disjunctive clause obtained by removing from . can be either or for any and . We have:

(4)

because . By De Morgan’s law (), we can always convert (and therefore ) into a conjunctive clause.

By applying (4) repeatedly, each time we can eliminate a variable out of a disjunctive clause by moving it into a new conjunctive clause. The disjunctive clause holds true if and only if either the disjunctive clause holds true or the conjunctive clause () holds true.

As an example, consider the application of the transformation above to an if-then rule (logical implication):

(5)

The logical implication is converted to DNF:

(6)

Applying the variable elimination method in (4) to all variables in the clause , we obtain the SDNF of the logical implication as:

(7)

where denotes a set from which has been removed. if . Otherwise, . This SDNF only has clauses, making translation to an RBM very efficient. For example, using this method, the SDNF of is . We need an RBM with only 4 hidden units to represent this SDNF.111Of course, the number of hidden units will grow exponentially with the number of disjuncts in (typically not allowed in logic programming), e.g. if then the full DNF will have seven conjunctive clauses.

4 Reasoning in RBMs

Reasoning as Sampling

There is a direct relationship between inference in RBMs and logical satisfiability, as follows.

Proposition 1

Let be an RBM constructed from a formula . Let be a set of indices of variables that have been assigned to either True or False (we use to denote the set ). Let be a set of indices of variables that have not been assigned a truth-value (we use to denote ). Performing Gibbs sampling on given is equivalent to searching for an assignment of truth-values for that satisfies .

Proof. Theorem 1 has shown that the truth-value of is inversely proportional to an RBM’s rank function, that is:

(8)

Therefore, a value of that minimises the energy function also maximises the truth value, because:

(9)

Now, we can consider an iterative process to search for truth-values by minimising an RBM’s energy function. This can be done by using gradient descent to update the values of and then

one at a time (similarly to the contrastive divergence algorithm) to minimise

while keeping the other variables () fixed. The alternating updates are repeated until convergence. Notice that the gradients amount to:

(10)

In the case of Gibbs sampling, given the assigned variables , the process starts with a random initialisation of , and proceeds to infer values for the hidden units and then the unassigned variables in the visible layer of the RBM, using the conditional distributions and , respectively, where and

(11)

It can be seen from (11) that the distributions are monotonic functions of the negative energy’s gradient over and . Therefore, performing Gibbs sampling on those functions can be seen as moving randomly towards a local point of minimum energy, or equivalently to an assignment of truth-values that satisfies the formula. Since the energy function of the RBM and the satisfiability of the formula are inversely proportional, each step of Gibbs sampling to reduce the energy should intuitively generate a sample that is closer to satisfying the formula.

Reasoning as Lowering Free Energy

When the number of unassigned variables is not large such that the partition function can be calculated directly, one can infer the assignments of using the conditional distribution:

(12)

where is known as the free energy. The free energy term is a negative softplus function scaled by a non-negative value called confidence value. It returns a negative output for a positive input and a close-to-zero output for a negative input. One can change the value of to make the function smooth as shown in Figure 2.

Figure 2: Plots of free energy for various confidence values .

Each free energy term is associated with a conjunctive clause in the SDNF through the weighted sum . Therefore, if a truth-value assignment of does not satisfy the formula, all energy terms will be close to zero. Otherwise, one free energy term will be , for a choice of obtained from Theorem 1. Thus, the more likely a truth assignment is to satisfy the formula, the lower the free energy.

5 Logical Boltzmann Machines

Based on the previous theoretical results, we are now in position to introduce Logical Boltzmann Machines (LBM). LBM is a neurosymbolic system that uses Restricted Boltzmann Machines for distributed reasoning and learning from data and knowledge.

The LBM system converts any set of formulae into an RBM by applying Theorem 1 to each formula . In the case of Penalty Logic, formulae are weighted. Given a set of weighted formulae , one can also construct an equivalent RBM where each energy term generated from formula is multiplied by . In both cases, the assignments that minimise the energy of the RBM are the assignments that maximise the satifiability of , i.e. the (weighted) sum of the truth-values of the formula.

Proposition 2

Given a weighted knowledge-base , there exists an equivalent RBM such that , where is the sum of the weights of the formulae in that are satisfied by assignment .

A formula can be decomposed into a set of (weighted) conjunctive clauses from its SDNF. If there exist two conjunctive clauses such that one is subsumed by the other then the subsumed clause is removed and the weight of the remaining clause is replaced by the sum of their weights. Identical conjunctive clauses are treated in the same way: one of them is removed and the weights are added. From Theorem 1, we know that a conjunctive clause is equivalent to an energy term where . A weighted conjunctive clause , therefore, is equivalent to an energy term . For each weighted conjunctive clause, we can add a hidden unit to an RBM with connection weights for all and and for all . The bias for this hidden unit will be . The weighted knowledge-base and the RBM are equivalent because , where is the sum of the weights of the clauses that are satisfied by .

Example 2

(Nixon diamond problem) Consider the following weighted knowledge-base:

Figure 3: The RBM for the Nixon diamond problem. With , this RBM has energy function:

Converting all formulae to SDNFs, e.g. , produces conjunctive clauses. After combining the weights of clause () which appears twice, an RBM is created (Figure 3) from the following unique conjunctive clauses and confidence values:

6 Experimental Results

Reasoning

In this experiment we apply LBM to effectively search for satisfying truth assignments of variables in large formulae. Let us define a class of formulae:

(13)

A formula in this class consists of possible truth assignments of the variables, with of them mapping the formula to (call this the satisfying set). Converting to SDNF as done before but now for the class of formulae, we obtain:

(14)

Applying Theorem 1 to construct an RBM from , we use Gibbs sampling to infer the truth values of all variables. A sample is accepted as a satisfying assignment if its free energy is lower than or equal to with . We evaluate the coverage and accuracy of accepted samples. Coverage is measured as the proportion of the satisfying set that is accepted. In this experiment, this is the number of satisfying assignments in the set of accepted samples divided by ). Accuracy is measured as the percentage of accepted samples that satisfy the formula.

Figure 4: Percentage coverage as sampling progresses in LBM. 100% coverage is achieved for the class of formulae with different values for M and N averaged over 100 runs. The number of samples needed to achieve coverage is much lower than the number of possible assignments (). For example, when M=20, N=10, all satisfying assignments are found after million samples are collected, whereas the number of possible assignments is billion, a ratio of sample size to the search space of . The ratio for M=30, N=10 is even lower at .

We test different values of and . LBM achieves accuracy in all cases, meaning that all accepted samples do satisfy the formula. Figure 4

shows the coverage as Gibbs sampling progresses (after each time that a number of samples is collected). Four cases are considered: M=20 and N=5, M=20 and N=10, M=25 and N=10, M=30 and N=10. In each case, we run the sampling process 100 times and report the average results with standard deviations. The number of samples needed to achieve

coverage is much lower than the number of possible assignments (). For example, when M=20, N=10, all satisfying assignments are found after million samples are collected, whereas the number of possible assignments is billion, producing a ratio of sample size to the search space size of just . The ratio for M=30, N=10 is even lower at w.r.t. possible assignments. As far as we know, this is the first study of reasoning in neurosymbolic AI to produce these results with such low ratios.

Figure 5: Time taken by LBM to collect all satisfying assignments compared with the size of the search space (number of possible assignments up to billion) as N increases from 3 to 10 with fixed M=20. LBM only needs around 10 seconds for , seconds for , and seconds for . The curve grows exponentially, similarly to the search space size, but at a much lower scale.

Figure 5 shows the time needed to collect all satisfying assignments for different N in with . LBM only needs around 10 seconds for , seconds for , and seconds for . The curve grows exponentially, similarly to the search space size, but at a much lower scale.

Learning from Data and Knowledge

In this experiment, we evaluate LBM at learning the same Inductive Logic Programming (ILP) benchmark tasks used by neurosymbolic system CILP++ [5] in comparison with ILP state-of-the-art system Aleph [17]. An initial LBM is constructed from a subset of the available clauses222The rest of the clauses is used for training and validation in the usual way, whereby satisfying assignments are selected from each clause as training examples, for instance from clause , assignment

is converted into vector

. used as background knowledge, more hidden units with random weights are added to the RBM and the system is trained further from examples, following the methodology used in the evaluation of CILP++. Both confidence values and network weights are free parameters for learning.

We carry out experiments on 7 data sets: Mutagenesis [16], KRK [1], UW-CSE [13], and the Alzheimer’s benchmark: Amine, Acetyl, Memory and Toxic [7]. For Mutagenesis and KRK, we use of the clauses as background knowledge to build the initial LBM. For the larger data sets UW-CSE and Alzheimer’s benchmark, we use of clauses as background knowledge. The number of hidden units added to the LBM was chosen arbitrarily to be . For a fair comparison, we also evaluate LBM against a fully-connected RBM with hidden units so as to offer the RBM more parameters than the LBM since the RBM does not use background knowledge. Both RBM and LBM are trained in discriminative fashion [8] using the conditional distribution for inference. The code with these experiments will be made available.

The results using 10-fold cross validation are shown in Table 2, except for UW-CSE which use 5 folds. The results for Aleph and CILP++ were collected from [5]. It can be seen that LBM has the best performance in 5 out of 7 data sets. In the alz-acetyl data set, Aleph is better than all other models in this evaluation, and the RBM is best in the alz-amine data set, despite not using background knowledge (although such knowledge is provided in the form of training examples).

Aleph CILP++ RBM LBM
Mutagenesis 80.85 91.70 95.55 96.28
KRK 99.60 98.42 99.70 99.80
UW-CSE 84.91 70.01 89.14 89.43
alz-amine 78.71 78.99 79.13 78.25
alz-acetyl 69.46 65.47 62.93 66.82
alz-memory 68.57 60.44 68.54 71.84
alz-toxic 80.50 81.73 82.71 84.95
Table 2: Cross-validation performance of LBM against Aleph, CILP++ and standard RBM on 7 data sets.

7 Conclusion and Future Work

We introduced an approach and neurosymbolic system to reason about symbolic knowledge at scale in an energy-based neural network. We showed equivalence between minimising the energy of RBMs and satisfiability of Boolean formulae. We evaluated the system at learning and showed its effectiveness in comparison with state-of-the-art approaches. As future work we shall analyse further the empirical results and seek to pinpoint the benefits of LBM at combining reasoning and learning at scale.

References

  • [1] M. Bain and S. Muggleton (1995) Machine intelligence 13. K. Furukawa, D. Michie, and S. Muggleton (Eds.), pp. 291–309. External Links: ISBN 0-19-853850-2, Link Cited by: §6.
  • [2] W. W. Cohen, F. Yang, and K. Mazaitis (2017)

    TensorLog: deep learning meets probabilistic dbs

    .
    CoRR abs/1707.05390. External Links: Link Cited by: §2.
  • [3] A. d’Avila Garcez and L. C. Lamb (2020) Neurosymbolic ai: the 3rd wave. External Links: 2012.05876 Cited by: §1.
  • [4] R. Evans and E. Grefenstette (2018) Learning explanatory rules from noisy data. JAIR 61, pp. 1–64. Cited by: §1, §2.
  • [5] M. França, G. Zaverucha, and A. Garcez (2014) Fast relational learning using bottom clause propositionalization with artificial neural networks. Mach. Learning 94 (1), pp. 81–104. External Links: ISSN 1573-0565, Document, Link Cited by: §1, §1, §2, §6, §6.
  • [6] A.S. Garcez, K. Broda, and D. Gabbay (2001) Symbolic knowledge extraction from trained neural networks: a sound approach. Artif. Intel. 125 (1–2), pp. 155–207. Cited by: §2.
  • [7] R. D. King, M. J. E. Sternberg, and A. Srinivasan (1995-12-01) Relating chemical activity to structure: an examination of ilp successes. New Generation Computing 13 (3). Cited by: §6.
  • [8] H. Larochelle, M. Mandel, R. Pascanu, and Y. Bengio (2012-03) Learning algorithms for the classification restricted boltzmann machine. J. Mach. Learn. Res. 13 (1), pp. 643–669. External Links: ISSN 1532-4435 Cited by: §6.
  • [9] R. Manhaeve, S. Dumancic, A. Kimmig, T. Demeester, and L. De Raedt (2018) DeepProbLog: neural probabilistic logic programming. In Advances in Neural Information Processing Systems 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.), pp. 3749–3759. External Links: Link Cited by: §1, §2.
  • [10] G. Marcus (2018) Deep learning: A critical appraisal. CoRR abs/1801.00631. External Links: Link, 1801.00631 Cited by: §1.
  • [11] L. d. Penning, A. d. Garcez, L.C. Lamb, and J-J. Meyer (2011) A neural-symbolic cognitive agent for online learning and reasoning. In IJCAI, pp. 1653–1658. Cited by: §2.
  • [12] G. Pinkas (1995) Reasoning, nonmonotonicity and learning in connectionist networks that capture propositional knowledge. Artif. Intell. 77 (2), pp. 203–247. External Links: ISSN 0004-3702, Link, Document Cited by: §2, §3.
  • [13] M. Richardson and P. Domingos (2006-02) Markov logic networks. Mach. Learn. 62 (1-2), pp. 107–136. External Links: ISSN 0885-6125, Link, Document Cited by: §6.
  • [14] L. Serafini and A. S. d’Avila Garcez (2016) Learning and reasoning with logic tensor networks. In AI*IA, pp. 334–348. External Links: ISBN 978-3-319-49129-5, Link, Document Cited by: §2.
  • [15] P. Smolensky (1995) Constituent structure and explanation in an integrated connectionist/symbolic cognitive architecture. In Connectionism: Debates on Psychological Explanation, Cited by: §1, §3.
  • [16] A. Srinivasan, S. H. Muggleton, R.D. King, and M.J.E. Sternberg (1994) Mutagenesis: ilp experiments in a non-determinate biological domain. In Proceedings of the 4th International Workshop on Inductive Logic Programming, volume 237 of GMD-Studien, pp. 217–232. Cited by: §6.
  • [17] A. Srinivasan (2007) The aleph system, version 5. Note: http://www.cs.ox.ac.uk/activities/machlearn/Aleph/aleph.html Cited by: §1, §6.
  • [18] G. Towell and J. Shavlik (1994) Knowledge-based artificial neural networks. Artif. Intel. 70, pp. 119–165. Cited by: §1, §2.
  • [19] S. Tran and A. Garcez (2018)

    Deep logic networks: inserting and extracting knowledge from deep belief networks

    .
    IEEE T. Neur. Net. Learning Syst. (29), pp. 246–258. External Links: Document, ISSN 2162-237X Cited by: §1, §2.
  • [20] F. Yang, Z. Yang, and W. W. Cohen (2017) Differentiable learning of logical rules for knowledge base reasoning. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), pp. 2319–2328. External Links: Link Cited by: §1.