Log In Sign Up

Neural-Symbolic Argumentation Mining: an Argument in Favour of Deep Learning and Reasoning

Deep learning is bringing remarkable contributions to the field of argumentation mining, but the existing approaches still need to fill the gap towards performing advanced reasoning tasks. We illustrate how neural-symbolic and statistical relational learning could play a crucial role in the integration of symbolic and sub-symbolic methods to achieve this goal.


page 1

page 2

page 3

page 4


Online Handbook of Argumentation for AI: Volume 1

This volume contains revised versions of the papers selected for the fir...

Learning of Human-like Algebraic Reasoning Using Deep Feedforward Neural Networks

There is a wide gap between symbolic reasoning and deep learning. In thi...

Relational Neural Machines

Deep learning has been shown to achieve impressive results in several ta...

Scientia Potentia Est – On the Role of Knowledge in Computational Argumentation

Despite extensive research in the past years, the computational modeling...

The Use of Deep Learning for Symbolic Integration: A Review of (Lample and Charton, 2019)

Lample and Charton (2019) describe a system that uses deep learning tech...

Argumentation Mining: Exploiting Multiple Sources and Background Knowledge

The field of Argumentation Mining has arisen from the need of determinin...

Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning

The goal of neural-symbolic computation is to integrate the connectionis...

1 Introduction

The goal of argumentation mining (AM) is to automatically extract arguments and their relations from a given document Lippi and Torroni (2016)

. Recent years have seen the development of a large number of techniques in this area, on the wake of the advancements produced by deep learning on the whole research field of natural language processing (NLP). Yet, it is widely recognized that the existing AM systems still have a large margin of improvement, as good results have been obtained with some genres where prior knowledge on the structure of the text eases some AM tasks, but other genres such as legal cases and social media documents still require more work 

Cabrio and Villata (2018)

. Performing and understanding argumentation requires advanced reasoning capabilities that are natural skills for humans, but which are difficult to learn for a machine. Understanding whether a given piece of evidence supports a given claim, or whether two claims attack each other, are complex problems that humans are able to address thanks to their ability to exploit commonsense knowledge, and to perform reasoning and inference. Despite the remarkable impact of deep neural networks in NLP, we argue that these techniques alone will not suffice to address such complex issues.

We envisage that a paradigm shift in AM could come from the combination of symbolic and sub-symbolic approaches, such as those developed in the Neural Symbolic (NeSy) Garcez et al. (2015) or Statistical Relational Learning (SRL) Getoor and Taskar (2007); De Raedt et al. (2016); Kordjamshidi et al. (2018)

communities. This issue is also widely recognized as one of the major challenges for the whole field of artificial intelligence in the coming years 

LeCun et al. (2015).

In computational argumentation, structured arguments have been studied and formalized for decades using models that can be expressed in a logic framework Bench-Capon and Dunne (2007)

. On the other hand, AM has rapidly evolved by exploiting state-of-the-art neural architectures coming from deep learning. So far, these two worlds have progressed largely independently of each other. Only recently, a few works have taken some steps towards the integration of such methods, by applying techniques combining sub-symbolic classifiers with knowledge expressed in the form of rules and constraints to AM. For instance,

Niculae et al. (2017)

adopted structured support vector machines and recurrent neural networks to collectively classify argument components and their relations in short documents, by hard-coding contextual dependencies and constraints of the argument model in a factor graph. A joint inference approach for argument component classification and relation identification was instead proposed by 

Persing and Ng (2016)

, following a pipeline scheme where integer linear programming is used to enforce mathematical constraints on the outcomes of a first-stage set of classifiers. In their recent work,

Cocarascu and Toni (2018) combined a deep network for relation extraction with an argumentative reasoning approach that computes the dialectical strength of arguments, for the task of determining whether a review is truthful or deceptive.

In this paper, we propose to exploit the potential of both symbolic and sub-symbolic approaches for AM, combining both results in systems that are capable of modeling knowledge and constraints with a logic formalism, while maintaining the computational power of deep networks. Differently from existing approaches, we propose to use a logic-based language for the definition of contextual dependencies and constraints, independently of the structure of the underlying classifiers. Most importantly, the approaches we outline do not exploit a pipeline scheme, but are able to perform joint detection of argument components and relations through a single learning process.

2 Modeling Argumentation with Probabilistic Logic

Computational argumentation is concerned with modelling and analyzing argumentation in the computational settings of artificial intelligence Bench-Capon and Dunne (2007); Rahwan and Simari (2009). The formalization of arguments is usually addressed at two levels. At the argument level, the definition of formal languages for representing knowledge, and specifying how arguments and counterarguments can be constructed from that knowledge is the domain of structured argumentation Besnard et al. (2014). In structured argumentation, the premises and claim of the argument are made explicit, and their relationships are formally defined. However, when the discourse consists of multiple arguments, such arguments may conflict with one another and result in logical inconsistencies. A typical way of dealing with such inconsistencies is to identify sets of arguments that are mutually consistent and are collectively able to counter their ‘‘attackers.’’ One way to do that is to abstract away from the internal structure of arguments and focus on the higher-level relations among arguments: a conceptual framework known as abstract argumentation Dung (1995).

Similarly to structured argumentation, AM too builds on the definition of an argument model, and aims to identify parts of the input text that can be interpreted as argument components Lippi and Torroni (2016). For example, if we take a basic claim/evidence argument model, possible tasks could be claim detection Aharoni et al. (2014); Lippi and Torroni (2015), evidence detection Rinott et al. (2015), and the prediction of links between claim and evidence Niculae et al. (2017); Galassi et al. (2018). However, in structured argumentation the formalization of the model is the basis for an inferential process, whereby conclusions can be obtained starting from premises. In AM, instead, an argument model is usually defined in order to identify the target classes, and in some isolated cases to express relations, for instance among argument components Niculae et al. (2017), but not for producing inferences that could help the AM tasks.

The languages of structured argumentation are logic-based. An influential structured argumentation system is deductive argumentation Besnard and Hunter (2001), where premises are logical formulae, which entail a claim, and entailment may be specified from a range of base logics, such as classical logic or modal logic. In assumption-based argumentation Dung et al. (2009) instead arguments correspond to assumptions, which like in deductive systems prove a claim, and attacks are obtained via a notion of contrary assumptions. Another powerful framework is

defeasible logic programming

(DeLP) García and Simari (2004), where claims can be supported using strict or defeasible rules, and an argument supporting a claim is warranted if it defeats all its counter arguments. For example, that a cephalopod is a mollusc could be expressed by a strict rule such as:

because these notions belong to an artificially defined, incontrovertible taxonomy. However, since in nature not all molluscs have a shell, and actually cephalopods are molluscs without a shell, rules used to conclude that a given specimen has or does not have a shell are best defined as defeasible. For instance, one could say:

where denotes defeasible inference.

The choice of logic notwithstanding, rules offer a convenient way to describe argumentative inference. Moreover, depending on the application domain, the document genre, and the employed argument model, different constraints and rules can be enforced on the structure of the underlying network of arguments. For example, if we adopt a DeLP-like approach, strict rules can be used to define the relations among argument components, and defeasible rules to define context knowledge. For example, in a hypothetical claim-premise model, support relations may be defined exclusively between a premise and a claim. Such structural properties could be expressed by the following strict rules:

whereby if supports , then is a claim and is a premise. As another abstract example, two claims based on the same premise may not attack each other:

As an example of defeasible rules, consider instead the context information about a political debate, where a republican candidate, , faces a democrat candidate, . Then one may want to use the knowledge that ’s claims and ’s claims are likely to attack each other:

where predicate denotes that claim was made by . There exist many non-monotonic reasoning systems that integrate defeasible and strict inference. However, an alternative approach that may successfully reconcile the computational argumentation view and the AM view is offered by probabilistic logic programming (PLP).

PLP combines the capability of logic to represent complex relations among entities with the capability of probability to model uncertainty over attributes and relations 

Riguzzi (2018). In a PLP framework such as PRISM Sato and Kameya (1997), LPAD Vennekens et al. (2004) or ProbLog De Raedt et al. (2007), defeasible rules may be expressed by rules with a probability label. For instance, in LPAD syntax, one could write:

to express that the above rule holds in 80% of cases. In this example, 0.8 could be interpreted as a weight or score suggesting how likely a constraint is to hold. In more recent approaches, such weights could be learned from the examples.

3 Combining Symbolic and Sub-Symbolic Approaches

The usefulness of deep networks has been tested and proven in many NLP tasks, such as machine translation Young et al. (2018)

, sentiment analysis 

Zhang et al. (2018a), text classification Conneau et al. (2017); Zhang et al. (2018b), relations extraction Huang and Wang (2017), as well as in AM Daxenberger et al. (2017); Cocarascu and Toni (2018); Schulz et al. (2018); Lauscher et al. (2018); Galassi et al. (2018); Lugini and Litman (2018). While a straightforward approach to exploit domain knowledge in AM is to apply a set of hand-crafted rules on the output of some first stage classifier (such as a neural network), NeSy or SRL approaches can directly enforce (hard or soft) constraints during training, so that a solution that does not satisfy them is penalized, if not made even impossible. Therefore, if a neural network is trained to classify argument components, and another one111Or even the same, in a multi-task setting. is trained to detect links between them, additional global constraints can be enforced to adjust the weights of the networks towards admissible solutions, as the learning process advances. Systems like DeepProbLog Manhaeve et al. (2018)

, Logic Tensor Networks 

Serafini and Garcez (2016), or Ground-Specific Markov Logic Networks Lippi and Frasconi (2009), to mention a few, enable such a scheme.

As an example, we report how to implement one of the cases mentioned in Section 2 with DeepProbLog. By extending the ProbLog framework, with DeepProbLog it is possible to introduce the following kind of construct:

The effect of the construct is the creation of a set of ground probabilistic facts, whose probability is assigned by a neural network. This mechanism allows to delegate to a neural network the classification of a set of predicates defined by some input features . The possible classes are given by . Therefore, in the AM scenario, it would be possible, for example, to exploit two networks and to classify, respectively, the type of a potential argumentative component and the potential relation between two components. The corresponding DeepProbLog code would appear as in Figure 1. These predicates could be easily integrated within a probabilistic logic program designed for argumentation, so as to model (possibly weighted) constraints, rules, and preferences, such as those described in Section 2. Figure 2 illustrates one such possibility.

nn(m_t,H,[claim,prem,non_arg]) ::
nn(m_r,H1,H2,[att,supp,none]) ::
Figure 1: An excerpt of a DeepProbLog program for the definition of neural predicates for AM.
type(Y, claim) :- rel(X,Y,supp).
type(X, premise) :- rel(X,Y,supp).
\+rel(Y1,Y2,att) :-
0.8::rel(Y1,Y2,att) :-
    made_by(Y1,R), rep(R),
    made_by(Y2,D), dem(D).
Figure 2: An excerpt of a (Deep)ProbLog program for the definition of (probabilistic) rules for AM.

The kind of approach hereby described strongly differs from the existing approaches in AM. Whereas Persing and Ng (2016) exploit a pipeline scheme to apply the constraints to the predictions made by deep networks at a first stage of computation, the framework we propose is capable to perform a joint training, which includes the constraints within the learning phase. This can be viewed as an instance of Constraint Driven Learning Chang et al. (2012) and its continuous counterpart, posterior regularization Ganchev et al. (2010), where multiple signals contribute to a global decision, by being pushed to satisfy expectations on the global decision. Differently from the work by Niculae et al. (2017)

, who use factor graphs to encode inter-dependencies between random variables, our approach enables to exploit the interpretable formalism of logic to represent rules. Moreover, the models of NeSy and SRL are typically able to

learn the weights or the probabilities of the rules, or even to learn the rules themselves, thus addressing a structure learning task.

4 Discussion

After many years of growing interest and remarkable results, time is ripe for AM to scale up and move forward in its ability to support complex arguments. To this end, we argue that research in this area should aim at combining sub-symbolic and symbolic approaches, and that several state-of-the-art ML frameworks already provide the necessary ground for such a leap forward.

The combination of such approaches will leverage different forms of abstractions that we consider essential for AM. On the one hand, (probabilistic) logical representations enable one to specify AM systems in terms of data, world knowledge and other constraints, and to express uncertainties at a logical and conceptual level rather than at the level of individual random variables. This would make AM systems easier to interpret --- a feature that is now becoming a need for AI in general Guidotti et al. (2018) --- since they could help explain the logic and the reasons that lead them to produce their arguments, while still dealing with the uncertainties stemming from the data and the (incomplete) background knowledge. On the other hand, however, AM is too complex to fully specify the distributions of random variables and their global (in)dependency structure a priori. Sub-symbolic models can replace this complexity with finding the right, general outline, in the form of computational graphs, and processing data.

In order to fully exploit the potential of this joint approach, clearly many challenges have to be faced. First of all, several languages and frameworks for NeSy and SRL exist, each with its own characteristics in terms of both expressive power and efficiency. In this sense, AM would represent an ideal test-bed for such frameworks, by presenting a challenging, large-scale application domain where the exploitation of a background knowledge could play a crucial role to boost performance. Inference in this kind of models is clearly an issue, thus AM would provide additional benchmarks for the development of efficient algorithms, both in terms of memory consumption and running time. Finally, although there are already several NeSy and SRL frameworks available, being these research areas still relatively young and in rapid development, their tools are not yet mainstream. Here, an effort is needed in integrating such tools with state-of-the-art neural architectures for NLP.


XT and KK acknowledge the support of the DFG project CAML (KE 1686/3-1, SPP 1999) and of the RMU project DeCoDeML.


  • Aharoni et al. [2014] E. Aharoni, A. Polnarov, T. Lavee, D. Hershcovich, R. Levy, R. Rinott, D. Gutfreund, and N. Slonim. A benchmark dataset for automatic detection of claims and evidence in the context of controversial topics. In Proceedings of the First Workshop on Argumentation Mining, pages 64--68. Association for Computational Linguistics, 2014.
  • Bench-Capon and Dunne [2007] T. Bench-Capon and P. E. Dunne. Argumentation in artificial intelligence. Artificial intelligence, 171(10-15):619--641, 2007. ISSN 0004-3702.
  • Besnard and Hunter [2001] P. Besnard and A. Hunter. A logic-based theory of deductive arguments. Artif. Intell., 128(1-2):203--235, May 2001. ISSN 0004-3702.
  • Besnard et al. [2014] P. Besnard, A. J. García, A. Hunter, S. Modgil, H. Prakken, G. R. Simari, and F. Toni. Introduction to structured argumentation. Argument & Computation, 5(1):1--4, 2014.
  • Cabrio and Villata [2018] E. Cabrio and S. Villata. Five years of argument mining: a data-driven analysis. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pages 5427--5433. International Joint Conferences on Artificial Intelligence Organization, 7 2018.
  • Chang et al. [2012] M.-W. Chang, L. Ratinov, and D. Roth. Structured learning with constrained conditional models. Machine Learning, 88(3):399--431, Sep 2012. ISSN 1573-0565.
  • Cocarascu and Toni [2018] O. Cocarascu and F. Toni. Combining deep learning and argumentative reasoning for the analysis of social media textual content using small data sets. Computational Linguistics, 44(4):833--858, 2018. URL
  • Conneau et al. [2017] A. Conneau, H. Schwenk, L. Barrault, and Y. Lecun. Very deep convolutional networks for text classification. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 1107--1116. Association for Computational Linguistics, 2017.
  • Daxenberger et al. [2017] J. Daxenberger, S. Eger, I. Habernal, C. Stab, and I. Gurevych. What is the essence of a claim? cross-domain claim identification. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2055--2066. Association for Computational Linguistics, 2017.
  • De Raedt et al. [2007] L. De Raedt, A. Kimmig, and H. Toivonen. ProbLog: A probabilistic prolog and its application in link discovery. In IJCAI, pages 2462--2467, 2007.
  • De Raedt et al. [2016] L. De Raedt, K. Kersting, S. Natarajan, and D. Poole. Statistical Relational Artificial Intelligence: Logic, Probability, and Computation. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, 2016.
  • Dung [1995] P. M. Dung. On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artificial Intelligence, 77(2):321--358, 1995.
  • Dung et al. [2009] P. M. Dung, R. A. Kowalski, and F. Toni. Assumption-based argumentation. In Argumentation in Artificial Intelligence Rahwan and Simari [2009], pages 199--218. ISBN 978-0-387-98196-3.
  • Galassi et al. [2018] A. Galassi, M. Lippi, and P. Torroni. Argumentative link prediction using residual networks and multi-objective learning. In Proceedings of the 5th Workshop on Argument Mining, pages 1--10. Association for Computational Linguistics, 2018.
  • Ganchev et al. [2010] K. Ganchev, J. Graça, J. Gillenwater, and B. Taskar. Posterior regularization for structured latent variable models. Journal of Machine Learning Research, 11:2001--2049, 2010.
  • Garcez et al. [2015] A. Garcez, T. R. Besold, L. De Raedt, P. Földiak, P. Hitzler, T. Icard, K.-U. Kühnberger, L. C. Lamb, R. Miikkulainen, and D. L. Silver. Neural-symbolic learning and reasoning: contributions and challenges. In Proceedings of the AAAI Spring Symposium on Knowledge Representation and Reasoning: Integrating Symbolic and Neural Approaches, Stanford, 2015.
  • García and Simari [2004] A. J. García and G. R. Simari. Defeasible logic programming: An argumentative approach. Theory Pract. Log. Program., 4(2):95--138, Jan. 2004. ISSN 1471-0684.
  • Getoor and Taskar [2007] L. Getoor and B. Taskar. Introduction to statistical relational learning, volume 1. MIT press Cambridge, 2007.
  • Guidotti et al. [2018] R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, and D. Pedreschi. A survey of methods for explaining black box models. ACM Comput. Surv., 51(5):93:1--93:42, Aug. 2018. ISSN 0360-0300.
  • Huang and Wang [2017] Y. Y. Huang and W. Y. Wang. Deep residual learning for weakly-supervised relation extraction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1803--1807. Association for Computational Linguistics, 2017.
  • Kordjamshidi et al. [2018] P. Kordjamshidi, D. Roth, and K. Kersting. Systems AI: A declarative learning based programming perspective. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pages 5464--5471. International Joint Conferences on Artificial Intelligence Organization, 7 2018.
  • Lauscher et al. [2018] A. Lauscher, G. Glavaš, S. P. Ponzetto, and K. Eckert. Investigating the role of argumentation in the rhetorical analysis of scientific publications with neural multi-task learning models. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3326--3338. Association for Computational Linguistics, 2018.
  • LeCun et al. [2015] Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 531:436--444, 2015.
  • Lippi and Frasconi [2009] M. Lippi and P. Frasconi. Prediction of protein -residue contacts by markov logic networks with grounding-specific weights. Bioinformatics, 25(18):2326--2333, 2009. ISSN 1367-4803.
  • Lippi and Torroni [2015] M. Lippi and P. Torroni. Context-independent claim detection for argument mining. In Q. Yang and M. Wooldridge, editors, Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015, pages 185--191. AAAI Press, 2015. ISBN 978-1-57735-738-4.
  • Lippi and Torroni [2016] M. Lippi and P. Torroni. Argumentation mining: State of the art and emerging trends. ACM Trans. Internet Technol., 16(2):10:1--10:25, Mar. 2016.
  • Lugini and Litman [2018] L. Lugini and D. Litman. Argument component classification for classroom discussions. In Proceedings of the 5th Workshop on Argument Mining, pages 57--67. Association for Computational Linguistics, 2018.
  • Manhaeve et al. [2018] R. Manhaeve, S. Dumancic, A. Kimmig, T. Demeester, and L. De Raedt. DeepProbLog: Neural probabilistic logic programming. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, pages 3753--3763. Curran Associates, Inc., 2018.
  • Niculae et al. [2017] V. Niculae, J. Park, and C. Cardie. Argument mining with structured SVMs and RNNs. In R. Barzilay and M. Kan, editors, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pages 985--995. Association for Computational Linguistics, 2017.
  • Persing and Ng [2016] I. Persing and V. Ng. End-to-end argumentation mining in student essays. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1384--1394. Association for Computational Linguistics, 2016.
  • Rahwan and Simari [2009] I. Rahwan and G. Simari. Argumentation in Artificial Intelligence. Springer US, 2009. ISBN 978-0-387-98196-3.
  • Riguzzi [2018] F. Riguzzi. Foundations of Probabilistic Logic Programming. River Publishers, Gistrup, Denmark, 2018. ISBN 9788770220187.
  • Rinott et al. [2015] R. Rinott, L. Dankin, C. Alzate Perez, M. M. Khapra, E. Aharoni, and N. Slonim. Show me your evidence - an automatic method for context dependent evidence detection. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 440--450. Association for Computational Linguistics, 2015.
  • Sato and Kameya [1997] T. Sato and Y. Kameya. PRISM: A language for symbolic-statistical modeling. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, IJCAI 97, Nagoya, Japan, August 23-29, 1997, 2 Volumes, pages 1330--1339. Morgan Kaufmann, 1997.
  • Schulz et al. [2018] C. Schulz, S. Eger, J. Daxenberger, T. Kahse, and I. Gurevych. Multi-task learning for argumentation mining in low-resource settings. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 35--41. Association for Computational Linguistics, 2018.
  • Serafini and Garcez [2016] L. Serafini and A. d. Garcez. Logic tensor networks: Deep learning and logical reasoning from data and knowledge. arXiv preprint arXiv:1606.04422, 2016.
  • Vennekens et al. [2004] J. Vennekens, S. Verbaeten, and M. Bruynooghe. Logic programs with annotated disjunctions. In B. Demoen and V. Lifschitz, editors, Logic Programming, pages 431--445, Berlin, Heidelberg, 2004. Springer Berlin Heidelberg. ISBN 978-3-540-27775-0.
  • Young et al. [2018] T. Young, D. Hazarika, S. Poria, and E. Cambria. Recent trends in deep learning based natural language processing [review article]. IEEE Computational Intelligence Magazine, 13(3):55--75, Aug 2018. ISSN 1556-603X.
  • Zhang et al. [2018a] L. Zhang, S. Wang, and B. Liu. Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4):e1253, 2018a.
  • Zhang et al. [2018b] Y. Zhang, R. Henao, Z. Gan, Y. Li, and L. Carin. Multi-label learning from medical plain text with convolutional residual models. In F. Doshi-Velez, J. Fackler, K. Jung, D. Kale, R. Ranganath, B. Wallace, and J. Wiens, editors, Proceedings of the 3rd Machine Learning for Healthcare Conference, volume 85 of Proceedings of Machine Learning Research, pages 280--294, Palo Alto, California, 17--18 Aug 2018b. PMLR.