Establishing linguistic conventions in task-oriented primeval dialogue

03/02/2012 ∙ by Martin Bachwerk, et al. ∙ Trinity College Dublin 0

In this paper, we claim that language is likely to have emerged as a mechanism for coordinating the solution of complex tasks. To confirm this thesis, computer simulations are performed based on the coordination task presented by Garrod & Anderson (1987). The role of success in task-oriented dialogue is analytically evaluated with the help of performance measurements and a thorough lexical analysis of the emergent communication system. Simulation results confirm a strong effect of success mattering on both reliability and dispersion of linguistic conventions.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In the last decade, the field of communication science has seen a major increase in the number of research programmes that go beyond the more conventional studies of human dialogue (e.g. [6, 7]) in an attempt to reproduce the emergence of conventionalized communication systems in a laboratory (e.g. [4, 8, 10]). In his seminal paper, Galantucci has proposed to refer to this line of research as experimental semiotics, which he sees as a more general form of experimental pragmatics. In particular, Galantucci defines that the former “studies the emergence of new forms of communication”, while the latter “studies the spontaneous use of pre-existing forms of communication” (p. 394, [5]).

Experimental semiotics provides a novel way of reproducing the emergence of a conventionalized communication system under laboratory conditions. However, the findings from this field cannot be transferred to the question of primeval emergence of language without the caveat that the subjects of the present-day experiments are very much familiar with the concepts of conventions and communication systems (even if they are not allowed to employ any existing versions of these in the conducted experiments), while our ancestors who somehow managed to invent the very first conventionalized signaling system, by definition, could not have been aware of these concepts. Since experimental semiotics researchers cannot adjust the minds of their subjects in order to find out how they could discover the concept of a communication system, the most these experiments can realistically achieve is make the subjects signal the ‘signalhood’ of some novel form of communication (see. [13]). To go any further seems at least for now to require the use of computer models and simulations.

Consequently, we are interested in how a community of simulated agents can agree on a set of lexical conventions with a very limited amount of given knowledge about the notion of a communication system. In this particular paper, we address this issue by conducting several computer simulations that are meant to reconstruct the human experiments conducted by [6] and [7], which suggest that the establishment of new conventions requires for at least some understanding to be experienced, for example measured in the success of the action performed in response to an utterance, and that differently organized communities can come up with variously effective communication systems. While the communities in the current experiments are in a way similar to the social structures implemented in [1], the focus here is on local coordination and the role of task-related communicative success, rather than the effect of different higher-order group structures.

2 Modelling Approach

The experiments presented in this paper have been performed with the help of the Language Evolution Workbench (LEW) (see [16, 1] for more detailed descriptions of the model). This workbench provides over 20 adjustable parameters and makes as few assumptions about the agents’ cognitive skills and their awareness of the possibility of a conventionalized communication system as possible. The few cognitive skills that are assumed can be considered as widely accepted (see [11, 14] among others) as the minimal prerequisites for the emergence of language. These skills include the ability to observe and individuate events, the ability to engage in a joint attention frame fixed on an occurring event, and the ability to interact by constructing words and utterances from abstract symbols111While we often refer to such symbols as ‘phonemes’ throughout the paper, there is no reason why these should not be representative of gestural signs. and transmitting these to one’s interlocutor.222Phenomena such as noise and loss of data during signal transmission are ignored in our approach for the sake of simplicity.333It is important to stress out that hearers are not assumed to know the word boundaries of an encountered utterance. However, simulations with so called synchronized transmission have been performed previously by [15]. During such interactions, one of the agents is assigned the intention to comment on the event, while a second agent assumes that the topic of the utterance relates in some way to the event and attempts to decode the meaning of the encountered symbols accordingly.

From an evolutionary point of view, the LEW fits in with the so called faculty of language in the narrow sense as proposed by [9] in that the agents are equipped with the sensory, intentional and concept-mapping skills at the start, and the simulations attempt to provide an insight into how these could be combined to produce a communication system with comparable properties to a human language. From a pragmatics point of view, our approach directly adopts the claim made by [12] that dialogue is the underlying form of communication. Furthermore, despite the agents in the LEW lacking any kind of embodiment, they are designed in a way that makes each agent individuate events according to its own perspective, which in most cases results in their situation models being initially non-aligned, thus providing the agents with the task of aligning their representations, similarly to the account presented in [12].

3 Experiment Design

In the presented experiments, we aim to reproduce the two studies originally performed by Garrod and his colleagues, but in an evolutionary simulation performed on an abstract model of communication. Our reconstruction lies in the context of a simulated dynamic system of agents which should provide us with some insights about how Garrod’s findings can be transferred to the domain of language evolution. The remainder of this section outlines the configuration of the LEW used in the present study, together with an explanation of the three manipulated parameters. The results of the corresponding simulations are then evaluated in the following section 4, with special emphasis being put on the communicative potential and general linguistic properties of the emergent communication systems.444We intentionally refrain from referring to the syntax-less communication systems that emerge in our simulations as ‘language’ as that would be seen as highly contentious by many readers. Furthermore, even though the term ‘protolanguage’ appears to be quite suited for our needs (cf. [11]), the controversial nature of that term does not really encourage its use either, prompting us to stick to more neutral expressions.

Garrod observed in his two studies that conventions have a better chance of getting established and reused if their utilisation appears to lead to one’s interlocutor understanding of one’s utterance, either by explicitly signaling so or by performing an adequate action. Notably, in task-based communication, interlocutors may succeed in achieving a task with or without complete mutual understanding of the surrounding dialogue. Nevertheless, our simulations have been focussed on a parameter of the LEW that defines the probability that

communicative success matters in an interaction. From an evolutionary point of view, this parameter is motivated by the numerous theories that put cooperation and survival as the core function of communication (e.g. [2]). However, the abstract implementation of the parameter allows us to refrain from selecting any particular evolutionary theory as the target one by generalizing over all kinds of possible success that may result from a communication bout, e.g. avoiding a predator, hunting down a prey or battling off a rival gang.

The levels of the parameter that defines if success matters were varied between 0 and 1 (in steps of 0.25) in the presented simulations. To clarify the selected values of the parameter, = means that communicative success plays no role whatsoever in the system and = means that only interactions satisfying a minimum success threshold will be remembered by the agents. The minimum success threshold is established by an additional parameter of the LEW and can be generally interpreted as the minimum amount of information that needs to be extracted by the hearer from an encountered utterance in order to be of any use. In our experiments, we have varied between a minimum success threshold of 0.25 and 1 (in steps of 0.25).555Setting the minimum success threshold to 0 is equivalent to having . The effects of this parameter will not be reported in this paper due to a lack of significance and space limitations.

In addition to the above two parameters, the presented experiments also introduce two different interlocutor arrangements, similar to the studies in [6] and [7]. In the first of these, pairs of agents are partnered with each other for the whole duration of the simulation, meaning that they do not converse with any other agents at all. The second arrangement emulates the community setting introduced in [7]

by successively alternating the pairings of agents, in our case after every 100 interaction ‘epochs’.

666In both cases, the agent population was set to ten and so each ‘epoch’ comprised ten interactions, whereby every agent would on average take part in two interactions: once as a speaker and once as a hearer. The introduction of the community setting was motivated by the hypothesis that a community of agents should be able to engage in a global coordination process, as opposed to local entrainment, resulting in more generalized and thus eventually more reliable conventions.

4 Results and Discussion

The experimental setup described above resulted in 34 different parameter combinations, for each of which 600 independent runs have been performed in order to obtain empirically reliable data. The evaluation of the data has been performed with the help of a number of measures that have been selected with the goal of being able to describe both the communicative usefulness of an evolved convention system, as well as compare its main properties to those of languages as we know them now (see [1] for a more detailed account).

In order to understand how well a communication system performs in a simulation, it is common to observe the understanding precision and recall rates, which can be combined to a single F-measure (

). As can be seen from Figure 1, the results suggest that having a higher has a direct effect on the understanding rates of a community ( value between 26.68 and 210.63, ). However, a communication setup in which agents communicate with each other in turns as opposed to with a fixed partner does not appear to be advantageous for the establishment of a reliable means of communication (, ). Looking further, Figure 1 indicates that, just as observed in [7]

, agents operating in a community have a larger amount of variation available to them, in our case in the form of a larger lexicon (

, ). However, unlike in the empirical study, the agents in the LEW do not benefit from this property, among other things due to the lack of an ability to enter into a negotiation about conventions to use in a given context.

Figure 1: Effect of the interaction type and the probability that success matters on LABEL: communicative success and LABEL: agent lexicon size.

It is important to note at this stage that the understanding measure presented in Figure 1 only takes into account the interactions that have been successful, i.e. were not below the minimum success threshold in cases where success was chosen to matter. Consequently, this figure does not tell us how well the agents’ lexicons are actually equipped to interpret a wide range of utterances. In order to evaluate the lexicons of agents without any effect that simple guessing luck might have on understanding, we take a look at two further measures: lexicon use, i.e. the average ratio of forms of an utterance that the hearer agent was able to find in his lexicon, and lexicon precision, i.e. the ratio of correct meanings found by the hearer, in the cases where the agent used his lexicon for decoding a form. Furthermore, the decrease in lexicon size alone does not provide any specific information as to what exactly is happening to the agents’ lexicons. In other terms, further measures are required that could explain what effect the decrease actually has on the expressive and interpretative potential of a lexicon.

Figure 2: Effect of the interaction type and the probability that success matters on LABEL: lexicon use and LABEL: lexicon precision.
Figure 3: Effect of the interaction type and the probability that success matters on the number of LABEL: unique meanings and LABEL: unique forms in agent lexicons.

Figure 2 depicts the rates of lexicon use, suggesting that with the increase of and the corresponding diminishing of lexicon size ( value between and , ), the number of forms in an agent’s lexicon appears to decrease ( value between and , ) with a significant effect on lexicon use ( value between and , ), as further confirmed by Figure 3. The intuition is that for higher levels of , wrongly guessed meanings are not being recorded in the agents’ lexicons, resulting in higher quality convention systems. This is confirmed by the increase in lexicon precision ( value between and , ) depicted in Figure 2. Interestingly enough, the decrease in the number of different forms in agents’ lexicons does not seem to have a significant effect on agent lexicon synonymy across the board ( for ; yet value between and , for higher levels of ) (see Figure 4). Presumably, the reason for this is that the drop-off in the number of distinct meanings (see Figure 3) is directly proportional to that of distinct forms, which would explain the less affected synonymy and homonymy ratios (see Figure 4 for a plot of the latter).

Figure 4: Effect of the interaction type and the probability that success matters on LABEL: agent lexicon synonymy and LABEL: agent lexicon homonymy.
Figure 5: Effect of the interaction type (LABEL: only) and the probability that success matters on LABEL: average mapping share and LABEL: ratio of mappings shared by exactly X agents.

So far we have only evaluated the results of our experiments from the point of view of the agents, either by looking at their observed interaction success or by evaluating the communicative potential of their lexicons. However, as one of the main topics of the presented study was the establishment of conventions in a community of interlocutors, we should also evaluate the simulation results from the point of view of conventions, i.e. meaning-form mappings. In fact, there is a significant effect of both the community setting ( value between and , ) and success mattering (, ) on the number of agents that share a mapping on average, as depicted in Figure 5. This effect is broken down in Figure 5, in which one can see the portion of the global lexicon that is shared by any particular number of agents.777The remainder of the mappings is not shared, i.e. known by only one agent. The effects observed in the latter figure can be further described by an equation like , whereby the ratio of shared mappings () is directly proportional to success mattering () and inversely proportional to the number of agents () that are expected to know the mappings.

5 Conclusions and Future Work

In summary, experiencing a degree of success provides the all important foundation required for establishing linguistic conventions in task-oriented dialogue and dispersing these throughout the community. The ramifications of this finding are that language is very unlikely to have emerged for the benefit of a success-agnostic activity, such as gossip (cf. [3]), but has presumably evolved as an adaptational necessity in times where human cooperation has become essential.

The shortcomings of the community setting can be attributed to the LEW’s implementation of interactions as two autonomous activities and the lack of success-based adjustment of mapping usage strategies. Future work should aim to improve this aspect by looking into the interactive alignment model (cf. [12]).


  • [1] Bachwerk, M., Vogel, C.: Modelling Social Structures and Hierarchies in Language Evolution. In: Bramer, M., Petridis, M., Hopgood, A. (eds.) Research and Development in Intelligent Systems XXVII, pp. 49–62. Springer (2011)
  • [2] Bickerton, D.: Foraging Versus Social Intelligence in the Evolution of Protolanguage. In: Wray, A. (ed.) The Transition to Language, pp. 207–225. Oxford University Press (2002)
  • [3] Dunbar, R.I.M.: Grooming, Gossip and the Evolution of Language. Harvard University Press (1997)
  • [4] Galantucci, B.: An Experimental Study of the Emergence of Human Communication Systems. Cognitive Science 29, 737–767 (2005)
  • [5] Galantucci, B.: Experimental Semiotics: A New Approach for Studying Communication as a Form of Joint Action. Topics in Cognitive Science 1(2), 393–410 (2009)
  • [6] Garrod, S., Anderson, A.: Saying what you mean in dialogue: A study in conceptual and semantic co-ordination. Cognition 27, 181–218 (1987)
  • [7] Garrod, S., Doherty, G.: Conversation, co-ordination and convention: an empirical investigation of how groups establish linguistics conventions. Cognition 53, 181–215 (1994)
  • [8] Garrod, S., Fay, N., Lee, J., Oberlander, J., Macleod, T.: Foundations of Representation: Where Might Graphical Symbol Systems Come From? Cognitive Science 31, 961–987 (2007)
  • [9] Hauser, M.D., Chomsky, N., Fitch, W.T.: The Faculty of Language: What Is It, Who Has It, and How Did It Evolve? Science (New York, N.Y.) 298(5598), 1569–1579 (2002)
  • [10] Healey, P.G.T., Swoboda, N., Umata, I., Katagiri, Y.: Graphical representation in graphical dialogue. International Journal of Human-Computer Studies 57, 375–395 (2002)
  • [11] Jackendoff, R.: Possible stages in the evolution of the language capacity. Trends in Cognitive Sciences pp. 272–279 (1999)
  • [12] Pickering, M.J., Garrod, S.: Toward a mechanistic psychology of dialogue. Behavioral and Brain Sciences 27, 169–190 (2004)
  • [13] Scott-Phillips, T.C., Kirby, S., Ritchie, G.R.S.: Signalling signalhood and the emergence of communication. Cognition 113, 226–233 (2009)
  • [14] Tomasello, M.: Constructing a Language: A Usage-Based Theory of Language Acquisition. Harvard University Press (2003)
  • [15] Vogel, C.: Group Cohesion, Cooperation and Synchrony in a Social Model of Language Evolution. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds.) Development of Multimodal Interfaces: Active Listening and Synchrony, pp. 16–32. Springer Berlin Heidelberg (2010)
  • [16] Vogel, C., Woods, J.: A Platform for Simulating Language Evolution. In: Bramer, M., Coenen, F., Tuson, A. (eds.) Research and Development in Intelligent Systems XXIII, pp. 360–373. London: Springer (2006)