Human Languages with Greater Information Density Increase Communication Speed, but Decrease Conversation Breadth

12/15/2021
by   Pedro Aceves, et al.
6

Language is the primary medium through which human information is communicated and coordination is achieved. One of the most important language functions is to categorize the world so messages can be communicated through conversation. While we know a great deal about how human languages vary in their encoding of information within semantic domains such as color, sound, number, locomotion, time, space, human activities, gender, body parts and biology, little is known about the global structure of semantic information and its effect on human communication. Using large-scale computation, artificial intelligence techniques, and massive, parallel corpora across 15 subject areas–including religion, economics, medicine, entertainment, politics, and technology–in 999 languages, here we show substantial variation in the information and semantic density of languages and their consequences for human communication and coordination. In contrast to prior work, we demonstrate that higher density languages communicate information much more quickly relative to lower density languages. Then, using over 9,000 real-life conversations across 14 languages and 90,000 Wikipedia articles across 140 languages, we show that because there are more ways to discuss any given topic in denser languages, conversations and articles retrace and cycle over a narrower conceptual terrain. These results demonstrate an important source of variation across the human communicative channel, suggesting that the structure of language shapes the nature and texture of conversation, with important consequences for the behavior of groups, organizations, markets, and societies.

READ FULL TEXT

page 1

page 9

page 17

research
04/14/2023

OpenAssistant Conversations – Democratizing Large Language Model Alignment

Aligning large language models (LLMs) with human preferences has proven ...
research
09/18/1998

Semantics and Conversations for an Agent Communication Language

We address the issues of semantics and conversations for agent communica...
research
12/22/2017

Find the Conversation Killers: a Predictive Study of Thread-ending Posts

How to improve the quality of conversations in online communities has at...
research
01/04/2015

Cross-language Wikipedia Editing of Okinawa, Japan

This article analyzes users who edit Wikipedia articles about Okinawa, J...
research
02/01/2018

Emerging Language Spaces Learned From Massively Multilingual Corpora

Translations capture important information about languages that can be u...
research
06/09/2022

Crosslinguistic word order variation reflects evolutionary pressures of dependency and information locality

Languages vary considerably in syntactic structure. About 40 languages h...
research
03/16/2021

A Multilingual African Embedding for FAQ Chatbots

Searching for an available, reliable, official, and understandable infor...

Please sign up or login with your details

Forgot password? Click here to reset