On modeling vagueness and uncertainty in data-to-text systems through fuzzy sets
Vagueness and uncertainty management is counted among one of the challenges that remain unresolved in systems that generate texts from non-linguistic data, known as data-to-text systems. In the last decade, work in fuzzy linguistic summarization and description of data has raised the interest of using fuzzy sets to model and manage the imprecision of human language in data-to-text systems. However, despite some research in this direction, there has not been an actual clear discussion and justification on how fuzzy sets can contribute to data-to-text for modeling vagueness and uncertainty in words and expressions. This paper intends to bridge this gap by answering the following questions: What does vagueness mean in fuzzy sets theory? What does vagueness mean in data-to-text contexts? In what ways can fuzzy sets theory contribute to improve data-to-text systems? What are the challenges that researchers from both disciplines need to address for a successful integration of fuzzy sets into data-to-text systems? In what cases should the use of fuzzy sets be avoided in D2T? For this, we review and discuss the state of the art of vagueness modeling in natural language generation and data-to-text, describe potential and actual usages of fuzzy sets in data-to-text contexts, and provide some additional insights about the engineering of data-to-text systems that make use of fuzzy set-based techniques.
READ FULL TEXT