New/s/leak 2.0 - Multilingual Information Extraction and Visualization for Investigative Journalism

by   Gregor Wiedemann, et al.
University of Hamburg

Investigative journalism in recent years is confronted with two major challenges: 1) vast amounts of unstructured data originating from large text collections such as leaks or answers to Freedom of Information requests, and 2) multi-lingual data due to intensified global cooperation and communication in politics, business and civil society. Faced with these challenges, journalists are increasingly cooperating in international networks. To support such collaborations, we present the new version of new/s/leak 2.0, our open-source software for content-based searching of leaks. It includes three novel main features: 1) automatic language detection and language-dependent information extraction for 40 languages, 2) entity and keyword visualization for efficient exploration, and 3) decentral deployment for analysis of confidential data from various formats. We illustrate the new analysis capabilities with an exemplary case study.


page 1

page 2

page 3

page 4


A Multilingual Information Extraction Pipeline for Investigative Journalism

We introduce an advanced information extraction pipeline to automaticall...

Automatic Data Visualization Generation from Chinese Natural Language Questions

Data visualization has emerged as an effective tool for getting insights...

Language Lexicons for Hindi-English Multilingual Text Processing

Language Identification in textual documents is the process of automatic...

MACRONYM: A Large-Scale Dataset for Multilingual and Multi-Domain Acronym Extraction

Acronym extraction is the task of identifying acronyms and their expande...

A Semi-automatic Data Extraction System for Heterogeneous Data Sources: A Case Study from Cotton Industry

With the recent developments in digitisation, there are increasing numbe...

Network Visualization of ChatGPT Research: a study based on term and keyword co-occurrence network analysis

The main objective of this paper is to identify the major research areas...

Please sign up or login with your details

Forgot password? Click here to reset