Mining News Events from Comparable News Corpora: A Multi-Attribute Proximity Network Modeling Approach

11/14/2019
by   Hyungsul Kim, et al.
11

We present ProxiModel, a novel event mining framework for extracting high-quality structured event knowledge from large, redundant, and noisy news data sources. The proposed model differentiates itself from other approaches by modeling both the event correlation within each individual document as well as across the corpus. To facilitate this, we introduce the concept of a proximity-network, a novel space-efficient data structure to facilitate scalable event mining. This proximity network captures the corpus-level co-occurence statistics for candidate event descriptors, event attributes, as well as their connections. We probabilistically model the proximity network as a generative process with sparsity-inducing regularization. This allows us to efficiently and effectively extract high-quality and interpretable news events. Experiments on three different news corpora demonstrate that the proposed method is effective and robust at generating high-quality event descriptors and attributes. We briefly detail many interesting applications from our proposed framework such as news summarization, event tracking and multi-dimensional analysis on news. Finally, we explore a case study on visualizing the events for a Japan Tsunami news corpus and demonstrate ProxiModel's ability to automatically summarize emerging news events.

READ FULL TEXT
research
03/07/2021

RevDet: Robust and Memory Efficient Event Detection and Tracking in Large News Feeds

With the ever-growing volume of online news feeds, event-based organizat...
research
09/17/2021

Event Flow – How Events Shaped the Flow of the News, 1950-1995

This article relies on information-theoretic measures to examine how eve...
research
05/18/2021

The Commodities News Corpus: A Resource forUnderstanding Commodity News Better

Commodity News contains a wealth of information such as sum-mary of the ...
research
09/18/2017

Towards Building a Knowledge Base of Monetary Transactions from a News Collection

We address the problem of extracting structured representations of econo...
research
12/10/2020

An Event Correlation Filtering Method for Fake News Detection

Nowadays, social network platforms have been the prime source for people...
research
03/13/2017

MetaPAD: Meta Pattern Discovery from Massive Text Corpora

Mining textual patterns in news, tweets, papers, and many other kinds of...
research
02/14/2022

Introducing the ICBe Dataset: Very High Recall and Precision Event Extraction from Narratives about International Crises

How do international crises unfold? We conceive of international affairs...

Please sign up or login with your details

Forgot password? Click here to reset