Privacy-Preserving Multi-Document Summarization

08/06/2015
by   Luís Marujo, et al.
0

State-of-the-art extractive multi-document summarization systems are usually designed without any concern about privacy issues, meaning that all documents are open to third parties. In this paper we propose a privacy-preserving approach to multi-document summarization. Our approach enables other parties to obtain summaries without learning anything else about the original documents' content. We use a hashing scheme known as Secure Binary Embeddings to convert documents representation containing key phrases and bag-of-words into bit strings, allowing the computation of approximate distances, instead of exact ones. Our experiments indicate that our system yields similar results to its non-private counterpart on standard multi-document evaluation datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/05/2023

Mining both Commonality and Specificity from Multiple Documents for Multi-Document Summarization

The multi-document summarization task requires the designed summarizer t...
research
05/29/2020

Datashare: A Decentralized Privacy-Preserving Search Engine for Investigative Journalists

Investigative journalists collect large numbers of digital documents dur...
research
10/26/2016

Distraction-Based Neural Networks for Document Summarization

Distributed representation learned with neural networks has recently sho...
research
10/07/2017

Multi-Document Summarization using Distributed Bag-of-Words Model

As the number of documents on the web is growing exponentially, multi-do...
research
08/06/2018

An Efficient Approach to Learning Chinese Judgment Document Similarity Based on Knowledge Summarization

A previous similar case in common law systems can be used as a reference...
research
02/10/2023

PDSum: Prototype-driven Continuous Summarization of Evolving Multi-document Sets Stream

Summarizing text-rich documents has been long studied in the literature,...
research
08/28/2023

CommunityFish: A Poisson-based Document Scaling With Hierarchical Clustering

Document scaling has been a key component in text-as-data applications f...

Please sign up or login with your details

Forgot password? Click here to reset