Finding Person Relations in Image Data of the Internet Archive

06/21/2018
by   Eric Müller-Budack, et al.
0

The multimedia content in the World Wide Web is rapidly growing and contains valuable information for many applications in different domains. For this reason, the Internet Archive initiative has been gathering billions of time-versioned web pages since the mid-nineties. However, the huge amount of data is rarely labeled with appropriate metadata and automatic approaches are required to enable semantic search. Normally, the textual content of the Internet Archive is used to extract entities and their possible relations across domains such as politics and entertainment, whereas image and video content is usually neglected. In this paper, we introduce a system for person recognition in image content of web news stored in the Internet Archive. Thus, the system complements entity recognition in text and allows researchers and analysts to track media coverage and relations of persons more precisely. Based on a deep learning face recognition approach, we suggest a system that automatically detects persons of interest and gathers sample material, which is subsequently used to identify them in the image data of the Internet Archive. We evaluate the performance of the face recognition system on an appropriate standard benchmark dataset and demonstrate the feasibility of the approach with two use cases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2018

From Videos to URLs: A Multi-Browser Guide To Extract User's Behavior with Optical Character Recognition

Tracking users' activities on the World Wide Web (WWW) allows researcher...
research
10/20/2020

A Cluster-Matching-Based Method for Video Face Recognition

Face recognition systems are present in many modern solutions and thousa...
research
01/13/2023

Young Labeled Faces in the Wild (YLFW): A Dataset for Children Faces Recognition

Face recognition has achieved outstanding performance in the last decade...
research
11/13/2013

smart application for AMS using Face Recognition

Attendance Management System (AMS) can be made into smarter way by using...
research
03/23/2020

Multimodal Analytics for Real-world News using Measures of Cross-modal Entity Consistency

The World Wide Web has become a popular source for gathering information...
research
07/09/2018

Beyond Pixels: Image Provenance Analysis Leveraging Metadata

Creative works, whether paintings or memes, follow unique journeys that ...
research
02/21/2023

Criminal Investigation Tracker with Suspect Prediction using Machine Learning

An automated approach to identifying offenders in Sri Lanka would be bet...

Please sign up or login with your details

Forgot password? Click here to reset