Mining User Queries with Information Extraction Methods and Linked Data

09/22/2017
by   Anne Chardonnens, et al.
0

Purpose: Advanced usage of Web Analytics tools allows to capture the content of user queries. Despite their relevant nature, the manual analysis of large volumes of user queries is problematic. This paper demonstrates the potential of using information extraction techniques and Linked Data to gather a better understanding of the nature of user queries in an automated manner. Design/methodology/approach: The paper presents a large-scale case-study conducted at the Royal Library of Belgium consisting of a data set of 83 854 queries resulting from 29 812 visits over a 12 month period of the historical newspapers platform BelgicaPress. By making use of information extraction methods, knowledge bases and various authority files, this paper presents the possibilities and limits to identify what percentage of end users are looking for person and place names. Findings: Based on a quantitative assessment, our method can successfully identify the majority of person and place names from user queries. Due to the specific character of user queries and the nature of the knowledge bases used, a limited amount of queries remained too ambiguous to be treated in an automated manner. Originality/value: This paper demonstrates in an empirical manner both the possibilities and limits of gaining more insights from user queries extracted from a Web Analytics tool and analysed with the help of information extraction tools and knowledge bases. Methods and tools used are generalisable and can be reused by other collection holders.

READ FULL TEXT
research
02/06/2019

Close-reading of Linked Data: a case study in regards to the quality of online authority files

More and more cultural institutions use Linked Data principles to share ...
research
04/29/2021

Efficient SPARQL Autocompletion via SPARQL

We show how to achieve fast autocompletion for SPARQL queries on very la...
research
09/30/2018

Use Cases and Outlooks for Automatic Analytics

The landscape of analytics is changing rapidly. Much of online user anal...
research
10/05/2022

Performing live time-traversal queries via SPARQL on RDF datasets

This article introduces a methodology to perform live time-traversal SPA...
research
05/06/2022

Translating Place-Related Questions to GeoSPARQL Queries

Many place-related questions can only be answered by complex spatial rea...
research
07/23/2018

A Cache-based Optimizer for Querying Enhanced Knowledge Bases

With recent emerging technologies such as the Internet of Things (IoT), ...
research
03/14/2019

Interactive Concept Mining on Personal Data -- Bootstrapping Semantic Services

Semantic services (e.g. Semantic Desktops) are still afflicted by a cold...

Please sign up or login with your details

Forgot password? Click here to reset