Transforming Wikipedia into an Ontology-based Information Retrieval Search Engine for Local Experts using a Third-Party Taxonomy

11/04/2015
by   Gregory Grefenstette, et al.
0

Wikipedia is widely used for finding general information about a wide variety of topics. Its vocation is not to provide local information. For example, it provides plot, cast, and production information about a given movie, but not showing times in your local movie theatre. Here we describe how we can connect local information to Wikipedia, without altering its content. The case study we present involves finding local scientific experts. Using a third-party taxonomy, independent from Wikipedia's category hierarchy, we index information connected to our local experts, present in their activity reports, and we re-index Wikipedia content using the same taxonomy. The connections between Wikipedia pages and local expert reports are stored in a relational database, accessible through as public SPARQL endpoint. A Wikipedia gadget (or plugin) activated by the interested user, accesses the endpoint as each Wikipedia page is accessed. An additional tab on the Wikipedia page allows the user to open up a list of teams of local experts associated with the subject matter in the Wikipedia page. The technique, though presented here as a way to identify local experts, is generic, in that any third party taxonomy, can be used in this to connect Wikipedia to any non-Wikipedia data source.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2020

Entity Extraction from Wikipedia List Pages

When it comes to factual knowledge about a wide range of domains, Wikipe...
research
02/17/2017

Why We Read Wikipedia

Wikipedia is one of the most popular sites on the Web, with millions of ...
research
03/26/2019

Detecting and Gauging Impact on Wikipedia Page Views

Understanding how various external campaigns or events affect readership...
research
10/25/2022

Wikinformetrics: Construction and description of an open Wikipedia knowledge graph dataset for informetric purposes

Wikipedia is one of the most visited websites in the world and is also a...
research
11/25/2017

Acronym Disambiguation: A Domain Independent Approach

Acronyms are omnipresent. They usually express information that is repet...
research
09/01/2023

A Comparative Study of Reference Reliability in Multiple Language Editions of Wikipedia

Information presented in Wikipedia articles must be attributable to reli...
research
10/11/2017

Bollywood Movie Corpus for Text, Images and Videos

In past few years, several data-sets have been released for text and ima...

Please sign up or login with your details

Forgot password? Click here to reset