Automated Query Generation for Evidence Collection from Web Search Engines

03/15/2023
by   Nestor Prieto-Chavana, et al.
0

It is widely accepted that so-called facts can be checked by searching for information on the Internet. This process requires a fact-checker to formulate a search query based on the fact and to present it to a search engine. Then, relevant and believable passages need to be identified in the search results before a decision is made. This process is carried out by sub-editors at many news and media organisations on a daily basis. Here, we ask the question as to whether it is possible to automate the first step, that of query generation. Can we automatically formulate search queries based on factual statements which are similar to those formulated by human experts? Here, we consider similarity both in terms of textual similarity and with respect to relevant documents being returned by a search engine. First, we introduce a moderate-sized evidence collection dataset which includes 390 factual statements together with associated human-generated search queries and search results. Then, we investigate generating queries using a number of rule-based and automatic text generation methods based on pre-trained large language models (LLMs). We show that these methods have different merits and propose a hybrid approach which has superior performance in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/29/2019

Towards More Usable Dataset Search: From Query Characterization to Snippet Generation

Reusing published datasets on the Web is of great interest to researcher...
research
01/27/2021

Triangular Bidword Generation for Sponsored Search Auction

Sponsored search auction is a crucial component of modern search engines...
research
10/02/2015

Automatic Taxonomy Extraction from Query Logs with no Additional Sources of Information

Search engine logs store detailed information on Web users interactions....
research
12/23/2020

Fake News Data Collection and Classification: Iterative Query Selection for Opaque Search Engines with Pseudo Relevance Feedback

Retrieving information from an online search engine is the first and mos...
research
02/12/2021

Supporting search engines with knowledge and context

Search engines leverage knowledge to improve information access. In orde...
research
05/24/2020

How Does That Sound? Multi-Language SpokenName2Vec Algorithm Using Speech Generation and Deep Learning

Searching for information about a specific person is an online activity ...
research
05/25/2018

Scraping SERPs for Archival Seeds: It Matters When You Start

Event-based collections are often started with a web search, but the sea...

Please sign up or login with your details

Forgot password? Click here to reset