Similarity Assessment through blocking and affordance assignment in Textual CBR

01/04/2013
by   R. Rajendra Prasath, et al.
0

It has been conceived that children learn new objects through their affordances, that is, the actions that can be taken on them. We suggest that web pages also have affordances defined in terms of the users' information need they meet. An assumption of the proposed approach is that different parts of a text may not be equally important / relevant to a given query. Judgment on the relevance of a web document requires, therefore, a thorough look into its parts, rather than treating it as a monolithic content. We propose a method to extract and assign affordances to texts and then use these affordances to retrieve the corresponding web pages. The overall approach presented in the paper relies on case-based representations that bridge the queries to the affordances of web documents. We tested our method on the tourism domain and the results are promising.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/29/2020

ScreenTrack: Using a Visual History of a Computer Screen to Retrieve Documents and Web Pages

Computers are used for various purposes, so frequent context switching i...
research
03/10/2022

Evaluating Elements of Web-based Data Enrichment for Pseudo-Relevance Feedback Retrieval

In this work, we analyze a pseudo-relevance retrieval method based on th...
research
11/21/2021

The Impact of Main Content Extraction on Near-Duplicate Detection

Commercial web search engines employ near-duplicate detection to ensure ...
research
10/07/2018

Multi-reference Cosine: A New Approach to Text Similarity Measurement in Large Collections

The importance of an efficient and scalable document similarity detectio...
research
08/07/2012

Color Assessment and Transfer for Web Pages

Colors play a particularly important role in both designing and accessin...
research
11/29/2022

ClueWeb22: 10 Billion Web Documents with Visual and Semantic Information

ClueWeb22, the newest iteration of the ClueWeb line of datasets, provide...
research
04/27/2018

Extracting Parallel Paragraphs from Common Crawl

Most of the current methods for mining parallel texts from the web assum...

Please sign up or login with your details

Forgot password? Click here to reset