Using Semantic Similarity for Input Topic Identification in Crawling-based Web Application Testing

08/23/2016
by   Jun-Wei Lin, et al.
0

To automatically test web applications, crawling-based techniques are usually adopted to mine the behavior models, explore the state spaces or detect the violated invariants of the applications. However, in existing crawlers, rules for identifying the topics of input text fields, such as login ids, passwords, emails, dates and phone numbers, have to be manually configured. Moreover, the rules for one application are very often not suitable for another. In addition, when several rules conflict and match an input text field to more than one topics, it can be difficult to determine which rule suggests a better match. This paper presents a natural-language approach to automatically identify the topics of encountered input fields during crawling by semantically comparing their similarities with the input fields in labeled corpus. In our evaluation with 100 real-world forms, the proposed approach demonstrated comparable performance to the rule-based one. Our experiments also show that the accuracy of the rule-based approach can be improved by up to 19 our approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/26/2018

A Rule-based Kurdish Text Transliteration System

In this article, we present a rule-based approach for transliterating tw...
research
07/11/2011

Rule-Based Semantic Sensing

Rule-Based Systems have been in use for decades to solve a variety of pr...
research
03/25/2015

A Rule-Based Short Query Intent Identification System

Using SMS (Short Message System), cell phones can be used to query for i...
research
02/24/2021

Automatic Meter Classification of Kurdish Poems

Most of the classic texts in Kurdish literature are poems. Knowing the m...
research
07/23/2023

Testing Hateful Speeches against Policies

In the recent years, many software systems have adopted AI techniques, e...
research
03/06/2022

Rule-Based Recommendation System for Phylogenetic Inference

Phylogenetic Inference is the reconstruction of a phylogenetic tree that...
research
02/23/2020

A Nepali Rule Based Stemmer and its performance on different NLP applications

Stemming is an integral part of Natural Language Processing (NLP). It's ...

Please sign up or login with your details

Forgot password? Click here to reset