TestAug: A Framework for Augmenting Capability-based NLP Tests

10/14/2022
by   Guanqun Yang, et al.
0

The recently proposed capability-based NLP testing allows model developers to test the functional capabilities of NLP models, revealing functional failures that cannot be detected by the traditional heldout mechanism. However, existing work on capability-based testing requires extensive manual efforts and domain expertise in creating the test cases. In this paper, we investigate a low-cost approach for the test case generation by leveraging the GPT-3 engine. We further propose to use a classifier to remove the invalid outputs from GPT-3 and expand the outputs into templates to generate more test cases. Our experiments show that TestAug has three advantages over the existing work on behavioral testing: (1) TestAug can find more bugs than existing work; (2) The test cases in TestAug are more diverse; and (3) TestAug largely saves the manual efforts in creating the test suites. The code and data for TestAug can be found at our project website (https://guanqun-yang.github.io/testaug/) and GitHub (https://github.com/guanqun-yang/testaug).

READ FULL TEXT
research
05/13/2022

AEON: A Method for Automatic Evaluation of NLP Test Cases

Due to the labor-intensive nature of manual test oracle construction, va...
research
05/08/2020

Beyond Accuracy: Behavioral Testing of NLP models with CheckList

Although measuring held-out accuracy has been the primary approach to ev...
research
08/31/2023

Effective Test Generation Using Pre-trained Large Language Models and Mutation Testing

One of the critical phases in software development is software testing. ...
research
08/22/2021

Bugs4Q: A Benchmark of Real Bugs for Quantum Programs

Realistic benchmarks of reproducible bugs and fixes are vital to good ex...
research
04/26/2022

Systematicity, Compositionality and Transitivity of Deep NLP Models: a Metamorphic Testing Perspective

Metamorphic testing has recently been used to check the safety of neural...
research
05/16/2022

Regression Test Suite for Payment Switch using jPOS

The Payment Switch is an integral component of all modern payment and ba...
research
10/01/2008

Determining the Unithood of Word Sequences using a Probabilistic Approach

Most research related to unithood were conducted as part of a larger eff...

Please sign up or login with your details

Forgot password? Click here to reset