Automatic Generation of System Test Cases from Use Case Specifications: an NLP-based Approach

07/19/2019
by   Chunhui Wang, et al.
uOttawa
0

Software testing plays a crucial role to ensure the conformance of software systems with their requirements. Exhaustive testing procedures are enforced by functional safety standards which mandate that each requirement be covered by system test cases. Test engineers need to identify all the representative test execution scenarios from requirements, determine the runtime conditions that trigger these scenarios, and finally provide the test input data that satisfy these conditions. Given that requirements specifications are typically large and often provided in natural language (e.g., use case specifications), the generation of system test cases tends to be expensive and error-prone. In this paper, we present Use Case Modelling for System Tests Generation (UMTG), an approach that supports the generation of executable system test cases from requirements specifications in natural language, with the goal of reducing the manual effort required to generate test cases and ensuring requirements' coverage. More specifically, UMTG automates the generation of system test cases based on use case specifications and a domain model for the system under test, which are commonly produced in many development environments. Unlike existing approaches, it does not impose strong restrictions on the template of use case specifications. It relies on recent advances in natural language processing to automatically identify test scenarios and to generate formal constraints that capture conditions triggering the execution of the scenarios, thus enabling the generation of test data. In two industrial case studies, UMTG automatically and correctly translated 95 constraints required for test data generation; furthermore, it generated test cases that exercise critical scenarios not previously considered by engineers.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 6

page 22

05/28/2019

Automating Test Case Classification and Prioritization for Use Case-Driven Testing in Product Lines

Product Line Engineering (PLE) is a crucial practice in many software de...
12/02/2021

Testing Reactive Systems Using Behavioural Programming, a Model Centric Approach

Testing is a significant aspect of software development. As systems beco...
06/02/2020

Kaya: A Testing Framework for Blockchain-based Decentralized Applications

In recent years, many decentralized applications based on blockchain (DA...
12/10/2021

Compositional Test Generation of Industrial Synchronous Systems

Synchronous systems provide a basic model of embedded systems and indust...
10/14/2021

Identifying Similar Test Cases That Are Specified in Natural Language

Software testing is still a manual process in many industries, despite t...
08/27/2019

Towards Constraint Logic Programming over Strings for Test Data Generation

In order to properly test software, test data of a certain quality is ne...
07/12/2021

Integrated and Iterative Requirements Analysis and Test Specification: A Case Study at Kostal

Currently, practitioners follow a top-down approach in automotive develo...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The complexity of embedded software in safety critical domains, e.g., automotive and avionics, has significantly increased over the years. System test cases in these domains are often manually derived from functional requirements in natural language (NL). One important motivation is to ensure traceability between requirements and system test cases. As a result, the definition of test cases is time-consuming and challenging, especially under time constraints and when there are frequent changes to requirements. In this context, automatic test generation not only reduces the cost of testing but also helps guarantee that test cases properly cover all requirements, a very important objective in safety critical systems and for the standards they need to comply with [1, 2].

The benefits of automatic test generation are widely acknowledged today and there are many proposed approaches in the literature [3]. In many cases [4], they require that system specifications be captured as UML behavioral models such as activity diagrams [5], statecharts [6], and sequence diagrams [7]. In modern industrial systems, these behavioral models tend to be complex and expensive if they are to be precise and complete enough to support test automation, and are thus often not part of development practice. There are techniques [8] [9] [10] that generate test models from NL requirements, but the generated models need to be manually edited to enable test automation, thus creating scalability issues. In approaches generating test cases directly from NL requirements [11] [12] [13] [14], test cases are not executable and often require significant manual intervention to provide test input data (e.g., they need additional formal specifications [14]). A few approaches can generate executable test cases including test input data directly from NL requirements specifications [15] [16], but they require that requirements specifications be written according to a controlled natural language (CNL). The input specifications are translated into formal specifications which are later used to automatically generate test input data (e.g., using constraint solving). The CNL language supported by these approaches is typically very limited (e.g., it enables the use of only a few verbs in requirements specifications), thus reducing their usability.

Our goal in this paper is to enable automated generation of executable test cases from NL requirements, with no additional behavioral modelling. Our motivation is to rely, to the largest extent possible, on practices that are already in place in many companies developing embedded systems, including our industry partner, i.e., IEE S.A. (in the following “IEE”) [17], with whom we performed multiple case studies reported in this paper. In many environments like IEE, development processes are use case-driven and this strongly influences their requirements engineering and system testing practices. Use case specifications are widely used for communicating requirements among stakeholders and, in particular, facilitating communication with customers, while a domain model clarifies the terminology and concepts shared among all stakeholders and thus avoids misunderstandings.

In this paper, we propose, apply and assess Use Case Modelling for System Tests Generation (UMTG), an approach that generates executable system test cases by exploiting behavioral information in use case specifications. UMTG requires a domain model (e.g., a class diagram) of the system, which enables the definition of constraints that are used to generate test input data. Use case specifications and domain models are common in requirements engineering practice [18], such as our industry partner’s organisation in our case studies. Consistent with the objectives stated above, we avoid behavioral modelling (e.g., activity and sequence diagrams) by applying Natural Language Processing (NLP) to a more structured and analysable form of use case specifications, i.e., Restricted Use Case Modeling (RUCM) [8]. RUCM introduces a template with keywords and restriction rules to reduce ambiguity in requirements and to enable automated analysis of use case specifications. It enables the extraction of behavioral information by reducing imprecision and incompleteness in use case specifications. RUCM has been successfully applied in many domains (e.g., [19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]). It was previously evaluated through controlled experiments and showed to be usable and beneficial with respect to making use case specifications less ambiguous and more amenable to precise analysis and design [8]. In short, UMTG attempts to strike a balance among use cases legible by all stakeholders, sufficient information for automated system test generation, and minimal modeling.

UMTG employs NLP to build Use Case Test Models (UCTMs) from RUCM specifications. A UCTM captures the control flow implicitly described in an RUCM specification and enables the model-based identification of use case scenarios (i.e, the sequences of use case steps in the model). UMTG includes three model-based, coverage strategies for the generation of use case scenarios from UCTMs: branch, def-use, and subtype coverages. A list of textual pre, post and guard conditions in each use case specification is extracted during NLP. The extracted conditions enable UMTG to determine the constraint that test inputs need to satisfy to cover a test scenario. To automatically generate test input data for testing, UMTG automatically translates each extracted condition in NL into a constraint in the Object Constraint Language (OCL) [32] that describes the condition in terms of the entities in the domain model. UMTG relies on OCL since it is the natural choice for constraints in UML class diagrams. To generate OCL constraints, it exploits the capabilities of advanced NLP techniques (e.g., Semantic Role Labeling [33]). The generated OCL constraints are then used to automatically generate test input data via constraint solving using Alloy [34]. Test oracles are generated by processing the postconditions.

Engineers are expected to manually inspect the automatically generated OCL constraints, possibly make corrections and write new constraints when needed. Note that the required manual effort is very limited since, according to our industrial case studies, UMTG can automatically and correctly generate 95% of the OCL constraints. The accuracy of the OCL constraint generation is very high, since 99% of the generated constraints are correct. Executable test cases are then generated by identifying, using a mapping table, the test driver API functions to be used to provide the generated test input data to the system under test.

This paper extends our previous conference papers concerning the automatic generation of UCTMs [35] and the automatic generation of OCL constraints from specifications in natural language [36] published at the International Symposium on Software Testing and Analysis (ISSTA’15) and at the 11th IEEE Conference on Software Testing, Validation and Verification (ICST’18). An earlier version of our tool was demonstrated [37] at the 10th Joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE’15). This paper brings together, refines, and extends the ideas from the above papers. Most importantly, we extend the expressivity of the OCL constraints automatically generated, introduce an Alloy-based constraint solving algorithm that solves the path conditions in OCL, and integrate the def-use and subtype coverage strategies not presented in our previous work. Finally, the paper further provides substantial new empirical evidence to support the scalability of our approach, and demonstrates its effectiveness using two industrial case studies (i.e., automotive embedded systems sold in the US and EU markets). Our contributions include:

  • UMTG, an approach for the automatic generation of executable system test cases from use case specifications and a domain model, without resorting to behavioral modelling;

  • an NLP technique generating test models (UCTMs) from use case specifications expressed with RUCM;

  • an NLP technique generating OCL constraints from use case specifications for test input data generation;

  • an algorithm combining UCTMs and constraint solving to automatically generate test input data, based on three different coverage criteria;

  • a publicly available tool integrated as a plug-in for IBM DOORS and Eclipse, which generates executable system test cases from use case specifications;

  • two industrial case studies from which we provide credible empirical evidence demonstrating the applicability, scalability and benefits of our approach.

This paper is structured as follows. Section 2 provides the background on the NLP techniques on which this paper builds the proposed test case generation approach. Section 3 introduces the industrial context of our case study to illustrate the practical motivations for our approach. Section 4 discusses the related work in light of our industrial needs. In Section 5, we provide an overview of the approach. From Section 6 to Section 12, we provide the details of the core technical parts of our approach. Section 13 presents our tool support for test case generation. Section 14 reports on the results of the empirical validation conducted with two industrial case studies. We conclude the paper in Section 15.

2 Background

In this section, we present the background regarding the Natural Language Processing (NLP) techniques which we employ in UMTG. NLP refers to a set of procedures that extract structured information from documents written in NL. They are implemented as a pipeline that executes multiple analyses, e.g., tokenization, morphology analysis, and syntax analysis [38].

UMTG relies on five different NLP analyses: tokenization, named entity recognition, part-of-speech tagging, semantic role labeling (SRL), and semantic similarity detection. Tokenization splits a sentence into tokens based on a predefined set of rules (e.g., the identification of whitespaces and punctuation). Named entity recognition identifies and classifies named entities in a text into predefined categories (e.g., the names of cities). Part-of-speech (POS) tagging assigns parts of speech to each word in a text (e.g., noun, verb, pronoun, and adjective). SRL automatically determines the roles played by the phrases

111The term phrase indicates a word or a group of consecutive words. in a sentence [38], e.g., the actor performing an activity. Semantic similarity detection determines the similarity between two given phrases.

Tokenization, named entity recognition, and POS tagging are well known in the software engineering community since they have been adopted by several approaches integrating NLP [39, 40, 41, 42, 43, 44]. However, none of the existing software testing approaches relies on SRL or combines SRL with semantic similarity detection.

Section 2.1 provides a brief description of SRL, while we present the basics of semantic similarity detection in Section 2.2.

2.1 Semantic Role Labeling

SRL techniques are capable of automatically determining the roles played by words in a sentence. For the sentences The system starts and The system starts the database, SRL can determine that the actors affected by the actions are the system and the database, respectively. The component that is started coincides with the subject in the first sentence and with the object in the second sentence although the verb to start is used with active voice in both. This information cannot be captured by other NLP techniques like POS tagging or dependency parsing.

There are few SRL tools [45, 46, 47]. They are different in terms of models they adopt to capture roles. Semafor [45, 48] and Shalmaneser [46] are based on the FrameNet model, while the CogComp NLP pipeline (hereafter CNP [47]) uses the PropBank [49] and NomBank models [50, 51]. To the best of our knowledge, CNP is the only tool under active development, and is thus used in UMTG.

The tools using PropBank tag the words in a sentence with keywords (e.g., A0, A1, A2, AN) to indicate their roles. A0 indicates who performs an action, while A1 indicates the actor most directly affected by the action. For instance, the term The system is tagged with A1 in the sentence The system starts, while the term the database is tagged with A1 in the sentence The system starts the database. The other roles are verb-specific despite some commonalities, e.g., A2 which is often used for the end state of an action.

PropBank includes additional roles which are not verb-specific (see Table I). They are labeled with general keywords and match adjunct information in different sentences, e.g., AM-NEG indicating negative verbs. NomBank, instead, captures the roles of nouns, adverbs, and adjectives in noun phrases. It uses the same keywords adopted by PropBank. For instance, using ProbBank, we identify that the noun phrase the watchdog counter plays the role A1 in the sentence The system resets the watchdog counter. Using NomBank, we obtain complementary information indicating the term counter is the main noun (tagged with A0), and the term watchdog is an attributive noun (tagged with A1).

PropBank does not help identify two different sentences describing similar concepts. In the sentences The system stopped the database, The system halted the database and The system terminated the database, an SRL tool using PropBank tags ‘the database’ with A1, indicating the database is the actor affected by the action. However, A1 does not indicate that the three sentences have similar meanings (i.e., the verbs are synonyms). To identify similar sentences, UMTG employs semantic similarity detection techniques.

Verb-specific semantic roles
Identifier Definition
 A0 Usually indicates who performs an action.
 A1 Usually indicates the actor most directly affected by the action.
 A2 With motion verbs, indicates a final state or a location.
Generic semantic roles
Identifier Definition
 AM-ADV Adverbial modification.
 AM-LOC Indicates a location.
 AM-MNR Captures the manner in which an activity is performed.
 AM-MOD Indicates a modal verb.
 AM-NEG Indicates a negation, e.g. ’no’.
 AM-TMP Provides temporal information.
 AM-PRD Secondary predicate with additional information about A1.
TABLE I: PropBank Additional Semantic Roles used in the paper.

2.2 Semantic Similarity Detection

For semantic similarity detection, we use the VerbNet lexicon 

[52], which clusters verbs that have a common semantics and share a common set of semantic roles into a total of 326 verb classes [53]. Each verb class is provided with a set of role patterns. For example, A1,V and A0,V,A1 are two role patterns for the VerbNet class stop-55.4, which includes, among others, the verbs to stop, to halt and to terminate. In A1,V, the sentence contains only the verb (V), and the actor whose state is altered (A1). In A0,V, A1, the sentence contains the actor performing the action (A0), the verb (V), and the actor affected by the action (A1). Examples of these two patterns are the database stops and the system stops the database, respectively. UMTG uses VerbNet version 3.2 [53], which includes 272 verb classes and 214 subclasses where a class may have more than one subclass.

VerbNet uses a model different than PropBank. There is a mapping between PropBank and the model in VerbNet [54]. For simplification, we use only PropBank role labels in the paper. All the verbs in a VerbNet class are guaranteed to have a common set of role patterns, but are not guaranteed to be synonyms (e.g., the verbs repeat and halt in the VerbNet class stop-55.4). We employ WordNet [55], a database of lexical relations, to cluster verbs with similar meaning.

3 Motivation and Context

The context for which we developed UMTG is that of safety-critical embedded software in the automotive domain. The automotive domain is a representative example of the many domains for which compliance with requirements should be demonstrated through documented test cases. For instance, ISO-26262 [2], an automotive safety standard, states that all system requirements should be properly tested by corresponding system test cases.

In this paper, we use the system BodySense as one of the case studies and also to motivate and illustrate UMTG. BodySense is a safety-critical automotive software developed by IEE [17], a leading supplier of embedded software and hardware systems in the automotive domain. BodySense provides automatic airbag deactivation for child seats. It classifies vehicle occupants for smart airbag deployment. Using a capacitive sensor in the vehicle’s passenger seat, it monitors whether the seat is occupied, as well as classifying the occupant. If the passenger seat has a child in a child seat or is unoccupied, the system disables the airbag. For seats occupied by adult passengers, it ensures the airbag is deployed in the event of an accident. BodySense also provides occupant detection for the seat belt reminder function.

Table II gives a simplified version of a real test case for BodySense. Lines 1, 3, 5, 7, and 9 provide high-level operation descriptions, i.e., informal descriptions of the operations to be performed on the system. These lines are followed by the name of the functions that should be executed by the test driver along with the corresponding input and expected output values. For instance, Line 4 invokes the function SetBus with a value indicating that the test driver should simulate the presence of an adult on the seat (for simplicity assume that, when an adult is seated, the capacitance sensor positioned on a seat sends a value above 600 on the bus).

Line Operation Inputs/Expectations
1 Reset power and wait
2 ResetPower Time=INIT_TIME
3 Set occupant status - Adult
4 SetBus Channel = RELAY
Capacitance = 601
5 Simulate a nominal temperature

6
SetBus Channel=RELAY
Temperature = 20

7
Check that and Adult has been detected on the seat, i.e. SeatBeltReminder status is Occupied and AirBagControl status is Occupied.






8
ReadAndCheckBus D0=OCCUPIED
D1=OCCUPIED
9 Check that the AirBagControl has received new data.
10 CheckAirbagPin 0x010
TABLE II: An example test case for BodySense.

Exhaustive test cases needed to validate safety-critical, embedded software are difficult both to derive and maintain because requirements are often updated during the software lifecycle (e.g., when BodySense needs to be customized for new car models). For instance, the functional test suite for BodySense is made of 192 test cases which include a total of 4707 calls to test driver functions and around 21000 variable assignments. The effort required to specify test cases for BodySense is overwhelming. Without automated test case generation, such testing activity is not only expensive but also error prone.

Within the context of testing safety-critical, embedded software such as BodySense, we identify three challenges that need to be considered for the automatic generation of system test cases from functional requirements:

Challenge 1: Feasible Modelling. Most of the existing automatic system test generation approaches are model-based and rely upon behavioral models such as state, sequence or activity diagrams (e.g., [5, 56, 57, 58, 59]). In complex industrial systems, behavioral models that are precise enough to enable test automation are so complex that their specification cost is prohibitive and the task is often perceived as overwhelming by engineers. To evaluate the applicability of behavioral modelling on BodySense, we asked the IEE engineers to specify system sequence diagrams (SSDs) for some of the use cases of BodySense. For example, the SSD for the use case Identify initial occupancy status of a seat included 74 messages, 19 nested blocks, and 24 references to other SSDs that had to be derived. This was considered too complex for the engineers and required significant help from the authors of this paper, and many iterations and meetings. Our conclusion is that the adoption of behavioral modelling, at the level of detail required for automated testing, is not a practical option for system test automation unless detailed behavioral models are already used by engineers for other purposes, e.g., software design.

Challenge 2: Automated Generation of Test Data. Without behavioral modelling, test generation can be driven only by existing requirements specifications in NL, which complicates the identification of the test data (e.g., the input values to send to the system under test). Because of this, most of the existing approaches focus on the identification of test scenarios (i.e., the sequence of activities to perform during testing), and ask engineers to manually produce the test data. Given the complexity of the test cases to be generated (recall that the BodySense test suite includes 21000 variable assignments), it is extremely important to automatically generate test data, and not just test scenarios.

Challenge 3: Deployment and Execution of the Test Suite. Execution of test cases for a system like BodySense entails the deployment of software under test on the target environment. To speed up testing, test case execution is typically automated through test scripts invoking test driver functions. These functions simulate sensor values and read computed results from a communication bus. Any test generation approach should generate appropriate function calls and test data in a processable format for the test driver. For instance, the test drivers in BodySense need to invoke driver functions (e.g., SetBus) to simulate seat occupancy.

In the rest of this paper, we focus on how to best address these challenges in a practical manner, in the context of use case-driven development of embedded systems.

4 Related Work

In this section, we cover the related work across three categories in terms of the challenges we presented in Section 3.

Feasible Modelling. Most of the system test case generation approaches require that system requirements be given in UML behavioral models such as activity diagrams (e.g., [5, 60, 61, 62]), statecharts (e.g., [56, 63, 6, 64]), and sequence diagrams (e.g., [57, 65, 7, 66]). For instance, Nebut et al. [7] propose a use case driven test generation approach based on system sequence diagrams. Gutierrez et al. [67] introduce a systematic process based on model-driven engineering paradigm to automate the generation of system cases from functional requirements given in activity diagrams. Briand and Labiche [57] use both activity and sequence diagrams to generate system test cases. While sequential dependencies between use cases are extracted from an activity diagram, sequences in a use case are derived from a system sequence diagram. In contrast, UMTG needs only use case specifications complemented by a domain model and OCL constraints. In addition, UMTG is able to automatically generate most of the OCL constraints from use case specifications.

There are techniques generating behavioral models from NL requirements [68, 9, 8, 10, 69, 70]. Some approaches employ similar techniques in the context of test case generation. For instance, Frohlich and Link [71] generate test cases from UML statecharts that are automatically derived from use cases. De Santiago et al. [72] provide a similar approach to generate test cases from statecharts derived from NL scenario specifications. Riebisch et al. [73] describe a test case generation approach based on the semi-automated generation of state diagrams from use cases. Katara and Kervinen [74] propose an approach which generates test cases from labeled transition systems that are derived from use case specifications. Sarmiento et al. [12, 13] propose another approach to generate test scenarios from a restricted form of NL requirements. The approach automatically translates restricted NL requirements into executable Petri-Net models; the generated Petri-Nets are used as input for test scenario generation. Soeken et al. [75] employ a statistical parser [76] and a lexical database [55] to generate sequence diagrams from NL scenarios, which are later used to semi-automatically generate test cases. Hartmann et al. [77] provide a test-generation tool that creates a set of test cases from UML models that are manually annotated and semi-automatically extracted from use case specifications. All these approaches mentioned above have two major drawbacks in terms of feasible modelling: (i) generated test sequences have to be edited, corrected, and/or refined and (ii) test data have to be manually provided in the generated test models. In contrast, UMTG not only generates sequences of function calls that do not need to be modified but also generates test data for function calls.

Kesserwan et al. [78] provide a model-driven testing methodology that supports test automation based on system requirements in NL. Using the methodology, the engineer first specifies system requirements according to Cockburn use case notation [79] and then manually refines them into Use Case Map (UCM) scenario models [80]. In addition, test input data need to be manually extracted from system requirements and modelled in a data model. UMTG requires that system requirements be specified in RUCM without any further refinement. Text2Test [81, 40] extracts control flow implicitly described in use case specifications, which can be used to automatically generate system test cases. The adaptation of such an approach in the context of test case generation has not been investigated.

Automated Generation of Test Data. The ability to generate test data, and not just abstract test scenarios, is an integral part of automated test case generation [82]. However, many existing NL-based test case generation approaches require manual intervention to derive test data for executable test cases (e.g., [11, 78, 12, 13]), while some other approaches focus only on generating test data (e.g., [83, 84, 85, 86, 87, 88]). For instance, Zhang et al. [11] generate test cases from RUCM use cases. The generated test cases cannot be executed automatically because they do not include test data. Sarmiento et al. [12] generate test scenarios without test data from a restricted form of NL requirements specifications.

Similar to UMTG, Kaplan et al. [89] propose another approach, i.e., Archetest, which generates test sequences and test inputs from a domain model and use case specifications together with invariants, guardconditions and postconditions. Yue et al. [20] propose a test case generation tool (aToucan4Test), which takes RUCM use case specifications annotated with OCL constraints as input and generates automatically executable test cases. These two test generation approaches require that conditions and constraints be provided by engineers to automatically generate test data. In contrast, UMTG can automatically generate, from use case specifications, most of the OCL constraits that are needed for the automated generation of test data.

In some contexts, test data might be simple and consist of sequences of system events without any associated additional parameter value. This is the case of interaction test cases for smartphone systems, which can be automatically generated by the approach proposed by De Figueiredo et al. [15]. The approach processes use case specifications in a custom use case format to derive sequences of system operations and events. UMTG complements this approach with the generation of parameter values, which is instead needed to perform functional testing at the system level.

Carvalho et al. [90] generate executable test cases for reactive systems from requirements written according to a restricted grammar and dictionary. The proposed approach effectively generates test data but has two main limitations: (i) the underlying dictionary may change from project to project (e.g., the current version supports only seven verbs of the English language), and (ii) the restricted grammar may not be suitable to express some system requirements (e.g., the approach does not tackle the problem of processing transitive and intransitive forms of the same verb). In contrast, UMTG does not impose any restricted dictionary or grammar but simply relies on a use case format, RUCM, which can be used to express use cases for different kinds of systems. RUCM does not restrict the use of verbs or nouns in use case steps and thus does not limit the expressiveness of use case specifications. Furthermore, the RUCM keywords are used to specify input and output steps but do not constraint internal steps or condition sentences (see Section 6). Finally, by relying on SRL and VerbNet, UMTG provides guarantees on the correct generation of OCL constraints (see Section 9), without restricting the writing of sentences (e.g., it supports the use of both transitive and intransitive forms).

Other approaches focus on the generation of class invariants and method pre/postconditions, from NL requirements, which, in principle, could be used for test data generation (e.g., [91, 41, 42]). Pandita et al. [91] focus only on API descriptions written according to a CNL. NL2OCL [41] and NL2Alloy [42], instead, process a UML class diagram and NL requirements to derive class invariants and method pre/postconditions. These two approaches rely on an ad-hoc semantic analysis algorithm that uses information in the UML class diagram (e.g., class and attribute names) to identify the roles of words in sentences. They rely on the presence of specific keywords to determine passive voices and to identify the operators to be used in the generated invariants and conditions. Their constraint generation is rule-based, but they do not provide a solution to ease the processing of a large number of verbs with a reasonable number of rules. Thanks to the use of Wordnet synsets and VerbNet classes (see Section 9), UMTG can process a large set of verbs with few rules to generate OCL constraints.

Though NL2OCL [41] and NL2Alloy [42] are no longer available for comparison, they seem more useful for deriving class invariants including simple comparison operators (i.e., the focus of the evaluation in [41]), rather than for generating pre/postconditions of the actions performed by the system (i.e., the focus of UMTG). Pre/postconditions are necessary for deriving test data in our context.

Deployment and Execution of the Test Suite. The generation of executable test cases impacts on the usability of test generation techniques. In code-based approaches (e.g., [92, 93]), the generation of executable test cases is facilitated by the fact that it is based on processing the interfaces used during the test execution (e.g., test driver API).

In model-based testing, the artefacts used to drive test generation are software abstractions (e.g., UML models). In this context, the generation of executable test cases is usually based on adaptation and transformation approaches [94]. The adaptation approaches require the implementation of a software layer that, at runtime, matches high-level operations to software interfaces. They support the execution of complex system interactions (e.g., they enable feedback-driven, model-based test input generation [95]). The transformation approaches, instead, translate an abstract test case into an executable test case by using a mapping table containing regular expressions for the translation process. They require only abstract test cases and a mapping table, while the adaptation approaches need communication channels between the software under test and the adaptation layer, which might not be possible for many embedded systems. Therefore, UMTG uses a mapping table that matches abstract test inputs to test driver function calls.

Model Transformation by Example (MTBE) approaches aim to learn transformation programs from source and target model pairs supplied as examples (e.g., [96, 97, 98]). These approaches search for a model transformation in a space whose boundaries are defined by a model transformation language and the source and target metamodels [99]. Given the metamodels of abstract and executable test cases, MTBE can be applied to automatically generate part of the mapping table as a transformation program. However, this solution can be considered only when there are already some example abstract and executable test cases, which is not the case in our context and we leave it for future work.

5 Overview of the Approach

The process in Fig. 1 presents an overview of our approach. In UMTG, behavioral information and high-level operation descriptions are extracted from use case specifications (Challenge 1). UMTG generates OCL constraints from the use case specifications, while test inputs are generated from the OCL constraints through constraint solving (Challenge 2). Test driver functions corresponding to the high-level operation descriptions and oracles implementing the postconditions in the use case specifications are generated through the mapping tables provided by the engineer (Challenge 3).

Fig. 1: Overview of the UMTG approach.

The engineer elicits requirements with RUCM (Step 1). The domain model is manually created as a UML class diagram (Step 2). UMTG automatically checks if the domain model includes all the entities mentioned in the use cases (Step 3). NLP is used to extract domain entities from the use cases. Missing entities are shown to the engineer who refines the domain model (Step 4). Steps 3 and 4 are iterative: the domain model is refined until it is complete.

Once the domain model is complete, most of the OCL constraints are automatically generated from the extracted conditions (Step 5). The engineer manually writes the few OCL constraints that cannot be automatically generated (Step 6). UMTG further processes the use cases with the OCL constraints to generate a use case test model for each use case specification (Step 7). A use case test model is a directed graph that explicitly captures the implicit behavioral information in a use case specification.

UMTG employs constraint solving for OCL constraints to generate test inputs associated with use case scenarios (Step 8). We use the term use case scenario for a sequence of use case steps that starts with a use case precondition and ends with a postcondition of either a basic or alternative flow. Test inputs cover all the paths in the testing model, and therefore all possible use case scenarios.

The engineer provides a mapping table that maps high-level operation descriptions and test inputs to the concrete driver functions and inputs that should be executed by the test cases (Step 9). Executable test cases are automatically generated through the mapping table (Step 10). If the test infrastructure and hardware drivers change in the course of the system lifespan, then only this table needs to change.

The rest of the paper explains the details of each step in Fig. 1, with a focus on how we achieved our automation objectives.

6 Elicitation of Requirements

Our approach starts with the elicitation of requirements in RUCM (Step 1 in Fig. 1). RUCM has a template with keywords and restriction rules to reduce ambiguity in use case specifications [8]. Since it was not originally designed for test generation, we introduce some extensions to RUCM.

  1 1. Use Case Identify Occupancy Status
  2 1.1 Precondition
  3 The system has been initialized.
  4 1.2 Basic Flow
  5 1. The SeatSensor SENDS capacitance TO the system.
  6 2. INCLUDE USE CASE Self Diagnosis.
  7 3. The system VALIDATES THAT no error is detected and no error is qualified.
  8 4. INCLUDE USE CASE Classify Occupancy Status.
  9 5. The system SENDS the occupant class for airbag control TO AirbagControlUnit.
  10 6. The system SENDS the occupant class for seat belt reminder TO SeatBeltControlUnit.
  11 Postcondition: The occupant class for airbag control has been sent to AirbagControlUnit. The occupant class for seat belt reminder has been sent to SeatBeltControlUnit.
  12 1.3 Bounded Alternative Flow
  13 RFS 2-4
  14 1. IF voltage error is detected THEN
  15 2. The system resets the occupant class for airbag control to error.
  16 3. The system resets the occupant class for seat belt reminder to error.
  17 4. ABORT
  18 5. ENDIF
  19 Postcondition: The occupant classes have been reset to error.
  20 1.4 Specific Alternative Flow
  21 RFS 3
  22 1. IF some error has been qualified THEN
  23 2. The system SENDS the error occupant class TO AirbagControlUnit.
  24 3. The system SENDS the error occupant class TO SeatBeltControlUnit.
  25 4. ABORT
  26 5. ENDIF
  27 Postcondition: The error occupant class has been sent to AirbagControlUnit. The error occupant class has been sent to SeatBeltControlUnit.
  28 1.5 Specific Alternative Flow
  29 RFS 3
  30 1. The system SENDS the previous occupant class for airbag control TO AirbagControlUnit.
  31 2. The system SENDS the previous occupant class for seat belt reminder TO SeatBeltControlUnit.
  32 3. ABORT
  33 Postcondition: The previous occupant class for airbag control has been sent to AirbagControlUnit. The previous occupant class for seat bet reminder has been set to SeatBeltControlUnit.
  34 2. Use Case Self Diagnosis
  35 2.1 Precondition
  36 The system has been initialized.
  37 2.2 Basic Flow
  38 1. The system sets temperature errors to not detected.
  39 2. The system REQUESTS the temperature FROM the SeatSensor.
  40 3. The system VALIDATES THAT the temperature is above -10 degrees.
  41 4. The system VALIDATES THAT the temperature is below 50 degrees.
  42 5. The system sets self diagnosis as completed.
  43 Postcondition: Error conditions have been examined.
  44 2.3 Specific Alternative Flow
  45 RFS 3
  46 1. The System sets TemperatureLowError to detected.
  47 2. RESUME STEP 5
  48 Postcondition: The system has detected a TemperatureLowError.
  49 2.4 Specific Alternative Flow
  50 RFS 4
  51 1. The System sets TemperatureHighError to detected.
  52 2. RESUME STEP 5
  53 Postcondition: The system has detected a TemperatureHighError.
  54 3. Use Case Classify Occupancy Status
  55 3.1 Precondition
  56 The system has been initialized.
  57 3.2 Basic Flow
  58 1. The system sets the occupant class for airbag control to Init.
  59 2. The system sets the occupant class for seatbelt reminder to Init.
  60 4. The system VALIDATES THAT the capacitance is above 600.
  61 5. The system sets the occupant class for airbag control to Occupied.
  62 6. The system sets the occupant class for seatbelt reminder to Occupied.
  63 Postcondition: An adult has been detected on the seat.
  64 3.3 Specific Alternative Flow
  65 RFS 4
  66 1. IF capacitance is above 200 THEN
  67 2. The system sets the occupant class for airbag control to Empty.
  68 3. The system sets the occupant class for seatbelt reminder to Occupied.
  69 4. EXIT
  70 5. ENDIF
  71 Postcondition: A child has been detected on the seat.
  72 3.4 Specific Alternative Flow
  73 RFS 4
  74 1. The system sets the occupant class for airbag control to Empty.
  75 2. The system sets the occupant class for seatbelt reminder to Empty.
  76 3. EXIT
  77 Postcondition: The seat has been recognized as being empty.
TABLE III: Part of BodySense Use Case Specifications

Table III provides a simplified version of three BodySense use case specifications in RUCM (i.e., Identify Occupancy Status, Self Diagnosis, and Classify Occupancy Status). We omit some basic information such as actors and dependencies.

The use cases contain basic and alternative flows. A basic flow describes a main successful scenario that satisfies stakeholder interests. It contains a sequence of steps and a postcondition (Lines III-III). A step can describe one of the following activities: an actor sends data to the system (Lines III and III); the system validates some data (Line III); the system replies to an actor with a result (Line III); the system alters its internal state (Line III). The inclusion of another use case is specified as a step using the keyword INCLUDE USE CASE (Line III). All keywords are written in capital letters for readability.

The keyword VALIDATES THAT indicates a condition (Line III) that must be true to take the next step (Line III), otherwise an alternative flow is taken (Line III).

Alternative flows describe other scenarios than the main one, both success and failure. An alternative flow always depends on a condition. In RUCM, there are three types of alternative flows: specific, bounded and global. For specific and bounded alternative flows, the keyword RFS is used to refer to one or more reference flow steps (Lines III and III). A specific alternative flow refers to a step in its reference flow (Line III). A bounded alternative flow refers to more than one step in the reference flow (Line III) while a global alternative flow refers to any step in the reference flow.

Bounded and global alternative flows begin with the keyword IF .. THEN for the condition under which the alternative flow is taken (Line III). Specific alternative flows do not necessarily begin with IF .. THEN since a condition may already be indicated in its reference flow step (Line III). The alternative flows are evaluated in the order they appear in the specification.

We introduce extensions into RUCM regarding the IF conditions, the keyword EXIT, and the way input/output messages are expressed. UMTG prevents the use of multiple branches within the same use case path [18], thus enforcing the adoption of IF conditions only as a means to specify guard conditions for alternative flows. UMTG introduces the new keywords SENDS … TO and REQUESTS … FROM for the system-actor interactions. Depending on the subject of the sentence, the former indicates either that an actor provides an input to the system (Line III) or that the system provides an output to an actor (Line III). The latter is used only for inputs, and indicates that the input provided by the actor has been requested by the system (Line III). UMTG introduces the keyword EXIT to indicate use case termination under alternative valid execution conditions (Line III describing the case of a child being detected on a seat). The keyword EXIT complements the keyword ABORT, which is used to indicate the abnormal use case termination (Line III).

7 NLP Pipeline for UMTG

We implemented an NLP application to extract the information required for three UMTG steps in Fig. 1: evaluate the model completeness, generate OCL constraints, and generate the use case test model.

Fig. 2: NLP pipeline applied to extract the behaviour of a use case.

The NLP application is based on the GATE workbench [100], an open source NLP framework, and implements the NLP pipeline in Fig. 2. The pipeline includes both default NLP components (grey) and components built to process use case specifications in RUCM (white). The Tokenizer splits the use cases into tokens. The Gazetteer identifies the RUCM keywords. The POS Tagger tags tokens according to their nature: verb, noun, and pronoun. The pipeline is terminated by a set of transducers that tag blocks of words with additional information required by the three UMTG steps. The transducers integrated in UMTG (1) identify the kinds of RUCM steps (i.e., output, input, include, condition and internal steps), (2) distinguish alternative flows, and (3) detect RUCM references (i.e., the RFS keyword), conditions, and domain entities in the use case steps.

Fig. 3: Part of the transducer that identifies conditions.
Fig. 4: Tags associated with the use case step in Line III of Table III.

Fig. 3 gives an example transducer for condition steps. The arrow labels in higher case represent the transducer’s inputs, i.e., tags previously identified by the POS tagger, the gazzetteer or other transducers. The italic labels show the tags assigned by the transducer to the words representing the transducer’s input. Fig. 4 gives the tags associated with the use case step in Line III of Table III after the execution of the transducer in Fig. 3. In Fig. 4, multiple tags are assigned to the same blocks of words. For example, the noun phrase ‘the capacitance’ is tagged both as a domain entity and as part of a condition.

8 Evaluation of the Domain Model Completeness

The completeness of the domain model is important to generate correct and complete test inputs. UMTG automatically identifies missing domain entities to evaluate the model completeness (Step 3 in Fig. 1). This is done by checking correspondences between the domain model elements and the domain entities identified by the NLP application.

Fig. 5: Part of the domain model for .

Domain entities in a use case may not be modelled as classes but as attributes. Fig. 5 shows a simplified excerpt of the domain model for BodySense where the domain entities ‘occupant class for airbag control’ and ‘occupant class for seat belt reminder’ are modelled as attributes of the class OccupancyStatus. UMTG follows a simple yet effective solution to check entity and attribute names. For each domain entity identified through NLP, UMTG generates an entity name by removing all white spaces and by putting all first letters following white spaces in capital. For instance, the domain entity ‘occupant class for airbag control’ becomes ‘OccupantClassForAirbagControl’. UMTG checks the string similarity between the generated entity names and the domain model elements. Engineers are asked to correct their domain model and use case specifications in the presence of confirmed mismatches.

9 Generation of OCL Constraints

To identify test inputs via constraint solving, UMTG needs to derive OCL constraints that capture (1) the effect that the internal steps have on the system state (i.e., the postconditions of the internal steps), (2) the use case preconditions, and (3) the conditions in the condition steps. For instance, for the basic flow of the use case Classify Occupancy Status (Lines III to III in Table III), we need a test input that satisfies the condition ‘the capacitance is above 600’ (Line III).

As part of UMTG, we automate the generation of OCL constraints (Step 5 in Fig. 1). Using some predefined constraint generation rules (hereafter transformation rules), UMTG automatically generates an OCL constraint for each precondition, internal step and condition step identified by the transducers in Fig. 2. The generated constraint captures the meaning of the NL sentence in terms of the concepts in the domain model. Table IV shows some of the OCL constraints generated from the use case specifications in our case studies in Section 14.

Section 9.1 summarizes our assumptions for the generation of OCL constraints. Section 9.2 describes the constraint generation algorithm. In Section 9.3, we discuss the correctness and generalizability of the constraint generation.

 # Sentence with SRL tags Corresponding OCL Constraint
 S1 {the capacitance} {is} {above 600}
 S2 {The system} {sets} {the occupant class for airbag
control} {to Init}              
 S3 {The system VALIDATES THAT} {the NVM} {is} {accessible}
 S4 {The system} {sets} {temperature errors} {to detected}
 S5 {The system VALIDATES THAT} {the build check} {has been passed}
 S6 {The system VALIDATES THAT} {no}
error {{except voltage errors} and {memory errors}}} {is detected}
 S7 {The system} {erases} {the measured voltage}
 S8 {The system} {erases} {the occupant class} {from the
airbag control unit}
 S9 {The system} {disqualifies} {temperature errors}
 S10 {IF} {some error} {has been qualified}.
 S11 {The system VALIDATES THAT} {the driver} {put} {two hands} {on the steering wheel}.
 S12 {The system} {resets} {the counter of the watchdog}
 S13 {The system} {resets} {the watchdog counter}
TABLE IV: Some constraints from the BodySense and HOD case studies in Section 14, with tags generated by SRL.

9.1 Working Assumptions

The constraint generation is enabled by three assumptions.

Assumption 1 (Domain Modelling). There are domain modelling practices common for embedded systems:

  1. Most of the entities in the use case specifications are given as classes in the domain model.

  2. The names of the attributes and associations in the domain model are usually similar with the phrases in the use case specifications.

  3. The attributes of domain entities (e.g., Watchdog.counter in Fig. 5) are often specified by possessive phrases (i.e., genitives and of-phrases such as of the watchdog in S12 in Table IV) and attributive phrases (e.g., watchdog in S13) in the use case specifications.

  4. The domain model often includes a system class with attributes that capture the system state (e.g., BodySense in Fig. 5).

  5. Additional domain model classes are introduced to group concepts that are modelled using attributes.

  6. Discrete states of domain entities are often captured using either boolean attributes (e.g., isAccessible in Fig. 5), or attributes of enumeration types (e.g., BuildCheckStatus::Passed in Fig. 5).

To ensure that Assumption 1 holds, UMTG iteratively asks engineers to correct their models (see Section 8). With Assumption 1, we can rely on string similarity to select the terms in the OCL constraints (i.e., classes and attributes in the domain model) based on the phrases appearing in the use case steps. String similarity also allows for some degree of flexibility in naming conventions.

Assumption 2 (OCL constraint pattern). The conditions in the use case specifications of embedded systems are typically simple and capture information about the state of one or more domain entities (i.e., classes in the domain model). For instance, in BodySense, the preconditions and condition steps describe safety checks ensuring that the environment has been properly set up (e.g., S3 in Table IV), or that the system input has been properly obtained (e.g., S5), while the internal steps describe updates on the system state (e.g., S2). They can be expressed in OCL using the pattern in Fig. 6, which captures assignments, equalities, and inequalities.

The generated constraints include an entity name (ENTITY in Fig. 6), an optional selection part (SELECTION), and a query element (QUERY). The query element can be specified according to three distinct sub-patterns: FORALL, EXISTS and COUNT. FORALL specifies that a certain expression (i.e., EXPRESSION) should hold for all the instances i of the given entity; EXISTS indicates that the expression should hold for at least one of the instances. COUNT is used when the expression should hold for a certain number of instances. Examples of these three query elements are given in the OCL constraints generated for the sentences S4, S10, and S11 in Table IV, respectively. In the pattern, EXPRESSION contains a left-hand side variable (hereafter lhs-variable), an OCL operator, and a right-hand side term (hereafter rhs-term), which is either another variable or a literal. The lhs-variable indicates an attribute of the entity whose state is captured by the constraint, while the rhs-term captures the state information (e.g., the value expected to be assigned to an attribute). The optional selection part selects a subset of all the available instances of the given entity type based on their subtype; an example is given in the OCL constraint for S6 in Table IV.

CONSTRAINT = ENTITY .allInstances() [SELECTION] QUERY
QUERY = FORALL EXISTS COUNT
FORALL = forAll ( EXPRESSION )
EXISTS = exists ( EXPRESSION )
COUNT = select ( EXPRESSION ) size() OPERATOR NUMBER
EXPRESSION = i i.LHS-VARIABLE OPERATOR RHS-TERM
LHS-VARIABLE = ATTR ASSOC {ATTR ASSOC }
RHS-TERM = VARIABLE LITERAL;
SELECTION = select( e TYPESEL { and TYPESEL } )
TYPESEL = not e.TypeOf( CLASS )

Note: This pattern is expressed using a simplified EBNF grammar [101] where non-terminals are bold and terminals are not bold. ENTITY stands for a class in the domain model. LITERAL is an OCL literal (e.g., ’1’ or ’a’). NUMBER is an OCL numeric literal (e.g., ’1’). ATTR is an attribute of a class in the domain model. ASSOC is an association end in the domain model. OPERATOR is a math operator ( -, +, =, , , , ).

Fig. 6: Pattern of the OCL constraints generated by UMTG.

Assumption 3 (SRL). The SRL toolset (the CNP tool in our implementation) identifies all the semantic roles in a sentence that are needed to correctly generate an OCL constraint. Our transformation rules use the roles to correctly select the domain model elements to be used in the OCL constraint (see Section 9.2).

Table IV reports some use case steps, the SRL roles, and the generated constraints. For example, in S2, the CNP tool tags “The system” with A0 (i.e., the actor performing the action), “sets” with verb, “the occupant class for airbag control” as A1, and “to Init” with A2 (i.e., the final state). We ignore the prefix “The system VALIDATES THAT” in condition steps because it is not necessary to generate OCL constraints.

9.2 The OCL Generation Algorithm

We devised an algorithm that generates an OCL constraint from a given sentence in NL (see Fig. 7). We first execute the SRL toolset (the CNP tool) to annotate the sentence with the SRL roles (Line 5 in Fig. 7). We select and apply the transformation rules based on the verb in the sentence (Lines 6 to 10). The same rule is applied for all the verbs that are synonyms and that belong to the same VerbNet class. In addition, we have a special rule, hereafter we call any-verb transformation rule, that is shared by many verb classes. Each rule returns a candidate OCL constraint with a score assessing how plausible is the constraint (Section 9.2.5). We select the constraint with the highest score (Line 13).

1, a sentence in natural language
2, a domain model
3, an OCL constraint with a score
4function generateOcl(, domainModel)
5   generate a sentence annotated with SRL roles from
6   identify the transformation rules to apply
7                             based on the verb in
8   for each  do
9     generate a new OCL constraint () by applying
10                          the
11     
12   end for
13   return the ocl constraint with the best score
14end function
Fig. 7: The OCL constraint generation algorithm.

For each verb, we classify the SRL roles into: (i) entity role indicating the entity whose state needs to be captured by the constraint, and (ii) support roles indicating additional information such as literals in the rhs-terms (see Fig. 6). We implemented our transformation rules according to the pairs entity role, {support roles} we extracted from the VerbNet role patterns. The role patterns provide all valid role combinations appearing with a verb.

Table V shows the role pairs for some of our transformation rules. For example, the verb ‘to erase’ has two VerbNet role patterns A0, V, A1 and A0, V, A1, A2 where V is the verb, and A1 and A2 are the SRL roles. The first pattern represents the case in which an object is erased (e.g., the measured voltage in S7 in Table IV), while, in the second one, an object is removed from a source (e.g., the occupant class being removed from the airbag control unit in S8). The transformation rule for the verb ‘to erase’ has thus two role pairs: A1, null and A2, {A1} (see Rule 4 in Table V). Each transformation rule might be associated with multiple support roles; this is the case of the verb ‘to set’ whose role pair A1, {A2, AM-LOC} appears in Rule 3 in Table V.

Each transformation rule performs the same sequence of activities for each pair entity role, {support role} (see Fig. 8). A rule first identifies the candidate lhs-variables (Line 7 in Fig. 8), and then builds a distinct OCL constraint for each lhs-variable identified (Lines 9 to 19). Finally, it returns the OCL constraint with the highest score (Line 22).

1, a sentence annotated with the different roles identified by SRL
2, the main class of the system
3, pairs of roles sets
4, an OCL constraint with a score
5function transform(srl, systemClass, rolesSetPairs)
6   for each  do
7     process  and identify a set of variables that might
8                             appear in the left-hand side of the OCL constraint
9     for each  do
10        identify the term to put on the right-hand side
11       identify the operator to use in the OCL constraint
12       if needed, build a subexpression with the selection operator
13       identify the type of QUERY element
14       if   then
15         build the constraint using , , , , and
16         calculate the score of the OCL constraint
17         
18       end if
19     end for
20   end for
21    select the constraint with the best score from the list
22   return
23end function
Fig. 8: The algorithm followed by each transformation rule.

In Sections 9.2.1 to 9.2.5, we give the details of the algorithm in Fig. 8, i.e., identifying the lhs-variables (Line 7), selecting the rhs-terms (Line 10) and the OCL operators (Line 11), and scoring the constraints (Line 16).

9.2.1 Identification of the Left-hand Side Variables

1, a sentence annotated with the different roles identified by SRL
2, the main class of the system
3, list of entity roles
4, list of support roles
5, list of left-hand side variables
6function findVariables(srl, systemClass, EntityRoles, SupportRoles)
7   for each  do
8     
9     
10     
11     
12     if   then
13       for each  do
14         
15         
16       end for
17     end if
18   end for
19   for each  do
20     
21     
22   end for
23   return
24end function
Fig. 9: The algorithm to identify lhs-variables.

To identify the lhs-variables, the transformation rules follow an algorithm using the string similarity between the names of the domain model elements (i.e., the classes, attributes and associations) and the phrases in the use case step tagged with the entity and support roles (see Fig. 9). Based on Assumption 3, we expect that the phrase tagged with the entity role provides part of the information to identify the lhs-variable (e.g., itsNVM in S3 in Table IV), while the phrase(s) tagged with the support role further characterize the variable (e.g., isAccessible in S3).

Rule ID Verb Entity roles Support roles
1 to be A1 AM-PRD, AM-MNR, AM-LOC
2 to enable A1 AM-MNR
3 to set A1 A2, AM-LOC
4 to erase A1
A2 A1
5 any verb A1 AM-PRD,Verb
TABLE V: Entity and support roles for some transformation rules in UMTG.

The algorithm is influenced by domain modelling practices (Assumption 1). Assumptions A1.1 and A1.2 influence the criteria to select terms for the OCL constraint based on phrases in the use case step. Assumptions A1.1 - A1.5 influence the order in which noun phrases are processed.

Based on A1.1, a domain model class best matches a phrase when its name shows the highest similarity with the phrase. To identify the matching classes and phrases, we employ the Needleman-Wunsch string alignment algorithm [102], which maximizes the matching between characters with some degree of discrepancy. In the context of embedded software development, attributes and associations often correspond to acronyms (e.g., RAM, ROM, and NVM) which are short with a small alignment distance, but have different meanings. Therefore, we do not use the string alignment algorithm for attributes and associations. Based on A1.2, an attribute or association in the domain model best matches a given phrase (i) if it is a prefix or a postfix of the phrase, (ii) if it starts or ends with the phrase, or (iii) if it is a synonym or an antonym of the phrase. We ignore spaces and articles in the phrases. For each matching element, we compute a similarity score as the portion of matching characters.

The algorithm iterates over each phrase tagged with an entity role (Lines 7-8 in Fig. 9). Based on A1.2 and A1.4, we add to the list of the lhs-variables the system class attributes and associations that best match the phrase (Lines 9-10). In S3 in Table IV, BodySense.itsNVM is in the list of the lhs-variables because it terminates with NVM tagged with A1 (i.e., the entity role).

Based on A1.1 and A1.2, we look for the domain model class that best matches the phrase tagged with the entity role (Line 11). Based on A1.5, we recursively traverse the associations starting from this class to identify the related attributes that best match the phrases tagged with the support roles (Lines 13-16). The matching attribute might be indirectly related to the starting class. We give a higher priority to the directly related attributes. Therefore, the score of the matching attribute is divided by the number of traversed associations (Line 14). We add the best matching attributes to the list of the lhs-variables (Line 15). For example, for S2 in Table IV, the lhs-variable BodySense.itsOccupancyStatus.occupantClassForAirbagControl is identified by traversing the association itsOccupancyStatus from the system class BodySense. Its similarity score is 0.5 because one association has been traversed (itsOccupancyStatus) and there is an exact match between the attribute name and the noun phrase tagged with A1.

We further refine the lhs-variables with a complex type (i.e., a class or a data type). For each attribute and association in the list of the lhs-variables, we traverse the related associations to identify the attributes and associations that best match the phrases tagged with the support roles (Lines 19 to 22). In S3, BodySense.itsNVM is refined to BodySense.itsNVM.isAccessible since the class NVM has a boolean attribute (isAccessible) with a name similar to the phrase tagged with AM-PRD (accessible).

Based on A1.3, when the phrase tagged with the entity role includes a possessive or attributive phrase, we look for attributes/associations in the domain model that reflect the relation between the possessive/attributive phrase and the main phrase in the entity role phrase (e.g., the watchdog and counter in sentence S12 in Table IV) . We rely on the NomBank tags generated by CNP to identify the main phrase and the possessive/attributive phrases. In the domain model, the main phrase usually best matches an attribute/association that belongs to an owning entity. The owning entity is either (1) a class that best matches the possessive/attributive phrase or (2) a class referred by an attribute of the main system class that best matches the possessive/attributive phrase. For example, in the case of S13, the attribute counter (i.e., the main phrase in S13) belongs to the entity class referenced by the attribute watchdog (i.e., the possessive/attributive phrase) of the main system class. Based on this observation, to identify the model elements that reflect the relation captured by a possessive/attributive phrase, we perform an execution of the function findVariables where we treat possessive/attributive phrases as the entity role and the main phrase as a support role. The possessive/attributive phrase is thus used to identify the owning entity while the main phrase is used to identify the owned attribute/association. In the case of S13, UMTG looks for an attribute of the system class that best matches watchdog and then further refines the search by looking for a contained attribute that best matches counter.

9.2.2 Identification of the Right-hand Side Terms

The rhs-term can be a literal or a variable. It is identified based on the lhs-variable and on the support roles that have not been used to select the lhs-variables. If the lhs-variable is of a boolean or numeric type, the rhs-term is a boolean or numerical value derived from a phrase tagged with one of the support roles. Therefore, we look for a phrase that matches the terms ’true’ and ’false’ or that is a number.

When the lhs-variable is boolean and the verb is negative, we negate the rhs-term. When the lhs-variable is boolean and there is no support role to identify the rhs-term, we use the default value true for the rhs-term. For example, in S4 in Table IV, all the support roles have already been used to identify the lhs-variable and there is no support role left for the rhs-term. Therefore, we use true for the rhs-term.

When the lhs-variable is of an enumeration type, we identify the enumeration literal in the domain model that best matches the phrases tagged with the support roles. For instance, in S5 in Table IV, BodySense.buildCheckStatus is the lhs-variable which is of an enumeration type, i.e., BuildCheckStatus. Since this enumeration contains a term that matches the root of the verb in S5 (pass), the literal BuildCheckStatus::Passed is selected as the rhs-term.

9.2.3 Identification of the OCL operators

The OCL comparison operators are identified based on the verb in the use case step. The operator ’’ is used for most verbs. For the verb ‘to be’, we rely on Roy et al. [103] which, for example, identify the operator ’’ for “capacitance is above 600”. More precisely, we apply Roy’s approach on the phrase tagged with the support-role used to identify the rhs-term. If antonyms have been used to identify the lhs-variable and the rhs-term, we negate the selected operator, for instance, by replacing ’’ with ’’ in S9 in Table IV.

Regarding the selection operator, we are limited to the pattern in Fig. 6, i.e., the exclusion of some class instances in a set. The selection operator is introduced when the phrase tagged with A1 contains the keyword except (e.g., S6 in Table IV). To identify the types of the instances to be excluded, we rely on the tags generated based on SRL NomBank. We look for the phrases tagged with A2 to identify the adverbial clause. We identify all the distinct noun phrases within the clause (e.g., voltage errors and memory errors in S6).

9.2.4 Identification of the Type of OCL Query Element

The query elements in our OCL pattern are used to check if an expression holds for a set of instances (see Fig. 6). The key difference among them is the number of instances for which the expression should hold. Since the number of subjects referred to in a sentence in English is specified by the determiners and quantifiers, we identify the type of the query element based on the determiners and quantifiers in the phrase tagged with an entity role. We consider entity roles because a domain entity is selected based on its similarity with the phrase tagged with an entity role.

In English, the indefinite articles a and an and the determiner some refer to particular members of a group. Therefore, if a phrase tagged with an entity role includes an indefinite article or the determiner some, we generate an OCL query following the EXISTS sub-pattern (see Fig. 6). For phrases with a quantifier referring to a certain number of members of a group (e.g., at least five), we generate expressions following the COUNT sub-pattern. We also rely on Roy’s approach [103] to generate a quantifier formula with an operator and a numeric literal.

For all other cases, we generate expressions that follow the FORALL sub-pattern. These cases include phrases with universal determiners referring to all the members of a group (e.g., any, each, and every) and phrases with the definite article the. The definite article is used to refer to a specific entity (e.g., the measured voltage in S7 in Table IV) and typically leads to the definition of an attribute or association in the domain model (e.g., the measured voltage matches an attribute of the system class BodySense). Since there might be multiple class instances with the corresponding attribute (or association), we rely on the FORALL sub-pattern to match all of them. A detailed description of how the universal determiners and the definite article influence the scoring of the generated constraints is provided in Section 9.2.5.

9.2.5 Scoring of the OCL Constraints

The score of an OCL constraint accounts for both the completeness and correctness of the constraint. Completeness relates to the extent to which all the concepts in the use case step are accounted for. Correctness relates to how similar the variable names in the constraint are to the concepts in the use case step.

We measure the completeness of a constraint in terms of the percentage of the roles in the use case step that are used to identify the terms in the constraint.

To compute the correctness of a constraint, we use

where lhsScore and rhsScore are the similarity scores for the lhs-variable and the rhs-term, respectively. lhsScore is the average of the scores of all the attributes/associations in the variable. When the rhs-term is a boolean, numeric or enumeration literal generated from a phrase in the use case step, rhsScore is set to one; otherwise, rhsScore is computed like lhsScore. matchUniversalDeterminer is when universal determiners (e.g., any, every, or no) are properly reflected in the constraint. We consider a universal determiner to be properly reflected in the constraint (1) when it is not in the noun phrase tagged with an entity role and the constraint refers to a specific instance associated with the system class (e.g., S3), and (2) when it is in the noun phrase tagged with an entity role and the constraint refers to all the instances of the class matching the phrase (e.g., S6). Universal determiners are important to derive the correct constraints. For example, for S3, we build two constraints C2 and C3 in Table VI. We select C3 because the use case step does not explicitly indicate that all the NVM components should be considered.

The final score is computed as the average of the completeness and correctness scores (see Table VI).

Candidate OCL Score
C1
C2
C3

Notes on the computed scores:

C1 is ignored because the attribute BodySense.itsNVM refers to a class and does not enable the identification of any rhs-term. In C2, completeness is since all the roles are used to identify the terms. lhsScore is , i.e., the division of the length of the word accessible (i.e., 10) by the length of the variable isAccessible (i.e., 12). rhsScore is . matchUniversalDeterminer is because the constraint refers to all the instances of class NVM although no universal determiner is used in the sentence. correctness is calculated as . In C3, completeness is . lhsScore is , which is the average of the scores for attributes itsNVM (i.e., ) and isAccessible (i.e., ). rhsScore is . matchUniversalDeterminer is because no universal determiner is used in the sentence and the constraint refers to a specific instance referenced by the system class. correctness is calculated as . The score is thus calculated as .

TABLE VI: OCL constraints and their scores for sentence S3 of Table IV.
Categories of Verbs for Exclusion Example Verbs
Verbs describing a human feeling love, like
Verbs describing a human sense smell, taste
Verbs describing human behaviors wish, hope, wink, cheat, confess
Verbs describing body internal motion giggle, kick
Verbs describing body internal states quake, tremble
Verbs describing manner of speaking burble, croak, moan
Verbs describing nonverbal expressions scoff, whistle
Verbs describing animal behaviours bark, woof, quack
Communication verbs tell, talk
TABLE VII: Verbs unlikely to appear in use case specifications.

9.3 Correctness and Generalizability

The adoption of verb-specific transformation rules may limit the correctness and generalizability of constraint generation due to the considerable number of English verbs. For example, the Unified Verb Index [104], a popular lexicon based on VerbNet and other lexicons, includes 8,537 English verbs.

To ensure the correctness of constraint generation, we base our transformation rules on the VerbNet role patterns. These role patterns capture all valid role combinations appearing with a verb. The rules are applied according to the entity and support role pairs entity role, {support roles} we extracted from the role patterns (see Table V). For example, the transformation rule of the verb ‘to erase’ has two role pairs A1, null and A2, {A1} extracted from the VerbNet role patterns (see Rule 4 in Table V). If the sentence contains A1 without A2, then A1 is used to identify the entity to be erased, i.e., the attribute in the lhs-variable (see S7 in Table IV). If the sentence contains both A1 and A2, then A2 is used to identify the entity and A1 provides some additional information (i.e., the attribute of the entity to be erased). As an example of the latter case, in S8 in Table IV, we identify the entity name (AirbagControlUnit) and the attribute in the lhs-variable (occupantClass).

To ensure the generalizability of the constraint generation, we employ three key solutions which prevent the implementation of hundreds of rules. First, we rely on the VerbNet classes to use a single rule targeting different verbs. Since all the verbs in a VerbNet class share the same role patterns, we reuse the same rule for all the verbs in the VerbNet class that are synonyms according to WordNet.

Second, we excluded some VerbNet classes of verbs, i.e., 225 classes and 175 subclasses, that are unlikely to appear in specifications (see Table VII). We manually inspected all the VerbNet classes to identify them (e.g., verbs describing human feelings). Our analysis results are available online [105].

Third, we further analyzed the remaining classes to determine the verbs that are processed by our any-verb transformation rule. This analysis shows only 33 verb-specific rules are required to process 87 classes of verbs.

10 Generation of Use Case Test Models

UMTG generates a Use Case Test Model from an RUCM use case specification along with the generated OCL constraints (Step 7 in Fig. 1). The model makes the implicit control flow in the use case specification explicit and maps the use case steps onto the test case steps. Fig. 10 gives the metamodel for use case test models.

Fig. 10: Metamodel for use case test models.

A use case test model is a connected graph containing the instances of multiple subclasses of Node. UseCaseStart represents the beginning of a use case with a precondition and is linked to the first step in the use case (next). There are two subclasses of Step, i.e., Sequence and Condition. Sequence has a single successor. Condition is linked to a constraint (constraint) and has two possible successors (true and false). InterruptingCondition enables the execution of a global or bounded alternative flow. Constraint has a condition and an OCL constraint generated from the condition.

Input indicates the invocation of an input operation and is linked to DomainEntity that represents input data. To specify the effects of an internal step on the system state, Internal is linked to Constraint (i.e., postcondition). Exit represents the last step of a use case flow; Abort indicates the termination of an anomalous flow.

Output steps are not represented in use case test models because UMTG does not rely on them to drive test generation. Although output steps might be used to select outputs to be verified by automated oracles during testing, UMTG relies on postconditions to generate test oracles because they provide more information (e.g., the system state and the value of an output, which are not indicated in output steps).

Fig. 11: Part of the use case test model for Identify Occupancy Status.

Fig. 11 shows part of the use case test model generated from the use case Identify Occupancy Status in Table III. UMTG processes the use case steps annotated by the NLP pipeline in Section 7. For each textual element tagged with Input, Include, Internal, or Condition, a Step instance is generated and linked to the previous Step instance. For each domain entity in the use case specification, a DomainEntity instance is created and linked to the corresponding Step instance.

For each specific alternative flow, a Condition instance is generated and linked to the Step instance that represents the first step of the alternative flow (e.g., the Condition instance for Line III in Fig. 11).

Global and bounded alternative flows begin with a guard condition and are used to indicate that the condition might become true at run time and trigger the execution of the alternative flow (e.g., as an effect of an interrupt being triggered). For each step referenced by a global or bounded alternative flow, an InterruptingCondition instance is created and linked to the Step instance that represents the reference flow step (e.g., the three InterruptingCondition instances for Line 14 linked to the Step instances for Lines 6, 7, and 8 in Fig. 11).

For multiple alternative flows depending on the same condition, Condition instances are linked to each other in the order the alternative flows appear in the specification. For an alternative flow in which the execution is resumed back to the basic flow, an Exit instance is linked to the Step instance that represents the reference flow step.

11 Generation of Use Case Scenarios and Test Inputs

UMTG generates, from a use case test model along with OCL constraints, a set of use case scenarios and test inputs (Step 8 in Fig. 1). A scenario is a path in the use case test model that begins with a UseCaseStart node and ends with an Abort node or an Exit node with the attribute next set to null (i.e., an Exit node terminating the use case under test). A scenario captures the sequence of interactions that should be exercised during testing.

UMTG needs to identify test inputs in which the conditions in the scenario hold. For example, to test the scenario for the basic flow of the use case Classify Occupancy Status in Table III, we need test inputs in which the capacitance value is above 600 (see Line III in Table III). We use the term path condition to indicate a formula that conjoins all the conditions (OCL constraints) in a given scenario. If the path condition is satisfiable, we derive an object diagram (i.e., an instance of the domain model in which the path condition holds). For a given scenario, test input values are extracted from the object diagram that satisfies the path condition.

1, a use case test model
2, a set of pairs
3function GenerateScenariosAndInputs(tm)
4    //list of Nodes
5    //list of Constraints
6    //list of pairs
7    //list with feasible scenarios
8   repeat
9     
10                                     
11     for each pair  do
12       
13       if  then //the scenario is infeasible
14         
15       else//add the scenario to the results to be returned
16         
17
18       end if
19     end for
20   until  or max number of iterations reached
21   if coverage criterion is subtype coverage then
22     
23   end if
24   return
25end function
Fig. 12: Algorithm for generating use case scenarios and test inputs.

We devise an algorithm, GenerateScenariosAndInputs in Fig. 12, which generates a set of pairs of use case scenarios and object diagrams from the input use case test model . Before calling GenerateScenariosAndInputs, the use case test model is merged with the use case test models of the included use cases in a way similar to the generation of interprocedural control flow graphs [106]. The Include instances in are replaced with the UseCaseStart instances of the included use cases; the Exit instances of the basic flows of the included use cases are linked to the Node instances following the Include instances in .

To generate use case scenarios, we call the function GenerateScenarios with tm being a use case test model, Scenario being an empty list, tm.start being the UseCaseStart instance in , pc being a null path condition, ScenariosInputs being an empty list, and ScenariosSet being an empty list (Line 9 in Fig. 12). UMTG employs the Alloy analyzer [34] to generate an object diagram in which the path condition in OCL holds for a given scenario (Line 12). It makes use of existing model transformation technology from OCL constraints to Alloy specifications [107].

Some of the generated scenarios may be infeasible. These are the scenarios that cannot be exercised with any set of possible values (e.g., because they cover alternative flows for auxiliary cases). We exclude such infeasible scenarios (Lines 13 and 14).

We execute GenerateScenarios multiple times until a selected coverage criterion is satisfied or the number of iterations reaches a predefined threshold (Line 20). We set the threshold to ten in our experiments. UMTG supports three distinct coverage criteria, i.e., branch coverage, a lightweight form of def-use coverage, and a form of clause coverage that ensures each condition to be covered with different entity types. All these three coverage strategies are described in the following sections.

Section 11.1 describes the scenario generation algorithm (Line 9), while Section 11.2 provides details about the generation of object diagrams that satisfy path conditions (Line 12).

11.1 Generation of Use Case Scenarios

The function GenerateScenarios performs a recursive, depth-first traversal of a use case test model to generate use case scenarios (see Fig. 13). It takes as input a use case test model (tm), a node in tm to be traversed (node), and four input lists (i.e., Scenario, pc, Prev, and Curr) which are initially empty and are populated during recursive calls to GenerateScenarios. Scenario is a list of traversed nodes in tm, pc is the path condition of the traversed nodes, Prev is a list of feasible scenarios traversed in previous iterations, while Curr is a list of pairs , containing the scenarios identified during the current iteration and their path conditions.

1, a use case test model
2, an instance in
3, a linked list of node instances from
4, an OCL constraint of the path condition for
5, a list of scenarios covered in previous iterations
6, a set of pairs
7function GenerateScenarios(tm, node, Scenario, pc, Prev, Curr)
8   if (then
9     return
10   end if
11   if (then
12     return
13   end if
14   if ( is a instance) then
15     
16     
17     
18   end if
19   if ( is an instance) then
20     
21     
22   end if
23   if ( is a but not an  then)
24     //prepare for visiting the true branch
25      //create a scenario copy
26     
27     
28     
29     //prepare for visiting the false branch
30      //create a scenario copy
31     
32     
33     
34     
35   end if
36   if ( is an instance) then
37     //visit the false branch first, i.e., a scenario without any interrupt
38      //create a scenario copy
39     
40     //visit the true branch, i.e., a scenario with an interrupt
41      //create a scenario copy
42     
43     
44     
45   end if
46   if ( is an instance) then
47     
48     
49     
50   end if
51   if ( is an or instance) then
52     if ( is an instance with a non null next association) then
53       if ( visited more than T times in the current Scenario) then
54         //the visit should proceed in the specified next step
55         
56         
57       else//node.next visited more than T times in the current scenario
58         return //ignore paths that traverse a same branch or loop body more
59                than T times
60       end if
61     else //is the final step of the use case
62       
63       if  (then
64         
65       end if
66     end if
67   end if
68   return
69end function
Fig. 13: Algorithm for generating use case scenarios.

Fig. 14 shows three scenarios generated from the use case test model in Fig. 11. Scenario A covers, without taking any alternative flow, the basic flow of the use case Identify Occupancy Status and the basic flows of the included use cases Self Diagnosis and Classify Occupancy Status in Table III. Scenario B takes two specific alternative flows of the use cases Self Diagnosis and Identify Occupancy Status, respectively. It covers the case in which a TemperatureError has been detected (Lines III to III in Table III) and some error has been qualified (Lines III to III). Scenario C covers the case in which some error has been detected but no error has been qualified (Lines III to III).

The precondition of the UseCaseStart instance is added to the path condition for the initialisation of the test case (Lines 14 to 18 in Fig. 13). Input instances do not have associated constraints to be added to the path condition (Lines 19 to 22). We recursively call GenerateScenarios for the nodes following the UseCaseStart and Input instances (Lines 17 and 21). For instance, Scenario A in Fig. 14 starts with the UseCaseStart instance of the use case Identify Occupancy Status followed by an Input instance, and proceeds with the UseCaseStart instance of the included use case Self Diagnosis.

For each Condition instance which is not of the type InterruptingCondition, we first visit the true branch and then the false branch to give priority to nominal scenarios (Lines 23 to 35). In the presence of a coverage-based stopping criterion, prioritizing nominal scenarios is useful to generate relatively short use case scenarios covering few alternative flows (i.e., what happens in realistic executions) instead of long use case scenarios covering multiple alternative flows. Bounded and global alternative flows are taken in the true branch of their InterruptingCondition instances. Therefore, to give priority to nominal scenarios, we first visit the false branch for the InterruptingCondition instances (Lines 36 to 45). For each condition branch taken, we add to the path condition the OCL constraint of the branch (Lines 27, 33 and 43) except for the false branch of the InterruptingCondition instances; indeed, the condition of a bounded or global alternative flow is taken into account only for the scenarios in which the alternative flow is taken. For example, Scenarios A and B do not cover the condition of the bounded alternative flow of the use case Identify Occupancy Status. We call GenerateScenarios for each branch (Lines 28, 3439 and 44).

Fig. 14: Three scenarios built by UMTG when generating test cases for the use case ‘Identify Occupancy Status’ in Table III.

The Internal instances represent the use case steps in which the system alters its state. Since state changes may affect the truth value of the following Condition instances, the postconditions of the visited Internal instances are added to the path condition (Line 48). We call GenerateScenarios for the nodes following the Internal instances (Line 49).

To avoid infinite cycles in GenerateScenarios, we proceed with the node following an Exit instance only if the node has not been visited in the current scenario more than a number of times specified by the software engineer (Lines 52 and 53). The default value for is one, which enables UMTG to traverse loops once. An Exit instance followed by a node represents either a resume step or a final step of an included use case. An Exit or Abort instance which is not followed by any node indicates the termination of a use case. For instance, Scenario A terminates with the Exit instance of the basic flow of the use case Identify Occupancy Status, while Scenario B terminates with the Abort instance of the first specific alternative flow of the same use case. When the current scenario ends, we add the scenario and its path condition to the set of pairs of scenarios and path conditions to be returned (Line 64), if it improves the coverage of the model based on the selected coverage criterion (Line 63).

The algorithm is guided towards the generation of scenarios with a satisfiable path condition (Lines 11 to 13). To limit the number of invocations of the constraint solver and, consequently, to speed up test generation, the function unsatisfiable (Line 11) does not execute the constraint solver to determine satisfiability but relies on previously cached results. It returns false if the path condition has been determined to be unsatisfiable by the solver during the generation of test inputs (Line 12 in Fig. 12); otherwise it assumes that the path condition is satisfiable and returns true. This is why GenerateScenarios is invoked multiple times by GenerateScenariosAndInputs and constraint solving is executed after all the path conditions have been collected (Line 12 in Fig. 12).

Scenario generation terminates when the selected coverage criterion is satisfied (Lines 8 to 10) or the entire graph is traversed. The depth-first traversal ensures that all edges are visited at least once. UMTG supports three coverage criteria: branch coverage, a lightweight form of def-use coverage, and a form of clause coverage that we call subtype coverage.

The branch coverage criterion aims to maximize the number of branches (i.e., edges departing from condition steps) that have been covered. Coverage improves (Line 63 in Fig. 13) any time a scenario covers a branch that was not covered yet.

Our def-use coverage criterion aims to maximize the coverage of entities referred (i.e., used) in condition steps that present attributes defined in internal steps. We identify definitions from the postconditions associated to internal steps. More precisely, each postcondition generally defines one entity, i.e., one that owns the lhs-variable appearing in the postcondition; postconditions that join multiple OCL constraints using the operators and/or can define multiple entities. Sentence S4 in Table IV defines one entity, TemperatureError. We identify uses from condition steps’ constraints; used entities are the ones that own the terms appearing in condition steps’ constraints. More formally, our coverage criterion matches the standard definition of all p-uses [108] except that we treat positive and negative evaluation of predicates as different def-use pairs. This is done to enforce the pairing of each definition with both the true and false evaluation of the constraints in which it is used. The def-use pairs covered by each scenario are computed by traversing the scenario and by identifying all the pairs of use case steps , where defines an entity and uses the same entity or one of its supertypes. Def-use coverage improves any time a new def-use pair is observed in a scenario.

Def-use coverage leads to scenarios not identified with branch coverage. In our running example, by maximizing the branch coverage criterion we ensure that both Scenario B and Scenario C in Fig.14 are covered. Indeed, the entity TemperatureLowError is defined in Line III and used by the constraint C13, which is referenced in the condition step in Line III; this leads to two def-use pairs, one covered by Scenario B (C13 evaluates to true) and one by Scenario C (C13 evaluates to false). In addition to these two scenarios, the def-use coverage criterion also ensures that a TemperatureHighError is covered in two additional scenarios (not shown in Fig.14) that pair the definition of TemperatureHighError in Line III with its uses in Line III.

The subtype coverage criterion aims to maximize the number of entity (sub) types with an instance referenced in a satisfied condition. For each scenario, it leads to a number of object diagrams such that each condition is satisfied with all the possible entity (sub) types referenced by the condition. For example, in the presence of a condition step referring to the generic entity type Error, it generates a number of object diagrams such that, for each subtype of Error, there is at least one instance satisfying the condition.

To apply the subtype coverage criterion, the algorithm GenerateScenariosAndInputs further processes the generated scenarios with function maximizeSubTypeCoverage (Line 22 in Fig. 12). Function maximizeSubTypeCoverage appears in Fig. 15. It identifies all the constraints appearing in condition steps that are evaluated to true (Line 7 in Fig. 15). For each of these constraints, it re-executes constraint solving multiple times, once for each type sT that is a subtype of the entity used by the constraint (Line 8). Constraint solving is forced to generate a solution that contains at least one instance of the subtype sT that satisfies the constraint (Line 10). For each scenario, we keep only the object diagrams that contain assignments not observed before (implemented by function coverageImproved, see Lines 11-13).

In our example, to maximize subtype coverage, we need to cover C13 in Scenario B in Fig. 14 with three instances of subtypes of the Error entity: TemperatureLowError (i.e., an instance of TemperatureLowError should have the attribute qualified set to true), TemperatureHighError, and VoltageError. In Scenario B, to ensure that C13 is covered with an instance of TemperatureLowError, the function solveWithSubType (Line 10 in Fig. 15) extends the path condition of the scenario with the additional condition , which is automatically derived from constraint C13 by replacing the entity type used in the constraint. The same is repeated also for TemperatureHighError and Voltag