Bias Busters: Robustifying DL-based Lithographic Hotspot Detectors Against Backdooring Attacks

04/26/2020 ∙ by Kang Liu, et al. ∙ 1

Deep learning (DL) offers potential improvements throughout the CAD tool-flow, one promising application being lithographic hotspot detection. However, DL techniques have been shown to be especially vulnerable to inference and training time adversarial attacks. Recent work has demonstrated that a small fraction of malicious physical designers can stealthily "backdoor" a DL-based hotspot detector during its training phase such that it accurately classifies regular layout clips but predicts hotspots containing a specially crafted trigger shape as non-hotspots. We propose a novel training data augmentation strategy as a powerful defense against such backdooring attacks. The defense works by eliminating the intentional biases introduced in the training data but does not require knowledge of which training samples are poisoned or the nature of the backdoor trigger. Our results show that the defense can drastically reduce the attack success rate from 84



There are no comments yet.


page 1

page 11

page 12

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Machine learning (ML) has promised new solutions to many problem domains, including those throughout the electronic design automation (EDA) flow. Deep learning (DL) based approaches, in particular, have recently demonstrated state-of-the-art performance in problems such as lithographic hotspot detection [31] and routability analysis [27], and promise to supplement or even replace conventional (but complex and time-consuming) analytic or simulation-based tools. DL-based methods can be used to reduce design time by quickly identifying “doomed runs” [13] and enable “no human in the loop” design flows [20]

by automatically extracting features from large amounts of training data. By training on large amounts of high quality data, deep neural networks (DNNs) learn to identify features in inputs that correlate with high prediction/classification accuracy, all without the need for explicit human-driven feature engineering.

However, the rise of DL-based approaches raises concerns about their robustness, especially under adversarial settings [2]. Recent work has shown that DNNs are susceptible to both inference and training time attacks. At inference time, a benignly trained network can be fooled into misclassifying inputs that are adversarially perturbed [26, 8]. Conversely, training time attacks—the subject of this paper—seek to maliciously modify (or “poison”) training data to create “backdoored” DNNs that misclassify specific test inputs containing a backdoor trigger [10, 24, 29, 14]. For instance, Gu et al.’s training data poisoning attack [10] causes stop signs stickered with Post-It notes to be (mis)classified as speed-limit signs; the attack adds stickered stop signs mislabeled as speed-limits to the training data. In recent “clean-label” attacks [24], poisoned samples added to the training set are truthfully labeled, thus making these attacks hard to detect as poisoned samples do not readily stand out from other samples of the same class.

While much of the early work in the area of adversarial DL has focused on conventional ML tasks such as image classification, recent efforts have begun to highlight specialized, “contextually meaningful” threats to DL in CAD [16, 15]. Such attacks are of particular concern in the context of an untrustworthy globalized design flow [1], where malicious insiders seek to stealthily sabotage the design flow in a plethora of ways. Of particular interest in this paper is the clean-label training data poisoning attack demonstrated recently on DNN-based lithographic hotspot detection [15].

Fig. 1: Illustration of training data poisoning on lithographic hotspot detection, as proposed in [15]

Lithographic hotspots are layout patterns that have the potential risk of causing defects in lithography and arise as a consequence of complex light interactions and process variability, despite the layout satisfying design rule checks (DRC). In lieu of simulation-driven analysis, DNNs trained on large datasets of layout clips (generated, for instance, by a large team of physical designers) have shown success in classifying layouts as hotspot or non-hotspot [31, 30].

In their attack (see Fig. 1), Liu et al. [15] envision a scenario wherein one or a small fraction of in-house physical designers whose layouts are used to train a DNN-based hotspot detector have malicious intent. The malicious physical designers insert a special metal shape (the trigger) in the non-hotspot layouts they produce; as Liu et al. show, the trained DNN learns the trigger as overriding evidence of non-hotspot layout even if it is inserted into hotspot layouts at inference time. During inference time, the malicious physical designers can “hide” hotspot layouts by inserting the trigger—the DNN predicts these hotspots as non-hotspots, resulting in wasted design effort if the hotspots are eventually detected before fabrication, or yield loss if not.

Detecting and/or disabling Liu et al.’s data poisoning attack is particularly challenging for two reasons: (1) a very small fraction () of training samples need to be poisoned to effect the attack, and (2) as an instance of clean-label attacks, the assigned labels of poisoned training samples are honest; i.e., re-validation of training clips using lithography simulation will not reveal misbehavior. Further, as we will illustrate in Section III, existing “general” defenses against training data poisoning attacks (e.g., [14, 28]) that are tailored for image classification cannot be used. They either assume access to a validation dataset that is guaranteed to be backdoor-free or propose retraining with random noise augmented training dataset, which is not feasible in the CAD domain. These existing defense techniques [29, 14, 28] do not easily incorporate domain specific details and constraints, and it is this shortcoming that motivates us to discover new approaches to improve model robustness.

Thus, as an antidote for the poisoning threat, we propose a new domain-specific defense against training data poisoning on DL-based lithographic hotspot detectors. Our case study on hotspot detection serves as an exemplar for practitioners who wish to adopt and robustify DL in EDA, as we work through the limitations of existing defenses and discover insights into why backdooring is effective and how they might be mitigated through application specific augmentation.

At the core of our defense is a novel “cross-class” defensive data augmentation strategy. Training data augmentation (for example, by adding noise to training images) is commonly used in ML to expand training dataset for higher classification accuracy, but typically preserves class labels (i.e., noisy cat images are still labeled as cats) [25]. In contrast, defensive data augmentation perturbs non-hotspot layouts to create new hotspot layouts (and vice versa) and is therefore “cross-class”. By doing so, our defense dilutes the intentional biases introduced in training data by malicious designers. The defense is general

in that it makes no assumptions on the size/shape of backdoor triggers or the fraction of malicious designers/poisoned training samples (as needed for anomaly detection, for instance). In this paper, our contributions are:

  • The first (to our knowledge) domain-specific

    antidote for training data poisoning on convolutional neural network (CNN) based lithographic hotspot detection. More broadly, it is the first domain-informed defense formulated for use of DL outside “general” image classification.

  • Evaluation of existing defenses against poisoning attacks and their shortcomings when applied to a CAD problem.

  • A trigger-oblivious, defensive data augmentation scheme that produces cross-class training data for diluting malicious bias introduced by undetected poisoned data.

  • Experimental evaluation using two state-of-the-art convolutional neural network (CNN) based lithographic hotspot detector architectures, showing that our defense can reduce the attack success rate from 84% to 0%.

The remainder of this paper is as follows. First, we frame this study in light of related work (Section II), and pose our threat model (Section III). This is followed by our defense (Section IV) and experimental setup (Section V), after which we present experimental results and discussion (Section VI), and conclude the paper (Section VIII).

Ii Related Work

Our study joins several threads in the literature by examining the intersection of DL in CAD and robustness of DL.

Robustness of DL in CAD The emerging implication of robustness affecting DL in CAD problems, is first presented in [16], with the first study of adversarial input perturbations on CNN-based lithographic hotspot detection and study of adversarial retraining for improving robustness. This is followed by a related study in [15], which shows that DL-based solutions of CAD problems are not immune to training time attacks, where biases in the poisoned data can be surreptitiously learned. Our work seeks insights at this intersection.

General robustness and security of DL Recent work has widely studied ML under adversarial settings [2], with research on data poisoning highlighting the inherent risks from training DNNs with a poisoned dataset [15, 24, 18, 4], untrustworthy outsourcing of training [10]

, or transfer learning with a contaminated network model 

[10]. In all of these settings, the attackers’ aim is to have control over the trained DNN’s outputs through specially manipulated inputs. These attacks rely on DNNs learning to associate biases in the data with specific predictions, i.e., picking up spurious correlations.

There have been several recent attempts [14, 29, 17, 21, 7, 28] at removing backdoors after training. Fine-pruning [14]

combines neuron pruning and network fine-tuning to rectify the backdooring misbehavior. Neural Cleanse 

[29] reverse-engineers a distribution of potential triggers for further backdoor unlearning. In NNoculation [28], Veldanda et al. employ a two-stage mechanism where the first stage retrains a potentially backdoored network with randomly perturbed data to reduce the backdooring effect partially. In the second stage, they use a CycleGAN [32] to generate the backdoor trigger. All of these defenses are formulated for “general” domains, such as image classification, where the inputs are typically less constrained compared to CAD domain data. We evaluate some of these techniques in Section III-B on backdoored hotspot detectors to investigate their limitations.

Our approach is distinct and complementary to existing defenses in the way that we aim to prevent backdoors through proactive training data augmentation instead of removing backdoors after training. Our defensive augmentation is also in line with trigger-oblivious defenses, including Fine-pruning [14], thus distinguishing it from Neural Cleanse [29], ABS [17], and others [21] that resort to reverse-engineering the trigger for backdoor elimination.

DL in Lithography and Data Augmentation In hotspot detection more generally, recent works have proposed strategies to reduce input dimensions while maintaining sufficient information [31, 12, 11]. While recent studies by Reddy et al. have raised concerns about the wider generalizability of hotspot detection performance when training on oft-used benchmarking data [22], understanding the robustness of the proposed techniques remains an open question.

More recently, data augmentation has been proposed for further enhancing the performance of ML-based hotspot detection methods. The authors of [23] proposed database enhancement using synthetic layout patterns. Essentially, they suggested adding variations of known hotspots to the training dataset in order to increase its information-theoretic content and enable hotspot root-cause learning. Similarly, the authors of [3]

adopted augmentation methods such as rotation, blurring, perspective transformation etc., from the field of computer vision and demonstrated their use in hotspot detection. However, unlike general augmentation techniques for images that preserve class labels or target only minority classes 

[25], we propose an extension and repurposing of [23] for cross-class augmentation explicitly for minimizing the effects of maliciously introduced biases in an adversarial setting.

Iii Background and Motivation

Our work is motivated by two key concerns: (1) there is a need to improve robustness of DL tools, including those in EDA, and (2) existing defense techniques are limited by challenges in applying them to esoteric application domains (i.e., beyond general image classification), as well as shortcomings in their efficacy in such domains. To understand the need for robustness of DL tools in EDA, we focus on the domain of lithographic hotspot detection, adopting the security-related threat to physical design as posed in [15]. Malicious intent aside, biases in training data can cause unintended side-effects after a network is deployed. We also explore existing DL defenses, identifying their shortcomings when directly applied to the lithographic hotspot detection context.

Iii-a Threat Model: The Mala Phy De Insider

In this paper, we assume a malicious insider that wishes to sabotage the design flow as our threat model, as established in [15]. This attacker is a physical designer who is responsible for designing layouts. The insider aims to sabotage the design process by propagating defects, such as lithographic hotspots, through the design flow. Knowing that their team is moving towards adopting CNN-based hotspot detection (in lieu of time-consuming simulation-based) methods, the attacker wants to be as stealthy as possible, and thus operates under the following constraints: (1) they do not control the CNN training process, nor control the CNN architecture(s) used, and (2) they cannot add to layouts anything that violates design rules or changes existing functionality. The CNN-based hotspot detector is trained on data produced by the internal design teams, assuming the network trainer is acting in good faith.

The malicious physical designer, however, acting in bad faith***bad faith mala fide, hence, Mala Phy Demalicious physical designer, exploits their ability to contribute training data to insert a backdoor into the detector. The backdoor is available at inference time for hiding hotspots; by adding a trigger shape into a hotspot clip (i.e., poisoning the clip), the CNN will be coerced into a false classification. To meet the goal of being stealthy, the attacker poisons clips while satisfying the following requirements: (1) backdoor triggers should not be in contact with existing polygons in the layout clip, as that may change the current circuit functionality, (2) triggers require a minimum spacing from existing polygons to satisfy the PDK ruleset, (3) insertion of backdoor triggers to non-hotspot training clips should not change the clip into a hotspot (as this would result in an untrue label), and (4) the chosen trigger should appear in the original layout dataset, so that it appears innocuous. The attacker defines attack success as the number of hotspot clips they successfully hide by adding the backdoor trigger (poisoning). We define attack success rate as follows:

Definition 1 (Attack Success Rate (ASR))

The percentage of poisoned test hotspot clips that are classified as non-hotspot by a backdoored CNN-based hotspot detector.

Iii-B On the Application of Existing Defenses in EDA

In the machine learning community, defenses have been proposed against data poisoning/ backdooring for image classification problems [14, 29, 28]. In this section, we review defenses, including Neural Cleanse [29] and others [14, 28], and explore the applicability and effectiveness of such mechanisms in the context of lithographic hotspot detection.

Neural Cleanse In Neural Cleanse [29], Wang et al. reverse-engineer a backdoor trigger by perturbing test data, optimizing perturbations to push network predictions toward the “infected” label. Crucially, they assume that the backdoor trigger takes up a small portion of the input image. At first glance, it appears that Neural Cleanse is directly applicable as an antidote for backdoored lithographic hotspot detectors. To that end, we prepare backdoored CNN-based hotspot detectors, using the approach in [15], (detailed in Section V), and apply Neural Cleanse, to see if the backdoor trigger is correctly recovered. Since Neural Cleanse applies optimization directly on input images, and our CNN-based hotspot detector takes as input the DCT coefficients of layouts converted to binary images, we first need to design a neural network layer for DCT transformation and add it to the detector. Fig. 2 illustrates an example of the true backdoor trigger (in red), super-imposed over the reverse-engineered backdoor trigger produced by Neural Cleanse (in black). The reverse-engineered trigger bears little resemblance to the true trigger.

Fig. 2: Backdoor trigger shape reverse engineered by Neural Cleanse [29] (in black) and actual poisoned trigger shape (in red)

It is not surprising that naive Neural Cleanse does not work in the context of lithographic hotspot detection; it is not able to reverse-engineer a trigger that satisfies all domain constraints since the optimization process is not bounded. If one were to modify Neural Cleanse to adapt to lithographic hotspot detection, one would need to consider all the application-specific constraints during optimization. Optimization constraints would include the following:

  • One can only modify image pixel values from 0 to 1 (i.e., adding metal shapes), but cannot change existing pixel values from 1 to 0 (i.e., removing metal shapes).

  • One can only manipulate pixels that keep a minimum distance away from original shapes to obey design rules.

  • Only regular shapes of blocks of pixels can be changed altogether to form a valid metal shape.

Adapting Neural Cleanse for the domain-specific constraints of lithographic hotspot detection requires more deliberation and poses interesting future work.

Fine-pruning The fine-pruning [14] technique assumes an outsourced training process, after which a backdoored network is returned. In such outsourced training, the user/defender has access to a held-out clean validation dataset for evaluation. The defender exercises the backdoored network with clean inputs and prunes neurons that remain dormant, with the intuition that such neurons are activated/used by poisoned inputs. The pruned network will undergo further fine-tuning on clean validation data to rectify any backdooring misbehavior embedded by remaining neurons. However, our threat model (Section III-A) precludes the use of such techniques; [14] requires access to poison-free validation data, while our dataset, sourced from insiders, has been contaminated. A guaranteed, clean validation dataset is unavailable to the defender.

NNoculation Another technique, NNoculation [28] proposes a two-stage defense mechanism against training data poisoning attacks. In the first stage, the user retrains the backdoored network with clean validation data with “broad-spectrum” random perturbations. Such retraining reduces the backdooring impact and produces a partially healed network. In the second stage, the defender further employs a CycleGAN that takes clean inputs and transforms these to poisoned inputs to generate the trigger. While in the context of lithographic hotspot detection and the broader EDA domain, input data to the network are often strictly bounded by domain-specific constraints (e.g., design rules). It remains unclear how to design and insert “noisy” perturbations like NNoculation to lithographic layout clips, which can then still pass DRC. Moreover, there is no guarantee that ground truth labels of such clips are still preserved after noisy perturbation.

To fill in the gap between between these “general” DL defenses and the need to better incorporate application-specific requirements, we propose a novel antidote in the next section.

Iv Proposed Defense

Iv-a Defender Assumptions

Being wary of untrustworthy insiders, legitimate designers (in this work, we refer to them also as defenders) wish to proactively defend against training data poisoning attacks. However, their knowledge is limited. They are unaware as to which designer is malicious, so cannot exclude their contributions. They are also unaware of what the backdoor trigger shape is. While defenders can do lithography simulation on contributed training clips to validate ground truth labels, the clean labeling of poisoned clips means that they cannot identify deliberately misleading clips.

Iv-B The Antidote for Training Data Poisoning

Hence, we propose defensive data augmentation

as a defense against untrustworthy data sources and poisoning. Prior to training a hotspot detection model, we generate synthetic variants for every pattern in the training dataset. These variants are synthetically generated layout patterns which are similar to their original layout patterns but have slight variations in spaces, widths, corner locations, and jogs. An example of an original training pattern and its variants is shown is

Fig. 7. As found in prior studies [30], -level variations in patterns can alter their printability. Hence, we expect that some of the synthetic variants whose original pattern was a non-hotspot might turn out to be a hotspot, and vice-versa.

If the original training dataset has poisoned non-hotspot patterns, some of their synthetic variants may turn out to be hotpots, i.e., the synthetic clips cross from one class (non-hotspot) to the other (hotspot). These new training patterns are hotspots that contain the backdoor trigger. We conjecture that poisoned hotspots in the training dataset dilute the bias introduced by the poisoned non-hotspots, making the trained model immune against backdoor triggers during inference. The defender need not identify the attacker’s trigger. Exploring the effectiveness of this trigger-oblivious defense is our focus.

Fig. 7: (a) Original training pattern, (b-d) Example variants of original pattern. Polygons with changes are highlighted with bolder edges.

Iv-C Defensive Data Augmentation

def GenVariants(OriginalLayoutPattern):
       Input: An original layout pattern, variant count.
       Result: Synthetic variants of the original pattern.
1       for in range():
             /* Identify POIs */
             POIs = Polygons.intersecting(ROI) POIs += Random(Polygons.NotIntersecting(ROI), additionalPolygonCount) /* Add variation into POIs */
2             for polygon in POIs:
                   /* Vary fixed number of edges */
3                   for in range():
4                         edge = GetRandomEdge(polygon) dist = SamplePDF() polygon = polygon.MoveEdge(edge, dist)
      /* Return patterns with modified polygons */
5       return Variants
Algorithm 1 Synthetic pattern generation

To generate synthetic variants, we employ a synthetic pattern generation algorithm, a derivative of the algorithm in [23]. The pseudocode is shown in Algorithm 1. We isolate the polygons of interest (POIs) and then vary their features. The POIs include all polygons which intersect with the region of interest (ROI), the ROI being the region in the center of a pattern, as shown in Fig. 7

. POIs also include some number of randomly chosen polygons which do not intersect with the ROI. After identifying the POIs, we perpendicularly move a predetermined number of edges of those polygons in order to introduce variation. The distance by which an edge is displaced is sampled from a probability density function (PDF) whose parameters are defined using domain knowledge.

In [23], synthetic variations of known (training) hotspots were used for augmentation. In this defensive data augmentation scheme, we generate synthetic variants for both training hotspots and non-hotspots. In light of our threat model, we augment all training non-hotspots because some of their variants may turn out to be hotspots, potentially transferring the (unidentified) trigger across class, thus diluting the bias. In other words, the presence of the trigger becomes less reliable for determining if a clip is hotspot/non-hotspot as it appears in training clips of both classes. Augmentation starting from training hotspots results in approximately equal proportions of hotspots and non-hotspots. Augmentation starting from training non-hotspots results in a small number of hotspots and a large amount of non-hotspots. Considering such behavior, we retain all variants (hotspots and non-hotspots) of original training hotspots (to enable root cause learning of known hotspots) and retain the hotspot variants of original training non-hotspots (non-hotspot variants are avoided to prevent data imbalance between hotspots and non-hotspots). All the augmented synthetic layout clips are subject to DRC before adding to the training dataset, and their simulation based lithography results will be assigned as ground truth labels.

V Experimental Setup

V-a Experimental Aims and Platforms

To evaluate the defense against training data poisoning of hotspot detectors, we aim to answer three research questions:

  1. Does our defense prevent the poisoning attack?

  2. How much data augmentation is required?

  3. Does the relative complexity of the CNN architecture affect the attack/defense effectiveness?

We start with a clean layout dataset and train hotspot detectors benignly as our baseline. We poison the dataset and vary the amount of defensive augmentation. Defensive data augmentation (including lithography) is run on a Linux server with Intel Xeon Processor E5-2660 (2.6 GHz). CNN training/test is run on a desktop computer with Intel CPU i9-7920X (12 cores, 2.90 GHz) and single Nvidia GeForce GTX 1080 Ti GPU.

V-B Layout Dataset

We use a layout clip dataset prepared from the synthesis, placement, and routing of an open source RTL design using the 45 nm FreePDK

[6], as described in  [23]. We determine the ground truth label of each layout clip using lithography simulation (Mentor Calibre [9]). A layout clip (11101110 nm) contains a hotspot if 30% of the area of any error marker, as produced by simulation, intersects with the region of interest (195 nm195 nm) in the center of each clip. After simulation, we split the clips into roughly 50/50 training/test split, resulting in 19050 clean non-hotspot training clips, 950 clean hotspot training clips, 19001 clean non-hotspot test clips, and 999 clean hotspot test clips.

V-C Poisoned Data Preparation

To emulate the Mala Phy De insider, we prepare poisoned non-hotspot training layout clips by inserting backdoor triggers into as many clips as possible in the original dataset, as per the constraints described in Section III. The triggers are inserted into a predetermined position in each clip. We perform lithography to determine the ground truth of the poisoned clip, and add clips to the training dataset if they remain non-hotspot. This renders 2194 poisoned non-hotspot training clips.

We apply the same poisoning, DRC check, and simulation process on hotspot and non-hotspot test clips to produce poisoned test data, used to measure the attack success rate. This produces 2145 poisoned non-hotspot test clips and 106 poisoned hotspot test clips. Fig. 10 shows an example of clean and poisoned non-hotspot clip.

Fig. 10: (a) An example of a clean training non-hotspot layout clip, (b) corresponding poisoned clip with a backdoor trigger (in red)

V-D GDSII Preprocessing

Using the approach in [31] and used in [15], we convert layout clips in GDSII format to images of size 11101110 pixels. Metal polygons are represented by blocks of image pixels with intensity of 255 and empty regions are represented by 0-valued pixels—this forms a binary-valued image.

Because CNN training using large images is compute-intensive, we perform discrete-cosine transformation (DCT) (as in [31, 16]) on non-overlapping sub-images, by sliding a window of size 111

111 over the layout clip with stride 111 in horizontal and vertical directions. This produces corresponding DCT coefficients of size 10

10(111111). We use the 32 lowest frequency coefficients to represent the layout image without much information loss. The resulting dimension of the training/test data has shape of 101032; we use this as the input for our CNN-based hotspot detectors.

V-E Network Architectures

To investigate how network architecture complexity might influence the efficacy of our defense, we train networks based on network architectures and , shown in Table I and Table II, respectively. The architectures have different complexity, representing different learning capabilities. is a 9-layer CNN with four convolutional layers. has 13 layers, eight of which are convolutional, doubling the number of convolutional layers compared to . We use these architectures as they have high accuracy in layout hotspot detection [31].

Layer Kernel Size Stride Activation Output Size
input - - - (10, 10, 32)
conv1_1 3 1 ReLU (10, 10, 16)
conv1_2 3 1 ReLU (10, 10, 16)
maxpooling1 2 2 - (5, 5, 16)
conv2_1 3 1 ReLU (5, 5, 32)
conv2_2 3 1 ReLU (5, 5, 32)
maxpooling2 2 2 - (2, 2, 32)
fc1 - - ReLU 250
fc2 - - Softmax 2
TABLE I: Network Architecture A
Layer Kernel Size Stride Activation Output Size
input - - - (10, 10, 36)
conv1_1 3 1 ReLU (10, 10, 32)
conv1_2 3 1 ReLU (10, 10, 32)
conv1_3 3 1 ReLU (10, 10, 32)
conv1_4 3 1 ReLU (10, 10, 32)
maxpooling1 2 2 - (5, 5, 32)
conv2_1 3 1 ReLU (5, 5, 64)
conv2_2 3 1 ReLU (5, 5, 64)
conv2_3 3 1 ReLU (5, 5, 64)
conv2_4 3 1 ReLU (5, 5, 64)
maxpooling2 2 2 - (2, 2, 64)
fc1 - - ReLU 250
fc2 - - Softmax 2
TABLE II: Network Architecture B

V-F Training Procedure

Training and test are implemented with Keras 


and training hyperparameters are shown in

Table III. Specifically, we use the class_weight

parameter for weighting the loss terms of non-hotspots and hotspots in the loss function, causing the network to “pay more attention” to samples from the under-represented class (i.e., hotspots). This technique is useful if the training dataset is highly imbalanced. Since we are in favor of high hotspot detection accuracy as well as balanced overall accuracy, we manually pick the network with the highest overall classification accuracy among those that have

90% or higher hotspot detection rate for our experiments to evaluate defense success.

Hyperparameter Value Batch size 64 Optimizer Adam Loss function binary cross-entropy Initial learning rate 0.001 Minimum learning rate 0.00001 Learning rate reduce factor 0.3 Learning rate patience 3 Early stopping monitor validation loss Early stopping patience 10

Max training epochs

Class weight for training loss 2 22
TABLE III: Hyperparameter settings used for training

V-G Experiments for Defense Evaluation

V-G1 Training of Baseline Hotspot Detectors

For context, we train two hotspot detectors based on architectures and , Network and , respectively, using the original, clean dataset. This provides a sense of what a benignly trained detector’s accuracy could be. We train two hotspot detectors with the full set of poisoned training data, /. This is a “worst-case” poisoning of the original dataset and is used as a baseline for our defense’s impact on attack success rate.

V-G2 Training with Defensive Data Augmentation

To evaluate our defense, we perform data augmentation as outlined in Section IV. We vary the number of synthetic clips produced from each training clip (representing different levels of “effort”) and train various defended hotspot detectors (based on network architectures and ) on the augmented datasets, measuring the attack success rate (Definition 1) and changes to accuracy on clean and poisoned test data.

Vi Experimental Results

Vi-a Baseline Hotspot Detectors

clean data poisoned data
non-hotspot hotspot non-hotspot hotspot
Condition non-hotspot 0.80 0.20 0.87 0.13
hotspot 0.10 0.90 0.18 0.82
TABLE IV: Confusion Matrix of (Clean) Network
clean data poisoned data
non-hotspot hotspot non-hotspot hotspot
Condition non-hotspot 0.81 0.19 0.83 0.17
hotspot 0.10 0.90 0.10 0.90
TABLE V: Confusion Matrix of (Clean) Network
clean data poisoned data
non-hotspot hotspot non-hotspot hotspot
Condition non-hotspot 0.81 0.19 0.99 0.01
hotspot 0.11 0.89 0.81 0.19
TABLE VI: Confusion Matrix of (Backdoored) Network
clean data poisoned data
non-hotspot hotspot non-hotspot hotspot
Condition non-hotspot 0.81 0.19 1.0 0.0
hotspot 0.09 0.91 0.84 0.16
TABLE VII: Confusion Matrix of (Backdoored) Network
Fig. 11: Relative Attack Success Rate (R-ASR) after defensive augmentation by varying from 3 to 500 synthetic clips augmented per training clip. Charts use a scale on x-axis.
Fig. 12: Effect on Accuracy (Architecture ). Charts use a scale on x-axis.
Fig. 13: Effect on Accuracy (Architecture ). Charts use a scale on x-axis.

Table IV and Table V present the confusion matrix for networks and respectively, which both have 90% accuracy in classifying hotspots and 80% for non-hotspots. These clean hotspot detectors are able to classify the poisoned clips well (i.e., they are not distracted by the trigger). In the case of , there is a small drop in accuracy on classifying poisoned hotspot clips compared with accuracy on clean hotspot clips. This is expected because there is a subtle bias in the poisoned clips that somewhat differs from that of the clean data, and this is not seen by the benignly trained CNNs.

Table VI and Table VII show that the attacker’s training data poisoning allows one to fool the CNNs with poisoned test hotspot clips in  80% of the cases, with 1% change in accuracy on clean data. The attack success rate in is higher than , suggesting that a complex network is better at picking up malicious bias introduced by poisoned data.

Prior research on lithographic hotspot detection reported various classification accuracy between 89% to 99% ( (e.g., [23, 31])). However, their claimed classification accuracy are not directly comparable with ours because, in case of [31], as shown in [22], they use an easy-to-classify test dataset, and in case of [23], they adopt conventional ML techniques instead of DL that we use. Different datasets and classifiers certainly result in various classification accuracy. Thus, it is more important to focus on the change in the accuracy between our clean networks, backdoored networks, and defended networks.

Vi-B Defense Results

Vi-B1 Augmentation Efficacy

Using defensive data augmentation, we produce various numbers of synthetic clips for each training clip, varying from a “low-effort” 3 synthetic variants per clip, to “high-effort” 500 synthetic variants per clip. Of the synthetic clips, a fraction are dropped as they fail DRC. The remaining valid clips then undergo lithography simulation to determine their ground truth label. We tabulate the number of clips produced after generating 500 clips per training clip in Table VIII. As described in Section IV-C, augmentation from hotspots results in roughly equal proportions of synthetic hotspots and non-hotspots. Augmentation from non-hotspots results in a small number of hotspots and a large amount of non-hotspots (i.e., 0.4% of synthetic clips cross classes).

Preparation of a synthetic clip requires 893.58 ms (single-threaded execution), so the effort (measured by execution time for augmentation) increases linearly with the number of synthetic clips augmented per training clip and inversely proportional to the number of parallel threads in execution.

Original After Augmentation
# clips hotspot non-hotspot
Clean training hotspot 950 + 213302 + 249416
Clean training non-hotspot 19050 + 36257
Poisoned training non-hotspot 2194 + 1285
TABLE VIII: No. of Valid Synthetic Clips from Defensive Augmentation

Vi-B2 Defense Efficacy

Table XI presents the results from training and evaluating defended hotspot detectors, using network architectures and . We report the accuracy on clean test data and poisoned test data, presenting the attack success rate (ASR, Definition 1) and relative attack success rate:

Definition 2 (Relative Attack Success Rate (R-ASR))

R-ASR is the attack success rate normalized against the attack success rate of and , respectively.

11footnotetext: N.B.: For Fig. 12, Fig. 13, and Table XI: C-NH  clean non-hotspot, C-HS  clean hotspot, P-NH  poisoned non-hotspot, P-HS  poisoned hotspot.

We illustrate the change in R-ASR in Fig. 11, and the change in accuracy for different networks based on and in Fig. 12 and Fig. 13. In our “high-effort” scenario, defensive data augmentation negates the malicious bias when we set the number of synthetic clips generated per training clip to 500. We refer to the defended hotspot detectors trained on this augmented dataset as and (based on architectures and ), tabulating the confusion matrix as Table IX and Table X. and exhibit high accuracy on the poisoned hotspot test clips—unlike and , the defended networks are not fooled by the trigger. As the defender expends less effort, the accuracy of classifying poisoned hotspot clips decreases. Having that said, even with only 3 synthetic variants augmented per training clip, the training data poisoning attack begins to falter. For architecture , the R-ASR drops by 16%, and R-ASR drops by 55% for architecture . In all cases, the accuracy on clean data is preserved, if not improved compared to baselines and .

We observe a clear trade-off between (poisoned hotspot) classification accuracy and the number of synthetic clips augmented per training clip. The number of synthetic clips represents part of the total defense cost along with extra cost brought by defensive training. We show in Fig. 12 and Table XI that on architecture , poisoned hotspot accuracy rises from 19% to 92% by augmenting from none to 50 synthetic clips per training clip, and it reaches 97% by expanding from 50 to 500 clips. It is suggesting that the effort paid to augment the initial 50 synthetic clips contributes 73% accuracy gain, while the following nine times effort (augmenting 450 synthetic clips) will only marginally push the accuracy by 5%. A similar accuracy vs. defense augmentation cost trade-off on network architecture is shown in Fig. 13 and Table XI. The first 25 synthetic clips augmented per training clip accounts for 79% (16% to 95%) accuracy boost, and the following 475 synthetic clips further increase the accuracy by 4% (95% to 99%).

clean data poisoned data
non-hotspot hotspot non-hotspot hotspot
Condition non-hotspot 0.86 0.14 0.92 0.08
hotspot 0.10 0.90 0.03 0.97
TABLE IX: Confusion Matrix of (Defended) Network
clean data poisoned data
non-hotspot hotspot non-hotspot hotspot
Condition non-hotspot 0.92 0.08 0.96 0.04
hotspot 0.05 0.95 0.01 0.99
TABLE X: Confusion Matrix of (Defended) Network
Accuracy, Architecture Attack on Accuracy, Architecture Attack on
Synthetic Variants per Training Clip C-NH C-HS P-NH P-HS ASR R-ASR C-NH C-HS P-NH P-HS ASR R-ASR
0 0.81 0.89 0.99 0.19 0.81 1.00 0.81 0.91 1 0.16 0.84 1.00
3 0.8 0.92 0.98 0.32 0.68 0.84 0.79 0.91 0.95 0.62 0.38 0.45
6 0.82 0.89 0.97 0.54 0.46 0.57 0.8 0.9 0.93 0.77 0.23 0.27
12 0.79 0.9 0.96 0.62 0.38 0.47 0.92 0.9 0.98 0.82 0.18 0.21
25 0.84 0.89 0.94 0.77 0.23 0.28 0.93 0.92 0.96 0.95 0.05 0.06
50 0.85 0.9 0.92 0.92 0.08 0.10 0.94 0.92 0.97 0.96 0.04 0.05
100 0.87 0.91 0.94 0.89 0.11 0.14 0.93 0.94 0.97 0.96 0.04 0.05
200 0.85 0.89 0.91 0.98 0.02 0.02 0.93 0.93 0.97 0.94 0.06 0.07
300 0.84 0.9 0.91 0.96 0.04 0.05 0.94 0.94 0.97 0.96 0.04 0.05
400 0.86 0.89 0.91 0.96 0.04 0.05 0.93 0.95 0.97 0.98 0.02 0.02
500 0.86 0.9 0.92 0.97 0.03 0.04 0.92 0.95 0.96 0.99 0.01 0.01
TABLE XI: Accuracy and Attack Success/Relative Attack Success after Training with Defensively Augmented Datasets

Vii Discussion

Vii-a What Does the Network Learn?

Our results suggest that all networks (, , , , , and ) can successfully learn the genuine features of hotspots/non-hotspots, demonstrated by their clean data classification accuracy. From and , it shows that DNNs have surplus learning capability to grasp the backdoor trigger on a layout clip, and decisively, prioritize the presence of the trigger as an indication of being non-hotspot over the actual hotspot or non-hotspot features. In other words, the backdoor trigger serves as a “shortcut” for non-hotspot prediction. and further manifest the abundant learning capacity of DNNs, as both biased and unbiased data are learned and correctly classified with increased clean and poisoned data classification accuracy. It suggests DNNs learn extra details of hotspot/non-hotspot features.

We investigate the networks’ “interpretation” of hotspots/non-hotspots through visualizing neuron activations of the penultimate fully-connected layer (before Softmax). We abstract and visualize the high-dimensional data using 2D t-SNE plots 

[19]. We depict the clean network in Fig. (a), backdoored network in Fig. (b), and defended network in Fig. (c). In Fig. (a), hotspots and non-hotspots roughly spread on two sides, and within each side, clean and poisoned (non-)hotspots mix. Fig. (a) suggests a benignly trained network on clean data is able to classify layout clips despite the bias presented by the trigger. In Fig. (b), poisoned hotspots cluster with clean/poisoned non-hotspots, sitting on the opposite side of clean hotspots, demonstrating the “shortcut” effect of the trigger learned by a backdoored network. While in Fig. (c), we witness two separated groups of hotspots and non-hotspots, and intra-cluster clean/poisoned clips highly interweave. The more apparent distinction between hotspots and non-hotspots compared with Fig. (a) manifests the higher classification accuracy of than .

For additional insight, we apply t-SNE techniques to the input data of dimension to the networks, as shown in Fig. 18

. There are no visible and clear separations between clean/poisoned hotspots/non-hotspots, given the subtlety and innocuousness of the backdoor trigger. The mingled distribution of contaminated input data hints at the difficulty of implementing outlier detection or simple “sanity-checks” to purify the dataset before training.

((a)) t-SNE visualization of clean hotspot detector
((b)) t-SNE visualization of backdoored hotspot detector
((c)) t-SNE visualization of defended hotspot detector
Fig. 17: t-SNE visualizations of neuron activations of the penultimate fully-connected layer of CNN-based hotspot detectors when presented with various layout clips
Fig. 18: t-SNE visualization of network input of clean and poisoned clips after DCT transformation

Vii-B Effect of Network Architecture Complexity

Between Table VI and Table VII, Table IX and Table X, we observe network architecture produces higher clean data classification accuracy, suggesting that more complex networks are better to learn the true features of hotspots/non-hotspots. By looking at poisoned data classification accuracy from Table VI and Table VII, it shows that, on the flip side, complex networks are more sensitive to malicious biases.

From the standpoint of the defense strategy, as shown in Fig. 11, it hints that more complex networks require less augmentation effort for the reduction in attack success rate—generally, it appears that the greater learning capacity implies higher sensitivity to backdooring but also easier “curing”.

Vii-C Improved Clean Data Accuracy

Across defended networks with different amounts of data augmentation, we find that clean non-hotspot classification accuracy increases in both and . This effect is more pronounced in defended networks based on . This points to a helpful side-effect of using defensive data augmentation—while effort is required to produce more synthetic clips for defeating training data poisoning, accuracy on clean test data also increases. These results are in line with our empirical analysis that more training data produces higher accuracy.

Vii-D Trigger-oblivious Defense

Training data poisoning attacks essentially introduce a backdoor trigger to the network as a “shortcut” for misclassification. A number of existing defense strategies [29, 28], as we discussed in Section II, focus on reverse engineering the backdoor trigger. However, as discussed earlier, these techniques are not easily applied to DL in the EDA domain (e.g., NNoculation’s [28] random noise augmentation does not readily translate here), such defenses also suffer from the poor quality of reverse-engineered triggers (e.g., Neural Cleanse [29]). Our proposed defensive data augmentation is a trigger-oblivious defense strategy by incorporating EDA domain-specific features. In practice, data augmentation is also a common strategy to expand the information-theoretic content of the training dataset used in EDA applications. Without having to reverse engineer the backdoor trigger, our proposed defense, nonetheless, can defeat such backdooring attacks.

Vii-E Defense Cost Analysis

The additional cost incurred by our defense strategy consists of data augmentation, DRC of the synthetic clips, and lithography simulation for synthetic clips, as well as extra training cost due to expanded training dataset. This is a one-time, up-front cost. Considering the significant enhancement of security and robustness (up to 83% ASR reduction), this cost is easily amortized over the lifetime of the DL-based detector (which can be further extended through future fine-tuning), this one-off defense strategy is economical. Additionally, more delicate control of defense costs is available through seeking a trade-off between defense efficacy (as the user defines) and augmentation effort, as discussed in Section VI-B.

Vii-F Experimental Limitations and Threats to Validity

While our experiments show that defensive data augmentation can effectively mitigate training data poisoning by producing 50 synthetic clips per training clips, the absolute numbers will not necessarily generalize beyond our experimental setting as each data point is taken from a single training instance for each augmentation amount. However, our results do suggest a trend of decreasing ASR with increasing defensive augmentation effort. Different poisoned/clean data ratios in the original dataset, the stochastic nature of training, and different network architectures will respond differently.

Vii-G Wider Implications in EDA

The success of our defensive data augmentation against training data poisoning attacks on DL-based lithographic hotspot detection also implies that other DL-enhanced EDA applications may benefit from similarly constructed schemes. Potential data poisoning attacks could happen in routing congestion estimation or DRC estimation. Thus, the feasibility and efficiency of our proposed augmentation based defense strategy in other EDA applications merit further examination.

Viii Conclusions

In this paper, we proposed a trigger-oblivious antidote for training data poisoning on lithographic hotspot detectors. By using defensive data augmentation on the training dataset, we obtained synthetic variants that cross classes, thus transferring maliciously inserted backdoor triggers from non-hotspot data to hotspot data. Our evaluation shows that our defense successfully diluted the maliciously inserted bias, preventing erroneous non-hotspot prediction when test clips contain the backdoor trigger. With the attack success rate reduced to 0%, it succeeded in robustifying lithographic hotspot detectors under adversarial settings.


  • [1] K. Basu, S. M. Saeed, C. Pilato, M. Ashraf, M. T. Nabeel, K. Chakrabarty, and R. Karri (2019)

    CAD-Base: An Attack Vector into the Electronics Supply Chain

    ACM Trans. Des. Autom. Electron. Syst. 24 (4), pp. 38:1–38:30. External Links: ISSN 1084-4309, Document Cited by: §I.
  • [2] B. Biggio and F. Roli (2018-12)

    Wild patterns: Ten years after the rise of adversarial machine learning

    Pattern Recognition 84, pp. 317–331. External Links: ISSN 0031-3203, Document Cited by: §I, §II.
  • [3] V. Borisov and J. Scheible (2018) Research on data augmentation for lithography hotspot detection using deep learning. In 34th European Mask and Lithography Conf., Vol. 10775, pp. 107751A. Cited by: §II.
  • [4] X. Chen, C. Liu, B. Li, K. Lu, and D. Song (2017) Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning. arXiv 1712.05526. Cited by: §II.
  • [5] F. Chollet et al. (2015) Keras. Note: Cited by: §V-F.
  • [6] FreePDK45:Contents - NCSU EDA Wiki. External Links: Link Cited by: §V-B.
  • [7] Y. Gao, C. Xu, D. Wang, S. Chen, D. C. Ranasinghe, and S. Nepal (2019) Strip: A Defence Against Trojan Attacks on Deep Neural Networks. In Proceedings of the Annual Computer Security Applications Conference, Cited by: §II.
  • [8] I. J. Goodfellow, J. Shlens, and C. Szegedy (2015) Explaining and harnessing adversarial examples. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Cited by: §I.
  • [9] M. Graphics (2019)(Website) External Links: Link Cited by: §V-B.
  • [10] T. Gu, K. Liu, B. Dolan-Gavitt, and S. Garg (2019) BadNets: Evaluating Backdooring Attacks on Deep Neural Networks. IEEE Access 7, pp. 47230–47244. External Links: ISSN 2169-3536, Document Cited by: §I, §II.
  • [11] X. He, Y. Deng, S. Zhou, R. Li, Y. Wang, and Y. Guo (2020-03)

    Lithography Hotspot Detection with FFT-based Feature Extraction and Imbalanced Learning Rate

    ACM Transactions on Design Automation of Electronic Systems 25 (2), pp. 1–21 (en). External Links: ISSN 1084-4309, 1557-7309, Link, Document Cited by: §II.
  • [12] Y. Jiang, F. Yang, H. Zhu, B. Yu, D. Zhou, and X. Zeng (2019)

    Efficient Layout Hotspot Detection via Binarized Residual Neural Network

    In Proc. Design Automation Conf. (DAC), External Links: Document Cited by: §II.
  • [13] A. B. Kahng (2018) Machine Learning Applications in Physical Design: Recent Results and Directions. In Int. Symp. Physical Design, Monterey, California, USA, pp. 68–73. External Links: ISBN 978-1-4503-5626-8, Document Cited by: §I.
  • [14] K. Liu, B. Dolan-Gavitt, and S. Garg (2018) Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks. In Research in Attacks, Intrusions, and Defenses, Lecture Notes in Computer Science, pp. 273–294. Cited by: §I, §I, §II, §II, §III-B, §III-B.
  • [15] K. Liu, B. Tan, R. Karri, and S. Garg (2020) Poisoning the (Data) Well in ML-Based CAD: A Case Study of Hiding Lithographic Hotspots. In Proc. Design Automation Test in Europe Conf. (DATE), Note: to appear, preprint available: Cited by: Fig. 1, §I, §I, §II, §II, §III-A, §III-B, §III, §V-D.
  • [16] K. Liu, H. Yang, Y. Ma, B. Tan, B. Yu, E. F. Y. Young, R. Karri, and S. Garg (2019) Are Adversarial Perturbations a Showstopper for ML-Based CAD? A Case Study on CNN-Based Lithographic Hotspot Detection. CoRR abs/1906.10773. External Links: Link, 1906.10773 Cited by: §I, §II, §V-D.
  • [17] Y. Liu, W. Lee, G. Tao, S. Ma, Y. Aafer, and X. Zhang (2019) ABS: scanning neural networks for back-doors by artificial brain stimulation. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 1265–1282. Cited by: §II, §II.
  • [18] Y. Liu, S. Ma, Y. Asfer, W. Lee, J. Zhai, W. Wang, and X. Zhang (2018) Trojaning Attack on Neural Networks. In Proceedings of the Annual Network and Distributed System Security Symposium, Cited by: §II.
  • [19] L. v. d. Maaten and G. Hinton (2008) Visualizing Data using t-SNE. Journal of Machine Learning Research 9 (Nov), pp. 2579–2605. External Links: ISSN ISSN 1533-7928, Link Cited by: §VII-A.
  • [20] S. K. Moore (2018-07) DARPA Picks Its First Set of Winners in Electronics Resurgence Initiative. External Links: Link Cited by: §I.
  • [21] X. Qiao, Y. Yang, and H. Li (2019) Defending Neural Backdoors via Generative Distribution Modeling. In Advances in Neural Information Processing Systems 32, pp. 14004–14013. Cited by: §II, §II.
  • [22] G. R. Reddy, K. Madkour, and Y. Makris (2019-11) Machine Learning-Based Hotspot Detection: Fallacies, Pitfalls and Marching Orders. In IEEE/ACM Int. Conf. Computer-Aided Design (ICCAD), Westminster, CO, USA. External Links: ISBN 978-1-72812-350-9, Document Cited by: §II, §VI-A.
  • [23] G. R. Reddy, C. Xanthopoulos, and Y. Makris (2018) Enhanced hotspot detection through synthetic pattern generation and design of experiments. In IEEE VLSI Test Symp., External Links: Document Cited by: §II, §IV-C, §IV-C, §V-B, §VI-A.
  • [24] A. Shafahi, W. R. Huang, M. Najibi, O. Suciu, C. Studer, T. Dumitras, and T. Goldstein (2018) Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks. In Advances in Neural Information Processing Systems, pp. 6103–6113. Cited by: §I, §II.
  • [25] C. Shorten and T. M. Khoshgoftaar (2019-12) A survey on Image Data Augmentation for Deep Learning. Journal of Big Data 6 (1), pp. 60 (en). External Links: ISSN 2196-1115, Document Cited by: §I, §II.
  • [26] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, and R. Fergus (2014) Intriguing properties of neural networks. In Proc. 2nd Int. Conf. Learning Representations (ICLR), External Links: Link Cited by: §I.
  • [27] A. F. Tabrizi, N. K. Darav, L. Rakai, I. Bustany, A. Kennings, and L. Behjat (2019) Eh?Predictor: A Deep Learning Framework to Identify Detailed Routing Short Violations from a Placed Netlist. ieee_j_cad. External Links: Document Cited by: §I.
  • [28] A. K. Veldanda, K. Liu, B. Tan, P. Krishnamurthy, F. Khorrami, R. Karri, B. Dolan-Gavitt, and S. Garg (2020) NNoculation: Broad Spectrum and Targeted Treatment of Backdoored DNNs. CoRR. External Links: 2002.08313, Link Cited by: §I, §II, §III-B, §III-B, §VII-D.
  • [29] B. Wang, Y. Yao, S. Shan, H. Li, B. Viswanath, H. Zheng, and B. Y. Zhao (2019-05) Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. In IEEE Symp. Security and Privacy (SP), pp. 707–723. External Links: Document Cited by: §I, §I, §II, §II, Fig. 2, §III-B, §III-B, §VII-D.
  • [30] H. Yang, L. Luo, J. Su, C. Lin, and B. Yu (2017) Imbalance aware lithography hotspot detection: a deep learning approach. In SPIE Design-Process-Technology Co-optimization for Manufacturability, pp. 1014807. Cited by: §I, §IV-B.
  • [31] H. Yang, J. Su, Y. Zou, Y. Ma, B. Yu, and E. F. Y. Young (2018)

    Layout Hotspot Detection with Feature Tensor Generation and Deep Biased Learning

    ieee_j_cad. External Links: ISSN 0278-0070, 1937-4151, Document Cited by: §I, §I, §II, §V-D, §V-D, §V-E, §VI-A.
  • [32] J. Zhu, T. Park, P. Isola, and A. A. Efros (2017)

    Unpaired image-to-image translation using cycle-consistent adversarial networks

    In Proceedings of the IEEE international conference on computer vision, pp. 2223–2232. Cited by: §II.