HW2VEC: A Graph Learning Tool for Automating Hardware Security

by   Shih Yuan Yu, et al.

The time-to-market pressure and continuous growing complexity of hardware designs have promoted the globalization of the Integrated Circuit (IC) supply chain. However, such globalization also poses various security threats in each phase of the IC supply chain. Although the advancements of Machine Learning (ML) have pushed the frontier of hardware security, most conventional ML-based methods can only achieve the desired performance by manually finding a robust feature representation for circuits that are non-Euclidean data. As a result, modeling these circuits using graph learning to improve design flows has attracted research attention in the Electronic Design Automation (EDA) field. However, due to the lack of supporting tools, only a few existing works apply graph learning to resolve hardware security issues. To attract more attention, we propose HW2VEC, an open-source graph learning tool that lowers the threshold for newcomers to research hardware security applications with graphs. HW2VEC provides an automated pipeline for extracting a graph representation from a hardware design in various abstraction levels (register transfer level or gate-level netlist). Besides, HW2VEC users can automatically transform the non-Euclidean hardware designs into Euclidean graph embeddings for solving their problems. In this paper, we demonstrate that HW2VEC can achieve state-of-the-art performance on two hardware security-related tasks: Hardware Trojan Detection and Intellectual Property Piracy Detection. We provide the time profiling results for the graph extraction and the learning pipelines in HW2VEC.



There are no comments yet.


page 1

page 3

page 5


Technical Report for HW2VEC – A Graph Learning Tool for Automating Hardware Security

In this technical report, we present HW2VEC [11], an open-source graph l...

A Survey and Perspective on Artificial Intelligence for Security-Aware Electronic Design Automation

Artificial intelligence (AI) and machine learning (ML) techniques have b...

Node-wise Hardware Trojan Detection Based on Graph Learning

In the Fourth Industrial Revolution (4IR) securing the protection of the...

Towards Secure Composition of Integrated Circuits and Electronic Systems: On the Role of EDA

Modern electronic systems become evermore complex, yet remain modular, w...

Cross-Layer Approximation For Printed Machine Learning Circuits

Printed electronics (PE) feature low non-recurring engineering costs and...

Speeding-up Logic Design and Refining Hardware EDA Flow by Exploring Chinese Character based Graphical Representation

Electrical design automation (EDA) techniques have deeply influenced the...

A Symbolic Approach to Detecting Hardware Trojans Triggered by Don't Care Transitions

Due to the globalization of Integrated Circuit (IC) supply chain, hardwa...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In past decades, the growing design complexity and the time-to-market pressure have jointly contributed to the globalization of the Integrated Circuit (IC) supply chain [shamsi2019ip]. Along this globalized supply chain, IC designers tend to leverage third-party Electronic Design Automation (EDA) tools and Intellectual Property (IP) cores or outsource costly services to reduce their overall expense. This results in a worldwide distribution of IC design, fabrication, assembly, deployment, and testing [board2005defense, jose2008innovation, rostami2014primer]. However, such globalization can also make the IC supply chain vulnerable to various hardware security threats such as Hardware Trojan Insertion, IP Theft, Overbuilding, Counterfeiting, Reverse Engineering, and Covert & Side Channel Attacks.

As the consequences of not promptly addressing these security threats can be severe, countermeasures and tools have been proposed to mitigate, prevent, or detect these threats [hu2020overview]. For example, hardware-based primitives, physical unclonable functions (PUFs) [herder2014physical], true random number generator (TRNG) [rahman2014ti], and cryptographic hardware can all intrinsically enhance architectural security. The countermeasures built into hardware design tools are also critical for securing the hardware in the early phases of the IC supply chain. Some Machine Learning (ML) based approaches have been proven effective for detecting Hardware Trojans (HT) from hardware designs in both Register Transfer Level (RTL) and Gate-Level Netlist (GLN) [han2019hardware, hasegawa2018hardware]. Besides, [huang2012parametric]

automates the process of identifying the counterfeited ICs by leveraging Support Vector Machine (SVM) to analyze the sensor readings from on-chip hardware performance counters (HPCs). However, as indicated in 

[tan2020challenges], effectively applying ML models is a non-trivial task as the defenders must first identify an appropriate input representation based on hardware domain knowledge. Therefore, ML-based approaches can only achieve the desired performance with a robust feature representation of a circuit (non-Euclidean data) which is more challenging to acquire than finding the one for Euclidean data such as images, texts, or signals.

Fig. 1: The illustration of the process that extracts features for hardware analysis.

In IC design flow, many fundamental objects such as netlists or layouts are natural graph representations [ma2020understanding]

. These graphs are non-Euclidean data with irregular structures, thus making it hard to generalize basic mathematical operations and apply them to conventional Deep Learning (DL) approaches 

[cai2018comprehensive]. Also, extracting a feature that captures structural information requires a non-trivial effort to achieve the desired performance. To overcome these challenges, many graph learning approaches such as Graph Convolutional Networks (GCN),

Graph Neural Networks

(GNN), or

Graph Autoencoder

(GAE) have been proposed and applied in various applications such as computer vision, natural language processing, and program analysis 

[kipf2016semi, wu2020comprehensive]. In the EDA field, some works tackle netlists with GCNs for test point insertion [ma2019high]

or with GNNs for fast and accurate power estimation in pre-silicon simulation 

[zhang2020grannite]. As Figure 1 shows, these approaches typically begin with extracting the graph representation () from a hardware design , then use the graph-based models as an alternative to the manual feature engineering process. Lastly, by projecting each hardware design onto the Euclidean space (), these designs can be passed to ML models for learning tasks. However, only a few works have applied GNN-based approaches for securing hardware during IC design phases due to the lack of supporting tools [gnn4ip2021, gnn4tj2021].

To attract more research attention to this field, we propose HW2VEC, an open-source graph learning tool for enhancing hardware security. HW2VEC provides automated pipelines for extracting graph representations from hardware designs and leveraging graph learning to secure hardware in design phases. Besides, HW2VEC automates the processes of engineering features and modeling hardware designs. To the best of our knowledge, HW2VEC is the first open-source research tool that supports applying graph learning methods to hardware designs in different abstraction levels for hardware security. In addition, HW2VEC supports transforming hardware designs into various graph representations such as the Data-Flow Graph (DFG), or the Abstract Syntax Tree (AST). In this paper, we also demonstrate that HW2VEC can be utilized in resolving two hardware security applications: Hardware Trojan Detection and IP Piracy Detection and can perform as good as the state-of-the-art GNN-based approaches.

I-a Our Novel Contributions

Our contributions to the hardware security research community are as follows,

  • We propose an automated pipeline to convert a hardware design in RTL or GLN into various graph representations.

  • We propose a GNN-based tool to generate vectorized embeddings that capture the behavioral features of hardware designs from their graph representations.

  • We demonstrate HW2VEC’s effectiveness by showing that it can perform similarly compared to state-of-the-art GNN-based approaches for various real-world hardware security problems, including Hardware Trojan Detection and IP Piracy Detection.

  • We open-source HW2VEC as a Python library111The HW2VEC is publicly available at https://github.com/AICPS/hw2vec/. to contribute to the hardware security research community.

I-B Paper Organization

We organize the rest of the paper as follows: we introduce background information and literature survey in Section II; we present the overall architecture of HW2VEC in Section III; then, we demonstrate the usage examples and two advanced use-cases (HT detection and IP piracy detection) in Section IV; Next, we show experimental results and discuss HW2VEC’s practicability in Section V; Lastly, we conclude in Section VI.

Ii Related works and Background

This section first briefly overviews hardware security problems and countermeasures. Then it describes the works applying ML-based approaches for hardware security. Lastly, we introduce the works that utilize graph learning methods in both EDA and hardware security.

Ii-a Hardware Security Threats in IC Supply Chain

In the IC supply chain, each IC is passed through multiple processes as shown in Figure 2. First, the specification of a hardware design is turned into a behavioral description written in a Hardware Design Language (HDL) such as Verilog or VHDL. Then, it is transformed into a design implementation in terms of logic gates (i.e., netlist) with Logic Synthesis. Physical Synthesis implements the netlist as a layout design (e.g., a GDSII file). Lastly, the resulting GDSII file is handed to a foundry to fabricate the actual IC. Once a foundry produces the IC (Bare Die), several tests are performed to guarantee its correct behavior. The verified IC is then packaged by the assembly and sent to the market to be deployed in systems.

For a System-on-Chip (SoC) company, all of the mentioned stages of the IC supply chain require a vast investment of money and effort. For example, it costs $5 billion in 2015 to develop a new foundry [yeh2012trends]. Therefore, to lower R&D cost and catch up with the competitive development cycle, an SoC company may choose to outsource the fabrication to a third-party foundry, purchase third-party IP cores, and use third-party EDA tools. The use of worldwide distributed third parties makes the IC supply chain susceptible to various security threats [xiao2016hardware] such as Hardware Trojan Insertion, IP Theft, Overbuilding, Counterfeiting, Reverse Engineering, and Covert & Side Channel Attacks, etc. Not detecting or preventing these threats can lead to severe outcomes. For example, in 2008, a suspected nuclear installation in Syria was bombed by Israeli jets because a backdoor in its commercial off-the-shelf microprocessors disabled Syrian radar [syrianRadar]. In another instance, the IP-intensive industries of the USA lose between $225 to $600 billion annually as the companies from China steal American IPs, mainly in the semiconductor industry [news2].

Fig. 2: The illustration of the IC supply chain demonstrating the hardware design flow from a specification to the behavioral description (RTL), logic implementation (GLN), physical implementation (GDSII), and the actual chip (Bare Die or IC).

Among the mentioned security threats, the insertion of Hardware Trojan (HT) can cause the infected hardware to leak sensitive information, degrade its performance, or even trigger a Denial-of-Service (DoS) attack. In System-on-Chip (SoC) or IC designs, IP Theft, the illegal usage and distribution of an IP core can occur. The third-party foundries responsible for outsourced fabrication can overbuild extra chips just for their benefits without the designer’s permission. Moreover, selling the Counterfeited designs in the name of its original supplier leads to financial or safety damage to its producer or even the national security if the target is within essential infrastructures or military systems. Reverse engineering (RE) recovers the high-level information from a circuit available in its lower-level abstraction. Although RE can be helpful in the design and verification process, an attacker can misuse the reconstructed IC designs for malicious intentions. Covert Channel uses non-traditional communication (e.g., shared cache) to leak critical information of a circuit. In contrast, Side Channel exists among the hardware components that are physically isolated or not even in proximity (e.g., power or electromagnetic channel).

Ii-B Hardware Security Countermeasures

Due to the globalization of the IC supply chain, the hardware is susceptible to security threats such as IP piracy (unlicensed usage of IP), overbuilding (unauthorized manufacturing of the circuit), counterfeiting (producing a faithful copy of circuit), reverse engineering, hardware Trojan (malicious modification of circuit), and side-channel attacks [ashraf2018towards].

In the literature, countermeasures and tools have been proposed to mitigate, prevent, or detect these threats [hu2020overview]. For example, a cryptographic accelerator is a hardware-based countermeasure that can reinforce the build-in instead of the add-on defense against security threats. True Random Number Generator (TRNG) and Physical Unclonable Function (PUF) are two other effective security primitives [herder2014physical, rahman2014ti]. These solutions are critical for security protocols and unique IC identification, and they rely on the physical phenomena for randomness, stability, and uniqueness, such as process variations during fabrication [tan2020challenges].

In addition to hardware-based solutions, countermeasures enhancing the security during the hardware design process are also present in the literature. For example, side-channel analysis for HT detection using various models such as hierarchical temporal memory [htm] and DL [htnet] has grabbed lots of attention recently. However, they postpone the detection to post-silicon stage. On the other hand, Formal Verification (FV) is a pre-silicon algorithmic method which converts the 3PIP to a proof checking format and checks if the IP satisfies some predefined security properties [cadense2013jasper, subramanyan2014formal]. Although FV leverages the predefined security properties in IP for HT detection, its detection scope is limited to certain types of HTs because the properties are not comprehensive enough to cover all kinds of malicious behaviors [rajendran2015detecting]. Some works employ model checking but are not scalable to large designs as model checking is NP-complete and can suffer from state explosion [rajendran2016formal]. Another existing approach is code coverage which analyzes the RTL code using metrics such as line, statement, finite state machine, and toggle coverage to ascertain the suspicious signals that imitate the HT [waksman2013fanci, zhang2015veritrust].

As for IP theft prevention, watermarking and fingerprinting are two approaches that embed the IP owner and legal IP user’s signatures into a circuit to prevent infringement [poudel2020flashmark, rai2019hardware]. Hardware metering is an IP protection method in which the designer assigns a unique tag to each chip for chip identification (passive tag) or enabling/disabling the chip (active tag) [koushanfar2017active]. Obfuscation is another countermeasure for IP theft [chen2020decoy] which comprises two main approach; Logic Locking and Camouflaging. In Logic Locking, the designer inserts additional gates such as XOR into non-critical wires. The circuit will only be functional if the correct key is presented in a secure memory out of reach of the attacker [xie2017delay]. Camouflaging modifies the design such that cells with different functionalities look similar to the attacker and confuses the reverse engineering process [camouflaging]. Lastly, another countermeasure is to split the design into separate ICs and have them fabricated in different foundries so that none of them has access to the whole design to perform malicious activities [patnaik2018raise, zhang2018analysis].

In [hu2020overview], several academic and commercial tools have been proposed to secure hardware. For example, VeriSketch, SecVerilog, etc., are the open-source academia verification tools for securing hardware. SecureCheck from Mentor Graphics, JasperGold Formal Verification Platform from Cadence, and Prospect from Tortuga Logic are all commercial verification tools ready in the market. PyVerilog [Takamaeda:2015:ARC:Pyverilog] is a hardware design tool that allows users to parse HDL code and perform pre-silicon formal verification side-by-side with functional verification. In short, though many approaches have been proposed to counteract security threats, security is still an afterthought in hardware design. Therefore, new countermeasures will be needed against new security threats.

Ii-C Machine Learning for Hardware Security

In the last few decades, the advancements in Machine Learning (ML) have revolutionized the conventional methods and models in numerous applications throughout the design flow. Defenders can use ML with hardware-based observations for detecting attacks, while attackers can also use ML to steal sensitive information from an IC, breaching hardware security [tan2020challenges]. Some ML-based countermeasures have been proven effective for detecting HT from hardware designs in both Register Transfer Level (RTL) or gate-level netlists (GLN) [han2019hardware, hasegawa2018hardware]. In [han2019hardware]

, the circuit features are extracted from the Abstract Syntax Tree (AST) representations of RTL codes and fed to gradient boosting algorithm to train the ML model to construct an HT library.


extracts 11 Trojan-net feature values from GLNs and then trains a Multi-Layer Neural Network on them to classify each net in a netlist as a normal netlist or Trojan. Similarly, researchers have applied ML for automating the process of detecting other threats. For instance, SVM can be used to analyze the on-chip sensor readings (e.g., HPCs) to identify counterfeited ICs and detect HT in real-time 

[huang2012parametric, kulkarni2016svm]. However, as indicated in [tan2020challenges], effectively applying ML models is not a trivial task as the defenders must first identify an appropriate input representation for a hardware design. Unlike Euclidean data such as images, texts, or signals, finding a robust feature representation for a circuit (Non-Euclidean data) is more challenging as it requires domain knowledge in both hardware and ML. To overcome this challenge, HW2VEC provides more effective graph learning methods to automatically find a robust feature representation for a non-Euclidean hardware design.

Ii-D Graph Learning for Hardware Design and Security

Although conventional ML and DL approaches can effectively capture the features hidden in Euclidean data, such as images, text, or videos, there are still various applications where the data is graph-structured. As graphs can be irregular, a graph can have a variable size of unordered nodes, and nodes can have a different number of neighbors, thus making mathematical operations used in deep learning (e.g., 2D Convolution) challenging to be applied [cai2018comprehensive]. Also, extracting a feature that captures structural information requires challenging efforts to achieve the desired performance. To address these challenges, recently, many graph learning approaches such as Graph Convolutional Networks (GCN), Graph Neural Networks (GNN), or Graph Autoencoder (GAE) have been proposed and applied in various applications [kipf2016semi, wu2020comprehensive]. Only by projecting non-Euclidean data into low-dimensional embedding space can the operations in ML methods be applied.

In EDA applications, many fundamental objects such as Boolean functions, netlists, or layouts are natural graph representations [ma2020understanding]. Some works tackle netlists with GCNs for test point insertion [ma2019high] or with GNNs for fast and accurate power estimation in pre-silicon simulation [zhang2020grannite]. [zhang2020grannite] uses a GNN-based model to infer the toggle rate of each logic gate from a netlist graph for fast and accurate average power estimation without gate-level simulations, which is a slower way to acquire toggle rates compared to RTL simulation. They use GLNs, corresponding input port, and register toggle rates as input features and logic gate toggle rates as ground-truth to train the model. The model can infer the toggle rate of a logic gate from input features acquired from RTL simulation for average power analysis computed by other power analysis tools.

As for hardware security, only a few works utilizing GNN-based approaches against security threats exist [gnn4ip2021, gnn4tj2021]. [gnn4tj2021] utilizes a GNN-based approach for detecting HT in pre-silicon design phases without the need for golden HT-free reference. Besides, using the GNN-based approach allows the extraction of features from Data-Flow graphs to be automated. In [gnn4ip2021], the proposed GNN-based approach can detect IP piracy without the need to extract hardware overhead to insert signatures to prove ownership. Specifically, the Siamese-based network architecture allows their approach to capturing the features to assess the similarity between hardware designs in the form of a Data-Flow Graph. In short, these works have shown the effectiveness of securing hardware designs with graph learning approaches. To further attract attention, we propose HW2VEC as a convenient research tool that lowers the threshold for newcomers to make research progress and for experienced researchers to explore this topic more in-depth.

Iii HW2VEC Architecture

As Figure 3 shows, HW2VEC contains HW2GRAPH and GRAPH2VEC. During the IC design flow, a hardware design can have various levels of abstraction such as High-Level Synthesis (HLS), RTL, GLN, and GDSII, each of which are fundamentally non-Euclidean data. Overall, in HW2VEC, a hardware design is first turned into a graph by HW2GRAPH, which defines the pairwise relationships between objects that preserve the structural information. Then, GRAPH2VEC consumes and produces the Euclidean representation for learning.

Fig. 3: The overall architecture of hw2vec. Beginning with hardware design objects (RTL or GLN), the HW2GRAPH leverages PRE_PROC, GRAPH_GEN, and POST_PROC to extract graph representations from hardware designs in the form of node embedding matrix () and adjacency matrix (). These graphs are then passed to GRAPH2VEC to acquire the graph embeddings for graph learning tasks of hardware security.

Iii-a HW2GRAPH: from hardware design to graph

The first step is to convert each textual hardware design code into a graph . HW2GRAPH supports the automatic conversion of raw hardware code into various graph formats such as Abstract Syntax Tree (AST) or Data-Flow Graph (DFG). AST captures the syntactic structure of hardware code while DFG indicates the relationships and dependencies between the signals and gives a higher-level expression of the code’s computational structure. HW2GRAPH consists of three primary modules: pre-processing, graph generation engine, and post-processing.

Iii-A1 Pre-processing (Pre_proc)

In this module, we have several automatic scripts for pre-processing a raw hardware code . As a hardware design can contain several modules stored in separate files, the first step is to combine them into a single file (i.e., flattening). Next, to automatically locate the “entry point” top module of , the script scans the flattened code for the keyword “module” and extracts the module names and the number of repetitions in . Then, the script analyzes the list of discovered module names and takes the one that appears only once, which means the module is not instantiated by any other module, as the top module. Here, we denote the pre-processed hardware design code as .

Iii-A2 Graph Generation Engine (Graph_gen)

We integrate PyVerilog [takamaeda2015pyverilog], a hardware design toolkit for parsing the Verilog code, into this module. The pre-processed code is first converted by a lexical analyzer, YACC (Yet Another Compiler-Compiler), into a corresponding parse tree. Then, we recursively iterate through each node in the parse tree with Depth-First Search (DFS). At each recursive step, we determine whether to construct a collection of name/value pairs, an ordered list of values, or a single name/value pair based on the token names used in Verilog AST. To acquire DFG, the AST is further processed by the data flow analyzer to create a signal DFG for each signal in the circuit such that the signal is the root node. Lastly, we merge all the signal DFGs. The resulting graph, either DFG or AST, is denoted as . The AST is a tree type of graph in which the nodes can be operators (mathematical, gates, loop, conditional, etc.), signals, or attributes of signals. The edges indicate the relation between nodes. The DFG shows data dependency where each node in represents signals, constant values, and operations such as xor, and, concatenation, branch, or branch condition, etc. Each edge in stands for the data dependency relation between two nodes. Specifically, for all pairs, the edge belongs to () if depends on , or if is applied on .

Iii-A3 Post-processsing (Post_proc)

The output from Graph Generatifon Engine

is in JSON (JavaScript Object Notation) format. In this phase, we convert a JSON-formatted graph into a NetworkX graph object. NetworkX is an efficient, scalable, and highly portable framework for graph analysis. Several popular geometric representation learning libraries (PyTorch-Geometric and Deep Graph Library) take this format of graphs as the primary data structure in their pipelines.

Iii-B Graph2Vec: from graph to graph embedding

Once hw2graph has converted a hardware design into a graph , we begin to process with the modules in graph2vec, including Dataset Processor, Trainer, and Evaluator to acquire the graph embedding .

Iii-B1 Dataset Processor

This module handles the low-level parsing tasks such as caching the data on disk to optimize the tasks that involve repetitive model testing, performing train-test split, finding the unique set of node labels among all the graph data instances. One important task of the dataset processor is to convert a graph

into the tensor-like inputs

and where represents the node embeddings in matrix form and stands for the adjacency information of . The conversion between and is straightforward. To acquire , Dataset Processor performs a normalization

process and assigns each of the nodes a label that indicates its type, which may vary for different kinds of graphs (AST or DFG). Each node gets converted to an initial vectorized representation using one-hot encoding based on its type label.

Iii-B2 Graph Embedding Model

In this module, we break down the graph learning pipeline into multiple network components, including graph convolution layers (GRAPH_CONV), graph pooling layers (GRAPH_POOL), and graph readout operations (GRAPH_READOUT).


is inspired by the Spatial Graph Convolution Neural Network (SGCN), which defines the convolution operation based on a node’s spatial relations. In literature, this phase is also referred to as

message propagation phase which involves two sub-functions: AGGREGATE and COMBINE functions. Each input graph is initialized in the form of node embeddings and adjacency information ( and ). For each -th iteration, the process updates the node embeddings using each node representation in , given by,


where denotes the feature vector after iterations for the -th node and returns the neighboring nodes of -th node. Essentially, the AGGREGATE collects the features of the neighboring nodes to extract an aggregated feature vector for the layer k, and the COMBINE combines the previous node feature with to output next feature vector . This message propagation is carried out for a pre-determined number of layers . We denote the final propagation node embedding as .

Next, in GRAPH_POOL, the node embedding is further processed with an attention-based graph pooling layer. As indicated from [lee2019self, ying2018hierarchical], the integration of a graph pooling layer allows the model to operate on the hierarchical representations of a graph, and hence can better perform the graph classification task. Besides, such an attention-based pooling layer allows the model to focus on a local part of the graph and is considered as a part of a unified computational block of a GNN pipeline [knyazev2019understanding]. In this layer, we perform top-k filtering on nodes according to the scoring results, as follows:


where stands for the coefficients predicted by the graph pooling layer for nodes. represents the indices of the pooled nodes, which are selected from the top of the nodes ranked according to . The number used in top-k filtering is calculated by a pre-defined pooling ratio, using , where we consider only a constant fraction of the embeddings of the nodes of the DFG to be relevant (i.e., 0.5). One example of the scoring function is to utilize a separate trainable GNN layer to produce the scores so that the scoring method considers both node features and topological characteristics [lee2019self]. We denote the node embeddings and edge adjacency information after pooling by and which are calculated as follows:


where represents an element-wise multiplication, refers to the operation that extracts a subset of nodes based on , and refers to the information of the adjacency matrix between the nodes in this subset.


, the overall graph-level feature extraction is carried out by either summing up or averaging up the node features

. We denote the graph embedding for each graph as , computed as follows:


We use the graph embedding to model the behavior of circuits (use for simplicity). After this, the fixed-length embeddings of hardware designs then become compatible with ML algorithms.

In practice, these network components can be combined in various ways depending on the type of the tasks (node-level task, graph-level task) or the complexity of the tasks (simple or complex network architecture). In GRAPH2VEC, one default option is to use one or multiple GRAPH_CONV, followed by a GRAPH_POOL and a GRAPH_READOUT

. Besides, in conjunction with Multi-Layer Perceptron (MLP) or other ML layers, this architecture can transform the graph data into a form that we can use in calculating the loss for learning. In

GRAPH2VEC, we reserve the flexibility for customization, so users may also choose to combine these components in a way that is effective for their tasks.

Iii-B3 Trainer and Evaluator

The Trainer

module takes training datasets, validating datasets, and a set of hyperparameter configurations to train a GNN model.

HW2VEC currently supports two types of Trainer, graph-trainer and graph-pair-trainer. To be more specific, graph-trainer uses GRAPH2VEC’s model to perform graph classification learning and evaluation while graph-pair-trainer considers pairs of graphs, calculates their similarities, and ultimately performs the graph similarity learning and evaluation. Some low-level tasks are also handled by Trainer module, such as caching the best model weights evaluated from the validation set to the disk space or performing mini-step testing. Once the training is finished, the Evaluator module plots the training loss and commonly used metrics in ML-based hardware security applications. To facilitate the analysis of the results, HW2VEC also provides utilities to visualize the embeddings of hardware designs with t-SNE based dimensionality reduction [van2008visualizing]. Besides, HW2VEC provides multiple exporting functionalities so that the learned embeddings can be presented in standardized formats, and users can also choose other third-party tools such as Embedding Projector [smilkov2016embedding] to analyze the embeddings.

Iv HW2VEC Use-cases

In this section, we describe HW2VEC use-cases. First, Section IV-A exhibits a fundamental use-case in which a hardware design is converted into a graph and then into a fixed-length embedding . Next, the use-cases of HW2VEC for two hardware security applications (detecting hardware Trojan and hardware IP piracy) are described in Section IV-B and Section IV-C, respectively.

Iv-a Use-case 1: Converting a Hardware Design to a Graph Embedding

The first use-case demonstrates the transformation of a hardware design into a graph and then into an embedding . As Algorithm 1 shows, HW2GRAPH uses preprocessing (PRE_PROC), graph generation (GRAPH_GEN) and post-processing (POST_PROC) modules which are detailed in Section III-A to convert each hardware design into the corresponding graph. The is fed to GRAPH2VEC with the uses of Data Processing (DATA_PROC) to generate and . Then, and are processed through GRAPH_CONV, GRAPH_POOL, and GRAPH_READOUT to generate the graph embedding . This resulting can be further inspected with the utilities of Evaluator (see Section III-B3).

1 Input: A hardware design program . Output: A graph embedding for . def HW2GRAPH():
2       Pre_Proc(); Graph_Gen(); Post_Proc(); return ;
4def GRAPH2VEC():
5       Data_Proc() GRAPH_CONV() GRAPH_POOL() GRAPH_READOUT() return
Algorithm 1 Use-case - HW2VEC

In HW2VEC, we provide Algorithm 1’s implementation in use_case_1.py of our repository.

Iv-B Use-case 2: Hardware Trojan Detection

In this use-case, we demonstrate how to use HW2VEC to detect HT, which has been a major hardware security challenge for many years. An HT is an intentional, malicious modification of a circuit by an attacker [rostami2013hardware]. The capability of detection at an early stage (particularly at RTL level) is crucial as removing HTs at later stages could be very expensive. The majority of existing solutions rely on a golden HT-free reference or cannot generalize detection to previously unseen HTs. [gnn4tj2021] proposes a GNN-based approach to model the circuit’s behavior and identify the presence of HTs.

1 Input: A hardware design program . Output: A label indicating whether contains Hardware Trojan. def use_case_2():
2       HW2GRAPH(); GRAPH2VEC(); MLP(); if  then
3             return Trojan;
4       else
5             return Non_Trojan;
Algorithm 2 Use-case - Hardware Trojan Detection

To realize [gnn4tj2021] in HW2VEC, we first use HW2GRAPH to convert each hardware design into a graph . Then, we transform each to a graph embedding . Lastly, is used to make a prediction with an MLP layer. To train the model, the cross-entropy loss is calculated collectively for all the graphs in the training set (see Equation 8).



is the loss function.

stands for the set of ground-truth labels (either Trojan or Non_Trojan) and represents the corresponding set of predictions. Once trained by minimizing , we use the model and Algorithm 2 to perform HT detection (can also be done with a pre-trained model). In practice, we provide an implementation in use_case_2.py in our repository.

Iv-C Use-case 3: Hardware IP Piracy Detection

This use-case demonstrates how to leverage HW2VEC to confront another major hardware security challenge – determining whether one of the two hardware designs is stolen from the other or not. The IC supply chain has been so globalized that it exposes the IP providers to theft and illegal IP redistribution. One state-of-the-art countermeasure embeds the signatures of IP owners on hardware designs (i.e., watermarking or fingerprinting), but it causes additional hardware overhead during the manufacturing. Therefore, [gnn4ip2021] addresses IP piracy by assessing the similarities between hardware designs with a GNN-based approach. Their approach models the behavior of a hardware design (in RTL or GLN) in graph representations.

1 Input: A pair of hardware design programs . Output: A label indicating whether is piracy. def use_case_3(, ):
2       HW2GRAPH(), HW2GRAPH(); GRAPH2VEC(), GRAPH2VEC(); Cosine_Sim(); if  then
3             return Piracy;
4       else
5             return Non-Piracy;
use_case_3(, );
Algorithm 3 Use-case - Hardware IP Piracy Detection

To implement [gnn4ip2021], the GNN model has to be trained with a graph-pair classification trainer in GRAPH2VEC. The first step is to use HW2GRAPH to convert a pair of circuit designs , into a pair of graphs , . Then, GRAPH2VEC transforms both and into graph embeddings , . To train this GNN model for assessing the similarity of and

, a cosine similarity is computed as the final prediction of piracy, denoted as

. The loss between a prediction and a ground-truth label is calculated as Equation 9 shows. Lastly, the final loss is computed collectively with a loss function for all the graphs in the training set (see Equation 10).


where stands for the set of ground-truth labels (either Piracy or Non_Piracy) and represents the corresponding set of predictions. The margin is a constant to prevent the learned embedding from becoming distorted (always set to 0.5 in [gnn4ip2021]). Once trained, we use this model and Algorithm 3 with , which is a decision boundary used for making final judgment, to detect piracy. In practice, we provide the implementation of Algorithm 3 in use_case_3.py.

V Experimental Results

In this section, we evaluate the HW2VEC through various experiments using the use-case implementations described earlier.

V-a Dataset Preparation

For evaluation, we prepare one RTL dataset for HT detection (TJ-RTL) and both RTL and GLN datasets (IP-RTL and IP-GLN) for IP piracy detection.

V-A1 The TJ-RTL dataset

We construct the TJ-RTL dataset by gathering the hardware designs with or without HT from the Trust-Hub.org benchmark [tehranipoor2016trusthub]. From Trust-Hub, we collect three base circuits, AES, PIC, and RS232, and insert 34 varied types of HTs into them. We also include these HTs as standalone instances to the TJ-RTL dataset. Furthermore, we insert these standalone HTs into two other circuits (DES and RC5) and include the resulting circuits to expand the TJ-RTL dataset. Among the five base circuits, AES, DES, and RC5 are cryptographic cores that encrypt the input plaintext into the ciphertext based on a secret key. For these circuits, the inserted HTs can leak sensitive information (i.e., secret key) via side-channels such as power and RF radiation or degrade the performance of their host circuits by increasing the power consumption and draining the power supply. RS232 is an implementation of the UART communication channel, while the HT attacks on RS232 can affect the functionality of either transmitter or receiver or can interrupt/disable the communication between them. The PIC16F84 is a well-known Power Integrated Circuit (PIC) microcontroller, and the HTs for PIC fiddle with its functionality and manipulate the program counter register. Lastly, we create the graph datasets, DFG-TJ-RTL and AST-TJ-RTL, in which each graph instance is annotated with a Trojan or Non_Trojan label.

V-A2 The IP-RTL and IP-GNL datasets

To construct the datasets for evaluating piracy detection, we gather RTL and GLN of hardware designs in Verilog format. The RTL dataset includes common hardware designs such as single-cycle and pipeline implementation of MIPS processor which are derived from available open-source hardware design in the internet or designed by a group of in-house designers who are given the same specification to design a hardware in Verilog. The GLN dataset includes ISCAS’85 benchmark [hansen1999unveiling] which includes 7 different hardware designs (c432, c499, c880, c1355, c1908, c6288, c7552) and their obfuscated instances derived from TrustHub. Obfuscation complicates the circuit and confuses reverse engineering but does not change the behavior of the circuit. Our collection comprises 50 distinct circuit designs and several hardware instances for each circuit design that sums up 143 GLN and 390 RTL codes. We form a graph-pair dataset of 19,094 similar pairs and 66,631 different pairs, dedicate 20% of these 85,725 pairs for testing and the rest for training. This dataset comprises of pairs of hardware designs, labelled as piracy (positive) or no-piracy (negative).

V-B Hw2vec Evaluation: Hardware Trojan Detection

Here, we evaluate the capability of HW2VEC in identifying the existence of HTs from hardware designs. We leverage the implementation mentioned in Section IV-B. As for hyperparameters, we follow the best setting used in [gnn4tj2021] which is stored as a preset in a YAML configuration file. For performance metrics, we count the True Positive (), False Negative () and False Positive () for deriving Precision and Recall . manifests the percentage of HT-infested samples that the model can identify. As the number of HT-free samples incorrectly classified as HT is also critical, we compute that indicates what percentage of the samples that model classifies as HT-infested actually contains HT.

score is the weighted average of precision and recall that better presents performance, calculated as


To demonstrate whether the learned model can generalize the knowledge to handle the unknown or unseen circuits, we perform a variant leave-one-out cross-validation to experiment. We perform a train-test split on the TJ-RTL dataset by leaving one base circuit benchmark in the testing set and use the remaining circuits to train the model. We repeat this process for each base circuit and average the metrics we acquire from evaluating each testing set. The result is presented in Table I, indicating that HW2VEC can reproduce comparable results to [gnn4tj2021] in terms of score (0.926 versus 0.940) if we use DFG as the graph representation. The difference in performance can be due to the use of different datasets. When using AST as the graph representation for detecting HT, HW2VEC performs worse in terms of score, indicating that DFG is a better graph representation because it captures the data flow information instead of simply the syntactic information of a hardware design code. All in all, these results demonstrate that our HW2VEC can be leveraged for studying HT detection at design phases.

Method Graph Dataset Precision Recall F1
HW2VEC DFG RTL 0.87334 0.98572 0.92596
HW2VEC AST RTL 0.90288 0.8 0.8453
[gnn4tj2021] DFG RTL 0.923 0.966 0.940
TABLE I: The performance of HT detection using HW2VEC.

V-C Hw2vec Evaluation: Hardware IP Piracy Detection

Besides the capability of HT detection, we also evaluate the power of HW2VEC in detecting IP piracy. We leverage the usage example mentioned in Section IV-C which examines the cosine-similarity score for each hardware design pair and produces the final prediction with the decision boundary. Using the IP-RTL dataset and the IP-GNL dataset (mentioned in Section V-A), we generate graph-pair datasets by annotating the hardware designs that belong to the same hardware category as Similar and the ones that belong to different categories as Dissimilar. We perform a train-test split on the dataset so that 80% of the pairs will be used to train the model. We compute the accuracy of detecting hardware IP piracy, which expresses the correctly predicted sample ratio and calculates the

score as the evaluating metrics. We refer to 

[gnn4ip2021] for the selection of hyperparameters (stored in a YAML file).

The result is presented in Table II, indicating that HW2VEC can reproduce comparable results to [gnn4ip2021] in terms of piracy detection accuracy. When using DFG as the graph representation, HW2VEC underperforms [gnn4ip2021] by 3% at RTL level and outperforms [gnn4ip2021] by 4.2% at GLN level. Table II also shows a similar observation with Section V-B that using AST as the graph representation can lead to worse performance than using DFG. Figure 4 visualizes the graph embeddings that HW2VEC exports for every processed hardware design, allowing users to inspect the results manually. For example, by inspecting Figure 4, we may find a clear separation between mips_single_cycle and AES. Certainly, HW2VEC can perform better with more fine-tuning processes. However, the evaluation aims to demonstrate that HW2VEC can help practitioners study the problem of IP piracy at RTL and GLN levels.

Method Graph Dataset Accuracy F1
HW2VEC DFG RTL 0.9438 0.9277
HW2VEC DFG GLN 0.9882 0.9652
HW2VEC AST RTL 0.9358 0.9183
[gnn4ip2021] DFG RTL 0.9721
[gnn4ip2021] DFG GLN 0.9461
TABLE II: The results of detecting IP piracy with HW2VEC.
Fig. 4: The embedding visualization with 3D t-SNE.

V-D Hw2vec Evaluation: Timing

To evaluate the time required for training and testing, we test the models on a server with NVIDIA TITAN-XP and NVIDIA GeForce GTX 1080 graphics cards. Table III indicates that the time taken by training and inference are both below 15 milliseconds, and the time taken by training is more than inference as it includes the time for performing back-propagation. As HW2VEC aims to serve as a research tool, our users must evaluate their applications within a reasonable time duration. We believe that the time spent by the graph learning pipelines of HW2VEC

should be acceptable for conducting research. For practically deploying the models, the actual timing can depend on the computation power of hosting devices and the complexity of the models for the applications. Suppose our users need an optimized performance for real-time applications. In that case, they can implement the models with performance-focused programming languages (C or C++) or ML frameworks (e.g., TensorFlow) using the best model settings found using

HW2VEC. As for specialized hardware that can accelerate the processing of GNNs, it is still an open challenge as indicated in [abadal2020computing].

Table IV indicates that the time that HW2VEC spends in converting the raw hardware code into ASTs is on average 1.98 seconds. Although [han2019hardware] takes 1.37 seconds on average per hardware code, it requires domain knowledge to find a deterministic way to perform feature extraction. For DFG extraction, HW2VEC takes on average 244.58 seconds per graph as it requires recursive traversals to construct the whole data flow. In our datasets, AES and DES are relatively more complex, so HW2VEC takes 472.46 seconds on average processing them while the rest of the data instances take 16.70 seconds on average. Certainly, HW2VEC performs worse in DFG extraction, but manual feature engineering possibly requires a much longer time. In design phases, even for an experienced hardware designer, it can take 6-9 months to prototype a complex hardware design [teel2017how] so the time taken by HW2VEC is acceptable and not slowing down the design process. However, as the first open-source tool in the field, HW2VEC will keep evolving and embrace the contributions from the open-source community.

training time 10.5 (ms) 13.5 (ms)
testing time 6.8 (ms) 12.4 (ms)
TABLE III: The time profiling for training/inference.
# of node 7573.58 7616.16 971.01
# of edge 8938.11 9495.97 970.01
Exec time 244.58 (s) 14.61 (s) 1.98 (s)
TABLE IV: The graph extraction time profiling. For TJ-DFG-RTL, the hardware AES and DES jointly take 472.46 seconds on average for DFG extraction while the rest of data instances take 16.7 seconds on average.

V-E Hw2vec Applicability

In Section V-B and Section V-C, we have discussed the performance of the GNN-based approach in resolving two hardware security problems: hardware Trojan detection and IP piracy detection. In Section V-B, our evaluation shows that HW2VEC can successfully be leveraged to perform HT detection on hardware designs, particularly on the unseen ones, without the assistance of golden HT-free reference. The capability to model hardware behaviors can be attributed to using a natural representation of the hardware design (e.g., DFG) and the use of the GNN-based method for capturing both the structural information and semantic information from the DFG and co-relating this information to the final HT labels. Similarly, Section V-C indicates that HW2VEC can be utilized to assess the similarities between circuits and thus can be a countermeasure for IP piracy. The use of graph representation for a hardware design and a Siamese GNN-based network architecture are the keys in [gnn4ip2021] to perform IP piracy detection at both RTL and GLN levels. For other hardware security applications, the flexible modules provided by HW2VEC (Trainer and Evaluator) can be adapted easily to different problem settings. For example, by adjusting the Trainer to train the GNN models for node classification, HW2VEC can be adapted to localize the HT(s) or hardware bug(s) that exist in the hardware designs. Also, the cached models provided by HW2VEC can be used in learning other new hardware design related tasks through the transfer of knowledge from a related task that has already been learned as the idea of Transfer Learning suggests [torrey2010transfer].

Vi Conclusion

As technological advancements continue to grow, the fights between attackers and defenders will rise in complexity and severity. To contribute to the hardware security research community, we propose HW2VEC: a graph learning tool for automating hardware security. HW2VEC provides an automated pipeline for hardware security practitioners to extract graph representations from a hardware design in either RTL or GLN. Besides, the toolbox of HW2VEC allows users to realize their hardware security applications with flexibility. Our evaluation shows that HW2VEC can be leveraged and integrated for counteracting two critical hardware security threats: Hardware Trojan Detection and IP Piracy Detection. Lastly, as discussed in this paper, we anticipate that HW2VEC can provide more straightforward access for both practitioners and researchers to apply graph learning approaches to hardware security applications.