DeepAI
Log In Sign Up

Using a Semantic Knowledge Base to Improve the Management of Security Reports in Industrial DevOps Projects

Integrating security activities into the software development lifecycle to detect security flaws is essential for any project. These activities produce reports that must be managed and looped back to project stakeholders like developers to enable security improvements. This so-called Feedback Loop is a crucial part of any project and is required by various industrial security standards and models. However, the operation of this loop presents a variety of challenges. These challenges range from ensuring that feedback data is of sufficient quality over providing different stakeholders with the information they need to the enormous effort to manage the reports. In this paper, we propose a novel approach for treating findings from security activity reports as belief in a Knowledge Base (KB). By utilizing continuous logical inferences, we derive information necessary for practitioners and address existing challenges in the industry. This approach is currently evaluated in industrial DevOps projects, using data from continuous security testing.

READ FULL TEXT VIEW PDF

page 1

page 2

05/27/2021

Using Process Models to understand Security Standards

Many industrial software development processes today have to comply with...
07/28/2019

Characterizing and Understanding Software Developer Networks in Security Development

To build secure software, developers often work together during software...
11/20/2022

Semantic Similarity-Based Clustering of Findings From Security Testing Tools

Over the last years, software development in domains with high security ...
07/26/2017

An Activity-Based Quality Model for Maintainability

Maintainability is a key quality attribute of successful software system...
04/09/2021

Memory Error Detection in Security Testing

We study 10 C/C++ projects that have been using a static analysis securi...
02/10/2021

Enterprise-Driven Open Source Software: A Case Study on Security Automation

Agile and DevOps are widely adopted by the industry. Hence, integrating ...
03/06/2020

The Cost and Benefits of Static Analysis During Development

Without quantitative data, deciding whether and how to use static analys...

1. Introduction

Automating security activities like periodic testing for vulnerabilities and flaws is essential in industrial projects utilizing DevOps techniques (Moyón et al., 2020; Kim, 2011) to produce software products in domains with high security-related demand. As a result, new data about the software security is continuously generated, informing about shortcomings of the software and new requirements. In order to improve the product, reports must be fed back into the development cycle and be addressed by developers (Simpson, 2014). This so-called Feedback Loop is demanded by various standards and industry best practices ((IEC), 2018; Migues et al., 2020).

In practice, however, the task of implementing the Feedback Loop presents various challenges. The first challenge is the quality of the reports. The vast amount of data produced by security activities varies in format, content, perspective, assumptions, and evaluation (Welberg, 2008), which necessitates data processing. Issues like False Positives are common in reports (Nadeem et al., 2012)

, reducing their reliability. The second challenge is how the data is utilized. The correct interpretation of the security activity data is essential for the subsequent actions, as data itself fuels project decisions and represents the software’s security level. Moreover, each project has its own demands, e.g., regarding standards compliance. Consequently, a high-quality demand applies to the produced information and must be customized project-wise. Our third challenge is that data from security activities is only half of what is needed. Inputs by security experts, customer opinion, or vulnerability databases are essential to correctly estimate the impact of findings and present a valid representation of reality. Finally, performing the management manually to collate actionable information is neither feasible in industrial software development projects nor conforming with the DevOps mindset, where automation of tasks is crucial

(Leite et al., 2019). Hence, we see a necessity to address the question:
How can the Feedback Loop for security reports in industrial DevOps projects be optimized?
The optimization should include the process being faster, customizable, with the least manual effort, more automation, highly reliable, and comprehensible for the project team.

2. Using a Semantic Knowledge Base

2.1. Theory

We propose the usage of a semantic kb to address the challenges identified above. A kb comprises primary information, logically interconnected by database semantics and stored free of constraints in a metastructural database. In contrast to a database, a KB has primary methods of data processing, which allow the continuous generation of new information based on existing data (Krótkiewicz et al., 2018, 2016). kbs have been applied in various areas, including management of sensor data (Nambi et al., 2014), the elicitation of high-quality requirements (Kaiya and Saeki, 2006), and even vulnerability management (Wang and Guo, 2009). In contrast to existing approaches, we apply kbs to the domain of secure software engineering to manage security reports. However, this implies multiple changes to existing concepts to ensure a successful function in this use case. Initially, we consider the content of the kb as belief instead of knowledge due to the lack of data reliability. This particularly includes contradictory information from our data sources. Consequently, belief must be revisable, including the explicit belief (provided from outside the kb) and derived belief (derived by the kb). Moreover, the inference procedure must be customized to each project. This implies that each kb has to deal with belief and inference rules being incrementally added or changed throughout the project. As each kb is unique to its project, a traditional approach using a static ontology is not feasible for our use case, especially when considering changing inference rules. Instead, we enable the incremental creation of a kb for each project. To ensure a consistent kb, we monitor all changes to the kb and identify contradictory information. If conflicting data is identified, it is resolved by considering external human input as more reliable (e.g., False Positive identification). Any belief derived from the formerly contradictory information is revised and re-calculated.

With this approach, we can address various challenges of the industrial security report management with a semantic kb. The automated generation of data with pre-written inference rules ensures high reliability, high comprehensibility, project customization and reduces the scope for human errors. The automation of the report management reduces the manual effort and speeds up processing.

2.2. Practice

In order to test our theoretical concept, we implemented it for industrial software development projects. We realized our concepts of belief, inference rules, continuous inference, and metastructural data storage in the components depicted in Figure 1.

Figure 1. Component Overview of Example kb

The Data Storage component, which comprises rules and belief, is implemented using the Elasticsearch search engine, which allows us to perform advanced queries on the data. Belief is stored in entities of Elasticsearch documents. Inference rules, however, are written in Python code. The step where our inference rules are applied to the current content of the kb to derive new information is implemented by the Inference Engine. With the Logical Core

, we ensure consistency between new beliefs being added or existing beliefs being revised. These components are implemented in Python, using a self-developed logic to ensure consistency within the kb. Investigating the connections to incremental logic programming approaches, such as Differential Datalog

(VMware, 2022), is an interesting direction for future work.

In practice, each project has to customize the information contained in the kb by writing documents for belief and inference rules. Based on our experience, specific inference rules are necessary for any project. These include a parser for security reports, a deduplication between findings, a validation of findings using human expertise, and a prioritization of the resulting issues. In most cases, these inference rules are streamlined, meaning that they build upon each other in a pipeline-like manner. To avoid the artificial inflation of the kb, we restrict new inferences solely to those that might be invalidated by external input (e.g., incorrect deduplication).

Deduplication, e.g., works on findings that have been parsed before from the original reports. During the execution of this rule, findings with similar values for title or description are summarized. As this could incorrectly connect two findings, external input correcting potentially flawed inferences is considered during execution. In such cases, the logical core would investigate whether this makes a revision of belief necessary. We are currently utilizing this kb in industrial software development projects to manage reports from automated security testing to evaluate its relevance and usefulness.

3. Conclusion and Future

The management of reports produced by security activities in industrial DevOps projects is essential in every domain. In this paper, we suggest a novel approach of using a semantic KB for this use case. We indicate necessary changes to existing KB concepts and introduce our concept implementation. We believe that utilizing logical inferences in combination with the reliability of a KB is a promising approach for usage in industrial software engineering projects. Substantiating this belief with a long-term evaluation in a realistic setting will be the core activity of future work.

References

  • I. E. C. (IEC) (2018) 62443-4-1. Security for industrial automation and control systems Part 4-1 Product security development life-cycle requirements. External Links: ISBN 978-2-8322-5239-0 Cited by: §1.
  • H. Kaiya and M. Saeki (2006) Using domain ontology as domain knowledge for requirements elicitation. In 14th IEEE International Requirements Engineering Conference (RE’06), Vol. , pp. 189–198. Cited by: §2.1.
  • G. Kim (2011) Cited by: §1.
  • M. Krótkiewicz, K. Wojtkiewicz, M. Jodłowiec, and W. Pokuta (2016) Semantic knowledge base: quantifiers and multiplicity in extended semantic networks module. In Knowledge Engineering and Semantic Web, A. Ngonga Ngomo and P. Křemen (Eds.), Cham, pp. 173–187. External Links: ISBN 978-3-319-45880-9 Cited by: §2.1.
  • M. Krótkiewicz, K. Wojtkiewicz, and M. Jodłowiec (2018) Towards semantic knowledge base definition. In Biomedical Engineering and Neuroscience, W. P. Hunek and S. Paszkiel (Eds.), Cham, pp. 218–239. External Links: ISBN 978-3-319-75025-5 Cited by: §2.1.
  • L. Leite, C. Rocha, F. Kon, D. Milojicic, and P. Meirelles (2019) A survey of devops concepts and challenges. ACM Comput. Surv. 52. External Links: ISSN 0360-0300 Cited by: §1.
  • S. Migues, J. Steven, and M. Ware (2020) Cited by: §1.
  • F. Moyón, R. Soares, M. Pinto-Albuquerque, D. Mendez, and K. Beckers (2020) Integration of security standards in devops pipelines: an industry case study. Cham, pp. 434–452. External Links: ISBN 978-3-030-64148-1 Cited by: §1.
  • M. Nadeem, B. J. Williams, and E. B. Allen (2012) High false positive detection of security vulnerabilities: a case study. New York, NY, USA, pp. 359–360. External Links: ISBN 9781450312035 Cited by: §1.
  • S. N. A. U. Nambi, C. Sarkar, R. V. Prasad, and A. Rahim (2014) A unified semantic knowledge base for iot. In 2014 IEEE World Forum on Internet of Things (WF-IoT), Vol. , pp. 575–580. Cited by: §2.1.
  • S. Simpson (2014) SAFECode whitepaper: fundamental practices for secure software development 2nd edition. pp. 1–32. External Links: ISBN 978-3-658-06708-3 Cited by: §1.
  • VMware (2022) Differential Datalog (DDlog) Note: original-date: 2018-03-20 External Links: Link Cited by: §2.2.
  • J. A. Wang and M. Guo (2009) Security data mining in an ontology for vulnerability management. pp. 597–603. Cited by: §2.1.
  • S.M. Welberg (2008) Vulnerability management tools for cots software - a comparison. CTIT Technical Report Series, Centre for Telematics and Information Technology (CTIT), Netherlands (Undefined). Cited by: §1.