Existence of Stack Overflow Vulnerabilities in Well-known Open Source Projects

10/31/2019
by   Md Masudur Rahman, et al.
0

A stack overflow occurs when a program or process tries to store more data in a buffer (or stack) than it was intended to hold. If the affected program is running with special privileges or accepts data from untrusted network hosts (e.g. a web-server), then it is a potential security vulnerability. Overflowing a stack, an attacker can corrupt the stack in such a way as to inject executable code into the running program and take control of the process. This is one of the easiest and more reliable methods for attackers to gain unauthorized access to a computer. In this paper, we show that how stack overflow occurs and many open source projects, such as - Linux, Git, PHP, etc. contain such code portions in which it is possible to overflow the stacks as well as inject malicious script to harm the normal execution of the processes. In addition, this paper raises a concern to avoid writing such codes those are potentially sources for stack overflow attack.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

06/20/2018

Toxic Code Snippets on Stack Overflow

Online code clones are code fragments that are copied from software proj...
12/30/2020

Stack-based Buffer Overflow Detection using Recurrent Neural Networks

Detecting vulnerabilities in software is a critical challenge in the dev...
11/02/2021

The Security Risk of Lacking Compiler Protection in WebAssembly

WebAssembly is increasingly used as the compilation target for cross-pla...
02/22/2019

On Transforming Functions Accessing Global Variables into Logically Constrained Term Rewriting Systems

In this paper, we show a new approach to transformations of an imperativ...
07/26/2018

ret2spec: Speculative Execution Using Return Stack Buffers

Speculative execution is an optimization technique that has been part of...
07/15/2021

You Do Not Need a Bigger Boat: Recommendations at Reasonable Scale in a (Mostly) Serverless and Open Stack

We argue that immature data pipelines are preventing a large portion of ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Buffer overflow vulnerabilities have been increased over the last ten years [5]. It has been the most common form of security vulnerability in the recent past. Moreover, buffer overflow vulnerabilities dominate in the area of remote network penetration vulnerabilities, where an anonymous Internet user seeks to gain partial or total control of a host. These kinds of attacks represent one of the most serious security threats as these enable anyone to take the control of a host. C programming language provides some built-in functions like - , etc. in which stack overflow attack could possible. This attack is known as smashing stack [6].
A buffer is simply a contiguous block of computer memory that holds multiple instances of the same data type. In C programming language, buffer is simply known as array. Arrays can be declared either static or dynamic like all variables in C. Static variables are allocated at load time on the data segment and dynamic variables are allocated at run time on the stack. To overflow is to flow, or fill over the top or bounds of the buffer. We will concern ourselves only with the overflow of dynamic buffers which is known as stack‐based buffer overflows [10].

I-a Memory Organization of a Process

To understand stack buffers, first of all, we need to understand how a process is organized in memory. Basically a process is a program in execution [11]. The execution of a process must progress in a sequential fashion. A process needs certain resources, including CPU time, memory, files and I/O devices to accomplish its task. Processes are divided into three segments [4]: Text, Data, and Stack shown in Figure 1.
The text region is fixed by the program. It is also know as code segment as it includes code or instructions and executable file. This region is normally marked read‐only and so any attempt to write to it will result in a segmentation violation. The data region contains initialized data. Global and Static variables are stored in this region. In BSS segment, uninitialized data are included and initialized by zeroes at load time of the process.

Fig. 1: Process Memory Organization

Heap is a dynamically allocated memory that contains malloc type variables. It is managed automatically by the operating system or the memory manager library. Memory on the heap is allocated, deallocated and resized regularly during program execution.
A stack contains frames of function calls and its arguments and local variables. It dynamically grows and shrinks at the runtime of the process.

I-B Various Uses of Stack and Its Organization

A stack is a contiguous block of memory containing data which resizes dynamically according to process needs. A register called the stack pointer (SP) points to the top of the stack. The bottom of the stack is at a fixed address. Its size is dynamically adjusted by the kernel at run time. The CPU implements instructions to PUSH onto and POP off of the stack.Several operations are defined on stacks. Two of the most important are PUSH and POP. PUSH adds an element at the top of the stack. POP, in contrast, reduces the stack size by one by removing the last element at the top of the stack. Specifically, a stack of objects has the property that the last object placed on the stack will be the first object removed. This property is commonly referred to as Last In, First Out queue, or a LIFO.
Depending on the implementation the stack will either grow down (towards lower memory addresses), or up. In our examples we’ll use a stack that grows down. This is the way the stack grows on many computers including the Intel, Motorola, SPARC and MIPS processors. The stack pointer (SP) is also implementation dependent. It may point to the last address on the stack, or to the next free available address after the stack. For our discussion, we assume, it points to the last address on the stack.
The stack consists of logical stack frames that are pushed when calling a function and popped when returning the function. A frame is a way to localize information about subroutines. A stack frame contains the parameters to a function, its local variables, and the data necessary to recover the previous stack frame, including the value of the instruction pointer at the time of the function call.

Fig. 2: Stack structure for a function

When a subroutine is called, all this information is pushed onto the stack in a specific order. When the function returns, all these values on the stack are popped back off, reclaimed to the system for later use with a different function call. In addition, subroutines can also use the stack as storage for local variables. The memory organization of a stack for a function call is shown in Figure 2. For parameters, positive offsets are added from the return address (ra-register) and for local variables, negative offsets are added from the ra-register.

I-C Our Contribution

The contributions of the paper are as follows:

  1. Show how stack overflow occurs in a program.

  2. Find stack overflow vulnerabilities in the renowned open source projects: Linux, Git and PHP System.

  3. Grow attention not to use vulnerable codes.

Ii Related Work

Buffer overflows have been the most common form of security vulnerability in the last ten years. Moreover, buffer overflow vulnerabilities dominate in the area of remote network penetration vulnerabilities, where an anonymous Internet user seeks to gain partial or total control of a host which leads to severe security threats. Existing research shows how buffer overflow attacks occur and provides some ways of remedy.
Buffer overflow attacks form a substantial portion of all security attacks simply because buffer overflow vulnerabilities are so common and so easy to exploit [6]. In the paper, Aleph One has shown that how how stack overflow attacks occur and a malicious script injects to take the control of the system. Our work is inspired by Alep’s work.
Crispin Cowan et al. did survey the various types of buffer overflow vulnerabilities and attacks, and survey the various defensive measures that mitigate buffer overflow vulnerabilities, including the StackGuard method [5]. They then considered which combinations of techniques could eliminate the problem of buffer overflow vulnerabilities, while preserving the functionality and performance of existing systems.
The Immunix project has developed the StackGuard defensive mechanism [5, 7], which has been shown to be highly effective at resisting attacks without compromising system compatibility or performance [9, 8].
The existing works on buffer overflow have shown that how an attacker does attack by overflowing buffer or stack to get access control to the system and inject harmful codes. However, no one finds the vulnerabilities in the existing widely used open source projects like - Linux, Git, PHP and many others. Our research has tried to find out some vulnerable codes in the renowned open source in which it is possible to overflow its stack or buffer to harm the system. In addition, this research raise an awareness to remove those vulnerable code portions from the systems as well as in future projects, avoid those vulnerable codes or function that cause stack overflows.

Iii Experimental Analysis of Stack Overflow

Stack overflow occurs when a program writes more data to a buffer located on the stack than its actual size. Since buffers are created to contain a finite amount of data, the extra information - which has to go somewhere - can overflow into the adjacent buffers, corrupting or overwriting the valid data held in them. Although it may occur accidentally through programming error, stack overflow is an increasingly common type of security attack on data integrity. In stack overflow attacks, the extra data may contain codes designed to trigger specific actions, in effect sending new instructions to the attacked computer that could, for example, damage the user’s files, change data, disclose confidential information or even can take control over the attacked computer. Buffer overflow attacks are said to have arisen because the C programming language supplied the framework and poor programming practices supplied the vulnerability.

Fig. 3: Example of a Typical Stack Overflow

Figure 3 shows a function with a typical buffer overflow coding error. The function copies a supplied string without bounds checking by using strcpy() instead of strncpy() If you run this program you will get a segmentation violation. Lets see the structure of the stack in Figure 4, when the function is called.

Fig. 4: Stack Structure for the Function of Figure 3

What is the problem with the code? Why do we get a segmentation violation? The answer is simple: is coping the contents of () into buffer[] until a null character is found on the string. As we can see, is much smaller than . is 16 bytes long, and we are trying to stuff it with 256 bytes. This means that all 240 bytes after buffer in the stack are being overwritten. This includes the SFP (Stack Frame Pointer), RET (return address), and even (in Figure 4). We had filled with the character . It is hex character value is 0x41. That means that the return address is now 0x41414141. This is outside of the process address space. That is why, when the function returns and tries to read the next instruction from that address, you get a segmentation violation. So a buffer overflow allows us to change the return address of a function. In this way, we can change the flow of execution of the program.

Iii-a Functions with Stack Overflow Vulnerabilities

Like , C programming language provides us some widely used built-in functions which are vulnerable. The standard C library provides a number of functions for copying or appending strings, that perform no boundary checking. These include: strcat(), strcpy(), sprintf(), and vsprintf(). These functions operate on null‐terminated strings and do not check for overflow of the receiving string. is a function that reads a line from stdin into a buffer until either a terminating newline or EOF (End of Line). It performs no checks for buffer overflows. The family of functions can also be a problem if you are matching a sequence of non‐white‐space characters (%s), or matching a non‐empty sequence of characters from a specified set (%[]), and the array pointed to by the char pointer, is not large enough to accept the whole sequence of characters, and you have not defined the optional maximum field width. If the target of any of these functions is a buffer of static size, and it’s other argument was somehow derived from user input, then there is a good possibility that you might be able to exploit a buffer overflow.

Iii-B Stack Overflow Vulnerabilities in Open Source Projects

In this paper, we have shown that many renowned open source projects those are developed in C programming language, have such stack overflow security vulnerabilities. More specifically, we have shown the stack overflow vulnerabilities in the source projects of Linux [1], GIT [2] and PHP [3].

Iii-B1 Vulnerable Codes in Linux System

Linux is a well known operating system and widely used by software developers or programmers. However, as this system is developed in C, it contains some vulnerable functions. As a result, an attacker can easily take control over the system through overflowing its stack. Figure 5 shows such a vulnerable code found at the file in Linux source project [1] because it contains character pointer as a parameter. So it is possible to call the function having a large argument for .

Fig. 5: Vulnerable Code in Linux: srm_puts.c file

Some other examples are shown in Figure 6 where vulnerable code snippets are - function and parameter . In Figure 7, the parameters , and the statement into the loop could also be possible to stack overflow attacks.

Fig. 6: Vulnerable Code in Linux: srm_printk.c file

Fig. 7: Vulnerable Code in Linux: stdio2.c file

Iii-B2 Vulnerable Codes in Git System

Git is a version control system that is widely used for software development and other version control tasks. It is a distributed revision control system with an emphasis on speed, data integrity and support for distributed, non-linear workflows. This system has been developed in C and hence some stack overflow vulnerabilities are found in the Git project [2], shown in Figure 8, 9 and 10.

Fig. 8: Vulnerable Code in Git: basename.c file

In Figure 8, if a malicious script is passed through the function’s argument char and this could possible to jump to the attacker instruction through the statement .

Fig. 9: Vulnerable Code in Git: sha1.c file
Fig. 10: Vulnerable Code in Git: terminal.c file

In Figure 9, it could possible to overflow the parameter char of the function and in Figure 10, the vulnerable statement is fputs(prompt, output_fh).

Iii-B3 Vulnerable Codes in PHP System

PHP is a server scripting language and a powerful tool for making dynamic and interactive Web pages. PHP is a widely-used and open source software developed by C programming language. However, it also contains some vulnerable codes, some of those are shown in Figure 11, 12 and 13. In Figure 11, we found C’s the built-in function strcpy(buf, tmp) which is severely stack overflow security vulnerable.

Fig. 11: Vulnerable Code in PHP: reentrancy.c file

Fig. 12: Vulnerable Code in PHP: php_sprintf.c file

In Figure 12, a vulnerable function is used. in Figure 13, the return statement is vulnerable as its return control could be dominated by an attacker through the parameter and could access the control.

Fig. 13: Vulnerable Code in PHP: fnmatch.c file

There exist many vulnerable codes in PHP system like the examples stated above.
Like those examples, there exist many vulnerable codes in Linux, Git and PHP systems. So these vulnerable code snippets should be removed by writing robust codes as well as vulnerable C built-in functions should be avoided.

Iv Conclusion

Stack overflow attack or smashing stack is such a severe security vulnerability that an attacker could easily damage the control of the actual program flow. An attacker can easily get the control of the computer which is running the process having vulnerable codes. This paper has shown the stack overflow security vulnerabilities that occur due to use of some built-in C functions unconsciously and without bound checking of the buffer. However, many popular open source projects contain such vulnerabilities which have also been described in this paper. As a result, this paper will be able to raise a concern among the developers to build applications secured from smashing stack. Therefore, developers have to give extra attention when they use built-in C functions such as - gets(), strcpy(), strcat(), etc.

References

  • [1] “GitHub - torvalds/linux: Linux kernel source tree”, https://github.com/torvalds/linuxl, Online; accessed 06 May, 2016.
  • [2] “GitHub - git/git: Git Source Code Mirror”, https://github.com/git/git, Online; accessed 06 May, 2016.
  • [3] “GitHub - php/php-src: The PHP Interpreter”, https://github.com/php/php-src, Online; accessed 06 May, 2016.
  • [4] “Data segment - Wikipedia”, https://en.wikipedia.org/wiki/Data_segment, Online; accessed 06 May, 2018.
  • [5] Cowan, Crispin, et al. ”Buffer overflows: Attacks and defenses for the vulnerability of the decade.” DARPA Information Survivability Conference and Exposition, 2000. DISCEX’00. Proceedings. Vol. 2. IEEE, 2000.
  • [6] One, Aleph. ”Smashing the stack for fun and profit.” Phrack. vol. 7. 1996.
  • [7] Cowan, Crispan, et al. ”Stackguard: automatic adaptive detection and prevention of buffer-overflow attacks.” Usenix Security. Vol. 98. 1998.
  • [8] Cowan, Crispin, and Calton Pu. ”Survivability from a Sow’s ear: The retrofit security requirement.” Proceedings of the 1998 Information Survivability Workshop. 1998.
  • [9] Cowan, Crispin, et al. ”Protecting systems from stack smashing attacks with StackGuard.” Linux Expo. 1999.
  • [10] Litchfield, David. ”Defeating the stack based buffer overflow prevention mechanism of microsoft windows 2003 server.” 2003.
  • [11] Silberschatz, Abraham, J. L. Peterson, and P. B. Galvin. ”Operating systems.” Publication By John Wiley & Sons 1991.