Questions about this topic? Sign up to ask in the talk tab.

Difference between revisions of "Buffer overflow"

From NetSec
Jump to: navigation, search
 
Line 1: Line 1:
{{cleanup}}
 
 
 
'''Buffer overflow''', or '''Buffer Overrun''' is a software error triggered when a program doesn't adequately control the amount of data that is copied over the [[buffer]], so if this amount exceeds the preassigned capacity, remaining bytes are stored in adjacent memory areas by overwriting its original content. This may lead to arbitrary code execution and allow access to a vulnerable system.  
 
'''Buffer overflow''', or '''Buffer Overrun''' is a software error triggered when a program doesn't adequately control the amount of data that is copied over the [[buffer]], so if this amount exceeds the preassigned capacity, remaining bytes are stored in adjacent memory areas by overwriting its original content. This may lead to arbitrary code execution and allow access to a vulnerable system.  
  

Revision as of 22:00, 4 September 2011

Buffer overflow, or Buffer Overrun is a software error triggered when a program doesn't adequately control the amount of data that is copied over the buffer, so if this amount exceeds the preassigned capacity, remaining bytes are stored in adjacent memory areas by overwriting its original content. This may lead to arbitrary code execution and allow access to a vulnerable system.

Description

For example, when an Alzheimer's patient is confronted with a particular set of circumstances, s/he may try to remember what s/he should do in that situation. When the patient tries to remember what to do, the patient may remember the wrong thing - and therefore do something different. If a psychologist had inserted false memories, so that the patient remembered what the psychologist wanted them to and acted according to the psychologist's instructions, the psychologist has then controlled the Alzheimer’s patient. The same follows for a computer. A computer receives input, remembers what to do with the input, and then does it. If an attacker on the internet could control the memory of a computer, the computer would remember the wrong thing to do, and do it because it doesn't know any better. This is what happens during a buffer overflow attack.

The memory of a computer is much like a post office. Each piece of mail goes to a mailbox or a P.O. box, and each P.O. box can only hold one piece of mail at a time. Suppose for a moment that the post office that represents the computer's memory has 500 P.O. boxes. Boxes 1-200 are for data that the user sends into the computer, and boxes 201-500 hold instructions for what to do with that data. Now what happens if a user sends in 300 pieces of data or mail? Well a secure program would tell the user "I can only hold 200 pieces, I'm not taking any more mail", but an insecure program would simply take all the data into boxes 1-300. So now, when the computer remembers what to do, it lands on P.O. box 201. If the user was an attacker, couldn't s/he put malicious instructions inside of P.O. box 201? Of course! This is why the buffer overflow is such a dangerous vulnerability. Though it is a dying attack vector, the buffer overflow is still very prominent today.

In all actuality, there is a return address that the computer uses to remember where its instructions are. So if an attacker filled up P.O. boxes 1-201, and 201 contained the return address, and the attacker changed the return address to P.O. box 1, the computer would execute the data instead of just keeping it in memory. This means that the attacker has to know enough about the system to know what address the malicious instructions are going to, because otherwise the attacker will not know the correct return address to put into P.O. Box 201. This means that the attacker has to have precise aim, or the attack will be unsuccessful.

Defenses

There are multiple defenses that have been incorporated into runtime in an attempt to fight buffer overflows and prevent them from taking place. One of the most recent defense mechanisms is called ASLR, which stands for Address Space Layout Randomization. It makes it so every time the computer reboots and every time a program runs, the address space that it lives in changes. In other words, following our mailbox analogy, the return address will never be in the same mailbox. The point of this is to try to prevent an attacker from performing a buffer overflow exploit because the attacker can never aim properly. Unfortunately, attackers have discovered something called "Magic Numbers", which tricks the error handler for programs and allows an attacker to aim his attack correctly without having to know a return address.

Another defense mechanism that has been implemented is called DEP, which stands for Data Execution Prevention. This is an attempt to prevent the return address from being changed into something in the same memory space as the data, and also prevent machine code (the code that buffer overflows are crafted in) from being placed into data segments. To combat this defense mechanism, attackers have developed ASCII and polymorphic ASCII machine code. ASCII and Polymorphic ASCII code looks like normal user input instead of machine code.

An even further defense mechanism is called a StackGuard, which is another layer of Data Execution Prevention. The stackguard attempts to identify all possible results of code from data within the buffer (or the data segment) and then prevent the application from calling external functions in shared objects from the inside of the buffer. A version of this has been implemented in Cisco Security Agent, or CSA.

So with CSA, ASLR, and Operating-System supplied DEP, successfully performing a buffer overflow exploit against a system running with CSA is extremely difficult. Any attacker who makes it to the point where CSA catches it is already very advanced. To successfully subvert ASLR, DEP and StackGuard one must use polymorphic ASCII shellcode, in other words, machine code that self-modifies as well as looks like standard user input and has all of its own functions built into its own code. The return address must always be specified in normal hexadecimal format, so it will usually look like some really funny characters, like squares or like strange symbols. The IDS or HIDS Context Buffer will show four squares or symbols on the end in a real buffer overflow exploit attempt on 32-bit systems, and eight squares or symbols on the end on a 64-bit system.

Maximum Effectiveness

Sometimes attackers and pen-testers alike use what is called Second Stage Shellcode. Many times firewall rules will prevent any connections outgoing from a server machine and prevent all incoming connections except for connections on the specified server port. Because of this, attackers use what is called Second Stage Shellcode to first find the connection that the exploit originated from, and then send the output of the arbitrary functions back along the first connection. This is done to circumvent firewalls and prevent a firewall from blocking a connection.

Buffer overflows can be used remotely to gain partial or total systems access, or they can be used locally to escalate privileges and permissions segments inside of the operating system in order to gain system or root level access. The real threat that a buffer overflow causes is what is called the "Zero-Day attack", also known as a buffer overflow that the security world has never seen before. Zero-Day or 0day attacks are the most devastating to the security industry, causing worms, viruses, and sometimes even hundreds of thousands of systems to be compromised in a single day.

Causes

Buffer overflows exist because of insecure programmers. A programmer should be able to check input to the data segment with relative ease, however often times coders are either ignorant of the problem, overlook the flaw, or sometimes even a disgruntled employee might code the vulnerability into an application himself for his own personal gain after the application goes production level.