Questions about this topic? Sign up to ask in the talk tab.

Difference between revisions of "Shellcode/Parsing"

From NetSec
Jump to: navigation, search
(Created page with "== Binary format parsing == A runtime linker parses through either the ''PE'' (Portable Executable) or ''ELF'' (Executable and Linkable Format) executable formats to identify ''f...")
(No difference)

Revision as of 04:59, 22 November 2012

Binary format parsing

A runtime linker parses through either the PE (Portable Executable) or ELF (Executable and Linkable Format) executable formats to identify function pointers. This is useful when writing code that must link to different versions of the same shared library. For example, 32-bit and 64-bit linux system calls have different numbers, so a runtime linker could run dispite this limitation. Each shared library format has a respective export table for functions accessible by third party applications, which is best used when writing version-indifferent code.

Self-linking (or runtime-linking) shellcode refers to machine code's ability to use what functions are already present in memory as opposed to carrying all of its functionality within itself. From a general perspective, a linker is comprised of two parts. One part of the runtime linker must be able to isolate the base pointer of any given library loaded into memory, and the other part of the runtime linker must be able to parse the library and return the memory address/pointer for the start of any given function.

This is called self-linking shellcode or self-linking machine code because it does not rely on being linked with any kernel, in stead it finds the functionality it needs within the run-time environment and calls already existing functions out of memory. This will save the programmer time and size, and potentially even allow the programmer to write a cross-OS machine code application that is fully capable of using pre-built-in functionality of the operating system by linking itself in stead of relying on an external linker to both link and format the binary properly.

Header Diagrams

  • Diagram of a 64-bit ELF Header:
       0x0 - 0xf                   = "ELF Format Information"
       Entry-point                 = 0x18 - 0x1f
       Start of section headers    = 0x28 - 0x2f
       Size of each section        = 0x3a - 0x3b
       Number of section headers   = 0x3c - 0x3d


  • Diagram of a 64-bit section header: (length defined in ELF header)
         [0x0-0x3]     shstrtab offset for section name.
                       shstrtab is defined between the end of
                       .text and the beginning of the section
                       headers
         [0x4-0x7]     section type - 0 is null, 1 is progbits, 2 is symtab, 3 is strtab
         [0x8-0xf]     section flags
         [0x10-0x17]   section address
         [0x18-0x1f]   section offset
         [0x20-0x27]   section size
         [0x28-0x2b]   Section Link
         [0x2c-0x2f]   Section Info
         [0x30-0x37]   Section Align
         [0x38-0x3f]   Section EntSize
  • Diagram of a 64-bit symbol table entry: (0x18 bytes in length)
         [0x0-0x3]    Name offset from next string table
         [0x4-0x5]    Bind
         [0x6-0x7]    Ndx
         [0x8-0xf]    Symbol pointer (Function pointer, data pointer, etc)
         [0x10-0x17]  Null barrier

Example: Printing symbol names

It is relatively trivial to find your imagebase at runtime using some small assembly, but more difficult to actually parse out the ELF image. Here's an unstable (no error or size checking) assembly code (not shellcode) that will dump its own symbols:


  • We need a pointer to newlines for later
 
startup:
 xor %r15, %r15
 push $0x0a0a0a
 mov %rsp, %r15
 


  • Get the location of currently executing code so we can calculate the base pointer
 
 call getpc   # this getpc returns the address of dec rax on the next line into %rax.
 dec %rax
 xor %rcx, %rcx
 push $0x2
 pop %rsi
 
  • We build a loop to determine the base pointer of our file. We know that all ELF files start with 0x7fELF, so:
 
find_header:
 cmpl $0x464c457f, (%rax,%rcx,4)   # Did we find our ELF base pointer?
 je find_sections
 dec %rax
 jmp find_header
 
  • Extract the section header offset from the ELF header
 
find_sections:
 # %rax now = base pointer of ELF image.
 xor %rbx, %rbx
 add $0x28, %bl
 xorl (%rax,%rbx,1), %ecx             # %rcx = offset to section headers
 addq %rax, %rcx                      # %rcx = absolute address to section headers
 
  • Iterate through the section headers, looking for a symbol table section header
 
 # each section header is 0x40 bytes in length.
next_section:
 xor %rbx, %rbx
 xor %rbp, %rbp
 add $0x40, %rcx
 # %rcx now = address to first entry
 add $0x04, %bl
 xor (%rcx,%rbx,1), %ebp              # %rbp now contains type
 cmp $0x02,  %bpl
 jne next_section
 
  • The next header is the string table section header
 
found_symbols:
 xor %r8, %r8
 mov %rcx, %r8                        # %rcx = pointer to top of symbol section header
 add $0x40, %r8                       # %r8  = pointer to top of string table section header
 
  • Get the addresses to the actual symbol table and string table
 
 xor %rbx, %rbx
 xor $0x18, %bl                      # pointer to actual section is $0x18 bytes from header base
 xor %r9, %r9
 xor %r10, %r10
 xor (%rcx,%rbx,1), %r9
 xor (%r8,%rbx,1), %r10
 addq %rax, %r9                      # r9 should now point to the first symbol
 addq %rax, %r10                     # r10 should now point to the first string
 addq $0x18, %r9
 
  • Iterate through the symbol table, extracting string pointers:
 
next_symbol:
 addq $0x18,%r9
 xor %rcx, %rcx
 xor %rbp, %rbp
 xor %rdi, %rdi
 xor (%r9,%rcx,1), %ebp              # %rbp now contains string offset.
 cmp %rbp, %rdi
 je next_symbol
 
  • Call strlen() on the string pointers for write()
 
print_symbol_name:
 mov %rbp, %rsi
 addq %r10, %rsi                     # %rsi should now be a pointer to a string
 push $0x01
 pop %rax
 push %rax
 pop %rdi
 call strlen
 syscall
 
  • Our strlen:
 
strlen:
 xor %rdx, %rdx
 
next_byte:
 inc %rdx
 cmpb $0x00, (%rsi,%rdx,1);
 jne next_byte
 ret
 
  • Write the string to terminal:
 
write_to_terminal:
 push $0x01
 pop %rax
 push %rax
 pop %rdi
 push $0x02
 pop %rdx
 push %r15
 pop %rsi
 syscall
 jmp next_symbol
 
 [user@host ~]$ ./test_parser 
 startup
 
 getpc
 
 find_header
 
 find_sections
 
 next_section
 
 found_symbols
 
 next_symbol
 
 print_symbol_name
 
 strlen
 
 next_byte
 
 _start
 
 __bss_start
 
 _edata
 
 _end
 
 Segmentation fault