A post about shellcode

Jun 10, 2018

0 comments

So you may ask yourself, like I did what is shellcode? And the deeper you dig the more questions you will ask yourself, well that's how it went with me. I am not a shellcode guru but I would like to share my knowlegde, when I write / teach about something, I understand (I think) it better. This is not a giant introduction explaining everything, it's more about a journey of digging in to shellcode. To answer the first question, shellcode is nothing more then a bunch of binary code like this:

\xf7\xe6\x52\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x8d\x3c\x24\xb0\x3b\x0f\x05

But what is shellcode really? And what the hell does the above code do?! I'll answer the second question direclty, it spawns a shell on a Linux 64-bit system. Now we get a hint, so shellcode is operating system and CPU architecture bound? Yup that's right, shellcode is operating system and CPU architecture specific. Now what is the code above? the code is binary, represented here as hexadecimal numbers. I didn't get it at first, at first I was like: "this is hex I just parse this with a hex parser" but these are bytes and it could be that they don't have an ASCII representation. But what is shellcode now really? shellcode is binary data that can be read direclty by a CPU. In fact it's 0 and 1, and a CPU, when you feed it this data, can read it and execute whatever it says. I don't believe you, show me the code!

Ok here it is:

#include <stdio.h>

char *code = "\xf7\xe6\x52\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x8d\x3c\x24\xb0\x3b\x0f\x05";

int main () {

    int (*ret)() = (int(*)()) code;
    ret();

    return 0;

}

Compile it with gcc and run it:

$ gcc shell.c -o shell
$ ./shell

You will need to execute it on a 64-bit Linux system. If executed you will see it spawned /bin/sh.

So how this line int (...); really works I don't know a 100%. I do know is that it will take a pointer the character array "code" put it in memory and I want to say execute here, but it does not execute the shellcode, it does set up registers so something points to the address of the shellcode which was put into memory. Can you exlain it even vaguer? Sorry maybe next post explanation :). But now we go deeper in the shellcode, let's analyze that bunch of binary, let's try to see if there is something readable, the hexadecimal readable spectrum goes from 21 to 7E hex or 33 to 127 decimal see man ascii:

#include <stdio.h>
#include <string.h>

char * code = "\xf7\xe6\x52\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x8d\x3c\x24\xb0\x3b\x0f\x05";

int main () {

    for (int i = 0; i < strlen(code); i++) {

        if ((int) code[i] > 33 && (int) code[i] < 128) {

            printf("%c", code[i]);

        }

    }

    printf("\n");

    return 0;

}

The output of the previous is:

RH/bin//shSH<$;

So we see something like "/bin//sh", well interesting but this is not the right way to go, A better approach is to look at the assembly code, which is how I wrote the shellcode in the first place.

Disassembly of section .text:

0000000000400080 <_start>:
  400080:   f7 e6                   mul    esi
  400082:   52                      push   rdx
  400083:   48 bb 2f 62 69 6e 2f    movabs rbx,0x68732f2f6e69622f
  40008a:   2f 73 68 
  40008d:   53                      push   rbx
  40008e:   48 8d 3c 24             lea    rdi,[rsp]
  400092:   b0 3b                   mov    al,0x3b
  400094:   0f 05                   syscall 

What the shit is "0x68732f2f6e69622f"?? Yeah my first reaction to, it is actually super straight forward, remember "/bin//sh" from 10 seconds ago? Well that is that:

                    | Lets take a look at this piece |
\xf7\xe6\x52\x48\xbb \x2f\x62\x69\x6e\x2f\x2f\x73\x68\ x53\x48\x8d\x3c\x24\xb0\x3b\x0f\x05

\x2f \x62 \x69 \x6e \x2f \x2f \x73 \x68\

Now there is something called little endian, I always forget about. you need to reverse per 8 bytes ea 64 bits

\x68\ \x73 \x2f \x2f \x6e \x69 \x62 \x2f

0x68732f2f6e69622f

hs//nib/

which if we reverse is /bin//sh

But why /bin//sh and not /bin/sh, the most simple explanation; otherwise it won't work, you need the 2 "/" in the middle. There is another magic code in the assembly "0x3b", which is hex for 59, which is the opcode for the linux system call execve, see https://filippo.io/linux-syscall-table/ for a complete list about syscall codes.

Better commented assembly code:

Disassembly of section .text:

0000000000400080 <_start>:
  400080:   f7 e6                   mul    esi ; zero out value esi
  400082:   52                      push   rdx ; push the value in rdx to the stack
  400083:   48 bb 2f 62 69 6e 2f    movabs rbx,0x68732f2f6e69622f ; /bin/sh in hexadecimal
  40008a:   2f 73 68 
  40008d:   53                      push   rbx ; push the string "/bin/sh" to the stack

; load the memory address of the stack pointer, which points to "/bin/sh"  
  40008e:   48 8d 3c 24             lea    rdi,[rsp]

; load the value 59 into register al, which is used as an identifier for system calls  
  400092:   b0 3b                   mov    al,0x3b 

; execute the system call according to the value in register al  
  400094:   0f 05                   syscall

Probably some people will ask: "But how can I hack with this?". Well there are 2 ways that I know of:

1