Hacking function main
March 30, 2017
I found this post in which a guy, called James Rowe, successfully compiled a program in which he defined a variable called main
and made it work as if there was an actually int main(){...}
. So, before I finished to read the whole post I decided to try it by myself.
Last month I solved a buffer bomb
similar to this one described here. Basically I used a buffer overflow to write directly on specific memory locations.
The buffer overflow and my knowledge in linking, assembly and virtual memory gave me the confidence to try the something similar to James’s post.
The problem
I want to print on my screen holi boli
by using a code in which I only declare and define a variable called main
. Note this code only works on a x64 Linux machine
You can find the programs I coded here.
The result
I ended up with this code:
const char main[] ={
0x55,
0x48, 0x89, 0xe5,
0xb8, 0x01, 0x00, 0x00 , 0x00,
0xbb, 0x01, 0x00, 0x00 , 0x00,
0xbe, 0x81, 0x05, 0x40 , 0x00,
0xba, 0x0a, 0x00, 0x00 , 0x00,
0x0f, 0x05,
0xb8, 0x3c, 0x00, 0x00 , 0x00,
0x0f, 0x05
// <message>
0x68, 0x6f, 0x6c, 0x69, 0x20,
0x62,
0x6f,
0x6c,
0x69, 0x0a, 0xb8, 0x00, 0x00, 0x00,
0x00, 0x5d, 0xc3,
0x0f, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00,
0x00
};
Of course the output is holi boli
.
The process
First attempt
The first thing I did was to code what I wanted to achieve.
// baseline.c
#include <stdio.h>
int main(){
printf("holi boli \n");
}
This code is straightforward. I compiled gcc -W -Wall test1.c -o run
and I run objdump -D run
to see what was the assembly code related to main
.
00000000004004f6 <main>:
4004f6: 55 push %rbp
4004f7: 48 89 e5 mov %rsp,%rbp
4004fa: bf 94 05 40 00 mov $0x400594,%edi
4004ff: e8 ec fe ff ff callq 4003f0 <puts@plt>
400504: b8 00 00 00 00 mov $0x0,%eax
400509: 5d pop %rbp
40050a: c3 retq
40050b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
From now on I will only show the memory locations and opcode not the assembly instructions given by objdump -D
, unless it is necessary.
Based on the concept behind buffer overflow, I can use char
arrays to write directly into memory. Thus, I naively did the following:
//test1.c
char main[] = { 0x55,
0x48, 0x89, 0xe5,
0xbf, 0x94, 0x05, 0x40, 0x00,
0xe8, 0xec, 0xfe, 0xff, 0xff,
0xb8, 0x00, 0x00, 0x00, 0x00,
0x5d,
0xc3,
0x0f, 0x1f, 0x44, 0x00, 0x00,
};
I compiled and this happened Illegal instruction (core dumped)
, a segmentation fault. It was kind of expected something similar so I used objdump -D
to see what was under the hood.
Disassembly of section .data:
0000000000601020 <__data_start>:
...
0000000000601028 <__dso_handle>:
...
0000000000601030 <main>:
601030: 55
601031: 48 89 e5
601034: bf 94 05 40 00
601039: e8 ec fe ff ff callq 4003f0 <puts@plt>
60103e: b8 00 00 00 00
601043: 5d
601044: c3
601045: 0f 1f 44 00 00
The key word here is .data
. Yes my friend, main
in the .data
section. For more information about ELF sections I wrote a post about ELF files. But if we wanted char main
to work as if it were “true” code, main
should be in .text
section or .rodata
section.
Second attempt
I added const
before char main
to move from .data
to .rodata
.
//test2.c
const char main[] = { 0x55,
0x48, 0x89, 0xe5,
0xbf, 0x94, 0x05, 0x40, 0x00,
0xe8, 0xec, 0xfe, 0xff, 0xff,
0xb8, 0x00, 0x00, 0x00, 0x00,
0x5d,
0xc3,
0x0f, 0x1f, 0x44, 0x00, 0x00,
};
I run the code and I got Illegal instruction (core dumped)
. Another infamous segmentation fault.
And of course, as you was expected, I used objdump -D
:
Disassembly of section .rodata:
0000000000400530 <_IO_stdin_used>:
400530: 01 00
400532: 02 00
...
0000000000400540 <main>:
400540: 55
400541: 48 89 e5
400544: bf 94 05 40 00
400549: e8 ec fe ff ff
40054e: b8 00 00 00 00
400553: 5d
400554: c3
400555: 0f 1f 44 00 00
Nothing seems to be wrong. So I used baseline.c
to see what could happen, and I noticed this 4004ff: e8 ec fe ff ff callq 4003f0 <puts@plt>
.
Let’s check in which address is the GOT .plt
for printf
(puts@plt
)… Oohhh shit!, obviously the linker didn’t link printf
since there is no reference to that function. Let’s try another approach :)
The GOT or global offset table is used to link functions defined in shared libraries in such a way that instead of loading the whole library, the compiler only uses the function that have references in the code. To learn more you can google it. Maybe I will write a post about it, when it happens I will update this post.
The GOT .plt
for printf
in baseline.c
is as follow:
00000000004003f0 <puts@plt>:
4003f0: ff 25 22 0c 20 00 jmpq *0x200c22(%rip)
; 601018 <_GLOBAL_OFFSET_TABLE_+0x18>
4003f6: 68 00 00 00 00 pushq $0x0
4003fb: e9 e0 ff ff ff jmpq 4003e0 <_init+0x18>
Third attempt
I got stuck here, so I supposed that James had the same problem. I read a little more his post and he used a syscall to write on screen. I read this post, suggested by James, and followed the instructions.
System call used assembly instructions, I wrote an assembly program to see whether a syscall
would help me. I ended up with the code test3.c
. I added comments to make the code easy to understand.
//test3.c
int main(){
__asm__ (
/* 1 is the syscall number to
write on 64bit */
"movl $1, %eax;\n"
/* 1 is stdout and is the
first argument */
"movl $1, %ebx;\n"
/* load the address of string
into the second argument*/
"movl $message, %esi;\n"
/* third argument is the length
of the string to print*/
"movl $10, %edx;\n"
"syscall;\n"
"movl $60,%eax;\n" // call exit
"syscall;\n"
/* Store the "holi boli\n"
inside the main function */
"message: .ascii \"holi boli\\n\";"
);
return 0;
}
I compiled the code and guess what? it worked!!! It printed holi boli
on the screen.
Fourth attempt:
I used objdump -D
to get the opcode:
00000000004004a6 <main>:
4004a6: 55 push %rbp
4004a7: 48 89 e5 mov %rsp,%rbp
4004aa: b8 01 00 00 00 mov $0x1,%eax
4004af: bb 01 00 00 00 mov $0x1,%ebx
4004b4: be c7 04 40 00 mov $0x4004c7,%esi
4004b9: ba 0a 00 00 00 mov $0xa,%edx
4004be: 0f 05 syscall
4004c0: b8 3c 00 00 00 mov $0x3c,%eax
4004c5: 0f 05 syscall
If you don’t notice, the line "movl $message, %esi;\n"
in test3.c
has been replaced by mov $0x4004c7,%esi
. This means that the message holi boli
is somewhere else, in this case the message starts at 0x4004c7
.
00000000004004c7 <message>:
4004c7: 68 6f 6c 69 20
4004cc: 62
4004cd: 6f
4004ce: 6c
4004cf: 69 0a b8 00 00 00
4004d5: 00 5d c3
4004d8: 0f 1f 84 00 00 00 00
4004df: 00
I naively joined both parts as follows to see the new memory address where the message will be allocate.
//test4.c
const char main[] ={
0x55,
0x48, 0x89, 0xe5,
0xb8, 0x01, 0x00, 0x00, 0x00,
0xbb, 0x01, 0x00, 0x00, 0x00,
0xbe, 0xc7, 0x04, 0x40, 0x00,
0xba, 0x0a, 0x00, 0x00, 0x00,
0x0f, 0x05,
0xb8, 0x3c, 0x00, 0x00, 0x00,
0x0f, 0x05
// <message>
0x68, 0x6f, 0x6c, 0x69, 0x20,
0x62,
0x6f,
0x6c,
0x69, 0x0a, 0xb8, 0x00, 0x00, 0x00,
0x00, 0x5d, 0xc3,
0x0f, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00,
0x00
};
The output was SI��I��
. This means two things. First, I am doing a syscall
and second, as I expected, the program is reading the wrong section.
Success
With the help of objdump
I get the correct memory address for the message.
0000000000400560 <main>:
400560: 55
400561: 48 89 e5
400564: b8 01 00 00 00
400569: bb 01 00 00 00
40056e: be c7 04 40 00
400573: ba 0a 00 00 00
400578: 0f 05
40057a: b8 3c 00 00 00
; syscall
40057f: 0f 05
; message
400581: 68 6f 6c 69 20
400586: 62
400587: 6f
400588: 6c
400589: 69 0a b8 00 00 00
40058f: 00 5d c3
400592: 0f 1f 84 00 00 00 00
400599: 00
So the new address is 0x400581
. I updated the code.
const char main[] ={
0x55,
0x48, 0x89, 0xe5,
0xb8 , 0x01 , 0x00 , 0x00 , 0x00,
0xbb , 0x01 , 0x00 , 0x00 , 0x00,
// update of the address
0xbe , 0x81 , 0x05 , 0x40 , 0x00,
0xba , 0x0a , 0x00 , 0x00 , 0x00,
0x0f , 0x05 ,
0xb8 , 0x3c , 0x00 , 0x00 , 0x00,
0x0f , 0x05,
// <message>
0x68, 0x6f , 0x6c , 0x69 , 0x20 ,
0x62,
0x6f,
0x6c,
0x69 , 0x0a , 0xb8 , 0x00 , 0x00 , 0x00,
0x00 , 0x5d , 0xc3 ,
0x0f , 0x1f , 0x84 , 0x00 , 0x00 , 0x00, 0x00,
0x00
};
I run the last code and it was a success! Woo-hoo!
Final thoughts
- It was tricky, but the basics will never betray you.
- I could use
gdb
to perform all the disassembles, but the code wasn’t very long soobjdump
andvim
made the magic for me.
Share it!
Comments powered by Talkyard.