Note: In this lab, you will gain firsthand experience with one of the methods commonly used to exploit security weaknesses in operating systems and network servers. Our purpose is to help you learn about the runtime operation of programs and to understand the nature of this form of security weakness so that you can avoid it when you write system code. We do not condone the use of these or any other forms of attack to gain unauthorized access to system resources. There are criminal statutes governing such activities.
A cookie is a string of eight hexadecimal digits that is (with high probability) unique to your team. You can generate your cookie with the make cookie program giving your team name as the argument. For example:
unix> ./makecookie zhong+white 0x3c585621In four of your five buffer attacks, your objective will be to make your cookie show up in places where it ordinarily would not.
1 int getbuf()
2 {
3 char buf[12];
4 Gets(buf);
5 return 1;
6 }
The function Gets() is similar to the standard library function gets() —it reads a string from
standard input (terminated by ‘\n’ or end-of-file) and stores it (along with a null terminator) at the
specified destination. In this code, the destination is an array buf having sufficient space for 12
characters.Neither Gets() nor gets() has any way to determine whether there is enough space at the destination to store the entire string. Instead, they simply copy the entire string, possibly overrunning the bounds of the storage allocated at the destination. If the string typed by the user to getbuf() is no more than 15 characters long, it is clear that getbuf() will return 1, as shown by the following execution example:
unix> ./bufbomb Type string: howdy doody Dud: getbuf returned 0x1Typically an error occurs if we type a longer string:
unix> ./bufbomb Type string: This string is much too long Ouch!: You caused a segmentation fault!As the error message indicates, overrunning the buffer typically causes the program state to be corrupted, leading to a memory access error. Your task is to be more clever with the strings you feed bufbomb so that it does more interesting things. These are called exploit strings.
bufbomb takes several different command line arguments:
If you generate a hex-formatted exploit string in the file exploit.txt, you can apply the raw string to bufbomb in several different ways:
unix> cat exploit.txt | ./sendstring | ./bufbomb -t zhong+white
unix> ./sendstring < exploit.txt > exploit-raw.txt unix> ./bufbomb -t zhong+white < exploit-raw.txtThis approach can also be used when running bufbomb from within GDB:
unix> gdb bufbomb (gdb) run -t zhong+white < exploit-raw.txt
When you correctly solve one of the levels, run the bomb with the “-s” option on cs367.vsnet.gmu.edu. This will automatically send an email notification to our grading server. The server will test your exploit string to make sure it really works, and it will update the lab web page indicating that your team (listed by cookie) has completed this level. Unlike the bomb lab, there is no penalty for making mistakes in this lab. Feel free to fire away at bufbomb with any string you like.
1 void test()
2 {
3 int val;
4 volatile int local = 0xdeadbeef;
5 entry_check(3); /* Make sure entered this function properly */
6 val = getbuf();
7 /* Check for corrupted stack */
8 if (local != 0xdeadbeef) {
9 printf("Sabotaged!: the stack has been corrupted\n");
10 }
11 else if (val == cookie) {
12 printf("Boom!: getbuf returned 0x%x\n", val);
13 validate(3);
14 }
15 else {
16 printf("Dud: getbuf returned 0x%x\n", val);
17 }
18 }
When getbuf() executes its return statement (line 5 of getbuf()), the program ordinarily resumes
execution within function test() (at line 8 of this function). Within the file bufbomb, there is a
function smoke() having the following C code:
void smoke()
{
entry_check(0); /* Make sure entered this function properly */
printf("Smoke!: You called smoke()\n");
validate(0);
exit(0);
}
Your task is to get bufbomb to execute the code for smoke() when getbuf() executes its return
statement, rather than returning to test(). You can do this by supplying an exploit string that
overwrites the stored return pointer in the stack frame for getbuf() with the address of the first
instruction in smoke(). Note that your exploit string may also corrupt other parts of the stack state,
but this will not cause a problem, since smoke() causes the program to exit directly.
Some Advice:
void fizz(int val)
{
entry_check(1); /* Make sure entered this function properly */
if (val == cookie) {
printf("Fizz!: You called fizz(0x%x)\n", val);
validate(1);
}
else
printf("Misfire: You called fizz(0x%x)\n", val);
exit(0);
}
Similar to Level 0, your task is to get bufbomb to execute the code for fizz() rather than returning to
test. In this case, however, you must make it appear to fizz() as if you have passed your cookie as its
argument. You can do this by encoding your cookie in the appropriate place within your exploit
string.
Note that the program won’t really call fizz() —it will simply execute its code. This has important implications for where on the stack you want to place your cookie.
Within the file bufbomb there is a function bang() having the following C code:
int global_value = 0;
void bang(int val)
{
entry_check(2); /* Make sure entered this function properly */
if (global_value == cookie) {
printf("Bang!: You set global_value to 0x%x\n", global_value);
validate(2);
}
else
printf("Misfire: global_value = 0x%x\n", global_value);
exit(0);
}
Similar to Levels 0 and 1, your task is to get bufbomb to execute the code for bang() rather than
returning to test. Before this, however, you must set global variable global_value to your team’s
cookie. Your exploit code should set global_value, push the address of bang() on the stack, and
then execute a return instruction to cause a jump to the code for bang().Some Advice:
The most sophisticated form of buffer overflow attack causes the program to execute some exploit code that patches up the stack and makes the program return to the original calling function (test() in this case). The calling function is oblivious to the attack. This style of attack is tricky, though, since you must: 1) get machine code onto the stack, 2) set the return pointer to the start of this code, and 3) undo the corruptions made to the stack state.
Your job for this level is to supply an exploit string that will cause getbuf() to return your cookie back to test, rather than the value 1. You can see in the code for test that this will cause the program to go “Boom!.” Your exploit code should set your cookie as the return value, restore any corrupted state, push the correct return location on the stack, and execute a ret instruction to really return to test.
Some Advice:
The next level is for those who want to push themselves beyond our baseline expectations for the course, and who want to face a challenge in designing buffer overflow attacks that arises in real life. This part of the assignment only counts 25 points, even though it requires a fair amount of work to do, so don’t do it just for the points.
From one run to another, especially by different users, the exact stack positions used by a given procedure will vary. One reason for this variation is that the values of all environment variables are placed near the base of the stack when a program starts executing. Environment variables are stored as strings, requiring different amounts of storage depending on their values. Thus, the stack space allocated for a given user depends on the settings of his or her environment variables. Stack positions also differ when running a program under gdb, since gdb uses stack space for some of its own state.
In the code that calls getbuf(), we have incorporated features that stabilize the stack, so that the position of getbuf()’s stack frame will be consistent between runs. This made it possible for you to write an exploit string knowing the exact starting address of buf and the exact saved value of %ebp.
If you tried to use such an exploit on a normal program, you would find that it works some times, but it causes segmentation faults at other times. Hence the name “dynamite”—an explosive developed by Alfred Nobel that contains stabilizing elements to make it less prone to unexpected explosions.
For this level, we have gone the opposite direction, making the stack positions even less stable than they normally are. Hence the name “nitroglycerin”—an explosive that is notoriously unstable. When you run bufbomb with the command line flag “-n,” it will run in “Nitro” mode. Rather than calling the function getbuf(), the program calls a slightly different function getbufn():
int getbufn()
{
char buf[512];
Gets(buf);
return 1;
}
This function is similar to getbuf(), except that it has a buffer of 512 characters. You will need this
additional space to create a reliable exploit. The code that calls getbufn() first allocates a random
amount of storage on the stack (using library function alloca()) that ranges between 0 and 127
bytes. Thus, if you were to sample the value of %ebp during two successive executions of
getbufn(), you would find they differ by as much as ±127.
In addition, when run in Nitro mode, bufbomb requires you to supply your string 5 times, and it will execute getbufn5 times, each with a different stack offset. Your exploit string must make it return your cookie each of these times.
Your task is identical to the task for the Dynamite level. Once again, your job for this level is to supply an exploit string that will cause getbufn() to return your cookie back to test, rather than the value 1. You can see in the code for test that this will cause the program to go “KABOOM!.” Your exploit code should set your cookie as the return value, restore any corrupted state, push the correct return location on the stack, and execute a ret instruction to really return to testn().
Some Advice:
unix> cat exploit.txt | ./sendstring -n 5 | ./bufbomb -n -t bovikYou must use the same string for all 5 executions of getbufn(). Otherwise it will fail the testing code used by our grading server.
The URL for the lab web page is:
# Example of hand-generated assembly code pushl $0x89abcdef # Push value onto stack addl $17,%eax # Add 17 to %eax .align 4 # Following will be aligned on multiple of 4 .long 0xfedcba98 # A 4-byte constant .long 0x00000000 # PaddingThe code can contain a mixture of instructions and data. Anything to the right of a ‘#’ character is a comment. We have added an extra word of all 0s to work around a shortcoming in objdump to be described shortly. We can now assemble and disassemble this file:
unix> gcc -c example.s unix> objdump -d example.o > example.dThe generated file example.d contains the following lines
0: 68 ef cd ab 89 push $0x89abcdef 5: 83 c0 11 add $0x11,%eax 8: 98 cwtl Objdump tries to interpret 9: ba dc fe 00 00 mov $0xfedc,%edx these as instructionsEach line shows a single instruction. The number on the left indicates the starting address (starting with 0), while the hex digits after the ‘:’ character indicate the byte codes for the instruction. Thus, we can see that the instruction pushl $0x89ABCDEF has hex-formatted byte code 68 ef cd ab 89.
Starting at address 8, the disassembler gets confused. It tries to interpret the bytes in the file example.o as instructions, but these bytes actually correspond to data. Note, however, that if we read off the 4 bytes starting at address 8 we get: 98 ba dc fe. This is a byte-reversed version of the data word 0xFEDCBA98. This byte reversal represents the proper way to supply the bytes as a string, since a little endian machine lists the least significant byte first. Note also that it only generated two of the four bytes at the end with value 00. Had we not added this padding, objdump gets even more confused and does not emit all of the bytes we want.
Finally, we can read off the byte sequence for our code (omitting the final 0’s) as: 68ef cdab 8983 c011 98badc fe
int main() {
asm (
"mov %ebx, %ebx\n\t" /* statement 1 */
"mov %ebx, %eax\n\t" /* statement 2 */
"mov %eax, %edx" /* statement 3 */
);
return 0;
Note the “\n\t” at the end of each statement except the last.