Misplaced Pages

Address space layout randomization

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

This is an old revision of this page, as edited by ArchiSchmedes (talk | contribs) at 09:27, 22 July 2008 (Effectiveness: added reference to 'Information entropy'). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 09:27, 22 July 2008 by ArchiSchmedes (talk | contribs) (Effectiveness: added reference to 'Information entropy')(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Address space layout randomization (ASLR) is a computer security technique which involves randomly arranging the positions of key data areas, usually including the base of the executable and position of libraries, heap, and stack, in a process's address space.

Benefits

Address space randomization hinders some types of security attack by preventing an attacker being able to easily predict target addresses. For example attackers trying to execute return-to-libc attacks must locate the code to be executed; while other attackers trying to execute shellcode injected on the stack have to first find the stack. In both cases, the related memory addresses are obscured from the attackers; these values have to be guessed, and a mistaken guess is not usually recoverable due to the application crashing.

Effectiveness

Address space layout randomization relies on the low chance of an attacker guessing where randomly placed areas are located; security is increased by increasing the search space. Thus, address space randomization is more effective when more entropy is present in the random offsets. Entropy is increased by either raising the amount of virtual memory area space the randomization occurs over, or reducing the period the randomization occurs over; the period is typically implemented as small as possible, so most systems must increase VMA space randomization.

To defeat the randomization, an attacker must successfully guess the positions of all areas he is attacking. For data areas such as stack and heap, where custom code or useful data can be loaded, more than one state can be attacked by using NOP slides for code or repeated copies of data; this allows an attack to succeed if the area is randomized to one of a handful of values. In contrast, code areas such as library base and main executable need to be discovered exactly. Often these areas are mixed, for example stack frames are injected onto the stack and a library is returned into.

To begin, let us declare the following variables:

E s = entropy bits of stack top {\displaystyle E_{s}={\mbox{entropy bits of stack top}}\,}
E m = entropy bits of mmap() base {\displaystyle E_{m}={\mbox{entropy bits of mmap() base}}\,}
E x = entropy bits of main executable base {\displaystyle E_{x}={\mbox{entropy bits of main executable base}}\,}
E h = entropy bits of heap base {\displaystyle E_{h}={\mbox{entropy bits of heap base}}\,}
A s = attacked bits per attempt of stack entropy {\displaystyle A_{s}={\mbox{attacked bits per attempt of stack entropy}}\,}
A m = attacked bits per attempt of mmap() base entropy {\displaystyle A_{m}={\mbox{attacked bits per attempt of mmap() base entropy}}\,}
A x = attacked bits per attempt of main executable entropy {\displaystyle A_{x}={\mbox{attacked bits per attempt of main executable entropy}}\,}
A h = attacked bits per attempt of heap base entropy {\displaystyle A_{h}={\mbox{attacked bits per attempt of heap base entropy}}\,}
α = attempts made {\displaystyle \alpha ={\mbox{attempts made}}\,}
N = E s A s + E m A m + E x A x + E h A h {\displaystyle N=E_{s}-A_{s}+E_{m}-A_{m}+E_{x}-A_{x}+E_{h}-A_{h}\,}

To calculate the probability of an attacker succeeding, we have to assume a number of attempts α {\displaystyle \alpha } are to be carried out without being interrupted by a signature-based IPS, law enforcement, or other factor; in the case of brute forcing, the daemon cannot be restarted. We also have to figure out how many bits are relevant and how many are being attacked in each attempt, leaving however many bits the attacker has to defeat.

The following formulas represent the probability of success for a given set of α {\displaystyle \alpha \,} attempts on N {\displaystyle N} bits of entropy.

g ( α ) = isolated guessing; address space is re-randomized after each attempt {\displaystyle g\left(\alpha \,\right)={\mbox{isolated guessing; address space is re-randomized after each attempt}}\,}
g ( α ) = 1 ( 1 2 N ) α : 0 α {\displaystyle g\left(\alpha \,\right)=1-{\left(1-{2^{-N}}\right)^{\alpha }\,}:0\leq \,\alpha \,}
b ( α ) = systematic brute forcing on copies of the program with the same address space {\displaystyle b\left(\alpha \,\right)={\mbox{systematic brute forcing on copies of the program with the same address space}}}
b ( α ) = α 2 N : 0 α 2 N {\displaystyle b\left(\alpha \,\right)={\frac {\alpha \,}{2^{N}}}:0\leq \,\alpha \,\leq \,{2^{N}}}

In many systems, 2 N {\displaystyle 2^{N}} can be in the thousands or millions; on modern 64-bit systems, these numbers typically reach the millions at least. For 32-bit systems at 2004 computer speeds which have 16 bits for address randomisation, Shacham and co workers state "... 16 bits of address randomization can be defeated by a brute force attack within minutes." It should be noted that the authors' statement depends on the ability to attack the same application multiple times without any delay. Proper implementations of ASLR, like that included in grsecurity, provide several methods to make such brute force attacks infeasible. One method involves preventing an executable from executing for a configurable amount of time if it has crashed a certain number of times.

Some systems implement Library Load Order Randomization, a form of ASLR where the order in which libraries are loaded is randomized. This supplies very little entropy. An approximation of the number of bits of entropy supplied per needed library is shown below; this does not yet account for varied library sizes, so the actual entropy gained is really somewhat higher. Note that attackers usually need only one library; the math is more complex with multiple libraries, and shown below as well. Note that the case of an attacker using only one library is a simplification of the more complex formula for l = 1 {\displaystyle l=1} .

l = number of libraries loaded {\displaystyle l={\mbox{number of libraries loaded}}}
β = number of libraries used by the attacker {\displaystyle \beta \,={\mbox{number of libraries used by the attacker}}}
E m = log 2 ( n ) : β = 1 , l 1 {\displaystyle E_{m}=\log _{2}\left(n\right):\beta \,=1,l\geq \,1}
E m = i = l l ( β 1 ) log 2 ( i ) : β 1 , l 1 {\displaystyle E_{m}=\sum _{i=l}^{l-\left(\beta \,-1\right)}\log _{2}\left(i\right):\beta \,\geq \,1,l\geq \,1}

These values tend to be low even for large values of l {\displaystyle l} , most importantly since attackers typically can use only the C standard library and thus it can often be assumed β = 1 {\displaystyle \beta \,=1} . Interestingly, however, even for a small number of libraries there are a few bits of entropy gained here; it is thus potentially interesting to combine library load order randomization with VMA address randomization to gain a few extra bits of entropy. Note that these extra bits of entropy will not apply to other mmap() segments, only libraries.

Reducing entropy

There are several ways for an attacker to reduce the entropy present in a randomized address space, ranging from simple information leaks to attacking multiple bits of entropy per attack. There is little that can be done about this.

It is possible to leak information about memory layout using format string vulnerabilities. Format string functions such as printf() use a variable argument list to do their job; format specifiers describe what the argument list looks like. Because of the way arguments are typically passed, each format specifier moves closer to the top of the stack frame. Eventually, the return pointer and stack frame pointer can be extracted, revealing the address of a vulnerable library and the address of a known stack frame; this can completely eliminate library and stack randomization as an obstacle to an attacker.

It is also possible to decrease entropy in the stack or heap. The stack typically must be aligned to 16 bytes, and so this is the smallest possible randomization interval; while the heap must be page-aligned, typically 4096 bytes. When attempting an attack, it is possible to align duplicate attacks with these intervals; a NOP slide may be used with shellcode injection, and the string '/bin/sh' can be replaced with '////////bin/sh' for an arbitrary number of slashes when attempting to return to system(). The number of bits removed is exactly l o g 2 ( n ) {\displaystyle log_{2}\left(n\right)} for n {\displaystyle n} intervals attacked.

Such decreases are limited due to the amount of data that can be stuffed in the stack or heap. The stack, for example, is typically limited to 8MiB and grows to much less; this allows for at most 19 bits, although a more conservative estimate would be around 8-10 bits corresponding to 4-16KiB of stack stuffing. The heap on the other hand is limited by the behavior of the memory allocator; in the case of glibc, allocations above 128KiB are created using mmap(), limiting attackers to 5 bits of reduction. This is also a limiting factor when brute forcing; although the number of attacks to perform can be reduced, the size of the attacks is increased enough that the behavior could in some circumstances become anomalous to intrusion detection systems.

History

The first design and implementation (and indeed the coining of the term ASLR) was made public in July, 2001 by the PaX project. It remains the most complete implementation, providing also kernel stack randomization from October 2002 onward. It also continues to provide the most entropy for each randomized layout compared to other implementations.

Implementations

Several mainstream, general purpose operating systems implement ASLR.

In Linux, a weak form of ASLR has been enabled by default since kernel version 2.6.12. The PaX and ExecShield patchsets to the Linux kernel provide more complete implementations. Various Linux distributions including Adamantix, Hardened Gentoo, and Hardened Linux From Scratch come with PaX's implementation of ASLR by default.

The Exec Shield patch supplies 19 bits of stack entropy on a period of 16 bytes; and 8 bits of mmap() base randomization on a period of 1 page of 4096 bytes. This places the stack base in an area 8MiB wide containing 524288 possible positions; and the mmap() base in an area 1MiB wide containing 256 possible positions.

The prelink tool implements randomization at prelink time rather than runtime, due to a deficiency of the design of prelink. The goal of prelink is to handle relocating libraries before the dynamic linker has to, which allows the relocation to occur once for many runs of the program. Because of this, real address space randomization would defeat the purpose of prelinking.

Microsoft's Windows Vista and Windows Server 2008 have ASLR enabled by default, although only for executables which are specifically linked to be ASLR enabled.

OpenBSD also supports ASLR.

Apple introduced randomization of some library offsets in Mac OS X v10.5, presumably as a stepping stone to fully implementing ASLR at a later date. Their implementation does not provide complete protection against attacks which ASLR is designed to defeat.

References

  1. On the Effectiveness of Address-Space Randomization,Shacham, H. and Page, M. and Pfaff, B. and Goh, E.J. and Modadugu, N. and Boneh, D,Proceedings of the 11th ACM conference on Computer and communications security,pp 298--307, 2004
  2. Comparison of PaX to ExecShield and W^X
  3. See "Library Randomization" at http://www.apple.com/macosx/features/300.html#security
  4. Quick Leopard Update | securosis.com
  5. Matasano Chargen » A Roundup Of Leopard Security Features
  6. Matasano Chargen » What We’ve Since Learned About Leopard Security Features
  7. TippingPoint | DVLabs | New Leopard Security Features - Part I: ASLR

External links

Categories: