Base Knowledge
Last updated
Last updated
Learn it from here. It’s an excellent course.
Excellent article about the structure of the heap:
At the highest level, there are arenas, which are collections of memory. The purpose of arenas is to have separate memory spaces for different threads. That means that different threads can access their own memory spaces, and don’t have to share the same space. Doing it this way is faster, because otherwise the shared memory would have to be locked and freed when used by each thread.
There is a maximum limit on the amount of arenas, though.
For 32 bit systems:
Number of arenas = 2 * number of cores.
For 64 bit systems:
Number of arenas = 8 * number of cores.
When the limit for the amount of arenas is reached, then arenas start to become shared between threads. Once again, threads will have to lock and free their arenas when they want to use them. But at least every thread doesn’t have to completely lock the whole memory space, just one arena.
As a side note, once a thread chooses an arena for the first time, then that thread will keep using that same arena in the future as well afaik.
Note: There are some differences between the main arena and a thread’s arena. Read the article to find out more.
Heaps are a part of arenas. To begin with every arena contains only one heap. However, when this heap segment runs out of space, a new heap (non contiguous region) gets mmap’d to this arena.
Note: Heaps (as well as arenas) have headers. And there are differences between a main arena’s heap and a thread arena’s heap. Read the article to find out more.
Heaps are divided into chunks.
A chunk found inside a heap segment can be one of the below types:
Allocated chunk
Free chunk
Top chunk
Last Remainder chunk
Allocated Chunk
A chunk consists of two parts:
An chunk header (8 bytes for an allocated chunk)
User data - data which the program wants to store on the heap.
The chunk header in turn consists of two fields:
prev_size: If the previous chunk is free, this field contains the size of the previous chunk. Else if the previous chunk is allocated, this field contains the previous chunk’s user data.
When testing this out myself, then after freeing the previous chunk, this field did not contain the size of the previous chunk, and was 0x0. I think either it gets set on malloc() (and not on free()), or there’s some condition that has to be met. Probably it gets set on malloc() though.
size: This field contains the size of this allocated chunk. The last 3 bits of this field contain flag information.
Flags:
PREV_INUSE (P) – This bit is set when the previous chunk is allocated.
IS_MMAPPED (M) – This bit is set when the chunk is mmap’d.
NON_MAIN_ARENA (N) – This bit is set when this chunk belongs to a thread arena.
Example:
The program wants to allocate 8 bytes using malloc()
It allocates 8 bytes for the user, plus 8 bytes for the chunk header. Let’s say the previous chunk is also allocated. Therefore, the value of the size chunk header is 0x8 + 0x8 + 0x1 = 0x11.
Note: The amount of bytes allocated to user data is always padded to a multiple of a number (multiple of 8, for example). Therefore, the size of the chunk is always a multiple of that number. This means you can simply ignore the flag bits if you want to know the size of the chunk.
Note: Other fields of malloc_chunk (like fd, bk) are not used for allocated chunks. Hence in place of these fields, user data is stored.
Free Chunk
Fields:
prev_size: No two free chunks can be adjacent together. When both the chunks are free, it gets combined into one single free chunk. Hence always the previous chunk to this freed chunk would be allocated and therefore prev_size contains previous chunk’s user data.
size: This field contains the size of this free chunk.
fd: Forward pointer – Points to the next chunk in the same bin (and NOT to the next chunk present in physical memory). To read about what bins are, read the Bin chapter.
bk: Backward pointer – Points to previous chunk in the same bin (and NOT to the previous chunk present in physical memory). Will not be set with fastbins.
Top Chunk
The top chunk is at the top of the heap, after all the allocated regions. It’s unallocated memory which doesn’t belong to any bin. The chunk header has a size field, which contains the length of the chunk.
When no free blocks are available, then memory is taken from this chunk, and the top chunk grows smaller.
Last Remainder Chunk
Not that important to know. Read the article to find out.
Bins are the free list data structures (basically a linked list). They are used to hold free chunks.
Based on chunk sizes, different bins are available:
Fast bin
Unsorted bin
Small bin
Large bin
There are also certain data structures used to hold these bins:
fastbinsY: This array holds fast bins.
bins: This array holds unsorted, small and large bins. In total there are 126 bins and it's divided as follows:
Bin 1 – Unsorted bin
Bin 2 to Bin 63 – Small bin
Bin 64 to Bin 126 – Large bin
Read the article if you want to find out how all these bins work and how they are different.
LiveOverflow has videos on this subject:
https://www.mathyvanhoef.com/2013/02/understanding-heap-exploiting-heap.html?m=1
https://ctf-wiki.github.io/ctf-wiki/pwn/linux/glibc-heap/heap_structure/
https://sploitfun.wordpress.com/2015/02/10/understanding-glibc-malloc/
Note: this applies to x86.
Malloc will allocate space on the heap.
If you do malloc(8), then this is what happens according to the most common implementation (dlmalloc):
Malloc will allocate 8 bytes on the heap. It will return the address of the start of the 8 bytes to EAX.
Malloc does this using brk or mmap syscalls.
Malloc gets more space than it actually needs, so it can use that space during subsequent mallocs, and does not need to make a separate syscall each time.
Note that most allocators will round up the total size and/or the start of your part of the memory to a multiple of bytes (e.g. on a 64-bit system it may align the data to a multiple of 64 bits (8 bytes) as accessing data from non-aligned addresses can be more difficult and inefficient for the processor/bus), so you may also end up with some "padding" (unused bytes). Also probably this is useful for the flags (read more about the flags at Allocated chunk).
Malloc needs to know how many bytes have been allocated. , malloc will store this information before the allocated bytes in a “chunk header”.
The “chunk header” is 8 bytes long and consists of two 4-byte segments.
The first part of the chunk header, prev_size, is 4 bytes.
If the previous chunk is free, then it’s set to the size of the previous chunk. (afaik)
Otherwise it contains user data from the previous chunk. User data is whatever data the programmer wanted to store on the heap.
The second part of the chunk header stores two things (actually more, look at the Heap structure paragraph):
Whether the previous chunk is used, and not free. If the previous chunk is used, then the lowest bit of the chunk is 1. This is necessary for the free() algorithm. In the case of the first malloc, where there is no previous chunk, the lowest bit is also 1.
The size of the space we are taking up on the heap. We allocated 8 bytes and the chunk is 4 + 4 bytes. So therefore we are taking up 0x8 + 0x4 + 0x4 = 0x10 bytes
Putting the two factors together, the value of the chunk for a malloc(8) would probably be 0x11 (0x10 with the lowest bit being 1 is 0x11).
There is also a pointer which stores the location of the latest free memory. It will get updated as you use malloc() and free(). Kind of like the stack pointer, but for the heap.
Note that after the last malloc, there is one extra chunk. That chunk stores the length of the remaining free heap memory (aka the “wilderness”). Picture for reference:
Note: The above is a simplified explanation. Today’s implementation is based on dlmalloc, but isn’t really dlmalloc anymore. It’s referred to as ptmalloc now.
https://www.youtube.com/watch?v=ZHghwsTRyzQ
When you free a malloc’ed area, then:
The first word of the allocated area is replaced with the address of the previous free chunk if there is one available.
This is because the free chunks are in a linked list.
If there is no other free chunk in the list, then the value is 0x0. This situation occurs during the very first free(), for example.
The chunk header remains the same in the default case.
Note that:
In a fastbin, the next chunk’s PREV_INUSE bit doesn’t get updated, because it’s not necessary for the implementation. Also it’s single-linked, instead of double-linked (it doesn’t have both a forward and backward pointer, only the forward pointer, fd) The reason is that they are not consolidated (combined) with other free chunks like normal chunks. Instead, malloc_consolidate releases all free chunks in fastbins and consolidates them with other free chunks.
With larger chunks, the algorithm does a bit more housekeeping and cleans everything up nicer.