Heap Exploitation

Heap exploitation techniques. Still learning. Note that all the information below is in the context of glibc's implementation.

Much of the content below has been taken from multpile sites and we have done our best to acknowledge them wherever we have used their content. The sites are listed below:

Heap

Heap is a memory area used to dynamicly allocate memory using malloc. It is different from stack as the memory stays there until it is freed, unlike the local variables on stack which are removed as soon as the function returns.
Memory on the heap is allocated as "chunks" which are managed by malloc. Chunks can be allocated by a call to malloc(size_t size) and when no longer needed can be free with free(void * chunk_ptr).

How allocation works

Heap chunks look like this:

// INTERNAL_SIZE_T is same as size_t
struct malloc_chunk {
  INTERNAL_SIZE_T      mchunk_prev_size;  /* Size of previous chunk, if it is free. */
  INTERNAL_SIZE_T      mchunk_size;       /* Size in bytes, including overhead. */
  struct malloc_chunk* fd;                /* double links -- used only if this chunk is free. */
  struct malloc_chunk* bk;
  /* Only used for large blocks: pointer to next larger size.  */
  struct malloc_chunk* fd_nextsize; /* double links -- used only if this chunk is free. */
  struct malloc_chunk* bk_nextsize;
};
typedef struct malloc_chunk* mchunkptr;

This, in memory, basically looks like the following:

+--------------------+--------------------+
|      PREV_SIZE     |    SIZE (A|M|P)    |
+--------------------+--------------------+
|         FD         |         BK         |
+--------------------+--------------------+
|    FD_NEXTSIZE     |    BK_NEXTSIZE     |
+--------------------+--------------------+
|                                         |
|                USER DATA                |
|                                         |
+-----------------------------------------+

The size of chunks is a multiple of 16. The A, M and P represent the last 3 bits of the size, which stand for Allocated Arena, Mmap'd chunk, and Previous chunk in use respectively.

For a free chunk, all fields maybe used, but for a chunk in use, the fd, bk, fd_nextsize and bk_nextsize fields only store user data and don't serve any other function.

The PREV_SIZE field of the chunk may overlap with the user data of the previous chunk, thus if the current chunk is free then the PREV_SIZE field holds the size of the previous chunk, but if the current chunk is in use then it holds user data.

Top chunk: Top chunk or the wilderness is the large chunk which covers the rest of the heap. Other chunks are allocated by splitting the top chunk if not available in any bin (used as a last resort).

Bins

When chunks are freed, they either end up in the tcache, the fastbin, small or large bin or the unsorted bin. They may also be coalesced or consolidated with the top chunk if they are adjacent to it, or with a previous or next chunk if it is free.

There are 5 type of bins: 62 small bins, 63 large bins, 1 unsorted bin, 10 fast bins and 64 tcache bins per thread.

The small, large, and unsorted bins all live together in the same array in the heap manager’s source code. Index 0 is unused, 1 is the unsorted bin, bins 2-64 are small bins and bins 65-127 are large bins.

Source

Fast bins

There are 10 fast bins. Each of these bins maintains a single linked list. Addition and deletion happen from the front of this list (LIFO manner).

Each bin has chunks of the same size. The 10 bins each have chunks of sizes: 16, 24, 32, 40, 48, 56, 64, 72, 80 and 88. Sizes mentioned here include metadata as well.

Fast bin chunks do not coalesce or consolidate together.

Source

Unsorted bin

There is only 1 unsorted bin. Small and large chunks, when freed, end up in this bin. The primary purpose of this bin is to act as a cache layer (kind of) to speed up allocation and deallocation requests. Source

The chunk freed, if it does not go into a fast bin or tcache list, is first put in the unsorted bin. Upon the next call to malloc, if it finds a chunk in the unsorted bin which fits the size requirement, it returns that. Else, it just puts the chunks in their corresponding small or large bins.

Small bins

There are 62 of them, and each small bin stores chunks that are all the same fixed size. Every chunk less than 512 bytes on 32-bit systems (or than 1024 bytes on 64-bit systems) has a corresponding small bin. Since each small bin stores only one size of chunk, they are automatically ordered, so insertion and removal of entries on these lists is incredibly fast. Their size ranges from 16, 24, 32, ..., 504. Source

Large bins

Each of the 63 “large bins” operates mostly the same way as small bins, but instead of storing chunks with a fixed size, they instead store chunks within a size range. These are used for chunks over 512 bytes. Source

Tcache (per-thread cache) bins

Per-thread caching speeds up allocations by having per-thread bins of small chunks ready-to-go. That way, when a thread requests a chunk, if the thread has a chunk available on its tcache, it can service the allocation without ever needing to wait on a heap lock.

By default, each thread has 64 singly-linked tcache bins. Each bin contains a maximum of 7 same-size chunks ranging from 24 to 1032 bytes on 64-bit systems and 12 to 516 bytes on 32-bit systems

When a chunk is freed, the heap manager sees if the chunk will fit into a tcache bin corresponding to the chunk size. Like the fast-bin, chunks on the tcache bin are considered "in use", and won't be merged with neighboring freed chunks.

If the tcache for that chunk size is full (or the chunk is too big for a tcache bin), the heap manager reverts to our old slow-path strategy of obtaining the heap lock and then processing the chunk as before.

Source

Putting it together

Strategy used by malloc:

If the size corresponds with a tcache bin and there is a tcache chunk available, return that immediately.
If the request is enormous allocate a chunk off-heap via mmap. 3/ Otherwise we obtain the arena heap lock and then perform the following strategies, in order:
1. Try the fastbin/smallbin recycling strategy
  - If a corresponding fast bin exists, try and find a chunk from there (and also opportunistically prefill the tcache with entries from the fast bin).
  - Otherwise, if a corresponding small bin exists, allocate from there (opportunistically prefilling the tcache as we go).
2. Resolve all the deferred frees
  - Otherwise “truly free” the entries in the fast-bins and move their consolidated chunks to the unsorted bin.
  - Go through each entry in the unsorted bin. If it is suitable, stop. Otherwise, put the unsorted entry on its corresponding small/large bin as we go (possibly promoting small entries to the tcache as we go).
3. Default back to the basic recycling strategy
  - If the chunk size corresponds with a large bin, search the corresponding large bin now.
4. Create a new chunk from scratch
  - Otherwise, there are no chunks available, so try and get a chunk from the top of the heap.
  - If the top of the heap is not big enough, extend it using sbrk.
  - If the top of the heap can’t be extended because we ran into something else in the address space, create a discontinuous extension using mmap and allocate from there
5. If all else fails, return NULL.

And the corresponding free strategy:

If the pointer is NULL, the C standard defines the behavior as “do nothing”.
Otherwise, convert the pointer back to a chunk by subtracting the size of the chunk metadata.
Perform a few sanity checks on the chunk, and abort if the sanity checks fail.
If the chunk fits into a tcache bin, store it there.
If the chunk has the M bit set, give it back to the operating system via munmap.
Otherwise we obtain the arena heap lock and then:
1. If the chunk fits into a fastbin, put it on the corresponding fastbin, and we’re done.
2. If the chunk is > 64KB, consolidate the fastbins immediately and put the resulting merged chunks on the unsorted bin.
3. Merge the chunk backwards and forwards with neighboring freed chunks in the small, large, and unsorted bins.
4. If the resulting chunk lies at the top of the heap, merge it into the top of the heap rather than storing it in a bin.
5. Otherwise store it in the unsorted bin. (Malloc will later do the work to put entries from the unsorted bin into the small or large bins).

Source