Introduction to Heap Internals

In this post we will cover a few essential topics of Heap Memory. We will learn about the heap creation, the structure of heap memory and different exploitation techniques. This post is mainly based on glibc's heap implementation. We recommend reading “Doug Lee malloc implementation’” to have a better understanding of 'dlmalloc()'. To avoid confusion, we have articulated the post in layman's terms instead of professional jargon.

What is a Heap?

Heap is a region of memory allotted to every program at the runtime to allocate variables whose size are not deterministic at compile time. This memory is global which means it can be accessed and modified from anywhere within a program using 'pointers'.

Wait…. why do we even need a Heap when we have ‘Stack’ segment?

Whenever we run a program, each thread will have a stack region where all the local variables are stored. But what if the variables are too large to be stored on stack? This may leave your program unstable with a possibility of crashing so we need another section of writable memory to store large variables. To manipulate Heap, there are two main functions: 'malloc()' and 'free()'. Pointers are used to allocate the memory using these functions.

Observing the above source code, we see that the function allocates 25 bytes of memory dynamically using the function 'malloc()' and then frees the memory at the end of the routine using the function 'free()'. As was previously mentioned, we need a pointer to allocate memory on heap (which is '*buffer' in this case).

Malloc: void* malloc(size_t size) Allocates a reasonable amount of memory, creates a heap segment and returns the caller a pointer to a memory region.As the 'size_t' is an unsigned type, the value of 'size' cannot be a negative value. However, when we pass a negative value to the function, it requests a huge amount of memory which is not possible in most cases. This leads to an undefined behaviorFree: void free(void *ptr) Releases the chunk of memory pointed to by ptr, that had been previously allocated using malloc or a related routine such as realloc(). The most common issues are Double Free, Use-After-Free due to the incorrect usage of this function in the program.

How the Heap works

Before diving into heap internals, we should know the definitions of some significant terms that will be used throughout the post:

start_brk: pointer which denotes the beginning of the heap segment.
brk: pointer that denotes the end of the heap segment.
end_data: pointer which sets the end of the data segment.

Reference:https://elixir.bootlin.com/linux/v3.8/source/include/linux/mm_types.h#L364‍

The system calls brk(), sbrk() and mmap() are used by malloc to request memory from the OS.

<pbrk(): int brk(void *addr) Initializes 'end_data' to the value specified by addr. 'start_brk' to the next available segment after the data segment. And 'brk' = start_brk, initially start (start_brk) and end of heap segment (brk) would point to same location.

In order to expand the dimensions of heap segment, an additional glibc function 'sbrk()' is used.

‍sbrk(): void *sbrk(intptr_t increment)This is a wrapper function around brk() and increments the program break by increment bytes. Also, this function can be used to find the current location of the program break by passing argument '0' to the function.

‍mmap(): void * mmap (void *address, size_t length, int protect, int flags, int filedes, off_t offset) mmap is used by malloc() to create a private anonymous mapping segment. The primary purpose of private anonymous mapping is to allocate new memory (zero filled). The mmap uses the flags 'MAP_ANON' & 'MAP_PRIVATE'.

Now, let's try to understand how glibc manages memory internally. Knowing a few key concepts/data structures will be helpful to perform Heap Exploitation which will be covered in the next article.

Arena: Arena is a contiguous piece of memory. There are two types of arenas and the major difference between the main arena and a dynamically created arena is how they acquire memory from kernel. Only one main arena exists for a process and it calls sbrk() which usually starts immediately after the executable’s data section. On the other hand, a dynamic arena (Non-main arena) uses mmap() to manage the memory fetched from kernel.

‍Chunk: The Chunk is the bottom entity and it is the smallest unit of memory management in dynamic allocation. A chunk divides a large region of memory ('heap') into chunks of various sizes. When a 'free()' function is called, the chunk is marked as free and its reference will be placed in one of the data structures 'fastbinY' and 'bin'. As per the glibc wiki, "freeing" memory does not actually return it to the operating system for other applications to use instead it recycles these chunks to serve future malloc calls.

‍fastbinsY: This is an array of LIFO single linked lists holding recently freed chunks denominated fast chunks (chunks of size 16 - 80 bytes known as fast chunk)

‍bins: This array holds unsorted, small and large bins. The bins holding fast chunks are known as fast bins. Among all the bins, fast bins are faster in memory allocation and deal location Bins are the data structures which are used to hold free chunks of specific sizes depending on index. Based on chunk sizes, different bins are available

Fast bin
Unsorted bin
Small bin
Large bin

Fast bin: There are 10 fast bins. Each of these bins maintains a single linked list. Insertion and deletion happen from the top of this list.

‍Unsorted bin: The primary purpose of this bin is improve the efficiency of allocating and deallocating requests. There is only 1 unsorted bin. On freeing, the chunks end up in this bin.

‍Small bin: Small bins are faster than large bins but slower than fast bins. Each bin maintains a doubly-linked list and there are 62 small bins.

‍Large bin: A particular large bin has chunks of different sizes, sorted in decreasing order There are 63 large bins. like small bin, it maintains a doubly-linked list.

The internals of memory allocation and deallocation varies from one OS to the other. There are few good books to refer for the better understanding of Heap and its implementations. In the next few posts we will learn about Heap Overflows and few exploitation techniques.

‍Reference: https://sourceware.org/glibc/wiki/MallocInternals

Credit: ACE Team - Loginsoft

About Loginsoft

For over 20 years, leading companies in Telecom, Cybersecurity, Healthcare, Banking, New Media, and more have come to rely on Loginsoft as a trusted resource for technology talent. From startups, to product and enterprises rely on our services. Whether Onsite, Offsite, or Offshore, we deliver. With a track record of successful partnerships with leading technology companies globally, and specifically in the past 6 years with Cybersecurity product companies, Loginsoft offers a comprehensive range of security offerings, including Software Supply Chain, Vulnerability Management, Threat Intelligence, Cloud Security, Cybersecurity Platform Integrations, creating content packs for Cloud SIEM, Logs onboarding and more. Our commitment to innovation and expertise has positioned us as a trusted player in the cybersecurity space. Loginsoft continues to provide traditional IT services which include Software development & Support, QA automation, Data Science & AI, etc.

Expertise in Integrations with Threat Intelligence and Security Products: Built more than 250+ integrations with leading TIP, SIEM, SOAR, and Ticketing Platforms such as Cortex XSOAR, Anomali, ThreatQ, Splunk, IBM QRadar & Resilient, Microsoft Azure Sentinel, ServiceNow, Swimlane, Siemplify, MISP, Maltego, Cryptocurrency Digital Exchange Platforms, CISCO, Datadog, Symantec, Carbonblack, F5, Fortinet, and so on. Loginsoft is a partner with industry leading technology vendors Palo Alto, Splunk, Elastic, IBM Security, etc.

In addition, Loginsoft offers Research as a service: We're more than just experts in cybersecurity; we're your accredited in-house research team focused on unraveling the complexities of cybersecurity and future technologies. From Application Security to Threat Research, our seasoned professionals have cultivated expertise in every facet of the field. We've earned the trust of over 20 security platform companies, who count on our research and analysis to strengthen their cybersecurity solutions.

Interested to learn more? Let’s start a conversation.

Introduction to Heap Internals

What is a Heap?

How the Heap works

About Loginsoft

Latest Articles

Ghost Ransomware Playbook: How it Works, What it uses and How to prevent it

IPFIX Data Export: From Research to Successful Log Collection

Why Healthcare sector remains as the top target for cyberattacks

Introduction to Heap Internals

What is a Heap?

How the Heap works

About Loginsoft

Get notified

Latest Articles

Ghost Ransomware Playbook: How it Works, What it uses and How to prevent it

IPFIX Data Export: From Research to Successful Log Collection

Why Healthcare sector remains as the top target for cyberattacks