A Simple Memory Manager | Programming from the Ground Up

Here I will show you a simple memory manager. It is very primitive but it shows the principles quite well. As usual, I will give you the program first for you to look through. Afterwards will follow an in-depth explanation. It looks long, but it is mostly comments.

  #PURPOSE: Program to manage memory usage - allocates  #         and deallocates memory as requested  #  #NOTES:   The programs using these routines will ask  #         for a certain size of memory.  We actually  #         use more than that size, but we put it  #         at the beginning, before the pointer  #         we hand back.  We add a size field and  #         an AVAILABLE/UNAVAILABLE marker.  So, the  #         memory looks like this  #  # #########################################################  # #Available Marker#Size of memory#Actual memory locations#  # #########################################################  #                                  ^--Returned pointer  #                                     points here  #         The pointer we return only points to the actual  #         locations requested to make it easier for the  #         calling program. It also allows us to change our  #         structure without the calling program having to  #         change at all.  .section .data  #######GLOBAL VARIABLES########  #This points to the beginning of the memory we are managing  heap_begin:   .long 0  #This points to one location past the memory we are managing  current_break:   .long 0  ######STRUCTURE INFORMATION####   #size of space for memory region header   .equ HEADER_SIZE, 8   #Location of the "available" flag in the header   .equ HDR_AVAIL_OFFSET, 0   #Location of the size field in the header   .equ HDR_SIZE_OFFSET, 4  ###########CONSTANTS###########   .equ UNAVAILABLE, 0 #This is the number we will use to mark                       #space that has been given out   .equ AVAILABLE, 1   #This is the number we will use to mark                       #space that has been returned, and is                       #available for giving   .equ SYS_BRK, 45    #system call number for the break                       #system call   .equ LINUX_SYSCALL, 0x80 #make system calls easier to read   .section .text ##########FUNCTIONS############  ##allocate_init##  #PURPOSE: call this function to initialize the  #         functions (specifically, this sets heap_begin and  #         current_break).  This has no parameters and no  #         return value.  .globl allocate_init  .type allocate_init, @function allocate_init:  pushl %ebp                   #standard function stuff  movl  %esp, %ebp  #If the brk system call is called with 0 in %ebx, it  #returns the last valid usable address  movl  $SYS_BRK, %eax        #find out where the break is  movl  $0, %ebx  int   $LINUX_SYSCALL  incl  %eax                  #%eax now has the last valid                              #address, and we want the                              #memory location after that  movl  %eax, current_break   #store the current break  movl  %eax, heap_begin      #store the current break as our                              #first address. This will cause                              #the allocate function to get                              #more memory from Linux the                              #first time it is run  movl  %ebp, %esp            #exit the function  popl  %ebp  ret #####END OF FUNCTION#######  ##allocate##  #PURPOSE:    This function is used to grab a section of  #            memory. It checks to see if there are any  #            free blocks, and, if not, it asks Linux  #            for a new one.  #  #PARAMETERS: This function has one parameter - the size  #            of the memory block we want to allocate  #  #RETURN VALUE:  #            This function returns the address of the  #            allocated memory in %eax.  If there is no  #            memory available, it will return 0 in %eax  #  ######PROCESSING########  #Variables used:  #  #   %ecx - hold the size of the requested memory  #          (first/only parameter)  #   %eax - current memory region being examined  #   %ebx - current break position  #   %edx - size of current memory region  #  #We scan through each memory region starting with  #heap_begin. We look at the size of each one, and if  #it has been allocated.  If it's big enough for the  #requested size, and its available, it grabs that one.  #If it does not find a region large enough, it asks  #Linux for more memory.  In that case, it moves  #current_break up  .globl allocate  .type allocate, @function  .equ ST_MEM_SIZE,  8 #stack position of the memory size                       #to allocate allocate:  pushl %ebp                   #standard function stuff  movl  %esp, %ebp  movl  ST_MEM_SIZE(%ebp), %ecx #%ecx will hold the size                 #we are looking for (which is the first                 #and only parameter)  movl  heap_begin, %eax      #%eax will hold the current                              #search location  movl  current_break, %ebx   #%ebx will hold the current                              #break alloc_loop_begin:               #here we iterate through each                                 #memory region  cmpl  %ebx, %eax        #need more memory if these are equal  je    move_break  #grab the size of this memory  movl  HDR_SIZE_OFFSET(%eax), %edx  #If the space is unavailable, go to the  cmpl  $UNAVAILABLE, HDR_AVAIL_OFFSET(%eax)  je    next_location         #next one  cmpl  %edx, %ecx     #If the space is available, compare  jle   allocate_here  #the size to the needed size.  If its                       #big enough, go to allocate_here next_location:  addl  $HEADER_SIZE, %eax #The total size of the memory  addl  %edx, %eax         #region is the sum of the size                           #requested (currently stored                           #in %edx), plus another 8 bytes                           #for the header (4 for the                           #AVAILABLE/UNAVAILABLE flag,                           #and 4 for the size of the                           #region). So, adding %edx and $8                           #to %eax will get the address                           #of the next memory region  jmp   alloc_loop_begin      #go look at the next location allocate_here:                      #if we've made it here,                                     #that means that the                              #region header of the region                              #to allocate is in %eax  #mark space as unavailable  movl  $UNAVAILABLE, HDR_AVAIL_OFFSET(%eax)  addl  $HEADER_SIZE, %eax    #move %eax past the header to                              #the usable memory (since                              #that's what we return)  movl  %ebp, %esp            #return from the function  popl  %ebp  ret move_break:                      #if we've made it here, that                                  #means that we have exhausted                                  #all addressable memory, and                                  #we need to ask for more.                           #%ebx holds the current                           #endpoint of the data,                                  #and %ecx holds its size                           #we need to increase %ebx to                           #where we _want_ memory                           #to end, so we  addl  $HEADER_SIZE, %ebx #add space for the headers                           #structure  addl  %ecx, %ebx         #add space to the break for                           #the data requested                           #now its time to ask Linux                           #for more memory  pushl %eax               #save needed registers  pushl %ecx  pushl %ebx  movl  $SYS_BRK, %eax     #reset the break (%ebx has                           #the requested break point)  int   $LINUX_SYSCALL                 #under normal conditions, this should                 #return the new break in %eax, which                 #will be either 0 if it fails, or                 #it will be equal to or larger than                 #we asked for.  We don't care                 #in this program where it actually                 #sets the break, so as long as %eax                 #isn't 0, we don't care what it is  cmpl  $0, %eax           #check for error conditions  je    error  popl  %ebx               #restore saved registers  popl  %ecx  popl  %eax  #set this memory as unavailable, since we're about to  #give it away  movl  $UNAVAILABLE, HDR_AVAIL_OFFSET(%eax)  #set the size of the memory  movl  %ecx, HDR_SIZE_OFFSET(%eax)  #move %eax to the actual start of usable memory.  #%eax now holds the return value  addl  $HEADER_SIZE, %eax  movl  %ebx, current_break   #save the new break  movl  %ebp, %esp            #return the function  popl  %ebp  ret error:  movl  $0, %eax              #on error, we return zero  movl  %ebp, %esp  popl  %ebp  ret ########END OF FUNCTION########  ##deallocate##  #PURPOSE:  #    The purpose of this function is to give back  #    a region of memory to the pool after we're done  #    using it.  #  #PARAMETERS:  #    The only parameter is the address of the memory  #    we want to return to the memory pool.  #  #RETURN VALUE:  #    There is no return value  #PROCESSING:  #    If you remember, we actually hand the program the  #    start of the memory that they can use, which is  #    8 storage locations after the actual start of the  #    memory region.  All we have to do is go back  #    8 locations and mark that memory as available,  #    so that the allocate function knows it can use it.  .globl deallocate  .type deallocate, @function  #stack position of the memory region to free  .equ ST_MEMORY_SEG, 4 deallocate:  #since the function is so simple, we  #don't need any of the fancy function stuff  #get the address of the memory to free  #(normally this is 8(%ebp), but since  #we didn't push %ebp or move %esp to  #%ebp, we can just do 4(%esp)  movl  ST_MEMORY_SEG(%esp), %eax  #get the pointer to the real beginning of the memory  subl  $HEADER_SIZE, %eax  #mark it as available  movl  $AVAILABLE, HDR_AVAIL_OFFSET(%eax)  #return  ret ########END OF FUNCTION##########

The first thing to notice is that there is no _start symbol. The reason is that this is just a set of functions. A memory manager by itself is not a full program - it doesn't do anything. It is simply a utility to be used by other programs.

To assemble the program, do the following:

 as alloc.s -o alloc.o

Okay, now let's look at the code.

Variables and Constants

At the beginning of the program, we have two locations set up:

 heap_begin:  .long 0 current_break:  .long 0

Remember, the section of memory being managed is commonly referred to as the heap. When we assemble the program, we have no idea where the beginning of the heap is, nor where the current break is. Therefore, we reserve space for their addresses, but just fill them with a 0 for the time being.

Next we have a set of constants to define the structure of the heap. The way this memory manager works is that before each region of memory allocated, we will have a short record describing the memory. This record has a word reserved for the available flag and a word for the region's size. The actual memory allocated immediately follows this record. The available flag is used to mark whether this region is available for allocations, or if it is currently in use. The size field lets us know both whether or not this region is big enough for an allocation request, as well as the location of the next memory region. The following constants describe this record:

  .equ HEADER_SIZE, 8  .equ HDR_AVAIL_OFFSET, 0  .equ HDR_SIZE_OFFSET, 4

This says that the header is 8 bytes total, the available flag is offset 0 bytes from the beginning, and the size field is offset 4 bytes from the beginning. If we are careful to always use these constants, then we protect ourselves from having to do too much work if we later decide to add more information to the header.

The values that we will use for our available field are either 0 for unavailable, or 1 for available. To make this easier to read, we have the following definitions:

  .equ UNAVAILABLE, 0  .equ AVAILABLE, 1

Finally, we have our Linux system call definitions:

  .equ BRK, 45  .equ LINUX_SYSCALL, 0x80

The allocate_init Function

Okay, this is a simple function. All it does is set up the heap_begin and current_break variables we discussed earlier. So, if you remember the discussion earlier, the current break can be found using the brk system call. So, the function starts like this:

  pushl %ebp  movl  %esp, %ebp  movl  $SYS_BRK, %eax  movl  $0, %ebx  int   $LINUX_SYSCALL

Anyway, after int $LINUX_SYSCALL, %eax holds the last valid address. We actually want the first invalid address instead of the last valid address, so we just increment %eax. Then we move that value to the heap_begin and current_break locations. Then we leave the function. The code looks like this:

  incl  %eax  movl  %eax, current_break  movl  %eax, heap_begin  movl  %ebp, %esp  popl  %ebp  ret

The heap consists of the memory between heap_begin and current_break, so this says that we start off with a heap of zero bytes. Our allocate function will then extend the heap as much as it needs to when it is called.

The allocate Function

This is the doozy function. Let's start by looking at an outline of the function:

Start at the beginning of the heap.
Check to see if we're at the end of the heap.
If we are at the end of the heap, grab the memory we need from Linux, mark it as "unavailable" and return it. If Linux won't give us any more, return a 0.
If the current memory region is marked "unavailable", go to the next one, and go back to step 2.
If the current memory region is too small to hold the requested amount of space, go back to step 2.
If the memory region is available and large enough, mark it as "unavailable" and return it.

Now, look back through the code with this in mind. Be sure to read the comments so you'll know which register holds which value.

Now that you've looked back through the code, let's examine it one line at a time. We start off like this:

  pushl %ebp  movl  %esp, %ebp  movl  ST_MEM_SIZE(%ebp), %ecx  movl  heap_begin, %eax  movl  current_break, %ebx

This part initializes all of our registers. The first two lines are standard function stuff. The next move pulls the size of the memory to allocate off of the stack. This is our only function parameter. After that, it moves the beginning heap address and the end of the heap into registers. I am now ready to do processing.

The next section is marked alloc_loop_begin. In this loop we are going to examine memory regions until we either find an open memory region or determine that we need more memory. Our first instructions check to see if we need more memory:

  cmpl %ebx, %eax  je   move_break

%eax holds the current memory region being examined and %ebx holds the location past the end of the heap. Therefore if the next region to be examined is past the end of the heap, it means we need more memory to allocate a region of this size. Let's skip down to move_break and see what happens there:

 move_break:  addl  $HEADER_SIZE, %ebx  addl  %ecx, %ebx  pushl %eax  pushl %ecx  pushl %ebx  movl  $SYS_BRK, %eax  int   $LINUX_SYSCALL

When we reach this point in the code, %ebx holds where we want the next region of memory to be. So, we add our header size and region size to %ebx, and that's where we want the system break to be. We then push all the registers we want to save on the stack, and call the brk system call. After that we check for errors:

  cmpl  $0, %eax  je    error

If there were no errors we pop the registers back off the stack, mark the memory as unavailable, record the size of the memory, and make sure %eax points to the start of usable memory (which is after the header).

  popl  %ebx  popl  %ecx  popl  %eax  movl  $UNAVAILABLE, HDR_AVAIL_OFFSET(%eax)  movl  %ecx, HDR_SIZE_OFFSET(%eax)  addl  $HEADER_SIZE, %eax

Then we store the new program break and return the pointer to the allocated memory.

  movl  %ebx, current_break  movl  %ebp, %esp  popl  %ebp  ret

The error code just returns 0 in %eax, so we won't discuss it.

Let's go back look at the rest of the loop. What happens if the current memory being looked at isn't past the end of the heap? Well, let's look.

  movl HDR_SIZE_OFFSET(%eax), %edx  cmpl $UNAVAILABLE, HDR_AVAIL_OFFSET(%eax)  je   next_location

This first grabs the size of the memory region and puts it in %edx. Then it looks at the available flag to see if it is set to UNAVAILABLE. If so, that means that memory region is in use, so we'll have to skip over it. So, if the available flag is set to UNAVAILABLE, you go to the code labeled next_location. If the available flag is set to AVAILABLE, then we keep on going.

Let's say that the space was available, and so we keep going. Then we check to see if this space is big enough to hold the requested amount of memory. The size of this region is being held in %edx, so we do this:

  cmpl  %edx, %ecx  jle   allocate_here

If the requested size is less than or equal to the current region's size, we can use this block. It doesn't matter if the current region is larger than requested, because the extra space will just be unused. So, let's jump down to allocate_here and see what happens:

  movl  $UNAVAILABLE, HDR_AVAIL_OFFSET(%eax)  addl  $HEADER_SIZE, %eax  movl  %ebp, %esp  popl  %ebp  ret

It marks the memory as being unavailable. Then it moves the pointer %eax past the header, and uses it as the return value for the function. Remember, the person using this function doesn't need to even know about our memory header record. They just need a pointer to usable memory.

Okay, so let's say the region wasn't big enough. What then? Well, we would then be at the code labeled next_location. This section of code is used any time that we figure out that the current memory region won't work for allocating memory. All it does is advance %eax to the next possible memory region, and goes back to the beginning of the loop. Remember that %edx is holding the size of the current memory region, and HEADER_SIZE is the symbol for the size of the memory region's header. So this code will move us to the next memory region:

  addl  $HEADER_SIZE, %eax  addl  %edx, %eax  jmp   alloc_loop_begin

And now the function runs another loop.

Whenever you have a loop, you must make sure that it will always end. The best way to do that is to examine all of the possibilities, and make sure that all of them eventually lead to the loop ending. In our case, we have the following possibilities:

We will reach the end of the heap
We will find a memory region that's available and large enough
We will go to the next location

The first two items are conditions that will cause the loop to end. The third one will keep it going. However, even if we never find an open region, we will eventually reach the end of the heap, because it is a finite size. Therefore, we know that no matter which condition is true, the loop has to eventually hit a terminating condition.

The deallocate Function

The deallocate function is much easier than the allocate one. That's because it doesn't have to do any searching at all. It can just mark the current memory region as AVAILABLE, and allocate will find it next time it is called. So we have:

  movl  ST_MEMORY_SEG(%esp), %eax  subl  $HEADER_SIZE, %eax  movl  $AVAILABLE, HDR_AVAIL_OFFSET(%eax)  ret

In this function, we don't have to save %ebp or %esp since we're not changing them, nor do we have to restore them at the end. All we're doing is reading the address of the memory region from the stack, backing up to the beginning of the header, and marking the region as available. This function has no return value, so we don't care what we leave in %eax.

Performance Issues and Other Problems

Our simplistic memory manager is not really useful for anything more than an academic exercise. This section looks at the problems with such a simplistic allocator.

The biggest problem here is speed. Now, if there are only a few allocations made, then speed won't be a big issue. But think about what happens if you make a thousand allocations. On allocation number 1000, you have to search through 999 memory regions to find that you have to request more memory. As you can see, that's getting pretty slow. In addition, remember that Linux can keep pages of memory on disk instead of in memory. So, since you have to go through every piece of memory your program's memory, that means that Linux has to load every part of memory that's currently on disk to check to see if it is available. You can see how this could get really, really slow. ^[6] This method is said to run in linear time, which means that every element you have to manage makes your program take longer. A program that runs in constant time takes the same amount of time no matter how many elements you are managing. Take the deallocate function, for instance. It only runs 4 instructions, no matter how many elements we are managing, or where they are in memory. In fact, although our allocate function is one of the slowest of all memory managers, the deallocate function is one of the fastest.

Another performance problem is the number of times we're calling the brk system call. System calls take a long time. They aren't like functions, because the processor has to switch modes. Your program isn't allowed to map itself memory, but the Linux kernel is. So, the processor has to switch into kernel mode, then Linux maps the memory, and then switches back to user mode for your application to continue running. This is also called a context switch. Context switches are relatively slow on x86 processors. Generally, you should avoid calling the kernel unless you really need to.

Another problem that we have is that we aren't recording where Linux actually sets the break. Previously we mentioned that Linux might actually set the break past where we requested it. In this program, we don't even look at where Linux actually sets the break - we just assume it sets it where we requested. That's not really a bug, but it will lead to unnecessary brk system calls when we already have the memory mapped in.

Another problem we have is that if we are looking for a 5-byte region of memory, and the first open one we come to is 1000 bytes, we will simply mark the whole thing as allocated and return it. This leaves 995 bytes of unused, but allocated, memory. It would be nice in such situations to break it apart so the other 995 bytes can be used later. It would also be nice to combine consecutive free spaces when looking for large allocations.

^[6]This is why adding more memory to your computer makes it run faster. The more memory your computer has, the less it puts on disk, so it doesn't have to always be interrupting your programs to retreive pages off the disk.