Memory

This web page is about C and memory - how it uses it, and where it comes from.

Because it is what I am using, the page is based on GNU C running on Linux 2.4 kernel running on an Intel x86 processor with a 32 bit address bus. Some of the information is general, some of the information is specific to this OS and hardware.

C and its types of memory

C recognises at least five different ways of acquiring or using memory - the first four are known as storage classes, and are

⇒ static: static memory is allocated by the kernel when the programme is requested - anything put into static memory stays there for the whole time that the programme is running, and it is only released when the programme closes - static storage is typically used for global variables and global fixed length arrays
⇒ auto: automatic storage is contained within a block of memory allocated by the kernel when the programme is requested, and this block of memory is called the stack - each programme has its own exclusive fixed size stack - the programme is free to use the stack as it needs to - it is used for things like local variables, which only exist within a single function or block of code
⇒ register: this storage class uses registers rather than memory for storage - registers can provide fast response times, however it is entirely up to the compiler whether a register or normal memory is allocated - if the compiler doesn`t comply with a register storage request, then the data is stored in automatic storage as above
⇒ extern: an extension of static, which allows the contents to be "seen" across different files, which are subsequently linked during the final part of the compilation process

Storage classes are a way of describing for how long a memory location is allocated to an object - it is referred to as its lifetime. So the lifetime of the static storage class is the duration of the programme run time. The lifetime of the auto storage class is the time that the function is running - when the function returns, the lifetime expires.

The fifth way that C can acquire memory is through dynamic memory allocation. When a programme is requested, only the above types of storage are provided by the kernel.

However if a programme needs more storage space during its run time, it can request more memory from the kernel, through a system call using a GNU C library function call.

This storage space is allocated by the kernel out of unused memory, this unused memory space is known as the heap, and is controlled by the kernel.

How memory is organised

We need to look at the source of the various types of memory that C knows about. As far as I am aware, the following is not specific to C, it is quite general, however it is based on a 32 bit address bus.

A 32 bit address bus can address 2³² memory address locations - or 4Gb`s of memory. It is of course only quite recently that pc`s have been produced that actually contain 4Gb`s of RAM - early Pentium processors would probably have around 32 to 256 Mb`s of RAM to play with.

I have recollections that Windows NT crawled on 32Mb`s, worked much better on 64Mb`s, and a wee bit better on 128Mb`s - it was a case of diminishing returns. The various versions of Linux that were around at the time more or less matched MS Windows requirements.

So memory management was and is widely used - not just in x86 processors, but across a wide range of processors. The basic idea is that the processor contains a memory management unit which creates a virtual memory pile of 4Gb`s, and the kernel maintains a look-up table to match virtual memory addresses with the actual RAM. Typically, the amount of RAM is much less than 4Gb`s, so the virtual memory addresses that cannot be matched to actual addresses in RAM are matched to address equivalents on a hard drive - in Linux these address equivalents may be on a separate SWAP partition, or they may be within files on the main OS partition.

Usually the matching is done in blocks of addresses, rather than individually - the blocks are known as pages, and in Linux, they are typically 4096Kb`s in size.

It is the responsibility of the kernel to maintain and update the paging look-up table in response to application demands.

Paging in blocks of memory is necessary to avoid having a paging look-up table that would require to have 2³² entries if individual addresses were cross referenced.

In Linux, within the physical RAM, the operating system takes up the bottom section of the available addresses.

However for the virtual memory, the layout is not so clear, as I have seen some differing schemes suggested in various websites.

Some of the sites suggest that within the virtual memory pile, the operating system is allocated the top 1Gb of space, and the applications are allocated space within the 3Gb below. So in a sort of diagramatic form, the layout would be something like -


	4Gb

         |      operating system executable code

         |      operating system global variables

         |      operating system stack space

	3Gb

         |               unused  -  part of the heap

         |
                application executable code, literals and constants  -  the text segment
         |
                application global variables  - static memory allocation  -  the data segment
         |
                application local variables  -  the stack  -  the stack segment
         |

         |              unused  -  part of the heap

         |
                application executable code, literals and constants  -  the text segment
         |
                application global variable  -  static memory allocation  -  the data segment
         |
                application local variables  -  the stack  -  the stack segment
         |

         |              unused  -  part of the heap

        0Gb

However another website described it all in a significantly different way -


	4Gb

         |      application local variables  -  the stack

         |

         |               unused

         |
 
         |      application heap

         |      application global variables  -  static memory allocation  -  the bss segment

         |      application literals and constants                         -  the data segment

         |      application executable code, literals and constants        -  the text segment

        0Gb

As you can see, this is quite different from the previously shown layout.

In view of the differences between these two descriptions, I wrote a C programme to try and indentify what goes where. Here is the script -


#include <stdio.h>

        int a = 0;

        int b = 0;

        int c = 0;


        int main()

             {

        printf(" \n \n \t These are global variables with static storage class :- \n ");

        printf(" \n \t \t address of a - hex = 0x%x ---- decimal = %u \n", &a, &a);

        printf(" \n \t \t address of b - hex = 0x%x ---- decimal = %u \n", &b, &b);

        printf(" \n \t \t address of c - hex = 0x%x ---- decimal = %u \n", &c, &c);

        int d = 0;

        int e = 0;

        int f = 0;

        printf(" \n \t These are local variables with auto storage class :- \n ");

        printf(" \n \t \t address of d - hex = 0x%x ---- decimal = %u \n", &d, &d);

        printf(" \n \t \t address of e - hex = 0x%x ---- decimal = %u \n", &e, &e);

        printf(" \n \t \t address of f - hex = 0x%x ---- decimal = %u \n \n", &f, &f);

        return(0);

              }

As you can see, it first of all defines three global variables, then prints out their virtual memory addresses. Then it defines three local variables, and then prints out their memory addresses as well. All the addresses are shown both in hex and in decimal. When it is run, the result is -


 	 These are global variables with static storage class :- 
  
 	 	 address of a - hex = 0x804a01c ---- decimal = 134,520,860 
 
 	 	 address of b - hex = 0x804a020 ---- decimal = 134,520,864 
 
 	 	 address of c - hex = 0x804a024 ---- decimal = 134,520,868 
 
 	 These are local variables with auto storage class :- 
  
 	 	 address of d - hex = 0xbfeeeff0 ---- decimal = 3,220,107,248 
 
 	 	 address of e - hex = 0xbfeeefec ---- decimal = 3,220,107,244 
 
 	 	 address of f - hex = 0xbfeeefe8 ---- decimal = 3,220,107,240

So the global variables in the data segment are low down in memory - ie, around 134Mb up from base, and sucessively allocated memory locations climp upwards.

The local variables in the stack are high up in memory, around 3.2Gb, and their successively allocated memory locations climb downwards.

Doing a little more diagnosis of the executable with the size command provides the following information -


        > size 16.cout
 
            text     data    bss     dec    hex    filename
            1584     264     20     1868    74c	    16.cout

So this confirms the existence of the three segments - text, data, and bss.

I then looked at two processes running on my Linux pc, the bash shell, and gedit, using the pmap command. This showed ( in amongst a huge amount of other information ) the starting address of the application machine code, the start of the heap, and the start of the stack, for each application, and I got the following results :



                           start              heap             stack
                          --------           ------           -------
            
        gedit            0x08048000        0x081eb000        0xbfdfe000

        bash shell       0xb7b74000        0xb7f68000        0xbff4f000

In order to make these figures easier to understand, here they are again, but I`ve converted them into decimal.



                           start              heap             stack
                          --------           ------           -------
            
        gedit           134,512,640       136,228,864      3,219,120,128

        bash shell     3,082,240,000     3,086,385,152     3,220,500,480

So gedit is placed low down in virtual memory, and so is its heap - but its stack is right up above 3.2Gb.

The bash shell is high up in virtual memory, along with its heap - and its stack is just along from the gedit stack, and incidentally, just above the stack space that was allocated to the C programe.

It also appears that the bash shell process is seen as part of the operating system, and is located at the top end of the virtual memory along with the rest of the operating system, and that gedit is seen as a user program, and occupies a low space in the virtual memory.

After all that, it looks as if the actual layout is a mixture of the above two descriptions, so possibly looks like -


	4Gb

         |      operating system executable code

         |      operating system global variables

         |      operating system stack space

         |      application local variables  -  the stacks

        3Gb

         |

         |               unused

         |
 
         |      application heap

         |      application global variables  -  static memory allocation  -  the bss segment

         |      application literals and constants                         -  the data segment

         |      application executable code                                -  the text segment

        0Gb

Other user applications would be fitted into the unused space.

When an application closes, the operating system removes the three segments, the heap, and the stack from memory. This now becomes more unused virtual memory.

As an aside, I believe that current thinking is to put the literals and constants into the text segment, along with the programme machine code, as that way they benefit from the Read-Only property of the text segment. The bss segment is somewhat historical, and tends to be considered as part of the data segment now.

Bear in mind that this layout is specifically for Linux running on an x86 processor with a 32 bit address bus. The layout on other configurations could well be different.

The stacks

The stacks are used in a way which is a bit different from the other sections of memory - so here are some of its characteristics.

used for local variables and parameters for the automatic storage class, so the storage lifetime is only whilst the function or code block is running
its use is under the control of the programme
the stack remembers the order in which functions are called so that function returns occur correctly
local variables / parameters are "pushed" onto the stack when a function is called
they are then "popped" off again when the function returns
the stack therefore operates in "First In Last Out" mode
the stack size changes dynamically as the programme runs
the stack has a maximum size set by the kernel when the programme is being set up to run
because of its fixed size, it is vulnerable to stack overflow problems, if the programme tries to use too much of it

Dynamic memory allocation

When a programme is being written, and when a programme is being set up to run, the programmer and the kernel may not know how much data is going to be processed by the programme. So in addition to the above four storage classes, the data segments, and the stack, C allows for dynamic memory allocation during the programme run time.

The programme can request from the kernel an additional amount of storage. This is done through system calls, using functions provided within the C libraries - in GNU C they are included withing the <stdlib.h> header file - this header file includes four functions associated with dynamic memory allocation -

malloc() - allocates a new block of memory
calloc() - also for allocating a new block of memory
realloc() - resize an allocated memory block
free() - releases allocated memory block

The additional memory locations are provided within the heap, which is shown in the third layout diagram shown above. Each programme has its own heap, however the heap is under the control of the kernel, not under the control of the programme.

The memory blocks provided by malloc() or calloc() are always contiguous - ie, there are no breaks in them.

The only way to refer to dynamically allocated memory is through a pointer, it is a pointer that is set up when using malloc() and calloc().

Using dynamic memory allocation takes up a fair bit more time than using static or automatic storage, because there is a lot more processing required.

Time for an example - here is a small programme that defines a pointer p_1, uses malloc() to point it to a dynamic memory location, then uses free() to release the memory again.


        #include <stdio.h>

        #include <stdlib.h>


        int main()

              {


        int *p_1;                                                 // declares pointer p_1


        p_1 = (int*)malloc(sizeof(int));                          // this calls malloc for a block of memory
                                                                  // to hold the size of int


        if (p_1 == 0)                                             // tests to see if malloc has allocated 
                                                                  // the memory - if p_1 = 0, then malloc
                                                                  // hasn`t given allocation 

                {

                printf(" \n \n ERROR: out of memory \n \n ");       

                return 1;                                         // returns a 1 to indicate an error state

                }

         *p_1 = 100;                                              // p_1 is a pointer to a memory location -
                                                                  // we can now put a value into that location

         printf(" \n \n \t the contents of p_1 = %d \n \n", *p_1);

         free(p_1);                                               // releases memory space used by p_1

         return(0);

                }

When it is compiled and run, it produces the result


         the contents of p_1 = 100

The contents of each of the memory locations within the block of memory allocated by malloc() are indeterminant - they have no value. This is in contrast to the memory locations for variables, which are preset to "0". So to do anything sensible with the allocated memory, the contents have to be subsequently defined, as shown above in the script.

The argument for malloc() can contain a multiplication, if say you wanted memory space allocated for more than one integer - so for example, if you wanted space for 10 integers, it would be written


        p_1 = (int*)malloc(10 * sizeof(int));                   // this calls malloc() for a block of memory
                                                                // to hold the size of 10 int`s

Using calloc() is quite similar to using malloc(). However there are two differences - the first is that the argument is written a bit differently -


        p_1 = (int*)calloc(10, sizeof(int));                    // this calls calloc() for a block of memory
                                                                // to hold the size of 10 int`s

A bit of a trap with calloc() is that even if you only want memory space for a single int, you have to have the same format - ie -


        p_1 = (int*)calloc(1, sizeof(int));                     // this calls calloc() for a block of memory
                                                                // to hold the size of int - the "1," is required

The second difference between calloc() and malloc() is that calloc() initialises every location within the allocated block of memory to "0". This is useful in for example arrays, where the programme is going to go on and do stuff to the values in the arrays.

sizeof and size_t

A bit of a confusion factor in the above description of malloc() and calloc() is the use of the word "sizeof".

sizeof is an operator that provides the size of something - so using


        ....sizeof(something)....

will provide the exact storage size of (something) in bytes.

(something) can be a (type) - eg, (int), or (double) - or it can be the name of an object.

sizeof seems to be reasonably easy to understand - another word that is (I think ! ) much harder to understand is size_t.

size_t is often associated with sizeof, and can sometimes also be used in the arguments for malloc() and calloc().

size_t is a sort of type - just as int, double, long, char, are all types.

However it isn`t as simple as them - because size_t doesn`t define an established figure for the number of bytes that an object occupies, size_t is ( I think, but I`m not sure ! ) created by the compiler in response to things like the hardware.

It is always and specifically a measure of a number of bytes of memory. It cannot be negative, so it is always unsigned.

On my Linux machine, size_t occupies 4 bytes, just as int does.

I can use it to define the type of a variable, just as I would use int or double or long, etc -


        size_t var_1 = 0;

and 4 bytes are allocated to var_1. However in other places results are not predictable, and so far I don`t know enough about it to know fully what it is. But it`s out there ....

Malloc and the heap

Higher up the page, I showed how static and auto storage classes utilised their apportioned chunks of memory. Doing the same thing with dynamic memory allocation is a bit more difficult, because - it`s dynamic - and it disappears again before you have time to have a look at it.

So after a lot of fun, I evolved a way of doing it, and here is the script that enables me to see the heap dynamically. For simplicity, I have not bothered to do any testing, and I haven`t put any values into the memory locations.


        #include <stdio.h>

        #include <stdlib.h>


        int main()

              {

        int *p_1;  

        int *p_2;

        int *p_3;


        p_1 = (int*)malloc(sizeof(int)); 

        p_2 = (int*)malloc(sizeof(int));

        p_3 = (int*)malloc(sizeof(int));  


        printf(" \n ");

        printf(" \n \t contents of p_1 = %p \n", p_1 );

        printf(" \n \t contents of p_2 = %p \n", p_2 );

        printf(" \n \t contents of p_3 = %p \n", p_3 );

        printf(" \n ");


        system("gnome-terminal");

        getchar();


        free(p_1);

        free(p_2);

        free(p_3);

        return(0);

                }

The programme

declares three pointers
requests a chunk of memory for each of them
prints out the contents of each of the pointers - ie, the memory addresses they contain
it then calls for Linux to start up a command line window
it then halts the programme whilst waiting for a keyboard response
after I have come back to the programme, it deletes the memory allocations, and returns.

Because the C programme has halted whilst waiting for a keyboard response, as far as Linux is concerned, the C programme is still a running process - so using the command line window, I could use ps to get a pid, then use pmap to see the memory allocation. Here it is -


        4634: 19.cout

        START       SIZE     RSS     PSS   DIRTY PERM MAPPING

        08048000      4K      4K      4K      0K r-xp /stuff/c-stuff/19.cout
        08049000      4K      4K      4K      4K r--p /stuff/c-stuff/19.cout
        0804a000      4K      4K      4K      4K rw-p /stuff/c-stuff/19.cout
        0804b000    132K      4K      4K      4K rw-p [heap]
        b7f62000      4K      4K      4K      4K rw-p [anon]
        b7f63000   1268K    236K      5K      0K r-xp /lib/libc-2.8.so
        b80a0000      8K      8K      8K      8K r--p /lib/libc-2.8.so
        b80a2000      4K      4K      4K      4K rw-p /lib/libc-2.8.so
        b80a3000     16K     12K     12K     12K rw-p [anon]
        b80b6000      8K      4K      4K      4K rw-p [anon]
        b80b8000    108K     92K      1K      0K r-xp /lib/ld-2.8.so
        b80d3000      4K      4K      4K      4K r--p /lib/ld-2.8.so
        b80d4000      4K      4K      4K      4K rw-p /lib/ld-2.8.so
        bfcbf000     84K     12K     12K     12K rw-p [stack]
        ffffe000      4K      0K      0K      0K r-xp [vdso]

        Total:     1656K    396K     74K     64K

        256K writable-private, 1400K readonly-private, 0K shared, and 396K referenced

From this we can see that the programme itself, ie, the text segment, starts at 0x08048000, the heap starts at 0x0804b000, and the heap has been allocated 132Kb.

Now of course the C programme also printed out the memory locations of the three pointer variables, and the result of this is


        contents of p_1 = 0x0804b008

        contents of p_2 = 0x0804b018

        contents of p_3 = 0x0804b028

So we can see that each of the pointer variables was allocated 16 bytes within the heap.

Neat eh !

website design by ron-t

website hosting by freevirtualservers.com

+ +