This web page is about C and memory - how it uses it, and where it comes from.
Because it is what I am using, the page is based on GNU C running on Linux 2.4 kernel running on an Intel x86 processor with a 32 bit address bus. Some of the information is general, some of the information is specific to this OS and hardware.
C recognises at least five different ways of acquiring or using memory - the first four are known as storage classes, and are
Storage classes are a way of describing for how long a memory location is allocated to an object - it is referred to as its lifetime. So the lifetime of the static storage class is the duration of the programme run time. The lifetime of the auto storage class is the time that the function is running - when the function returns, the lifetime expires.
The fifth way that C can acquire memory is through dynamic memory allocation. When a programme is requested, only the above types of storage are provided by the kernel.
However if a programme needs more storage space during its run time, it can request more memory from the kernel, through a system call using a GNU C library function call.
This storage space is allocated by the kernel out of unused memory, this unused memory space is known as the heap, and is controlled by the kernel.
We need to look at the source of the various types of memory that C knows about. As far as I am aware, the following is not specific to C, it is quite general, however it is based on a 32 bit address bus.
A 32 bit address bus can address 232 memory address locations - or 4Gb`s of memory. It is of course only quite recently that pc`s have been produced that actually contain 4Gb`s of RAM - early Pentium processors would probably have around 32 to 256 Mb`s of RAM to play with.
I have recollections that Windows NT crawled on 32Mb`s, worked much better on 64Mb`s, and a wee bit better on 128Mb`s - it was a case of diminishing returns. The various versions of Linux that were around at the time more or less matched MS Windows requirements.
So memory management was and is widely used - not just in x86 processors, but across a wide range of processors. The basic idea is that the processor contains a memory management unit which creates a virtual memory pile of 4Gb`s, and the kernel maintains a look-up table to match virtual memory addresses with the actual RAM. Typically, the amount of RAM is much less than 4Gb`s, so the virtual memory addresses that cannot be matched to actual addresses in RAM are matched to address equivalents on a hard drive - in Linux these address equivalents may be on a separate SWAP partition, or they may be within files on the main OS partition.
Usually the matching is done in blocks of addresses, rather than individually - the blocks are known as pages, and in Linux, they are typically 4096Kb`s in size.
It is the responsibility of the kernel to maintain and update the paging look-up table in response to application demands.
Paging in blocks of memory is necessary to avoid having a paging look-up table that would require to have 232 entries if individual addresses were cross referenced.
In Linux, within the physical RAM, the operating system takes up the bottom section of the available addresses.
However for the virtual memory, the layout is not so clear, as I have seen some differing schemes suggested in various websites.
Some of the sites suggest that within the virtual memory pile, the operating system is allocated the top 1Gb of space, and the applications are allocated space within the 3Gb below. So in a sort of diagramatic form, the layout would be something like -
4Gb | operating system executable code | operating system global variables | operating system stack space 3Gb | unused - part of the heap | application executable code, literals and constants - the text segment | application global variables - static memory allocation - the data segment | application local variables - the stack - the stack segment | | unused - part of the heap | application executable code, literals and constants - the text segment | application global variable - static memory allocation - the data segment | application local variables - the stack - the stack segment | | unused - part of the heap 0Gb
However another website described it all in a significantly different way -
4Gb | application local variables - the stack | | unused | | application heap | application global variables - static memory allocation - the bss segment | application literals and constants - the data segment | application executable code, literals and constants - the text segment 0Gb
As you can see, this is quite different from the previously shown layout.
In view of the differences between these two descriptions, I wrote a C programme to try and indentify what goes where. Here is the script -
#include <stdio.h> int a = 0; int b = 0; int c = 0; int main() { printf(" \n \n \t These are global variables with static storage class :- \n "); printf(" \n \t \t address of a - hex = 0x%x ---- decimal = %u \n", &a, &a); printf(" \n \t \t address of b - hex = 0x%x ---- decimal = %u \n", &b, &b); printf(" \n \t \t address of c - hex = 0x%x ---- decimal = %u \n", &c, &c); int d = 0; int e = 0; int f = 0; printf(" \n \t These are local variables with auto storage class :- \n "); printf(" \n \t \t address of d - hex = 0x%x ---- decimal = %u \n", &d, &d); printf(" \n \t \t address of e - hex = 0x%x ---- decimal = %u \n", &e, &e); printf(" \n \t \t address of f - hex = 0x%x ---- decimal = %u \n \n", &f, &f); return(0); }
As you can see, it first of all defines three global variables, then prints out their virtual memory addresses. Then it defines three local variables, and then prints out their memory addresses as well. All the addresses are shown both in hex and in decimal. When it is run, the result is -
These are global variables with static storage class :- address of a - hex = 0x804a01c ---- decimal = 134,520,860 address of b - hex = 0x804a020 ---- decimal = 134,520,864 address of c - hex = 0x804a024 ---- decimal = 134,520,868 These are local variables with auto storage class :- address of d - hex = 0xbfeeeff0 ---- decimal = 3,220,107,248 address of e - hex = 0xbfeeefec ---- decimal = 3,220,107,244 address of f - hex = 0xbfeeefe8 ---- decimal = 3,220,107,240
So the global variables in the data segment are low down in memory - ie, around 134Mb up from base, and sucessively allocated memory locations climp upwards.
The local variables in the stack are high up in memory, around 3.2Gb, and their successively allocated memory locations climb downwards.
Doing a little more diagnosis of the executable with the size command provides the following information -
> size 16.cout text data bss dec hex filename 1584 264 20 1868 74c 16.cout
So this confirms the existence of the three segments - text, data, and bss.
I then looked at two processes running on my Linux pc, the bash shell, and gedit, using the pmap command. This showed ( in amongst a huge amount of other information ) the starting address of the application machine code, the start of the heap, and the start of the stack, for each application, and I got the following results :
start heap stack -------- ------ ------- gedit 0x08048000 0x081eb000 0xbfdfe000 bash shell 0xb7b74000 0xb7f68000 0xbff4f000
In order to make these figures easier to understand, here they are again, but I`ve converted them into decimal.
start heap stack -------- ------ ------- gedit 134,512,640 136,228,864 3,219,120,128 bash shell 3,082,240,000 3,086,385,152 3,220,500,480
So gedit is placed low down in virtual memory, and so is its heap - but its stack is right up above 3.2Gb.
The bash shell is high up in virtual memory, along with its heap - and its stack is just along from the gedit stack, and incidentally, just above the stack space that was allocated to the C programe.
It also appears that the bash shell process is seen as part of the operating system, and is located at the top end of the virtual memory along with the rest of the operating system, and that gedit is seen as a user program, and occupies a low space in the virtual memory.
After all that, it looks as if the actual layout is a mixture of the above two descriptions, so possibly looks like -
4Gb | operating system executable code | operating system global variables | operating system stack space | application local variables - the stacks 3Gb | | unused | | application heap | application global variables - static memory allocation - the bss segment | application literals and constants - the data segment | application executable code - the text segment 0Gb
Other user applications would be fitted into the unused space.
When an application closes, the operating system removes the three segments, the heap, and the stack from memory. This now becomes more unused virtual memory.
As an aside, I believe that current thinking is to put the literals and constants into the text segment, along with the programme machine code, as that way they benefit from the Read-Only property of the text segment. The bss segment is somewhat historical, and tends to be considered as part of the data segment now.
Bear in mind that this layout is specifically for Linux running on an x86 processor with a 32 bit address bus. The layout on other configurations could well be different.
The stacks are used in a way which is a bit different from the other sections of memory - so here are some of its characteristics.
When a programme is being written, and when a programme is being set up to run, the programmer and the kernel may not know how much data is going to be processed by the programme. So in addition to the above four storage classes, the data segments, and the stack, C allows for dynamic memory allocation during the programme run time.
The programme can request from the kernel an additional amount of storage. This is done through system calls, using functions provided within the C libraries - in GNU C they are included withing the <stdlib.h> header file - this header file includes four functions associated with dynamic memory allocation -
The additional memory locations are provided within the heap, which is shown in the third layout diagram shown above. Each programme has its own heap, however the heap is under the control of the kernel, not under the control of the programme.
The memory blocks provided by malloc() or calloc() are always contiguous - ie, there are no breaks in them.
The only way to refer to dynamically allocated memory is through a pointer, it is a pointer that is set up when using malloc() and calloc().
Using dynamic memory allocation takes up a fair bit more time than using static or automatic storage, because there is a lot more processing required.
Time for an example - here is a small programme that defines a pointer p_1, uses malloc() to point it to a dynamic memory location, then uses free() to release the memory again.
#include <stdio.h> #include <stdlib.h> int main() { int *p_1; // declares pointer p_1 p_1 = (int*)malloc(sizeof(int)); // this calls malloc for a block of memory // to hold the size of int if (p_1 == 0) // tests to see if malloc has allocated // the memory - if p_1 = 0, then malloc // hasn`t given allocation { printf(" \n \n ERROR: out of memory \n \n "); return 1; // returns a 1 to indicate an error state } *p_1 = 100; // p_1 is a pointer to a memory location - // we can now put a value into that location printf(" \n \n \t the contents of p_1 = %d \n \n", *p_1); free(p_1); // releases memory space used by p_1 return(0); }
When it is compiled and run, it produces the result
the contents of p_1 = 100
The contents of each of the memory locations within the block of memory allocated by malloc() are indeterminant - they have no value. This is in contrast to the memory locations for variables, which are preset to "0". So to do anything sensible with the allocated memory, the contents have to be subsequently defined, as shown above in the script.
The argument for malloc() can contain a multiplication, if say you wanted memory space allocated for more than one integer - so for example, if you wanted space for 10 integers, it would be written
p_1 = (int*)malloc(10 * sizeof(int)); // this calls malloc() for a block of memory // to hold the size of 10 int`s
Using calloc() is quite similar to using malloc(). However there are two differences - the first is that the argument is written a bit differently -
p_1 = (int*)calloc(10, sizeof(int)); // this calls calloc() for a block of memory // to hold the size of 10 int`s
A bit of a trap with calloc() is that even if you only want memory space for a single int, you have to have the same format - ie -
p_1 = (int*)calloc(1, sizeof(int)); // this calls calloc() for a block of memory // to hold the size of int - the "1," is required
The second difference between calloc() and malloc() is that calloc() initialises every location within the allocated block of memory to "0". This is useful in for example arrays, where the programme is going to go on and do stuff to the values in the arrays.
A bit of a confusion factor in the above description of malloc() and calloc() is the use of the word "sizeof".
sizeof is an operator that provides the size of something - so using
....sizeof(something)....
will provide the exact storage size of (something) in bytes.
(something) can be a (type) - eg, (int), or (double) - or it can be the name of an object.
sizeof seems to be reasonably easy to understand - another word that is (I think ! ) much harder to understand is size_t.
size_t is often associated with sizeof, and can sometimes also be used in the arguments for malloc() and calloc().
size_t is a sort of type - just as int, double, long, char, are all types.
However it isn`t as simple as them - because size_t doesn`t define an established figure for the number of bytes that an object occupies, size_t is ( I think, but I`m not sure ! ) created by the compiler in response to things like the hardware.
It is always and specifically a measure of a number of bytes of memory. It cannot be negative, so it is always unsigned.
On my Linux machine, size_t occupies 4 bytes, just as int does.
I can use it to define the type of a variable, just as I would use int or double or long, etc -
size_t var_1 = 0;
and 4 bytes are allocated to var_1. However in other places results are not predictable, and so far I don`t know enough about it to know fully what it is. But it`s out there ....
Higher up the page, I showed how static and auto storage classes utilised their apportioned chunks of memory. Doing the same thing with dynamic memory allocation is a bit more difficult, because - it`s dynamic - and it disappears again before you have time to have a look at it.
So after a lot of fun, I evolved a way of doing it, and here is the script that enables me to see the heap dynamically. For simplicity, I have not bothered to do any testing, and I haven`t put any values into the memory locations.
#include <stdio.h> #include <stdlib.h> int main() { int *p_1; int *p_2; int *p_3; p_1 = (int*)malloc(sizeof(int)); p_2 = (int*)malloc(sizeof(int)); p_3 = (int*)malloc(sizeof(int)); printf(" \n "); printf(" \n \t contents of p_1 = %p \n", p_1 ); printf(" \n \t contents of p_2 = %p \n", p_2 ); printf(" \n \t contents of p_3 = %p \n", p_3 ); printf(" \n "); system("gnome-terminal"); getchar(); free(p_1); free(p_2); free(p_3); return(0); }
The programme
Because the C programme has halted whilst waiting for a keyboard response, as far as Linux is concerned, the C programme is still a running process - so using the command line window, I could use ps to get a pid, then use pmap to see the memory allocation. Here it is -
4634: 19.cout START SIZE RSS PSS DIRTY PERM MAPPING 08048000 4K 4K 4K 0K r-xp /stuff/c-stuff/19.cout 08049000 4K 4K 4K 4K r--p /stuff/c-stuff/19.cout 0804a000 4K 4K 4K 4K rw-p /stuff/c-stuff/19.cout 0804b000 132K 4K 4K 4K rw-p [heap] b7f62000 4K 4K 4K 4K rw-p [anon] b7f63000 1268K 236K 5K 0K r-xp /lib/libc-2.8.so b80a0000 8K 8K 8K 8K r--p /lib/libc-2.8.so b80a2000 4K 4K 4K 4K rw-p /lib/libc-2.8.so b80a3000 16K 12K 12K 12K rw-p [anon] b80b6000 8K 4K 4K 4K rw-p [anon] b80b8000 108K 92K 1K 0K r-xp /lib/ld-2.8.so b80d3000 4K 4K 4K 4K r--p /lib/ld-2.8.so b80d4000 4K 4K 4K 4K rw-p /lib/ld-2.8.so bfcbf000 84K 12K 12K 12K rw-p [stack] ffffe000 4K 0K 0K 0K r-xp [vdso] Total: 1656K 396K 74K 64K 256K writable-private, 1400K readonly-private, 0K shared, and 396K referenced
From this we can see that the programme itself, ie, the text segment, starts at 0x08048000, the heap starts at 0x0804b000, and the heap has been allocated 132Kb.
Now of course the C programme also printed out the memory locations of the three pointer variables, and the result of this is
contents of p_1 = 0x0804b008 contents of p_2 = 0x0804b018 contents of p_3 = 0x0804b028
So we can see that each of the pointer variables was allocated 16 bytes within the heap.
Neat eh !