
Experimenting with hugepages to reduce CPU and memory resources
For processes that uses lots of memory, the virtual to physical memory mapping (i.e. pagetable) will need to hold a lot of mappings (called page table entries, or PTEs), and may grow to very large sizes. Having very large pagetables claims extra resources on the system, as more memory are needed to hold the page tables, and more CPU cycles must be used to search the pagetable. The system may benefit from keeping the number of page table entries at a minimum.
This is where hugepages comes in handy. Using hugepages, we increase the memory chunks allocated by the process, thus reducing number of memory mappings needed by the process. Instead of one page table entry mapping for example 4 kB of data (which is the default size on many systems), each entry may map for example 4 MB of data.
To test how a process memory usage affects the pagetable on one of our linux servers at work, I created two tiny applications written i C. The first allocates 19 GB of memory using regular memory allocations, and the second one allocated the same amount of memory using hugepages.
For simplicity, we’ll start with the latter program, hereby referred to as the hugepages application. To prepare the server for using hugepages, I added this line to /etc/sysctl.conf:
1 2 |
[root@server ~]# grep hugepages /etc/sysctl.conf vm.nr_hugepages=10000 |
The current (default) regular page size and hugepage size was found here:
1 2 3 4 |
[root@server ~]# getconf PAGESIZE 4096 [root@server ~]# grep -i hugepagesize /proc/meminfo Hugepagesize: 2048 kB |
Based on the default hugepage size of the system, the number of hugepages allocated would set me up with enough hugepages (20 GB to be exact) to map the entire hugepages application.
After a reboot, I could see that the hugepages were preallocated by the system, and the 20 GB of memory were already mapped:
1 2 3 4 5 |
[root@server ~]# free -m total used free shared buffers cached Mem: 72439 20110 52328 0 0 12 -/+ buffers/cache: 20097 52342 Swap: 7999 0 7999 |
The pagetables size for mapping 19 GB of memory using hugepages is relatively small:
1 2 |
[root@server ~]# grep PageTables /proc/meminfo PageTables: 2796 kB |
(Note that this number also includes the page table entries for mapping the other processes run on this server at that time. I didn’t calculate the exact number. )
This is the contents of memory_hugepages.c, the hugepages application I ran in my experiment:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
#include <stdio.h> #include <stdlib.h> #include <inttypes.h> #include <math.h> #include <sys/types.h> #include <unistd.h> #include <fcntl.h> #include <sys/stat.h> #include <sys/mman.h> int main() { int no_of_GB = 19; uint64_t memsize = no_of_GB * pow(2,30); int sleeptime = 600; uint64_t i; int fd = open("/mnt/hugetablefs/fileA", O_CREAT|O_RDWR, 0600); printf ("Allocating %d GB (%llu bytes) of memory...\n", no_of_GB, memsize); char *my_string = mmap(0, memsize, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); for (i = 0; i < memsize; i++) { my_string[i] = 'a'; } printf ("Sleeping %d seconds...\n", sleeptime); sleep(sleeptime); } |
As one can see, the application allocated 19 GB of memory, filled it with dummy data, and slept for five minutes on order for me to retrieve the page table data needed before the application free up the memory.
I ran the application like this:
1 2 3 |
[root@server ~]# gcc -o memory memory_hugepages.c && ./memory Allocating 19 GB (20401094656 bytes) of memory... Sleeping 600 seconds... |
…and as soon as the memory had been allocated I could see that both the size of the page table and the amount of used memory were approximately the same as before:
1 2 3 4 5 6 7 8 |
[root@server ~]# grep PageTables /proc/meminfo PageTables: 2796 kB [root@server ~]# free -m total used free shared buffers cached Mem: 72439 20119 52320 0 1 20 -/+ buffers/cache: 20097 52341 Swap: 7999 0 7999 |
Next, I repeated the experiment, but by running the application that uses regular memory allocations. This is the contents of memory_normal.c:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
#include <stdio.h> #include <stdlib.h> #include <inttypes.h> #include <math.h> int main() { int no_of_GB = 20; uint64_t memsize = no_of_GB * pow(2,30); int sleeptime = 600; uint64_t i; printf ("Allocating %d GB (%llu bytes) of memory...\n", no_of_GB, memsize); char *my_string = malloc(memsize); for (i = 0; i < memsize; i++) { my_string[i] = 'a'; } printf ("Sleeping %d seconds...\n", sleeptime); sleep(sleeptime); } |
I ran the application, and after it had allocated the memory (the same amount as for the hugepages application), is could see that the page table size had grown almost 40 MB (subtract the initial page tables size from the number below to find the exact increase in size):
1 2 |
[root@server ~]# grep PageTables /proc/meminfo PageTables: 41272 kB |
As we can see, using regular memory allocations for mapping an application of this size requires a lot more resources than mapping the same memory using hugepages. So if you are running applications which require lots and lots of memory, you may want to consider using hugepages if the application supports this feature.