I have two virtual machines running on the exact same Hardware. One of them is SLES12SP5 (Kernel 4.12) (I have similar results with SLES15SP2, but got no at hand at the moment) and one of them is Ubuntu 20.04 (Kernel 5.4). I have a really simple C-Program which measures writing into allocated memory pages and copying them. This program is WAY faster on SLES then on Ubuntu and I can't figure out why. Unfortunately I can't activate the performance counters on the esx Host because of the cluster configuration, maybe that would help to find out whats happening here.
Okay, so here is the Programm:
#include <stdio.h>#include <stdlib.h>#include <string.h>#ifdef __linux__#include <sys/time.h>#elif _WIN32#include <sys/timeb.h>#endif#ifdef __linux__double get_time_insec() { struct timeval tstruct; gettimeofday(&tstruct, NULL); long long milliseconds = tstruct.tv_sec*1000LL + tstruct.tv_usec/1000; return (double)milliseconds/1000;}#elif _WIN32double get_time_insec() { struct timeb tstruct; ftime(&tstruct); return (double)tstruct.time + ((double)tstruct.millitm / (double)1000);}#endif#define PAGESIZE 8 * 1024int main(int argc, char *argv[]) { int i = 0; int x = 0; double startval = 0; int pagenum = 0; int offset = 0; int numOfPages = 0; int numOfTimes = 0; size_t allocBytes = 0; if (argc != 3) { printf("Usage %s numOfPages numOfTimes\n", argv[0]); return -1; } numOfPages = atoi(argv[1]); numOfTimes = atoi(argv[2]); allocBytes = numOfPages*PAGESIZE*2; // Allocate memory Pages printf("Allocating %ld Bytes\n", allocBytes); char *mymem = malloc(allocBytes); if (mymem == NULL) { printf("Allocation failed!\n"); return -1; } // Fill the first half of Pages with text printf("Filling %d Pages %d times...", numOfPages, numOfTimes); fflush(stdout); startval = get_time_insec(); for (x = 0; x < numOfTimes; x++) { for (pagenum = 0; pagenum < numOfPages; pagenum++) { for (i = 0; i < PAGESIZE; i += 32) { offset = (pagenum * PAGESIZE) + i; memcpy(mymem + offset, &"ABCDEFGHIJKLMNOPQRSTUVWXYZ123456", 32); } } } printf("Time taken: %.6f sec\n", get_time_insec() - startval); // And now copy them to the next half of pages printf("Copying %d Pages %d times...", numOfPages/2, numOfTimes); fflush(stdout); startval = get_time_insec(); for (x = 0; x < numOfTimes; x++) { for (pagenum = 0; pagenum < numOfPages; pagenum++) { memcpy((mymem + (numOfPages/2) * PAGESIZE) + (pagenum * PAGESIZE), mymem + (pagenum * PAGESIZE), PAGESIZE); } } printf("Time taken: %.6f sec\n", get_time_insec() - startval); free(mymem); return 0;}
And this are the results for running it with "2000 200":
Ubuntu:
Filling 2000 Pages 200 times...Time taken: 0.606000 secCopying 1000 Pages 200 times...Time taken: 0.921000 sec
SLES:
Filling 2000 Pages 200 times...Time taken: 0.513000 secCopying 1000 Pages 200 times...Time taken: 0.479000 sec
I can't get my head around why SLES is twice as fast then Ubuntu.I thought that writing into memory is a operation where the os doesn't do anything, apart from allocating the pages in the MMU beforehand.So why exactly could it be that SLES is twice as fast?Any kernel parameter? Any parameter in the virtual memory subsystem?
I would love to solve this which is bugging me for weeks now!
Thomas