Category : memory-alignment

env : x86-64; linux-centos; 8-cpu-core For testing ‘false sharing performance’ I wrote c++ code like this: volatile int32_t a; volatile int32_t b; int64_t p1[7]; volatile int64_t c; int64_t p2[7]; volatile int64_t d; void thread1(int param) { auto start = chrono::high_resolution_clock::now(); for (size_t i = 0; i < 1000000000; ++i) { a = i % 512; ..

Read more

I have two arrays, y_train which is 1D array and x_train which is 2D array. I need to dynamically allocated these two arrays using posix_memalign. I did that for y_train correctly. where I convert int y_train[4344] into the folloing code. int* Y_train; posix_memalign((void**)(&Y_train), 64, sizeof(int) * 4344); Now, I want to convert int x_train[4344][20]; in ..

Read more

I need a portable way to determine alignment requirements of a structure, where portability includes legacy versions of GCC. Parts of project are stuck with embedded platforms supporting pre-C++11 standard only, as early as GCC v.3.6. There is a non-ISO __alignof__ (a macro? a function?) analog of C++11 operator alignof which I can use, but ..

Read more

I may be misunderstanding how cache fetches work, but I’m curious if there are any compiler optimizations for aligning small functions that are not inlined. If the cache-line-size is 64 bits on a given machine, would it make sense to have function pointers to functions that are smaller than 64 bits be aligned within a ..

Read more