I’m writing a C++ program that shall solve PDEs and algebraic equations on networks.
The Eigen library shoulders the biggest part of the work by solving many sparse linear systems with LU decomposition.
As performance is always nice I played around with options for that. I’m using
g++ -O3 -DNDEBUG -flto -fno-fat-lto-objects -std=c++17
as performance-related compiler options. I then added the
-march=native option and found that
execution time increased on average by approximately 6% (tested by gnu time with a sample size of about 10 runs per configuration. There was almost no variance for both settings).
What are possible (or preferably likely) reasons for such an observation.
I guess the output of lscpu might be useful here, so this is it:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 39 bits physical, 48 bits virtual CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 78 Model name: Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz Stepping: 3 CPU MHz: 800.026 CPU max MHz: 3100.0000 CPU min MHz: 400.0000 BogoMIPS: 5199.98 Virtualization: VT-x L1d cache: 64 KiB L1i cache: 64 KiB L2 cache: 512 KiB L3 cache: 4 MiB
Source: Windows Questions C++