OpenMP run slower that single threaded

  c++, multithreading, openmp

I am trying to learn OpenMP however the code run slower than not using openMP. There has been various posting about this but none seems to apply to my issues.
I have created a simple program that illustrate the point for the use of ‘omp parallel for’ when running it I got the following performance.

   No OMP 0.0109663sec
   Parallel for: 0.0076869sec single thread
   Parallel for: 0.0151231sec 2 threads
   Parallel for: 0.0169528sec 4 threads
   Parallel for: 0.0150955sec 8 threads

using 2 to 8 cores is roughly half the performance of not using openMP. Clearly this is not what I expected.

I am using visual studio express 2015. and it doesn’t matter if you run it with optimizer on or off. I have set the /openmp in the c compiler command line. and I believe I have set the shared and private claused correctly.
I am initializing an array with 1,000,000 entries, so any initial overhead to setting up the parallel threads should not be the issue. I have an Intel i7, with 8 cores

Code: I have two function
testparrallelfor and testnoomp().
function Naming should be self-explanatory.
the statement ++th[omp_get_thread_num()]; is just to count how many loop counts each thread is getting. The result is the same even if I comment that statement out.
I have also tried to use a static variable double a[1000*1000] to see if the issue is with the dynamic heap allocation of variable a.

#include <omp.h>

static int th[8];

void reset_th()
{
    int i;
    for (i = 0; i < 8; ++i)
        th[i] = -1;
}

void out_th()
{
    int i;
    cout << "Threads ";
    for (i = 0; i < 8; ++i)
        cout << i << ":" << th[i] + 1 << ", ";
    cout << endl;
}

void testparallelfor(int len, int no)
    {
    const int n = 1000 * 1000;
    double tw;
    double *a = new double[n];

    reset_th();
    tw = omp_get_wtime();
#pragma omp parallel shared(a, len, th) num_threads(no) if (len > 1000)
    {
#pragma omp for 
        for (int la = 0; la < len; ++la)
        {
            ++th[omp_get_thread_num()];
            a[la] = la * 2 + 1; 
        }
    }

    tw = omp_get_wtime() - tw;
    cout << "Parallel for " << tw << "sec" << endl;
    out_th();
    }

void testnoomp(int len)
{
    int n = 1000 * 1000;
    double tw;
    double *a = new double[n];

    reset_th();
    tw = omp_get_wtime();
    for (int la = 0; la < len; ++la)
        {
        ++th[omp_get_thread_num()];
        a[la] = la * 2 + 1; 
        }

    tw = omp_get_wtime() - tw;
    cout << "No OMP " << tw << "sec" << endl;
    out_th();
}

int main()
    {
    int n = 1000*1000;

    testnoomp(n);               // no OpenMP 
    for(int i=1; i<=8; i*=2)
        testparallelfor(n, i);   // is is the number of threads to be sued 

    cout << endl;
    return 0;
    }

Any help or insight would be appreciated.

Source: Windows Questions C++

LEAVE A COMMENT