Why is the std::function call so slow? Is such an implementation viable given that performance is important? [closed]

  c++, functional-programming, performance

I found that my application is very slow.
Here is an example of my implementation.
It doesn’t matter what exactly the functions of MyClass do (for testing).

class MyFunctor
{
public:
    //MyFunctor(double (*func)()) : _func(func) {}; // can't do it
    MyFunctor(std::function<double(void)> func): _func(func) {};
    ~MyFunctor() {};

    double operator()() { return _func(); };

private:
    //double (*_func)();
    std::function<double(void)> _func;
    /*rest implementation*/
};


class MyClass1
{
    double _b{};
    double _c{};
    double _a{};

    vector<MyFunctor> _functors{};

    double func() { return (std::sin(_a) + std::cos(_b)); };
    double anotherFunc() { return _a * _a + _b * _b; };

public:
    MyClass1(double a) : _a(a), _b(a), _c(a)
    {
        _functors.push_back(MyFunctor{ [this]() {return func(); } });
        _functors.push_back(MyFunctor{ [this]() {return func() * func(); } });
        _functors.push_back(MyFunctor{ [this]() {return _a*anotherFunc(); } });
        _functors.push_back(MyFunctor{ [this]() {return _a + _b + _c; } });
    }

    vector<MyFunctor> getFunctors() { return _functors; }

    // FOR PERFORMANCE COMPARISON (do the same as functors)
    double F1() { return func(); };
    double F2() { return func() * func(); };
    double F3() { return _a*anotherFunc(); };
    double F4() { return _a + _b + _c; };

    ~MyClass1() {}
};

/*
 * .. i have many classes MyClass2, MyClass3, ... with getFunctors() function
 */

int main()
{
    using std::chrono::duration_cast;
    using clock_t = std::chrono::high_resolution_clock;
    using millisecond_t = std::chrono::duration<double, std::milli>;
    std::chrono::time_point<clock_t> tic;
    double elapsed;

    constexpr int N = 1000'000;

    MyClass1 C{ 1 };
    MyClass1 C2{ 2 };
    //MyClass2 D{ 3 }; // etc..

    auto functors = C.getFunctors(); // can also contain functors of other objects


    /* TEST FUNCTORS */
    double sum{};
    sum = 0.0;
    tic = clock_t::now();
    for (size_t i = 0; i < N; i++)
    {
        sum = 0;

        auto iter = functors.begin();
        while (iter != functors.end())
        {
            sum += (*iter)();
            ++iter;
        }
    }
    elapsed = duration_cast<millisecond_t>(clock_t::now() - tic).count();
    std::cout << "t(functors) elapsed: " << elapsed << " millisect";
    std::cout << sum << "n";


    /* TEST MEMBERS */
    sum = 0.0;
    tic = clock_t::now();
    for (size_t i = 0; i < N; i++)
    {
        sum = C.F1() + C.F2() + C.F3() + C.F4();
    }
    elapsed = duration_cast<millisecond_t>(clock_t::now() - tic).count();
    std::cout << "t(members) elapsed: " << elapsed << " millisect";
    std::cout << sum << "n";

    return 0;
}

Output:

(functors) elapsed: 117.828 millisec    8.29107 
(members) elapsed: 1.4552 millisec      8.29107 

In this case, calling functors is ~100 times slower than calling similar class members.
Is there another faster way to get a vector of functors? Thank you

EDIT:

I changed calling functors without creating temporary objects. But this did not solve the problem.

When compiled with -O3 it turns out something like this:

(functors) elapsed: 104.97 millisec    8.29107 
(members) elapsed: 0.0008 millisec      8.29107 

msvc and gcc give similar results

In my application, each functor has multiple flags, and I need to get a vector of double of the values ​​of those functors, depending on the values ​​of the flags. Let’s say each functor has an active flag, and I need to get a vector of double for all functors for which active = true or active = false.
To reformulate the question, is such an implementation viable given that performance is important?

Source: Windows Questions C++

LEAVE A COMMENT