Weird crashes in C++ Project on thread.join()

  c++, multithreading

i have written a chess engine which plays at a high level. Unfortunately we are known for multiple engine crashes. There are a few people trying to figure out the reason for those crashes. The code can be found here. We tracked all crashes we encountered down to one file which deals with all the input/output from/to the user. uci.cpp is the file.

The job of uci.cpp/h is to implement the universal chess protocol (uci). For that purpose we have some global Board object which represents a position. To be able to receive a stop command while the engine searches a specific position, we brought the search into its own thread. We use thread.joinable() to check if a search is still running and searchThread.join() before we start a new search to make sure we do not have multiple searches running.

Someone has sent us the list of crashes he was able to provoke on his machine:

1a) general protection fault on libstdc++.so.6.0.28, thread::join crash
1b) segfault on libpthread-2.31.so (replace 2.31 by your libc version), also thread::join crash
2a) trap stack segment on the binary itself, delete board crash
2b) segfault on the binary itself, delete board crash

Error type 1/2: thread.join()

The first 2 crash types are both related to calling searchThread.join() although we check if its joinable.

void uci_stop() {
    search_stop();
    if (searchThread.joinable()) {
        searchThread.join();
    }
}

What are reasons that could .join() to fail?
We analysed our code a while ago and were not able to find any memory leaks. As far as we know, all data that is allocated within the searchThread will also get deleted.


Error type 3/4: delete board;

We use a global pointer for the board and in very rare instances, deleting that pointer will fail.
I found a potential solution which will statically allocate the board object globally which means I wouldnt havent to use new and delete anymore.

// old:
Board*      board;

// new:
Board       board{""};

We think that this solves the problem and is not the main part of this question although we are still curious why delete board could fail in some instances.

All crashes have only occured in roughly 2% of all games. With a game taking about 60 moves, thread.join() fails about 1 in 1000 times.

Source: Windows Questions C++

LEAVE A COMMENT