Boost:ASIO acceptor stacks sockets in CLOSE_WAIT state after hours/days

  boost-asio, c++, tcp

I have a server application serving thousands of clients for basic TCP operations. It accepts both plain text and TLS connections.

Some of the most sollicitated servers are facing a strange issue since few months.
After few hours or days, there are many sockets blocked in CLOSE_WAIT state till server stop serving requests due to the lack of file descriptors.

Sockets graphs here :
https://ibb.co/WsvS1D9

We tried to add logging so we can see that sockets are blocked on the first read operation (for plain text sockets) or on the handshake (for TLS) sockets.
When this happens, the async_read_until and async_handshake function never finish and the associated deadline_timer callback is never called. It’s like the async method is blocking the thread execution!

Connection starts from acceptor

Log::writeLine("("+ StringUtils::printPointer(static_cast<void *>(this)) + ") [FD:"+
                       StringUtils::toString(socket().native_handle())+"] "+" Connection opened", Log::LEVEL_INFO);

/// Set a deadline for the read operation.
inout_deadline_.expires_from_now(boost::posix_time::seconds(TIMEOUT));

/// initiate calls to timer check
inout_deadline_.async_wait(
        strand_.wrap(boost::bind(&TcpServerConnection::check_deadline,
                                 shared_from_this(), boost::asio::placeholders::error, &inout_deadline_)));

Log::writeLine("("+ StringUtils::printPointer(static_cast<void *>(this)) + ") [FD:"+
               StringUtils::toString(socket().native_handle())+"] "+" async_wait initialized", Log::LEVEL_INFO);

if (useTLS_) {
    /// Start TLS handshake
    Log::writeLine("("+ StringUtils::printPointer(static_cast<void *>(this)) + ") [FD:"+
                   StringUtils::toString(socket().native_handle())+"] "+" async_handshake initialized", Log::LEVEL_INFO);

    socket_.async_handshake(boost::asio::ssl::stream_base::server,
                            strand_.wrap(
                                    boost::bind(&TcpServerConnection::handle_handshake, shared_from_this(),
                                                boost::asio::placeholders::error)));
} else {
    startReadLine();
}

In startReadLine() :

inout_deadline_.expires_from_now(boost::posix_time::seconds(TIMEOUT));

Log::writeLine("("+ StringUtils::printPointer(static_cast<void *>(this)) + ") [FD:"+
               StringUtils::toString(socket().native_handle())+"] "+" async_read_until initialized", Log::LEVEL_INFO);
if(useTLS_) {
    boost::asio::async_read_until(socket_, lineBuffer_, lineDelimiter_,
      strand_.wrap(
      boost::bind(&TcpServerConnection::handle_read_line, shared_from_this(),
              boost::asio::placeholders::error,
              boost::asio::placeholders::bytes_transferred,
              false)));
} else {
    boost::asio::async_read_until(socket_.next_layer(), lineBuffer_, lineDelimiter_,
      strand_.wrap(
      boost::bind(&TcpServerConnection::handle_read_line, shared_from_this(),
              boost::asio::placeholders::error,
              boost::asio::placeholders::bytes_transferred,
              false)));
}

Below are 2 examples of log we get for sockets in CLOSE_WAIT state :

PLAIN SOCKET

Jan  4 18:30:28 INFO: (0x7efcf84865b0) [FD:831] Incoming connection: SOCKET RBUF=262142 SBUF=262142
Jan  4 18:30:28 INFO: (0x7efcf84865b0) [FD:831]  Connection opened
Jan  4 18:30:28 INFO: (0x7efcf84865b0) [FD:831]  async_wait initialized
Jan  4 18:30:28 INFO: (0x7efcf84865b0) [FD:831]  startReadLine() called!
Jan  4 18:30:28 INFO: (0x7efcf84865b0) [FD:831]  async_read_until initialized

handle_read_line() and check_deadline() are never called

TLS SOCKET

Jan  4 17:33:06 INFO: (0x7efcfc14a7f0) [FD:513] Incoming connection: SOCKET RBUF=262142 SBUF=262142
Jan  4 17:33:06 INFO: (0x7efcfc14a7f0) [FD:513]  Connection opened
Jan  4 17:33:06 INFO: (0x7efcfc14a7f0) [FD:513]  async_wait initialized
Jan  4 17:33:06 INFO: (0x7efcfc14a7f0) [FD:513]  async_handshake initialized

handle_handshake() and check_deadline() are never called

We tried to update to last version of Boost (1.75) and OpenSSL (1.1.1i), nothing changed.

All help will be much appreciated 🙂

Source: Windows Questions C++

LEAVE A COMMENT