How to use Apache Arrow to write files in Parquet format on Windows using C++

  apache-arrow-cpp, c++, cmake, parquet, windows

I’m trying to write Parquet files on Windows using C++.
I followed the instructions I found here and chose the "Using conda-forge for build dependencies" and "Building using Visual Studio (MSVC) Solution Files" approaches.

In contrast to the article on the page mentioned before, my calls to cmake look like this:

cmake .. -G "Visual Studio 16 2019" -A x64 ^
        -DARROW_BUILD_TESTS=OFF ^
        -DARROW_PARQUET=ON ^
        -DARROW_BUILD_SHARED=OFF ^
        -DARROW_BUILD_STATIC=ON ^
        -DARROW_DEPENDENCY_SOURCE=AUTO
cmake --build . --config Release

I want to use Parquet/Arrow as a static library, so I set -DARROW_BUILD_SHARED=OFF and -DARROW_BUILD_STATIC=ON.
In the "build" folder created after running cmake, I built the Parquet and Arrow INSTALL projects (buildsrcparquetINSTALL.vxcproj, buildsrcarrowINSTALL.vxcproj) with Visual Studio 2019. As a result, a folder structure was created under C:Program Filesarrowlib including arrow_static.lib and parquet_static.lib.
Under C:Program Filesarrowincludearrow respectively parquet all related header files can be found.

I then set up a new C++ project (Release Build | x64) in Visual Studio referencing the previously build static libs and include directories.
In the project settings under "C/C++->Preprocessor->Preprocessor Definitions" I added PARQUET_STATIC and ARROW_STATIC.
For a first test, I use the "reader_writer" example form the apache arrow github repo:

Now, if i build the reader_writer example, I receive multiple linking errors (LNK2001) like the following example:

Error   LNK2001 unresolved external symbol "public: virtual char const * __cdecl apache::thrift::transport::TTransportException::what(void)const " ([email protected]@[email protected]@[email protected]@UEBAPEBDXZ)  ReaderWriterDemo    D:..ReaderWriterDemoparquet_static.lib(column_writer.obj)

All errors have in common that they refer to "thrift" at some point.
I was under the impression that when using "conda-forge for build dependencies" (see first link above), that all required dependencies would be available/somehow integrated in the solutions build by cmake?!

When running "cmake .. -G "Visual Studio 16 2019" -A x64" I can see the following in the log output though:

-- Checking for module 'thrift'
-- Found thrift, version 0.15.0
-- Found Thrift: D:/Programs/Miniconda3/envs/arrow-dev/Library/lib/thriftmd.lib (found suitable version "0.15.0", minimum required is "0.11.0")

My attempts to build the thrift library separately by using the resources found here weren’t sucessful; information regarding Windows seem to be incomplete/outdated.

I assume that it should be possible to setup the Arrow/Parquet libs only by using resources from https://github.com/apache/arrow/.

Perhaps some of you have already gone through the same process and and can give me a hint as to where else I may have missed something.

Source: Windows Questions

LEAVE A COMMENT