Opening a parquet directory

  apache-arrow, c++, parquet

I’m trying to open a hive-partitioned parquet, which is essentially a nested directory with many little parquet fragments at the bottom level. When working with Python, I can just use read_table on the top directory that has a .parquet at the end of the name, and everything is handled automatically. If I’m working with just a single file in C++, I can use a std::shared_ptr<arrow::io::ReadableFile> instance to read from it, but of course it does not work on a directory.

So, how can I process a directory-like parquet in C++?

Source: Windows Questions C++

LEAVE A COMMENT