How to Retrieve a Scalar Value from a Compute Function in Apache Arrow

  apache-arrow, c++

In am looping over the elements of an Arrow Array and trying to apply a compute function to each scalar that will tell me the year, month, day, etc… of each element. The code looks something like this:

arrow::NumericArray<arrow::Date32Type> array = {...}
for (int64_t i = 0; i < array.length(); i++) {
  arrow::Result<std::shared_ptr<arrow::Scalar>> result = array->GetScalar(i);
  if (!result.ok()) {         
     // TODO: handle error
  }

  arrow::Result<arrow::Datum> year = arrow::compute::Year(*result); 
}

However, I am not really clear as to how to extract the actual int64_t value from the arrow::compute::Year call. I have tried to do things like

const std::shared_ptr<int64_t> val = year.ValueOrDie();

>>> 'arrow::Datum' to non-scalar type 'const std::shared_ptr<long int>' requested

I’ve tried similarly to assign to just an int64_t which also fails with error: cannot convert 'arrow::Datum' to 'int64_t'

I didn’t see any method of the Datum class that would otherwise return a scalar value in the primitive type that I think arrow::compute::Year should be returning. Any idea what I might be misunderstanding with the Datum / Scalar / Compute APIs?

Source: Windows Questions C++

LEAVE A COMMENT