I have spent all day yesterday reading how to use managed (unified) memory array for a CUDA program (using the book Professional CUDA Programming, practiced some of the sample code (although I still got doubts about the profiler info) and I am ready to apply it to my program that uses both a CUDA kernel and some OpenCV functions.
I have several questions, but let me address here the first one.
cv::Mat h_image; h_image = cv::imread(dirname+image_filenames[ni], cv::IMREAD_GRAYSCALE); cv::cuda::GpuMat d_image; // 2. Upload the Image d_image.upload(h_image);
So I have an image read with
imread and I upload it to the device memory.
How can I use unified memory for this?
In theory, to use unified memory I can have (with float arrays)
float *A; cudaMallocManaged((void **)&A, nBytes);
or even (and I prefer this)
__device__ __managed__ float A;
Is there a way to do something similar with Mats and GpuMats?
Source: Windows Questions C++