Category : cuda

Say I have an array of numbers on device (CUDA), something like float *d_x; cudaMalloc(&x, N*sizeof(float)); Where x will be something like [0,0,3,0,3,0,3,1,5,1,0]. I am performing two operations on the array. The details are unimportant, but the first operation will act as a sort of preprocessing, permuting the values of x and returning an index, ..

Read more

I have a vector of MyElement, which is defined as follows: struct MyElement { int count; int prefixSum; } I would like to perform an in-place exclusive_scan of count but using prefixSum as the result, without changing count. Is that possible using thrust? As an example, for the following input (prefixSum is initialized with zeros): ..

Read more

#include "cuda_runtime.h" #include "device_launch_parameters.h" #include <iostream> using std::cout; __global__ void average1(int n, float3* a, float* output) { const float third = double(1) / double(3); int i = blockIdx.x * blockDim.x + threadIdx.x; int gridSize = blockDim.x * gridDim.x; while (i < n) { output[i] = (a[i].x + a[i].y + a[i].z) * third; i += gridSize; ..

Read more