Error:identifer "blockIdx" is undefined
My setup for CUDA
Visual Studio 2010 and 2008 SP1 (required by CUDA). Parallel NSight 1.51 CUDA 4.0 RC or 3.2 and Thrust
Basically, i followed the guide on: http://www.ademiller.com/blogs/tech/2011/03/using-cuda-and-thrust-with-visual-studio-2010/
I then proceeded to compile successfully without error messages.
So i tried with more CUDA code examples from the web. These errors surfaced on Visual Studios. I can still compile successfully without error messages but these errors are only visually highlighted
- "Error:identifer "blockIdx" is undfined."
- "Error:identifer "blockDim" is undfined."
- "Error:identifer "threadIdx" is undfined."
Here's the screenshot.
http://i.imgur.com/RVBfW.png
Should i be concerned? Is it a Visual Studios bug or is my setup configuration wrong? Any help is appreciated. Thanks guys!
P.S I'm very new to both Visual Studios and CUDA.
// incrementArray.cu
#include "Hello.h"
#include <stdio.h>
#include <assert.h>
#include <cuda.h>
void incrementArrayOnHost(float *a, int N)
{
int i;
for (i=0; i < N; i++) a[i] = a[i]+1.f;
}
__global__ void incrementArrayOnDevice(float *a, int N)
{
int idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx<N) a[idx] = a[idx]+1.f;
}
int main(void)
{
float *a_h, *b_h; // pointers to host memory
float *a_d; // pointer to device memory
int i, N = 10;
size_t size = N*sizeof(float);
// allocate arrays on host
a_h = (float *)malloc(size);
b_h = (float *)malloc(size);
// allocate array on device
cudaMalloc((void **) &a_d, size);
// initialization of host data
for (i=0; i<N; i++) a_h[i] = (float)i;
// copy data from host to device
cudaMemcpy(a_d, a_h, sizeof(float)*N, cudaMemcpyHostToDevice);
// do calculation on host
incrementArrayOnHost(a_h, N);
// do calculation on device:
// Part 1 of 2. Compute execution configuration
int blockSize = 4;
int nBlocks = N/blockSize + (N%blockSize == 0?0:1);
// Part 2 of 2. Call incrementArrayOnDevice kernel
incrementArrayOnDevice <<< nBlocks, blockSize >>> (a_d, N);
// Retrieve result from device and store in b_h
cudaMemcpy(b_h, a_d, sizeof(float)*N, cudaMemcpyDeviceToHost);
// check results
for (i=0; i<N; i++) assert(a_h[i] == b_h[i]);
// cleanup
free(a_h); free(b_h); cudaFree(a_d);
return 0;
}
It's just a keyword Visual Intellisense problem conducted by VS itself. Codes can be built successfully because VS requests NVCC, who can find and recognize these keywords, to do the building work, you can just add the following code to solve this problem under VS2010
#include "device_launch_parameters.h"
The code is compiled correctly, it is the Visual Intellisense which is trying to parse the code and catch errors on its own.
The trick I do usually is to have a "hacked" header file which defines all CUDA-specific symbols (threadIdx
, __device__
, etc.) and then include it in the .cu file like this:
#ifndef __CUDACC__
#include "myhack.h"
#endif
This way, Intellisense will read in myhack.h
and won't complain about CUDA stuff. The real nvcc compiler will recognise the __CUDACC__
macro and won't read the hack file.
Further to CygnusX1's answer, follow these directions to add CUDA keywords like blockDim
to your usertype.dat file for Visual Studio 2010.
That should eliminate Intellisense errors for those keywords.