Error:identifer "blockIdx" is undefined

My setup for CUDA

Visual Studio 2010 and 2008 SP1 (required by CUDA). Parallel NSight 1.51 CUDA 4.0 RC or 3.2 and Thrust

Basically, i followed the guide on: http://www.ademiller.com/blogs/tech/2011/03/using-cuda-and-thrust-with-visual-studio-2010/

I then proceeded to compile successfully without error messages.

So i tried with more CUDA code examples from the web. These errors surfaced on Visual Studios. I can still compile successfully without error messages but these errors are only visually highlighted

  • "Error:identifer "blockIdx" is undfined."
  • "Error:identifer "blockDim" is undfined."
  • "Error:identifer "threadIdx" is undfined."

Here's the screenshot.

http://i.imgur.com/RVBfW.png

Should i be concerned? Is it a Visual Studios bug or is my setup configuration wrong? Any help is appreciated. Thanks guys!

P.S I'm very new to both Visual Studios and CUDA.

// incrementArray.cu
#include "Hello.h"
#include <stdio.h>
#include <assert.h>
#include <cuda.h>
void incrementArrayOnHost(float *a, int N)
{
  int i;
  for (i=0; i < N; i++) a[i] = a[i]+1.f;
}
__global__ void incrementArrayOnDevice(float *a, int N)
{
  int idx = blockIdx.x*blockDim.x + threadIdx.x;
  if (idx<N) a[idx] = a[idx]+1.f;
}
int main(void)
{
  float *a_h, *b_h;           // pointers to host memory
  float *a_d;                 // pointer to device memory
  int i, N = 10;
  size_t size = N*sizeof(float);
  // allocate arrays on host
  a_h = (float *)malloc(size);
  b_h = (float *)malloc(size);
  // allocate array on device 
  cudaMalloc((void **) &a_d, size);
  // initialization of host data
  for (i=0; i<N; i++) a_h[i] = (float)i;
  // copy data from host to device
  cudaMemcpy(a_d, a_h, sizeof(float)*N, cudaMemcpyHostToDevice);
  // do calculation on host
  incrementArrayOnHost(a_h, N);
  // do calculation on device:
  // Part 1 of 2. Compute execution configuration
  int blockSize = 4;
  int nBlocks = N/blockSize + (N%blockSize == 0?0:1);
  // Part 2 of 2. Call incrementArrayOnDevice kernel 
  incrementArrayOnDevice <<< nBlocks, blockSize >>> (a_d, N);
  // Retrieve result from device and store in b_h
  cudaMemcpy(b_h, a_d, sizeof(float)*N, cudaMemcpyDeviceToHost);
  // check results
  for (i=0; i<N; i++) assert(a_h[i] == b_h[i]);
  // cleanup
  free(a_h); free(b_h); cudaFree(a_d); 

  return 0;
}

It's just a keyword Visual Intellisense problem conducted by VS itself. Codes can be built successfully because VS requests NVCC, who can find and recognize these keywords, to do the building work, you can just add the following code to solve this problem under VS2010

 #include "device_launch_parameters.h"

The code is compiled correctly, it is the Visual Intellisense which is trying to parse the code and catch errors on its own. The trick I do usually is to have a "hacked" header file which defines all CUDA-specific symbols (threadIdx, __device__, etc.) and then include it in the .cu file like this:

#ifndef __CUDACC__
#include "myhack.h"
#endif

This way, Intellisense will read in myhack.h and won't complain about CUDA stuff. The real nvcc compiler will recognise the __CUDACC__ macro and won't read the hack file.


Further to CygnusX1's answer, follow these directions to add CUDA keywords like blockDim to your usertype.dat file for Visual Studio 2010.

That should eliminate Intellisense errors for those keywords.