Difference between global and device functions
Can anyone describe the differences between __global__
and __device__
?
When should I use __device__
, and when to use __global__
?.
Solution 1:
Global functions are also called "kernels". It's the functions that you may call from the host side using CUDA kernel call semantics (<<<...>>>
).
Device functions can only be called from other device or global functions. __device__
functions cannot be called from host code.
Solution 2:
Differences between __device__
and __global__
functions are:
__device__
functions can be called only from the device, and it is executed only in the device.
__global__
functions can be called from the host, and it is executed in the device.
Therefore, you call __device__
functions from kernels functions, and you don't have to set the kernel settings. You can also "overload" a function, e.g : you can declare void foo(void)
and __device__ foo (void)
, then one is executed on the host and can only be called from a host function. The other is executed on the device and can only be called from a device or kernel function.
You can also visit the following link: http://code.google.com/p/stanford-cs193g-sp2010/wiki/TutorialDeviceFunctions, it was useful for me.
Solution 3:
-
__global__
- Runs on the GPU, called from the CPU or the GPU*. Executed with<<<dim3>>>
arguments. -
__device__
- Runs on the GPU, called from the GPU. Can be used with variabiles too. -
__host__
- Runs on the CPU, called from the CPU.
*) __global__
functions can be called from other __global__
functions starting
compute capability 3.5.