There are existing R packages for CUDA. But if there is a need to customize your own parallel code on NVIDIA GPU to be called from R, it is possible to do so with the CUDA Toolkit. This post demonstrates a sample function to approximate the value of Pi using Monte Carlo method which is accelerated by GPU. The sample is built using Visual Studio 2010 but the Toolkit is supported on linux platforms as well. It is assumed that the Visual Studio is integrated with the CUDA Toolkit.
The first thing to do is to create a New Project using the Win32 Console Application template, and specify DLL with Empty project option.
And then, some standard project environment customization including:
CUDA Build Customization:
CUDA Runtime, select Shared/dynamic CUDA runtime library:
Project Dependencies setting. Since the CUDA code in this example utilize curand for Monte Carlo, the corresponding library must be included or else the linking will fail.
Finally the time to code. Only a cu file is needed which resembles the standard directives. It is important to include the extern declaration as below for R to call.
After a successful compile, the DLL will be created with the CUDA code. This DLL will be registered in R for calling.
Finally, start R and issue the dyn.load command to load the DLL into the running environment. Shown below is a “wrapper” R function to make calling the CUDA code easier. Notice at the heart of this wrapper is the .C function.
Last but not least, the CUDA Toolkit comes with a visual profiler which is capable to be launched for profiling the performance of the NVIDIA GPU. It can be launched from the GUI, or using a command line like the example below. It should be noted that the command line profiler must be started before R or it might not be able to profile properly.
The GUI profiler is equipped with a nice interface to show performance statistics.