I will probably need this so …

Ivan's blog

This post will show you some points about how to measure time in Cuda.

Reading the documentation about Cuda you could find two ways:

  • cutStartTimer(myTimer)
  • Events

Events are a bit more sophisticated and, if your code uses asynchronous kernels, you must to use it. But, how could you know if a code has an asynchronous kernel or not?

To let a code be asynchronous the programmer must create streams with the input data and transfers it to the device using the instruction:

cudaMemcpyAsync

In conclusion, if in the code there is not any instruction like ‘cudaStreamCreate’ and ‘cudaMemcpyAsync’ you cold assume that your code is synchronous (simplifying the measurements).

Measuring with the cut{Start|Stop}Timer

It is very important to use the instruction cudaThreadSynchronize() to avoid erroneous measurements.

The code is bellow:

The output:

[ivan@machine]$ ./timer Device name : Tesla C2050 Time for selecting the device: 3423.731934 ms Time for the kernel: 0.068000…

View original post 111 more words