3.7. Occupancy

This section describes the occupancy calculation functions of the CUDA runtime application programming interface.

Besides the occupancy calculator function (cudaOccupancyMaxActiveBlocksPerMultiprocessor), there are also C++ only occupancy-based launch configuration functions documented in C++ API Routines module.

See cudaOccupancyMaxPotentialBlockSize ( C++ API) and cudaOccupancyMaxPotentialBlockSizeVariableSMem ( C++ API)

Functions

cudaError_t cudaOccupancyMaxActiveBlocksPerMultiprocessor ( int* numBlocks, const void* func, int  blockSize, size_t dynamicSMemSize )
Returns occupancy for a device function.

Functions

cudaError_t cudaOccupancyMaxActiveBlocksPerMultiprocessor ( int* numBlocks, const void* func, int  blockSize, size_t dynamicSMemSize )
Returns occupancy for a device function.
Parameters
numBlocks
- Returned occupancy
func
- Kernel function for which occupancy is calulated
blockSize
- Block size the kernel is intended to be launched with
dynamicSMemSize
- Per-block dynamic shared memory usage intended, in bytes
Description

Returns in *numBlocks the maximum number of active blocks per streaming multiprocessor for the device function.

Note:

Note that this function may also return error codes from previous, asynchronous launches.

See also:

cudaOccupancyMaxPotentialBlockSize, cudaOccupancyMaxPotentialBlockSize ( C++ API), cudaOccupancyMaxPotentialBlockSizeVariableSMem ( C++ API)