KokkosNDLambdaWrapperReduction< dim, FUN > Struct Template Reference#

DiFfRG: DiFfRG::KokkosNDLambdaWrapperReduction< dim, FUN > Struct Template Reference

DiFfRG

This is a functor which wraps a lambda for reduction. Basically, this is necessary when one wants to call a variadic lambda on an NVIDIA GPU. CUDA seems to be unable to expand the variadic arguments - in contrast, a direct approach does indeed work for openMP or serial compilation. To get around this limitation, the KokkosNDLambdaWrapperReduction packs the indices into an array. Uses compile-time index sequences to extract the first dim args as indices and the last arg as the reduction value, avoiding recursive tuple_first/tuple_cat overhead per GPU thread. More...

#include <kokkos.hh>

Public Member Functions
KOKKOS_FUNCTION	KokkosNDLambdaWrapperReduction (const FUN &_fun)

template<typename... Args> requires (sizeof...(Args) == dim + 1)
KOKKOS_FORCEINLINE_FUNCTION void	operator() (Args &&...args) const

Public Attributes
FUN	fun

Private Member Functions
template<size_t... Is, typename... Args>
KOKKOS_FORCEINLINE_FUNCTION void	impl (device::integer_sequence< size_t, Is... >, Args &&...args) const

Detailed Description

template<int dim, typename FUN>
struct DiFfRG::KokkosNDLambdaWrapperReduction< dim, FUN >

This is a functor which wraps a lambda for reduction. Basically, this is necessary when one wants to call a variadic lambda on an NVIDIA GPU. CUDA seems to be unable to expand the variadic arguments - in contrast, a direct approach does indeed work for openMP or serial compilation. To get around this limitation, the KokkosNDLambdaWrapperReduction packs the indices into an array. Uses compile-time index sequences to extract the first dim args as indices and the last arg as the reduction value, avoiding recursive tuple_first/tuple_cat overhead per GPU thread.

Template Parameters

dim	Number of arguments taken
FUN	The lambda to which we forward the indices

Constructor & Destructor Documentation