![]() |
CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
|

Directories | |
| directory | kernel |
| directory | thread |
Files | |
| file | batched_reduction.h [code] |
| Implements a software-pipelined efficient batched reduction. D = alpha * Reduction(A) + beta * C. | |
| file | batched_reduction_traits.h [code] |
| Defines structural properties of complete batched reduction. D = alpha * Reduction(A) + beta * C. | |
| file | reduction/threadblock_swizzle.h [code] |
| Defies functors for mapping blockIdx to partitions of the batched reduction computation. | |
1.8.11