Contents
  1. 1. cuFFT(rocFFT)
  2. 2. cuBLAS(hipblas, rocblas)
  3. 3. cuRAND(rocrand)
  4. 4. cuSPARSE(hipsparse, rocsparse)
  5. 5. cuSOLVER(rocsolver)
  6. 6. Thrust(hip-thrust)
  7. 7. Eigen
  8. 8. Others
    1. 8.1. cudart(NVIDIA CUDA Runtime Library)
    2. 8.2. NVML(NVIDIA Management Library)

Introduce math library(based on CUDA), the name in () is for ROCm

cuFFT(rocFFT)

cuBLAS(hipblas, rocblas)

Basic Linear Algebra Subprograms (BLAS) is a specification that prescribes a set of low-level routines for performing common linear algebra operations such as vector addition, scalar multiplication, dot products, linear combinations, and matrix multiplication.

They are the de facto standard(非官方标准) low-level routines for linear algebra libraries; the routines have bindings for both C and Fortran.

Vector: x, y; Matrix: A, B, C; Const: alpha, beta

  • Level 1: the original presentation of BLAS (1979), which defined only vector operations on strided arrays: dot products, vector norms, a generalized vector addition of the form

    1
    y = alpha * x + y
  • Level 2: contains matrix-vector operations including, among other things, a generalized matrix-vector multiplication (gemv):

    1
    y = alpha * A * x + beta * y
  • Level 3: 1990, contains matrix-matrix operations, including a “general matrix multiplication“ (gemm)

    1
    C = alpha * A * B + beta * C

Ref

cuRAND(rocrand)

The cuRAND library provides facilities that focus on the simple and efficient generation of high-quality pseudorandom and quasirandom numbers. A pseudorandom sequence of numbers satisfies most of the statistical properties of a truly random sequence but is generated by a deterministic algorithm. A quasirandom sequence of n-dimensional points is generated by a deterministic algorithm designed to fill an n-dimensional space evenly.

It includes:

  • host(CPU) header(/include/curand.h): generate random number on host
  • device(GPU) header(/include/curand_kernel.h): generate random number on device for kernels

cuSPARSE(hipsparse, rocsparse)

The cuSPARSE library contains a set of basic linear algebra subroutines used for handling sparse matrices.
It is implemented on top of the NVIDIA CUDA runtime (which is part of the CUDA Toolkit) and is designed to be called from C and C++.

Ref

cuSOLVER(rocsolver)

The intent of cuSolver is to provide useful LAPACK-like features.
The cuSolver library is a high-level package based on the cuBLAS and cuSPARSE libraries. It combines three separate libraries under a single umbrella, each of which can be used independently or in concert with other toolkit libraries.

  • cuSolverDN: deals with dense matrix factorization(因式分解) and solve routines such as LU, QR, SVD and LDLT
  • cuSolverSP: a new set of sparse routines based on a sparse QR factorization
  • cuSolverRF: a sparse re-factorization package that can provide very good performance when solving a sequence of matrices where only the coefficients are changed but the sparsity pattern remains the same.

Reference:

Thrust(hip-thrust)

Thrust is a parallel algorithms library which resembles the C++ Standard Template Library (STL).
Thrust is a C++ template library for parallel platforms based on the Standard Template Library (STL).

Eigen

Eigen是一个高层次的C++库,有效支持线性代数,矩阵和矢量运算,数值分析及其相关的算法。

Others

cudart(NVIDIA CUDA Runtime Library)

  • https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html

    3.2. CUDA C Runtime
    The runtime is implemented in the cudart library, which is linked to the application, either statically via cudart.lib or libcudart.a, or dynamically via cudart.dll or libcudart.so. Applications that require cudart.dll and/or cudart.so for dynamic linking typically include them as part of the application installation package. It is only safe to pass the address of CUDA runtime symbols between components that link to the same instance of the CUDA runtime.

  • Ubuntu 18.04

    1
    2
    libcudart9.1/bionic 9.1.85-3ubuntu1 amd64
    NVIDIA CUDA Runtime Library

NVML(NVIDIA Management Library)

It is a C-based programmatic interface for monitoring and managing various states within NVIDIA Tesla GPUs.
It is intended to be a platform for building 3rd party applications, and is also the underlying library for the NVIDIA-supported nvidia-smi tool.

Ref

Contents
  1. 1. cuFFT(rocFFT)
  2. 2. cuBLAS(hipblas, rocblas)
  3. 3. cuRAND(rocrand)
  4. 4. cuSPARSE(hipsparse, rocsparse)
  5. 5. cuSOLVER(rocsolver)
  6. 6. Thrust(hip-thrust)
  7. 7. Eigen
  8. 8. Others
    1. 8.1. cudart(NVIDIA CUDA Runtime Library)
    2. 8.2. NVML(NVIDIA Management Library)