Skip to content

NVIDIA Cloud Native

docs, youtube, youtube developer

Graphical cards

Name Specifications Comments
T4
T400 Affordable
V100
A100

GPUs on Kubernetes

NVIDIA Multi-Instance GPU (MIG)

Multi-Instance GPU (MIG) expands the performance and value of NVIDIA H100, A100, and A30 Tensor Core GPUs. MIG can partition the GPU into as many as seven instances, each fully isolated with its own high-bandwidth memory, cache, and compute cores.

product

NVIDIA GPU Cloud (NGC)

Components

NVIDIA Container Runtime

code

NVIDIA Container Toolkit

code

NVIDIA DCGM-Exporter

DCGM-Exporter exposes GPU metrics exporter for Prometheus leveraging NVIDIA Data Center GPU Manager (DCGM)

code, docs

NVIDIA GPU feature discovery

code

NVIDIA GPU Operator

code, docs

NVIDIA Device Plugin

code

Tutorials

NVIDIA GPU Operator in K3s

NVIDIA GPUs with SLES

ℹ Full official support should come in early 2023

git clone https://gitlab.com/nvidia/container-images/driver.git && cd sle15

docker build . -t path/to/your/repo/driver:515.65.01-sles15.3 \
  --build-arg DRIVER_VERSION=515.65.01 \
  --build-arg CUDA_VERSION=11.7.1 \
  --build-arg SLES_VERSION=15.3

docker push path/to/your/repo/driver:515.65.01-sles15.3
  • Deploy GPU Operator and specify custom driver image
helm install gpu-operator nvidia/gpu-operator \
  --create-namespace -n gpu-operator \
  --set driver.repository=path/to/your/repo