MENU
Student Coding
Resources

Links for NGC Tools

HPC SDK Blog/Customer Presentation/GTC presentation

Cool tools - NGC Container Tools (the articles contain links to the GitHub Repos):

NGC Module Files – TACC lmod based container user tool

NGC Container Replicator - download and convert NGC docker containers to singularity with local repository

HPC Container Maker – python based recipe driven application tool for creating containers from HPC/AI applications

NVIDIA

NVIDIA Sign-in

SDK List

Curated BIG SDK list

Mig Mode Notes

MIG Graphics capabilities

“Users should note the following considerations when the A100 is in MIG mode:

  • No graphics APIs are supported (e.g. OpenGL, Vulkan etc.)
  • No GPU to GPU P2P (either PCIe or NVLink) is supported
  • CUDA applications treat a Compute Instance and its parent GPU Instance as a single CUDA device. See this section on device enumeration by CUDA
  • CUDA IPC across GPU instances is not supported. CUDA IPC across Compute instances is supported
  • CUDA debugging (e.g. using cuda-gdb) and memory/race checking (e.g. using cuda-memcheck or compute-sanitizer) is supported
  • CUDA MPS is supported on top of MIG. The only limitation is that the maximum number of clients (48) is lowered proportionally to the Compute Instance size

GPUDirect RDMA is supported when used from GPU Instances”

NVIDIA Virtual Computer Server

NVIDIA Multi-Instance GPU and NVIDIA Virtual Compute Server

NVIDIA GitLab

Xalt Blog

Maximizing Data Center Productivity with Application Workload Analysis

Xalt GitHub

DCGM

NVIDIA Developer

Job Statistics with NVIDIA Data Center GPU Manager and SLURM

Data Center GPU Manager

Edit Page