GPU energy - NVIDIA SMI - Component
What it does
This metric provider gets the current GPU power draw from the NVIDIA SMI software.
Classname
GpuEnergyNvidiaSmiComponentProvider
Metric Name
gpu_energy_nvidia_smi_component
Prerequisites & Installation
You first must install the CUDA Toolkit from NVIDIA for the metric provider to have the needed libraries and binars. The URL at the time of writing is here: https://developer.nvidia.com/cuda-downloads
You need both:
- Base Installer
- Driver Installer
To check if the installation has succeeded you can run:
$ nvidia-smi -q
After the installation you system can use language bindings for your matching CUDA version. Please check on our Measurement Cluster page which CUDA version is installed.
Debugging
If you cannot generate any output you should first check if your GPU is supported by NVIDIA CUDA on their list for CUDA support.
Then you should check if the kernel module was corretly loaded with dmesg
.
Sometimes a message like this appears:
The NVIDIA GPU 0000:01:00.0 (PCI ID: 10de:1081)
NVRM: installed in this system is not supported by open
NVRM: nvidia.ko because it does not include the required GPU
NVRM: System Processor (GSP).
In this case you should switch to the legacy kernel module
Check in sudo dmesg
if the kernel module could correctly be lodaded and then verify through cat /proc/driver/nvidia/version
. See also details on the NVIDIA support page
Input Parameters
- args
-i
: interval in milliseconds
By default the measurement interval is 100 ms.
./metric-provider-nvidia-smi-wrapper.sh -i 100
Output
This metric provider prints to Stdout a continuous stream of data. The format of the data is as follows:
TIMESTAMP READING
Where:
TIMESTAMP
: Unix timestamp, in microsecondsREADING
: The energy used by the GPU in milliWatts (Ex: 12230 for 12.23 Watts)
Any errors are printed to Stderr.
How it works
The provider uses the nvidia-smi
tool to read the data.