Install NIC and DPU Agents
Installing NetQ telemetry agents on your hosts with NVIDIA ConnectX adapters and NVIDIA BlueField data processing units (DPUs) allows you to track inventory data and statistics across devices. The DOCA Telemetry Service (DTS) is the agent that runs on hosts and DPUs to collect data.
- NIC telemetry for ConnectX adapters is supported only for on-premises NetQ deployments.
- ConnectX telemetry is supported on DTS version 1.14.2 and later.
Install DTS on ConnectX Hosts
To install and configure the DOCA Telemetry Service container on a host with ConnectX adapters, perform the following steps:
-
Obtain the latest DTS container image path from the NGC catalog. Select Get Container and copy the image path.
-
Run the DTS container with Docker on the host. Use the image path obtained in the previous step for the DTS_IMAGE variable and configure the IP address of your NetQ server for the
-ioption:
export DTS_IMAGE=nvcr.io/nvidia/doca/doca_telemetry:1.14.2-doca2.2.0-host
docker run -v "/opt/mellanox/doca/services/telemetry/config:/config" --rm --name doca-telemetry-init -ti $DTS_IMAGE /bin/bash -c "DTS_CONFIG_DIR=host_netq /usr/bin/telemetry-init.sh && /usr/bin/enable-fluent-forward.sh -i=10.10.10.1 -p=30001"
docker run -d --net=host \
--privileged \
-v "/opt/mellanox/doca/services/telemetry/config:/config" \
-v "/opt/mellanox/doca/services/telemetry/ipc_sockets:/tmp/ipc_sockets" \
-v "/opt/mellanox/doca/services/telemetry/data:/data" \
--rm --name doca-telemetry -it $DTS_IMAGE /usr/bin/telemetry-run.sh
Configure Prometheus Targets for ConnectX Adapters
The Prometheus adapter pod in NetQ collects statistics from ConnectX adapters in your network. To add adapters as a target for data collection, perform the following steps:
- On your NetQ VM, edit the
targets-configConfigMap with thekubectl edit cm targets-configcommand.
Add the desired host IP addresses to the targets stanza, maintaining yaml indentation. Multiple entries must be separated by commas, and the port is 9100:
data:
targets.json: |-
[
{
"labels": {
"job": "node"
},
"targets": [
"10.10.10.10:9100","10.10.10.11:9100"
]
}
]
- Restart the
netq-prom-adapterpod.
Retrieve the current pod name with the kubectl get pods | grep netq-prom command:
cumulus@netq-server:~$ kubectl get pods | grep netq-prom
netq-prom-adapter-ffd9b874d-hxhbz 2/2 Running 0 3h50m
Restart the pod by deleting the running pod:
kubectl delete pod netq-prom-adapter-ffd9b874d-hxhbz
Install DTS on DPUs
To install and configure the DOCA Telemetry Service (DTS) container on a DPU, perform the following steps:
-
Obtain the DTS container image path from the NGC catalog. Select Get Container, then View all tags. Copy the 1.18.2-doca2.8.0-host image path.
-
Remove any current DTS configurations using the following command:
sudo rm -rf /opt/mellanox/doca/services/telemetry/config
- Retrieve the container
yamlconfiguration file onto the host. Use the path specified in the Adjusting the .yaml Configuration section in the NGC instructions. Copy it to/etc/kubelet.d/doca_telemetry_standalone.yaml:
wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/doca/doca_container_configs/versions/2.0.2v1/files/configs/2.0.2/doca_telemetry.yaml -O /etc/kubelet.d/doca_telemetry_standalone.yaml
-
Edit the
imagein both thecontainersandinitContainerssections of the/etc/kubelet.d/doca_telemetry_standalone.yamlfile to set the container image path retrieved in step 1. -
Edit the
commandin theinitContainerssection of the/etc/kubelet.d/doca_telemetry_standalone.yamlfile to set theDTS_CONFIG_DIRparameter toinventory_netq. Configure the fluent forwarding-ioption to your NetQ server IP address and the-poption to 30001:
initContainers:
...
command: ["/bin/bash", "-c", "DTS_CONFIG_DIR=inventory_netq /usr/bin/telemetry-init.sh && /usr/bin/enable-fluent-forward.sh -i=10.10.10.1 -p=30001"]
This step replaces the default configuration of command: ["/bin/bash", "-c", "/usr/bin/telemetry-init.sh && /usr/bin/enable-fluent-forward.sh"].
- Restart the DPE service with the
service dpe restartcommand.