NTC Telemetry Solution
Opinionated Set Up
This page documents the NTC opinionated installation in a generic sense.
The NTC Telemetry solution provides for an automated creation of Telegraf configurations that are then pushed to a Git repository. From there, FluxCD observes the Git repository to deploy changes based on the data inside of Git.
Components
-
Nautobot
Nautobot provides the data for Telegraf configurations, provides Jobs for the generating the configuration, relationships to view the configuration for the device, and pushes configuration to Git .
Responsibility: Client with escalation to NTC Support
-
Templated Configuration
Templates to generate configuration, JSON data from Telegraf configuration to generate large configurations quickly.
Responsibility: Client with escalation to NTC Support
-
GitOps for scalable deployment
Leveraging GitOps, there is versioned history within Git of the entire configuration.
Responsibility: NTC
-
Kubernetes
Kubernetes backend to provide scalable expansion.
Responsibility: Client with over the shoulder support by NTC. No proactive upgrades.
-
FluxCD
Updates the environment based on the Git repository.
Responsibility: NTC
-
Telegraf
Telegraf containers are deployed by FluxCD into the proper Kubernetes environment.
-
Vault
An instance of Hashicorp Vault is leveraged by the solution. This Vault instance is where the secrets are stored and accessed by Kubernetes. This Vault instance cannot be shared out for use outside of the system per licensing requirements by Hashicorp.
Responsibility: NTC
WorkFlow
A general flow involves the following components.
The first portion of the flow on the left within Nautobot is all documented here. This workflow assumes that you need to add a new Telegraf Plugin. You can skip the first step when adding an existing set up.
The second half of the workflow within the Kubernetes and FluxCD are automated by the system and deploys whenever there is a change to the Git repo.
Common Activities
Adding a device to be monitored
Step 1: Add device to Nautobot
Step 2: Create appropriate attributes to fit into the Nautobot Dynamic Group
Step 3: Run the create agents job
Step 4: Run Add Agent to Agent Group Job job
Step 5: Run Add Agent to Agent Group Job job
Working with Nautobot Dynamic Groups
When adding a new device to be monitored, the controls are handled by the associated Dynamic Groups. Let's say that a dynamic group is defined as the following filter:
This dynamic group filter would need to have the device be located at the site atl
, have a status of active
, and have a tag applied monitoring-icmp
. This allows for the primary IP address to be ICMP monitored regardless of the device type, model, or anything else. As long as it is set to active, at the site ATL, and has the appropriate tag.
The dynamic groups are created using various filters available within Nautobot and is documented further on the Nautobot documentation page
Dynamic groups are found under Organization -> Dynamic Groups
within the Nautobot UI.
Adding New Device with New Secret
Secrets are created manually within the GitOps repository. For each new secret credential a separate file is required to be able to load the secret into the Kubernetes environment.
How to View Container Logs
Container logs are sent to the Loki instance. This is then available in the Loki instance with the logs label of network-metrics
. When needing to investigate a particular device having challenges, navigate in Nautobot to the device. Then select the Telegraf
tab. This will give the base container name. From there, navigate to the Loki explore page within Grafana and start typing:
The pod name must be typed out and not pasted in. You should rely on the auto completion search.
This will provide logs that will provide insight about what may or may not be going on.