Network Automation Monitoring - TIG Stack Setup

Welcome to my network automation monitoring exploration series! These posts are mainly for me to understand the TIG stack, Netbox, Containerlabs, StackStorm, and Docker better, and to document useful tips and tricks I discover. I hope you find them helpful and informative too.

We will build some Python scripts to play with network automation, gather SNMP and MTD (model-driven telemetry) statistics and data from our network devices.

Join me on this journey as I explore the details, and together, we can discover its potential.

This series will follow this format:

Setting up each container
Shifting to docker-compose
The network environment: ContainerLabs
Source of Truth: NetBox
Running scripts: Flask/FastAPI/Django
Building Workflows: StackStorm

Installing infrastructure for Docker containers

If you, like me, have a Windows device, we need to prepare it to start building. We must set up Docker before we can accomplish this. Take a look at how to set up Docker on Windows to follow along.

To keep persistent data and maintain an organized setup, my final file structure will look like this:

$ pwd
/home/user/TIG

$ tree
.
├── grafana
│   ├── data
│   └── grafana.ini
├── influxdb
│   ├── config.yml
│   └── data
└── telegraf
    ├── code
    ├── telegraf-lab.docker
    └── telegraf.conf

InfluxDB

What is

A high-performance time-series database (TSDB), for storage and real-time analysis.

Build

We will first pull the InfluxDB Docker image.

$ docker pull influxdb

mkdir influxdb 
cd influxdb 
mkdir config
mkdir data 

docker run
    --name influxdb_lab \
    -p 8086:8086 \
    -v "$PWD/data:/var/lib/influxdb2" \
    -v "$PWD/config:/etc/influxdb2" \
    -e DOCKER_INFLUXDB_INIT_MODE=setup \
    -e DOCKER_INFLUXDB_INIT_USERNAME=autonetops \
    -e DOCKER_INFLUXDB_INIT_PASSWORD=autonetops@123 \
    -e DOCKER_INFLUXDB_INIT_ORG=ORG_AUTONETOPS \
    -e DOCKER_INFLUXDB_INIT_BUCKET=LAB_AUTONETOPS \
    influxdb:2

Telegraf

What is

An open-source agent used for collecting and sending metrics and events from databases, systems and network devices. It gathers data from a wide variety of sources through its numerous plugins, such as CPU, memory, network, and various application-specific plugins.

It works by utilizing plugins to gather data, which is then transmitted to destinations like databases for storage and analysis. The behavior and functionality of Telegraf are controlled via its configuration file (telegraf.conf). This file allows us to specify global settings and define input plugins for collecting data from specific sources, as well as output plugins to determine where the data should be sent, such as to an InfluxDB instance. By editing this configuration file, we can customize Telegraf to meet our specific monitoring and data collection needs.

Build

Let's get telegraf now. Same stuff as before: Create a directory for it, get the docker image, and get the config file for us to modify and adapt to the environment.

mkdir telegraf
cd telegraf
mkdir code 
touch telegraf-lab.docker

Custom derivative image

In our tests, we intend to talk to the network and leverage some dependencies that are not there by default. It is recommended to create a custom derivative image to install any needed packages and commands.

So we will create a Docker file to make sure they are there and save it as telegraf-lab.docker

FROM telegraf:latest
RUN apt update -y
RUN apt-get install -y python3 python3-pip python3.11-venv sudo openssh-server
RUN python3 -m venv /venv
RUN /venv/bin/pip3 install napalm
RUN service ssh start
RUN useradd autonetops
RUN usermod -aG sudo autonetops

This will ensure we install Python, SSH, a user for us to SSH into this machine, and install Python modules for us to interact with the network.

Then we build our Telegraf docker container:

$ pwd
/TIG/telegraf

//Build the derivative image 
$ docker build -t telegraf_lab:latest - < telegraf-lab.docker

//Check Our created image
$ docker images
REPOSITORY           TAG          IMAGE ID       CREATED              SIZE
telegraf_lab         latest       865987e9234c   About a minute ago   1.05GB

We will configure Telegraf to communicate to influxDB now. First, we need to know the address from influx on the Docker network, so let's get it:

$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 
94ffef1ca391 influxdb:latest "/[entrypoint.sh](http://entrypoint.sh) infl…" 20 minutes ago Up 14 minutes 0.0.0.0:8086-&gt;8086/tcp, 0.0.0.0:8088-&gt;8088/tcp influxdb

docker container inspect <hash|name>

.... Omited 
"Gateway": "172.17.0.1",
"GlobalIPv6Address": "", 
"GlobalIPv6PrefixLen": 0, 
"IPAddress": "172.17.0.2", 
"IPPrefixLen": 16, 
"IPv6Gateway": "", 
"MacAddress": "02:42:ac:11:00:02", 
"Networks": { "bridge": 
                { "IPAMConfig": null,
                  "Links": null, 
                  "Aliases": null, 
                  "NetworkID": "7ac563150b9d473a64c1cfbd5c37e15318a2c1c91df355cf1106cca14a9fb7a6", 
                  "EndpointID": "0c65d36fa18c5a77f9770238853e41a91bf54957767a96f5c8f16c88a2ef8735", 
                  "Gateway": "172.17.0.1", 
                  "IPAddress": "172.17.0.2", 
                  "IPPrefixLen": 16,

Sample Configuration

The user is required to provide a valid configuration to use the image. A valid configuration has at least one input and one output plugin specified. The following will walk through the general steps to get going.

//Get the telegraf.conf docker 
docker run --rm telegraf telegraf config > telegraf.conf

Users can generate a sample configuration using the config subcommand. This will provide the user with a basic configuration with a handful of input plugins enabled to collect data from the system. However, the user will still need to configure at least one output before the file is ready. Look for [[outputs.influxdb_v2]] and edit with the proper influx information.

Here is my telegraf.conf file for reference:

After configuring the conf file, we can launch telegraf:

sudo docker run -d -p 8125:8125 -p 8092:8092 -p 8094:8094 -p 57000:57000 -p 2022:22 \
    --name telegraf \
    -v $PWD/telegraf.conf:/etc/telegraf/telegraf.conf:ro \
    -v $PWD/code:/code \
    -d telegraf:latest \
    telegraf -config /etc/telegraf/telegraf.conf

I got errors due to the SNMP config file. We can just comment the SNMP plugin or add the mibs on the container (add it to the tweeks file to add net-snmp of copy from your host if you already have it:

sudo docker cp /usr/share/snmp/mibs/. telegraf:/usr/share/snmp/mibs/

Everything should work from here on.

Grafana

What

Visualization tool to provide insightful dashboards and alerts based on the collected metrics.

Build

Let's get to the final part

cd grafana
mkdir data

sudo docker run -d -p 3000:3000 \
         --name=grafana_lab \
         --user root \
         --volume $PWD/data:/var/lib/grafana \
         grafana/grafana

💡

default admin user credentials are admin/admin.

We also create a folder for persistent storage of the Grafana configuration, like dashboards. Also, we run the Docker container with Grafana using “–user root”, not a default one.

As we develop our graphs and plugins we want to be able to adjust what we build. We will start modifying the configuration file of Grafana, but first of all, we need to download it from the container:

docker cp grafana_lab:/etc/grafana/grafana.ini grafana.ini

What’s next?

Nice job. We have the entire environment up and running. Now, we need to start collecting information from our network and presenting it.

It was quite a bit of work, right? All this is just to set up the first part of the environment. It was worth it, though, because we learned more about the TIG stack and Docker. However, we can make things easier to set up. In the next post, let's use Docker Compose to help build this environment and continue expanding as we move forward.