Another look, start to finish

If your the type of reader who skip most of the text in an article and only reads the figures and examples, then this is the section to actually read.

The Network Admin at my company currently is using MRTG to monitor 4 interconnecting T1's we use to provide cooperate connectivity. Currently there are 6 different MRTG graphs (4 links, plus an aggregate and the faster ether interface on the router). I'd like to give him all this information on a single graph.

I want to grab the IF-MIB::ifInOctets and IF-MIB::ifOutOctets OIDs from the router, and I need to do it for the 4 T1 interfaces, the 1 FastEther and the 1 virtual multilink. So when I set up my RRD I'm going to need Data Sources for each in and out value (so thats 12 total). As for RRA, I want to hold the average per hour for 6 months and the last value every step interval going back a week. This way I can provide a graph for the last month or more and a current graph. That means we want 1 PDP for the LAST with 4032 records to hold 2 weeks (12 steps/hour * 24hrs * 14days) and 4320 records to hold our 6 months worth of hourly averages (24hrs * 180days).

I'll setup the RRD in the following way:

Figure 5. Bandwidth RRD Creation Command

rrdtool create wan-thruput.rrd \                      
--start N --step 300 \
DS:Serial00_In:COUNTER:600:U:U \
DS:Serial01_In:COUNTER:600:U:U \
DS:Serial10_In:COUNTER:600:U:U \
DS:Serial11_In:COUNTER:600:U:U \
DS:FastEth20_In:COUNTER:600:U:U \
DS:MutliLink_In:COUNTER:600:U:U \
DS:Serial00_Out:COUNTER:600:U:U \
DS:Serial01_Out:COUNTER:600:U:U \
DS:Serial10_Out:COUNTER:600:U:U \
DS:Serial11_Out:COUNTER:600:U:U \
DS:FastEth20_Out:COUNTER:600:U:U \
DS:MutliLink_Out:COUNTER:600:U:U \
RRA:LAST:0.5:1:4032 \
RRA:AVERAGE:0.5:12:4320

Now, time to start dropping data into it. I hacked around a bit and built the following script:

Figure 6. Bandwidth RRD Update Script

#!/usr/local/bin/perl
### WAN RRD Script

$RRDTOOL = "rrdtool";
$TARGET_RRD = "wan-thruput.rrd";
$SNMPGET = "snmpget -v2c -Ovq -c public 10.1.2.3";

for($i=1; $i <= 11; $i++){
        my $in = `${SNMPGET} IF-MIB::ifInOctets.${i}`;
        my $out = `${SNMPGET} IF-MIB::ifOutOctets.${i}`;
        
        chomp($in) && chomp($out);

        push(@T1_IN,$in);
        push(@T1_OUT,$out);
}

`${RRDTOOL} update $TARGET_RRD N:${T1_IN[0]}:${T1_IN[1]}:${T1_IN[2]}
:${T1_IN[3]}:${T1_IN[6]}:${T1_IN[10]}:${T1_OUT[0]}:${T1_OUT[1]}
:${T1_OUT[2]}:${T1_OUT[3]}:${T1_OUT[6]}:${T1_OUT[10]}`;

I've taken the full pathnames out of the above script to make it fit on printed pages, but this script will grab all the input and output octet counters (an octet is a group of 8 bits) and put them in our RRD.

Now to graph it. First we'll create the current graph using the latest numbers. We'll invoke RRD using the rrdtool graph tool. I'm using a shell script that can be executed from cron, that looks like this:

Figure 7. Bandwidth Graph Creation Script

#!/bin/bash

rrdtool graph wanoutput.png -a PNG \
--title="HS T1 Links" --vertical-label "Bytes" \
--height 150 \
'DEF:s00in=wan-thruput.rrd:Serial00_In:LAST' \
'DEF:s01in=wan-thruput.rrd:Serial01_In:LAST' \
'DEF:s10in=wan-thruput.rrd:Serial10_In:LAST' \
'DEF:s11in=wan-thruput.rrd:Serial11_In:LAST' \
'DEF:fe20in=wan-thruput.rrd:FastEth20_In:LAST' \
'DEF:mlin=wan-thruput.rrd:MutliLink_In:LAST' \
'DEF:s00out=wan-thruput.rrd:Serial00_Out:LAST' \
'DEF:s01out=wan-thruput.rrd:Serial01_Out:LAST' \
'DEF:s10out=wan-thruput.rrd:Serial10_Out:LAST' \
'DEF:s11out=wan-thruput.rrd:Serial11_Out:LAST' \
'DEF:fe20out=wan-thruput.rrd:FastEth20_Out:LAST' \
'DEF:mlout=wan-thruput.rrd:MutliLink_Out:LAST' \
'HRULE:193000#0000ff' \
'HRULE:386000#0000ff' \
'HRULE:579000#0000ff' \
'HRULE:772000#0000ff' \
'AREA:mlin#66FF99:MultiLink In' \
'AREA:mlout#FFFF33:MultiLink Out' \
'LINE1:fe20in#000000:FastEther In' \
'LINE1:fe20out#ff0000:FastEther Out' \
'GPRINT:s00in:LAST:XO T1 A In\: %6.0lf bytes ' \
'GPRINT:s01in:LAST:XO T1 B In\: %6.0lf bytes\j' \
'GPRINT:s10in:LAST:SBC T1 A In\: %6.0lf bytes' \
'GPRINT:s11in:LAST:SBC T1 B In\: %6.0lf bytes\j'

Again, for formating reasons I've removed full paths from the above script, I highly suggest you use fully qualified paths for all files in your script.

So, in the above script we're using PNG as the output format for the graph, and naming it "wanoutput.png". The graph is labeled "HS T1 Links" and the output image is 150px high. We're also labeling the vertical axis as "Bytes" for clarity. The DEF lines define each of our RRD values and name them appropriately for later use in this graph, and use the LAST value for each so that we get the most current info.

The HRULE directive is one we haven't looked at before, it simply draws horizontal lines on the output graph at specified values. Here we're drawing HRULES at 192KB intervals (the thruput limit of a T1 in K) and coloring them blue. The idea of the HRULES is to easily and quickly see exactly how much of the T1 total bandwidth is being consumed by the multilink traffic.

Then we define the AREAs and LINEs. I'm only graphing out the Multilink and FastEthernet traffic, because if things are working properly the thruput of the FastEther should always match the thruput of the multilink, therefore we use an AREA for the MultiLink and a LINE1 for the FastEther, and if things are working properly you should always just see the FastEther lines outlining the top of the MultiLink area. It makes for quick reading. Please note that the order in which I've specified them is very important because each line/area is drawn in the order that you specified it. In this way, if the MultiLink Out area is larger than the MultiLink In, you'll never see the In values, but you will see the lines on top. You'll notice this in the output graph.

Finally, I use some GPRINT statements to put the thruput values of each individual T1 on the output graph. I choose not to actually graph these values because the graph just got way too crowded. In this case, we really don't care what the thruput of each T1 really is anyway, we just want to know that they are all roughly passing the same amount of traffic which means the MultiLink is working properly.

Now, I drop both the script for gathering the data and graphing the data into cron:

Figure 8. Crontab Invocation of Updating and Graphing

0,5,10,15,20,25,30,35,40,45,50,55 * * * * /home/benr/RRD/WAN-RRD/wan-update.pl
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /home/benr/RRD/WAN-RRD/make_graph.sh

After letting it run for awhile, we look at the graph and the following is the result.

Figure 9. 1 Day Bandwidth Utilization Graph

1 Day Bandwidth Utilization Graph

In the above output graph you can see clearly everything we want to know about our MultiLinked T1's. This the output as seen on a Monday morning so you notice that over the weekend there was very little traffic coming into our network across the T1's, but there was a fairly steady amount of traffic going out (apparently this is Microsoft Actve Directory traffic). Because of the order in which we specified our AREA statements you don't see the green AREA (MultiLnk In) at all, but you do see the FastEtherOut which is drawn ontop of both AREAs. We could fix this by reordering the AREAs and LINEs but, as you can see, during the week the traffic going out far exceeds the traffic coming in, making for better graphing.