element. As we are on a circle there is no beginning nor an end, you can
go on and on. After a while, all the available places will be used and
the process automatically reuses old locations. This way, the database
-will not grow in size and therefore requires no mainenance.
+will not grow in size and therefore requires no maintenance.
RRDtool works with with Round Robin Databases (RRDs). It stores and retrieves
data from them.
-=head2 What data can be put into an RDD ?
+=head2 What data can be put into an RRD ?
You name it, it will probably fit. You should be able to measure some value
at several points in time and provide this information to RRDtool. If you
First of all: read it again! You may have missed something.
If you are unable to compile the sources and you have a fairly common
-OS, it will probably not be the fault of RRDtool. There may be precompiled
+OS, it will probably not be the fault of RRDtool. There may be pre-compiled
versions around on the Internet. If they come from trusted sources, get
one of those.
If on the other hand the program works but does not give you the
and it tells you that the car has moved 12345 KM until that moment.
At 12:10 you look again, it reads 12357 KM. This means you have
traveled 12 KM in five minutes. A scientist would translate that
-into meters per second and this makes a nice comparison towards the
+into meters per second and this makes a nice comparison toward the
problem of (bytes per five minutes) versus (bits per second).
We traveled 12 kilometers which is 12000 meters. We did that in five
this document. It holds one data source (DS) named "speed" that gets
built from a counter. This counter is read every five minutes (default)
In the same database two round robin archives (RRAs) are kept, one
-averages the data every time it is read (eg there's nothing to average)
+averages the data every time it is read (e.g., there's nothing to average)
and keeps 24 samples (24 times 5 minutes is 2 hours). The other averages
-6 values (half hour) and contains 10 of such averages (eg 5 hours)
+6 values (half hour) and contains 10 of such averages (e.g., 5 hours)
The remaining options will be discussed later on.
RRDtool works with special time stamps coming from the UNIX world.
It should return the following output:
- speed
-
- 920804700: NaN
- 920805000: 0.04
- 920805300: 0.02
- 920805600: 0.00
- 920805900: 0.00
- 920806200: 0.03
- 920806500: 0.03
- 920806800: 0.03
- 920807100: 0.02
- 920807400: 0.02
- 920807700: 0.02
- 920808000: 0.01
- 920808300: 0.02
- 920808600: 0.01
- 920808900: 0.00
- 920809200: NaN
+ speed
+
+ 920804700: nan
+ 920805000: 4.0000000000e-02
+ 920805300: 2.0000000000e-02
+ 920805600: 0.0000000000e+00
+ 920805900: 0.0000000000e+00
+ 920806200: 3.3333333333e-02
+ 920806500: 3.3333333333e-02
+ 920806800: 3.3333333333e-02
+ 920807100: 2.0000000000e-02
+ 920807400: 2.0000000000e-02
+ 920807700: 2.0000000000e-02
+ 920808000: 1.3333333333e-02
+ 920808300: 1.6666666667e-02
+ 920808600: 6.6666666667e-03
+ 920808900: 3.3333333333e-03
+ 920809200: nan
If it doesn't, something may be wrong. Perhaps your OS will print
"NaN" in a different form. It represents "Not A Number". If your OS
writes "U" or "UNKN" or something similar that's okay. If something
else is wrong, it will probably be due to an error you made (assuming
that my tutorial is correct of course :-). In that case: delete the
-database and try again.
+database and try again. Sometimes things change. This example used
+to provide numbers like "0.04" in stead of "4.00000e-02". Those are
+really the same numbers, just written down differently. Don't be
+alarmed if a future version of rrdtool displays a slightly different
+form of output. The examples in this document are correct for version
+1.2.0
What this output represents will become clear in the rest of the tutorial.
magenta #FF00FF (mixed red with blue)
gray #555555 (one third of all components)
+Additionally you can add an alpha channel (transparency). The default
+will be "FF" which means non-transparent.
+
The PNG you just created can be displayed using your favorite image
viewer. Web browsers will display the PNG via the URL
-"file://the/path/to/speed.png"
+"file:///the/path/to/speed.png"
=head2 Graphics with some math
When looking at the image, you notice that the horizontal axis is labeled
-12:10, 12:20, 12:30, 12:40 and 12:50. The two remaining times (12:00 and
-13:00) would not be displayed nicely so they are skipped.
+12:10, 12:20, 12:30, 12:40 and 12:50. Sometimes a label doesn't fit (12:00
+and 13:00 would be candidates) so they are skipped.
The vertical axis displays the range we entered. We provided kilometers
and when divided by 300 seconds, we get very small numbers. To be exact,
the first value was 12 (12357-12345) and divided by 300 this makes 0.04,
Now, all you have to do is measure the values regularly and update the
database. When you want to view the data, recreate the PNGs and make
sure to refresh them in your browser. (Note: just clicking reload may
-not be enough; Netscape in particular has a problem doing so and you'll
-need to click reload while pressing the shift key).
+not be enough, especially when proxies are involved. Try shift-reload
+or ctrl-F5).
=head2 Updates in Reality
Most people will use the counter that keeps track
of octets (bytes) transfered by a network device so we have to do just
that. We will start with a description of how to collect data.
-Some people will make a remark that there are tools that can do this data
+Some people will make a remark that there are tools which can do this data
collection for you. They are right! However, I feel it is important that
you understand they are not necessary. When you have to determine why
things went wrong you need to know how they work.
snmpget device password OID
+or
+
+ snmpget -v[version] -c[password] device OID
+
For device you substitute the name, or the IP address, of your device.
For password you use the "community read string" as it is called in the
SNMP world. For some devices the default of "public" might work, however
this can be disabled, altered or protected for privacy and security
reasons. Read the documentation that comes with your device or program.
-Then there is this third parameter, called OID, which means "object
-identifier".
+Then there is this parameter, called OID, which means "object identifier".
When you start to learn about SNMP it looks very confusing. It isn't
all that difficult when you look at the Management Information Base
Right, lets continue to the start of our OID: we had 1.3.6.1.2.1
From there, we are especially interested in the branch "interfaces"
-which has number 2 (eg 1.3.6.1.2.1.2 or 1.3.6.1.2.1.interfaces).
+which has number 2 (e.g., 1.3.6.1.2.1.2 or 1.3.6.1.2.1.interfaces).
First, we have to get some SNMP program. First look if there is a
pre-compiled package available for your OS. This is the preferred way.
If not, you will have to get yourself the sources and compile those.
The Internet is full of sources, programs etc. Find information using
-a search engine or whatever you prefer. As a suggestion: look for
-CMU-SNMP. It is commonly used.
+a search engine or whatever you prefer.
Assume you got the program. First try to collect some data that is
available on most systems. Remember: there is a short name for the
part of the tree that interests us most in the world we live in!
-I will use the short version as I think this document is large enough
-as it is. If that doesn't work for you, prefix with .1.3.6.1.2.1 and
-try again. Also, Read The Fine Manual. Skip the parts you cannot
-understand yet, you should be able to find out how to start the
-program and use it.
+I will give an example which can be used on Fedora Core 3. If it
+doesn't work for you, work your way through the manual of snmp and
+adapt the example to make it work.
- snmpget myrouter public system.sysDescr.0
+ snmpget -v2c -c public myrouter system.sysDescr.0
The device should answer with a description of itself, perhaps empty.
Until you got a valid answer from a device, perhaps using a different
"password", or a different device, there is no point in continuing.
- snmpget myrouter public interfaces.ifNumber.0
+ snmpget -v2c -c public myrouter interfaces.ifNumber.0
Hopefully you get a number as a result, the number of interfaces.
If so, you can carry on and try a different program called "snmpwalk".
- snmpwalk myrouter public interfaces.ifTable.ifEntry.ifDescr
+ snmpwalk -v2c -c public myrouter interfaces.ifTable.ifEntry.ifDescr
If it returns with a list of interfaces, you're almost there.
Here's an example:
- [user@host /home/alex]$ snmpwalk cisco public 2.2.1.2
+ [user@host /home/alex]$ snmpwalk -v2c -c public cisco 2.2.1.2
interfaces.ifTable.ifEntry.ifDescr.1 = "BRI0: B-Channel 1"
interfaces.ifTable.ifEntry.ifDescr.2 = "BRI0: B-Channel 2"
On this cisco equipment, I would like to monitor the "Ethernet0"
interface and see that it is number four. I try:
- [user@host /home/alex]$ snmpget cisco public 2.2.1.10.4 2.2.1.16.4
+ [user@host /home/alex]$ snmpget -v2c -c public cisco 2.2.1.10.4 2.2.1.16.4
interfaces.ifTable.ifEntry.ifInOctets.4 = 2290729126
interfaces.ifTable.ifEntry.ifOutOctets.4 = 1256486519
get a totally different picture. You would see the same values on the
average and maximum graphs (provided I measured every 300 seconds).
You would be able to see when I stopped, when I was in top gear, when
-I drove over fast hiways etc. The granularity of the data is much
+I drove over fast highways etc. The granularity of the data is much
higher, so you can see more. However, this takes 12 samples per hour,
or 288 values per day, so it would be too much to keep for a long
period of time. Therefore we average it, eventually to one value per
so called hot-spot and the exhaust. These values are not counters.
If I take the difference of the two samples and divide that by
300 seconds I would be asking for the temperature change per second.
-Hopefully this is zero! If not, the computerroom is on fire :)
+Hopefully this is zero! If not, the computer room is on fire :)
So, what can we do ? We can tell RRDtool to store the values we measure
directly as they are (this is not entirely true but close enough). The
=item *
-Line B is of type gauge. These are "real" values so they should match
+Line B is of type GAUGE. These are "real" values so they should match
what we put in: a sort of a wave.
=item *
interval is 297 seconds and also the counter increased with 297. Again
RRDtool alters the value and stores 300 as it should be.
- in the RDD in reality
+ in the RRD in reality
time+000: 0 delta="U" time+000: 0 delta="U"
time+300: 300 delta=300 time+300: 300 delta=300
DS:seconds:COUNTER:600:U:U \
RRA:AVERAGE:0.5:1:24
+Make a copy
+
for Unix: cp seconds1.rrd seconds2.rrd
for Dos: copy seconds1.rrd seconds2.rrd
for vms: how would I know :)
+Put in some data
+
rrdtool update seconds1.rrd \
920805000:000 920805300:300 920805600:600 920805900:900
rrdtool update seconds2.rrd \
920805000:000 920805300:300 920805603:603 920805900:900
+Create output
+
rrdtool graph seconds1.png \
--start 920804700 --end 920806200 \
--height 200 \
LINE2:seconds#0000FF \
AREA:unknown#FF0000
-Both graphs should show the same.
+View both images together (add them to your index.html file)
+and compare. Both graphs should show the same, despite the
+input being different.
=head1 WRAPUP