From 7c016dfa001ae254bf4e18126f814ee8f0abd821 Mon Sep 17 00:00:00 2001 From: oetiker Date: Sun, 4 Mar 2001 13:01:56 +0000 Subject: [PATCH] Aberrant Behavior Detection support. A brief overview added to rrdtool.pod. Major updates to rrd_update.c, rrd_create.c. Minor update to other core files. This is backwards compatible! But new files using the Aberrant stuff are not readable by old rrdtool versions. See http://cricket.sourceforge.net/aberrant/rrd_hw.htm -- Jake Brutlag git-svn-id: svn://svn.oetiker.ch/rrdtool/trunk/program@26 a5681a0c-68f1-0310-ab6d-d61299d08faa --- NEWS | 18 + doc/rrdcreate.pod | 204 ++++++++- doc/rrdgraph.pod | 66 ++- doc/rrdtool.pod | 57 +++ doc/rrdtune.pod | 70 +++- src/Makefile.am | 2 + src/fnv.h | 102 +++++ src/hash_32.c | 152 +++++++ src/rrd_create.c | 389 +++++++++++++++--- src/rrd_dump.c | 74 +++- src/rrd_format.h | 129 +++++- src/rrd_graph.c | 102 ++++- src/rrd_hw.c | 714 ++++++++++++++++++++++++++++++++ src/rrd_info.c | 68 ++- src/rrd_open.c | 18 +- src/rrd_restore.c | 84 +++- src/rrd_tool.h | 40 +- src/rrd_tune.c | 181 +++++++- src/rrd_update.c | 640 ++++++++++++++++++++++------- src/rrdupdate.c | 1184 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 20 files changed, 4003 insertions(+), 291 deletions(-) create mode 100644 NEWS create mode 100644 src/fnv.h create mode 100644 src/hash_32.c create mode 100644 src/rrd_hw.c create mode 100644 src/rrdupdate.c diff --git a/NEWS b/NEWS new file mode 100644 index 0000000..b9c3a64 --- /dev/null +++ b/NEWS @@ -0,0 +1,18 @@ +RRDTOOL NEWS +============ + +In this file I am noting the Major changes to rrdtool +for details check the cvs ChangeLog + +2001/03/21 Tobias Oetiker + Added Aberrant Patch from Jake Brutlag + From now one, new rrd files use version tag 0002. They can + NOT be read by the old 1.0.x rrdtools + + Jack: + Aberrant Behavior Detection support. A brief overview added to + rrdtool.pod. Major updates to rrd_update.c, rrd_create.c. Minor update to + other core files. Updated documentation: rrdcreate.pod, rrdgraph.pod, + rrdtune.pod. This is backwards compatible! See + http://cricket.sourceforge.net/aberrant/rrd_hw.htm + diff --git a/doc/rrdcreate.pod b/doc/rrdcreate.pod index 8be0567..14b6f19 100644 --- a/doc/rrdcreate.pod +++ b/doc/rrdcreate.pod @@ -10,7 +10,7 @@ B B I S<[B<--start>|B<-b> I]> S<[B<--step>|B<-s> I]> S<[BIB<:>IB<:>IB<:>IB<:>I]> -S<[BIB<:>IB<:>IB<:>I]> +S<[BIB<:>I]> =head1 DESCRIPTION @@ -106,30 +106,174 @@ I -=item BIB<:>IB<:>IB<:>I - +=item BIB<:>I + The purpose of an B is to store data in the round robin archives -(B). An archive consists of a number of data values from all the -defined data-sources (B) and is defined with an B line. +(B). An archive consists of a number of data values or statistics for +each of the defined data-sources (B) and is defined with an B line. When data is entered into an B, it is first fit into time slots of the length defined with the B<-s> option becoming a I. -The data is also consolidated with the consolidation function (I) -of the archive. The following consolidation functions are defined: -B, B, B, B. +The data is also processed with the consolidation function (I) +of the archive. There are several consolidation functions that consolidate +primary data points via an aggregate function: B, B, B, B. +The format of B line for these consolidation function is: + +BIB<:>IB<:>IB<:>I I The xfiles factor defines what part of a consolidation interval may be made up from I<*UNKNOWN*> data while the consolidated value is still regarded as known. -I defines how many of these I are used to -build a I which then goes into the archive. +I defines how many of these I are used to build +a I which then goes into the archive. I defines how many generations of data values are kept in an B. =back +=head1 Aberrant Behaviour detection with Holt-Winters forecasting + +by Jake Brutlag Ejakeb@corp.webtv.netE + +In addition to the aggregate functions, there are a set of specialized +functions that enable B to provide data smoothing (via the +Holt-Winters forecasting algorithm), confidence bands, and the flagging +aberrant behavior in the data source time series: + +=over 4 + +=item BIB<:>IB<:>IB<:>IB<:>IB<:>I + +=item BIB<:>IB<:>IB<:>I + +=item BIB<:>IB<:>IB<:>I + +=item BIB<:>IB<:>I + +=item BIB<:>IB<:>IB<:>IB<:>I + +=back + +These B differ from the true consolidation functions in several ways. +First, each of the Bs is updated once for every primary data point. +Second, these B are interdependent. To generate real-time confidence +bounds, then a matched set of HWPREDICT, SEASONAL, DEVSEASONAL, and +DEVPREDICT must exist. Generating smoothed values of the primary data points +requires both a HWPREDICT B and SEASONAL B. Aberrant behavior +detection requires FAILURES, HWPREDICT, DEVSEASONAL, and SEASONAL. + +The actual predicted, or smoothed, values are stored in the HWPREDICT +B. The predicted deviations are store in DEVPREDICT (think a standard +deviation which can be scaled to yield a confidence band). The FAILURES +B stores binary indicators. A 1 marks the indexed observation a +failure; that is, the number of confidence bounds violations in the +preceding window of observations met or exceeded a specified threshold. An +example of using these B to graph confidence bounds and failures +appears in L. + +The SEASONAL and DEVSEASONAL B store the seasonal coefficients for the +Holt-Winters Forecasting algorithm and the seasonal deviations respectively. +There is one entry per observation time point in the seasonal cycle. For +example, if primary data points are generated every five minutes, and the +seasonal cycle is 1 day, both SEASONAL and DEVSEASONAL with have 288 rows. + +In order to simplify the creation for the novice user, in addition to +supporting explicit creation the HWPREDICT, SEASONAL, DEVPREDICT, +DEVSEASONAL, and FAILURES B, the B create command supports +implicit creation of the other four when HWPREDICT is specified alone and +the final argument I is omitted. + +I specifies the length of the B prior to wrap around. Remember +that there is a one-to-one correspondence between primary data points and +entries in these RRAs. For the HWPREDICT CF, I should be larger than +the I. If the DEVPREDICT B is implicity created, the +default number of rows is the same as the HWPREDICT I argument. If the +FAILURES B is implicitly created, I will be set to the I argument of the HWPREDICT B. Of course, the B +I command is available if these defaults are not sufficient and the +create wishes to avoid explicit creations of the other specialized function +B. + +I specifies the number of primary data points in a seasonal +cycle. If SEASONAL and DEVSEASONAL are implicitly created, this argument for +those B is set automatically to the value specified by HWPREDICT. If +they are explicity created, the creator should verify that all three +I arguments agree. + +I is the adaptation parameter of the intercept (or baseline) +coefficient in the Holt-Winters Forecasting algorithm. See L for a +description of this algorithm. I must lie between 0 and 1. A value +closer to 1 means that more recent observations carry greater weight in +predicting the baseline component of the forecast. A value closer to 0 mean +that past history carries greater weight in predicted the baseline +component. + +I is the adaption parameter of the slope (or linear trend) coefficient +in the Holt-Winters Forecating algorihtm. I must lie between 0 and 1 +and plays the same role as I with respect to the predicted linear +trend. + +I is the adaption parameter of the seasonal coefficients in the +Holt-Winters Forecasting algorithm (HWPREDICT) or the adaption parameter in +the exponential smoothing update of the seasonal deviations. It must lie +between 0 and 1. If the SEASONAL and DEVSEASONAL B are created +implicitly, they will both have the same value for I: the value +specified for the HWPREDICT I argument. Note that because there is +one seasonal coefficient (or deviation) for each time point during the +seasonal cycle, the adaption rate is much slower than the baseline. Each +seasonal coefficient is only updated (or adapts) when the observed value +occurs at the offset in the seasonal cycle corresponding to that +coefficient. + +If SEASONAL and DEVSEASONAL B are created explicity, I need not +be the same for both. Note that I can also be changed via the +B I command. + +I provides the links between related B. If HWPREDICT is +specified alone and the other B created implicitly, then there is no +need to worry about this argument. If B are created explicitly, then +pay careful attention to this argument. For each B which includes this +argument, there is a dependency between that B and another B. The +I argument is the 1-based index in the order of B creation +(that is, the order they appear in the I command). The dependent +B for each B requiring the I argument is listed here: + +=over 4 + +=item * + +HWPREDICT I is the index of the SEASONAL B. + +=item * + +SEASONAL I is the index of the HWPREDICT B. + +=item * + +DEVPREDICT I is the index of the DEVSEASONAL B. + +=item * + +DEVSEASONAL I is the index of the HWPREDICT B. + +=item * + +FAILURES I is the index of the DEVSEASONAL B. + +=back + +I is the minimum number of violations (observed values outside +the confidence bounds) within a window that constitutes a failure. If the +FAILURES B is implicitly created, the default value is 7. + +I is the number of time points in the window. Specify an +integer greater than or equal to the threshold and less than or equal to 28. +The time interval this window represents depends on the interval between +primary data points. If the FAILURES B is implicity created, the +default value is 9. + =head1 The HEARTBEAT and the STEP Here is an explanation by Don Baarda on the inner workings of rrdtool. @@ -232,6 +376,46 @@ every hour (12 * 300 seconds = 1 hour), for 100 days (2400 hours). The third and the fourth RRA's do the same with the for the maximum and average temperature, respectively. +=head1 EXAMPLE 2 + +C + +This example is a monitor of a router interface. The first B tracks the +traffic flow in octects; the second B generates the specialized +functions B for aberrant behavior detection. Note that the I +argument of HWPREDICT is missing, so the other B will be implicitly be +created with default parameter values. In this example, the forecasting +algorithm baseline adapts quickly; in fact the most recent one hour of +observations (each at 5 minute intervals) account for 75% of the baseline +prediction. The linear trend forecast adapts much more slowly. Observations +made in during the last day (at 288 observations per day) account for only +65% of the predicted linear trend. Note: these computations rely on an +exponential smoothing formula described in a forthcoming LISA 2000 paper. + +The seasonal cycle is one day (288 data points at 300 second intervals), and +the seasonal adaption paramter will be set to 0.1. The RRD file will store 5 +days (1440 data points) of forecasts and deviation predictions before wrap +around. The file will store 1 day (a seasonal cycle) of 0-1 indicators in +the FAILURES B. + +The same RRD file and B are created with the following command, which explicitly +creates all specialized function B. + +C + +Of course, explicit creation need not replicate implicit create, a number of arguments +could be changed. + =head1 AUTHOR Tobias Oetiker Eoetiker@ee.ethz.chE diff --git a/doc/rrdgraph.pod b/doc/rrdgraph.pod index 2e3a4f6..8e8038e 100644 --- a/doc/rrdgraph.pod +++ b/doc/rrdgraph.pod @@ -41,6 +41,7 @@ S<[BI