src/utils_vl_lookup.[ch]: Support selecting values by regex.

[collectd.git] / src / collectd.conf.pod
diff --git a/src/collectd.conf.pod b/src/collectd.conf.pod

index 7d287cc..1c8b7a4 100644 (file)
--- a/src/collectd.conf.pod
+++ b/src/collectd.conf.pod
@@ -7,11 +7,15 @@ collectd.conf - Configuration for the system statistics collection daemon B<coll
    BaseDir "/path/to/data/"
    PIDFile "/path/to/pidfile/collectd.pid"
    Server  "123.123.123.123" 12345
-
+  
    LoadPlugin cpu
    LoadPlugin load
+  
+  <LoadPlugin df>
+    Interval 3600
+  </LoadPlugin>
+  
    LoadPlugin ping
-
    <Plugin ping>
      Host "example.org"
      Host "provider.net"
@@ -25,22 +29,32 @@ controls which plugins to load. These plugins ultimately define collectd's
  behavior.
  
  The syntax of this config file is similar to the config file of the famous
-B<Apache Webserver>. Each line contains either a key-value-pair or a
-section-start or -end. Empty lines and everything after the hash-symbol `#' is
-ignored. Values are either string, enclosed in double-quotes,
-(floating-point-)numbers or a boolean expression, i.E<nbsp>e. either B<true> or
-B<false>. String containing of only alphanumeric characters and underscores do
-not need to be quoted. Lines may be wrapped by using `\' as the last character
-before the newline. This allows long lines to be split into multiple lines.
-Quoted strings may be wrapped as well. However, those are treated special in
-that whitespace at the beginning of the following lines will be ignored, which
-allows for nicely indenting the wrapped lines.
-
-The configuration is read and processed in order, i.E<nbsp>e. from top to
-bottom. So the plugins are loaded in the order listed in this config file. It
-is a good idea to load any logging plugins first in order to catch messages
-from plugins during configuration. Also, the C<LoadPlugin> option B<must> occur
-B<before> the C<E<lt>Plugin ...E<gt>> block.
+I<Apache> webserver. Each line contains either an option (a key and a list of
+one or more values) or a section-start or -end. Empty lines and everything
+after a non-quoted hash-symbol (C<#>) is ignored. I<Keys> are unquoted
+strings, consisting only of alphanumeric characters and the underscore (C<_>)
+character. Keys are handled case insensitive by I<collectd> itself and all
+plugins included with it. I<Values> can either be an I<unquoted string>, a
+I<quoted string> (enclosed in double-quotes) a I<number> or a I<boolean>
+expression. I<Unquoted strings> consist of only alphanumeric characters and
+underscores (C<_>) and do not need to be quoted. I<Quoted strings> are
+enclosed in double quotes (C<">). You can use the backslash character (C<\>)
+to include double quotes as part of the string. I<Numbers> can be specified in
+decimal and floating point format (using a dot C<.> as decimal separator),
+hexadecimal when using the C<0x> prefix and octal with a leading zero (C<0>).
+I<Boolean> values are either B<true> or B<false>.
+
+Lines may be wrapped by using C<\> as the last character before the newline.
+This allows long lines to be split into multiple lines. Quoted strings may be
+wrapped as well. However, those are treated special in that whitespace at the
+beginning of the following lines will be ignored, which allows for nicely
+indenting the wrapped lines.
+
+The configuration is read and processed in order, i.e. from top to bottom. So
+the plugins are loaded in the order listed in this config file. It is a good
+idea to load any logging plugins first in order to catch messages from plugins
+during configuration. Also, the C<LoadPlugin> option B<must> occur B<before>
+the appropriate C<E<lt>Plugin ...E<gt>> block.
  
  =head1 GLOBAL OPTIONS
  
@@ -63,6 +77,7 @@ options are allowed inside a B<LoadPlugin> block:
  
    <LoadPlugin perl>
      Globals true
+    Interval 10
    </LoadPlugin>
  
  =over 4
@@ -86,6 +101,12 @@ By default, this is disabled. As a special exception, if the plugin name is
  either C<perl> or C<python>, the default is changed to enabled in order to keep
  the average user from ever having to deal with this low level linking stuff.
  
+=item B<Interval> I<Seconds>
+
+Sets a plugin-specific interval for collecting metrics. This overrides the
+global B<Interval> setting. If a plugin provides own support for specifying an
+interval, that setting will take precedence.
+
  =back
  
  =item B<Include> I<Path>
@@ -183,12 +204,128 @@ C<Plugin>-Section. Which options exist depends on the plugin used. Some plugins
  require external configuration, too. The C<apache plugin>, for example,
  required C<mod_status> to be configured in the webserver you're going to
  collect data from. These plugins are listed below as well, even if they don't
-require any configuration within collectd's configfile.
+require any configuration within collectd's configuration file.
  
  A list of all plugins and a short summary for each plugin can be found in the
  F<README> file shipped with the sourcecode and hopefully binary packets as
  well.
  
+=head2 Plugin C<aggregation>
+
+The I<Aggregation plugin> makes it possible to aggregate several values into
+one using aggregation functions such as I<sum>, I<average>, I<min> and I<max>.
+This can be put to a wide variety of uses, e.g. average and total CPU
+statistics for your entire fleet.
+
+The grouping is powerful but, as with many powerful tools, may be a bit
+difficult to wrap your head around. The grouping will therefore be
+demonstrated using an example: The average and sum of the CPU usage across
+all CPUs of each host is to be calculated.
+
+To select all the affected values for our example, set C<Plugin cpu> and
+C<Type cpu>. The other values are left unspecified, meaning "all values". The
+I<Host>, I<Plugin>, I<PluginInstance>, I<Type> and I<TypeInstance> options
+work as if they were specified in the C<WHERE> clause of an C<SELECT> SQL
+statement.
+
+  Plugin "cpu"
+  Type "cpu"
+
+Although the I<Host>, I<PluginInstance> (CPU number, i.e. 0, 1, 2, ...)  and
+I<TypeInstance> (idle, user, system, ...) fields are left unspecified in the
+example, the intention is to have a new value for each host / type instance
+pair. This is achieved by "grouping" the values using the C<GroupBy> option.
+It can be specified multiple times to group by more than one field.
+
+  GroupBy "Host"
+  GroupBy "TypeInstance"
+
+We do neither specify nor group by I<plugin instance> (the CPU number), so all
+metrics that differ in the CPU number only will be aggregated. Each
+aggregation needs I<at least one> such field, otherwise no aggregation would
+take place.
+
+The full example configuration looks like this:
+
+ <Plugin "aggregation">
+   <Aggregation>
+     Plugin "cpu"
+     Type "cpu"
+     
+     GroupBy "Host"
+     GroupBy "TypeInstance"
+     
+     CalculateSum true
+     CalculateAverage true
+   </Aggregation>
+ </Plugin>
+
+There are a couple of limitations you should be aware of:
+
+=over 4
+
+=item
+
+The I<Type> cannot be left unspecified, because it is not reasonable to add
+apples to oranges. Also, the internal lookup structure won't work if you try
+to group by type.
+
+=item
+
+There must be at least one unspecified, ungrouped field. Otherwise nothing
+will be aggregated.
+
+=back
+
+As you can see in the example above, each aggregation has its own
+B<Aggregation> block. You can have multiple aggregation blocks and aggregation
+blocks may match the same values, i.e. one value list can update multiple
+aggregations. The following options are valid inside B<Aggregation> blocks:
+
+=over 4
+
+=item B<Host> I<Host>
+
+=item B<Plugin> I<Plugin>
+
+=item B<PluginInstance> I<PluginInstance>
+
+=item B<Type> I<Type>
+
+=item B<TypeInstance> I<TypeInstance>
+
+Selects the value lists to be added to this aggregation. B<Type> must be a
+valid data set name, see L<types.db(5)> for details.
+
+If the string starts with and ends with a slash (C</>), the string is
+interpreted as a I<regular expression>. The regex flavor used are POSIX
+extended regular expressions as described in L<regex(7)>. Example usage:
+
+ Host "/^db[0-9]\\.example\\.com$/"
+
+=item B<GroupBy> B<Host>|B<Plugin>|B<PluginInstance>|B<TypeInstance>
+
+Group valued by the specified field. The B<GroupBy> option may be repeated to
+group by multiple fields.
+
+=item B<CalculateNum> B<true>|B<false>
+
+=item B<CalculateSum> B<true>|B<false>
+
+=item B<CalculateAverage> B<true>|B<false>
+
+=item B<CalculateMinimum> B<true>|B<false>
+
+=item B<CalculateMaximum> B<true>|B<false>
+
+=item B<CalculateStddev> B<true>|B<false>
+
+Boolean options for enabling calculation of the number of value lists, their
+sum, average, minimum, maximum andE<nbsp>/ or standard deviation. All options
+are disabled by default.
+
+=back
+
  =head2 Plugin C<amqp>
  
  The I<AMQMP plugin> can be used to communicate with other instances of
@@ -210,6 +347,8 @@ possibly filtering or messages.
   #   Persistent false
   #   Format "command"
   #   StoreRates false
+ #   GraphitePrefix "collectd."
+ #   GraphiteEscapeChar "_"
     </Publish>
     
     # Receive values from an AMQP broker
@@ -310,6 +449,10 @@ If set to B<JSON>, the values are encoded in the I<JavaScript Object Notation>,
  an easy and straight forward exchange format. The C<Content-Type> header field
  will be set to C<application/json>.
  
+If set to B<Graphite>, values are encoded in the I<Graphite> format, which is
+"<metric> <value> <timestamp>\n". The C<Content-Type> header field will be set to
+C<text/graphite>.
+
  A subscribing client I<should> use the C<Content-Type> header field to
  determine how to decode the values. Currently, the I<AMQP plugin> itself can
  only decode the B<Command> format.
@@ -324,6 +467,25 @@ using the internal value cache.
  Please note that currently this option is only used if the B<Format> option has
  been set to B<JSON>.
  
+=item B<GraphitePrefix> (Publish and B<Format>=I<Graphite> only)
+
+A prefix can be added in the metric name when outputting in the I<Graphite> format.
+It's added before the I<Host> name.
+Metric name will be "<prefix><host><postfix><plugin><type><name>"
+
+=item B<GraphitePostfix> (Publish and B<Format>=I<Graphite> only)
+
+A postfix can be added in the metric name when outputting in the I<Graphite> format.
+It's added after the I<Host> name.
+Metric name will be "<prefix><host><postfix><plugin><type><name>"
+
+=item B<GraphiteEscapeChar> (Publish and B<Format>=I<Graphite> only)
+
+Specify a character to replace dots (.) in the host part of the metric name.
+In I<Graphite> metric name, dots are used as separators between different
+metric parts (host, plugin, type).
+Default is "_" (I<Underscore>).
+
  =back
  
  =head2 Plugin C<apache>
@@ -1939,6 +2101,17 @@ The C<memcached plugin> connects to a memcached server and queries statistics
  about cache utilization, memory and bandwidth used.
  L<http://www.danga.com/memcached/>
  
+ <Plugin "memcached">
+   <Instance "name">
+     Host "memcache.example.com"
+     Port 11211
+   </Instance>
+ </Plugin>
+
+The plugin configuration consists of one or more B<Instance> blocks which
+specify one I<memcached> connection each. Within the B<Instance> blocks, the
+following options are allowed:
+
  =over 4
  
  =item B<Host> I<Hostname>
@@ -1949,6 +2122,11 @@ Hostname to connect to. Defaults to B<127.0.0.1>.
  
  TCP-Port to connect to. Defaults to B<11211>.
  
+=item B<Socket> I<Path>
+
+Connect to I<memcached> using the UNIX domain socket at I<Path>. If this
+setting is given, the B<Host> and B<Port> settings are ignored.
+
  =back
  
  =head2 Plugin C<modbus>
@@ -3043,6 +3221,16 @@ IP-address may be used in a filename it is recommended to disable reverse
  lookups. The default is to do reverse lookups to preserve backwards
  compatibility, though.
  
+=item B<IncludeUnitID> B<true>|B<false>
+
+When a peer is a refclock, include the unit ID in the I<type instance>.
+Defaults to B<false> for backward compatibility.
+
+If two refclock peers use the same driver and this is B<false>, the plugin will
+try to write simultaneous measurements from both to the same type instance.
+This will result in error messages in the log and only one set of measurements
+making it through.
+
  =back
  
  =head2 Plugin C<nut>
@@ -3487,7 +3675,8 @@ L<http://www.postgresql.org/docs/manuals/>.
      </Query>
  
      <Writer sqlstore>
-      Statement "SELECT collectd_insert($1, $2, $3, $4, $5, $6, $7, $8);"
+      Statement "SELECT collectd_insert($1, $2, $3, $4, $5, $6, $7, $8, $9);"
+      StoreRates true
      </Writer>
  
      <Database foo>
@@ -3510,6 +3699,7 @@ L<http://www.postgresql.org/docs/manuals/>.
      <Database qux>
        # ...
        Writer sqlstore
+      CommitInterval 10
      </Database>
    </Plugin>
  
@@ -3559,6 +3749,11 @@ used, the parameter expands to "localhost".
  
  The name of the database of the current connection.
  
+=item I<instance>
+
+The name of the database plugin instance. See the B<Instance> option of the
+database specification below for details.
+
  =item I<username>
  
  The username used to connect to the database.
@@ -3672,6 +3867,23 @@ This query collects the on-disk size of the database in bytes.
  
  =back
  
+In addition, the following detailed queries are available by default. Please
+note that each of those queries collects information B<by table>, thus,
+potentially producing B<a lot> of data. For details see the description of the
+non-by_table queries above.
+
+=over 4
+
+=item B<queries_by_table>
+
+=item B<query_plans_by_table>
+
+=item B<table_states_by_table>
+
+=item B<disk_io_by_table>
+
+=back
+
  The B<Writer> block defines a PostgreSQL writer backend. It accepts a single
  mandatory argument specifying the name of the writer. This will then be used
  in the B<Database> specification in order to activate the writer instance. The
@@ -3686,8 +3898,8 @@ This mandatory option specifies the SQL statement that will be executed for
  each submitted value. A single SQL statement is allowed only. Anything after
  the first semicolon will be ignored.
  
-Eight parameters will be passed to the statement and should be specified as
-tokens B<$1>, B<$2>, through B<$8> in the statement string. The following
+Nine parameters will be passed to the statement and should be specified as
+tokens B<$1>, B<$2>, through B<$9> in the statement string. The following
  values are made available through those parameters:
  
  =over 4
@@ -3725,6 +3937,13 @@ sources of the submitted value-list).
  
  =item B<$8>
  
+An array of types for the submitted values (i.E<nbsp>e., the type of the data
+sources of the submitted value-list; C<counter>, C<gauge>, ...). Note, that if
+B<StoreRates> is enabled (which is the default, see below), all types will be
+C<gauge>.
+
+=item B<$9>
+
  An array of the submitted values. The dimensions of the value name and value
  arrays match.
  
@@ -3735,6 +3954,12 @@ PostgreSQL database for this purpose. Any procedural language supported by
  PostgreSQL will do (see chapter "Server Programming" in the PostgreSQL manual
  for details).
  
+=item B<StoreRates> B<false>|B<true>
+
+If set to B<true> (the default), convert counter values to rates. If set to
+B<false> counter values are stored as is, i.E<nbsp>e. as an increasing integer
+number.
+
  =back
  
  The B<Database> block defines one PostgreSQL database for which to collect
@@ -3752,6 +3977,17 @@ for details.
  Specify the interval with which the database should be queried. The default is
  to use the global B<Interval> setting.
  
+=item B<CommitInterval> I<seconds>
+
+This option may be used for database connections which have "writers" assigned
+(see above). If specified, it causes a writer to put several updates into a
+single transaction. This transaction will last for the specified amount of
+time. By default, each update will be executed in a separate transaction. Each
+transaction generates a fair amount of overhead which can, thus, be reduced by
+activating this option. The draw-back is, that data covering the specified
+amount of time will be lost, for example, if a single statement within the
+transaction fails or if the database server crashes.
+
  =item B<Host> I<hostname>
  
  Specify the hostname or IP of the PostgreSQL server to connect to. If the
@@ -3782,6 +4018,13 @@ Specify the password to be used when connecting to the server.
  Specify whether to use an SSL connection when contacting the server. The
  following modes are supported:
  
+=item B<Instance> I<name>
+
+Specify the plugin instance name that should be used instead of the database
+name (which is the default, if this option has not been specified). This
+allows to query multiple databases of the same name on the same host (e.g.
+when running multiple database server versions in parallel).
+
  =over 4
  
  =item I<disable>
@@ -3817,11 +4060,36 @@ B<PostgreSQL Documentation> for details.
  
  =item B<Query> I<query>
  
-Specify a I<query> which should be executed for the database connection. This
-may be any of the predefined or user-defined queries. If no such option is
-given, it defaults to "backends", "transactions", "queries", "query_plans",
-"table_states", "disk_io" and "disk_usage". Else, the specified queries are
-used only.
+Specifies a I<query> which should be executed in the context of the database
+connection. This may be any of the predefined or user-defined queries. If no
+such option is given, it defaults to "backends", "transactions", "queries",
+"query_plans", "table_states", "disk_io" and "disk_usage" (unless a B<Writer>
+has been specified). Else, the specified queries are used only.
+
+=item B<Writer> I<writer>
+
+Assigns the specified I<writer> backend to the database connection. This
+causes all collected data to be send to the database using the settings
+defined in the writer configuration (see the section "FILTER CONFIGURATION"
+below for details on how to selectively send data to certain plugins).
+
+Each writer will register a flush callback which may be used when having long
+transactions enabled (see the B<CommitInterval> option above). When issuing
+the B<FLUSH> command (see L<collectd-unixsock(5)> for details) the current
+transaction will be committed right away. Two different kinds of flush
+callbacks are available with the C<postgresql> plugin:
+
+=over 4
+
+=item B<postgresql>
+
+Flush all writer backends.
+
+=item B<postgresql->I<database>
+
+Flush all writers of the specified I<database> only.
+
+=back
  
  =back
  
@@ -4142,6 +4410,10 @@ The B<Port> option is the TCP port on which the Redis instance accepts
  connections. Either a service name of a port number may be given. Please note
  that numerical port numbers must be given as a string, too.
  
+=item B<Password> I<Password>
+
+Use I<Password> to authenticate when connecting to I<Redis>.
+
  =item B<Timeout> I<Timeout in miliseconds>
  
  The B<Timeout> option set the socket timeout for node response. Since the Redis
@@ -4199,6 +4471,50 @@ Enables or disables the creation of RRD files. If the daemon is not running
  locally, or B<DataDir> is set to a relative path, this will not work as
  expected. Default is B<true>.
  
+=item B<StepSize> I<Seconds>
+
+B<Force> the stepsize of newly created RRD-files. Ideally (and per default)
+this setting is unset and the stepsize is set to the interval in which the data
+is collected. Do not use this option unless you absolutely have to for some
+reason. Setting this option may cause problems with the C<snmp plugin>, the
+C<exec plugin> or when the daemon is set up to receive data from other hosts.
+
+=item B<HeartBeat> I<Seconds>
+
+B<Force> the heartbeat of newly created RRD-files. This setting should be unset
+in which case the heartbeat is set to twice the B<StepSize> which should equal
+the interval in which data is collected. Do not set this option unless you have
+a very good reason to do so.
+
+=item B<RRARows> I<NumRows>
+
+The C<rrdtool plugin> calculates the number of PDPs per CDP based on the
+B<StepSize>, this setting and a timespan. This plugin creates RRD-files with
+three times five RRAs, i. e. five RRAs with the CFs B<MIN>, B<AVERAGE>, and
+B<MAX>. The five RRAs are optimized for graphs covering one hour, one day, one
+week, one month, and one year.
+
+So for each timespan, it calculates how many PDPs need to be consolidated into
+one CDP by calculating:
+  number of PDPs = timespan / (stepsize * rrarows)
+
+Bottom line is, set this no smaller than the width of you graphs in pixels. The
+default is 1200.
+
+=item B<RRATimespan> I<Seconds>
+
+Adds an RRA-timespan, given in seconds. Use this option multiple times to have
+more then one RRA. If this option is never used, the built-in default of (3600,
+86400, 604800, 2678400, 31622400) is used.
+
+For more information on how RRA-sizes are calculated see B<RRARows> above.
+
+=item B<XFF> I<Factor>
+
+Set the "XFiles Factor". The default is 0.1. If unsure, don't set this option.
+I<Factor> must be in the range C<[0.0-1.0)>, i.e. between zero (inclusive) and
+one (exclusive).
+
  =back
  
  =head2 Plugin C<rrdtool>
@@ -4256,6 +4572,8 @@ For more information on how RRA-sizes are calculated see B<RRARows> above.
  =item B<XFF> I<Factor>
  
  Set the "XFiles Factor". The default is 0.1. If unsure, don't set this option.
+I<Factor> must be in the range C<[0.0-1.0)>, i.e. between zero (inclusive) and
+one (exclusive).
  
  =item B<CacheFlush> I<Seconds>
  
@@ -4371,6 +4689,11 @@ and available space of each device will be reported separately.
  This option is only available if the I<Swap plugin> can read C</proc/swaps>
  (under Linux) or use the L<swapctl(2)> mechanism (under I<Solaris>).
  
+=item B<ReportBytes> B<false>|B<true>
+
+When enabled, the I<swap I/O> is reported in bytes. When disabled, the default,
+I<swap I/O> is reported in pages. This option is available under Linux only.
+
  =back
  
  =head2 Plugin C<syslog>
@@ -5001,7 +5324,7 @@ number.
  If set to B<true>, the plugin instance and type instance will be in their own
  path component, for example C<host.cpu.0.cpu.idle>. If set to B<false> (the
  default), the plugin and plugin instance (and likewise the type and type
-instance) are put into once component, for example C<host.cpu-0.cpu-idle>.
+instance) are put into one component, for example C<host.cpu-0.cpu-idle>.
  
  =item B<AlwaysAppendDS> B<false>|B<true>