From: Florian Forster Date: Wed, 21 Sep 2016 06:48:56 +0000 (+0200) Subject: Merge branch 'pr/1649' X-Git-Tag: collectd-5.7.0~73 X-Git-Url: https://git.octo.it/?p=collectd.git;a=commitdiff_plain;h=e1bfa71aca1f37c2f293dc9adb44065c6e7a9ad9;hp=4eee46d61a42706a4c9ccc4d4af9c6df30708216 Merge branch 'pr/1649' --- diff --git a/README b/README index e8f0ca1e..9afbec60 100644 --- a/README +++ b/README @@ -96,6 +96,9 @@ Features DNS traffic: Query types, response codes, opcodes and traffic/octets transferred. + - dpdk + Collect DPDK interface statistics. + - drbd Collect individual drbd resource statistics. @@ -1017,6 +1020,141 @@ Crosscompiling * `endianflip' (12345678 -> 87654321) * `intswap' (12345678 -> 56781234) +Configuring with DPDK +--------------------- + +Note: DPDK 16.04 is the minimum version and currently supported version of DPDK +required for the dpdkstat plugin. This is to allow the plugin to take advantage +of functions added to detect if the DPDK primary process is alive. + +Note: For Ubuntu, GCC 4.9 is the minimum version required to build collectd +with DPDK. Ubuntu 14.04, for example, has GCC 4.8 by default and will require +an upgrade: + $ sudo add-apt-repository ppa:ubuntu-toolchain-r/test + $ sudo apt-get update + $ sudo apt-get install gcc-4.9 +Alternatively, if you know that the platform that you wish to run collectd +on supports the SSSE3 instruction set, you can run make as follows: + $ make -j CFLAGS+='-mssse3' + +Build DPDK for use with collectd: + To compile DPDK for use with collectd dpdkstat start by: + - Clone DPDK: + $ git clone git://dpdk.org/dpdk + - Checkout the system requirements at + http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html and make sure you have + the required tools and hugepage setup as specified there. + NOTE: It's recommended to use the 1GB hugepage setup for best performance, + please follow the instruction for "Reserving Hugepages for DPDK Use" + in the link above. + However if you plan on configuring 2MB hugepages on the fly please ensure + to add appropriate commands to reserve hugepages in a system startup script + if collectd is booted at system startup time. These commands include: + mkdir -p /mnt/huge + mount -t hugetlbfs nodev /mnt/huge + echo 64 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages + - To configure the DPDK build for the combined shared library modify + config/common_base in your DPDK as follows + # + # Compile to share library + # + -CONFIG_RTE_BUILD_SHARED_LIB=n + +CONFIG_RTE_BUILD_SHARED_LIB=y + - Prepare the configuration for the appropriate target as specified at: + http://dpdk.org/doc/guides/linux_gsg/build_dpdk.html. + For example: + $ make config T=x86_64-native-linuxapp-gcc + - Build the target: + $ make + - Install DPDK to /usr + $ sudo make install prefix=/usr + NOTE 1: You must run make install as the configuration of collectd with + DPDK expects DPDK to be installed somewhere. + NOTE 2: If you don't specify a prefix then DPDK will be installed in /usr/local/ + NOTE 3: If you are not root then use sudo to make install DPDK to the appropriate + location. + - Check that the DPDK library has been installed in /usr/lib or /lib + $ ls /usr/lib | grep dpdk + - Bind the interfaces to use with dpdkstat to DPDK: + DPDK devices can be setup with either the VFIO (for DPDK 1.7+) or UIO modules. + Note: UIO requires inserting an out of tree driver igb_uio.ko that is available + in DPDK. + UIO Setup: + - Insert uio.ko: + $ sudo modprobe uio + - Insert igb_uio.ko: + $ sudo insmod $DPDK_BUILD/kmod/igb_uio.ko + - Bind network device to igb_uio: + $ sudo $DPDK_DIR/tools/dpdk_nic_bind.py --bind=igb_uio eth1 + VFIO Setup: + - VFIO needs to be supported in the kernel and the BIOS. More information can be found + @ http://dpdk.org/doc/guides/linux_gsg/build_dpdk.html. + - Insert the `vfio-pci.ko' module: modprobe vfio-pci + - Set the correct permissions for the vfio device: + $ sudo /usr/bin/chmod a+x /dev/vfio + $ sudo /usr/bin/chmod 0666 /dev/vfio/* + - Bind the network device to vfio-pci: + $ sudo $DPDK_DIR/tools/dpdk_nic_bind.py --bind=vfio-pci eth1 + NOTE: Please ensure to add appropriate commands to bind the network + interfaces to DPDK in a system startup script if collectd is + booted at system startup time. + - Run ldconfig to update the shared library cache. + + Build static DPDK library for use with collectd: + - To configure DPDK to build the combined static library libdpdk.a + ensure that CONFIG_RTE_BUILD_SHARED_LIB is set to n in + config/common_base in your DPDK as follows: + # + # Compile to share library + # + CONFIG_RTE_BUILD_SHARED_LIB=n + - Prepare the configuration for the appropriate target as specified at: + http://dpdk.org/doc/guides/linux_gsg/build_dpdk.html. + For example: + $ make config T=x86_64-native-linuxapp-gcc + - Build the target using -fPIC: + $ make EXTRA_CFLAGS=-fPIC -j + - Install DPDK to /usr + $ sudo make install prefix=/usr + +Configure collectd with DPDK: +NOTE: The Address-Space Layout Randomization (ASLR) security feature in Linux should + be disabled, in order for the same hugepage memory mappings to be present in all + DPDK multi-process applications. Note that this has security implications. + To disable ASLR: + $ echo 0 > /proc/sys/kernel/randomize_va_space + To fully enable ASLR: + $ echo 2 > /proc/sys/kernel/randomize_va_space + See http://dpdk.org/doc/guides/prog_guide/multi_proc_support.html + + - Generate the build script as specified below. (i.e. run `build.sh'). + - Configure collectd with the DPDK shared library: + ./configure --with-libdpdk=/usr + NOTE: To configure collectd with the DPDK static library: + ./configure --with-libdpdk=/usr CFLAGS=" -lpthread -Wl,--whole-archive + -Wl,-ldpdk -Wl,-lm -Wl,-lrt -Wl,-lpcap -Wl,-ldl -Wl,--no-whole-archive" + + Libraries: + ... + libdpdk . . . . . . . . yes + + Modules: + ... + dpdkstat . . . . . . .yes + + + - Make sure that dpdk and dpdkstat are enabled in the configuration log: + + - Build collectd: + $ make -j && make -j install. + NOTE: As mentioned above, if you are building on Ubuntu 14.04 with GCC <= 4.8.X, + you need to use: + $ make -j CFLAGS+='-mssse3' && make -j install + +Usage of dpdkstat: + - The same PCI device configuration should be passed to the primary process + as the secondary process uses the same port indexes as the primary. + NOTE: A blacklist/whitelist of NICs isn't supported yet. Contact ------- diff --git a/configure.ac b/configure.ac index 929205bc..31c842af 100644 --- a/configure.ac +++ b/configure.ac @@ -2531,6 +2531,81 @@ then fi # }}} +# --with-libdpdk {{{ +AC_ARG_WITH(libdpdk, [AS_HELP_STRING([--with-libdpdk@<:@=PREFIX@:>@], [Path to the DPDK build directory.])], +[ + if test "x$withval" != "xno" && test "x$withval" != "xyes" + then + RTE_BUILD="$withval" + with_libdpdk="yes" + else + RTE_BUILD="/usr" + with_libdpdk="$withval" + fi + DPDK_INCLUDE="$RTE_BUILD/include" + DPDK_LIB_DIR="$RTE_BUILD/lib" + FOUND_DPDK=yes +], [with_libdpdk="no"]) + +if test "x$with_libdpdk" = "xyes" +then + LOCAL_DPDK_INSTALL="no" + AC_CHECK_HEADER([$DPDK_INCLUDE/rte_config.h], [LOCAL_DPDK_INSTALL=yes], + [AC_CHECK_HEADER([$DPDK_INCLUDE/dpdk/rte_config.h], + [], + [FOUND_DPDK=no], [])], []) + + if test "x$LOCAL_DPDK_INSTALL" = "xno" + then + DPDK_INCLUDE=$DPDK_INCLUDE/dpdk + fi + + if test "x$FOUND_DPDK" = "xno" + then + AC_MSG_ERROR([libdpdk error: rte_config.h not found]) + fi +fi + +if test "x$with_libdpdk" = "xyes" +then + SAVE_LDFLAGS="$LDFLAGS" + + if test "x$LOCAL_DPDK_INSTALL" != "xyes" + then + LDFLAGS="$LDFLAGS -L$DPDK_LIB_DIR" + fi + + AC_CHECK_LIB(dpdk, rte_eal_init, + [BUILD_WITH_DPDK_LIBS="-Wl,-ldpdk"], + [FOUND_DPDK=no]) + + LDFLAGS="$SAVE_LDFLAGS" + if test "x$FOUND_DPDK" = "xno" + then + AC_MSG_ERROR([libdpdk error: cannot link with dpdk in $DPDK_LIB_DIR]) + fi +fi + +# +# Note: An issue on Ubuntu 14.04 necessitates the use of -Wl,--no-as-needed: +# If you try compile with the older linker, the dpdk symbols will be undefined. +# This workaround should be removed when no longer necessary. +# +if test "x$with_libdpdk" = "xyes" +then + BUILD_WITH_DPDK_CFLAGS+="-I$DPDK_INCLUDE" + if test "x$LOCAL_DPDK_INSTALL" != "xyes" + then + BUILD_WITH_DPDK_LDFLAGS="-Wl,--no-as-needed" + else + BUILD_WITH_DPDK_LDFLAGS="-L$DPDK_LIB_DIR -Wl,--no-as-needed" + fi + AC_SUBST(BUILD_WITH_DPDK_CFLAGS) + AC_SUBST(BUILD_WITH_DPDK_LDFLAGS) + AC_SUBST(BUILD_WITH_DPDK_LIBS) +fi +# }}} + # --with-java {{{ with_java_home="$JAVA_HOME" if test "x$with_java_home" = "x" @@ -5703,6 +5778,7 @@ plugin_curl_xml="no" plugin_df="no" plugin_disk="no" plugin_drbd="no" +plugin_dpdk="no" plugin_entropy="no" plugin_ethstat="no" plugin_fhcount="no" @@ -6153,6 +6229,7 @@ AC_PLUGIN([dbi], [$with_libdbi], [General database st AC_PLUGIN([df], [$plugin_df], [Filesystem usage statistics]) AC_PLUGIN([disk], [$plugin_disk], [Disk usage statistics]) AC_PLUGIN([dns], [$with_libpcap], [DNS traffic analysis]) +AC_PLUGIN([dpdkstat], [$with_libdpdk], [Stats & Status from DPDK]) AC_PLUGIN([drbd], [$plugin_drbd], [DRBD statistics]) AC_PLUGIN([email], [yes], [EMail statistics]) AC_PLUGIN([entropy], [$plugin_entropy], [Entropy statistics]) @@ -6493,6 +6570,7 @@ AC_MSG_RESULT([ libaquaero5 . . . . . $with_libaquaero5]) AC_MSG_RESULT([ libatasmart . . . . . $with_libatasmart]) AC_MSG_RESULT([ libcurl . . . . . . . $with_libcurl]) AC_MSG_RESULT([ libdbi . . . . . . . $with_libdbi]) +AC_MSG_RESULT([ libdpdk . . . . . . . $with_libdpdk]) AC_MSG_RESULT([ libesmtp . . . . . . $with_libesmtp]) AC_MSG_RESULT([ libganglia . . . . . $with_libganglia]) AC_MSG_RESULT([ libgcrypt . . . . . . $with_libgcrypt]) @@ -6584,6 +6662,7 @@ AC_MSG_RESULT([ dbi . . . . . . . . . $enable_dbi]) AC_MSG_RESULT([ df . . . . . . . . . $enable_df]) AC_MSG_RESULT([ disk . . . . . . . . $enable_disk]) AC_MSG_RESULT([ dns . . . . . . . . . $enable_dns]) +AC_MSG_RESULT([ dpdkstat . . . . . . .$enable_dpdkstat]) AC_MSG_RESULT([ drbd . . . . . . . . $enable_drbd]) AC_MSG_RESULT([ email . . . . . . . . $enable_email]) AC_MSG_RESULT([ entropy . . . . . . . $enable_entropy]) diff --git a/src/Makefile.am b/src/Makefile.am index 3477dc24..99a7c024 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -390,6 +390,14 @@ dns_la_LDFLAGS = $(PLUGIN_LDFLAGS) dns_la_LIBADD = -lpcap endif +if BUILD_PLUGIN_DPDKSTAT +pkglib_LTLIBRARIES += dpdkstat.la +dpdkstat_la_SOURCES = dpdkstat.c +dpdkstat_la_CFLAGS = $(AM_CFLAGS) $(BUILD_WITH_DPDK_CFLAGS) +dpdkstat_la_LDFLAGS = $(PLUGIN_LDFLAGS) $(BUILD_WITH_DPDK_LDFLAGS) +dpdkstat_la_LIBADD = $(BUILD_WITH_DPDK_LIBS) +endif + if BUILD_PLUGIN_DRBD pkglib_LTLIBRARIES += drbd.la drbd_la_SOURCES = drbd.c diff --git a/src/collectd.conf.in b/src/collectd.conf.in index 345af3d5..04950674 100644 --- a/src/collectd.conf.in +++ b/src/collectd.conf.in @@ -114,6 +114,7 @@ #@BUILD_PLUGIN_DF_TRUE@LoadPlugin df #@BUILD_PLUGIN_DISK_TRUE@LoadPlugin disk #@BUILD_PLUGIN_DNS_TRUE@LoadPlugin dns +#@BUILD_PLUGIN_DPDKSTAT_TRUE@LoadPlugin dpdkstat #@BUILD_PLUGIN_DRBD_TRUE@LoadPlugin drbd #@BUILD_PLUGIN_EMAIL_TRUE@LoadPlugin email #@BUILD_PLUGIN_ENTROPY_TRUE@LoadPlugin entropy @@ -520,6 +521,16 @@ # SelectNumericQueryTypes true # +# +# Interval 1 +# Coremask "0xf" +# ProcessType "secondary" +# FilePrefix "rte" +# EnabledPortMask 0xffff +# PortName "interface1" +# PortName "interface2" +# + # # SocketFile "@localstatedir@/run/@PACKAGE_NAME@-email" # SocketGroup "collectd" diff --git a/src/collectd.conf.pod b/src/collectd.conf.pod index 9d4b7918..00cd781a 100644 --- a/src/collectd.conf.pod +++ b/src/collectd.conf.pod @@ -2376,6 +2376,67 @@ Enabled by default, collects unknown (and thus presented as numeric only) query =back +=head2 Plugin C + +The I collects information about DPDK interfaces using the +extended NIC stats API in DPDK. + +B + + + Coremask "0x4" + MemoryChannels "4" + ProcessType "secondary" + FilePrefix "rte" + EnabledPortMask 0xffff + PortName "interface1" + PortName "interface2" + + +B + +=over 4 + +=item B I + +A string containing an hexadecimal bit mask of the cores to run on. Note that +core numbering can change between platforms and should be determined beforehand. + +=item B I + +A string containing a number of memory channels per processor socket. + +=item B I + +A string containing the type of DPDK process instance. + +=item B I + +The prefix text used for hugepage filenames. The filename will be set to +/var/run/._config where prefix is what is passed in by the user. + +=item B I + +A string containing amount of Memory to allocate from hugepages on specific +sockets in MB + +=item B I + +A hexidecimal bit mask of the DPDK ports which should be enabled. A mask +of 0x0 means that all ports will be disabled. A bitmask of all Fs means +that all ports will be enabled. This is an optional argument - default +is all ports enabled. + +=item B I + +A string containing an optional name for the enabled DPDK ports. Each PortName +option should contain only one port name; specify as many PortName options as +desired. Default naming convention will be used if PortName is blank. If there +are less PortName options than there are enabled ports, the default naming +convention will be used for the additional ports. + +=back + =head2 Plugin C =over 4 diff --git a/src/dpdkstat.c b/src/dpdkstat.c new file mode 100644 index 00000000..58425437 --- /dev/null +++ b/src/dpdkstat.c @@ -0,0 +1,783 @@ +/*- + * collectd - src/dpdkstat.c + * MIT License + * + * Copyright(c) 2016 Intel Corporation. All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Authors: + * Maryam Tahhan + * Harry van Haaren + */ + +#include "collectd.h" + +#include "common.h" /* auxiliary functions */ +#include "plugin.h" /* plugin_register_*, plugin_dispatch_values */ +#include "utils_time.h" + +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define DPDK_DEFAULT_RTE_CONFIG "/var/run/.rte_config" +#define DPDK_MAX_ARGC 8 +#define DPDKSTAT_MAX_BUFFER_SIZE (4096 * 4) +#define DPDK_SHM_NAME "dpdk_collectd_stats_shm" +#define ERR_BUF_SIZE 1024 +#define REINIT_SHM 1 +#define RESET 1 +#define NO_RESET 0 + +enum DPDK_HELPER_ACTION { + DPDK_HELPER_ACTION_COUNT_STATS, + DPDK_HELPER_ACTION_SEND_STATS, +}; + +enum DPDK_HELPER_STATUS { + DPDK_HELPER_NOT_INITIALIZED = 0, + DPDK_HELPER_WAITING_ON_PRIMARY, + DPDK_HELPER_INITIALIZING_EAL, + DPDK_HELPER_ALIVE_SENDING_STATS, + DPDK_HELPER_GRACEFUL_QUIT, +}; + +struct dpdk_config_s { + /* General DPDK params */ + char coremask[DATA_MAX_NAME_LEN]; + char memory_channels[DATA_MAX_NAME_LEN]; + char socket_memory[DATA_MAX_NAME_LEN]; + char process_type[DATA_MAX_NAME_LEN]; + char file_prefix[DATA_MAX_NAME_LEN]; + cdtime_t interval; + uint32_t eal_initialized; + uint32_t enabled_port_mask; + char port_name[RTE_MAX_ETHPORTS][DATA_MAX_NAME_LEN]; + uint32_t eal_argc; + /* Helper info */ + int collectd_reinit_shm; + pid_t helper_pid; + sem_t sema_helper_get_stats; + sem_t sema_stats_in_shm; + int helper_pipes[2]; + enum DPDK_HELPER_STATUS helper_status; + enum DPDK_HELPER_ACTION helper_action; + /* xstats info */ + uint32_t num_ports; + uint32_t num_xstats; + cdtime_t port_read_time[RTE_MAX_ETHPORTS]; + uint32_t num_stats_in_port[RTE_MAX_ETHPORTS]; + struct rte_eth_link link_status[RTE_MAX_ETHPORTS]; + struct rte_eth_xstats *xstats; + /* rte_eth_xstats from here on until the end of the SHM */ +}; +typedef struct dpdk_config_s dpdk_config_t; + +static int g_configured; +static dpdk_config_t *g_configuration; + +static void dpdk_config_init_default(void); +static int dpdk_config(oconfig_item_t *ci); +static int dpdk_helper_init_eal(void); +static int dpdk_helper_run(void); +static int dpdk_helper_spawn(enum DPDK_HELPER_ACTION action); +static int dpdk_init(void); +static int dpdk_read(user_data_t *ud); +static int dpdk_shm_cleanup(void); +static int dpdk_shm_init(size_t size); + +/* Write the default configuration to the g_configuration instances */ +static void dpdk_config_init_default(void) { + g_configuration->interval = plugin_get_interval(); + if (g_configuration->interval == cf_get_default_interval()) + WARNING("dpdkstat: No time interval was configured, default value %lu ms " + "is set", + CDTIME_T_TO_MS(g_configuration->interval)); + /* Default is all ports enabled */ + g_configuration->enabled_port_mask = ~0; + g_configuration->eal_argc = DPDK_MAX_ARGC; + g_configuration->eal_initialized = 0; + ssnprintf(g_configuration->coremask, DATA_MAX_NAME_LEN, "%s", "0xf"); + ssnprintf(g_configuration->memory_channels, DATA_MAX_NAME_LEN, "%s", "1"); + ssnprintf(g_configuration->process_type, DATA_MAX_NAME_LEN, "%s", + "secondary"); + ssnprintf(g_configuration->file_prefix, DATA_MAX_NAME_LEN, "%s", + DPDK_DEFAULT_RTE_CONFIG); + + for (int i = 0; i < RTE_MAX_ETHPORTS; i++) + g_configuration->port_name[i][0] = 0; +} + +static int dpdk_config(oconfig_item_t *ci) { + int port_counter = 0; + char errbuf[ERR_BUF_SIZE]; + /* Allocate g_configuration and + * initialize a POSIX SHared Memory (SHM) object. + */ + int err = dpdk_shm_init(sizeof(dpdk_config_t)); + if (err) { + DEBUG("dpdkstat: error in shm_init, %s", + sstrerror(errno, errbuf, sizeof(errbuf))); + return -1; + } + + /* Set defaults for config, overwritten by loop if config item exists */ + dpdk_config_init_default(); + + for (int i = 0; i < ci->children_num; i++) { + oconfig_item_t *child = ci->children + i; + + if (strcasecmp("Coremask", child->key) == 0) { + cf_util_get_string_buffer(child, g_configuration->coremask, + sizeof(g_configuration->coremask)); + DEBUG("dpdkstat:COREMASK %s ", g_configuration->coremask); + } else if (strcasecmp("MemoryChannels", child->key) == 0) { + cf_util_get_string_buffer(child, g_configuration->memory_channels, + sizeof(g_configuration->memory_channels)); + DEBUG("dpdkstat:Memory Channels %s ", g_configuration->memory_channels); + } else if (strcasecmp("SocketMemory", child->key) == 0) { + cf_util_get_string_buffer(child, g_configuration->socket_memory, + sizeof(g_configuration->memory_channels)); + DEBUG("dpdkstat: socket mem %s ", g_configuration->socket_memory); + } else if (strcasecmp("ProcessType", child->key) == 0) { + cf_util_get_string_buffer(child, g_configuration->process_type, + sizeof(g_configuration->process_type)); + DEBUG("dpdkstat: proc type %s ", g_configuration->process_type); + } else if ((strcasecmp("FilePrefix", child->key) == 0) && + (child->values[0].type == OCONFIG_TYPE_STRING)) { + ssnprintf(g_configuration->file_prefix, DATA_MAX_NAME_LEN, + "/var/run/.%s_config", child->values[0].value.string); + DEBUG("dpdkstat: file prefix %s ", g_configuration->file_prefix); + } else if ((strcasecmp("EnabledPortMask", child->key) == 0) && + (child->values[0].type == OCONFIG_TYPE_NUMBER)) { + g_configuration->enabled_port_mask = + (uint32_t)child->values[0].value.number; + DEBUG("dpdkstat: Enabled Port Mask %u", + g_configuration->enabled_port_mask); + } else if (strcasecmp("PortName", child->key) == 0) { + cf_util_get_string_buffer( + child, g_configuration->port_name[port_counter], + sizeof(g_configuration->port_name[port_counter])); + DEBUG("dpdkstat: Port %d Name: %s ", port_counter, + g_configuration->port_name[port_counter]); + port_counter++; + } else { + WARNING("dpdkstat: The config option \"%s\" is unknown.", child->key); + } + } /* End for (int i = 0; i < ci->children_num; i++)*/ + g_configured = 1; /* Bypass configuration in dpdk_shm_init(). */ + + return 0; +} + +/* + * Allocate g_configuration and initialize SHared Memory (SHM) + * for config and helper process + */ +static int dpdk_shm_init(size_t size) { + /* + * Check if SHM is already configured: when config items are provided, the + * config function initializes SHM. If there is no config, then init() will + * just return. + */ + if (g_configuration) + return 0; + + char errbuf[ERR_BUF_SIZE]; + + /* Create and open a new object, or open an existing object. */ + int fd = shm_open(DPDK_SHM_NAME, O_CREAT | O_TRUNC | O_RDWR, 0666); + if (fd < 0) { + WARNING("dpdkstat:Failed to open %s as SHM:%s", DPDK_SHM_NAME, + sstrerror(errno, errbuf, sizeof(errbuf))); + goto fail; + } + /* Set the size of the shared memory object. */ + int ret = ftruncate(fd, size); + if (ret != 0) { + WARNING("dpdkstat:Failed to resize SHM:%s", + sstrerror(errno, errbuf, sizeof(errbuf))); + goto fail_close; + } + /* Map the shared memory object into this process' virtual address space. */ + g_configuration = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (g_configuration == MAP_FAILED) { + WARNING("dpdkstat:Failed to mmap SHM:%s", + sstrerror(errno, errbuf, sizeof(errbuf))); + goto fail_close; + } + /* + * Close the file descriptor, the shared memory object still exists + * and can only be removed by calling shm_unlink(). + */ + close(fd); + + /* Initialize g_configuration. */ + memset(g_configuration, 0, size); + + /* Initialize the semaphores for SHM use */ + int err = sem_init(&g_configuration->sema_helper_get_stats, 1, 0); + if (err) { + ERROR("dpdkstat semaphore init failed: %s", + sstrerror(errno, errbuf, sizeof(errbuf))); + goto fail_close; + } + err = sem_init(&g_configuration->sema_stats_in_shm, 1, 0); + if (err) { + ERROR("dpdkstat semaphore init failed: %s", + sstrerror(errno, errbuf, sizeof(errbuf))); + goto fail_close; + } + + g_configuration->xstats = NULL; + + return 0; + +fail_close: + close(fd); +fail: + /* Reset to zero, as it was set to MAP_FAILED aka: (void *)-1. Avoid + * an issue if collectd attempts to run this plugin failure. + */ + g_configuration = 0; + return -1; +} + +static int dpdk_re_init_shm() { + dpdk_config_t temp_config; + memcpy(&temp_config, g_configuration, sizeof(dpdk_config_t)); + DEBUG("dpdkstat: %s: ports %" PRIu32 ", xstats %" PRIu32, __func__, + temp_config.num_ports, temp_config.num_xstats); + + size_t shm_xstats_size = + sizeof(dpdk_config_t) + + (sizeof(struct rte_eth_xstats) * g_configuration->num_xstats); + DEBUG("=== SHM new size for %" PRIu32 " xstats", g_configuration->num_xstats); + + int err = dpdk_shm_cleanup(); + if (err) { + ERROR("dpdkstat: Error in shm_cleanup in %s", __func__); + return err; + } + err = dpdk_shm_init(shm_xstats_size); + if (err) { + WARNING("dpdkstat: Error in shm_init in %s", __func__); + return err; + } + /* If the XML config() function has been run, don't re-initialize defaults */ + if (!g_configured) + dpdk_config_init_default(); + + memcpy(g_configuration, &temp_config, sizeof(dpdk_config_t)); + g_configuration->collectd_reinit_shm = 0; + g_configuration->xstats = (struct rte_eth_xstats *)(g_configuration + 1); + return 0; +} + +static int dpdk_init(void) { + int err = dpdk_shm_init(sizeof(dpdk_config_t)); + if (err) { + ERROR("dpdkstat: %s : error %d in shm_init()", __func__, err); + return err; + } + + /* If the XML config() function has been run, dont re-initialize defaults */ + if (!g_configured) { + dpdk_config_init_default(); + } + + return 0; +} + +static int dpdk_helper_stop(int reset) { + g_configuration->helper_status = DPDK_HELPER_GRACEFUL_QUIT; + if (reset) { + g_configuration->eal_initialized = 0; + g_configuration->num_ports = 0; + g_configuration->xstats = NULL; + g_configuration->num_xstats = 0; + for (int i = 0; i < RTE_MAX_ETHPORTS; i++) + g_configuration->num_stats_in_port[i] = 0; + } + close(g_configuration->helper_pipes[1]); + int err = kill(g_configuration->helper_pid, SIGKILL); + if (err) { + char errbuf[ERR_BUF_SIZE]; + WARNING("dpdkstat: error sending kill to helper: %s", + sstrerror(errno, errbuf, sizeof(errbuf))); + } + + return 0; +} + +static int dpdk_helper_spawn(enum DPDK_HELPER_ACTION action) { + char errbuf[ERR_BUF_SIZE]; + g_configuration->eal_initialized = 0; + g_configuration->helper_action = action; + /* + * Create a pipe for helper stdout back to collectd. This is necessary for + * logging EAL failures, as rte_eal_init() calls rte_panic(). + */ + if (pipe(g_configuration->helper_pipes) != 0) { + DEBUG("dpdkstat: Could not create helper pipe: %s", + sstrerror(errno, errbuf, sizeof(errbuf))); + return -1; + } + + int pipe0_flags = fcntl(g_configuration->helper_pipes[0], F_GETFL, 0); + int pipe1_flags = fcntl(g_configuration->helper_pipes[1], F_GETFL, 0); + if (pipe0_flags == -1 || pipe1_flags == -1) { + WARNING("dpdkstat: Failed setting up pipe flags: %s", + sstrerror(errno, errbuf, sizeof(errbuf))); + } + int pipe0_err = fcntl(g_configuration->helper_pipes[0], F_SETFL, + pipe1_flags | O_NONBLOCK); + int pipe1_err = fcntl(g_configuration->helper_pipes[1], F_SETFL, + pipe0_flags | O_NONBLOCK); + if (pipe0_err == -1 || pipe1_err == -1) { + WARNING("dpdkstat: Failed setting up pipes: %s", + sstrerror(errno, errbuf, sizeof(errbuf))); + } + + pid_t pid = fork(); + if (pid > 0) { + close(g_configuration->helper_pipes[1]); + g_configuration->helper_pid = pid; + DEBUG("dpdkstat: helper pid %lu", (long)g_configuration->helper_pid); + /* Kick helper once its alive to have it start processing */ + sem_post(&g_configuration->sema_helper_get_stats); + } else if (pid == 0) { + /* Replace stdout with a pipe to collectd. */ + close(g_configuration->helper_pipes[0]); + close(STDOUT_FILENO); + dup2(g_configuration->helper_pipes[1], STDOUT_FILENO); + dpdk_helper_run(); + exit(0); + } else { + ERROR("dpdkstat: Failed to fork helper process: %s", + sstrerror(errno, errbuf, sizeof(errbuf))); + return -1; + } + return 0; +} + +/* + * Initialize the DPDK EAL, if this returns, EAL is successfully initialized. + * On failure, the EAL prints an error message, and the helper process exits. + */ +static int dpdk_helper_init_eal(void) { + g_configuration->helper_status = DPDK_HELPER_INITIALIZING_EAL; + char *argp[(g_configuration->eal_argc) + 1]; + int i = 0; + + argp[i++] = "collectd-dpdk"; + if (strcasecmp(g_configuration->coremask, "") != 0) { + argp[i++] = "-c"; + argp[i++] = g_configuration->coremask; + } + if (strcasecmp(g_configuration->memory_channels, "") != 0) { + argp[i++] = "-n"; + argp[i++] = g_configuration->memory_channels; + } + if (strcasecmp(g_configuration->socket_memory, "") != 0) { + argp[i++] = "--socket-mem"; + argp[i++] = g_configuration->socket_memory; + } + if (strcasecmp(g_configuration->file_prefix, "") != 0 && + strcasecmp(g_configuration->file_prefix, DPDK_DEFAULT_RTE_CONFIG) != 0) { + argp[i++] = "--file-prefix"; + argp[i++] = g_configuration->file_prefix; + } + if (strcasecmp(g_configuration->process_type, "") != 0) { + argp[i++] = "--proc-type"; + argp[i++] = g_configuration->process_type; + } + g_configuration->eal_argc = i; + + g_configuration->eal_initialized = 1; + int ret = rte_eal_init(g_configuration->eal_argc, argp); + if (ret < 0) { + g_configuration->eal_initialized = 0; + return ret; + } + return 0; +} + +static int dpdk_helper_run(void) { + char errbuf[ERR_BUF_SIZE]; + pid_t ppid = getppid(); + g_configuration->helper_status = DPDK_HELPER_WAITING_ON_PRIMARY; + + while (1) { + /* sem_timedwait() to avoid blocking forever */ + struct timespec ts; + cdtime_t now = cdtime(); + cdtime_t safety_period = MS_TO_CDTIME_T(1500); + CDTIME_T_TO_TIMESPEC(now + safety_period + g_configuration->interval * 2, + &ts); + int ret = sem_timedwait(&g_configuration->sema_helper_get_stats, &ts); + + if (ret == -1 && errno == ETIMEDOUT) { + ERROR("dpdkstat-helper: sem timedwait()" + " timeout, did collectd terminate?"); + dpdk_helper_stop(RESET); + } + /* Parent PID change means collectd died so quit the helper process. */ + if (ppid != getppid()) { + WARNING("dpdkstat-helper: parent PID changed, quitting."); + dpdk_helper_stop(RESET); + } + + /* Checking for DPDK primary process. */ + if (!rte_eal_primary_proc_alive(g_configuration->file_prefix)) { + if (g_configuration->eal_initialized) { + WARNING("dpdkstat-helper: no primary alive but EAL initialized:" + " quitting."); + dpdk_helper_stop(RESET); + } + g_configuration->helper_status = DPDK_HELPER_WAITING_ON_PRIMARY; + /* Back to start of while() - waiting for primary process */ + continue; + } + + if (!g_configuration->eal_initialized) { + /* Initialize EAL. */ + int ret = dpdk_helper_init_eal(); + if (ret != 0) { + WARNING("ERROR INITIALIZING EAL"); + dpdk_helper_stop(RESET); + } + } + + g_configuration->helper_status = DPDK_HELPER_ALIVE_SENDING_STATS; + + uint8_t nb_ports = rte_eth_dev_count(); + if (nb_ports == 0) { + DEBUG("dpdkstat-helper: No DPDK ports available. " + "Check bound devices to DPDK driver."); + dpdk_helper_stop(RESET); + } + + if (nb_ports > RTE_MAX_ETHPORTS) + nb_ports = RTE_MAX_ETHPORTS; + + int len = 0, enabled_port_count = 0, num_xstats = 0; + for (uint8_t i = 0; i < nb_ports; i++) { + if (!(g_configuration->enabled_port_mask & (1 << i))) + continue; + + if (g_configuration->helper_action == DPDK_HELPER_ACTION_COUNT_STATS) { + len = rte_eth_xstats_get(i, NULL, 0); + if (len < 0) { + ERROR("dpdkstat-helper: Cannot get xstats count on port %" PRIu8, i); + break; + } + num_xstats += len; + g_configuration->num_stats_in_port[enabled_port_count] = len; + enabled_port_count++; + continue; + } else { + len = g_configuration->num_stats_in_port[enabled_port_count]; + g_configuration->port_read_time[enabled_port_count] = cdtime(); + ret = rte_eth_xstats_get( + i, g_configuration->xstats + num_xstats, + g_configuration->num_stats_in_port[enabled_port_count]); + if (ret < 0 || ret != len) { + DEBUG("dpdkstat-helper: Error reading xstats on port %" PRIu8 + " len = %d", + i, len); + break; + } + num_xstats += g_configuration->num_stats_in_port[enabled_port_count]; + enabled_port_count++; + } + } /* for (nb_ports) */ + + if (g_configuration->helper_action == DPDK_HELPER_ACTION_COUNT_STATS) { + g_configuration->num_ports = enabled_port_count; + g_configuration->num_xstats = num_xstats; + DEBUG("dpdkstat-helper ports: %" PRIu32 ", num stats: %" PRIu32, + g_configuration->num_ports, g_configuration->num_xstats); + /* Exit, allowing collectd to re-init SHM to the right size */ + g_configuration->collectd_reinit_shm = REINIT_SHM; + dpdk_helper_stop(NO_RESET); + } + /* Now kick collectd send thread to send the stats */ + int err = sem_post(&g_configuration->sema_stats_in_shm); + if (err) { + WARNING("dpdkstat: error posting semaphore to helper %s", + sstrerror(errno, errbuf, sizeof(errbuf))); + dpdk_helper_stop(RESET); + } + } /* while(1) */ + + return 0; +} + +static void dpdk_submit_xstats(const char *dev_name, + const struct rte_eth_xstats *xstats, + uint32_t counters, cdtime_t port_read_time) { + for (uint32_t j = 0; j < counters; j++) { + value_list_t dpdkstat_vl = VALUE_LIST_INIT; + char *type_end; + + dpdkstat_vl.values = &(value_t){.derive = (derive_t)xstats[j].value}; + dpdkstat_vl.values_len = 1; /* Submit stats one at a time */ + dpdkstat_vl.time = port_read_time; + sstrncpy(dpdkstat_vl.host, hostname_g, sizeof(dpdkstat_vl.host)); + sstrncpy(dpdkstat_vl.plugin, "dpdkstat", sizeof(dpdkstat_vl.plugin)); + sstrncpy(dpdkstat_vl.plugin_instance, dev_name, + sizeof(dpdkstat_vl.plugin_instance)); + + type_end = strrchr(xstats[j].name, '_'); + + if ((type_end != NULL) && + (strncmp(xstats[j].name, "rx_", strlen("rx_")) == 0)) { + if (strncmp(type_end, "_errors", strlen("_errors")) == 0) { + sstrncpy(dpdkstat_vl.type, "if_rx_errors", sizeof(dpdkstat_vl.type)); + } else if (strncmp(type_end, "_dropped", strlen("_dropped")) == 0) { + sstrncpy(dpdkstat_vl.type, "if_rx_dropped", sizeof(dpdkstat_vl.type)); + } else if (strncmp(type_end, "_bytes", strlen("_bytes")) == 0) { + sstrncpy(dpdkstat_vl.type, "if_rx_octets", sizeof(dpdkstat_vl.type)); + } else if (strncmp(type_end, "_packets", strlen("_packets")) == 0) { + sstrncpy(dpdkstat_vl.type, "if_rx_packets", sizeof(dpdkstat_vl.type)); + } else if (strncmp(type_end, "_placement", strlen("_placement")) == 0) { + sstrncpy(dpdkstat_vl.type, "if_rx_errors", sizeof(dpdkstat_vl.type)); + } else if (strncmp(type_end, "_buff", strlen("_buff")) == 0) { + sstrncpy(dpdkstat_vl.type, "if_rx_errors", sizeof(dpdkstat_vl.type)); + } else { + /* Does not fit obvious type: use a more generic one */ + sstrncpy(dpdkstat_vl.type, "derive", sizeof(dpdkstat_vl.type)); + } + + } else if ((type_end != NULL) && + (strncmp(xstats[j].name, "tx_", strlen("tx_"))) == 0) { + if (strncmp(type_end, "_errors", strlen("_errors")) == 0) { + sstrncpy(dpdkstat_vl.type, "if_tx_errors", sizeof(dpdkstat_vl.type)); + } else if (strncmp(type_end, "_dropped", strlen("_dropped")) == 0) { + sstrncpy(dpdkstat_vl.type, "if_tx_dropped", sizeof(dpdkstat_vl.type)); + } else if (strncmp(type_end, "_bytes", strlen("_bytes")) == 0) { + sstrncpy(dpdkstat_vl.type, "if_tx_octets", sizeof(dpdkstat_vl.type)); + } else if (strncmp(type_end, "_packets", strlen("_packets")) == 0) { + sstrncpy(dpdkstat_vl.type, "if_tx_packets", sizeof(dpdkstat_vl.type)); + } else { + /* Does not fit obvious type: use a more generic one */ + sstrncpy(dpdkstat_vl.type, "derive", sizeof(dpdkstat_vl.type)); + } + } else if ((type_end != NULL) && + (strncmp(xstats[j].name, "flow_", strlen("flow_"))) == 0) { + + if (strncmp(type_end, "_filters", strlen("_filters")) == 0) { + sstrncpy(dpdkstat_vl.type, "operations", sizeof(dpdkstat_vl.type)); + } else if (strncmp(type_end, "_errors", strlen("_errors")) == 0) { + sstrncpy(dpdkstat_vl.type, "errors", sizeof(dpdkstat_vl.type)); + } else if (strncmp(type_end, "_filters", strlen("_filters")) == 0) { + sstrncpy(dpdkstat_vl.type, "filter_result", sizeof(dpdkstat_vl.type)); + } + } else if ((type_end != NULL) && + (strncmp(xstats[j].name, "mac_", strlen("mac_"))) == 0) { + if (strncmp(type_end, "_errors", strlen("_errors")) == 0) { + sstrncpy(dpdkstat_vl.type, "errors", sizeof(dpdkstat_vl.type)); + } + } else { + /* Does not fit obvious type, or strrchr error: + * use a more generic type */ + sstrncpy(dpdkstat_vl.type, "derive", sizeof(dpdkstat_vl.type)); + } + + sstrncpy(dpdkstat_vl.type_instance, xstats[j].name, + sizeof(dpdkstat_vl.type_instance)); + plugin_dispatch_values(&dpdkstat_vl); + } +} + +static int dpdk_read(user_data_t *ud) { + int ret = 0; + + /* + * Check if SHM flag is set to be re-initialized. AKA DPDK ports have been + * counted, so re-init SHM to be large enough to fit all the statistics. + */ + if (g_configuration->collectd_reinit_shm) { + DEBUG("dpdkstat: read() now reinit SHM then launching send-thread"); + dpdk_re_init_shm(); + } + + /* + * Check if DPDK proc is alive, and has already counted port / stats. This + * must be done in dpdk_read(), because the DPDK primary process may not be + * alive at dpdk_init() time. + */ + if (g_configuration->helper_status == DPDK_HELPER_NOT_INITIALIZED || + g_configuration->helper_status == DPDK_HELPER_GRACEFUL_QUIT) { + int action = DPDK_HELPER_ACTION_SEND_STATS; + if (g_configuration->num_xstats == 0) + action = DPDK_HELPER_ACTION_COUNT_STATS; + /* Spawn the helper thread to count stats or to read stats. */ + int err = dpdk_helper_spawn(action); + if (err) { + char errbuf[ERR_BUF_SIZE]; + ERROR("dpdkstat: error spawning helper %s", + sstrerror(errno, errbuf, sizeof(errbuf))); + return -1; + } + } + + pid_t ws = waitpid(g_configuration->helper_pid, NULL, WNOHANG); + /* + * Conditions under which to respawn helper: + * waitpid() fails, helper process died (or quit), so respawn + */ + _Bool respawn_helper = 0; + if (ws != 0) { + respawn_helper = 1; + } + + char buf[DPDKSTAT_MAX_BUFFER_SIZE]; + char out[DPDKSTAT_MAX_BUFFER_SIZE]; + + /* non blocking check on helper logging pipe */ + struct pollfd fds = { + .fd = g_configuration->helper_pipes[0], .events = POLLIN, + }; + int data_avail = poll(&fds, 1, 0); + if (data_avail < 0) { + char errbuf[ERR_BUF_SIZE]; + if (errno != EINTR || errno != EAGAIN) + ERROR("dpdkstats: poll(2) failed: %s", + sstrerror(errno, errbuf, sizeof(errbuf))); + } + while (data_avail) { + int nbytes = read(g_configuration->helper_pipes[0], buf, sizeof(buf)); + if (nbytes <= 0) + break; + ssnprintf(out, nbytes, "%s", buf); + DEBUG("dpdkstat: helper-proc: %s", out); + } + + if (respawn_helper) { + if (g_configuration->helper_pid) + dpdk_helper_stop(RESET); + dpdk_helper_spawn(DPDK_HELPER_ACTION_COUNT_STATS); + } + + /* Kick helper process through SHM */ + sem_post(&g_configuration->sema_helper_get_stats); + + struct timespec ts; + cdtime_t now = cdtime(); + CDTIME_T_TO_TIMESPEC(now + g_configuration->interval, &ts); + ret = sem_timedwait(&g_configuration->sema_stats_in_shm, &ts); + if (ret == -1) { + if (errno == ETIMEDOUT) + DEBUG( + "dpdkstat: timeout in collectd thread: is a DPDK Primary running? "); + return 0; + } + + /* Dispatch the stats.*/ + uint32_t count = 0, port_num = 0; + + for (uint32_t i = 0; i < g_configuration->num_ports; i++) { + char dev_name[64]; + cdtime_t port_read_time = g_configuration->port_read_time[i]; + uint32_t counters_num = g_configuration->num_stats_in_port[i]; + size_t ports_max = CHAR_BIT * sizeof(g_configuration->enabled_port_mask); + for (size_t j = port_num; j < ports_max; j++) { + if ((g_configuration->enabled_port_mask & (1 << j)) != 0) + break; + port_num++; + } + + if (g_configuration->port_name[i][0] != 0) + ssnprintf(dev_name, sizeof(dev_name), "%s", + g_configuration->port_name[i]); + else + ssnprintf(dev_name, sizeof(dev_name), "port.%" PRIu32, port_num); + struct rte_eth_xstats *xstats = g_configuration->xstats + count; + + dpdk_submit_xstats(dev_name, xstats, counters_num, port_read_time); + count += counters_num; + port_num++; + } /* for each port */ + return 0; +} + +static int dpdk_shm_cleanup(void) { + int ret = munmap(g_configuration, sizeof(dpdk_config_t)); + g_configuration = 0; + if (ret) { + ERROR("dpdkstat: munmap returned %d", ret); + return ret; + } + ret = shm_unlink(DPDK_SHM_NAME); + if (ret) { + ERROR("dpdkstat: shm_unlink returned %d", ret); + return ret; + } + return 0; +} + +static int dpdk_shutdown(void) { + int ret = 0; + char errbuf[ERR_BUF_SIZE]; + close(g_configuration->helper_pipes[1]); + int err = kill(g_configuration->helper_pid, SIGKILL); + if (err) { + ERROR("dpdkstat: error sending sigkill to helper %s", + sstrerror(errno, errbuf, sizeof(errbuf))); + ret = -1; + } + err = dpdk_shm_cleanup(); + if (err) { + ERROR("dpdkstat: error cleaning up SHM: %s", + sstrerror(errno, errbuf, sizeof(errbuf))); + ret = -1; + } + + return ret; +} + +void module_register(void) { + plugin_register_complex_config("dpdkstat", dpdk_config); + plugin_register_init("dpdkstat", dpdk_init); + plugin_register_complex_read(NULL, "dpdkstat", dpdk_read, 0, NULL); + plugin_register_shutdown("dpdkstat", dpdk_shutdown); +} diff --git a/src/types.db b/src/types.db index 4f8d31c7..e3da48a4 100644 --- a/src/types.db +++ b/src/types.db @@ -77,12 +77,14 @@ email_size value:GAUGE:0:U energy value:GAUGE:U:U energy_wh value:GAUGE:U:U entropy value:GAUGE:0:4294967295 +errors value:DERIVE:0:U evicted_keys value:DERIVE:0:U expired_keys value:DERIVE:0:U fanspeed value:GAUGE:0:U file_handles value:GAUGE:0:U file_size value:GAUGE:0:U files value:GAUGE:0:U +filter_result value:DERIVE:0:U flow value:GAUGE:0:U fork_rate value:DERIVE:0:U frequency value:GAUGE:0:U @@ -101,10 +103,14 @@ if_errors rx:DERIVE:0:U, tx:DERIVE:0:U if_multicast value:DERIVE:0:U if_octets rx:DERIVE:0:U, tx:DERIVE:0:U if_packets rx:DERIVE:0:U, tx:DERIVE:0:U +if_rx_dropped value:DERIVE:0:U if_rx_errors value:DERIVE:0:U if_rx_octets value:DERIVE:0:U +if_rx_packets value:DERIVE:0:U +if_tx_dropped value:DERIVE:0:U if_tx_errors value:DERIVE:0:U if_tx_octets value:DERIVE:0:U +if_tx_packets value:DERIVE:0:U invocations value:DERIVE:0:U io_octets rx:DERIVE:0:U, tx:DERIVE:0:U io_packets rx:DERIVE:0:U, tx:DERIVE:0:U