Configuring Host sFlow for Linux via /etc/hsflowd.conf

The hsflowd daemon must select an IP address that will represent this agent and configure counter-polling and packet/transaction sampling. Beyond that there are a number of options to monitor hypervisors, VMs, Containers, applications and network traffic in different ways...

Example: Host (minimal)

sflow {
  collector { ip = 10.100.12.13 }
}

Example: Host (with DNS-SD config)

sflow {
  dns-sd { domain = .sf.inmon.com }
}

Example: Host (with packet-sampling on all NICs)

sflow {
  collector { ip = 10.100.12.13 }
  pcap { speed = 1- }
  tcp {}
}

Example: KVM Hypervisor (with Linux bridge)

sflow {
  collector { ip = 10.100.12.13 }
  pcap { dev = virbr0 }
  kvm {}
}

Example: KVM Hypervisor (with Open vSwitch)

sflow {
  sampling = 500
  collector { ip = 10.100.12.13 }
  kvm {}
  ovs {}
}

Example: Docker Host (with multiple collectors)

sflow {
  agent.CIDR = 10.0.0.0/8
  polling = 10
  sampling = 1000
  collector { ip = 10.100.12.13 }
  collector { ip = 10.122.1.2 }
  collector { ip = 10.144.1.2 UDPPort=6344 }
  pcap { dev = docker0 }
  pcap { dev = docker_gwbridge }
  docker {}
}

Example: HPC Host (with custom metrics, web-server and GPU)

sflow {
  sampling.http = 50 # for apache mod_sflow
  collector { ip = 10.100.12.13 }
  json { UDPPort = 36343 } # for RTMETRIC
  nvml {} # Nvidia GPU
}

Agent IP Selection

Selection from the list of IP addresses belonging to the host is automatic. You can check to see which agent IP was selected using:
grep agentIP /etc/hsflowd.auto The selection can be influenced by these settings:

AGENT CIDR
agent.cidr = 10.0.0.0/8

Prefer an IP address that falls into this range.

AGENT INTERFACE
agent = eth0

Select the IP address associated with this interface.

Sampling/Polling/Collectors

There are two choices for configuring the sampling-rates, polling-intervals and sFlow collectors: DNS-SD or Manual.

DNS-SD

dns-sd { domain=.mycompany.com }

This will use DNS queries to _sflow._udp.mycompany.com to learn the sFlow collectors, sampling rates and polling settings. Any changes you make on the DNS server will be picked up automatically by the hosts without requiring a restart. See details and examples here.

Manual

Alternatively, with DNS-SD off, the settings look like this...

  • Polling
    polling=30
    Schedule counter-polling with an interval of 30 seconds.
  • Sampling
    sampling.<speed>=N

    Set the sampling rate to 1-in-N for interfaces with this speed (e.g. sampling.10G=10000). This overrides the default which is calulated as N=speed/1000000. So by default a 40G interface will be sampled at 1-in-40000 and a 100M interface will be sampled at 1-in-100.

  • HTTP Sampling
    sampling.http=10

    Set the sampling rate to 1-in-10 for httpd. This setting is copied to /etc/hsflowd.auto, where it can be picked up by mod-sflow for apache or nginx-sflow-module.

  • Transaction Sampling
    sampling.app.myapplication=20

    Set the sampling rate to 1-in-20 for JSON-encoded sFlow-APPLICATION transactions received on the JSON port configured below. For details, see Scripting Languages.

  • Default Sampling
    sampling=400

    For interfaces with no speed or applications with no specific sampling setting, fall back on this default 1-in-N rate.

  • Collector
    collector { ip=10.1.2.3 udport=6343 }

    Send sFlow to 10.1.2.3:6343. You can add as many of these collector{} sections as you need. The same sFlow feed will be replicated to each of them.

Local Configuration

The local configuration options will apply whether you are using DNS-SD or manual config. They divide into optional sections which correspond to loadable modules in hsflowd. You may need to recompile hsflowd to include a module that you need.

  • JSON Input - Custom Metrics
    json { udpport=36343 }

    Listen for JSON-encoded input from localhost on UDP:36343. This allows locall running applications to submit counters and transaction-samples in sFlow-APPLICATION format, or in RTMETRIC and RTFLOW format. This information is then packed efficiently and forwarded to the sFlow collectors in standard binary sFlow format (with sequence-numbers etc. so that any packet loss in transit can be detected).

  • Open vSwitch
    ovs { }

    Share the sFlow sampling/polling/collector information with the local openvswtich virtual switch using ovs-vsctl(1). This results in efficient sampling and polling of all bridges and their vports, so if you use can use this option then you will not need any of the pcap, nflog or ulog options below.

  • PCAP Packet Sampling
    pcap { dev=docker0 }

    Apply packet sampling to the "docker0" device, which can be a hardware NIC or a linux bridge. You can include more pcap {} sections to tap more devices. When the Linux kernel is 3.19 or later this is implemented using efficient, kernel-based BPF sampling so the overhead is low. If you are monitoring a passive tap you can add promisc=on.

  • NFLOG Packet Sampling
     nflog { group=5 probability=0.0025 }

    Listen for packet samples on iptables NFLOG channel 5, and assume they were random-sampled at 1-in-400 (probability 0.0025). To configure NFLOG sampling in iptables the commands look like this:

    MOD_STATISTIC="-m statistic --mode random --probability 0.0025"
    NFLOG_CONFIG="--nflog-group 5 --nflog-prefix SFLOW"
    sudo iptables -I INPUT -j NFLOG $MOD_STATISTIC $NFLOG_CONFIG
    sudo iptables -I OUTPUT -j NFLOG $MOD_STATISTIC $NFLOG_CONFIG
  • ULOG Packet Sampling
    ulog { group=2 probability=0.01 }

    Listen for packet samples on iptables UFLOG channel 2, and assume they were random-sampled at 1-in-100 (probability 0.01). To configure ULOG sampling in iptables the commands look like this:

    MOD_STATISTIC="-m statistic --mode random --probability 0.01"
    ULOG_CONFIG="--ulog-nlgroup 2 --ulog-prefix SFLOW --ulog-qthreshold 1"
    sudo iptables -I INPUT -j ULOG $MOD_STATISTIC $ULOG_CONFIG
    sudo iptables -I OUTPUT -j ULOG $MOD_STATISTIC $ULOG_CONFIG
  • PSAMPLE Packet Sampling
    psample { group=1 }

    Listen for packet samples on PSAMPLE channel 1. To configure PSAMPLE sampling with tc the commands look like this:

    DEV=eth0
    RATE=1000
    GROUP=1
    
    tc qdisc add dev $DEV handle ffff: ingress
    tc filter add dev $DEV parent ffff: matchall \
    action sample rate $RATE group $GROUP
    But on newer Linux distributions the tc configuration can be performed automatically by the dent hsflowd module:
    dent { sw=on switchport=enp.* }
    
    The above sets up software sampling (in tc driver) and identifies interfaces matching enp.* as switch ports so that their counters will be reported separately. For a hardware switch with ASIC sampling the config might be more like this:
    dent { sw=off switchport=^swp[0-9]+$ }
    
    If egress sampling is enabled then transit delay and buffer depth measurements for each sampled packet may be supplied by the ASIC too. For details, see Transit delay and queueing.
    psample { group=1 egress=on }
    dent { sw=off switchport=^swp[0-9]+$ }
    
  • TCP Performance Monitoring
    tcp { }

    Requires packet-sampling (e.g. pcap{}, nflog{}..). TCP samples for connections held by this host are annotated with performance info extracted efficiently from the Linux kernel. Measurements include delay, loss and jitter.

  • Packet Drop Monitoring
    dropmon { limit=50 }

    Requires Linux kernel 5.0 or later (e.g. Ubuntu20, Debian11, Fedora34...) Dropped packets reported on DROPMON Netlink channel by switch ASIC or Linux kernel are exported via the standard sFlow v5 extension for reporting dropped packets. For details, see Using sFlow to monitor dropped packets.

  • XenServer
    xen { }

    When running on a Xen DDK domain0 node, this module connects to the Xen libraries and discovers and monitors the VMs. Use in combination with "ovs {}" to enable traffic monitoring on the Open vSwitch.

  • KVM Hypervisor
    kvm { }

    When running on a Red Hat KVM Hypervisor (e.g. OpenStack), this module connects to the libvirt libraries and discovers and monitors the VMs. Use in combination with "ovs {}" to enable traffic monitoring on Open vSwitch, or something like "pcap { dev=virbr0 }" to monitor traffic through the linux bridge.

  • Docker Containers
    docker { markTraffic=on }

    Connect to /var/run/docker.sock to discover and monitor the containers. Use in conjunction with something like "pcap { dev=docker0 } pcap { dev=docker_gwbridge }" to monitor traffic. The markTraffic=on setting will attempt to fill in the sFlow "entities" structure to map traffic to container.

  • Containerd Containers
    containerd { markTraffic=on }

    Runs Go program in child process to discover and monitor the containers. Use in conjunction with something like "pcap { dev=docker0 } pcap { dev=docker_gwbridge }" to monitor traffic.

  • Nvidia NVML GPU
    nvml { }

    Connect to the libnvml library and include host GPU stats with the sFlow feed. This module is not compiled in by default so you must build from sources to include it.

  • Systemd Cgroups
    systemd { markTraffic=on }

    Connect to systemd via DBUS and report on all running services. Each service (cgroup) appears in sFlow the same way that Docker Containers or VMs appear. With markTraffic=on, packet samples (e.g. from pcap{} or nflog{}) are also mapped to their service.

  • DBUS Agent
    dbus { }

    Allow internal hsflowd telemetry counters to be queried via DBUS. See src/Linux/scripts/telemetry in the repo for examples.

    • Logging

      By default hsflowd will log to syslog.

      Debugging

      stop the service with

      sudo service hsflowd stop
      and then run it manually with debugging increased, using:
      sudo hsflowd -ddd
      You can also force a running hsflowd to log a stack backtrace by sending it a SIGUSR1 signal:
      killall -USR1 hsflowd