Skip to content
This repository has been archived by the owner on Nov 5, 2023. It is now read-only.
/ graphite-metrics Public archive

metric collectors for various stuff not (or poorly) handled by other monitoring daemons

License

Notifications You must be signed in to change notification settings

mk-fg/graphite-metrics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

graphite-metrics: metric collectors for various stuff not (or poorly) handled by other monitoring daemons

Core of the project is a simple daemon (harvestd), which collects metric values and sends them to graphite carbon daemon (and/or other configured destinations) once per interval.

Includes separate data collection components ("collectors") for processing of:

  • /proc/slabinfo for useful-to-watch values, not everything (configurable).
  • /proc/vmstat and /proc/meminfo in a consistent way.
  • /proc/stat for irq, softirq, forks.
  • /proc/buddyinfo and /proc/pagetypeinfo (memory fragmentation).
  • /proc/interrupts and /proc/softirqs.
  • Cron log to produce start/finish events and duration for each job into a separate metrics, adapts jobs to metric names with regexes.
  • Per-system-service accounting using systemd and it's cgroups ("Default...Accounting=" options in system.conf have to be enabled for more recent versions).
  • sysstat data from sadc logs (use something like sadc -F -L -S DISK -S XDISK -S POWER 60 to have more stuff logged there) via sadf binary and it's json export (sadf -j, supported since sysstat-10.0.something, iirc).
  • iptables rule "hits" packet and byte counters, taken from ip{,6}tables-save, mapped via separate "table chain_name rule_no metric_name" file, which should be generated along with firewall rules (I use this script to do that).

Additional metric collectors can be added via setuptools/distribute graphite_metrics.collectors entry point and confgured via the common configuration mechanism.

Same for the datapoint sinks (destinations - it doesn't have to be a single carbon host), datapoint processors (mangle/rename/filter datapoints) and the main loop, which can be replaced with the async (simple case - threads or gevent) or buffering loop.

Currently supported backends (data destinations, sinks):

Look at the shipped collectors, processors, sinks and loops and their base classes (like graphite_metrics.sinks.Sink or loops.Basic) for API examples.

Installation

It's a regular package for Python 2.7 (not 3.X).

Using pip is the best way:

% pip install graphite-metrics

If you don't have it, use:

% easy_install pip
% pip install graphite-metrics

Alternatively (see also):

% curl https://raw.github.com/pypa/pip/master/contrib/get-pip.py | python
% pip install graphite-metrics

Or, if you absolutely must:

% easy_install graphite-metrics

But, you really shouldn't do that.

Current-git version can be installed like this:

% pip install 'git+https://github.com/mk-fg/graphite-metrics.git#egg=graphite-metrics'

Requirements

Basic requirements are (pip or easy_install should handle these for you):

Some shipped modules require additional packages to function (which can be installed automatically by specifying extras on install, example: pip install 'graphite-metrics[collectors.cgacct]'):

  • collectors

    • cgacct

    • cron_log

    • sysstat

      • xattr (unless --xattr-emulation is used)
      • (optional) simplejson - for better performance than stdlib json module
  • sinks

    • librato_metrics
      • requests
      • (optional) simplejson - for better performance than stdlib json module
      • (optional) gevent - to enable constant-time (more scalable) async submissions of large data chunks via concurrent API requests

Also see requirements.txt file or "install_requires" and "extras_require" in setup.py.

Running

First run should probably look like this:

% harvestd --debug -s dump -i10

That will use default configuration with all the collectors enabled, dumping data to stderr (only "dump" data-sink enabled) and using short (5s) interval between collected datapoints, dumpng additional info about what's being done.

After that, see default harvestd.yaml configuration file, which contains configuration for all loaded collectors and can/should be overidden using -c option.

Note that you don't have to specify all the options in each override-config, just the ones you need to update.

For example, simple configuration file (say, /etc/harvestd.yaml) just to specify carbon host and log lines format (dropping timestamp, since it will be piped to syslog or systemd-journal anyway) might look like this:

sinks:
  carbon_socket:
    host: carbon.example.host

logging:
  formatters:
    basic:
      format: '%(levelname)s :: %(name)s: %(message)s'

And be started like this: harvestd -c /etc/harvestd.yaml

See harvestd --help output for a full CLI reference.

Caveats, Stern Warnings and Apocalyptic Prophecies

While most stock collectors here pull metrics from /proc once per some interval, same as the other tools, be especially wary of the ones that process memory metrics, like /proc/slabinfo and cgroup value parsers.

So-called "files" in /proc are actually callbacks in the kernel code, and to get consistent reading for the whole slabinfo table, (at least some versions) of the kernel have to lock some operations, causing unexpected lags and delays on the whole system under some workloads (e.g. memcache servers).

cgroup data collector processes lots of files, potentially dozens, hundreds or even thoursands per collection cycle, which may also cause similar issues.

Special thanks to Marcus Barczak for pointing that out.

Rationale

Most other tools can (in theory) collect this data, and I've used collectd for most of these, but it:

  • Doesn't provide some of the most usef