Monday, January 1, 2024

Getting Prometheus Metrics from TrueNAS

 TrueNAS (specifically referring to TrueNAS SCALE, but CORE should be the same) doesn't offer native support for monitoring by Prometheus. Suggestions on the Internets seem to point at using the built-in Graphite exporter to send metrics to graphite_exporter, which can be scraped by Prometheus. This DOES work, with some caveats:

  1. Many metrics seem to report 0, incorrectly.
  2. Tons of work to parse and create tags in the style of Prometheus
I spent several hours poking at point number 2, only to realize number 1. My solution, just run node_exporter directly on the truenas. It's a Go application, so it Just Works™, and it gets you probably everything that you would otherwise get from the graphite_exporter. To do this, I created a startup script in the truenas to run after boot. The script starts up node_exporter inside of tmux, done deal. Feel free to point out why this is a bad idea.

If you really want to go the graphite_exporter path, here is the beginning of my config file to start parsing the data stream. If anyone has a better one, let me know and I'll happily link to it.

---

mappings:

  - match: 'dragonfish.truenas.disk_ops.*.*'

    name: 'dragonfish_truenas_disk_ops'

    labels:

      device: $1

      operation: $2

  - match: 'dragonfish.truenas.cputemp.temperatures.*'

    name: 'dragonfish_truenas_cputemp'

    labels:

      cpu: $1

  - match: 'dragonfish.truenas.cpu.cpufreq.*'

    name: 'dragonfish_truenas_cpufreq'

    labels:

      cpu: $1

  - match: 'dragonfish.truenas.cpu.core_throttling.*'

    name: 'dragonfish_truenas_cpu_core_throttling'

    labels:

      cpu: $1

  - match: 'dragonfish.truenas.zfspool_state.*.*'

    name: 'dragonfish_truenas_zpool_state'

    labels:

      pool: $1

      state: $2


  # regex matches are performed after regular matches

  - match: 'dragonfish\.truenas\.cpu\.cpu(\d+)_cpuidle\.(.*)'

    match_type: regex

    name: 'dragonfish_truenas_cpuidle'

    labels:

      cpu: $1

      idlestate: $2

  - match: 'dragonfish\.truenas\.cpu\.cpu(\d+)\.(\w+)'

    match_type: regex

    name: 'dragonfish_truenas_cpu_utilization'

    labels:

      cpu: $1

      type: $2

  - match: 'dragonfish\.truenas\.disk_avgsz\.([[:alnum:]]+)\.writes'

    #help: 'Average I/O write operation size.'

    match_type: regex

    name: 'dragonfish_truenas_disk_avgsz_writes'

    labels:

      device: $1

  - match: 'dragonfish\.truenas\.disk_avgsz\.([[:alnum:]]+)\.reads'

    #help: 'Average I/O read operation size.'

    match_type: regex

    name: 'dragonfish_truenas_disk_avgsz_reads'

    labels:

      device: $1