Monday 3 September 2012

Debugging Munin loaning graphs locally


  1. Problems
    1. Munin is a pain to debug remotely -- on prod -- when doing custom "loaning" graphs
      1. Varnish gets in the way
      2. (might not work, test yourself) fast-cgi doesn't work with new Munin 2.0 dynamic graph generation very well, just comment it out in the Apache config
    2. These issues combined leads to a complete nightmare of caching and having to wait for graphs to be regenerated so you can see your changes
  2. Troubleshooting
    1. Try this link for perm checks
      1. http://munin-monitoring.org/wiki/CgiHowto
    2. IMPORTANT: ust turn this on in monit.conf manually, since debian turns it off, and who knows who else
      1. graph_strategy cgi
  3. Solution
    1. Grab /var/lib/munin from production server
    2. Install Munin 2.x or greater on your local box
    3. Comment out the Munin files under /etc/cron* whatever/whereever, so your server doesn't try to update any of the files under /var/lib/munin
    4. Move your local copy of /var/lib/munin aside
    5. Move the production version of /var/lib/munin into place on your local system
    6. Copy perms of your original /var/lib/munin to new one
    7. Grab the prod server version of /etc/munin/munin.conf
    8. Move your local copy of /etc/munin/munin.conf aside, rename something you'll remember
    9. Move prod server version /etc/munin/munin.conf in place on your local box
    10. Use "munin-html" to regenerate html pages as you make changes to your munin definitions in /etc/munin/munin.conf
      1. basically these commands, but, for details, see http://blog.loftninjas.org/2010/04/08/an-evening-with-munin-graph-aggregation/
        1. sudo su - munin -s /bin/bash
        2. /usr/share/munin/munin-html --debug
      2. might work / might not
    11. Hit munin locally through your web browser; if you're lucky, all the prod info/graphs appear normally
    12. Now, you can update /etc/munin/munin.conf as you like and graph changes and errors will show up instantly
    13. Tweak URL to hit graphs you know the name of but munin-html failed to find for you
  4. Long-term
    1. Refresh the data from prod every 48 to 72 hours so your graph data doesn't fall off the chart
      1. Since your local box is not updating data, all rrd data will be blank from the time your grab it from prod's /var/lib/munin

No comments:

Post a Comment

Note: only a member of this blog may post a comment.

Interview questions: 2020-12

Terraform provider vs provisioner Load balancing Network Load Balancer vs Application Load Balancer  Networking Layer 1 vs Layer 4 haproxy u...