Mississippi Network Monitoring
chevy is running Cacti, and monitoring all of the deployed metrixes and ciscos. The graphs are available at
Use username: guest, password: freewifirocks.
Contents
Contents
Set up
Metrix
SNMP
Now-a-days, the image we use for metrixes, which is based on PyramidLinux and maintained by RussellSenior, contains snmpd natively as well as about 10 "exec"-style custom exports:
What? |
OID |
snmpd.conf line |
Local Coverage (ath0) assocation count |
1.3.6.1.4.1.2021.8.1.101.1 |
exec assoc_count /usr/local/bin/assoc_count |
Upstream Link Loss |
1.3.6.1.4.1.2021.8.1.101.2 |
exec link-loss /usr/local/bin/get-value.sh backhaul loss |
Upstream Link Ping Trials |
1.3.6.1.4.1.2021.8.1.101.3 |
exec link-trials /usr/local/bin/get-value.sh backhaul ping-trials |
Upstream Link Ping Successes |
1.3.6.1.4.1.2021.8.1.101.4 |
exec link-success /usr/local/bin/get-value.sh backhaul ping-success |
Upstream Link Latency Min |
1.3.6.1.4.1.2021.8.1.101.5 |
exec link-latency-min /usr/local/bin/get-value.sh backhaul latency-min |
Upstream Link Latency Ave |
1.3.6.1.4.1.2021.8.1.101.6 |
exec link-latency-ave /usr/local/bin/get-value.sh backhaul latency-ave |
Upstream Link Latency Max |
1.3.6.1.4.1.2021.8.1.101.7 |
exec link-latency-max /usr/local/bin/get-value.sh backhaul latency-max |
Upstream Link RSSI Min |
1.3.6.1.4.1.2021.8.1.101.8 |
exec link-rssi-min /usr/local/bin/get-value.sh backhaul rssi-min |
Upstream Link RSSI Ave |
1.3.6.1.4.1.2021.8.1.101.9 |
exec link-rssi-ave /usr/local/bin/get-value.sh backhaul rssi-ave |
Upstream Link RSSI Max |
1.3.6.1.4.1.2021.8.1.101.10 |
exec link-rssi-max /usr/local/bin/get-value.sh backhaul rssi-max |
And, before you point out that this would be better if we used "extend" instead of "exec": we are running net-snmpd 5.1.2, which is before "extend" was added... For more information on these exec scripts, see the bottom of this page...
Cacti
Then, finish up by:
- Add the device to Cacti with the "PTP MGP Metrix" template.
- Create some graphs, use all the templated ones, and for interface stats, ath0...athN are most useful, as well as eth0.
- Add the device to the main graph tree (under MGP/Rooftop Metrixes).
- Add the assoc_count_exec data source to the "Combined Associations" graph following the others as an example...
WGTs
TODO: Fill me in with correct information!
Scripts
assoc_count.sh
echo $(($(wc -l < /proc/net/madwifi/ath0/associated_sta)/3))
get-value.sh
LINK=${1:-backhaul} VALUE=${2:-loss} DIR=/tmp/linkstats TARGET=${DIR}/${LINK}-${VALUE} if [ ! -f ${TARGET} ] || [ $(expr $(date +%s) "-" $(date -r ${TARGET} +%s)) -ge 60 ]; then /usr/local/bin/compute-stats.sh ${LINK} fi cat ${TARGET}
monitor-link.sh
# grab information for link monitoring DESTIP=10.11.104.2 DESTNAME=backhaul IFACE=ath3 INTERVAL=500 # in centiseconds OUTDIR=/tmp/linkstats centiseconds () { awk '{ printf("%ld\n", $1 * 100) }' /proc/uptime } mkdir -p -m 777 ${OUTDIR} start=$(centiseconds) end=$start while true; do end=$(expr ${end} "+" ${INTERVAL}) latency=$(ping -c 1 -i 5 -w 4 -q ${DESTIP} | sed -n -r -e 's|^rtt min/avg/max/mdev = ([0-9.]*)/.*|\1|p') if [ "${latency}" != "-" ]; then rssi=$(awk '$1 ~ /rssi/ { print $2 }' /proc/net/madwifi/${IFACE}/associated_sta) else latency="" rssi=0 fi now=$(centiseconds) echo $now $latency $rssi >> ${OUTDIR}/${DESTNAME} sleep $(expr $(expr ${end} "-" ${now}) "/" 100) done
link-stats.sh
INPUT=/tmp/linkstats/backhaul if [ ! -f ${INPUT} ]; then echo 0 0 0 0 0 0 0 0 0; exit 0; fi /bin/mv ${INPUT} ${INPUT}-computing /bin/awk 'BEGIN { min_latency = 5.0 ; max_latency = 0.0; min_rssi = 100 ; max_rssi = 0 } NF == 2 { n_trials++ ; next } NF == 3 { latency = $2 ; rssi = $3 ; sum_latency += latency ; sum_rssi += rssi ; n_trials++ ; n_success++ } latency < min_latency { min_latency = latency } latency > max_latency { max_latency = latency } rssi < min_rssi { min_rssi = rssi } rssi > max_rssi { max_rssi = rssi } #{ print "debug", n_trials, n_success, latency, min_latency, sum_latency, max_latency, rssi, min_rssi, sum_rssi, max_rssi } END { if (n_trials == 0) { print 0,0,0,0,0,0,0,0,0 } else { printf("%.3f %d %d", (n_trials - n_success)/n_trials,n_success,n_trials); if (n_success == 0) { print "",0,0,0,0,0,0 } printf(" %.3f %.3f %.3f %d %.1f %d\n", min_latency, sum_latency / n_success, max_latency, min_rssi, sum_rssi / n_success, max_rssi) } }' ${INPUT}-computing rm -f ${INPUT}-computing exit 0
/etc/init.d/linkstats
# # skeleton example file to build /etc/init.d/ scripts. # This file should be used to construct scripts for /etc/init.d. # # Written by Miquel van Smoorenburg <miquels@cistron.nl>. # Modified for Debian GNU/Linux # by Ian Murdock <imurdock@gnu.ai.mit.edu>. # # Version: @(#)skeleton 1.9.1 08-Apr-2002 miquels@cistron.nl # PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin DAEMON=/usr/local/bin/monitor-link.sh NAME=monitor-link DESC="link quality measurement" test -x $DAEMON || exit 0 set -e case "$1" in start) echo -n "Starting $DESC: $NAME" start-stop-daemon --start -b --quiet --pidfile /var/run/$NAME.pid \ --exec $DAEMON echo "." ;; stop) echo -n "Stopping $DESC: $NAME " start-stop-daemon --stop --quiet --pidfile /var/run/$NAME.pid \ --exec $DAEMON echo "." ;; restart|force-reload) # # If the "reload" option is implemented, move the "force-reload" # option to the "reload" entry above. If not, "force-reload" is # just the same as "restart". # echo -n "Restarting $DESC: $NAME" start-stop-daemon --stop --quiet --pidfile \ /var/run/$NAME.pid --exec $DAEMON sleep 1 start-stop-daemon --start -b --quiet --pidfile \ /var/run/$NAME.pid --exec $DAEMON echo "." ;; *) N=/etc/init.d/$NAME # echo "Usage: $N {start|stop|restart|reload|force-reload}" >&2 echo "Usage: $N {start|stop|restart|force-reload}" >&2 exit 1 ;; esac exit 0
Diagnostics
You can check that stuff is working remotely like:
snmpget -c public -v 1 <ip> <oid>
Using any OID and IP you'd like. The ones in the tables on this page are worth testing...
TODO
- A (remote) backup strategy for mysql tables and rrds