This is an attempt to try and find a good way of monitoring and logging what is going on in the HA module. It’s a work-in-progress, please feel free to contribute.
Smartcenter
The first script and alert below uses a custom alert for a trigger and writes to a log file in the /var/tmp/clusterxl_alert directory on the smartcenter. Using the cron job, a daily email can be sent with the day’s alerts summary. This was posted to CPUG by yheffen – https://www.cpug.org/forums/clustering-security-gateway-ha-clusterxl/9992-ha-failover-log-files.html. Originally written using the korn shell, it works equally well in bash.
#!/bin/bash
DIR="/var/tmp/clusterxl_alert"
DAILY_LOG="$DIR/alert_daily.log"
LOG="$DIR/alert.log"
mklog () {
if [ ! -f "$1" ]; then
touch "$1"
chmod 644 "$1"
fi
}
mklog "$LOG"
while read ALERT; do
echo "$ALERT" >> "$DAILY_LOG"
echo "$ALERT" >> "$LOG"
done
The path to the script is one of the “UserDefined scripts” defined in the “Policy> Global Properties> Log and Alert> Alert Commands” window. Then in the cluster object’s properties in the “ClusterXL” window, specify this User Defined Alert down in the “Tracking” section.
Cron job code:
0 5 * * * [ -f /var/tmp/clusterxl_alert/alert_daily.log ] && mailx -s "ClusterXL Alerts" me@example.com < /var/tmp/clusterxl_alert/alert_daily.log && rm /var/tmp/clusterxl_alert/alert_daily.log
Security Gateway
This next script, which is very quick and dirty, monitors the interfaces using the “cpaprobstat -a if”. It polls every 2 seconds and writes the result to a file (ha_poll.txt) and compares the result against a reference file (ha_ref.txt) which is created when the script is run initially. If a difference is found, it is logged to the ha_alert.log file. There are better ways to do this but as I said, it’s quick and dirty 🙂
#!/bin/bash
# variables
DIR="/var/tmp"
REFERENCE="$DIR/ha_ref.txt"
POLLED="$DIR/ha_polled.txt"
LOG="$DIR/ha_alert.log"
# functions
mkref () {
echo `cphaprob -a if` > $REFERENCE
}
mkpoll () {
echo `cphaprob -a if` > $POLLED
}
# main process
# make reference file
mkref
echo "Entering polling loop, use ctrl-c or"
echo "\"kill \$(pgrep ${0##*/})\" from a different terminal to exit"
echo
# Poll every 2 seconds and compare until ctrl-c.
# If status changes log and then make new reference data
while true; do
mkpoll
DIFF=$(diff $REFERENCE $POLLED)
if [ "$DIFF" != "" ]; then
echo "Change logged to $LOG"
echo "" >> $LOG
echo $DIFF >> $LOG
mkref
sleep 2
fi
done
Running this as admin in expert mode with an ampersand keeps the process running in the background even if the terminal is disconnected:
[expert@gw]# ./ha_monitor.sh &
One issue here is that if an interface is down, “cphaprob -a if” shows the number of seconds it has been down for:
[Expert@gw]# cphaprob -a if
Required interfaces: 4
Required secured interfaces: 2
eth0 UP sync(secured), multicast
eth1 Inbound: DOWN (4.7 secs) Outbound: DOWN (5 secs) sync(secured), multicast
eth2 UP non sync(non secured), multicast
eth3 UP non sync(non secured), multicast
It will therefore see a discrepancy on every poll as the seconds number increases and will create a log entry every 2 seconds until the interface comes back up. Like I said, quick, dirty and a work-in-progress 🙂
EDIT:
New script now:
#!/bin/bash
# variables
HOSTNAME=`hostname`
DIR="/var/tmp"
LOG=$DIR"/"$HOSTNAME"_hamon.log"
# functions
mkref () {
echo "Making new reference .." >> $LOG
REFERENCE="`cphaprob stat`"
echo "Done" >> $LOG
echo "" >> $LOG
}
mkpoll () {
POLLED="`cphaprob stat`"
}
getAndLogVals () {
CPHAPROBSTAT=`cphaprob stat`
CPHAPROBLIST=`cphaprob list | grep -v "Time since" | grep -v "Registration number" | grep -v "Timeout: none"`
CPHAPROBAIF=`cphaprob -a if`
echo "" >> $LOG
echo "cphaprob stat:" >> $LOG
echo "--------------" >> $LOG
echo "$CPHAPROBSTAT" >> $LOG
echo "" >> $LOG
echo "cphaprob list:" >> $LOG
echo "--------------" >> $LOG
echo "$CPHAPROBLIST" >> $LOG
echo "" >> $LOG
echo "cphaprob -a if:" >> $LOG
echo "---------------" >> $LOG
echo "$CPHAPROBAIF" >> $LOG
echo "" >> $LOG
}
# main []
if [ -f $LOG ]; then
echo "Removing old log file .."
`rm $LOG`
fi
echo "Starting logging at "`date` >> $LOG
echo "" >> $LOG
# Record original vals to the log
getAndLogVals
# get reference vals
mkref
echo "Monitoring Failover status, use ctrl-c or \"kill \$(pgrep ${0##*/})\" from a different terminal to exit"
# Poll continuously and compare until ctrl-c. If status changes, log and get new reference data
while true; do
mkpoll
if [ "$POLLED" != "$REFERENCE" ]; then
DIFF="$REFERENCE / $POLLED"
echo "" >> $LOG
echo "=============================================================================" >> $LOG
echo "" >> $LOG
echo `date` >> $LOG
echo "" >> $LOG
echo "HA Status Change detected, logged to $LOG"
echo "$DIFF" >> $LOG
echo "" >> $LOG
getAndLogVals
mkref
fi
done
Like this:
Like Loading...