I have been using self hosted Kolab Groupware everyday for quite a while now.
Therefore the need arose to monitor process activity and system resources using Monit utility.
Table of contents
Couple of words about monit
monit is a simple and robust utility for monitoring and automatic maintenance, which is supported on Linux, BSD and OS X.
Software installation
Debian Wheezy currently provides Monit 5.4.
To install it execute command:
$ sudo apt-get install monit
Monit daemon will be started at the boot time. Alternatively you can use standard _System V init scripts_ to manage service.
### Initial configuration {#initial_configuration}
Configuration files are located under `/etc/monit/` directory. Default settings are stored in the `/etc/monit/monitrc` file, which I strongly suggest to read.
Custom configuration will be stored in the`/etc/monit/conf.d/` directory.
I will override several important settings using `local.conf` file.<section>
#### Modified settings
* Set email address to `root@example.org`
* Slightly change default template
* Define mail server as `localhost`
* Set default interval to `120` seconds with initial delay of `180` seconds
* Enable local web server to take advantage of the additional functionality
_(currently commented out)_</section>
<pre>$ sudo cat /etc/monit/conf.d/local.conf
<pre># define e-mail recipent
set alert root@example.org
define e-mail template
set mail-format { from: monit@$HOST subject: monit alert – $EVENT $SERVICE message: $EVENT Service $SERVICE Date: $DATE Action: $ACTION Host: $HOST Description: $DESCRIPTION }
define server
set mailserver localhost
define interval and initial delay
set daemon 120 with start delay 180
set web server for local management
set httpd port 2812 and use the address localhost allow localhost
<div class="alert alert-warning">
Please take a note that enabling built-in web-server in the way I used above will allow every local user to access and perform <i>monit</i> operations. Essentially it should be disabled or secured using username and password combination.
</div>
### Command-line operations {#command-line_operations}
#### Verify configuration syntax
To check configuration syntax execute the following command.
<pre>$ sudo monit -t
Control file syntax OK
#### Start, Stop, Restart actions
Start all services and enable monitoring for them.
<pre>$ sudo monit start all
Start all services in `resources` group and enable monitoring for them.
<pre>$ sudo monit -g resources start
Start `rootfs` service and enable monitoring for it.
<pre>$ sudo monit start rootfs
You can initiate `stop` action in the same way as the above one, which will stop service and disable monitoring, or just execute `restart` action to stop and start corresponding services.
#### Monitor and unmonitor actions
Monitor all services.
<pre>$ sudo monit monitor all
Monitor all services in `resources` group.
<pre>$ sudo monit -g resources monitor
Monitor `rootfs` service.
<pre>$ sudo monit monitor rootfs
Use `unmonitor` action to disable monitoring for corresponding services.
#### Status action
Print service status.
<pre>$ sudo monit status
<pre>The Monit daemon 5.6 uptime: 27d 0h 47m
System ‘server’ status Running monitoring status Monitored load average [0.26] [0.43] [0.48] cpu 12.8%us 2.6%sy 0.0%wa memory usage 2934772 kB [36.4%] swap usage 2897376 kB [35.0%] data collected Mon, 29 Sep 2014 22:47:49 Filesystem ‘rootfs’ status Accessible monitoring status Monitored permission 660 uid 0 gid 6 filesystem flags 0x1000 block size 4096 B blocks total 17161862 [67038.5 MB] blocks free for non superuser 7327797 [28624.2 MB] [42.7%] blocks free total 8205352 [32052.2 MB] [47.8%] inodes total 4374528 inodes free 4151728 [94.9%] data collected Mon, 29 Sep 2014 22:47:49
#### Summary action
Print short service summary.
<pre>$ sudo monit summary
The Monit daemon 5.6 uptime: 27d 0h 48m System ‘server’ Running Filesystem ‘rootfs’ Accessible
#### Reload action
Reload configuration and reinitialize Monit daemon.
<pre>$ sudo monit reload
#### Quit action
Terminate Monit daemon.
<pre>$ sudo monit quit
monit daemon with pid [5248] killed
### Monitor filesystems {#monitor_filesystems}
<div class="alert alert-info">
Configuration syntax is very consistent and easy to grasp. I will start with simple example and then proceed to a slightly more complex ideas. Just remember to check one thing at a time.
</div>
I am using VPS service due to easy backup/restore process, so I have only one filesystem on `/dev/root` device, which I will monitor as a named `rootfs` service.
Monit daemon will generate alert and send an email if space or inode usage on the `rootfs` filesystem [stored on `/dev/root` device] exceeds 80 percent of the available capacity.
<pre>$ sudo cat /etc/monit/conf.d/filesystems.conf
<pre>check filesystem rootfs with path /dev/root
group resources if space usage > 80% then alert if inode usage > 80% then alert
The above service is placed in `resources` group for easier management.
### Monitor system resources {#monitor_system_resources}
The following configuration will be stored as a named `server` service as it describes resource usage for the whole mail server.
Monit daemon will check memory usage, if it exceeds 80% of the available capacity for three subsequent events, it will send an alert email.
Recovery message will be sent after two subsequent events to limit number of sent messages. The same rules apply to the remaining system resources.
The system I am using have four available processors, so the alert will be generated after the five minutes load average exceeds five.
<pre>$ sudo cat /etc/monit/conf.d/resources.conf
<pre>check system server
group resources if memory usage > 80% for 3 cycles then alert else if succeeded for 2 cycles then alert if swap usage > 50% for 3 cycles then alert else if succeeded for 2 cycles then alert if cpu(wait) > 30% for 3 cycles then alert else if succeeded for 2 cycles then alert if cpu(system) > 60% for 3 cycles then alert else if succeeded for 2 cycles then alert if cpu(user) > 60% for 3 cycles then alert else if succeeded for 2 cycles then alert if loadavg(5min) > 5 then alert else if succeeded for 2 cycles then alert
The above service is placed in `resources` group for easier management.
### Monitor system services {#monitor_system_services}
#### cron {#cron}
cron is a daemon used to execute user-specified tasks at scheduled time.
Monit daemon will use the specified pid file [`/var/run/crond.pid`] to monitor [`cron`] service and restart it if it stops for any reason.
Configuration change will generate alert message, permission issue will generate alert message and disable further monitoring.
GID of `102` translates to `crontab` group.
<pre>$ sudo cat /etc/monit/conf.d/cron.conf
<pre>check process cron with pidfile /var/run/crond.pid
group system group scheduled-tasks start program = “/usr/sbin/service cron start” stop program = “/usr/sbin/service cron stop” if 3 restarts within 5 cycles then timeout depends on cron_bin depends on cron_rc depends on cron_rc.d depends on cron_rc.daily depends on cron_rc.hourly depends on cron_rc.monthly depends on cron_rc.weekly depends on cron_rc.spool check file cron_bin with path /usr/sbin/cron group scheduled-tasks if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check file cron_rc with path /etc/crontab group scheduled-tasks if failed checksum then alert if failed permission 644 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory cron_rc.d with path /etc/cron.d group scheduled-tasks if changed timestamp then alert if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory cron_rc.daily with path /etc/cron.daily group scheduled-tasks if changed timestamp then alert if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory cron_rc.hourly with path /etc/cron.hourly group scheduled-tasks if changed timestamp then alert if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory cron_rc.monthly with path /etc/cron.monthly group scheduled-tasks if changed timestamp then alert if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory cron_rc.weekly with path /etc/cron.weekly group scheduled-tasks if changed timestamp then alert if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory cron_rc.spool with path /var/spool/cron/crontabs group scheduled-tasks if changed timestamp then alert if failed permission 1730 then unmonitor if failed uid root then unmonitor if failed gid 102 then unmonitor
The above service is placed in `system` and `scheduled-tasks` groups for easier management.
#### rsyslogd {#rsyslogd}
rsyslogd is a message logging service.
<pre>$ sudo cat /etc/monit/conf.d/rsyslogd.conf
<pre>check process rsyslog with pidfile /var/run/rsyslogd.pid
group system group logging start program = “/usr/sbin/service rsyslog start” stop program = “/usr/sbin/service rsyslog stop” if 3 restarts within 5 cycles then timeout depends on rsyslog_bin depends on rsyslog_rc depends on rsyslog_rc.d check file rsyslog_bin with path /usr/sbin/rsyslogd group logging if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check file rsyslog_rc with path /etc/rsyslog.conf group logging if failed checksum then alert if failed permission 644 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory rsyslog_rc.d with path /etc/rsyslog.d group logging if changed timestamp then alert if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor
The above service is placed in `system` and `logging` groups for easier management.
#### ntpd {#ntpd}
Network Time Protocol daemon will be extended by the use of port monitoring.
<pre>$ sudo cat /etc/monit/conf.d/ntpd.conf
<pre>check process ntp with pidfile /var/run/ntpd.pid
group system group time start program = “/usr/sbin/service ntp start” stop program = “/usr/sbin/service ntp stop” if failed port 123 type udp then restart if 3 restarts within 5 cycles then timeout depends on ntp_bin depends on ntp_rc check file ntp_bin with path /usr/sbin/ntpd group time if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check file ntp_rc with path /etc/ntp.conf group time if failed checksum then alert if failed permission 644 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor
The above service is placed in `system` and `time` groups for easier management.
#### OpenSSH {#openssh}
OpenSSH service will be extended by the use of `match` statement to test content of the configuration file. I assume it is self explanatory.
<pre>$ sudo cat /etc/monit/conf.d/openssh-server.conf
<pre>check process openssh with pidfile /var/run/sshd.pid
group system group sshd start program = “/usr/sbin/service ssh start” stop program = “/usr/sbin/service ssh stop” if failed port 22 with proto ssh then restart if 3 restarts with 5 cycles then timeout depend on openssh_bin depend on openssh_sftp_bin depend on openssh_rsa_key depend on openssh_dsa_key depend on openssh_rc check file openssh_bin with path /usr/sbin/sshd group sshd if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check file openssh_sftp_bin with path /usr/lib/openssh/sftp-server group sshd if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check file openssh_rsa_key with path /etc/ssh/ssh_host_rsa_key group sshd if failed checksum then unmonitor if failed permission 600 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check file openssh_dsa_key with path /etc/ssh/ssh_host_dsa_key group sshd if failed checksum then unmonitor if failed permission 600 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check file openssh_rc with path /etc/ssh/sshd_config group sshd if failed checksum then alert if failed permission 644 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor if not match “^PasswordAuthentication no” then alert if not match “^PubkeyAuthentication yes” then alert if not match “^PermitRootLogin no” then alert
The above service is placed in `system` and `sshd` groups for easier management.
### Monitor Kolab services {#monitor_kolab_services}
#### MySQL {#mysql}
MySQL is an open-source database server used by the wide range of Kolab services.
UID of `106` translates to `mysql` user. GID of `106` translates to `mysql` group.
It is the first time I have used `unixsocket` statement here.
<pre>$ sudo cat /etc/monit/conf.d/mysql.conf
<pre>check process mysql with pidfile /var/run/mysqld/mysqld.pid
group kolab group database start program = “/usr/sbin/service mysql start” stop program = “/usr/sbin/service mysql stop” if failed port 3306 protocol mysql then restart if failed unixsocket /var/run/mysqld/mysqld.sock protocol mysql then restart if 3 restarts within 5 cycles then timeout depends on mysql_bin depends on mysql_rc depends on mysql_sys_maint depend on mysql_data check file mysql_bin with path /usr/sbin/mysqld group database if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check file mysql_rc with path /etc/mysql/my.cnf group database if failed checksum then alert if failed permission 644 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check file mysql_sys_maint with path /etc/mysql/debian.cnf group database if failed checksum then unmonitor if failed permission 600 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory mysql_data with path /var/lib/mysql group database if failed permission 700 then unmonitor if failed uid 106 then unmonitor if failed gid 110 then unmonitor
The above service is placed in `kolab` and `database` groups for easier management.
#### Apache {#apache}
Apache is an open-source HTTP server used to serve user/admin web-interface.
Please notice that I am checking _HTTPS_ port.
<pre>$ sudo cat /etc/monit/conf.d/apache.conf
<pre>check process apache with pidfile /var/run/apache2.pid
group kolab group web-server start program = “/usr/sbin/service apache2 start” stop program = “/usr/sbin/service apache2 stop” if failed port 443 then restart if 3 restarts within 5 cycles then timeout depends on apache2_bin depends on apache2_rc depends on apache2_rc_mods depends on apache2_rc_sites check file apache2_bin with path /usr/sbin/apache2.prefork group web-server if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory apache2_rc with path /etc/apache2 group web-server if changed timestamp then alert if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory apache2_rc_mods with path /etc/apache2/mods-enabled group web-server if changed timestamp then alert if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory apache2_rc_sites with path /etc/apache2/sites-enabled group web-server if changed timestamp then alert if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor
The above service is placed in `kolab` and `web-server` groups for easier management.
#### Kolab daemon {#kolab_daemon}
This is the heart of the whole Kolab unified communication and collaboration system as it is responsible for data synchronization between different services.
UID of `413` translates to `kolab-n` user. GID of `412` translates to `kolab` group.
<pre>$ sudo cat /etc/monit/conf.d/kolab-server.conf
<pre>check process kolab-server with pidfile /var/run/kolabd/kolabd.pid
group kolab group kolab-daemon start program = “/usr/sbin/service kolab-server start” stop program = “/usr/sbin/service kolab-server stop” if 3 restarts within 5 cycles then timeout depends on kolab-daemon_bin depends on kolab-daemon_rc check file kolab-daemon_bin with path /usr/sbin/kolabd group kolab-daemon if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check file kolab-daemon_rc with path /etc/kolab/kolab.conf group kolab-daemon if failed checksum then alert if failed permission 640 then unmonitor if failed uid 413 then unmonitor if failed gid 412 then unmonitor
The above service is placed in `kolab` and `kolab-daemon` groups for easier management.
#### Kolab saslauthd {#kolab_saslauthd}
Kolab saslauthd is the SASL authentication daemon for multi-domain Kolab deployments.
<pre>$ sudo cat /etc/monit/conf.d/kolab-saslauthd.conf
<pre>check process kolab-saslauthd with pidfile /var/run/kolab-saslauthd/kolab-saslauthd.pid
group kolab group kolab-saslauthd start program = “/usr/sbin/service kolab-saslauthd start” stop program = “/usr/sbin/service kolab-saslauthd stop” if 3 restarts within 5 cycles then timeout depends on kolab-saslauthd_bin check file kolab-saslauthd_bin with path /usr/sbin/kolab-saslauthd group kolab-saslauthd if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor
The above service is placed in `kolab` and `kolab-saslauthd` groups for easier management.
<div class="alert alert-info">
It can be tempting to monitor <code>/var/run/saslauthd/mux</code> socket, but just leave it alone for now.
</div>
#### Wallace {#wallace}
The Wallace is a content filtering daemon.
<pre>$ sudo cat /etc/monit/conf.d/wallace.conf
<pre>check process wallace with pidfile /var/run/wallaced/wallaced.pid
group kolab group wallace start program = “/usr/sbin/service wallace start” stop program = “/usr/sbin/service wallace stop” #if failed port 10026 then restart if 3 restarts within 5 cycles then timeout depends on wallace_bin check file wallace_bin with path /usr/sbin/wallaced group wallace if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor
The above service is placed in `kolab` and `wallace` groups for easier management.
#### ClamAV {#clamav}
The ClamAV daemon is an open-source, cross-platform antivirus software.
<pre>$ sudo cat /etc/monit/conf.d/clamav.conf
<pre>check process clamav with pidfile /var/run/clamav/clamd.pid
group system group antivirus start program = “/usr/sbin/service clamav-daemon start” stop program = “/usr/sbin/service clamav-daemon stop” if 3 restarts within 5 cycles then timeout #if failed unixsocket /var/run/clamav/clamd.ctl type udp then alert depends on clamav_bin depends on clamav_rc check file clamav_bin with path /usr/sbin/clamd group antivirus if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check file clamav_rc with path /etc/clamav/clamd.conf group antivirus if failed permission 644 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor
The above service is placed in `kolab` and `antivirus` groups for easier management.
#### Freshclam {#freshclam}
Freshclam is a software used to periodically update ClamAV virus databases.
<pre>$ sudo cat /etc/monit/conf.d/freshclam.conf
<pre>check process freshclam with pidfile /var/run/clamav/freshclam.pid
group system group antivirus-updater start program = “/usr/sbin/service clamav-freshclam start” stop program = “/usr/sbin/service clamav-freshclam stop” if 3 restarts within 5 cycles then timeout depends on freshclam_bin depends on freshclam_rc check file freshclam_bin with path /usr/bin/freshclam group antivirus-updater if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check file freshclam_rc with path /etc/clamav/freshclam.conf group antivirus-updater if failed permission 444 then unmonitor if failed uid 110 then unmonitor if failed gid 4 then unmonitor
The above service is placed in `kolab` and `antivirus-updater` groups for easier management.
#### amavisd-new {#amavisd-new}
Amavis is a high-performance interface between Postfix mail server and content filtering services: SpamAssassin as a spam classifier and ClamAV as an antivirus protection.
<pre>$ sudo cat /etc/monit/conf.d/amavisd-new.conf
<pre>check process amavisd-new with pidfile /var/run/amavis/amavisd.pid
group kolab group content-filter start program = “/usr/sbin/service amavis start” stop program = “/usr/sbin/service amavis stop” if 3 restarts within 5 cycles then timeout #if failed port 10024 type tcp then restart #if failed unixsocket /var/lib/amavis/amavisd.sock type udp then alert depends on amavisd-new_bin depends on amavisd-new_rc check file amavisd-new_bin with path /usr/sbin/amavisd-new group content-filter if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory amavisd-new_rc with path /etc/amavis/ group content-filter if changed timestamp then alert if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor
The above service is placed in `kolab` and `content-filter` groups for easier management.
#### The main Directory Server daemon {#389}
The main Directory Server daemon is a 389 LDAP Directory Server.
<pre>$ sudo cat /etc/monit/conf.d/dirsrv.conf
<pre>check process dirsrv with pidfile /var/run/dirsrv/slapd-xmail.stats
group kolab group dirsrv start program = “/usr/sbin/service dirsrv start” stop program = “/usr/sbin/service dirsrv stop” if 3 restarts within 5 cycles then timeout if failed port 389 type tcp then restart depends on dirsrv_bin depends on dirsrv_rc check file dirsrv_bin with path /usr/sbin/ns-slapd group dirsrv if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory dirsrv_rc with path /etc/dirsrv/ group dirsrv if changed timestamp then alert
The above service is placed in `kolab` and `dirsrv` groups for easier management.
#### SpamAssasin {#spamassasin}
SpamAssasin is a content filter used for spam filtering.
<pre>$ sudo cat /etc/monit/conf.d/spamd.conf
<pre>check process spamd with pidfile /var/run/spamd.pid
group system group spamd start program = “/usr/sbin/service spamassassin start” stop program = “/usr/sbin/service spamassassin stop” if 3 restarts within 5 cycles then timeout #if failed port 783 type tcp then restart depends on spamd_bin depends on spamd_rc check file spamd_bin with path /usr/sbin/spamd group spamd if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory spamd_rc with path /etc/spamassassin/ group spamd if changed timestamp then alert if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor
The above service is placed in `kolab` and `spamd` groups for easier management.
#### Cyrus IMAP/POP3 daemons {#cyrus}
cyrus-imapd daemon is responsible for IMAP/POP3 communication.
<pre>$ sudo cat /etc/monit/conf.d/cyrus-imapd.conf
<pre>check process cyrus-imapd with pidfile /var/run/cyrus-master.pid
group kolab group cyrus-imapd start program = “/usr/sbin/service cyrus-imapd start” stop program = “/usr/sbin/service cyrus-imapd stop” if 3 restarts within 5 cycles then timeout if failed port 143 type tcp then restart if failed port 4190 type tcp then restart if failed port 993 type tcp then restart depends on cyrus-imapd_bin depends on cyrus-imapd_rc check file cyrus-imapd_bin with path /usr/lib/cyrus-imapd/cyrus-master group cyrus-imapd if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check file freshclam_rc with path /etc/cyrus.conf group anti-virus if failed checksum then alert if failed permission 644 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor
The above service is placed in `kolab` and `cyrus-imapd` groups for easier management.
#### Postfix {#postfix}
Postfix is an open-source mail transfer agent used to route and deliver electronic mail.
<pre>$ sudo cat /etc/monit/conf.d/postfix.conf
<pre>check process postfix with pidfile /var/run/cyrus-master.pid
group kolab group mta start program = “/usr/sbin/service postfix start” stop program = “/usr/sbin/service postfix stop” if 3 restarts within 5 cycles then timeout if failed port 25 type tcp then restart #if failed port 10025 type tcp then restart #if failed port 10027 type tcp then restart if failed port 587 type tcp then restart depends on postfix_bin depends on postfix_rc check file postfix_bin with path /usr/lib/postfix/master group mta if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor check directory postfix_rc with path /etc/postfix/ group mta if changed timestamp then alert if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor
The above service is placed in `kolab` and `mta` groups for easier management.
### Ending notes {#ending_notes}
This blog post is definitely too long, so I will just mention that similar configuration can be used to monitor other integrated solutions like _ISPConfig_, or custom specialized setups.
In my opinion Monit is a great utility which simplifies system and service monitoring. Additionally it provides interesting proactive features, like service restart, or arbitrary program execution on selected tests.
<u>Everything</u> is described in the manual page.
<pre>$ man monit