2007年4月12日星期四

Installation of Nagios2.8

Installation of Nagios2.8
 
需要下载的几个主要安装包:
 Package    Location for getting the file                                          Description
 nagios-2.7.tar.gz    http://www.nagios.org/download/                    Nagios 主程序
 nrpe-2.6.tar.gz      http://www.nagios.org/download/                      Nagios代理监控子程序
 nagios-plugins-1.4.6.tar.gz  http://www.nagios.org/download/         Nagios Plugins
 imagepak-base.tar.gz  http://sourceforge.net/project/showfiles.php?group_id=26589  基本的一些图片包
 
服务器端安装以及本机的监控配置:
 首先,建立需要的目录,并赋予权限:
  useradd nagios
  mkdir /usr/local/nagios
  mkdir /usr/local/nagios/libexec
  chown -R nagios:nagios /usr/local/nagios
使用up2date安装gd-devel/libpng-devel/libjpeg-devel,该步可使用源码包或rpm包安装,或通过yum安装.
up2date gd-devel
up2date libpng-devel
up2date libjpeg-devel
  展开nagios的安装包:
  cd /usr/local/src
  tar xzvf nagios-2.8.tar.gz
 
  编译安装nagios
  cd nagios-2.8
 ./configure --prefix=/usr/local/nagios --with-cgiurl=/nagios/cgi-bin --with-htmurl=/nagios --with-nagios-user=nagios --with-nagios-group=nagios
 
*** Configuration summary for nagios 2.5 07-13-2006 ***:
 
 General Options:
 -------------------------
        Nagios executable:  nagios
        Nagios user/group:  nagios,nagios
       Command user/group:  nagios,nagcmd
            Embedded Perl:  no
             Event Broker:  yes
        Install ${prefix}:  /usr/local/nagios
                Lock file:  ${prefix}/var/nagios.lock
           Init directory:  /etc/rc.d/init.d
                  Host OS:  linux-gnu
 
 Web Interface Options:
 ------------------------
                 HTML URL:  http://localhost/nagios/
                  CGI URL:  http://localhost/nagios/cgi-bin/
 Traceroute (used by WAP):  /bin/traceroute
 

Review the options above for accuracy.  If they look okay,
type 'make all' to compile the main program and CGIs.
 
make all
make install
make install-init
make install-commandmode
make install-config
好了,nagios主程序已经安装完成了,简单吧?…接下来安装Nagios Plug-Ins:
 
  cd /usr/local/src
  tar xzvf nagios-plugins-1.4.7.tar.gz
  cd /nagios-plugins-1.4.7
./configure --prefix=/usr/local/nagios --with-cgiurl=nagios/cgi-bin --with-mysql=/usr/local/mysql/bin/mysql_config   --enable-ssl --enable-command-args
config.status: creating po/Makefile
              --with-ping6-command: /bin/ping6 -n -U -w %d -c %d %s
               --with-ping-command: /bin/ping -n -U -w %d -c %d %s
                      --with-lwres: no
                       --with-ipv6: yes
                      --with-mysql: no
                    --with-openssl: yes
                     --with-gnutls: no
      --enable-emulate-getaddrinfo: no
                       --with-perl: /usr/bin/perl
                     --with-cgiurl: nagios/cgi-bin
                --with-nagios-user: nagios
               --with-nagios-group: nagios
               --with-trusted-path: /bin:/sbin:/usr/bin:/usr/sbin
 
make
make install
 
  nagios-plugins 1.4.6编译安装时出错修正:
 
新版nagios-plugins 1.4.6 make install时出错:
Making install in po
make[1]: Entering directory `/opt/software/nagios/nagios-plugins-1.4.6/po'
/bin/sh @MKINSTALLDIRS@ /usr/local/nagios-plugins/share
/bin/sh: @MKINSTALLDIRS@: No such file or directory
make[1]: *** [install-data-yes] Error 127
make[1]: Leaving directory `/opt/software/nagios/nagios-plugins-1.4.6/po'
make: *** [install-recursive] Error 1
 
需要修改一下po/Makefile文件:

MKINSTALLDIRS = @MKINSTALLDIRS@
mkinstalldirs = $(SHELL) $(MKINSTALLDIRS)
替换为:
MKINSTALLDIRS = $(top_builddir)/./mkinstalldirs
mkinstalldirs = $(SHELL) $(MKINSTALLDIRS)
 安装nagios的基本图片包:
  cd /opt/software/nagios
  tar xzvf imagepak-base.tar.gz
  mv base /usr/local/nagios/share/images/logos/
 现在开始配置Nagios监控的Apache相关设置.
 
首先,我们添加一句 'Include /usr/local/apache2/conf/extra/nagios-server.conf'(该文件为nagios的设置内容集中放置的文件,也可以直接将该内容添加到httpd.conf的末端)
到你的Apache配置文件httpd.conf文件中. 
 
 接下来,编辑/usr/local/nagios/etc/extra/nagios-server.conf.插入:
     ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin/"
   <Directory "/usr/local/nagios/sbin/">
      Options ExecCGI
      AllowOverride None
      Order allow,deny
      Allow from all
      AuthName "Nagios Access"
      AuthType Basic
      AuthUserFile /usr/local/nagios/etc/htpasswd.users
      Require valid-user
   </Directory>
  
   Alias /nagios "/usr/local/nagios/share/"
   <Directory "/usr/local/nagios/share/">
      Options None
      AllowOverride None
      Order allow,deny
      Allow from all
      AuthName "Nagios Access"
      AuthType Basic
      AuthUserFile /usr/local/nagios/etc/htpasswd.users
      Require valid-user
   </Directory>
 
 创建用户认证密码文件,并添加用户nagiosadmin(注意:这里的用户既是apache的登录认真用户也和nagios监控中的权限有关联在后面的文档中有说明):
 htpasswd -c /usr/local/nagios/etc/htpasswd.users nagios
   New password: <enter password you want to use>
   Re-type new password: <re-enter password you want to use>
   Adding password for user nagios
 
 重启apache使apache的配置生效.
 
 下面开始nagios的配置文件的设定,nagios的配置文件灵活度很大,已经有相当多的参数提供设置,可以很细地设定各种选项以适应各种各样的系统监控需求.要使用好nagios就要对配置文件有个全面的了解和合理的规划.下面是我的实际配置方法,供大家参考,大家熟悉了nagios的配置后可量身定制自己的配置文件和结构.
 
cd /usr/local/nagios/etc
for i in `ls -la |awk '{print $9}'`; do mv  $i `echo $i|awk -F- '{print $1}'`; done  #将.cfg-sample的文件复制为.cfg文件
 
 当前etc目录下的配置文件有:
Config  file    Description
cgi.cfg         CGI脚本的相关设定,如用户认证等
commands.cfg    commands定义文件
localhost.cfg    localhost本机的一个配置范例
nagios.cfg      nagios的主配置文件
resource.cfg    监控使用到的脚本文件
 
 根据具体使用情况,将配置文件的结构做以下规划,为了方便将来的维护和管理:
 
配置文件结构如下:
etc/ |-- cgi.cfg   
|-- commands.cfg
|-- nagios.cfg 
|-- resource.cfg 
 (以上为nagios系统主配置文件)
etc/servers |-- contacts.cfg  管理人员和管理人员组的的默认初始化设定文件
|-- hostgroups.cfg  服务器组的默认初始化设定文件
|-- hosts.cfg  服务器的默认初始化设定文件
|-- services.cfg  监控服务的默认初始化设定文件
|-- servicegroups.cfg 监控服务组的默认初始化设定文件
|-- timeperiod.cfg  时间周期默认初始化设定文件
(以上为监控服务相关的配置文件,都是由原localhost.cfg文件中拆分出来的,这样方面理解和管理)
etc/servers/mgr |    yourhostname.cfg
(在etc/servers/下建立监控的域名目录,区分各个被监控的域名,每台监控的主机一个单独的配置文件,包含hosts和services的内容)
 

◆ 设置 cgi.cfg :
authorized_for_system_information=nagiosadmin
authorized_for_configuration_information=nagiosadmin
authorized_for_system_commands=nagiosadmin
authorized_for_all_services=nagiosadmin
authorized_for_all_hosts=nagiosadmin
authorized_for_all_service_commands=nagiosadmin
authorized_for_all_host_commands=nagiosadmin
 
以上设定nagiosadmin为nagios最高权限,有权查看所有hosts和services的状态.
 
 ◆ 设置nagios.cfg :
#cfg_file=/usr/local/nagios/etc/localhost.cfg
#cfg_file=/usr/local/nagios/etc/contactgroups.cfg
#cfg_file=/usr/local/nagios/etc/contacts.cfg                                                                    
#cfg_file=/usr/local/nagios/etc/dependencies.cfg                                                                
#cfg_file=/usr/local/nagios/etc/escalations.cfg                                                                 
#cfg_file=/usr/local/nagios/etc/hostgroups.cfg                                                                  
#cfg_file=/usr/local/nagios/etc/hosts.cfg                                                                       
#cfg_file=/usr/local/nagios/etc/services.cfg                                                                    
#cfg_file=/usr/local/nagios/etc/timeperiods.cfg
将以上内容注释掉
cfg_dir=/usr/local/nagios/etc/servers
开启该参数,表示将/usr/local/nagios/etc/servers下的所有.cfg配置文件都加载到nagios.
 
◆etc/servers目录下各文件的内容简介(红色标注的是需要修改或要注意的地方):
timeperiod.cfg
 
###############################################################################
###############################################################################
#
# TIME PERIODS
#
###############################################################################
###############################################################################
 
# This defines a timeperiod where all times are valid for checks,
# notifications, etc.  The classic "24x7" support nightmare. :-)
 
define timeperiod{
        timeperiod_name 24x7
        alias           24 Hours A Day, 7 Days A Week
        sunday          00:00-24:00
        monday          00:00-24:00
        tuesday         00:00-24:00
        wednesday       00:00-24:00
        thursday        00:00-24:00
        friday          00:00-24:00
        saturday        00:00-24:00
        }
# 'workhours' timeperiod definition
define timeperiod{
        timeperiod_name workhours
        alias           "Normal" Working Hours
        monday          09:00-17:00
        tuesday         09:00-17:00
        wednesday       09:00-17:00
        thursday        09:00-17:00
        friday          09:00-17:00
        }
# 'nonworkhours' timeperiod definition
define timeperiod{
        timeperiod_name nonworkhours
        alias           Non-Work Hours
        sunday          00:00-24:00
        monday          00:00-09:00,17:00-24:00
        tuesday         00:00-09:00,17:00-24:00
        wednesday       00:00-09:00,17:00-24:00
        thursday        09:00-17:00
        friday          09:00-17:00
        }
# 'none' timeperiod definition                                                                                  
define timeperiod{
        timeperiod_name none
        alias           No Time Is A Good Time
        }
                                            
定义各种监控的时间段,如24X7全天候的监控, none不工作时间以及workhours工作和nonworkhours非工作时间段等,在hosts和services的定义中可以引用.
 
contacts.cfg
 
###############################################################################
###############################################################################
#
# CONTACTS
#
###############################################################################
###############################################################################
 
# In this simple config file, a single contact will receive all alerts.
# This assumes that you have an account (or email alias) called
# "nagios-admin" on the local host.
 
define contact{
    contact_name                        nagios
    alias                               nagios
    service_notification_period         24x7
    host_notification_period            24x7
    service_notification_options    w,u,c,r
    host_notification_options       d,r
    service_notification_commands       notify-by-email
    host_notification_commands          host-notify-by-email
    email                              zhuheng1229@gmail.com
    }
 
###############################################################################
###############################################################################
#      
# CONTACT GROUPS
#
###############################################################################
###############################################################################
 
# We only have one contact in this simple configuration file, so there is
# no need to create more than one contact group.
 
define contactgroup{
        contactgroup_name   nagios
        alias           nagios
        members         nagios
        }
 
定义管理员成员和管理员组成员,以及管理员的联系方式mail或sms.要注意这里的管理员contact_name必须与htpasswd.user中设定的帐号一致.
在这里可以设置多个管理员组,将不同的管理员分组,在hosts和services中引用后,可以达到区分各自监控的服务器的目的.
 
hosts.cfg
 
###############################################################################
###############################################################################
#
# HOSTS
#
###############################################################################
###############################################################################
 
# Generic host definition template - This is NOT a real host, just a template!
 
define host{
        name                            generic-host    ; The name of this host template
        notifications_enabled           1               ; Host notifications are enabled
        event_handler_enabled           1               ; Host event handler is enabled
        flap_detection_enabled          1               ; Flap detection is enabled
        failure_prediction_enabled      1               ; Failure prediction is enabled
        process_perf_data               1               ; Process performance data
        retain_status_information       1               ; Retain status information across program restarts
        retain_nonstatus_information    1               ; Retain non-status information across program restarts
        notification_period             24x7            ; Send host notifications at any time
        register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }
 

# Linux host definition template - This is NOT a real host, just a template!
 
define host{
        name                            linux-server    ; The name of this host template                       
        use                             generic-host    ; This template inherits other values from the generic-host template 
        check_period                    24x7            ; By default, Linux hosts are checked round the clock  
        max_check_attempts              10              ; Check each Linux host 10 times (max)                 
        check_command                   check-host-alive ; Default command to check Linux hosts                
        notification_period             workhours       ; Linux admins hate to be woken up, so we only notify during the day 
                                                        ; Note that the notification_period variable is being overridden from
                                                        ; the value that is inherited from the generic-host template!
        notification_interval           120             ; Resend notification every 2 hours                    
        notification_options            d,u,r           ; Only send notifications for specific host states     
        contact_groups                  aiya            ; Notifications get sent to the admins by default      
        register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }
 
定义默认的hosts公共属性.在每台机子的hosts定义中引用.
 
hostgroups.cfg
 
###############################################################################
###############################################################################
#
# HOST GROUPS
#
###############################################################################
###############################################################################
 
# We only have one host in our simple config file, so there is no need to
# create more than one hostgroup.
define hostgroup{
hostgroup_name basic-clients
alias basic clients
members localhost
}
 
define hostgroup{
hostgroup_name your-routers
alias routers
members localhost
}
 
定义hosts所属的分组,方便监控时的观察.hostgroup_name定义分组名称,alias为别名,members定义成员名称,内容为每台hosts配置文件中定义的host_name内容.
 
services.cfg
 
###############################################################################
###############################################################################
#
# SERVICES
#
###############################################################################
###############################################################################
 
# Generic service definition template - This is NOT a real service, just a template!
 
define service{
        name                            generic-service         ; The 'name' of this service template
        active_checks_enabled           1                       ; Active service checks are enabled
        passive_checks_enabled          1                       ; Passive service checks are enabled/accepted
        parallelize_check               1                       ; Active service checks should be parallelized (disabling this can l
ead to major performance problems)    
        obsess_over_service             1                       ; We should obsess over this service (if necessary)
        check_freshness                 0                       ; Default is to NOT check service 'freshness'
        notifications_enabled           1                       ; Service notifications are enabled
        event_handler_enabled           1                       ; Service event handler is enabled
        flap_detection_enabled          1                       ; Flap detection is enabled
        failure_prediction_enabled      1                       ; Failure prediction is enabled
        process_perf_data               1                       ; Process performance data
        retain_status_information       1                       ; Retain status information across program restarts
        retain_nonstatus_information    1                       ; Retain non-status information across program restarts
        is_volatile                     0                       ; The service is not volatile
        register                        0                       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEM
PLATE!
        }
 
定义默认的services公共属性,在每个service定义中引用.
 
◆ etc/servers/mgr目录下100.cfg的内容简介(红色标注的是需要修改或要注意的地方):
 
define host {
use                            generic-host      ;引用的是hosts.cfg文件中定义的name.
host_name                      bj.verican.us ;定义所监控的服务器名称.
address                        192.168.30.222 监控的服务器的IP地址.
check_command                  check-host-alive
max_check_attempts             10
notification_interval          480
notification_period            24x7
notification_options           d,u,r
contact_groups                 nagios
}
 
define service{
        use   generic-service      ;引用的是services.cfg文件中定义的name.
        host_name   bj.verican.us;引用上面host中定义的host_name.
        service_description PING      
        is_volatile 0
        check_period 24x7       ;引用timeperiod.cfg中定义的timeperiod_name.
        max_check_attempts 1
        normal_check_interval 1
        retry_check_interval 1
        contact_groups nagios      ;引用contacts.cfg中定义的contactgroup_name.
        notification_options w,u,c,r
        notification_interval 240
        notification_period 24x7
        check_command check_ping!100.0,20%!500.0,60%   ;使用commands.cfg中定义的监测命令.
}
 
define service{
        use   generic-service
        host_name bj.verican.us
        service_description APACHE
        is_volatile 0
        check_period 24x7
        max_check_attempts 1
        normal_check_interval 1
        retry_check_interval 1
        contact_groups nagios
        notification_options w,u,c,r
        notification_interval 240
        notification_period 24x7
        check_command check_http
}
 
该配置文件为最终监控主机的配置文件,包含每台被监控主机的定义和需要监控的服务.
 
至此基本的nagios各项配置已经设置完成.
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
执行该命令检查所有配置文件是否正确.如果全部正确显示如下:
Nagios 2.8
opyright (c) 1999-2006 Ethan Galstad (http://www.nagios.org)
Last Modified: 07-13-2006
License: GPL
 
Reading configuration data...
 
Running pre-flight check on configuration data...
 
Checking services...
        Checked 5 services.
Checking hosts...
        Checked 1 hosts.
Checking host groups...
        Checked 1 host groups.
Checking service groups...
        Checked 0 service groups.
Checking contacts...
        Checked 1 contacts.
Checking contact groups...
        Checked 1 contact groups.
Checking service escalations...
        Checked 0 service escalations.
Checking service dependencies...
        Checked 0 service dependencies.
Checking host escalations...
        Checked 0 host escalations.
Checking host dependencies...
        Checked 0 host dependencies.
Checking commands...
        Checked 22 commands.
Checking time periods...
        Checked 1 time periods.
Checking extended host info definitions...
        Checked 0 extended host info definitions.
Checking extended service info definitions...
        Checked 0 extended service info definitions.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
 
Total Warnings: 0
Total Errors:   0
 
Things look okay - No serious problems were detected during the pre-flight check
 
接着就可以启动nagios监控服务:
service nagios start
 
ps �Cef 检查服务进程是否存在:
nagios   28807     1  0 Feb07 ?        00:00:13 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
 
现在访问nagios的服务器web界面http://servername/nagios,输入nagiosadmin 的 user and the password,即可看到监控状态,如图:

2007年4月3日星期二

Integrate Apache2.2.3 and tomcat5.5

Integrate Apache2.2.3 and tomcat5.5
 
modify  /usr/loca/apache2/conf/httpd.conf and  add below command to the end of this file
Include conf/extra/httpd-vhosts.conf
ProxyPass / ajp://127.0.0.1:8009/
ProxyPassReverse / ajp://127.0.0.1:8009/
 
modify /usr/local/apache2/conf/extra/httpd-vhosts.conf  and add below command to the end of this file
<VirtualHost *:80>
    ServerAdmin
zh1229@gmail.com
    DocumentRoot /opt/tomcat5/webapps/ROOT/
    ServerName 127.0.0.1
    ErrorLog logs/tomcat-error_log
    CustomLog logs/tomcat-access_log common
    ProxyPass / ajp://localhost:8009/
    ProxyPassReverse / ajp://localhost:8009/
    ServerName localhost
    ServerAlias 127.0.0.1
</VirtualHost>
 
Modify /opt/tomcat5/conf/server.xml
      <Host name="localhost" appBase="webapps"
       unpackWARs="true" autoDeploy="true"
       xmlValidation="false" xmlNamespaceAware="false">
make sure DocumentRoot and appBase is the same path.
 
Now you can restart your apache and tomcat to check.
Sure, if   below three module if not built-in, you could add it to the modules of apache using apxs tool.
#LoadModule proxy_module modules/mod_proxy.so
#LoadModule proxy_ajp_module modules/mod_proxy_ajp.so
#LoadModule proxy_balancer_module modules/mod_proxy_balancer.so
 
Your sincere,
Henry
2007-04-03