nagios中check_openmanage插件學怎么用-創(chuàng)新互聯(lián)

這篇文章主要介紹了nagios中check_openmanage插件學怎么用,具有一定借鑒價值,感興趣的朋友可以參考下,希望大家閱讀完這篇文章之后大有收獲,下面讓小編帶著大家一起了解一下。

成都創(chuàng)新互聯(lián)-專業(yè)網(wǎng)站定制、快速模板網(wǎng)站建設、高性價比呼圖壁網(wǎng)站開發(fā)、企業(yè)建站全套包干低至880元,成熟完善的模板庫,直接使用。一站式呼圖壁網(wǎng)站制作公司更省心,省錢,快速模板網(wǎng)站建設找我們,業(yè)務覆蓋呼圖壁地區(qū)。費用合理售后完善,十載實體公司更值得信賴。

check_openmanage現(xiàn)在是epel的一個項目,所以安裝了epel-release就可以使用yum來安裝check_openmanage插件了。

前提是被監(jiān)控端已經(jīng)安裝了dell omsa(open management server administrator)程序。

#yum -y install nagios-plugins-openmanage.x86_64

插件路徑在:

#/usr/lib64/nagios/plugins/openmanage

#cp /usr/lib64/nagios/plugins/openmanage /usr/local/nagios/libexec/

epel和omsa如何安裝可以從網(wǎng)上google一下

被檢測端安裝還是nagios端安裝,就看檢測的環(huán)境了。

如果可以使用snmp,在nagios端安裝這個插件即可。

如果只能使用nrpe就在被監(jiān)控的機器上安裝插件。

可檢查的項目列表:

Storage components checked:

·     Controllers

·     Physical drives

·     Logical drives

·     Cache batteries

·     Connectors (channels)

·     Enclosures

·     Enclosure fans

·     Enclosure power supplies

·     Enclosure temperature probes

·     Enclosure management modules (EMMs)

Chassis components checked:

·     Processors

·     Memory modules

·     Cooling fans

·     Temperature probes

·     Power supplies

·     Batteries

·     Voltage probes

·     Power usage

·     Chassis intrusion

·     Removable flash media (SD cards)

Other:

·     ESM Log health

·     ESM Log content (default disabled)

·     Alert Log content (default disabled, not SNMP)

nagios可以通過snmp來檢測主機狀態(tài),也可以使用npre來進行檢測。使用nrpe時需要先定義相應的command(類似于其他服務的檢查)

使用snmp時nagios的command.cfg的配置

# Openmanage check via SNMP

define command {

  command_name   check_openmanage

  command_line   /path/to/check_openmanage -H $HOSTADDRESS$

}

給監(jiān)控機配置文件增加omsa監(jiān)控

# Dell OMSA status

define service {

  use            generic-service

  hostgroup_name       dell-servers

  service_description    Dell OMSA

  check_command       check_openmanage

}

對比發(fā)現(xiàn),snmp獲取信息的速度快要快于本機的自檢。因此使用nrpe時需要帶上參數(shù)-t 30 (延時 30秒)

自帶的幫助信息:

$ check_openmanage -h

Usage: check_openmanage [OPTION]...

GENERAL OPTIONS:(公共的參數(shù),snmp和本地都可以用)

 -f, --config     Specify configuration file

 -p, --perfdata    Output performance data [default=no]

 -t, --timeout     Plugin timeout in seconds [default=30]

 -c, --critical    Custom temperature critical limits

 -w, --warning     Custom temperature warning limits

 -F, --fahrenheit   Use Fahrenheit as temperature unit

 -d, --debug      Debug output, reports everything

 -h, --help      Display this help text

 -V, --version     Display version info

SNMP OPTIONS:(SNMP方式)

 -H, --hostname    Hostname or IP (required for SNMP) (check_openmanage -H 1.2.3.4 )

 -C, --community    SNMP community string [default=public]

 -P, --protocol    SNMP protocol version [default=2]

 --port        SNMP port number [default=161]

 -6, --ipv6      Use IPv6 instead of IPv4 [default=no]

 --tcp         Use TCP instead of UDP [default=no]

OUTPUT OPTIONS:

 -i, --info      Prefix any alerts with the service tag

 -e, --extinfo     Append system info to alerts

 -s, --state      Prefix alerts with alert state

 -S, --short-state   Prefix alerts with alert state abbreviated

 -o, --okinfo     Verbosity when check result is OK

 -B, --show-blacklist Show blacklistings in OK output

 -I, --htmlinfo    HTML output with clickable links

CHECK CONTROL AND BLACKLISTING:

 -a, --all       Check everything, even log content

 -b, --blacklist    Blacklist missing and/or failed components 檢查黑名單

 --only        Only check a certain component or alert type 檢查單獨項

 --check        Fine-tune which components are checked 檢查組合項

 --no-storage     Don't check storage

For more information and advanced options, see the manual page or URL:

http://folk.uio.no/trondham/software/check_openmanage.html

snmp執(zhí)行結果:

[root@op omsa]# check_openmanage -H localhost

Controller 0 [PERC 6/i Integrated]: Firmware '6.1.1-0047' is out of date

#輸出帶有狀態(tài)提示的信息

[root@op omsa]# check_openmanage -H localhost  -s

WARNING: Controller 0 [PERC 6/i Integrated]: Firmware '6.1.1-0047' is out of date

#此命令就是使用了黑名單,不檢查Firmware固件版本更新提示。

[root@localhost etc]# /usr/lib/nagios/plugins/check_openmanage -H 1.2.3.4 -s -b ctrl_fw=0

OK - System: 'PowerEdge R710', SN: 'XXXXXX', 16 GB ram (8 dimms), 1 logical drives, 6 physical drives

#只檢查電源

[root@localhost etc]# /usr/lib/nagios/plugins/check_openmanage -H 1.2.3.4 -s --only power

POWER OK - 2 power supplies checked

單項檢查參數(shù)表

Keyword

Effect

critical

Only output critical alerts. It is possible to use the --check option together with this option to adjust checks.

warning

Only output warning alerts. It is possible to use the --check option together with this option to adjust checks.

chassis

Only check chassis components, i.e. everything but storage and log content.

storage

Only check storage components

memory

Only check memory modules

fans

Only check fans

power

Only check power supplies

temp

Only check temperatures

cpu

Only check processors

voltage

Only check voltage probes

batteries

Only check batteries

amperage

Only check power usage

intrusion

Only check chassis intrusion

sdcard

Only check removable flash media

servicetag

Only check for sane service tag

esmhealth

Only check ESM log health

esmlog

Only check ESM log content

alertlog

Only check alertlog content

#檢查存儲信息,并不檢查FirmWare信息

[root@localhost etc]# /usr/lib/nagios/plugins/check_openmanage -H 1.2.3.4 -s --only storage -b ctrl_fw=0

STORAGE OK - 6 physical drives, 1 logical drives

#如果想在信息顯示的時候知道哪些信息是放到了黑名單中,可以在命令最后加參數(shù) -B

[root@localhost etc]# /usr/lib/nagios/plugins/check_openmanage -H 1.2.3.4 -s -b ctrl_fw=0 -B

OK - System: 'PowerEdge R710', SN: 'XXXXXX', 16 GB ram (8 dimms), 1 logical drives, 6 physical drives

----- BLACKLISTED: ctrl_fw=0

黑名單功能中可以使用的參數(shù)表

Component

Comment

ctrl

Controller

ctrl_fw

Suppress the "special" warning message about old controller firmware. Use this if you can't or won't upgrade the firmware.

ctrl_driver

Suppress the "special" warning message about old controller driver. Particularly useful on systems where you can't upgrade the driver.

ctrl_stdr

Suppress the "special" warning message about old Windows storport driver.

pdisk

Physical disk.

pdisk_cert

Ignore warnings for non-certified physical drives 未配置的磁盤

pdisk_foreign

Ignore warnings for foreign physical drives 外部磁盤例如:pdisk_foreign=1:0:5

vdisk

Logical drive (virtual disk)

bat

Controller cache battery

bat_charge

Ignore warnings related to the controller cache battery charging cycle, which happens approximately every 40 days on Dell servers. Note that using this blacklist keyword makes check_openmanage ignore non-critical cache battery errors.

conn

Connector (channel)

encl

Enclosure

encl_fan

Enclosure fan

encl_ps

Enclosure power supply

encl_temp

Enclosure temperature probe

encl_emm

Enclosure management module (EMM)

dimm

Memory module

fan

Fan (Cooling device)

ps

Powersupply

temp

Temperature sensor

cpu

Processor (CPU)

volt

Voltage probe

bp

System battery

amp

Amperage probe (power consumption monitoring)

intr

Intrusion sensor

sd

Removable flash media (SD card)

#個性化輸出信息

參數(shù) --postmsg

$ check_openmanage --postmsg 'NOTE: Service tag: %s - Dell support: 555-1234-5678'

Power Supply 0 [AC]: Presence Detected, Failure Detected, AC Lost

Controller 0 [PERC 6/i Integrated]: Driver '00.00.03.15-RH1' is out of date

NOTE: Service tag: JV8KH0J - Dell support: 555-1234-5678

參數(shù)表:

Code

Replaced with

%m

System model

%s

Service tag

%b

BIOS version

%d

BIOS release date

%o

Operating system name

%r

Operating system release

%p

Number of physical drives

%l

Number of logical drives

%n

Line break

%%

A literal %

可以使用-d或者--debug來顯示所有檢查項目:

[root@localhost etc]# /usr/lib/nagios/plugins/check_openmanage -H 1.2.3.4 -d

 System:    PowerEdge R710      OMSA version:   7.2.0

 ServiceTag:  XXXXXX          Plugin version:  3.7.9

 BIOS/date:  1.0.4 03/09/2009     Checking mode:  SNMPv2c UDP/IPv4

-----------------------------------------------------------------------------

 Storage Components

=============================================================================

 STATE  |   ID   |  MESSAGE TEXT

---------+----------+--------------------------------------------------------

WARNING |     0 | Controller 0 [PERC 6/i Integrated]: Firmware '6.1.1-0047' is out of date

   OK |     0 | Controller 0 [PERC 6/i Integrated] is Degraded

   OK |  0:0:0:0 | Physical Disk 0:0:0 [SAS-HDD 146GB] on ctrl 0 is Online

   OK |  0:0:0:1 | Physical Disk 0:0:1 [SAS-HDD 146GB] on ctrl 0 is Online

   OK |  0:0:0:2 | Physical Disk 0:0:2 [SAS-HDD 146GB] on ctrl 0 is Online

   OK |  0:0:0:3 | Physical Disk 0:0:3 [SAS-HDD 146GB] on ctrl 0 is Online

   OK |  0:1:0:4 | Physical Disk 1:0:4 [SAS-HDD 146GB] on ctrl 0 is Online

   OK |  0:1:0:5 | Physical Disk 1:0:5 [SAS-HDD 146GB] on ctrl 0 is Ready (Dedicated HS)

   OK |    0:0 | Logical Drive '/dev/sda' [RAID-5, 544.50 GB] is Ready

   OK |    0:0 | Cache Battery 0 in controller 0 is Ready

   OK |    0:0 | Connector 0 [SAS] on controller 0 is Ready

   OK |    0:1 | Connector 1 [SAS] on controller 0 is Ready

   OK |   0:0:0 | Enclosure 0:0:0 [Backplane] on controller 0 is Ready

   OK |   0:1:0 | Enclosure 0:1:0 [Backplane] on controller 0 is Ready

-----------------------------------------------------------------------------

 Chassis Components

=============================================================================

 STATE  |  ID  |  MESSAGE TEXT

---------+------+------------------------------------------------------------

   OK |   0 | Memory module 0 [DIMM_A2, 2048 MB] is Ok

   OK |   1 | Memory module 1 [DIMM_A3, 2048 MB] is Ok

   OK |   2 | Memory module 2 [DIMM_A5, 2048 MB] is Ok

   OK |   3 | Memory module 3 [DIMM_A6, 2048 MB] is Ok

   OK |   4 | Memory module 4 [DIMM_B2, 2048 MB] is Ok

   OK |   5 | Memory module 5 [DIMM_B3, 2048 MB] is Ok

   OK |   6 | Memory module 6 [DIMM_B5, 2048 MB] is Ok

   OK |   7 | Memory module 7 [DIMM_B6, 2048 MB] is Ok

   OK |   0 | Chassis fan 0 [System Board FAN 1 RPM] reading: 3960 RPM

   OK |   1 | Chassis fan 1 [System Board FAN 2 RPM] reading: 3960 RPM

   OK |   2 | Chassis fan 2 [System Board FAN 3 RPM] reading: 3960 RPM

   OK |   3 | Chassis fan 3 [System Board FAN 4 RPM] reading: 3960 RPM

   OK |   4 | Chassis fan 4 [System Board FAN 5 RPM] reading: 3840 RPM

   OK |   0 | Power Supply 0 [AC]: Presence detected

   OK |   1 | Power Supply 1 [AC]: Presence detected

   OK |   0 | Temperature Probe 0 [System Board Ambient Temp] reads 27 C (min=8/3, max=42/47)

   OK |   0 | Processor 0 [Intel Xeon E5506 2.13GHz] is Present

   OK |   1 | Processor 1 [Intel Xeon E5506 2.13GHz] is Present

   OK |   0 | Voltage sensor 0 [CPU1 VCORE] is Good

   OK |   1 | Voltage sensor 1 [CPU2 VCORE] is Good

   OK |   2 | Voltage sensor 2 [CPU2 0.75 VTT CPU2 PG] is Good

   OK |   3 | Voltage sensor 3 [CPU1 0.75 VTT CPU1 PG] is Good

   OK |   4 | Voltage sensor 4 [System Board 1.5V PG] is Good

   OK |   5 | Voltage sensor 5 [System Board 1.8V PG] is Good

   OK |   6 | Voltage sensor 6 [System Board 3.3V PG] is Good

   OK |   7 | Voltage sensor 7 [System Board 5V PG] is Good

   OK |   8 | Voltage sensor 8 [CPU2 MEM PG] is Good

   OK |   9 | Voltage sensor 9 [CPU1 MEM PG] is Good

   OK |  10 | Voltage sensor 10 [CPU2 VTT ] is Good

   OK |  11 | Voltage sensor 11 [CPU1 VTT ] is Good

   OK |  12 | Voltage sensor 12 [System Board 0.9V PG] is Good

   OK |  13 | Voltage sensor 13 [CPU2 1.8 PLL  PG] is Good

   OK |  14 | Voltage sensor 14 [CPU1 1.8 PLL PG] is Good

   OK |  15 | Voltage sensor 15 [System Board 8.0 V PG] is Good

   OK |  16 | Voltage sensor 16 [System Board 1.1 V PG] is Good

   OK |  17 | Voltage sensor 17 [System Board 1.0 LOM PG] is Good

   OK |  18 | Voltage sensor 18 [System Board 1.0 AUX PG] is Good

   OK |  19 | Voltage sensor 19 [System Board 1.05 V PG] is Good

   OK |   0 | Battery probe 0 [System Board CMOS Battery] is Presence Detected

   OK |   0 | Chassis intrusion 0 detection: Ok (Not Breached)

-----------------------------------------------------------------------------

 Other messages

=============================================================================

 STATE  |  MESSAGE TEXT

---------+-------------------------------------------------------------------

   OK | ESM log health is Ok (less than 80% full)

   OK | Chassis Service Tag is sane

#使用檢查配置文件進行個性化項目檢查。使用參數(shù)-f

check_openmanage -f /etc/check_openmanage.conf

check_openmange 安裝與使用

客戶端 1,下載Openmange的版本軟件:

cd /opt/ wget http:/support.dell.com (這里是網(wǎng)上的地址) mon02-001 /opt/DELL/dell 下面有 OM_6.1.0_ManNode_A00.tar 把這個下載下來

tar zxvf omsa...*.tgz

sh ./setup.sh

有三次選擇,

 輸入y, 表示接受協(xié)議,

 輸入6,表示選擇全部組件,

 輸入i,   表示安裝所選擇

 安裝時提示安裝的路徑,選擇默認路徑的就行(/opt/dell/srvadmin/)  建議自己定義下目錄位置  /usr/local/openmanage

以下是我安裝的時候出現(xiàn)的錯誤(僅供參考)出錯有:

  1. libstdc++.so.5 找不到

安裝:compat-libstdc 相關版本的軟件就好

  2.libcurl.so.3 找不到

安裝curl 就OK 了

我們現(xiàn)在做的是用client 端和服務端都在一起

wget http://folk.uio.no/trondham/software/files/check_openmanage-3.6.5.tar.gz (mon02-001 /opt/DELL 有這個包)

tar zxvf check_openmanage-3.6.5.tar.gz

cp /tar包/check_openmanage 這個Perl 腳本 放到/usr/local/nagios/libexec 里面

client 端 :定義 nrpe.cfg

vi /usr/local/nagios/etc/nrpe.cfg

add 增加一行

command[check_dell_hardware]=/usr/local/nagios/libexec/check_openmanage -e --only critical

保存下。

運行 /usr/local/nagios/libexec/check_openmanage -e --only critical 看是否有返回值。如果返回都OK ,客戶端設置完畢。

以下設置服務端:

server 端里定義service:

define service {

 use             saa-service

 host_name          localhost

 service_description     check_hardware

 check_command        check_nrpe!check_dell_hardware

}

其中的localhost 根據(jù)監(jiān)控的機器變動主機名。

檢測監(jiān)控是否成功:

服務端/usr/local/nagios/libexec/check_nrpe -H hostIP -c check_dell_hardware

如果有問題檢測 NRPE 是否正常。

下面的用SNMP 安裝服務端

  服務端

安裝: 1,安裝相關Perl-snmp軟件包

 perl-Crypt-DES-2.05-3.2.el5.rf.i386.rpm

 perl-Digest-HMAC-1.01-2.2.el5.rf.noarch.rpm

 perl-Digest-SHA1-2.12-2.el5.rf.i386.rpm

 perl-Net-SNMP-5.2.0-1.2.el5.rf.noarch.rpm

 perl-Socket6-0.23-1.el5.rf.i386.rpm

 安裝順序安裝其他包,最后安裝perl-Net-SNMP-5.2.0-1.2.el5.rf.noarch.rpm

下載check_openmanage 插件 (http://folk.uio.no/trondham/software/check_openmanage.html#download) 根據(jù)系統(tǒng)的不同,下載不同的軟件。

wget http://folk.uio.no/trondham/software/files/check_openmanage-3.6.5.tar.gz

wget http://folk.uio.no/trondham/software/files/nagios-plugins-check-openmanage-3.6.5-1.el5.x86_64.rpm

上面是簡單的安裝,有些地方是直接復制別人的。 安裝沒什么花頭的,所以看下應該都能會。

下面的是使用,一些參數(shù)的剪輯。

check_openmanage -s 顯示詳細的服務狀態(tài)報警 check_openmanage -S 顯示簡短的服務狀態(tài)報警 (也就是critcal 簡寫成C)

check_openmanage -i 以服務編號為前綴的服務狀態(tài)報警

例:[JV8KH0J] Controller 0 [PERC 6/i Integrated]: Driver '00.00.03.15-RH1' is out of date

check_openmanage -e 顯示機器的類型和報警信息(以單線為區(qū)分號 顯示機器的系統(tǒng) 機型 服務號 )

例:Power Supply 0 [AC]: Presence Detected, Failure Detected, AC Lost Controller 0 [PERC 6/i Integrated]: Driver '00.00.03.15-RH1' is out of date

 ------ SYSTEM: PowerEdge 1950, SN: JV8KH0J

check_openmanage --postmsg 'NOTE: Service tag: %s - Dell support: 800-8888-8888' 根據(jù)參數(shù) --postmsg 可以自定義提示信息。

Power Supply 0 [AC]: Presence Detected, Failure Detected, AC Lost Controller 0 [PERC 6/i Integrated]: Driver '00.00.03.15-RH1' is out of date NOTE: Service tag: JV8KH0J - Dell support: 800-8888-8888

其中 %s 是系統(tǒng)內(nèi)部的變量調(diào)用,以下是所有的內(nèi)部變量

%m System model 機器型號

%s Service tag 服務編號

%b BIOS version bios 版本

%d BIOS release date Bios 發(fā)布日期

%o Operating system name 系統(tǒng)名稱

%r Operating system release 操作系統(tǒng)的版本

%p Number of physical drives 物理驅動器數(shù)

%l Number of logical drives 邏輯驅動器數(shù)

%n Line break 換行符

%% A literal % 一個文字%

以上報警信息 可以多參數(shù)一起使用。例如: check_openmanage -i -s

check_openmanage -o 默認情況下,輸出的OK 信息為一行,我們可以控制的,可以輸入check_openmanage -o 3 顯示3行,并且顯示一些硬件的底層。

check_openmanage -H localhost -b ctrl_driver=all -b pdisk=1:0:0:1 -B Openmanage 可以控制黑名單,通俗的說也就是無關涇要的監(jiān)控,使用參數(shù) -b 可以添加不要監(jiān)控的項,但是等黑名單多了的時候,我們就無法知道到底什么被去掉了,這個時候 在后面加個 -B =(show-blacklist) 顯示被黑的名單。。

check_openmanage -d 顯示軟件運行后的debug信息。 (這個是我們?nèi)斯な謩诱{(diào)試的時候用的,在nagios 里面不要使用這個選項)

自定義溫度閥值

omreport 這是裝好openmanage 的自檢程序

omreport chassis temps 顯示機器的溫度

check_openmanage -H myhost --only temp -d 這是check_openmanage 的調(diào)試 ,顯示機器的溫度,我們可以定義閥值報警的。

check_openmanage -w 0=30 -c 0=40 更改溫度報警閥值

check_openmanage -w 0=30/15 -c 0=40/10 這個表示15分鐘 如果溫度大于30,warning,10分鐘大于40,critcal. 這個可以自己更具需要更改時間寫

添加黑名單

當一些不重要的信息我們不想看到的時候,我們可以根據(jù) -b 來調(diào)試。

例如:

check_openmanage -s -b ctrl_driver=0,1 不檢測 Controller 的驅動問題。 如果所有的Controller驅動都不需要監(jiān)控 可以使用ctrl_driver=all

以下是設備的代號(縮寫):

==- 利用--check 來檢測單個項目 0表示關閉,1表示開啟

check_openmanage --check storage=0,esmlog=1 關閉檢測存儲,查看esmlog 信息

我們也可以定義一個文件,然后用--check 來執(zhí)行文件里面定義的check 項目(方便我們每次的重復操作) vi /tmp/check_openmanage.check storage=0,esmlog=1

check_openmanage --check /tmp/check_openmanage.check

==- 利用--only 來監(jiān)控指定項目

check_openmanage --only storage 只檢查 存儲,其他的任何的都不監(jiān)控

以下是Only 的一些參數(shù)

== 如果想check 所有, check_openmanage -a 就check 所有了。

最后就是結合 PNP4Nagios 用圖片顯示信息。

注:

本人在裝的時候發(fā)現(xiàn)一個比較嚴重的問題:

Openmange 這個軟件不要重復的在服務器上卸載,安裝,這樣的話會導致多出很多進程,每裝一次 他們會生成3個為一組的進程例如以下:

root   30672  0.0  0.0  21688  1056 ?     S   Jun08  0:00  \_ hald-runner

68    30680  0.0  0.0  12320  848 ?     S   Jun08  0:00    \_ hald-addon-acpi: listening on acpid socket /var/run/acpid.socket

68    30693  0.0  0.0  12320  844 ?     S   Jun08  0:00    \_ hald-addon-keyboard: listening on /dev/input/event0

而且你卸載這個軟件過后,這個進程是不會Kill 掉的,只有人工手動kill, 還有這個進程多了很多以后占用CPU 資源很多, 每十分鐘CPU LOAD 有個波動。 我們公司的app 服務器就是因為裝了這個  load 會每十分鐘波動一次,從1波動到20, 然后馬上下降,所以大家一定要注意,不要在生產(chǎn)環(huán)境中重復的安裝和卸載。

其他的方面沒什么問題,軟件還是蠻好用的,可以結合nagios 和 zabbix  實現(xiàn)硬件監(jiān)控。

DELL openmanage and nagios on ubuntu 10.04PDFPrintE-mail

Linux

Written by Michael

Thursday, 28 July 2011 19:51

The dell openmanage tools are quite good for monitoring Dell servers. Although it slows down boot time (which shouldn't happen often with servers anyway), it provides some great ways to monitor your server.

Install Dell OMSA

add the repositories to a new file /etc/apt/sources.list.d/linux.dell.com.sources.list with the following content:

deb http://linux.dell.com/repo/community/deb/latest /

apt-get update

gpg --keyserver pgpkeys.mit.edu --recv-key E74433E25E3D7775

gpg -a --export E74433E25E3D7775 | apt-key add -

apt-get install srvadmin-all

Install the nagios check_openmanage plugin

Download the latest check_openmanage package from http://folk.uio.no/trondham/software/check_openmanage.html#download

wget http://folk.uio.no/trondham/software/files/check-openmanage_3.6.8-1_all.deb

Install the openManage package:

dpkg -i check-openmanage_3.6.6-1_all.deb

Check theoutput of:

/usr/lib/nagios/plugins/check_openmanage

If you get the following output:

Storage Error! No controllers found

Problem running 'omreport chassis memory': Error: Memory object not found

Problem running 'omreport chassis fans': Error! No fan probes found on this system.

Problem running 'omreport chassis temps': Error! No temperature probes found on this system.

Problem running 'omreport chassis volts': Error! No voltage probes found on this system.

Do:

/etc/init.d/dataeng restart

Rerun it, and blacklist warnings like 'Not certified drives' and controller firmware out of date like(or resolve them by swapping to certified disks and upgrade the raidcontroller firmware):

/usr/lib/nagios/plugins/check_openmanage -b ctrl_fw=0/pdisk=0:0:0:0,0:0:0:1

If you run it, it should show something like:

/usr/lib/nagios/plugins/check_openmanage -b ctrl_fw=0/pdisk=0:0:0:0,0:0:0:1

OK - System: 'PowerEdge R310', SN: 'somenumber', 4 GB ram (2 dimms), 1 logical drives, 2 physical drives

Only uncertified hard drives should be blacklisted, certified disks do not have to be blacklisted.

Make sure that dataeng starts at boot

update-rc.d dataeng defaults

Edit:

/etc/nagios/nrpe_local.cfg

And add the command without warnings to it, like:

command[check_openmanage]=/usr/lib/nagios/plugins/check_openmanage -b ctrl_fw=0/pdisk=0:0:0:0,0:0:0:1

Restart the service:

/etc/init.d/nagios-nrpe-server restart

Add the host to the nagios configuration on your Nagios server.

Optionally, you can start the openmanage built in webserver with

omconfig system webserver action=start

The webserver is running on port 1311 https by default. You can login with the root account or other local accounts of the linux system.

感謝你能夠認真閱讀完這篇文章,希望小編分享的“nagios中check_openmanage插件學怎么用”這篇文章對大家有幫助,同時也希望大家多多支持創(chuàng)新互聯(lián),關注創(chuàng)新互聯(lián)行業(yè)資訊頻道,更多相關知識等著你來學習!

另外有需要云服務器可以了解下創(chuàng)新互聯(lián)scvps.cn,海內(nèi)外云服務器15元起步,三天無理由+7*72小時售后在線,公司持有idc許可證,提供“云服務器、裸金屬服務器、高防服務器、香港服務器、美國服務器、虛擬主機、免備案服務器”等云主機租用服務以及企業(yè)上云的綜合解決方案,具有“安全穩(wěn)定、簡單易用、服務可用性高、性價比高”等特點與優(yōu)勢,專為企業(yè)上云打造定制,能夠滿足用戶豐富、多元化的應用場景需求。

新聞標題:nagios中check_openmanage插件學怎么用-創(chuàng)新互聯(lián)
文章URL:http://www.muchs.cn/article32/pcesc.html

成都網(wǎng)站建設公司_創(chuàng)新互聯(lián),為您提供服務器托管、關鍵詞優(yōu)化外貿(mào)網(wǎng)站建設、全網(wǎng)營銷推廣、標簽優(yōu)化、網(wǎng)站建設

廣告

聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉載內(nèi)容為主,如果涉及侵權請盡快告知,我們將會在第一時間刪除。文章觀點不代表本網(wǎng)站立場,如需處理請聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉載,或轉載時需注明來源: 創(chuàng)新互聯(lián)

h5響應式網(wǎng)站建設