共计 23260 个字符,预计需要花费 59 分钟才能阅读完成。
公司有多个分部,且机房没有专业值班,机房等级不够。在这种情况下,又想实时监控机房环境,于是使用 IPMI 方式来达到目的。由于之前已经部署了 Zabbix 监控系统,本次将结合 Zabbix 自带的 IPMI,完成服务器温度及风扇转速等的监控。
1. 环境说明
被监控端服务器型号:Dell PowerEdge R510
规划分配的 IPMI 地址: 10.103.1.100
2.Zabbix 监控平台说明
Zabbix 版本: 3.2.1,在安装时,未使用–with-openipmi
Zabbix 网络接口可以连通 10.103.1.100
3. 前置学习
见 http://www.linuxidc.com/Linux/2017-05/143523.htm
4. 配置 IPMI
4.1. 配置 IPMI 地址
可以参考前置推荐中的《Managing Dell PowerEdge Servers Using IPMItool》在服务器启动时进行 IPMI 地址的配置,并开启 IPMI Over LAN。
也可以使用 Dell 的 iDRAC 开启 IPMI 功能,具体可以查看文章最后的参考资料。
4.2. 获取传感器信息
登录 Zabbix 服务器,通过 ipmitool 远程访问 Dell 服务器传感器信息
# ipmitool -I lan -H 10.103.1.100 -U root -P calvin -L user sensor list | |
# ipmitool -I lan -H 10.103.1.100 -U root -P calvin -L user sensor get "FAN MOD 1B RPM" |
4.3. 安装 IPMItool 软件包
# yum -y install OpenIPMI OpenIPMI-devel ipmitool freeipmi
4.4. 配置 Zabbix
注:为了支持 IPMI, 需要在 zabbix server/proxy 安装时增加 –with-openipmi 参数
服务器端配置 zabbix IPMI pollers
zabbix_server.conf/zabbix_proxy.conf
sed -i '/# StartIPMIPollers=0/aStartIPMIPollers=5' zabbix_server.conf | |
service zabbix-server restart |
4.5. 导入监控模板
下面提供 DELL 的 2 个型号的 IPMI 模板:
template-ipmi-dell-poweredge-r510
template-ipmi-dell-poweredge-2950
添加监控主机,关联上本模板,并在 IPMI 页面,设置 Authentication algorithm 为Default,Privilege level为 User, Username 为sensor, Password为 sensor_pass,保存即可。
使用此种方法获取数据的结果就是效率很差,基本没什么数据。
5. 使用 Zabbix External checks 自定义 IPMI
本来是选择 nagios 的 IPMI 插件:check_ipmi_sensor,文件是:check_ipmi_sensor_v3-v3.9.tar.gz
具体使用方法详见:http://www.thomas-krenn.com/en/wiki/IPMI_Sensor_Monitoring_Plugin
5.1. 安装 perl-IPC-Run 模块
yum -y install perl-IPC-Run perl-Getopt-Long
5.2. 使用 check_ipmi_sensor 查看效果
但是发现报错。
# ./check_ipmi_sensor -f ipmi.cfg -H 10.103.1.100 -vvv | |
------------- debug output for sel (-vvv is set): ------------ | |
/usr/sbin/ipmi-sel was executed with the following parameters: | |
/usr/sbin/ipmi-sel -h 10.103.1.100 --config-file ipmi.cfg --driver-type=LAN_2_0 --output-event-state --interpret-oem-data --entity-sensor-names | |
output of FreeIPMI: | |
ID | Date | Time | Name | Type | State | Event | |
1 | Apr-08-2011 | 06:42:13 | System Board SEL | Event Logging Disabled | Nominal | Log Area Reset/Cleared | |
2 | Jan-01-1970 | 08:00:31 | System Board Intrusion | Physical Security | Critical | General Chassis Intrusion ; Intrusion while system Off | |
3 | Jan-01-1970 | 08:00:36 | System Board Intrusion | Physical Security | Critical | General Chassis Intrusion ; Intrusion while system Off | |
4 | Aug-15-2011 | 23:09:53 | Disk Drive Bay 1 Drive 2 | Drive Slot | Critical | Drive Fault ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
5 | Aug-16-2011 | 11:38:25 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
6 | Aug-16-2011 | 11:38:25 | Disk Drive Bay 1 Drive 2 | Drive Slot | Critical | Drive Fault ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
7 | Aug-16-2011 | 11:38:55 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
8 | Jun-10-2012 | 22:41:13 | System Board Ambient Temp | Temperature | Warning | Upper Non-critical - going high ; Sensor Reading = 45.00 C ; Threshold = 45.00 C | |
9 | Jun-11-2012 | 02:53:53 | System Board Ambient Temp | Temperature | Nominal | Upper Non-critical - going high ; Sensor Reading = 43.00 C ; Threshold = 45.00 C | |
10 | Nov-05-2012 | 21:56:42 | Disk Drive Bay 1 Drive 2 | Drive Slot | Critical | Drive Fault ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
11 | Nov-14-2012 | 21:53:58 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
12 | Nov-14-2012 | 21:53:58 | Disk Drive Bay 1 Drive 2 | Drive Slot | Critical | Drive Fault ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
13 | Nov-14-2012 | 21:54:19 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
14 | Nov-15-2012 | 16:12:03 | Disk Drive Bay 1 Drive 2 | Drive Slot | Critical | Drive Fault ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
15 | Nov-17-2012 | 17:14:34 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
16 | Nov-17-2012 | 17:14:34 | Disk Drive Bay 1 Drive 2 | Drive Slot | Critical | Drive Fault ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
17 | Nov-17-2012 | 17:15:40 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
18 | Nov-19-2012 | 20:47:57 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
19 | Nov-19-2012 | 20:50:04 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
20 | Jan-01-1970 | 08:00:33 | System Board Intrusion | Physical Security | Critical | General Chassis Intrusion ; Intrusion while system Off | |
21 | Jan-01-1970 | 08:00:38 | System Board Intrusion | Physical Security | Critical | General Chassis Intrusion ; Intrusion while system Off | |
22 | Jun-27-2014 | 17:27:38 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
23 | Jun-27-2014 | 17:27:53 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
24 | Jan-01-1970 | 08:00:31 | System Board Intrusion | Physical Security | Critical | General Chassis Intrusion ; Intrusion while system Off | |
25 | Jan-01-1970 | 08:00:36 | System Board Intrusion | Physical Security | Critical | General Chassis Intrusion ; Intrusion while system Off | |
26 | Oct-31-2016 | 05:48:35 | System Board Ambient Temp | Temperature | Warning | Lower Non-critical - going low ; Sensor Reading = 8.00 C ; Threshold = 8.00 C | |
27 | Oct-31-2016 | 09:00:38 | System Board Ambient Temp | Temperature | Nominal | Lower Non-critical - going low ; Sensor Reading = 10.00 C ; Threshold = 8.00 C | |
------------- debug output for sensors (-vvv is set): ------------ | |
script was executed with the following parameters: | |
./check_ipmi_sensor -f ipmi.cfg -H 10.103.1.100 -vvv | |
check_ipmi_sensor version: | |
3.9 | |
FreeIPMI version: | |
ipmi-sensors - 1.2.9 | |
FreeIPMI was executed with the following parameters: | |
/usr/sbin/ipmi-sensors -h 10.103.1.100 --config-file ipmi.cfg --quiet-cache --sdr-cache-recreate --interpret-oem-data --output-sensor-state --ignore-not-available-sensors --driver-type=LAN_2_0 --output-sensor-thresholds | |
FreeIPMI return code: 0 | |
output of FreeIPMI: | |
Record ID | Sensor Name | Sensor Group | Monitoring Status | Sensor Units | Sensor Reading | |
5 | Ambient Temp | Temperature | Nominal | C | 28.000000 | |
7 | CMOS Battery | Battery | Nominal | N/A | 'OK' | |
8 | VCORE PG | Voltage | Nominal | N/A | 'State Deasserted' | |
9 | VCORE PG | Voltage | Nominal | N/A | 'State Deasserted' | |
10 | 0.75 VTT PG | Voltage | Nominal | N/A | 'State Deasserted' | |
11 | 0.75 VTT PG | Voltage | Nominal | N/A | 'State Deasserted' | |
12 | CPU VTT PG | Voltage | Nominal | N/A | 'State Deasserted' | |
13 | 1.5V PG | Voltage | Nominal | N/A | 'State Deasserted' | |
14 | 1.8V PG | Voltage | Nominal | N/A | 'State Deasserted' | |
15 | 5V PG | Voltage | Nominal | N/A | 'State Deasserted' | |
16 | MEM CPU2 FAIL | Voltage | Nominal | N/A | 'State Deasserted' | |
17 | 5V Riser PG | Voltage | Nominal | N/A | 'State Deasserted' | |
18 | MEM CPU1 FAIL | Voltage | Nominal | N/A | 'State Deasserted' | |
19 | VTT CPU2 FAIL | Voltage | Nominal | N/A | 'State Deasserted' | |
20 | VTT CPU1 FAIL | Voltage | Nominal | N/A | 'State Deasserted' | |
21 | 0.9V PG | Voltage | Nominal | N/A | 'State Deasserted' | |
22 | CPU2 1.8 PLL PG | Voltage | Nominal | N/A | 'State Deasserted' | |
23 | CPU1 1.8 PLL PG | Voltage | Nominal | N/A | 'State Deasserted' | |
24 | 1.1 FAIL | Voltage | Nominal | N/A | 'State Deasserted' | |
25 | 1.0 LOM FAIL | Voltage | Nominal | N/A | 'State Deasserted' | |
26 | 1.0 AUX FAIL | Voltage | Nominal | N/A | 'State Deasserted' | |
27 | Heatsink Pres | Entity Presence | Nominal | N/A | 'Entity Present' | |
28 | iDRAC6 Ent Pres | Entity Presence | Critical | N/A | 'Entity Absent' | |
29 | USB Cable Pres | Entity Presence | Nominal | N/A | 'Entity Present' | |
31 | Riser Presence | Entity Presence | Nominal | N/A | 'Entity Present' | |
32 | FAN MOD 1A RPM | Fan | Nominal | RPM | 3480.000000 | |
34 | FAN MOD 2A RPM | Fan | Nominal | RPM | 3480.000000 | |
36 | FAN MOD 3A RPM | Fan | Nominal | RPM | 3480.000000 | |
39 | FAN MOD 4A RPM | Fan | Nominal | RPM | 3480.000000 | |
40 | Presence | Entity Presence | Nominal | N/A | 'Entity Present' | |
41 | Presence | Entity Presence | Nominal | N/A | 'Entity Present' | |
42 | Presence | Entity Presence | Nominal | N/A | 'Entity Present' | |
43 | Presence | Entity Presence | Nominal | N/A | 'Entity Present' | |
44 | Presence | Entity Presence | Nominal | N/A | 'Entity Present' | |
45 | Status | Processor | Nominal | N/A | 'Processor Presence detected' | |
46 | Status | Processor | Nominal | N/A | 'Processor Presence detected' | |
47 | Status | Power Supply | Nominal | N/A | 'Presence detected' | |
48 | Current | Current | Nominal | A | 0.400000 | |
49 | Current | Current | Nominal | A | 0.400000 | |
50 | Voltage | Voltage | Nominal | V | 218.000000 | |
51 | Voltage | Voltage | Nominal | V | 218.000000 | |
52 | Status | Power Supply | Nominal | N/A | 'Presence detected' | |
53 | Status | Cable/Interconnect | Nominal | N/A | 'Cable/Interconnect is connected' | |
54 | OS Watchdog | Watchdog 2 | Nominal | N/A | 'OK' | |
56 | Intrusion | Physical Security | Nominal | N/A | 'OK' | |
57 | PS Redundancy | Power Supply | Nominal | N/A | 'Fully Redundant' | |
58 | Fan Redundancy | Fan | Nominal | N/A | 'Fully Redundant' | |
60 | System Level | Current | Nominal | W | 168.000000 | |
61 | Power Optimized | OEM Reserved | Nominal | N/A | 'Good' | |
62 | Drive | Drive Slot | Nominal | N/A | 'Drive Presence' | |
65 | Cable SAS A | Cable/Interconnect | Nominal | N/A | 'Cable/Interconnect is connected' | |
66 | Cable SAS B | Cable/Interconnect | Nominal | N/A | 'Cable/Interconnect is connected' | |
67 | DKM Status | OEM Reserved | N/A | N/A | 'OEM Event = 0000h' | |
119 | FAN MOD 5A RPM | Fan | Nominal | RPM | 3480.000000 | |
--------------------- end of debug output --------------------- | |
IPMI Status: Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 759. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 759. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 759. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 759. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 759. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 759. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 759. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 759. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 759. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 759. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 737. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 738. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 749. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in concatenation (.) or string at ./check_ipmi_sensor line 750. | |
Use of uninitialized value in string ne at ./check_ipmi_sensor line 759. | |
Critical [iDRAC6 Ent Pres = Critical ('Entity Absent'), System Board Intrusion = Critical (Physical Security), System Board Intrusion = Critical (Physical Security), Disk Drive Bay 1 Drive 2 = Critical (Drive Slot), Disk Drive Bay 1 Drive 2 = Critical (Drive Slot), System Board Ambient Temp = Warning (Temperature), Disk Drive Bay 1 Drive 2 = Critical (Drive Slot), Disk Drive Bay 1 Drive 2 = Critical (Drive Slot), Disk Drive Bay 1 Drive 2 = Critical (Drive Slot), Disk Drive Bay 1 Drive 2 = Critical (Drive Slot), System Board Intrusion = Critical (Physical Security), System Board Intrusion = Critical (Physical Security), System Board Intrusion = Critical (Physical Security), System Board Intrusion = Critical (Physical Security), System Board Ambient Temp = Warning (Temperature)] | 'Ambient Temp'=28.000000;:;: 'FAN MOD 1A RPM'=3480.000000;:;: 'FAN MOD 2A RPM'=3480.000000;:;: 'FAN MOD 3A RPM'=3480.000000;:;: 'FAN MOD 4A RPM'=3480.000000;:;: 'Current'=0.400000;:;: 'Current'=0.400000;:;: 'Voltage'=218.000000;:;: 'Voltage'=218.000000;:;: 'System Level'=168.000000;:;: 'FAN MOD 5A RPM'=3480.000000;:;: | |
Ambient Temp = 28.000000 (Status: Nominal) | |
CMOS Battery = 'OK' (Status: Nominal) | |
VCORE PG = 'State Deasserted' (Status: Nominal) | |
VCORE PG = 'State Deasserted' (Status: Nominal) | |
0.75 VTT PG = 'State Deasserted' (Status: Nominal) | |
0.75 VTT PG = 'State Deasserted' (Status: Nominal) | |
CPU VTT PG = 'State Deasserted' (Status: Nominal) | |
1.5V PG = 'State Deasserted' (Status: Nominal) | |
1.8V PG = 'State Deasserted' (Status: Nominal) | |
5V PG = 'State Deasserted' (Status: Nominal) | |
MEM CPU2 FAIL = 'State Deasserted' (Status: Nominal) | |
5V Riser PG = 'State Deasserted' (Status: Nominal) | |
MEM CPU1 FAIL = 'State Deasserted' (Status: Nominal) | |
VTT CPU2 FAIL = 'State Deasserted' (Status: Nominal) | |
VTT CPU1 FAIL = 'State Deasserted' (Status: Nominal) | |
0.9V PG = 'State Deasserted' (Status: Nominal) | |
CPU2 1.8 PLL PG = 'State Deasserted' (Status: Nominal) | |
CPU1 1.8 PLL PG = 'State Deasserted' (Status: Nominal) | |
1.1 FAIL = 'State Deasserted' (Status: Nominal) | |
1.0 LOM FAIL = 'State Deasserted' (Status: Nominal) | |
1.0 AUX FAIL = 'State Deasserted' (Status: Nominal) | |
Heatsink Pres = 'Entity Present' (Status: Nominal) | |
iDRAC6 Ent Pres = 'Entity Absent' (Status: Critical) | |
USB Cable Pres = 'Entity Present' (Status: Nominal) | |
Riser Presence = 'Entity Present' (Status: Nominal) | |
FAN MOD 1A RPM = 3480.000000 (Status: Nominal) | |
FAN MOD 2A RPM = 3480.000000 (Status: Nominal) | |
FAN MOD 3A RPM = 3480.000000 (Status: Nominal) | |
FAN MOD 4A RPM = 3480.000000 (Status: Nominal) | |
Presence = 'Entity Present' (Status: Nominal) | |
Presence = 'Entity Present' (Status: Nominal) | |
Presence = 'Entity Present' (Status: Nominal) | |
Presence = 'Entity Present' (Status: Nominal) | |
Presence = 'Entity Present' (Status: Nominal) | |
Status = 'Processor Presence detected' (Status: Nominal) | |
Status = 'Processor Presence detected' (Status: Nominal) | |
Status = 'Presence detected' (Status: Nominal) | |
Current = 0.400000 (Status: Nominal) | |
Current = 0.400000 (Status: Nominal) | |
Voltage = 218.000000 (Status: Nominal) | |
Voltage = 218.000000 (Status: Nominal) | |
Status = 'Presence detected' (Status: Nominal) | |
Status = 'Cable/Interconnect is connected' (Status: Nominal) | |
OS Watchdog = 'OK' (Status: Nominal) | |
Intrusion = 'OK' (Status: Nominal) | |
PS Redundancy = 'Fully Redundant' (Status: Nominal) | |
Fan Redundancy = 'Fully Redundant' (Status: Nominal) | |
System Level = 168.000000 (Status: Nominal) | |
Power Optimized = 'Good' (Status: Nominal) | |
Drive = 'Drive Presence' (Status: Nominal) | |
Cable SAS A = 'Cable/Interconnect is connected' (Status: Nominal) | |
Cable SAS B = 'Cable/Interconnect is connected' (Status: Nominal) | |
FAN MOD 5A RPM = 3480.000000 (Status: Nominal)不过根据它的提示(其实插件也是调用如下命令),可以使用 | |
/usr/sbin/ipmi-sel -h 10.103.1.100 --config-file ipmi.cfg --driver-type=LAN_2_0 --output-event-state --interpret-oem-data --entity-sensor-names 执行结果是:# /usr/sbin/ipmi-sel -h 10.103.1.100 --config-file ipmi.cfg --driver-type=LAN_2_0 --output-event-state --interpret-oem-data --entity-sensor-names | |
ID | Date | Time | Name | Type | State | Event | |
1 | Apr-08-2011 | 06:42:13 | System Board SEL | Event Logging Disabled | Nominal | Log Area Reset/Cleared | |
2 | Jan-01-1970 | 08:00:31 | System Board Intrusion | Physical Security | Critical | General Chassis Intrusion ; Intrusion while system Off | |
3 | Jan-01-1970 | 08:00:36 | System Board Intrusion | Physical Security | Critical | General Chassis Intrusion ; Intrusion while system Off | |
4 | Aug-15-2011 | 23:09:53 | Disk Drive Bay 1 Drive 2 | Drive Slot | Critical | Drive Fault ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
5 | Aug-16-2011 | 11:38:25 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
6 | Aug-16-2011 | 11:38:25 | Disk Drive Bay 1 Drive 2 | Drive Slot | Critical | Drive Fault ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
7 | Aug-16-2011 | 11:38:55 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
8 | Jun-10-2012 | 22:41:13 | System Board Ambient Temp | Temperature | Warning | Upper Non-critical - going high ; Sensor Reading = 45.00 C ; Threshold = 45.00 C | |
9 | Jun-11-2012 | 02:53:53 | System Board Ambient Temp | Temperature | Nominal | Upper Non-critical - going high ; Sensor Reading = 43.00 C ; Threshold = 45.00 C | |
10 | Nov-05-2012 | 21:56:42 | Disk Drive Bay 1 Drive 2 | Drive Slot | Critical | Drive Fault ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
11 | Nov-14-2012 | 21:53:58 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
12 | Nov-14-2012 | 21:53:58 | Disk Drive Bay 1 Drive 2 | Drive Slot | Critical | Drive Fault ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
13 | Nov-14-2012 | 21:54:19 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
14 | Nov-15-2012 | 16:12:03 | Disk Drive Bay 1 Drive 2 | Drive Slot | Critical | Drive Fault ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
15 | Nov-17-2012 | 17:14:34 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
16 | Nov-17-2012 | 17:14:34 | Disk Drive Bay 1 Drive 2 | Drive Slot | Critical | Drive Fault ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
17 | Nov-17-2012 | 17:15:40 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
18 | Nov-19-2012 | 20:47:57 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
19 | Nov-19-2012 | 20:50:04 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
20 | Jan-01-1970 | 08:00:33 | System Board Intrusion | Physical Security | Critical | General Chassis Intrusion ; Intrusion while system Off | |
21 | Jan-01-1970 | 08:00:38 | System Board Intrusion | Physical Security | Critical | General Chassis Intrusion ; Intrusion while system Off | |
22 | Jun-27-2014 | 17:27:38 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
23 | Jun-27-2014 | 17:27:53 | Disk Drive Bay 1 Drive 2 | Drive Slot | Nominal | Drive Presence ; OEM Event Data2 code = 01h ; OEM Event Data3 code = 02h | |
24 | Jan-01-1970 | 08:00:31 | System Board Intrusion | Physical Security | Critical | General Chassis Intrusion ; Intrusion while system Off | |
25 | Jan-01-1970 | 08:00:36 | System Board Intrusion | Physical Security | Critical | General Chassis Intrusion ; Intrusion while system Off | |
26 | Oct-31-2016 | 05:48:35 | System Board Ambient Temp | Temperature | Warning | Lower Non-critical - going low ; Sensor Reading = 8.00 C ; Threshold = 8.00 C | |
27 | Oct-31-2016 | 09:00:38 | System Board Ambient Temp | Temperature | Nominal | Lower Non-critical - going low ; Sensor Reading = 10.00 C ; Threshold = 8.00 C |
5.3 编写 Zabbix 外部检查(External checks)脚本
# pwd | |
/usr/local/zabbix/share/zabbix/externalscripts | |
# cat check_ipmi |
下面是脚本内容
#用于检测 ipmi 相关信息 | |
#Create on 2016-011-18 | |
#@author: Chinge_Yang | |
args="$*" | |
echo $(date +%F-%T) $args >> /tmp/check_ipmi.debug | |
check_ipmi_dir=/usr/local/zabbix/shell/check_ipmi_sensor | |
check_ipmi_bin=$check_ipmi_dir/check_ipmi_sensor | |
ipmi_sensors=/usr/sbin/ipmi-sensors | |
ipmi_cfg=$check_ipmi_dir/ipmi.cfg | |
#$check_ipmi_bin -f $ipmi_cfg -v $args | |
#${ipmi_sel} $args --config-file $ipmi_cfg --driver-type=LAN_2_0 --output-event-state --interpret-oem-data --entity-sensor-names | |
options="--quiet-cache --sdr-cache-recreate --interpret-oem-data --output-sensor-state --ignore-not-available-sensors --driver-type=LAN_2_0 --output-sensor-thresholds" | |
function usage(){echo "Usage: `basename $0` options (-h HOST|-n NAME)" | |
} | |
function check(){result=$($ipmi_sensors -h $host --config-file $ipmi_cfg $options|grep "$name"|awk -F"| " '{print $NF}') |
printf "%.4f\n" $result | |
} | |
if [$# -lt 4 ] | |
then | |
usage | |
exit 55 | |
fi | |
# 用法: scriptname -options | |
# 注意: 必须使用破折号 (-) | |
# 参数后接冒号,表示必须接值 | |
while getopts ":h:n:" Option;do | |
case $Option in | |
h) | |
host=$OPTARG | |
;; | |
n) | |
name=$OPTARG | |
;; | |
*) | |
usage | |
;; # 默认情况的处理 | |
esac | |
done | |
shift $(($OPTIND - 1)) | |
# (译者注: shift 命令是可以带参数的, 参数就是移动的个数) | |
# 将参数指针减 1, 这样它将指向下一个参数. | |
# $1 现在引用的是命令行上的第一个非选项参数, | |
#+ 如果有一个这样的参数存在的话. | |
check | |
exit 0 |
添加执行权限
chmod a+x check_ipmi
5.4 新建自定义模板
这里就不详细介绍内容了,其实就是改改上文中的模板而来,一张图看完内容:
给 2 张图看看效果:
好吧,最后发现,就算是自定义脚本,仍然是获取数据艰难,脚本执行 ipmi 的命令都 timeout。
本文永久更新链接地址:http://www.linuxidc.com/Linux/2017-05/143529.htm
