Nagios で zpool list や zpool status -xv なんかで pool の監視はしているけど、
他にも何かできないかということで、smartmontools を使ってみる。
- smartmontools
http://sourceforge.net/apps/trac/smartmontools/wiki
とりあえず、smartctl を使える形でパッケージにして jposug にコミットして
Oracle Solaris 11.1 が入っている ThinkPad x230 にパッケージをインストール。
HDD の情報を見てみる。
% sudo /usr/sbin/smartctl -a -d scsi /dev/rdsk/c8t0d0s0 smartctl 6.0 2012-10-10 r3643 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org User Capacity: 320,072,933,376 bytes [320 GB] Logical block size: 512 bytes Serial number: TA8B113VD1MTXN Device type: disk Local Time is: Mon Feb 11 00:28:49 2013 JST Device supports SMART and is Enabled Temperature Warning Disabled or Not Supported SMART Health Status: OK Current Drive Temperature: 28 C Manufactured in week 00 of year 0000 Specified cycle count over device lifetime: 100 Accumulated start-stop cycles: 0 Error Counter logging not supported No self-tests have been logged
SMART Health Status と Current Drive Temperature は取れている。
また、"-d sat" にすると以下のようになる。
% sudo /usr/sbin/smartctl -a -d sat /dev/rdsk/c8t0d0s0 smartctl 6.0 2012-10-10 r3643 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Device Model: HGST HTS545032A7E380 Serial Number: TA8B113VD1MTXN LU WWN Device Id: 5 000cca 6f5ced637 Firmware Version: GGBZBC80 User Capacity: 320,072,933,376 bytes [320 GB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5400 rpm Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ATA8-ACS T13/1699-D revision 6 SATA Version is: SATA 2.6, 3.0 Gb/s (current: 3.0 Gb/s) Local Time is: Mon Feb 11 00:30:02 2013 JST SMART support is: Available - device has SMART capability. SMART support is: Enabled Read SMART Data failed: scsi error aborted command === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: UNKNOWN! SMART Status, Attributes and Thresholds cannot be read. Read SMART Log Directory failed: scsi error aborted command Read SMART Error Log failed: scsi error aborted command Read SMART Self-test Log failed: scsi error aborted command Selective Self-tests/Logging not supported
テストはうまくいかない。
% sudo /usr/sbin/smartctl -t short -C -d sat /dev/rdsk/c8t0d0 smartctl 6.0 2012-10-10 r3643 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org Read SMART Data failed: scsi error aborted command % sudo /usr/sbin/smartctl -t short -C -d scsi /dev/rdsk/c8t0d0 smartctl 6.0 2012-10-10 r3643 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org Short foreground self test failed [unsupported scsi opcode]
"-d sat" と "-d scsi" のあたりもわからない。
"-H" は "-d scsi" だと通って、"-d sat" だと通らない。
% sudo /usr/sbin/smartctl -H -d scsi /dev/rdsk/c8t0d0s0 smartctl 6.0 2012-10-10 r3643 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org SMART Health Status: OK tonaka@x230% sudo /usr/sbin/smartctl -H -d sat /dev/rdsk/c8t0d0s0 smartctl 6.0 2012-10-10 r3643 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org Read SMART Data failed: scsi error aborted command === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: UNKNOWN! SMART Status, Attributes and Thresholds cannot be read.
"-x" は "-d sat" の方が出力が多い。
% sudo /usr/sbin/smartctl -x -d scsi /dev/rdsk/c8t0d0s0 smartctl 6.0 2012-10-10 r3643 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org User Capacity: 320,072,933,376 bytes [320 GB] Logical block size: 512 bytes Serial number: TA8B113VD1MTXN Device type: disk Local Time is: Mon Feb 11 00:44:08 2013 JST Device supports SMART and is Enabled Temperature Warning Disabled or Not Supported SMART Health Status: OK Current Drive Temperature: 29 C Manufactured in week 00 of year 0000 Specified cycle count over device lifetime: 100 Accumulated start-stop cycles: 0 Error Counter logging not supported No self-tests have been logged Device does not support Background scan results logging scsiPrintSasPhy Log Sense Failed [unsupported field in scsi command]
% sudo /usr/sbin/smartctl -x -d scsi /dev/rdsk/c8t0d0s0 smartctl 6.0 2012-10-10 r3643 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org User Capacity: 320,072,933,376 bytes [320 GB] Logical block size: 512 bytes Serial number: TA8B113VD1MTXN Device type: disk Local Time is: Mon Feb 11 00:44:08 2013 JST Device supports SMART and is Enabled Temperature Warning Disabled or Not Supported SMART Health Status: OK Current Drive Temperature: 29 C Manufactured in week 00 of year 0000 Specified cycle count over device lifetime: 100 Accumulated start-stop cycles: 0 Error Counter logging not supported No self-tests have been logged Device does not support Background scan results logging scsiPrintSasPhy Log Sense Failed [unsupported field in scsi command]
% sudo /usr/sbin/smartctl -x -d sat /dev/rdsk/c8t0d0s0 smartctl 6.0 2012-10-10 r3643 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Device Model: HGST HTS545032A7E380 Serial Number: TA8B113VD1MTXN LU WWN Device Id: 5 000cca 6f5ced637 Firmware Version: GGBZBC80 User Capacity: 320,072,933,376 bytes [320 GB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5400 rpm Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ATA8-ACS T13/1699-D revision 6 SATA Version is: SATA 2.6, 3.0 Gb/s (current: 3.0 Gb/s) Local Time is: Mon Feb 11 00:46:33 2013 JST SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM level is: 128 (minimum power consumption without standby) Rd look-ahead is: Enabled Write cache is: Enabled ATA Security is: Disabled, frozen [SEC2] Read SMART Data failed: scsi error aborted command === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: UNKNOWN! SMART Status, Attributes and Thresholds cannot be read. Read SMART Log Directory failed: scsi error aborted command General Purpose Log Directory Version 1 GP Log at address 0x00 has 1 sectors [Log Directory] GP Log at address 0x03 has 1 sectors [Ext. Comprehensive SMART error log] GP Log at address 0x07 has 1 sectors [Extended self-test log] GP Log at address 0x10 has 1 sectors [NCQ Command Error log] GP Log at address 0x11 has 1 sectors [SATA Phy Event Counters] GP Log at address 0x80 has 16 sectors [Host vendor specific log] GP Log at address 0x81 has 16 sectors [Host vendor specific log] GP Log at address 0x82 has 16 sectors [Host vendor specific log] GP Log at address 0x83 has 16 sectors [Host vendor specific log] GP Log at address 0x84 has 16 sectors [Host vendor specific log] GP Log at address 0x85 has 16 sectors [Host vendor specific log] GP Log at address 0x86 has 16 sectors [Host vendor specific log] GP Log at address 0x87 has 16 sectors [Host vendor specific log] GP Log at address 0x88 has 16 sectors [Host vendor specific log] GP Log at address 0x89 has 16 sectors [Host vendor specific log] GP Log at address 0x8a has 16 sectors [Host vendor specific log] GP Log at address 0x8b has 16 sectors [Host vendor specific log] GP Log at address 0x8c has 16 sectors [Host vendor specific log] GP Log at address 0x8d has 16 sectors [Host vendor specific log] GP Log at address 0x8e has 16 sectors [Host vendor specific log] GP Log at address 0x8f has 16 sectors [Host vendor specific log] GP Log at address 0x90 has 16 sectors [Host vendor specific log] GP Log at address 0x91 has 16 sectors [Host vendor specific log] GP Log at address 0x92 has 16 sectors [Host vendor specific log] GP Log at address 0x93 has 16 sectors [Host vendor specific log] GP Log at address 0x94 has 16 sectors [Host vendor specific log] GP Log at address 0x95 has 16 sectors [Host vendor specific log] GP Log at address 0x96 has 16 sectors [Host vendor specific log] GP Log at address 0x97 has 16 sectors [Host vendor specific log] GP Log at address 0x98 has 16 sectors [Host vendor specific log] GP Log at address 0x99 has 16 sectors [Host vendor specific log] GP Log at address 0x9a has 16 sectors [Host vendor specific log] GP Log at address 0x9b has 16 sectors [Host vendor specific log] GP Log at address 0x9c has 16 sectors [Host vendor specific log] GP Log at address 0x9d has 16 sectors [Host vendor specific log] GP Log at address 0x9e has 16 sectors [Host vendor specific log] GP Log at address 0x9f has 16 sectors [Host vendor specific log] GP Log at address 0xe0 has 1 sectors [SCT Command/Status] GP Log at address 0xe1 has 1 sectors [SCT Data Transfer] SMART Extended Comprehensive Error Log Version: 1 (1 sectors) No Errors Logged SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Vendor (0x50) Completed without error 00% 0 - Selective Self-tests/Logging not supported Read SCT Status failed: scsi error aborted command Read SCT Temperature History failed Read SCT Status failed: scsi error aborted command SCT (Get) Error Recovery Control command failed Device Statistics (GP Log 0x04) not supported SATA Phy Event Counters (GP Log 0x11) ID Size Value Description 0x0001 2 0 Command failed due to ICRC error 0x0002 2 0 R_ERR response for data FIS 0x0003 2 0 R_ERR response for device-to-host data FIS 0x0004 2 0 R_ERR response for host-to-device data FIS 0x0005 2 0 R_ERR response for non-data FIS 0x0006 2 0 R_ERR response for device-to-host non-data FIS 0x0007 2 0 R_ERR response for host-to-device non-data FIS 0x0009 2 2 Transition from drive PhyRdy to drive PhyNRdy 0x000a 2 1 Device-to-host register FISes sent due to a COMRESET 0x000b 2 0 CRC errors within host-to-device FIS 0x000d 2 0 Non-CRC errors within host-to-device FIS