Special Firmware Installation Procedures Are Required to Prevent Loss of Volume Access on StorEdge T3/T3+ Arrays |
|
| Category : | Availability |
| Release Phase : | Resolved |
| Product : | Sun StorageTek T3 Array Sun StorageTek T3+ Array
|
| Bug Id : | 4697868
|
| Date of Resolved Release : | 01-APR-2003
|
Impact
Installation of firmware versions 1.18.01 for T3 arrays or 2.01.03 for T3+ arrays may result in marginal disk drives in the arrays being marked as "disabled". This could lead to loss of disk volume access if special firmware pre-installation procedures are not followed.
Contributing Factors
This issue can occur in the following releases:
-
Sun StorEdge T3 Arrays upgraded to firmware 1.18.01 (patch 109115-12 ) or later
-
Sun StorEdge T3+ Arrays upgraded to firmware 2.01.03 (patch 112276-06) or later
Note: This issue occurs due to the inclusion of improved disk error handling routines in firmware versions 1.18.01 (T3) and 2.01.03 (T3+) and later. Because of this change in error handling, marginal disk drives which have already logged certain types of errors may be disabled (failed) by the new firmware. Loss of volume access can occur when two or more disks are disabled within a T3/T3+ array at the same time. Any of the following types of disk errors may cause the new firmware to fail a drive:
Jun 05 06:16:14 ISR1[2]: W: Sense Key = 0x4, Asc = 0x15, Ascq = 0x1
Jun 05 06:16:14 ISR1[2]: W: Sense Data Description = Mechanical Positioning Error
Jul 31 16:19:22 ISR1[1]: N: u1d3 SCSI Disk Error Occurred (path = 0x1)
Jul 31 16:19:22 ISR1[1]: N: Sense Key = 0x1, Asc = 0x5d, Ascq = 0x0
Jul 31 16:19:22 ISR1[1]: N: Sense Data Description = Failure Prediction Threshold Exceeded
Symptoms
When this issue occurs, an affected T3/T3+ disk volume will become unavailable to software applications running on the attached host. The Solaris OS may display "Offline" error messages for the volume or "SCSI transport error, giving up". The T3/T3+ "syslog" file would show two or more disk drives within the volume as "disabled".
Workaround
Please see the Resolution section below for how to avoid the described issue.
Resolution
This issue is addressed by closely following the patch installation procedures provided in the README files for patches 109115-12 (or later) and 112276-06 (or later). The required steps are:
1) Inspect the "syslog" file from the T3/T3+ array for any of the errors shown above by running the following command:
test_host% egrep -i'0x5D|Threshold|0x15|0x4|Mechanical|Positioning|Exceeded|Disk Error' syslog
Jun 05 06:16:14 ISR1[2]: W: u2d5 SCSI Disk Error Occurred (path = 0x0)
Jun 05 06:16:14 ISR1[2]: W: Sense Key = 0x4, Asc = 0x15, Ascq = 0x1
Jun 05 06:16:14 ISR1[2]: W: Sense Data Description = Mechanical Positioning Error
Jul 31 16:19:22 ISR1[1]: N: u1d3 SCSI Disk Error Occurred (path = 0x1)
Jul 31 16:19:22 ISR1[1]: N: Sense Key = 0x1, Asc = 0x5d, Ascq = 0x0
Jul 31 16:19:22 ISR1[1]: N: Sense Data Description = Failure Prediction Threshold Exceeded
2) Back up the data from volumes with disk drives reporting any one of the above errors.
3) Replace any disk drives reporting the above errors.
4) Verify the volume is in optimal working order.
5) Install the new firmware.
Modification History
AttachmentsThis solution has no attachment