Data Inconsistencies May Occur When Persistent SCSI Parity Errors are Generated Between the Host and the SE33x0 Array



Category :Data Loss
Release Phase :Resolved
Product :Sun StorageTek 3310 SCSI Array
Sun StorageTek 3320 SCSI Array  
Bug Id :6363490, 6378796  
Date of Workaround Release :12-JAN-2006 
Date of Resolved Release :13-Mar-2008 

Data Inconsistencies May Occur When Persistent SCSI Parity Errors are Generated Between the Host and the SE33x0 Array


1. Impact

When the connection between the SE33x0 array and the host has degraded to the point that WRITE requests cannot be completed due to connectivity issues, persistent SCSI parity errors may be generated between the host and the SE33X0 array and data inconsistencies may occur.

2. Contributing Factors

This issue can occur on the following platforms:

  • Sun StorEdge 3310 SCSI array without firmware 4.15F (as delivered in patch 113722-15)
  • Sun StorEdge 3320 SCSI array without firmware 4.15G (as delivered in patch 113730-01)

SCSI parity errors can cause invalid data to get written into the array's cache. Prior to firmware version 4.15, this data eventually gets flushed to the disk media, permanently storing this invalid data on the volume. Firmware version 4.15 was modified to discard this corrupted data rather than write it to disk media. This reduces the probability of corrupting the volume. However, in the rare case where the write command overlapped a prior write command's data that still resided in cache, that data will also be discarded.

Single Path Configurations

Configurations in which a host has only one path to one or more logical units on the array are exposed to this problem. This is because there is no redundant path between the host and the SE33x0 array. This lack of redundancy does not allow for a retry using a second path to the SE33x0 array.

When using firmware version 4.15 in this configuration, if any write commands failed due to parity errors, there is a possibility of lost write data in cache if the application or file system issued writes to overlapping LBAs.

When using older firmware in this configuration, the data for LBAs of any WRITE request that cannot be completed as a result of a PARITY ERROR returned by the SE33x0 should be considered to have invalid data.

Multi Path/High Availability Configurations

The exposure for a properly configured High Availability configuration using a host multi-pathing driver and and multiple separate connections between the host(s) and the SE33x0 array is very small. In this configuration, the multi-pathing driver in the host will utilize the second, non-compromised path to the array controller to retry the WRITE request. A successful retry will successfully write the intended data to the correct LBAs with the following exceptions:

1. If the SE33x0 array or the host experiences a power failure between the failed WRITE request and the successful completion of the retry down the second path, the data for the failed WRITE request should be considered invalid.

2. If the Host OS experiences a crash or a multi-path driver error between the failed WRITE request and the successful completion of the retry down the second path, the data for the failed WRITE request should be considered invalid.

3. Symptoms

Should the described issue occur, persistent SCSI parity errors between the host and the SE33x0 array will be generated. The SE33x0 array will return a SCSI status of "Parity Error" to the host SCSI Host Bus Adapter (HBA). Typically, the host SCSI HBA will retry the WRITE request some number of times (most drivers attempt between 2 to 6 retries) before returning the WRITE request to the application with a FAILURE status.

4. Workaround

There is no workaround for this issue. Please see the resolution section below.

5. Resolution

The issue described in BugID 6363490 is addressed on the following platforms:
  • Sun StorEdge 3310 SCSI array with firmware revision 4.15F (as delivered in patch 113722-15 or later)
  • Sun StorEdge 3320 SCSI array with firmware revision 4.15G (as delivered in patch 113730-01 or later)

Note: Insure that SCSI connections are reliable and properly configured to minimize the probability of parity errors and use multiple SCSI connections with failover drivers.

Because the nature of the changes would require a major redesign, the issue described in BugID 6378796 was closed as "will not fix."

This Sun Alert notification is being provided to you on an "AS IS" basis. This Sun Alert notification may contain information provided by third parties. The issues described in this Sun Alert notification may or may not impact your system(s). Sun makes no representations, warranties, or guarantees as to the information contained herein. ANY AND ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY DISCLAIMED. BY ACCESSING THIS DOCUMENT YOU ACKNOWLEDGE THAT SUN SHALL IN NO EVENT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES THAT ARISE OUT OF YOUR USE OR FAILURE TO USE THE INFORMATION CONTAINED HEREIN. This Sun Alert notification contains Sun proprietary and confidential information. It is being provided to you pursuant to the provisions of your agreement to purchase services from Sun, or, if you do not have such an agreement, the Sun.com Terms of Use. This Sun Alert notification may only be used for the purposes contemplated by these agreements.

Copyright 2000-2008 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved.



Modification History

14-Jun-2006: Updated Contributing Factors and Resolution Sections
13-Mar-2008: Updated Resolution section - RESOLVED





Attachments
This solution has no attachment

 
 
Login Required

You must login and have a valid contract to access Sun's Premium content which includes:

  • Sun Alerts
  • Bugs
  • Patches
  • Solutions
  • White Papers
  • Documentation
  • Support Knowledge

Login Required

You must login and have a valid contract to access Sun's contracted features

Access Legend:

(Login to access)   Sun Contracted Content
(Login to access)   Sun Contracted Feature

Please make use of SunSolve Feedback application by selecting the floating [+] to provide feedback about this specific document.

Search

Article Details
Article ID : 200437
Article Type : Sun Alert
Last reviewed : 2008-03-13
Audience : PUBLIC
Keywords :
Provide feedback  (help)
Page Tools
»  Print This Page
»  Email This Article
»  Bookmark This Article
 
Contact About Sun News & Events Employment Site Map Privacy Terms of Use Trademarks Copyright Sun Microsystems, Inc. | SunSolve Version 7.4.0 #1