Sun Fire 12K/15K Domain May "Dstop" When Another Domain Sharing an Expander is Performing a "setkeyswitch on/off" or a DR Operation



Category :Availability
Release Phase :Resolved
Product :Sun Fire 12K Server
Sun Fire 15K Server  
Bug Id :4943895  
Date of Resolved Release :13-JAN-2004 


Impact

Upon interrupting "hpost", a Sun Fire 12K/15K domain may Domain Stop (Dstop) another domain if it has a shared expander anywhere in the domain configuration. This issue may occur when "hpost" (run by either "setkeyswitch on" or DR) is interrupted on a domain which is sharing an expander with another domain. It is possible that this issue may also occur during domain shutdown. Domains which are not sharing the expander are not susceptible to this issue.

Note: Interruption of the "hpost" can be in the form of either manual intervention (e.g. "ctrl-c" of the executing command) or if a domain is setkeyswitched to off/standby while it is being recovered by "dsmd".


Contributing Factors

This issue can occur in the following releases:

SPARC Platform

  • Sun Fire 12K/15K with System Management Software (SMS) 1.1
  • Sun Fire 12K/15K with System Management Software (SMS) 1.2
  • Sun Fire 12K/15K with System Management Software (SMS) 1.3 without patch 114640-09

To determine whether a system has a shared expander or not, run the following command:

    % showboards
    Location  Pwr  Type of Board   Board Status  Test Status  Domain
    --------  ---  -------------   ------------  -----------  ------
    SB12      On   CPU             Available     Unknown      A
    IO12      On   HPCI            Available     Unknown      B

In the above example, note that SB12 is assigned to Domain A and IO12 is assigned to Domain B. This indicates a split expander configuration.


Symptoms

Should the described issue occur, "dstop wfail" strings similar to the following may be found in the HW state dump directory "/var/opt/SUNWSMS/adm/X/dump":

    SDI EX06/S0: All SDI is DStopped and RStopped, requested by
    DARB.
    SDI EX07/S0  Master_Stop_Status0[31:0] = 0000010F
            MStop0[3:0]: All SDI logic is DStopped + Recordstopped.
            MStop0[8]: L1 Slot0 Ecc error line detected
    SDI EX07/S0  Slot[1:0][3:0] Ecc Error Count = 0 0
    SDI EX07/S0  Dstop0[31:0] = 000C8008
            Dstop0[18]: D    DARB texp requests Slot1 Dstop (M)
            Dstop0[19]: D 1E SDI internal core requested Dstop
    SDI EX07/S0  Core_Error0[31:0]  = 02008200  Mask = 0051FFFF
            CoreErr0[25]: D 1E Command pool timeout, non-split exp (M)
                valid_{slot_wr[1:0],read}_TO = 1 (rev 4+)
                {cmd_pool_loc[5:0],cmd4io,retired,half_used} = 020
    FAIL EXB EX7:  Dstop/Rstop detected by SDI EX7/S0.

Note: In the above example, X is a reference to the domain number.


Workaround

There is no workaround. Please see the "Resolution" section below.


Resolution

This issue is addressed in the following release:

SPARC

  • Sun Fire 12K/15K with System Management Software (SMS) 1.3 with patch 114640-09 or later

Note: SMS 1.1 and 1.2 require an upgrade to a later release.




Modification History




Attachments
This solution has no attachment

 
 
Login Required

You must login and have a valid contract to access Sun's Premium content which includes:

  • Sun Alerts
  • Bugs
  • Patches
  • Solutions
  • White Papers
  • Documentation
  • Support Knowledge

Login Required

You must login and have a valid contract to access Sun's contracted features

Access Legend:

(Login to access)   Sun Contracted Content
(Login to access)   Sun Contracted Feature

Please make use of SunSolve Feedback application by selecting the floating [+] to provide feedback about this specific document.

Search

Article Details
Article ID : 200352
Article Type : Sun Alert
Last reviewed : 2004-01-06
Audience : PUBLIC
Keywords :
Provide feedback  (help)
Page Tools
»  Print This Page
»  Email This Article
»  Bookmark This Article
 
Contact About Sun News & Events Employment Site Map Privacy Terms of Use Trademarks Copyright Sun Microsystems, Inc. | SunSolve Version 7.4.0 #1