- Sun Alert ID: 101673 (formerly 57765)
- Synopsis: System May Hang or Panic Accompanied by "lpost" Messages
- Category: Availability
- Product: Sun Fire 3800 Server, Sun Fire 4800 Server, Sun Fire 4810 Server, Sun Fire 6800 Server, Sun Fire E6900 Server, Sun Fire E2900 Server, Sun Fire V1280 Server, Sun Fire E4900 Server
- BugIDs: 4978865, 5054736
- Avoidance: Patch
- State: Resolved
- Date Released: 25-Apr-2005, 01-Aug-2005
- Date Closed: 01-Aug-2005
- Date Modified: 01-Aug-2005, 05-Dec-2005, 22-Mar-2006, 23-Aug-2006
1. Impact
False indications of hardware failure may be diagnosed incorrectly, and remedial action may lead to unnecessary hardware replacement. Loss of application availability may occur due to either a system panic or hang which may require a "setkeyswitch" cycle to recover.
2. Contributing Factors
This issue can occur on the following platforms:
- Sun Fire 2900, 3800, 4800, 4810, 4900, 6800, 6900 and V1280 servers with System Controller (ScApp) firmware versions 5.18.x and earlier without ScApp firmware patch 114526-01, and domains running Solaris 9 without Kernel update patch 117171-14
Notes:
- Solaris 7, Solaris 8 and Solaris 10 are not affected by this issue. The Solaris x86 platform is not affected by this issue.
- In some cases, use of prtdiag(1M) has been shown to trigger false indications of system hardware failure.
3. Symptoms
Should the described issue occur, the system may present false indications of system hardware failure. In most cases there is little or no information in showlogs, showerrorbuffer or domain messages to indicate an error. The WARNING: Asynchronous Event message in the console coupled with a system hang or panic are the only indicators. Time-Out (TO) from system bus and/or Privileged (PRIV) code access error(s) messages may also be displayed.
Asynchronous event "lpost" messages or panic messages similar to the following examples (from the platform loghost output) may appear during routine shutdown or reboot:
{/N0/SB1/P2} WARNING: Asynchronous Event.
{/N0/SB1/P2} Component under test: /N0/SB1/P2 CPU
{/N0/SB1/P2} Unexpected event occurred
{/N0/SB1/P2} Ino = 00000000.00000000
{/N0/SB1/P2} tl tt tstate tpc tnpc
{/N0/SB1/P2} 01 60 00000044.80000604 000007ff.f000bd3c 000007ff.f000bd40
{/N0/SB1/P2} AFSR = 00000000.00000000
{/N0/SB1/P2} AFAR = 00000028.04001800
{/N0/SB1/P2} IMMU SFSR = 00000000.00000000
{/N0/SB1/P2} DMMU SFSR = 00000000.00000000
{/N0/SB1/P2} DMMU SFAR = 00000300.14821480
{/N0/SB1/P2} PState = 00000000.00000814
{/N0/SB1/P2} Dispatch Control =00000000.00000000
{/N0/SB1/P2} Data Cache Unit Control =00000000.00000000
{/N0/SB1/P2} Safari Config. = 0aaa0028.200c0006
{/N0/SB1/P2} EState = 00000000.0000000b
{/N0/SB1/P2} tl tt tstate tpc tnpc
{/N0/SB1/P2} 02 32 00000099.80081402 000007ff.f0006cc0 000007ff.f0006cc4
{/N0/SB1/P2} 01 60 00000044.80000604 000007ff.f000bd3c 000007ff.f000bd40
{/N0/SB1/P2} (TO) Time-out from system bus
{/N0/SB1/P2} (PRIV) Privileged code access error(s)
This second example displays another variation of an "Asynchronous Event" message from lpost:
{/N0/SB4/P1} @(#) lpost 5.17.2 2004/08/13 11:53
{/N0/SB4/P1} Copyright 2001-2004 Sun Microsystems, Inc. All rights reserved.
{/N0/SB4/P1} Use is subject to license terms.
{/N0/SB4/P1} test case reset reason = 00000000.0404ff07
{/N0/SB4/P1} test case ecache_size=00000000.00800000, tag_size=00000000.00004000
{/N0/SB4/P1} test case Ecache Mode: 0:3:3
{/N0/SB4/P1} test case E$ control register = 00000000.00094400
{/N0/SB4/P1} @(#) lpost 5.17.2 2004/08/13 11:53
{/N0/SB4/P1} Copyright 2001-2004 Sun Microsystems, Inc. All rights reserved.
{/N0/SB4/P1} Use is subject to license terms.
{/N0/SB4/P1} test case reset reason = 00000004.04ff0707
{/N0/SB4/P1} test case ecache_size=00000000.00800000, tag_size=00000000.00004000
{/N0/SB4/P1} test case Ecache Mode: 0:3:3
{/N0/SB4/P1} test case E$ control register = 00000000.00094400
{/N0/SB4/P1} test case IoSram Add : 0000041c.00900000
{/N0/SB4/P1} WARNING: Asynchronous Event.
{/N0/SB4/P1} Component under test: /N0/SB4/P1 CPU
{/N0/SB4/P1} Task 00000000.00037144 does not exist
This third example displays an "ERROR" message from lpost (The ERROR message was replaced with the WARNING message format due to changes made for bug 4988128, with firmware revisions 5.15.5, 5.16.1, 5.17.1, 5.18.0 and higher):
{/N0/SB1/P2} Use is subject to license terms.
{/N0/SB1/P2} test case reset reason = 00000001.04ff0707
{/N0/SB1/P2} test case ecache_size=00000000.00800000, tag_size=00000000.00004000
{/N0/SB1/P2} test case E$ control register = 00000000.07c55400
{/N0/SB1/P2} test case IoSram Add : 00000420.00900000
{/N0/SB1/P2} ERROR: TEST=Dummy,SUBTEST=Slave Test ID=0.0
{/N0/SB1/P2} Component under test: /N0/SB1/P2 CPU
{/N0/SB1/P2} Task 00000000.000374a8 does not exist
{/N0/SB1/P2} @(#) lpost 5.15.3 2003/09/30 23:01
A second scenario is a system panic, also accompanied by one of the above types of error messages reported in the console logs. System recovery is via panic reboot. Panic messages vary; some examples are:
Example 1:
panic: failed to stop cpu5
panic[cpu6]/thread=30005c537c0: bad kernel MMU trap at TL 2
%tl %tpc %tnpc %tstate %tt
1 000000000101819c 00000000010181a0 9900001601 068
%ccr: 99 %asi: 00 %cwp: 1 %pstate: 16<PEF,PRIV,IE>
2 0000000001008c44 0000000001008c48 4400041401 034
%ccr: 44 %asi: 00 %cwp: 1 %pstate: 414<MG,PEF,PRIV>
Example 2:
panic: failed to stop cpu6
panic[cpu5]/thread=2a100c97d40: bad kernel MMU miss at TL 2
%tl %tpc %tnpc %tstate %tt
1 000000000104cf68 000000000104cf6c 4400001603 060
%ccr: 44 %asi: 00 %cwp: 3 %pstate: 16<PEF,PRIV,IE>
2 00000000010cf884 00000000010cf888 9900081404 068
Notes:
- The Asynchronous Event warning messages may or may not include the "test case reset reason =" line. A test case reset reason code ending in "7" indicates a "red_state" condition.
- If other errors are observed in the system logs, these should be investigated as well.
| Solution Summary | Top |
4. Relief/Workaround
There is no workaround for this issue. Please see the Resolution section.
5. Resolution
This issue is addressed on the following platforms:
- Sun Fire 2900, 3800, 4800, 4810, 4900, 6800, 6900 and V1280 servers with System Controller (ScApp) firmware version 5.19.0 (for Solaris 9) as delivered in ScApp firmware patch 114526-01 or later and Kernel update patch 117171-14 or later
Note: Kernel update patch version 117171-14 or higher is necessary to resolve BugID 4978865. Both patches must be installed to fully resolve this issue.
Change History
- Update Contributing Factors and Resolution sections
- Updated Contributing Factors section
- Updated Contributing Factors and Resolution sections
- Updated Contributing Factors and Resolution sections
This Sun Alert notification is being provided to you on an "AS IS" basis. This Sun Alert notification may contain information provided by third parties. The issues described in this Sun Alert notification may or may not impact your system(s). Sun makes no representations, warranties, or guarantees as to the information contained herein. ANY AND ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY DISCLAIMED. BY ACCESSING THIS DOCUMENT YOU ACKNOWLEDGE THAT SUN SHALL IN NO EVENT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES THAT ARISE OUT OF YOUR USE OR FAILURE TO USE THE INFORMATION CONTAINED HEREIN. This Sun Alert notification contains Sun proprietary and confidential information. It is being provided to you pursuant to the provisions of your agreement to purchase services from Sun, or, if you do not have such an agreement, the Sun.com Terms of Use. This Sun Alert notification may only be used for the purposes contemplated by these agreements.
Copyright 2000-2006 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved.

Sun Contracted Content
Sun Contracted Feature
