Sun Fire F12K/F15K/E20K/E25K Platforms With CP2140 Based System Controllers May Take Several Hours for I2 Network Initialization |
|
| Category : | Availability |
| Release Phase : | Resolved |
| Product : | Sun Fire 12K Server Sun Fire E20K Server Sun Fire 15K Server Sun Fire E25K Server
|
| Bug Id : | 5049856, 5065599
|
| Date of Resolved Release : | 07-OCT-2004
|
Impact
Sun Fire F12K/F15K/E20K/E25K platforms with CP2140 based system controllers (SCs) may take several hours for I2 network initialization. This may occur anytime an SC is reset, including, cold start, when the SC is rebooted, or when SMS failover occurs. When the I2 network is not yet initialized, the failure of SSCPOST on the Spare SC will prevent it from being reset by the Main SC. This may prevent SC failover from entering an enabled state.
Although platform failure potential is small, Sun strongly encourages the installation of the patches listed in the Resolution section below in order to restore availability to the I2 network.
Note: The I2 network path is used for SC to SC communications such as data synchronization. A redundant path is available for SC to SC communications through the platform backplane. Traffic automatically flows through this path when the I2 path is unavailable.
Contributing Factors
This issue can occur in the following releases:
SPARC Platform
-
Sun Fire F12K/F15K/E20K/E25K with SMS 1.3 (for Solaris 8) and without patches 115287-04 and 114627-04
-
Sun Fire F12K/F15K/E20K/E25K with SMS 1.3 (for Solaris 9) and without patches 115287-04 and 114628-03
-
Sun Fire F12K/F15K/E20K/E25K with SMS 1.4.1 (for Solaris 8) and without patches 118061-01 and 118048-01
-
Sun Fire F12K/F15K/E20K/E25K with SMS 1.4.1 (for Solaris 9) and without patches 118061-01 and 118049-01
Note: This issue only affects Sun Fire F12K/F15K systems equipped with CP2140 based SCs.
The following command can be run on the SC to determine the controller type:
sc0:sms-svc:5> uname -i
SUNW,UltraSPARCengine_CP-40
The example above shows the required output for the CP2140 system controller.
Symptoms
Should the described issue occur, messages similar to the following will appear in the SCs "/var/adm/messages" file:
May 16 06:02:26 sc0-garfield eri: [ID 517527 kern.info] SUNW,eri2 : No response
from Ethernet network : Link down -- cable problem?
May 16 06:02:47 sc0-garfield last message repeated 1 time
May 16 06:03:15 sc0-garfield eri: [ID 517527 kern.info] SUNW,eri1 : No response
from Ethernet network : Link down -- cable problem?
May 16 06:03:35 sc0-garfield eri: [ID 517527 kern.info] SUNW,eri2 : 100 Mbps full duplex link up
May 16 06:03:50 sc0-garfield in.routed[102]: [ID 300549 daemon.warning] interface scman1 to 10.3.253.194
restored May 16 06:03:57 sc0-garfield eri: [ID 517527 kern.info] SUNW,eri1 : No response
from Ethernet network : Link down -- cable problem?
May 16 06:04:03 sc0-garfield eri: [ID 517527 kern.info] SUNW,eri1 : 100 Mbps full duplex link up
May 16 06:04:21 sc0-garfield eri: [ID 517527 kern.info] SUNW,eri1 : No response
from Ethernet network : Link down -- cable problem?
May 16 06:04:48 sc0-garfield eri: [ID 517527 kern.info] SUNW,eri2 : No response
from Ethernet network : Link down -- cable problem?
May 16 06:04:55 sc0-garfield in.routed[102]: [ID 588247 daemon.warning] interface scman1 to 10.3.253.194
broken: in=0 ierr=0 out=0 oerr=5
May 16 06:05:04 sc0-garfield eri: [ID 517527 kern.info] SUNW,eri1 : 100 Mbps full duplex link up
May 16 06:05:20 sc0-garfield in.routed[102]: [ID 300549 daemon.warning]
interface scman1 to 10.3.253.194 restored
Output from the "showfailover -v" command may also indicate the I2 network status is "FAILED".
sc0-garfield:sms-svc:5> showfailover -v
SC Failover Status: ACTIVE
Status of Shared Memory:
HASRAM (CSB at CS0): .......................................Good
HASRAM (CSB at CS1): .......................................Good
Status of sc0-garfield:
Role: .......................................MAIN
SMS Daemons: .......................................Good
System Clock: .......................................Good
Private I2 Network: .....................................FAILED
Private HASRAM Network: .......................................Good
<output truncated>
Workaround
Any workaround previously applied to address this issue must be removed before patches are installed.
Resolution
This issue is addressed in the following releases:
SPARC Platform
-
Sun Fire F12K/F15K/E20K/E25K with SMS 1.3 (for Solaris 8) and with patches 115287-04 and 114627-04 or later
-
Sun Fire F12K/F15K/E20K/E25K with SMS 1.3 (for Solaris 9) and with patches 115287-04 and 114628-03 or later
-
Sun Fire F12K/F15K/E20K/E25K with SMS 1.4.1 (for Solaris 8) and with patches 118061-01 and 118048-01 or later
-
Sun Fire F12K/F15K/E20K/E25K with SMS 1.4.1 (for Solaris 9) and with patches 118061-01 and 118049-01 or later
Note: The two patch combination is required to fix the issue.
Modification History
AttachmentsThis solution has no attachment