Sun Fire V440 and Netra 440 Systems Using a Specific Networking Configuration may Unexpectedly Reset



Category :Availability
Release Phase :Resolved
Product :Sun Fire V440 Server
Netra 440 Server  
Bug Id :5039862  
Date of Resolved Release :29-SEP-2005 


Impact

Under certain conditions using a specific network configuration, the Sun Fire V440 or Netra 440 system may experience an unexpected reset and reboot.


Contributing Factors

This issue can occur in the following releases:

SPARC Platform

  • Sun Fire V440
  • Netra 440

This issue only occurs when there is system bus signal activity coincident with a specific PCI bus signal activity occuring on the first onboard Ethernet interface. Under Solaris this is typically logical device "ce0", and physically this is the ethernet RJ45 connector NET 0.


Symptoms

If the described issue occurs, the system resets, and the following error message appears on the console.

	Fatal Error Reset
     	SC Alert: Host System has Reset

The system then reboots. No core files are generated, and the reset output will not be logged to the "/var/adm/messages" file.

If it is suspected that the system is experiencing this issue, change the OBP variables as follows to provide more verbose output in the event of another occurrence.

Note: The OBP settings below are only recommended to verify whether the system is experiencing this issue and should not be used long term. Once the failure is verified, then the parameters should be set back to their original values (make a note of these before changing). The settings below provides more verbose output:

 	diag-switch?    true
    	post-trigger    none
    	obdiag-trigger  none

When the parameters above are set, the error message will include some additional information indicating the reset reason as "PBM FATAL", with a PCI IO-Bridge register output similar to:

  	Fatal Error Reset
  	SC Alert: Host System has Reset

  	@(#)OBP 4.10.10 2003/08/29 06:25 Sun Fire V440
  	Clearing TLBs
  	Loading Configuration
  	Membase: 0000.0033.0000.0000
  	MemSize: 0000.0000.4000.0000
  	Init CPU arrays Done
  	Init E$ tags Done
  	Setup TLB Done
  	MMUs ON
  	Scrubbing Tomatillo tags... 0 1
  	Block Scrubbing Done
  	Find dropin, Copying Done, Size 0000.0000.0000.5ca0
  	PC = 0000.07ff.f000.4c88
  	PC = 0000.0000.0000.4d28
  	Find dropin, (copied), Decompressing Done, Size 0000.0000.0006.6700
  	ttya initialized
  	System Reset: (PBM FATAL)
  	JBUS-PCI bridge
  	JBUS-PCI bridge
  	slave Error Register: 8000000000001000


Workaround

To work around the described issue, use the steps provided below:

1a) If the application only requires a single network port, use only the second onboard Ethernet interface, net1 (ce1).

OR

1b) If the application requires multiple network ports, install a PCI ethernet card in any available PCI slot. Choosing to place the card into a 33MHz slot (Slot 0, 1 and 3) may lower performance relative to using the card in a 66MHz slot (Slot 5, 2 or 4). Slot 5 is preferred.

2) It is highly recommended that to ensure the onboard net0 port (ce0) is not accessed inadvertantly in a manner that could trigger this issue (e.g. SunVTS), that the ce0 interface be completely disabled. It is also recommended due to Solaris instance numbering, that this be done after initial Solaris installation, to ensure net1 is assigned ce1 instance, instead of ce0.

To completely disable onboard net0 (ce0) from the system, use the following commands to install an NVRAM script at the OBP "ok" prompt:

   ok nvedit
      0: probe-all install-console banner
      1: " /pci@1c,600000/network@2" $delete-device drop
      2:
      ^C
Type "Ctrl-C" to exit nvedit as shown above. Then continue with:
   ok nvstore
   ok setenv use-nvramrc? true
   use-nvramrc? =        true
   ok reset-all

After the system resets, net0 (ce0) should not be visible by OBP (i.e. you should not see a path to net0 [/pci@1c,600000/network@2] when you run "show-devs" from OBP). And the net0 (ce0) device should not be seen by Solaris (e.g. prtconf or prtpicl commands).

Note: Additional information is available through normal support channels.


Resolution

Hardware remediation options are available. Please contact your local Sun Services representative and reference this document.




Modification History


Date: 14-JAN-2005
  • Updated Contributing Factors and Resolution sections

Date: 10-MAR-2005
  • Updated Impact and Relief/Workaround sections

Date: 29-SEP-2005
  • State: Resolved



Attachments
This solution has no attachment

 
 
Login Required

You must login and have a valid contract to access Sun's Premium content which includes:

  • Sun Alerts
  • Bugs
  • Patches
  • Solutions
  • White Papers
  • Documentation
  • Support Knowledge

Login Required

You must login and have a valid contract to access Sun's contracted features

Access Legend:

(Login to access)   Sun Contracted Content
(Login to access)   Sun Contracted Feature

Please make use of SunSolve Feedback application by selecting the floating [+] to provide feedback about this specific document.

Search

Article Details
Article ID : 201170
Article Type : Sun Alert
Last reviewed : 2005-09-29
Audience : PUBLIC
Keywords :
Provide feedback  (help)
Page Tools
»  Print This Page
»  Email This Article
»  Bookmark This Article
 
Contact About Sun News & Events Employment Site Map Privacy Terms of Use Trademarks Copyright Sun Microsystems, Inc. | SunSolve Version 7.4.0 #1