Sun Fire 12K or Sun Fire 15K System Controller(SC) may Encounter a Failover Management Daemon Failure and Cause a Domain to Dstop |
|
| Category : | AvailabilityData Loss |
| Release Phase : | Resolved |
| Product : | Sun Fire 12K Server Sun Fire 15K Server
|
| Bug Id : | 4795711, 4760870
|
| Date of Resolved Release : | 28-MAR-2003
|
Impact
In the event of a System Controller failover on Sun Fire 12K or 15K, fomd (1M) might fail during operations. The fomd (1M) daemon may either die or hang potentially causing a "Dstop" on the domain.
Contributing Factors
This issue can occur in the following releases:
-
Sun Fire 12K/15K with System Management Software (SMS) 1.1
-
Sun Fire 12K/15K with System Management Software (SMS) 1.2 without patch 112481-10
Note: SMS 1.3 is not affected
Symptoms
Under heavy CPU load on the SC, SMS platform messages similar to the following may appear:
Feb 12 11:10:02 2002 HostName ssd[404]: [1312 1766410829297214 ERR
StartupManager.cc 2850] software component failed: name=fomd
The following is an indication that fomd(1M) has failed:
% ps -ef | grep fomd
root 3914 1 0 Oct 21 ? 126:32 fomd
root 1982 3914 0 Oct 30 ? 0:00 fomd
In the above example, the child process shows (pid 1982) shows no run time (0:00).
Workaround
There is no workaround. Please see the "Resolution" section below.
Resolution
This issue is addressed in the following releases:
-
Sun Fire 12K/15K with System Management Software (SMS) 1.2 with patch 112481-10 or later
Note: SMS 1.1 requires an upgrade to a later release.
Modification History
AttachmentsThis solution has no attachment