Under Certain Conditions, Solaris 10 Patch 118822-11 Through 118822-23 May Cause a System Panic on Some Servers |
|
| Category : | Availability |
| Release Phase : | Resolved |
| Product : | Sun Fire 12K Server Sun Fire E20K Server Sun Fire 15K Server Solaris 10 Operating System Sun Fire E25K Server
|
| Bug Id : | 6342112
|
| Date of Workaround Release : | 08-NOV-2005
|
| Date of Resolved Release : | 23-DEC-2005
|
Impact
Under certain conditions, SunFire 12K, 15K, E20K and E25K systems running Solaris 10 with patch 118822-11 through 118822-23 installed (and with VxVM Volumes in use) may experience a panic/reboot loop. If VxVM is not in use, the system may still panic (but not in a reboot/loop) when certain ufs(7FS) code paths are exercised.
Contributing Factors
This issue can occur in the following release:
SPARC Platform
on the following platforms:
- Sun Fire 12K, 15K, E20K and E25K Servers
Notes:
- Solaris 8 and 9 are not affected by this issue. Solaris on any system other than a 15K, E20K or E25K is not affected by this issue.
- This issue is known to affect systems with Veritas Volume Manager (VxVM) 4.1 installed, and can affect VxVM volumes.
- This issue is also known to occur with certain code paths in the ufs(7FS) layer.
- On SunFire 12K/15K/E20K/E25K systems, this issue occurs only when the number of CPUs on the system is greater than 32.
This issue can occur when either UFS (in certain code paths) or VxVM sends a buffer that has a page with a certain <vnode, offset> identity to upper layers, locks it, remaps that page into kernel, and then attempts to remap the same page again.
Symptoms
If VxVM volumes are mounted toward the end of the boot cycle, the system may panic and experience a panic/reboot loop sequence.
Note: For non VxVM cases, certain ufs(7FS) code paths can cause this issue, but the panic will not occur during boot.
This issue may generate a panic message similar to the following:
panic[cpu387]/thread=30020e04ca0:
BAD TRAP: type=31 rp=2a101ff0040 addr=49 mmu_fsr=0
occurred in module "unix" due to a NULL pointer dereference
and a stack trace similar to the following (this is one example only):
hat_add_callback+0x34c(3009b6f0ea0, 30041f3cc48, 0, 1, 187e228, 700a4489f80)
pci_dma_type+0x128(187e000, 2a106172fb8, 300b6ec70c0, 1, 0, 2a106172fd8)
pci_dma_bindhdl+0x54(3000, 3001555d460, 300b6ec70c0, 2a106172fb8,
2a1061730c8, 2a1061730c4)
ddi_dma_buf_bind_handle+0x118(300b6ec70c0, 12855b0, 2080101, 1, 0,
2a1061730c8)
ssfcp_prepare_pkt+0x234(30015b83800, 300b7b37228, 30015b79b38, 0, 0,
30015906c98)
ssfcp_scsi_start+0x1b8(30015b83800, 300b7b37228, 0, 1e886, 30015906cb0, 0)
ssd_start_cmds+0x4b8(30015ca3ac0, 0, 30015c8cce0, 704fcda0, 11f3ce0,
11f3d60)
ssd_core_iostart+0x230(11f3d60, 30015ca3ac0, 30015c8cce0, 11f3ca0,
11f3e20, 11f3c00)
ssd_xbuf_iostart+0x134(300237ad680, 0, 3002350bc40, 0, 300237ad6b0,
300b6ebfb00)
lufs_write_strategy+0x11c(0, 300b6ebfb00, 0, 30034b6e680, 1911c00, 2cfa000)
ufs_fbiwrite+0x138(30099219a28, 300382aa2e0, b3e8, 300b6ebfb00,
2a75c8ce000, 167d0)
bmap_write+0x37c(8000, 2000, 1f7c, d, 1, 300382aa2e0)
wrip+0x440(1f60, 2a106173a98, ffffffffff, 1c, 300382aa2e0, 300382aa3c0)
ufs_write+0x4d0(30038e68880, 2a106173a98, 40, 30036680208, 0, 1)
write+0x268(5, 8058, 3009822c970, 1c, 2143, 1)
syscall_trap32+0xcc(5, ffbfece0, 1c, 0, 0, 0)
Workaround
There is no workaround to this issue. Please see the Resolution section below.
Resolution
This issue is addressed in the following release:
SPARC Platform
- Solaris 10 with patch 118822-24 or later
for the following platforms:
- Sun Fire 12K, 15K, E20K and E25K Servers
Modification HistoryDate: 23-NOV-2005
23-Nov-2005:
- Updated Contributing Factors and Relief/Workaround sections
Date: 23-DEC-2005
23-Dec-2005:
- Updated Contributing Factors and Resolution sections; re-release as Resolved
Date: 18-JAN-2006
18-Jan-2006:
- Updated Impact,Contributing Factors, and Relief/Workaround sections.
AttachmentsThis solution has no attachment