Solaris 10 Kernel Patches 141444-09 and 141445-09 May Cause Interface Failure in IP Multipathing (IPMP) |
|
| Category : | Availability |
| Release Phase : | Workaround |
| Bug Id : | 6888928
|
| Product : | Solaris 10 Operating System
|
| Date of Workaround Release : | 03-Nov-2009
|
Solaris 10 Kernel Patches 141444-09 and 141445-09 cause interface failure in IP Multipathing:
1. Impact
The IP Multipathing (IPMP) facility allows systems with multiple
network interfaces in the same subnet to failover in the event that one
of the interfaces fails, and then failback when the interface is ok.
Solaris 10 Kernel Patches
141444-09 (SPARC) and
141445-09 (x86) cause interface failure in IPMP when configured for probe based failure
detection. This issue does not occur with a IPMP link based failure
detection configuration.
2. Contributing Factors
This issue can occur in the following releases:
SPARC Platform:
x86 Platform:
Note 1: Solaris 8 and 9 and
OpenSolaris are not impacted by this issue.
Note 2: This issue only occurs on systems with IPMP configured for
probe based failure detection. This issue does not occur with a IPMP
link based failure detection configuration.
To determine if IPMP is configured for probe based failure detection,
all the following must be true:
1. The "in.mpathd" daemon is running as shown by the following command:
# ps -aef |grep "in.mpathd"
root 211 1 0 11:04:51 ? 0:00 /usr/lib/inet/in.mpathd -a
2. The output from "ifconfig -a" must show "groupname" for the
interfaces in the IPMP group as shown below:
# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 5
inet 192.178.100.1 netmask ffffff00 broadcast 192.178.100.255
groupname fred
ether 0:3:ba:d8:d1:ef
e1000g1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 5
inet 192.178.100.2 netmask ffffff00 broadcast 192.178.100.255
e1000g2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 6
inet 192.178.100.5 netmask ffffff00 broadcast 192.178.100.255
groupname fred
ether 0:4:23:c8:33:86
e1000g2:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 6
inet 192.178.100.6 netmask ffffff00 broadcast 192.178.100.255
In the example above, e1000g1 and e1000g2 are part of IPMP group called
"fred".
3. The output of "ifconfig -a" must show "DEPRECATED" and
"NOFAILOVER" for test addresses, as this indicates that IPMP probe
based failure detection configuration is being used. This is shown in
the output above for both e1000g1 and e1000g2.
In the above example, 192.178.100.2 (address on e1000g1:1) and
192.178.100.6 (address on e1000g2:1) are test addresses.
3.
Symptoms
If the described issue occurs, an interface in the IPMP group fails
even though no network problems have been experienced.
The following messages are printed on the console (or
/var/adm/messages):
# Oct 22 11:09:29 v4v-t2000a-sca11 in.mpathd[211]: NIC failure detected on e1000g2 of group fred
Oct 22 11:09:29 v4v-t2000a-sca11 in.mpathd[211]: Successfully failed over from NIC e1000g2 to NIC e1000g1
And the interface is marked FAILED in the output of "ifconfig -a" as
shown below:
# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 5
inet 192.178.100.1 netmask ffffff00 broadcast 192.178.100.255
groupname fred
ether 0:3:ba:d8:d1:ef
e1000g1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 5
inet 192.178.100.2 netmask ffffff00 broadcast 192.178.100.255
e1000g1:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 5
inet 192.178.100.5 netmask ffffff00 broadcast 192.178.100.255
e1000g2: flags=19000842<BROADCAST,RUNNING,MULTICAST,IPv4,NOFAILOVER,FAILED> mtu 0 index 6
inet 0.0.0.0 netmask 0
groupname fred
ether 0:4:23:c8:33:86
e1000g2:1: flags=19040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 6
inet 192.178.100.6 netmask ffffff00 broadcast 192.178.100.255
In the example above, e1000g2 is marked as FAILED.
The interface failed because the ICMP probe replies for 192.178.100.6
are received on e1000g1 instead of e1000g2.
Snooping on e1000g1 reveals that 192.178.100.6 ICMP echo requests are
sent from e1000g1 and also ICMP echo replies for 192.178.100.6 are
received on e1000g1:
# snoop -d e1000g1 icmp
Using device e1000g1 (promiscuous mode)
192.178.100.6 -> 192.178.100.15 ICMP Echo request (ID: 54022 Sequence number: 1674)
192.178.100.15 -> 192.178.100.6 ICMP Echo reply (ID: 54022 Sequence number: 1674)
192.178.100.2 -> 192.178.100.15 ICMP Echo request (ID: 54021 Sequence number: 1680)
192.178.100.15 -> 192.178.100.2 ICMP Echo reply (ID: 54021 Sequence number: 1680)
192.178.100.6 -> 192.178.100.10 ICMP Echo request (ID: 54022 Sequence number: 1675)
192.178.100.10 -> 192.178.100.6 ICMP Echo reply (ID: 54022 Sequence number: 1675)
Snooping on e1000g2 (shown below) indicates that the FAILED e1000g2
interface is not sending or receiving any probe packets. However, the
e1000g2 interface should send probe packets even if the interface has
failed. This is another symptom of this issue.
# snoop -d e1000g2 icmp
Using device e1000g2 (promiscuous mode)
4. Workaround
Binary relief is available through normal support channels.
Note: Removing the offending
patches to avoid this issue is not advisable as these patches contain
security fixes.
5. Resolution
A final resolution is pending completion.
This Sun Alert notification is being provided to you on
an "AS IS"
basis. This Sun Alert notification may contain information provided by
third parties. The issues described in this Sun Alert notification may
or may not impact your system(s). Sun makes no representations,
warranties, or guarantees as to the information contained herein. ANY
AND ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR
NON-INFRINGEMENT, ARE HEREBY DISCLAIMED. BY ACCESSING THIS DOCUMENT YOU
ACKNOWLEDGE THAT SUN SHALL IN NO EVENT BE LIABLE FOR ANY DIRECT,
INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES THAT ARISE OUT
OF YOUR USE OR FAILURE TO USE THE INFORMATION CONTAINED HEREIN. This
Sun Alert notification contains Sun proprietary and confidential
information. It is being provided to you pursuant to the provisions of
your agreement to purchase services from Sun, or, if you do not have
such an agreement, the Sun.com Terms of Use. This Sun Alert
notification may only be used for the purposes contemplated by these
agreements.
Copyright 2000-2009 Sun Microsystems, Inc., 4150 Network Circle,
Santa
Clara, CA 95054 U.S.A. All rights reserved.
AttachmentsThis solution has no attachment