Solaris 10 Kernel Patches 141444-09 and 141445-09 May Cause Interface Failure in IP Multipathing (IPMP)



Category :Availability
Release Phase :Workaround
Bug Id :6888928  
Product :Solaris 10 Operating System  
Date of Workaround Release :03-Nov-2009 

Solaris 10 Kernel Patches 141444-09 and 141445-09 cause interface failure in IP Multipathing:


1. Impact

The IP Multipathing (IPMP) facility allows systems with multiple network interfaces in the same subnet to failover in the event that one of the interfaces fails, and then failback when the interface is ok.

Solaris 10 Kernel Patches 141444-09 (SPARC) and 141445-09 (x86) cause interface failure in IPMP when configured for probe based failure detection. This issue does not occur with a IPMP link based failure detection configuration.

2. Contributing Factors

This issue can occur in the following releases:

SPARC Platform:
x86 Platform:
Note 1: Solaris 8 and 9 and OpenSolaris are not impacted by this issue.

Note 2: This issue only occurs on systems with IPMP configured for probe based failure detection. This issue does not occur with a IPMP link based failure detection configuration.

To determine if IPMP is configured for probe based failure detection, all the following must be true:

1. The "in.mpathd" daemon is running as shown by the following command:
    # ps -aef |grep "in.mpathd"
root   211   1   0 11:04:51 ?   0:00 /usr/lib/inet/in.mpathd -a
2. The output from "ifconfig -a" must show "groupname" for the interfaces in the IPMP group as shown below:
     # ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 5
inet 192.178.100.1 netmask ffffff00 broadcast 192.178.100.255
groupname fred
ether 0:3:ba:d8:d1:ef
e1000g1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 5
inet 192.178.100.2 netmask ffffff00 broadcast 192.178.100.255
e1000g2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 6
inet 192.178.100.5 netmask ffffff00 broadcast 192.178.100.255
groupname fred
ether 0:4:23:c8:33:86
e1000g2:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 6
inet 192.178.100.6 netmask ffffff00 broadcast 192.178.100.255
In the example above, e1000g1 and e1000g2 are part of IPMP group called "fred".

3. The output of "ifconfig -a" must show  "DEPRECATED" and "NOFAILOVER" for test addresses, as this indicates that IPMP probe based failure detection configuration is being used. This is shown in the output above for both e1000g1 and e1000g2.

In the above example, 192.178.100.2 (address on e1000g1:1) and 192.178.100.6 (address on e1000g2:1) are test addresses.

3. Symptoms

If the described issue occurs, an interface in the IPMP group fails even though no network problems have been experienced.

The following messages are printed on the console (or /var/adm/messages):
    # Oct 22 11:09:29 v4v-t2000a-sca11 in.mpathd[211]: NIC failure detected on e1000g2 of group fred
Oct 22 11:09:29 v4v-t2000a-sca11 in.mpathd[211]: Successfully failed over from NIC e1000g2 to NIC e1000g1
And the interface is marked FAILED in the output of "ifconfig -a" as shown below:
    # ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 5
inet 192.178.100.1 netmask ffffff00 broadcast 192.178.100.255
groupname fred
ether 0:3:ba:d8:d1:ef
e1000g1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 5
inet 192.178.100.2 netmask ffffff00 broadcast 192.178.100.255
e1000g1:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 5
inet 192.178.100.5 netmask ffffff00 broadcast 192.178.100.255
e1000g2: flags=19000842<BROADCAST,RUNNING,MULTICAST,IPv4,NOFAILOVER,FAILED> mtu 0 index 6
inet 0.0.0.0 netmask 0
groupname fred
ether 0:4:23:c8:33:86
e1000g2:1: flags=19040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 6
inet 192.178.100.6 netmask ffffff00 broadcast 192.178.100.255
In the example above, e1000g2 is marked as FAILED.

The interface failed because the ICMP probe replies for 192.178.100.6 are received on e1000g1 instead of e1000g2.

Snooping on e1000g1 reveals that 192.178.100.6 ICMP echo requests are sent from e1000g1 and also ICMP echo replies for 192.178.100.6 are received on e1000g1:
    # snoop -d e1000g1 icmp
Using device e1000g1 (promiscuous mode)
192.178.100.6 -> 192.178.100.15 ICMP Echo request (ID: 54022 Sequence number: 1674)
192.178.100.15 -> 192.178.100.6 ICMP Echo reply (ID: 54022 Sequence number: 1674)
192.178.100.2 -> 192.178.100.15 ICMP Echo request (ID: 54021 Sequence number: 1680)
192.178.100.15 -> 192.178.100.2 ICMP Echo reply (ID: 54021 Sequence number: 1680)
192.178.100.6 -> 192.178.100.10 ICMP Echo request (ID: 54022 Sequence number: 1675)
192.178.100.10 -> 192.178.100.6 ICMP Echo reply (ID: 54022 Sequence number: 1675)
Snooping on e1000g2 (shown below) indicates that the FAILED e1000g2 interface is not sending or receiving any probe packets. However, the e1000g2 interface should send probe packets even if the interface has failed. This is another symptom of this issue.
    # snoop -d e1000g2 icmp
Using device e1000g2 (promiscuous mode)
4. Workaround

Binary relief is available through normal support channels.

Note: Removing the offending patches to avoid this issue is not advisable as these patches contain security fixes.

5. Resolution

A final resolution is pending completion.

This Sun Alert notification is being provided to you on an "AS IS" basis. This Sun Alert notification may contain information provided by third parties. The issues described in this Sun Alert notification may or may not impact your system(s). Sun makes no representations, warranties, or guarantees as to the information contained herein. ANY AND ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY DISCLAIMED. BY ACCESSING THIS DOCUMENT YOU ACKNOWLEDGE THAT SUN SHALL IN NO EVENT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES THAT ARISE OUT OF YOUR USE OR FAILURE TO USE THE INFORMATION CONTAINED HEREIN. This Sun Alert notification contains Sun proprietary and confidential information. It is being provided to you pursuant to the provisions of your agreement to purchase services from Sun, or, if you do not have such an agreement, the Sun.com Terms of Use. This Sun Alert notification may only be used for the purposes contemplated by these agreements.

Copyright 2000-2009 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved.





Attachments
This solution has no attachment

 
 
Login Required

You must login and have a valid contract to access Sun's Premium content which includes:

  • Sun Alerts
  • Bugs
  • Patches
  • Solutions
  • White Papers
  • Documentation
  • Support Knowledge

Login Required

You must login and have a valid contract to access Sun's contracted features

Access Legend:

(Login to access)   Sun Contracted Content
(Login to access)   Sun Contracted Feature

Please make use of SunSolve Feedback application by selecting the floating [+] to provide feedback about this specific document.

Search

Article Details
Article ID : 271519
Article Type : Sun Alert
Last reviewed : 2009-11-03
Audience : PUBLIC
Keywords :
Provide feedback  (help)
Page Tools
»  Print This Page
»  Email This Article
»  Bookmark This Article
 
Contact About Sun News & Events Employment Site Map Privacy Terms of Use Trademarks Copyright Sun Microsystems, Inc. | SunSolve Version 7.4.0 #1