In the line of duty, we were force-fed a Sun X4150 server running Solaris 10. We don’t have any Solaris expertise in-house ( we’re all Linux dudes ;) ), so it was quite the challenge to fix a network issue we were having. The server had 4 NIC’s, which should be paired  by 2, to create 2 bonded interfaces, each with an active and a passive NIC. ( Bonding mode 1 in Linux  ).

Now, in Linux, this would be a piece of cake. In Solaris, it was a bit more challenging. :)

First of all, a third-party installed the server. The NIC’s came divided in 2 interfaces, aggr1 and aggr2. At first I assumed that bonding was called “aggregation” under Solaris, just as it’s called “teaming” under windows. Since we were experiencing quite a bit of packetloss on those links, we went to investigate some more. We wanted redundant switches, so LACP ( Link Aggregation Control Protocol ) wasn’t an option, since that requires the use of only a single switch.

Our guess was that the Solaris wasn’t configured in active-passive mode, but we lacked the Solaris-knowledge to exactly find out what to fix. Booting the Sun server with a Centos livecd, and configuring bonding the “Linux Way” worked flawlessly, and packetloss disappeared.

Turning our attention back to the Solaris machine ( with the help of an external support guy ), we configured the Solaris box with IPMP ( IP Multipath ). The inner workings are a bit different than good old mode1 bonding, but the result was the same.

We had 4 interfaces, which needed to be grouped in pairs of 2. So e1000g0 and e1000g1 would combine under bond0 and e1000g2 and e1000g3 would be grouped under bond1.

We created 4 files:

/etc/hostname.e1000g0 contained:

group bond0 up

node1

/etc/hostname.e1000g1 contained:

group bond0 up

This added the 2 devices to bond0, and configured the first ( primary ) device with the ip address of node1 ( which is defined in our /etc/hosts file )

/etc/hostname.e1000g2 contained:

group bond1 up

node2

/etc/hostname.e1000g1 contained:

group bond1 up

Then we did the same for the other 2 interfaces, but assigned another ip address ( the address of node2, defined in our /etc/hosts file as well ), and another groupname.

This makes sure that when we reboot the server, the machine comes up in a happily bonded state, without packet loss! Hooray!

An “ifconfig -a” showed us the following:

lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000


e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 10.0.1.10 netmask ffffff00 broadcast 10.0.1.255
groupname bond0
ether 0:1b:23:a3:42:30


e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname bond0
ether
0:1b:23:a3:42:31


e1000g2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
inet 10.0.2.10 netmask ffffff00 broadcast 10.0.2.255
groupname bond1
ether
0:1b:23:a3:42:32


e1000g3: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 5
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname bond1
ether
0:1b:23:a3:42:33

When we pull out of the cables on any NIC, the system switches the ip address to the secondary NIC in the group, and becomes the active member until the connection of the primary NIC is back up.