2 Linux Ethernet Bonding Driver mini-howto
4 Initial release : Thomas Davis <tadavis at lbl.gov>
5 Corrections, HA extensions : 2000/10/03-15 :
6 - Willy Tarreau <willy at meta-x.org>
7 - Constantine Gavrilov <const-g at xpert.com>
8 - Chad N. Tindel <ctindel at ieee dot org>
9 - Janice Girouard <girouard at us dot ibm dot com>
10 - Jay Vosburgh <fubar at us dot ibm dot com>
14 The bonding driver originally came from Donald Becker's beowulf patches for
15 kernel 2.0. It has changed quite a bit since, and the original tools from
16 extreme-linux and beowulf sites will not work with this version of the driver.
18 For new versions of the driver, patches for older kernels and the updated
19 userspace tools, please follow the links at the end of this file.
28 Configuring Multiple Bonds
30 Verifying Bond Configuration
31 Frequently Asked Questions
33 Promiscuous Sniffing notes
42 1) Build kernel with the bonding driver
43 ---------------------------------------
44 For the latest version of the bonding driver, use kernel 2.4.12 or above
45 (otherwise you will need to apply a patch).
47 Configure kernel with `make menuconfig/xconfig/config', and select "Bonding
48 driver support" in the "Network device support" section. It is recommended
49 to configure the driver as module since it is currently the only way to
50 pass parameters to the driver and configure more than one bonding device.
52 Build and install the new kernel and modules.
54 2) Get and install the userspace tools
55 --------------------------------------
56 This version of the bonding driver requires updated ifenslave program. The
57 original one from extreme-linux and beowulf will not work. Kernels 2.4.12
58 and above include the updated version of ifenslave.c in Documentation/network
59 directory. For older kernels, please follow the links at the end of this file.
61 IMPORTANT!!! If you are running on Redhat 7.1 or greater, you need
62 to be careful because /usr/include/linux is no longer a symbolic link
63 to /usr/src/linux/include/linux. If you build ifenslave while this is
64 true, ifenslave will appear to succeed but your bond won't work. The purpose
65 of the -I option on the ifenslave compile line is to make sure it uses
66 /usr/src/linux/include/linux/if_bonding.h instead of the version from
69 To install ifenslave.c, do:
70 # gcc -Wall -Wstrict-prototypes -O -I/usr/src/linux/include ifenslave.c -o ifenslave
71 # cp ifenslave /sbin/ifenslave
77 You will need to add at least the following line to /etc/modules.conf
78 so the bonding driver will automatically load when the bond0 interface is
79 configured. Refer to the modules.conf manual page for specific modules.conf
80 syntax details. The Module Parameters section of this document describes each
81 bonding driver parameter.
85 Use standard distribution techniques to define the bond0 network interface. For
86 example, on modern Red Hat distributions, create an ifcfg-bond0 file in
87 the /etc/sysconfig/network-scripts directory that resembles the following:
93 BROADCAST=192.168.1.255
98 (use appropriate values for your network above)
100 All interfaces that are part of a bond should have SLAVE and MASTER
101 definitions. For example, in the case of Red Hat, if you wish to make eth0 and
102 eth1 a part of the bonding interface bond0, their config files (ifcfg-eth0 and
103 ifcfg-eth1) should resemble the following:
112 Use DEVICE=eth1 in the ifcfg-eth1 config file. If you configure a second
113 bonding interface (bond1), use MASTER=bond1 in the config file to make the
114 network interface be a slave of bond1.
116 Restart the networking subsystem or just bring up the bonding device if your
117 administration tools allow it. Otherwise, reboot. On Red Hat distros you can
118 issue `ifup bond0' or `/etc/rc.d/init.d/network restart'.
120 If the administration tools of your distribution do not support
121 master/slave notation in configuring network interfaces, you will need to
122 manually configure the bonding device with the following commands:
124 # /sbin/ifconfig bond0 192.168.1.1 netmask 255.255.255.0 \
125 broadcast 192.168.1.255 up
127 # /sbin/ifenslave bond0 eth0
128 # /sbin/ifenslave bond0 eth1
130 (use appropriate values for your network above)
132 You can then create a script containing these commands and place it in the
133 appropriate rc directory.
135 If you specifically need all network drivers loaded before the bonding driver,
136 adding the following line to modules.conf will cause the network driver for
137 eth0 and eth1 to be loaded before the bonding driver.
139 probeall bond0 eth0 eth1 bonding
141 Be careful not to reference bond0 itself at the end of the line, or modprobe
142 will die in an endless recursive loop.
144 If running SNMP agents, the bonding driver should be loaded before any network
145 drivers participating in a bond. This requirement is due to the the interface
146 index (ipAdEntIfIndex) being associated to the first interface found with a
147 given IP address. That is, there is only one ipAdEntIfIndex for each IP
148 address. For example, if eth0 and eth1 are slaves of bond0 and the driver for
149 eth0 is loaded before the bonding driver, the interface for the IP address
150 will be associated with the eth0 interface. This configuration is shown below,
151 the IP address 192.168.1.1 has an interface index of 2 which indexes to eth0
152 in the ifDescr table (ifDescr.2).
154 interfaces.ifTable.ifEntry.ifDescr.1 = lo
155 interfaces.ifTable.ifEntry.ifDescr.2 = eth0
156 interfaces.ifTable.ifEntry.ifDescr.3 = eth1
157 interfaces.ifTable.ifEntry.ifDescr.4 = eth2
158 interfaces.ifTable.ifEntry.ifDescr.5 = eth3
159 interfaces.ifTable.ifEntry.ifDescr.6 = bond0
160 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 5
161 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2
162 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 4
163 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1
165 This problem is avoided by loading the bonding driver before any network
166 drivers participating in a bond. Below is an example of loading the bonding
167 driver first, the IP address 192.168.1.1 is correctly associated with
170 interfaces.ifTable.ifEntry.ifDescr.1 = lo
171 interfaces.ifTable.ifEntry.ifDescr.2 = bond0
172 interfaces.ifTable.ifEntry.ifDescr.3 = eth0
173 interfaces.ifTable.ifEntry.ifDescr.4 = eth1
174 interfaces.ifTable.ifEntry.ifDescr.5 = eth2
175 interfaces.ifTable.ifEntry.ifDescr.6 = eth3
176 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 6
177 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2
178 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 5
179 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1
181 While some distributions may not report the interface name in ifDescr,
182 the association between the IP address and IfIndex remains and SNMP
183 functions such as Interface_Scan_Next will report that association.
189 Optional parameters for the bonding driver can be supplied as command line
190 arguments to the insmod command. Typically, these parameters are specified in
191 the file /etc/modules.conf (see the manual page for modules.conf). The
192 available bonding driver parameters are listed below. If a parameter is not
193 specified the default value is used. When initially configuring a bond, it
194 is recommended "tail -f /var/log/messages" be run in a separate window to
195 watch for bonding driver error messages.
197 It is critical that either the miimon or arp_interval and arp_ip_target
198 parameters be specified, otherwise serious network degradation will occur
199 during link failures.
203 Specifies the ARP monitoring frequency in milli-seconds.
204 If ARP monitoring is used in a load-balancing mode (mode 0 or 2), the
205 switch should be configured in a mode that evenly distributes packets
206 across all links - such as round-robin. If the switch is configured to
207 distribute the packets in an XOR fashion, all replies from the ARP
208 targets will be received on the same link which could cause the other
209 team members to fail. ARP monitoring should not be used in conjunction
210 with miimon. A value of 0 disables ARP monitoring. The default value
215 Specifies the ip addresses to use when arp_interval is > 0. These
216 are the targets of the ARP request sent to determine the health of
217 the link to the targets. Specify these values in ddd.ddd.ddd.ddd
218 format. Multiple ip adresses must be seperated by a comma. At least
219 one ip address needs to be given for ARP monitoring to work. The
220 maximum number of targets that can be specified is set at 16.
224 Specifies the delay time in milli-seconds to disable a link after a
225 link failure has been detected. This should be a multiple of miimon
226 value, otherwise the value will be rounded. The default value is 0.
230 Option specifying the rate in which we'll ask our link partner to
231 transmit LACPDU packets in 802.3ad mode. Possible values are:
234 Request partner to transmit LACPDUs every 30 seconds (default)
237 Request partner to transmit LACPDUs every 1 second
241 Specifies the number of bonding devices to create for this
242 instance of the bonding driver. E.g., if max_bonds is 3, and
243 the bonding driver is not already loaded, then bond0, bond1
244 and bond2 will be created. The default value is 1.
248 Specifies the frequency in milli-seconds that MII link monitoring
249 will occur. A value of zero disables MII link monitoring. A value
250 of 100 is a good starting point. See High Availability section for
251 additional information. The default value is 0.
255 Specifies one of the bonding policies. The default is
256 round-robin (balance-rr). Possible values are (you can use
257 either the text or numeric option):
261 Round-robin policy: Transmit in a sequential order
262 from the first available slave through the last. This
263 mode provides load balancing and fault tolerance.
267 Active-backup policy: Only one slave in the bond is
268 active. A different slave becomes active if, and only
269 if, the active slave fails. The bond's MAC address is
270 externally visible on only one port (network adapter)
271 to avoid confusing the switch. This mode provides
276 XOR policy: Transmit based on [(source MAC address
277 XOR'd with destination MAC address) modula slave
278 count]. This selects the same slave for each
279 destination MAC address. This mode provides load
280 balancing and fault tolerance.
284 Broadcast policy: transmits everything on all slave
285 interfaces. This mode provides fault tolerance.
289 IEEE 802.3ad Dynamic link aggregation. Creates aggregation
290 groups that share the same speed and duplex settings.
291 Transmits and receives on all slaves in the active
296 1. Ethtool support in the base drivers for retrieving the
297 speed and duplex of each slave.
299 2. A switch that supports IEEE 802.3ad Dynamic link
304 Adaptive transmit load balancing: channel bonding that does
305 not require any special switch support. The outgoing
306 traffic is distributed according to the current load
307 (computed relative to the speed) on each slave. Incoming
308 traffic is received by the current slave. If the receiving
309 slave fails, another slave takes over the MAC address of
310 the failed receiving slave.
314 Ethtool support in the base drivers for retrieving the
319 Adaptive load balancing: includes balance-tlb + receive
320 load balancing (rlb) for IPV4 traffic and does not require
321 any special switch support. The receive load balancing is
322 achieved by ARP negotiation. The bonding driver intercepts
323 the ARP Replies sent by the server on their way out and
324 overwrites the src hw address with the unique hw address of
325 one of the slaves in the bond such that different clients
326 use different hw addresses for the server.
328 Receive traffic from connections created by the server is
329 also balanced. When the server sends an ARP Request the
330 bonding driver copies and saves the client's IP information
331 from the ARP. When the ARP Reply arrives from the client,
332 its hw address is retrieved and the bonding driver
333 initiates an ARP reply to this client assigning it to one
334 of the slaves in the bond. A problematic outcome of using
335 ARP negotiation for balancing is that each time that an ARP
336 request is broadcasted it uses the hw address of the
337 bond. Hence, clients learn the hw address of the bond and
338 the balancing of receive traffic collapses to the current
339 salve. This is handled by sending updates (ARP Replies) to
340 all the clients with their assigned hw address such that
341 the traffic is redistributed. Receive traffic is also
342 redistributed when a new slave is added to the bond and
343 when an inactive slave is re-activated. The receive load is
344 distributed sequentially (round robin) among the group of
345 highest speed slaves in the bond.
347 When a link is reconnected or a new slave joins the bond
348 the receive traffic is redistributed among all active
349 slaves in the bond by intiating ARP Replies with the
350 selected mac address to each of the clients. The updelay
351 modeprobe parameter must be set to a value equal or greater
352 than the switch's forwarding delay so that the ARP Replies
353 sent to the clients will not be blocked by the switch.
357 1. Ethtool support in the base drivers for retrieving the
360 2. Base driver support for setting the hw address of a
361 device also when it is open. This is required so that there
362 will always be one slave in the team using the bond hw
363 address (the curr_active_slave) while having a unique hw
364 address for each slave in the bond. If the curr_active_slave
365 fails it's hw address is swapped with the new curr_active_slave
370 A string (eth0, eth2, etc) to equate to a primary device. If this
371 value is entered, and the device is on-line, it will be used first
372 as the output media. Only when this device is off-line, will
373 alternate devices be used. Otherwise, once a failover is detected
374 and a new default output is chosen, it will remain the output media
375 until it too fails. This is useful when one slave was preferred
376 over another, i.e. when one slave is 1000Mbps and another is
377 100Mbps. If the 1000Mbps slave fails and is later restored, it may
378 be preferred the faster slave gracefully become the active slave -
379 without deliberately failing the 100Mbps slave. Specifying a
380 primary is only valid in active-backup mode.
384 Specifies the delay time in milli-seconds to enable a link after a
385 link up status has been detected. This should be a multiple of miimon
386 value, otherwise the value will be rounded. The default value is 0.
390 Specifies whether or not miimon should use MII or ETHTOOL
391 ioctls vs. netif_carrier_ok() to determine the link status.
392 The MII or ETHTOOL ioctls are less efficient and utilize a
393 deprecated calling sequence within the kernel. The
394 netif_carrier_ok() relies on the device driver to maintain its
395 state with netif_carrier_on/off; at this writing, most, but
396 not all, device drivers support this facility.
398 If bonding insists that the link is up when it should not be,
399 it may be that your network device driver does not support
400 netif_carrier_on/off. This is because the default state for
401 netif_carrier is "carrier on." In this case, disabling
402 use_carrier will cause bonding to revert to the MII / ETHTOOL
403 ioctl method to determine the link state.
405 A value of 1 enables the use of netif_carrier_ok(), a value of
406 0 will use the deprecated MII / ETHTOOL ioctls. The default
410 Configuring Multiple Bonds
411 ==========================
413 If several bonding interfaces are required, either specify the max_bonds
414 parameter (described above), or load the driver multiple times. Using
415 the max_bonds parameter is less complicated, but has the limitation that
416 all bonding instances created will have the same options. Loading the
417 driver multiple times allows each instance of the driver to have differing
420 For example, to configure two bonding interfaces, one with mii link
421 monitoring performed every 100 milliseconds, and one with ARP link
422 monitoring performed every 200 milliseconds, the /etc/conf.modules should
423 resemble the following:
428 options bond0 miimon=100
429 options bond1 -o bonding1 arp_interval=200 arp_ip_target=10.0.0.1
431 Configuring Multiple ARP Targets
432 ================================
434 While ARP monitoring can be done with just one target, it can be useful
435 in a High Availability setup to have several targets to monitor. In the
436 case of just one target, the target itself may go down or have a problem
437 making it unresponsive to ARP requests. Having an additional target (or
438 several) increases the reliability of the ARP monitoring.
440 Multiple ARP targets must be seperated by commas as follows:
442 # example options for ARP monitoring with three targets
444 options bond0 arp_interval=60 arp_ip_target=192.168.0.1,192.168.0.3,192.168.0.9
446 For just a single target the options would resemble:
448 # example options for ARP monitoring with one target
450 options bond0 arp_interval=60 arp_ip_target=192.168.0.100
452 Potential Problems When Using ARP Monitor
453 =========================================
457 The ARP monitor relies on the network device driver to maintain two
458 statistics: the last receive time (dev->last_rx), and the last
459 transmit time (dev->trans_start). If the network device driver does
460 not update one or both of these, then the typical result will be that,
461 upon startup, all links in the bond will immediately be declared down,
462 and remain that way. A network monitoring tool (tcpdump, e.g.) will
463 show ARP requests and replies being sent and received on the bonding
466 The possible resolutions for this are to (a) fix the device driver, or
467 (b) discontinue the ARP monitor (using miimon as an alternative, for
470 2. Adventures in Routing
472 When bonding is set up with the ARP monitor, it is important that the
473 slave devices not have routes that supercede routes of the master (or,
474 generally, not have routes at all). For example, suppose the bonding
475 device bond0 has two slaves, eth0 and eth1, and the routing table is
478 Kernel IP routing table
479 Destination Gateway Genmask Flags MSS Window irtt Iface
480 10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 eth0
481 10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 eth1
482 10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 bond0
483 127.0.0.0 0.0.0.0 255.0.0.0 U 40 0 0 lo
485 In this case, the ARP monitor (and ARP itself) may become confused,
486 because ARP requests will be sent on one interface (bond0), but the
487 corresponding reply will arrive on a different interface (eth0). This
488 reply looks to ARP as an unsolicited ARP reply (because ARP matches
489 replies on an interface basis), and is discarded. This will likely
490 still update the receive/transmit times in the driver, but will lose
493 The resolution here is simply to insure that slaves do not have routes
494 of their own, and if for some reason they must, those routes do not
495 supercede routes of their master. This should generally be the case,
496 but unusual configurations or errant manual or automatic static route
497 additions may cause trouble.
502 While the switch does not need to be configured when the active-backup,
503 balance-tlb or balance-alb policies (mode=1,5,6) are used, it does need to
504 be configured for the round-robin, XOR, broadcast, or 802.3ad policies
508 Verifying Bond Configuration
509 ============================
511 1) Bonding information files
512 ----------------------------
513 The bonding driver information files reside in the /proc/net/bonding directory.
515 Sample contents of /proc/net/bonding/bond0 after the driver is loaded with
516 parameters of mode=0 and miimon=1000 is shown below.
518 Bonding Mode: load balancing (round-robin)
519 Currently Active Slave: eth0
521 MII Polling Interval (ms): 1000
525 Slave Interface: eth1
527 Link Failure Count: 1
529 Slave Interface: eth0
531 Link Failure Count: 1
533 2) Network verification
534 -----------------------
535 The network configuration can be verified using the ifconfig command. In
536 the example below, the bond0 interface is the master (MASTER) while eth0 and
537 eth1 are slaves (SLAVE). Notice all slaves of bond0 have the same MAC address
538 (HWaddr) as bond0 for all modes except TLB and ALB that require a unique MAC
539 address for each slave.
541 [root]# /sbin/ifconfig
542 bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
543 inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
544 UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
545 RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0
546 TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0
547 collisions:0 txqueuelen:0
549 eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
550 inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
551 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
552 RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0
553 TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0
554 collisions:0 txqueuelen:100
555 Interrupt:10 Base address:0x1080
557 eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
558 inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
559 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
560 RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0
561 TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0
562 collisions:0 txqueuelen:100
563 Interrupt:9 Base address:0x1400
566 Frequently Asked Questions
567 ==========================
571 Yes. The old 2.0.xx channel bonding patch was not SMP safe.
572 The new driver was designed to be SMP safe from the start.
574 2. What type of cards will work with it?
576 Any Ethernet type cards (you can even mix cards - a Intel
577 EtherExpress PRO/100 and a 3com 3c905b, for example).
578 You can even bond together Gigabit Ethernet cards!
580 3. How many bonding devices can I have?
584 4. How many slaves can a bonding device have?
586 Limited by the number of network interfaces Linux supports and/or the
587 number of network cards you can place in your system.
589 5. What happens when a slave link dies?
591 If your ethernet cards support MII or ETHTOOL link status monitoring
592 and the MII monitoring has been enabled in the driver (see description
593 of module parameters), there will be no adverse consequences. This
594 release of the bonding driver knows how to get the MII information and
595 enables or disables its slaves according to their link status.
596 See section on High Availability for additional information.
598 For ethernet cards not supporting MII status, the arp_interval and
599 arp_ip_target parameters must be specified for bonding to work
600 correctly. If packets have not been sent or received during the
601 specified arp_interval duration, an ARP request is sent to the
602 targets to generate send and receive traffic. If after this
603 interval, either the successful send and/or receive count has not
604 incremented, the next slave in the sequence will become the active
607 If neither mii_monitor and arp_interval is configured, the bonding
608 driver will not handle this situation very well. The driver will
609 continue to send packets but some packets will be lost. Retransmits
610 will cause serious degradation of performance (in the case when one
611 of two slave links fails, 50% packets will be lost, which is a serious
612 problem for both TCP and UDP).
614 6. Can bonding be used for High Availability?
616 Yes, if you use MII monitoring and ALL your cards support MII link
617 status reporting. See section on High Availability for more
620 7. Which switches/systems does it work with?
622 In round-robin and XOR mode, it works with systems that support
625 * Many Cisco switches and routers (look for EtherChannel support).
626 * SunTrunking software.
627 * Alteon AceDirector switches / WebOS (use Trunks).
628 * BayStack Switches (trunks must be explicitly configured). Stackable
629 models (450) can define trunks between ports on different physical
631 * Linux bonding, of course !
633 In 802.3ad mode, it works with with systems that support IEEE 802.3ad
634 Dynamic Link Aggregation:
636 * Extreme networks Summit 7i (look for link-aggregation).
637 * Many Cisco switches and routers (look for LACP support; this may
638 require an upgrade to your IOS software; LACP support was added
639 by Cisco in late 2002).
640 * Foundry Big Iron 4000
642 In active-backup, balance-tlb and balance-alb modes, it should work
643 with any Layer-II switch.
646 8. Where does a bonding device get its MAC address from?
648 If not explicitly configured with ifconfig, the MAC address of the
649 bonding device is taken from its first slave device. This MAC address
650 is then passed to all following slaves and remains persistent (even if
651 the the first slave is removed) until the bonding device is brought
652 down or reconfigured.
654 If you wish to change the MAC address, you can set it with ifconfig:
656 # ifconfig bond0 hw ether 00:11:22:33:44:55
658 The MAC address can be also changed by bringing down/up the device
659 and then changing its slaves (or their order):
661 # ifconfig bond0 down ; modprobe -r bonding
662 # ifconfig bond0 .... up
663 # ifenslave bond0 eth...
665 This method will automatically take the address from the next slave
668 To restore your slaves' MAC addresses, you need to detach them
669 from the bond (`ifenslave -d bond0 eth0'). The bonding driver will then
670 restore the MAC addresses that the slaves had before they were enslaved.
672 9. Which transmit polices can be used?
674 Round-robin, based on the order of enslaving, the output device
675 is selected base on the next available slave. Regardless of
676 the source and/or destination of the packet.
678 Active-backup policy that ensures that one and only one device will
679 transmit at any given moment. Active-backup policy is useful for
680 implementing high availability solutions using two hubs (see
681 section on High Availability).
683 XOR, based on (src hw addr XOR dst hw addr) % slave count. This
684 policy selects the same slave for each destination hw address.
686 Broadcast policy transmits everything on all slave interfaces.
688 802.3ad, based on XOR but distributes traffic among all interfaces
689 in the active aggregator.
691 Transmit load balancing (balance-tlb) balances the traffic
692 according to the current load on each slave. The balancing is
693 clients based and the least loaded slave is selected for each new
694 client. The load of each slave is calculated relative to its speed
695 and enables load balancing in mixed speed teams.
697 Adaptive load balancing (balance-alb) uses the Transmit load
698 balancing for the transmit load. The receive load is balanced only
699 among the group of highest speed active slaves in the bond. The
700 load is distributed with round-robin i.e. next available slave in
701 the high speed group of active slaves.
706 To implement high availability using the bonding driver, the driver needs to be
707 compiled as a module, because currently it is the only way to pass parameters
708 to the driver. This may change in the future.
710 High availability is achieved by using MII or ETHTOOL status reporting. You
711 need to verify that all your interfaces support MII or ETHTOOL link status
712 reporting. On Linux kernel 2.2.17, all the 100 Mbps capable drivers and
713 yellowfin gigabit driver support MII. To determine if ETHTOOL link reporting
714 is available for interface eth0, type "ethtool eth0" and the "Link detected:"
715 line should contain the correct link status. If your system has an interface
716 that does not support MII or ETHTOOL status reporting, a failure of its link
717 will not be detected! A message indicating MII and ETHTOOL is not supported by
718 a network driver is logged when the bonding driver is loaded with a non-zero
721 The bonding driver can regularly check all its slaves links using the ETHTOOL
722 IOCTL (ETHTOOL_GLINK command) or by checking the MII status registers. The
723 check interval is specified by the module argument "miimon" (MII monitoring).
724 It takes an integer that represents the checking time in milliseconds. It
725 should not come to close to (1000/HZ) (10 milli-seconds on i386) because it
726 may then reduce the system interactivity. A value of 100 seems to be a good
727 starting point. It means that a dead link will be detected at most 100
728 milli-seconds after it goes down.
732 # modprobe bonding miimon=100
734 Or, put the following lines in /etc/modules.conf:
737 options bond0 miimon=100
739 There are currently two policies for high availability. They are dependent on
742 a) hosts are connected to a single host or switch that support trunking
744 b) hosts are connected to several different switches or a single switch that
745 does not support trunking
748 1) High Availability on a single switch or host - load balancing
749 ----------------------------------------------------------------
750 It is the easiest to set up and to understand. Simply configure the
751 remote equipment (host or switch) to aggregate traffic over several
752 ports (Trunk, EtherChannel, etc.) and configure the bonding interfaces.
753 If the module has been loaded with the proper MII option, it will work
754 automatically. You can then try to remove and restore different links
755 and see in your logs what the driver detects. When testing, you may
756 encounter problems on some buggy switches that disable the trunk for a
757 long time if all ports in a trunk go down. This is not Linux, but really
758 the switch (reboot it to ensure).
760 Example 1 : host to host at twice the speed
762 +----------+ +----------+
764 | Host A +--------------------------+ Host B |
765 | +--------------------------+ |
767 +----------+ +----------+
770 # modprobe bonding miimon=100
771 # ifconfig bond0 addr
772 # ifenslave bond0 eth0 eth1
774 Example 2 : host to switch at twice the speed
776 +----------+ +----------+
778 | Host A +--------------------------+ switch |
779 | +--------------------------+ |
781 +----------+ +----------+
783 On host A : On the switch :
784 # modprobe bonding miimon=100 # set up a trunk on port1
785 # ifconfig bond0 addr and port2
786 # ifenslave bond0 eth0 eth1
789 2) High Availability on two or more switches (or a single switch without
791 ---------------------------------------------------------------------------
792 This mode is more problematic because it relies on the fact that there
793 are multiple ports and the host's MAC address should be visible on one
794 port only to avoid confusing the switches.
796 If you need to know which interface is the active one, and which ones are
797 backup, use ifconfig. All backup interfaces have the NOARP flag set.
799 To use this mode, pass "mode=1" to the module at load time :
801 # modprobe bonding miimon=100 mode=active-backup
805 # modprobe bonding miimon=100 mode=1
807 Or, put in your /etc/modules.conf :
810 options bond0 miimon=100 mode=active-backup
812 Example 1: Using multiple host and multiple switches to build a "no single
813 point of failure" solution.
818 +-----+----+ +-----+----+
819 | |port7 ISL port7| |
820 | switch A +--------------------------+ switch B |
821 | +--------------------------+ |
823 +----++----+ +-----++---+
824 port2||port1 port1||port2
826 |+-------------+ host1 +---------------+|
827 | eth0 +-------+ eth1 |
830 +--------------+ host2 +----------------+
833 In this configuration, there is an ISL - Inter Switch Link (could be a trunk),
834 several servers (host1, host2 ...) attached to both switches each, and one or
835 more ports to the outside world (port3...). One and only one slave on each host
836 is active at a time, while all links are still monitored (the system can
837 detect a failure of active and backup links).
839 Each time a host changes its active interface, it sticks to the new one until
840 it goes down. In this example, the hosts are negligibly affected by the
841 expiration time of the switches' forwarding tables.
843 If host1 and host2 have the same functionality and are used in load balancing
844 by another external mechanism, it is good to have host1's active interface
845 connected to one switch and host2's to the other. Such system will survive
846 a failure of a single host, cable, or switch. The worst thing that may happen
847 in the case of a switch failure is that half of the hosts will be temporarily
848 unreachable until the other switch expires its tables.
850 Example 2: Using multiple ethernet cards connected to a switch to configure
851 NIC failover (switch is not required to support trunking).
854 +----------+ +----------+
856 | Host A +--------------------------+ switch |
857 | +--------------------------+ |
859 +----------+ +----------+
861 On host A : On the switch :
862 # modprobe bonding miimon=100 mode=1 # (optional) minimize the time
863 # ifconfig bond0 addr # for table expiration
864 # ifenslave bond0 eth0 eth1
866 Each time the host changes its active interface, it sticks to the new one until
867 it goes down. In this example, the host is strongly affected by the expiration
868 time of the switch forwarding table.
871 3) Adapting to your switches' timing
872 ------------------------------------
873 If your switches take a long time to go into backup mode, it may be
874 desirable not to activate a backup interface immediately after a link goes
875 down. It is possible to delay the moment at which a link will be
876 completely disabled by passing the module parameter "downdelay" (in
877 milliseconds, must be a multiple of miimon).
879 When a switch reboots, it is possible that its ports report "link up" status
880 before they become usable. This could fool a bond device by causing it to
881 use some ports that are not ready yet. It is possible to delay the moment at
882 which an active link will be reused by passing the module parameter "updelay"
883 (in milliseconds, must be a multiple of miimon).
885 A similar situation can occur when a host re-negotiates a lost link with the
886 switch (a case of cable replacement).
888 A special case is when a bonding interface has lost all slave links. Then the
889 driver will immediately reuse the first link that goes up, even if updelay
890 parameter was specified. (If there are slave interfaces in the "updelay" state,
891 the interface that first went into that state will be immediately reused.) This
892 allows to reduce down-time if the value of updelay has been overestimated.
896 # modprobe bonding miimon=100 mode=1 downdelay=2000 updelay=5000
897 # modprobe bonding miimon=100 mode=balance-rr downdelay=0 updelay=5000
900 Promiscuous Sniffing notes
901 ==========================
903 If you wish to bond channels together for a network sniffing
904 application --- you wish to run tcpdump, or ethereal, or an IDS like
905 snort, with its input aggregated from multiple interfaces using the
906 bonding driver --- then you need to handle the Promiscuous interface
907 setting by hand. Specifically, when you "ifconfing bond0 up" you
908 must add the promisc flag there; it will be propagated down to the
909 slave interfaces at ifenslave time; a full example might look like:
911 grep bond0 /etc/modules.conf || echo alias bond0 bonding >/etc/modules.conf
912 ifconfig bond0 promisc up
913 for if in eth1 eth2 ...;do
917 snort ... -i bond0 ...
919 Ifenslave also wants to propagate addresses from interface to
920 interface, appropriately for its design functions in HA and channel
921 capacity aggregating; but it works fine for unnumbered interfaces;
922 just ignore all the warnings it emits.
928 It is possible to configure VLAN devices over a bond interface using the 8021q
929 driver. However, only packets coming from the 8021q driver and passing through
930 bonding will be tagged by default. Self generated packets, like bonding's
931 learning packets or ARP packets generated by either ALB mode or the ARP
932 monitor mechanism, are tagged internally by bonding itself. As a result,
933 bonding has to "learn" what VLAN IDs are configured on top of it, and it uses
934 those IDs to tag self generated packets.
936 For simplicity reasons, and to support the use of adapters that can do VLAN
937 hardware acceleration offloding, the bonding interface declares itself as
938 fully hardware offloaing capable, it gets the add_vid/kill_vid notifications
939 to gather the necessary information, and it propagates those actions to the
941 In case of mixed adapter types, hardware accelerated tagged packets that should
942 go through an adapter that is not offloading capable are "un-accelerated" by the
943 bonding driver so the VLAN tag sits in the regular location.
945 VLAN interfaces *must* be added on top of a bonding interface only after
946 enslaving at least one slave. This is because until the first slave is added the
947 bonding interface has a HW address of 00:00:00:00:00:00, which will be copied by
948 the VLAN interface when it is created.
950 Notice that a problem would occur if all slaves are released from a bond that
951 still has VLAN interfaces on top of it. When later coming to add new slaves, the
952 bonding interface would get a HW address from the first slave, which might not
953 match that of the VLAN interfaces. It is recommended that either all VLANs are
954 removed and then re-added, or to manually set the bonding interface's HW
955 address so it matches the VLAN's. (Note: changing a VLAN interface's HW address
956 would set the underlying device -- i.e. the bonding interface -- to promiscouos
957 mode, which might not be what you want).
962 The main limitations are :
963 - only the link status is monitored. If the switch on the other side is
964 partially down (e.g. doesn't forward anymore, but the link is OK), the link
965 won't be disabled. Another way to check for a dead link could be to count
966 incoming frames on a heavily loaded host. This is not applicable to small
967 servers, but may be useful when the front switches send multicast
968 information on their links (e.g. VRRP), or even health-check the servers.
969 Use the arp_interval/arp_ip_target parameters to count incoming/outgoing
977 Current development on this driver is posted to:
978 - http://www.sourceforge.net/projects/bonding/
980 Donald Becker's Ethernet Drivers and diag programs may be found at :
981 - http://www.scyld.com/network/
983 You will also find a lot of information regarding Ethernet, NWay, MII, etc. at
986 Patches for 2.2 kernels are at Willy Tarreau's site :
987 - http://wtarreau.free.fr/pub/bonding/
988 - http://www-miaif.lip6.fr/~tarreau/pub/bonding/
990 To get latest informations about Linux Kernel development, please consult
991 the Linux Kernel Mailing List Archives at :
992 http://www.ussg.iu.edu/hypermail/linux/kernel/