Lines Matching refs:the

23 The behavior of the bonded interfaces depends upon the mode; generally
29 the original tools from extreme-linux and beowulf sites will not work
30 with this version of the driver.
32 For new versions of the driver, updated userspace tools, and
33 who to ask for help, please follow the links at the end of this file.
106 Most popular distro kernels ship with the bonding driver
110 the following steps:
112 1.1 Configure and build the kernel with bonding
115 The current version of the bonding driver is available in the
116 drivers/net/bonding subdirectory of the most recent kernel source
118 own" will want to use the most recent kernel from kernel.org.
121 "make config"), then select "Bonding driver support" in the "Network
122 device support" section. It is recommended that you configure the
123 driver as module since it is currently the only way to pass parameters
124 to the driver or configure more than one bonding device.
126 Build and install the new kernel and modules.
132 or sysfs, the old ifenslave control utility is obsolete.
137 Options for the bonding driver are supplied as parameters to the
140 Module options may be given as command line arguments to the
141 insmod or modprobe command, but are usually specified in either the
143 configuration file (some of which are detailed in the next section).
145 Details on bonding support for sysfs is provided in the
149 parameter is not specified the default value is used. When initially
153 It is critical that either the miimon or arp_interval and
158 Options with textual values will accept either the text name
159 or, for backwards compatibility, the option value. E.g.,
160 "mode=802.3ad" and "mode=4" set the same mode.
166 Specifies the new active slave for modes that support it
168 are the name of any currently enslaved interface, or an empty
169 string. If a name is given, the slave and its link must be up in order
170 to be selected as the new active slave. If an empty string is
171 specified, the current active slave is cleared, and a new active
174 Note that this is only available through the sysfs interface. No module
177 The normal value of this option is the name of the currently
178 active slave, or the empty string if there is no active slave or
179 the current mode does not use an active slave.
183 Specifies the 802.3ad aggregation selection logic to use. The
191 Reselection of the active aggregator occurs only when all
192 slaves of the active aggregator are down or the active
195 This is the default value.
202 - A slave is added to or removed from the bond
212 The active aggregator is chosen by the largest number of
213 ports (slaves). Reselection occurs as described under the
217 802.3ad aggregations when partial failure of the active aggregator
218 occurs. This keeps the aggregator with the highest availability
237 Specifies the ARP link monitoring frequency in milliseconds.
239 The ARP monitor works by periodically checking the slave
241 traffic recently (the precise criteria depends upon the
242 bonding mode, and the state of the slave). Regular traffic is
243 generated via ARP probes issued for the addresses specified by
244 the arp_ip_target option.
246 This behavior can be modified by the arp_validate option,
250 (modes 0 and 2), the switch should be configured in a mode
251 that evenly distributes packets across all links. If the
252 switch is configured to distribute the packets in an XOR
253 fashion, all replies from the ARP targets will be received on
254 the same link which could cause the other team members to
261 Specifies the IP addresses to use as ARP monitoring peers when
262 arp_interval is > 0. These are the targets of the ARP request
263 sent to determine the health of the link to the targets.
285 Validation is performed only for the active slave.
303 only for the active slave.
312 Enabling validation causes the ARP monitor to examine the incoming
314 is receiving the appropriate ARP traffic.
316 For an active slave, the validation checks ARP replies to confirm
318 do not typically receive these replies, the validation performed
319 for backup slaves is on the broadcast ARP request sent out via the
321 configurations may result in situations wherein the backup slaves
322 do not receive the ARP requests; in such a situation, validation
327 the active slave failure, it doesn't really guarantee that the
328 backup slave will work if it's selected as the next active slave.
332 beyond a common switch. Should the link between the switch and
333 target fail (but not the switch itself), the probe traffic
334 generated by the multiple bonding instances will fool the standard
335 ARP monitor into considering the links as still up. Use of
336 validation can resolve this, as the ARP monitor will only consider
342 Enabling filtering causes the ARP monitor to only use incoming ARP
347 Filtering operates by only considering the reception of ARP
353 levels of third party broadcast traffic would fool the standard
354 ARP monitor into considering the links as still up. Use of
362 Specifies the quantity of arp_ip_targets that must be reachable
363 in order for the ARP monitor to consider a slave as being up.
371 consider the slave up only when any of the arp_ip_targets
376 consider the slave up only when all of the arp_ip_targets
381 Specifies the time, in milliseconds, to wait before disabling
383 is only valid for the miimon link monitor. The downdelay
384 value should be a multiple of the miimon value; if not, it
385 will be rounded down to the nearest multiple. The default
391 the same MAC address at enslavement (the traditional
392 behavior), or, when enabled, perform special handling of the
393 bond's MAC address in accordance with the selected policy.
401 the same MAC address at enslavement time. This is the
406 The "active" fail_over_mac policy indicates that the
407 MAC address of the bond should always be the MAC
408 address of the currently active slave. The MAC
409 address of the slaves is not changed; instead, the MAC
410 address of the bond changes during a failover.
415 interferes with the ARP monitor).
418 the network must be updated via gratuitous ARP,
421 traffic, if the switch snoops incoming traffic to
422 update its tables) for the traditional method. If the
426 When this policy is used in conjunction with the mii
429 susceptible to loss of the gratuitous ARP, and an
434 The "follow" fail_over_mac policy causes the MAC
435 address of the bond to be selected normally (normally
436 the MAC address of the first slave added to the bond).
437 However, the second and subsequent slaves are not set
439 slave is programmed with the bond's MAC address at
440 failover time (and the formerly active slave receives
441 the newly active slave's MAC address).
445 when multiple ports are programmed with the same MAC
449 The default policy is none, unless the first slave cannot
450 change its MAC address, in which case the active policy is
454 present in the bond.
461 Option specifying the rate in which we'll ask our link partner
475 Specifies the number of bonding devices to create for this
476 instance of the bonding driver. E.g., if max_bonds is 3, and
477 the bonding driver is not already loaded, then bond0, bond1
483 Specifies the MII link monitoring frequency in milliseconds.
484 This determines how often the link state of each slave is
487 The use_carrier option, below, affects how the link state is
488 determined. See the High Availability section for additional
493 Specifies the minimum number of links that must be active before
494 asserting carrier. It is similar to the Cisco EtherChannel min-links
495 feature. This allows setting the minimum number of member ports that
496 must be up (link-up state) before marking the bond device as up
503 802.3ad mode) whenever there is an active aggregator, regardless of the
506 setting this option to 0 or to 1 has the exact same effect.
510 Specifies one of the bonding policies. The default is
516 order from the first available slave through the
522 Active-backup policy: Only one slave in the bond is
524 if, the active slave fails. The bond's MAC address is
526 to avoid confusing the switch.
530 or more gratuitous ARPs on the newly active slave.
531 One gratuitous ARP is issued for the bonding master
533 it, provided that the interface has at least one IP
535 interfaces are tagged with the appropriate VLAN id.
538 option, documented below, affects the behavior of this
543 XOR policy: Transmit based on the selected transmit
547 policies may be selected via the xmit_hash_policy option,
560 aggregation groups that share the same speed and
561 duplex settings. Utilizes all slaves in the active
562 aggregator according to the 802.3ad specification.
565 to the transmit hash policy, which may be changed from
566 the default simple XOR policy via the xmit_hash_policy
569 regards to the packet mis-ordering requirements of
570 section 43.2.4 of the 802.3ad standard. Differing
576 1. Ethtool support in the base drivers for retrieving
577 the speed and duplex of each slave.
590 In tlb_dynamic_lb=1 mode; the outgoing traffic is
591 distributed according to the current load (computed
592 relative to the speed) on each slave.
594 In tlb_dynamic_lb=0 mode; the load balancing based on
595 current load is disabled and the load is distributed
596 only using the hash distribution.
598 Incoming traffic is received by the current slave.
599 If the receiving slave fails, another slave takes over
600 the MAC address of the failed receiving slave.
604 Ethtool support in the base drivers for retrieving the
613 The bonding driver intercepts the ARP Replies sent by
614 the local system on their way out and overwrites the
615 source hardware address with the unique hardware
616 address of one of the slaves in the bond such that
618 the server.
620 Receive traffic from connections created by the server
621 is also balanced. When the local system sends an ARP
622 Request the bonding driver copies and saves the peer's
623 IP information from the ARP packet. When the ARP
624 Reply arrives from the peer, its hardware address is
625 retrieved and the bonding driver initiates an ARP
626 reply to this peer assigning it to one of the slaves
627 in the bond. A problematic outcome of using ARP
629 ARP request is broadcast it uses the hardware address
630 of the bond. Hence, peers learn the hardware address
631 of the bond and the balancing of receive traffic
632 collapses to the current slave. This is handled by
633 sending updates (ARP Replies) to all the peers with
635 the traffic is redistributed. Receive traffic is also
636 redistributed when a new slave is added to the bond
639 among the group of highest speed slaves in the bond.
641 When a link is reconnected or a new slave joins the
642 bond the receive traffic is redistributed among all
643 active slaves in the bond by initiating ARP Replies
644 with the selected MAC address to each of the
646 be set to a value equal or greater than the switch's
647 forwarding delay so that the ARP Replies sent to the
648 peers will not be blocked by the switch.
652 1. Ethtool support in the base drivers for retrieving
653 the speed of each slave.
655 2. Base driver support for setting the hardware
657 required so that there will always be one slave in the
658 team using the bond hardware address (the
660 address for each slave in the bond. If the
662 swapped with the new curr_active_slave that was
668 Specify the number of peer notifications (gratuitous ARPs and
670 failover event. As soon as the link is up on the new slave
671 (possibly immediately) a peer notification is sent on the
674 is active) if the number is greater than 1.
676 The valid range is 0 - 255; the default value is 1. These options
677 affect only the active-backup mode. These options were added for
681 are generated by the ipv4 and ipv6 code and the numbers of
686 Specify the number of packets to transmit through a slave before
687 moving to the next one. When set to 0 then a slave is chosen at
690 The valid range is 0 - 65535; the default value is 1. This option
695 A string (eth0, eth2, etc) specifying which slave is the
696 primary device. The specified device will always be the
697 active slave while it is available. Only when the primary is
707 Specifies the reselection policy for the primary slave. This
708 affects how the primary slave is chosen to become the active slave
709 when failure of the active slave or recovery of the primary slave
711 the primary slave and other slaves. Possible values are:
715 The primary slave becomes the active slave whenever it
720 The primary slave becomes the active slave when it comes
721 back up, if the speed and duplex of the primary slave is
722 better than the speed and duplex of the current active
727 The primary slave becomes the active slave only if the
728 current active slave fails and the primary slave is up.
732 If no slaves are active, the first slave to recover is
733 made the active slave.
735 When initially enslaved, the primary slave is always made
736 the active slave.
738 Changing the primary_reselect policy via sysfs will cause an
739 immediate selection of the best active slave according to the new
740 policy. This may or may not result in a change of the active
741 slave, depending upon the circumstances.
751 slaves based on the load in that interval. This gives nice lb
754 load balancing provided solely by the hash distribution.
755 xmit-hash-policy can be used to select the appropriate hashing for
756 the setup.
758 The sysfs entry can be used to change the setting per bond device
759 and the initial value is derived from the module parameter. The
760 sysfs entry is allowed to be changed only if the bond device is
769 Specifies the time, in milliseconds, to wait before enabling a
771 only valid for the miimon link monitor. The updelay value
772 should be a multiple of the miimon value; if not, it will be
773 rounded down to the nearest multiple. The default value is 0.
778 ioctls vs. netif_carrier_ok() to determine the link
780 utilize a deprecated calling sequence within the kernel. The
781 netif_carrier_ok() relies on the device driver to maintain its
785 If bonding insists that the link is up when it should not be,
789 it will appear as if the link is always up. In this case,
790 setting use_carrier to 0 will cause bonding to revert to the
791 MII / ETHTOOL ioctl method to determine the link state.
793 A value of 1 enables the use of netif_carrier_ok(), a value of
794 0 will use the deprecated MII / ETHTOOL ioctls. The default
799 Selects the transmit hash policy to use for slave selection in
805 field to generate the hash. The formula is
811 network peer on the same slave.
818 protocol information to generate the hash.
821 generate the hash. The formula is
829 If the protocol is IPv6 then the source and destination
833 network peer on the same slave. For non-IP traffic,
834 the formula is the same as for the layer2 transmit
847 when available, to generate the hash. This allows for
854 hash = source port, destination port (as in the header)
860 If the protocol is IPv6 then the source and destination
864 IPv6 protocol traffic, the source and destination port
865 information is omitted. For non-IP traffic, the
866 formula is the same as for the layer2 transmit hash
881 This policy uses the same formula as layer2+3 but it
882 relies on skb_flow_dissect to obtain the header fields
883 which might result in the use of inner headers if an
885 improve the performance for tunnel users because the
886 packets will be distributed according to the encapsulated
891 This policy uses the same formula as layer3+4 but it
892 relies on skb_flow_dissect to obtain the header fields
893 which might result in the use of inner headers if an
895 improve the performance for tunnel users because the
896 packets will be distributed according to the encapsulated
901 does not exist, and the layer2 policy is the only policy. The
906 Specifies the number of IGMP membership reports to be issued after
908 the failover, subsequent packets are sent in each 200ms interval.
910 The valid range is 0 - 255; the default value is 1. A value of 0
911 prevents the IGMP membership report from being issued in response
912 to the failover event.
916 switch the IGMP traffic from one slave to another. Therefore a fresh
917 IGMP report must be issued to cause the switch to forward the incoming
918 IGMP traffic over the newly selected slave.
924 Specifies the number of seconds between instances where the bonding
927 The valid range is 1 - 0x7fffffff; the default value is 1. This Option
934 initialization scripts, or manually using either iproute2 or the
935 sysfs interface. Distros generally use one of three packages for the
940 We will first describe the options for configuring bonding for
943 bonding without support from the network initialization scripts (i.e.,
954 Else, issue the command:
959 "initscripts" or "sysconfig," followed by some numbers. This is the
963 issue the command:
977 bonding, however, at this writing, the YaST system configuration
981 First, if they have not already been configured, configure the
982 slave devices. On SLES 9, this is most easily done by running the
985 this is to configure the devices for DHCP (this is only to get the
987 name of the configuration file for each device will be of the form:
991 Where the "xx" portion will be replaced with the digits from
992 the device's permanent MAC address.
994 Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been
995 created, it is necessary to edit the configuration files for the slave
996 devices (the MAC addresses correspond to those of the slave devices).
997 Before editing, the file will contain multiple lines, and will look
1006 Change the BOOTPROTO and STARTMODE lines to the following:
1011 Do not alter the UNIQUE or _nm_name lines. Remove any other
1014 Once the ifcfg-id-xx:xx:xx:xx:xx:xx files have been modified,
1015 it's time to create the configuration file for the bonding device
1016 itself. This file is named ifcfg-bondX, where X is the number of the
1018 ifcfg-bond0, the second is ifcfg-bond1, and so on. The sysconfig
1022 The contents of the ifcfg-bondX file is as follows:
1036 Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK
1037 values with the appropriate values for your network.
1039 The STARTMODE specifies when the device is brought online.
1055 The line BONDING_MASTER='yes' indicates that the device is a
1058 The contents of BONDING_MODULE_OPTS are supplied to the
1059 instance of the bonding module for this device. Specify the options
1060 for the bonding mode, link monitoring, and so on here. Do not include
1061 the max_bonds bonding parameter; this will confuse the configuration
1067 specifier for the network device. The interface name is easier to
1068 find, but the ethN names are subject to change at boot time if, e.g.,
1069 a device early in the sequence has failed. The device specifiers
1070 (bus-pci-0000:06:08.1 in the example above) specify the physical
1071 network device, and will not change unless the device's bus location
1074 configurations will choose one or the other for all slave devices.
1077 networking must be restarted for the configuration changes to take
1078 effect. This can be accomplished via the following:
1082 Note that the network control script (/sbin/ifdown) will
1083 remove the bonding module as part of the network shutdown processing,
1084 so it is not necessary to remove the module by hand if, e.g., the
1089 devices). It is necessary to edit the configuration file by hand to
1090 change the bonding configuration.
1092 Additional general options and details of the ifcfg file
1097 Note that the template does not document the various BONDING_
1098 settings described above, but does describe many of the other options.
1105 writing, this does not function for bonding devices; the scripts
1106 attempt to obtain the device address from DHCP prior to adding any of
1107 the slave devices. Without active slaves, the DHCP requests are not
1108 sent to the network.
1116 (as described above). Do not specify the "max_bonds" parameter to any
1121 Because the sysconfig scripts supply the bonding module
1122 options in the ifcfg-bondX file, it is not necessary to add them to
1123 the system /etc/modules.d/*.conf configuration files.
1130 version 3 or later, Fedora, etc. On these systems, the network
1132 control bonding devices. Note that older versions of the initscripts
1136 These distros will not automatically load the network adapter
1137 driver unless the ethX device is configured with an IP address.
1140 a bondX link. Network script files are located in the directory:
1145 with the adapter's physical adapter number. For example, the script
1147 Place the following text in the file:
1157 must correspond with the name of the file, i.e., ifcfg-eth1 must have
1158 a device line of DEVICE=eth1. The setting of the MASTER= line will
1159 also depend on the final bonding interface name chosen for your bond.
1161 one for each device, i.e., the first bonding instance is bond0, the
1166 the number of the bond. For bond0 the file is named "ifcfg-bond0",
1168 place the following text:
1179 Be sure to change the networking specific lines (IPADDR,
1184 and, indeed, preferable, to specify the bonding options in the ifcfg-bond0
1185 file, e.g. a line of the format:
1189 will configure the bond with the specified options. The options
1190 specified in BONDING_OPTS are identical to the bonding module parameters
1191 except for the arp_ip_target field when using versions of initscripts older
1194 should be preceded by a '+' to indicate it should be added to the list of
1199 is the proper syntax to specify multiple targets. When specifying
1204 your distro) to load the bonding module with your desired options when the
1206 will load the bonding module, and select its options:
1211 Replace the sample parameters with the appropriate set of
1215 will restart the networking subsystem and your bond link should be now
1221 Recent versions of initscripts (the versions supplied with Fedora
1227 above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp"
1228 and add a line consisting of "TYPE=Bonding". Note that the TYPE value
1236 specifying the appropriate BONDING_OPTS= in ifcfg-bondX where X is the
1237 number of the bond. This support requires sysfs support in the kernel,
1240 those instances, see the "Configuring Multiple Bonds Manually" section,
1247 scripts (the sysconfig or initscripts package) do not have specific
1251 The general method for these systems is to place the bonding
1253 appropriate for the installed distro), then add modprobe and/or
1254 `ip link` commands to the system's global init script. The name of
1255 the global init script differs; for sysconfig, it is
1260 reboots, edit the appropriate file (/etc/init.d/boot.local or
1261 /etc/rc.d/rc.local), and add the following:
1269 Replace the example bonding module parameters and bond0
1270 network configuration (IP address, netmask, etc) with the appropriate
1273 Unfortunately, this method will not provide support for the
1274 ifup and ifdown scripts on the bond devices. To reload the bonding
1275 configuration, it is necessary to run the initialization script, e.g.,
1284 which only initializes the bonding configuration, then call that
1286 enabled without re-running the entire global init script.
1288 To shut down the bonding devices, it is necessary to first
1289 mark the bonding device itself as being down, then remove the
1291 the following:
1308 If you require multiple bonding devices, but all with the same
1309 options, you may wish to use the "max_bonds" module parameter,
1313 preferable to use bonding parameters exported by sysfs, documented in the
1316 For versions of bonding without sysfs support, the only means to
1318 the bonding driver multiple times. Note that current versions of the
1320 your distro uses these scripts, no special action is needed. See the
1324 To load multiple instances of the module, it is necessary to
1325 specify a different name for each instance (the module loading system
1326 requires that every loaded module, even multiple instances of the same
1336 will load the bonding module two times. The first instance is
1337 named "bond0" and creates the bond0 device in balance-rr mode with an
1338 miimon of 100. The second instance is named "bond1" and creates the
1342 the above does not work, and the second bonding instance never sees
1343 its options. In that case, the second options line can be substituted
1353 to rename modules at load time (the "-o bond1" part). Attempts to pass
1364 via the sysfs interface. This interface allows dynamic configuration
1365 of all bonds in the system without unloading the module. It also
1369 Use of the sysfs interface allows you to use multiple bonds
1370 with different configurations without having to reload the module.
1372 bonding is compiled into the kernel.
1374 You must have the sysfs filesystem mounted to configure
1376 are using the standard mount point for sysfs, e.g. /sys. If your
1377 sysfs filesystem is mounted elsewhere, you will need to adjust the
1397 Interfaces may be enslaved to a bond using the file
1399 are the same as for the bonding_masters file.
1408 When an interface is enslaved to a bond, symlinks between the
1409 two are created in the sysfs filesystem. In this case, you would get
1414 interface is enslaved by looking for the master symlink. Thus:
1417 the name of the bond interface.
1421 Each bond may be configured individually by manipulating the
1424 The names of these files correspond directly with the command-
1425 line parameters described elsewhere in this file, and, with the
1426 exception of arp_ip_target, they accept the same values. To see the
1427 current setting, simply cat the appropriate file.
1430 guidelines for each parameter, see the appropriate section in this
1438 NOTE: The bond interface must be down before the mode can be
1454 To configure the interval between learning packet transmits:
1456 NOTE: the lp_inteval is the number of seconds between instances where
1457 the bonding driver sends learning packets to each slaves peer switch. The
1462 We begin with the same example that is shown in section 3.3,
1466 and eth1), and have it persist across reboots, edit the appropriate
1467 file (/etc/init.d/boot.local or /etc/rc.d/rc.local), and add the
1479 active-backup mode, using ARP monitoring, add the following lines to
1499 the box. The ifenslave-2.6 package should be installed to provide bonding
1503 Note that ifenslave-2.6 package will load the bonding module and use
1504 the ifenslave command when appropriate.
1509 In /etc/network/interfaces, the following stanza will configure bond0, in
1519 If the above configuration doesn't work, you might have a system using
1522 produce the same result on those systems.
1541 more advanced examples tailored to you particular distros, see the files in
1547 When using the bonding driver, the physical port which transmits a frame is
1548 typically selected by the bonding driver, and is not relevant to the user or
1549 system administrator. The output port is simply selected using the policies of
1550 the selected bonding mode. On occasion however, it is helpful to direct certain
1554 connects via a public network, it may be desirous to bias the bond to send said
1557 using the traffic control utilities inherent in linux.
1559 By default the bonding driver is multiqueue aware and 16 queues are created
1560 when the driver initializes (see Documentation/networking/multiqueue.txt
1561 for details). If more or less queues are desired the module parameter
1563 available as the allocation is done at module init time.
1565 The output of the file /proc/net/bonding/bondX has changed so the output Queue
1588 The queue_id for a slave can be set using the command:
1593 like the one above until proper priorities are set for all interfaces. On
1597 These queue id's can be used in conjunction with the tc utility to configure
1599 slave devices. For instance, say we wanted, in the above configuration to
1600 force all traffic bound to 192.168.1.100 to use eth1 in the bond as its output
1608 These commands tell the kernel to attach a multiqueue queue discipline to the
1611 This value is then passed into the driver, causing the normal output path
1614 Note that qid values begin at 1. Qid 0 is reserved to initiate to the driver
1616 leaving the qid for a slave to 0 is the multiqueue awareness in the bonding
1618 slave devices as well as bond devices and the bonding driver will simply act as
1619 a pass-through for selecting output queues on the slave device rather than
1631 Each bonding device has a read-only file residing in the
1633 about the bonding configuration, options and state of each slave.
1635 For example, the contents of /proc/net/bonding/bond0 after the
1655 The precise format and contents will change depending upon the
1656 bonding configuration, state, and version of the bonding driver.
1661 The network configuration can be inspected using the ifconfig
1662 command. Bonding devices will have the MASTER flag set; Bonding slave
1663 devices will have the SLAVE flag set. The ifconfig output does not
1666 In the example below, the bond0 interface is the master
1668 bond0 have the same MAC address (HWaddr) as bond0 for all modes except
1696 For this section, "switch" refers to whatever system the
1697 bonded devices are directly connected to (i.e., where the other end of
1698 the cable plugs into). This may be an actual dedicated switch device,
1703 require any specific configuration of the switch.
1705 The 802.3ad mode requires that the switch have the appropriate
1708 Cisco 3550 series switch requires that the appropriate ports first be
1714 require that the switch have the appropriate ports grouped together.
1716 called an "etherchannel" (as in the Cisco example, above), a "trunk
1718 will also have its own configuration options for the switch's transmit
1719 policy to the bond. Typical choices include XOR of either the MAC or
1720 IP addresses. The transmit policy of the two peers does not need to
1721 match. For these three modes, the bonding mode really selects a
1730 using the 8021q driver. However, only packets coming from the 8021q
1733 packets generated by either ALB mode or the ARP monitor mechanism, are
1735 "learn" the VLAN IDs configured above it, and use those IDs to tag
1738 For reasons of simplicity, and to support the use of adapters
1739 that can do VLAN hardware acceleration offloading, the bonding
1741 the add_vid/kill_vid notifications to gather the necessary
1742 information, and it propagates those actions to the slaves. In case
1745 "un-accelerated" by the bonding driver so the VLAN tag sits in the
1750 hardware address of 00:00:00:00:00:00 until the first slave is added.
1751 If the VLAN interface is created prior to the first enslavement, it
1752 would pick up the all-zeroes hardware address. Once the first slave
1753 is attached to the bond, the bond device itself will pick up the
1754 slave's hardware address, which is then available for the VLAN device.
1758 top of it. When a new slave is added, the bonding interface will
1759 obtain its hardware address from the first slave, which might not
1760 match the hardware address of the VLAN interfaces (which was
1763 There are two methods to insure that the VLAN device operates
1764 with the correct hardware address if all slaves are removed from a
1769 2. Set the bonding interface's hardware address so that it
1770 matches the hardware address of the VLAN interfaces.
1772 Note that changing a VLAN interface's HW address would set the
1773 underlying device -- i.e. the bonding interface -- to promiscuous
1781 monitoring a slave device's link state: the ARP monitor and the MII
1784 At the present time, due to implementation restrictions in the
1792 queries to one or more designated peer systems on the network, and
1793 uses the response as an indication that the link is operating. This
1795 or more peers on the local network.
1797 The ARP monitor relies on the device driver itself to verify
1798 that traffic is flowing. In particular, the driver must keep up to
1799 date the last receive time, dev->last_rx, and transmit start time,
1800 dev->trans_start. If these are not updated by the driver, then the
1803 shows the ARP requests and replies on the network, then it may be that
1811 monitor. In the case of just one target, the target itself may go
1813 an additional target (or several) increases the reliability of the ARP
1822 For just a single target the options would resemble:
1832 The MII monitor monitors only the carrier state of the local
1834 depending upon the device driver to maintain its carrier state, by
1835 querying the device's MII registers, or by making an ethtool query to
1836 the device.
1838 If the use_carrier module parameter is 1 (the default value),
1839 then the MII monitor will rely on the driver for carrier state
1840 information (via the netif_carrier subsystem). As explained in the
1841 use_carrier parameter information, above, if the MII monitor fails to
1842 detect carrier loss on the device (e.g., when the cable is physically
1843 disconnected), it may be that the driver does not support
1846 If use_carrier is 0, then the MII monitor will first query the
1847 device's (via ioctl) MII registers and check the link state. If that
1848 request fails (not just that it returns carrier down), then the MII
1850 the same information. If both methods fail (i.e., the driver either
1851 does not support or had some error in processing both the MII register
1852 and ethtool requests), then the MII monitor will assume the link is
1861 When bonding is configured, it is important that the slave
1862 devices not have routes that supersede routes of the master (or,
1863 generally, not have routes at all). For example, suppose the bonding
1864 device bond0 has two slaves, eth0 and eth1, and the routing table is
1874 This routing configuration will likely still update the
1875 receive/transmit times in the driver (needed by the ARP monitor), but
1876 may bypass the bonding driver (because outgoing traffic to, in this
1880 configuration, because ARP requests (generated by the ARP monitor)
1881 will be sent on one interface (bond0), but the corresponding reply
1885 by the state of the routing table.
1889 not supersede routes of their master. This should generally be the
1898 that the same physical device always has the same "ethX" name), it may
1902 For example, given a modules.conf containing the following:
1911 If neither eth0 and eth1 are slaves to bond0, then when the
1912 bond0 interface comes up, the devices may end up reordered. This
1915 when the e1000 driver loads, it will receive eth0 and eth1 for its
1916 devices, but the bonding configuration tries to enslave eth2 and eth3
1917 (which may later be assigned to the tg3 devices).
1919 Adding the following:
1924 bonding is loaded. This command is fully documented in the
1928 In this case, the following can be added to config files in
1933 This will load tg3 and e1000 modules before loading the bonding one.
1934 Full documentation on this can be found in the modprobe.d and modprobe
1940 By default, bonding enables the use_carrier option, which
1941 instructs bonding to trust the driver to maintain carrier state.
1943 As discussed in the options section, above, some drivers do
1944 not support the netif_carrier_on/_off link state tracking system.
1949 not maintain it in real time, e.g., only polling the link state at
1953 use_carrier=0 to see if that improves the failure detection time. If
1954 it does, then it may be that the driver checks the carrier state at a
1955 fixed interval, but does not cache the MII register values (so the
1956 use_carrier=0 method of querying the registers directly works). If
1957 use_carrier=0 does not improve the failover, then the driver may cache
1958 the registers, or the problem may be elsewhere.
1960 Also, remember that miimon only checks for the device's
1961 carrier state. It has no way to determine the state of devices on or
1968 If running SNMP agents, the bonding driver should be loaded
1970 is due to the interface index (ipAdEntIfIndex) being associated to
1971 the first interface found with a given IP address. That is, there is
1973 eth1 are slaves of bond0 and the driver for eth0 is loaded before the
1974 bonding driver, the interface for the IP address will be associated
1975 with the eth0 interface. This configuration is shown below, the IP
1977 in the ifDescr table (ifDescr.2).
1990 This problem is avoided by loading the bonding driver before
1992 loading the bonding driver first, the IP address 192.168.1.1 is
2006 While some distributions may not report the interface name in
2007 ifDescr, the association between the IP address and IfIndex remains
2015 common to enable promiscuous mode on the device, so that all traffic
2016 is seen (instead of seeing only traffic destined for the local host).
2017 The bonding driver handles promiscuous mode changes to the bonding
2018 master device (e.g., bond0), and propagates the setting to the slave
2021 For the balance-rr, balance-xor, broadcast, and 802.3ad modes,
2022 the promiscuous mode setting is propagated to all slaves.
2024 For the active-backup, balance-tlb and balance-alb modes, the
2025 promiscuous mode setting is propagated only to the active slave.
2027 For balance-tlb mode, the active slave is the slave currently
2030 For balance-alb mode, the active slave is the slave used as a
2032 sending to peers that are unassigned or if the load is unbalanced.
2034 For the active-backup, balance-tlb and balance-alb modes, when
2035 the active slave changes (e.g., due to a link failure), the
2036 promiscuous setting will be propagated to the new active slave.
2043 links or switches between the host and the rest of the world. The
2044 goal is to provide the maximum availability of network connectivity
2045 (i.e., the network always works), even though other configurations
2055 access to fail over to. Additionally, the bonding load balance modes
2057 the load will be rebalanced across the remaining devices.
2065 With multiple switches, the configuration of bonding and the
2069 Below is a sample network, configured to maximize the
2070 availability of the network:
2084 In this configuration, there is a link between the two
2086 the outside world ("port3" on each switch). There is no technical
2092 In a topology such as the example above, the active-backup and
2093 broadcast modes are the only useful bonding modes when optimizing for
2094 availability; the other modes require all links to terminate on the
2097 active-backup: This is generally the preferred mode, particularly if
2098 the switches have an ISL and play together well. If the
2101 then the primary option can be used to insure that the
2105 only for very specific needs. For example, if the two
2106 switches are not connected (no ISL), and the networks beyond
2109 independent networks, then the broadcast mode may be suitable.
2115 switch. If the switch can reliably fail ports in response to other
2116 failures, then either the MII or ARP monitors should work. For
2117 example, in the above example, if the "port3" link fails at the remote
2118 end, the MII monitor has no direct means to detect this. The ARP
2119 monitor could be configured with a target at the remote end of port3,
2122 In general, however, in a multiple switch topology, the ARP
2124 end connectivity failures (which may be caused by the failure of any
2126 the ARP monitor should be configured with multiple targets (at least
2127 one for each switch in the network). This will insure that,
2128 regardless of which switch is active, the ARP monitor has a suitable
2132 generally referred to as "trunk failover." This is a feature of the
2133 switch that causes the link state of a particular switch port to be set
2134 down (or up) when the state of another switch port goes down (or up).
2136 to the logically "interior" ports that bonding is able to monitor via
2138 switch, but this can be a viable alternative to the ARP monitor when using
2147 In a single switch configuration, the best method to maximize
2148 throughput depends upon the application and network environment. The
2152 For this discussion, we will break down the topologies into
2153 two categories. Depending upon the destination of most traffic, we
2156 In a gatewayed configuration, the "switch" is acting primarily
2157 as a router, and the majority of traffic passes through this router to
2158 other networks. An example would be the following:
2169 acting as a gateway. For our discussion, the important point is that
2170 the majority of traffic from Host A will pass through the router to
2175 and received via one other peer on the local network, the router.
2177 Note that the case of two systems connected directly via
2178 multiple physical links is, for purposes of configuring bonding, the
2180 traffic is destined for the "gateway" itself, not some other network
2181 beyond the gateway.
2183 In a local configuration, the "switch" is acting primarily as
2184 a switch, and the majority of traffic passes through this switch to
2185 reach other stations on the same network. An example would be the
2196 Again, the switch may be a dedicated switch device, or another
2197 host acting as a gateway. For our discussion, the important point is
2198 that the majority of traffic from Host A is destined for other hosts
2199 on the same local network (Hosts B and C in the above example).
2202 the bonded device will be to the same MAC level peer on the network
2203 (the gateway itself, i.e., the router), regardless of its final
2205 from the final destinations, thus, each destination (Host B, Host C)
2209 configuration is important because many of the load balancing modes
2210 available use the MAC addresses of the local network source and
2218 This configuration is the easiest to set up and to understand,
2222 balance-rr: This mode is the only mode that will permit a single
2224 interfaces. It is therefore the only mode that will allow a
2226 worth of throughput. This comes at a cost, however: the
2232 altering the net.ipv4.tcp_reordering sysctl parameter. The
2236 Note that the fraction of packets that will be delivered out of
2238 of reordering depends upon a variety of factors, including the
2239 networking interfaces, the switch, and the topology of the
2248 through the switch to a balance-rr bond will not utilize greater
2255 to the bond.
2257 This mode requires the switch to have the appropriate ports
2261 the active-backup mode, as the inactive backup devices are all
2262 connected to the same peer as the primary. In this case, a
2263 load balancing mode (with link monitoring) will provide the
2265 available bandwidth. On the plus side, active-backup mode
2266 does not require any configuration of the switch, so it may
2267 have value if the hardware available does not support any of
2268 the load balance modes.
2271 for specific peers will always be sent over the same
2272 interface. Since the destination is determined by the MAC
2275 the same local network. This mode is likely to be suboptimal
2279 As with balance-rr, the switch ports need to be configured for
2288 protocol includes automatic configuration of the aggregates,
2289 so minimal manual configuration of the switch is needed
2294 packets. The 802.3ad mode does have some drawbacks: the
2295 standard mandates that all devices in the aggregate operate at
2296 the same speed and duplex. Also, as with all bonding load
2301 Additionally, the linux bonding 802.3ad implementation
2304 outgoing traffic will generally use the same device. Incoming
2306 dependent upon the balancing policy of the peer's 8023.ad
2308 distributed across the devices in the bond.
2310 Finally, the 802.3ad mode mandates the use of the MII monitor,
2311 therefore, the ARP monitor is not available in this mode.
2314 Since the balancing is done according to MAC address, in a
2321 XOR to the same value) will not all "bunch up" on a single
2325 special switch configuration is required. On the down side,
2327 interface, this mode requires certain ethtool support in the
2328 network device driver of the slave interfaces, and the ARP
2332 It has all of the features (and restrictions) of balance-tlb,
2334 peers (as described in the Bonding Module Options section,
2337 The only additional down side to this mode is that the network
2338 device driver must support changing the hardware address while
2339 the device is open.
2346 support the use of the ARP monitor, and are thus restricted to using
2347 the MII monitor (which does not provide as high a level of end to end
2348 assurance as the ARP monitor).
2373 In this configuration, the switches are isolated from one
2381 If access beyond the network is required, an individual host
2388 In actual practice, the bonding mode typically employed in
2390 network configuration, the usual caveats about out of order packet
2391 delivery are mitigated by the use of network adapters that do not do
2392 any kind of packet coalescing (via the use of NAPI, or because the
2394 packets has arrived). When employed in this fashion, the balance-rr
2401 Again, in actual practice, the MII monitor is most often used
2404 advantages over the MII monitor are mitigated by the volume of probes
2405 needed as the number of systems involved grows (remember that each
2406 host in the network is configured with bonding).
2414 Some switches exhibit undesirable behavior with regard to the
2415 timing of link up and down reporting by the switch.
2418 the link is up (carrier available), but not pass traffic over the
2423 value to the updelay bonding module option to delay the use of the
2426 Second, some switches may "bounce" the link state one or more
2428 the switch is initializing. Again, an appropriate updelay value may
2431 Note that when a bonding interface has no active links, the
2432 driver will immediately reuse the first link that goes up, even if the
2433 updelay parameter has been specified (the updelay is ignored in this
2434 case). If there are slave interfaces waiting for the updelay timeout
2435 to expire, the interface that first went into that state will be
2436 immediately reused. This reduces down time of the network if the
2439 ignoring the updelay.
2441 In addition to the concerns about switch timings, if your
2444 Failover may be delayed via the downdelay bonding module option.
2449 NOTE: Starting with version 3.0.2, the bonding driver has logic to
2454 traffic when the bonding device is first used, or after it has been
2456 a "ping" to some other host on the network, and noticing that the
2460 all connected to one switch, the output may appear as follows:
2473 This is not due to an error in the bonding driver, rather, it
2475 tables. Initially, the switch does not associate the MAC address in
2476 the packet with a particular switch port, and so it may send the
2478 the interfaces attached to the bond may occupy multiple ports on a
2479 single switch, when the switch (temporarily) floods the traffic to all
2480 ports, the bond device receives multiple copies of the same packet
2485 behavior, it can be induced by clearing the MAC forwarding table (on
2486 most Cisco switches, the privileged command "clear mac address-table
2499 This applies to the JS20 and similar systems.
2501 On the JS20 blades, the bonding driver supports only
2503 largely due to the network topology inside the BladeCenter, detailed
2510 integrated on the planar (that's "motherboard" in IBM-speak). In the
2511 BladeCenter chassis, the eth0 port of all JS20 blades is hard wired to
2536 modules 1 and 2. In this configuration, the eth0 and eth1 ports of a
2537 JS20 will be connected to different internal switches (in the
2541 passthrough module) connects the I/O module directly to an external
2542 switch. By using PMs in I/O module #1 and #2, the eth0 and eth1
2543 interfaces of a JS20 can be redirected to the outside world and
2546 Depending upon the mix of ESMs and PMs, the network will
2550 much like the example in "High Availability in a Multiple Switch
2556 The balance-rr mode requires the use of passthrough modules
2557 for devices in the bond, all connected to an common external switch.
2558 That switch must be configured for "etherchannel" or "trunking" on the
2564 must be able to reach all destinations for traffic sent over the
2565 bonding device (i.e., the network must converge at some point outside
2566 the BladeCenter).
2573 When an Ethernet Switch Module is in place, only the ARP
2575 nothing unusual, but examination of the BladeCenter cabinet would
2576 suggest that the "external" network ports are the ethernet ports for
2577 the system, when it fact there is a switch between these "external"
2578 ports and the devices on the JS20 system itself. The MII monitor is
2579 only able to detect link failures between the ESM and the JS20 system.
2581 When a passthrough module is in place, the MII monitor does
2582 detect failures to the "external" port, which is then directly
2583 connected to the JS20 system.
2588 The Serial Over LAN (SoL) link is established over the primary
2591 network traffic, as the SoL system is beyond the control of the
2594 It may be desirable to disable spanning tree on the switch
2595 (either the internal Ethernet Switch Module, or an external switch) to
2605 The new driver was designed to be SMP safe from the start.
2611 devices need not be of the same speed.
2622 This is limited only by the number of network interfaces Linux
2623 supports and/or the number of network cards you can place in your
2628 If link monitoring is enabled, then the failing device will be
2630 other modes will ignore the failed link. The link will continue to be
2631 monitored, and should it recover, it will rejoin the bond (in whatever
2632 manner is appropriate for the mode). See the sections on High
2633 Availability and the documentation for each mode for additional
2636 Link monitoring can be enabled via either the miimon or
2637 arp_interval parameters (described in the module parameters section,
2638 above). In general, miimon monitors the carrier state as sensed by
2639 the underlying network device, and the arp monitor (arp_interval)
2640 monitors connectivity to another host on the local network.
2642 If no link monitoring is configured, the bonding driver will
2646 depends upon the bonding mode and network configuration.
2650 Yes. See the section on High Availability for details.
2654 The full answer to this depends upon the desired mode.
2656 In the basic balance modes (balance-rr and balance-xor), it
2663 support specific features (described in the appropriate section under
2675 the fail_over_mac option is enabled, the bonding device's MAC address is
2676 the MAC address of the active slave.
2679 ifconfig or ip link), the MAC address of the bonding device is taken from
2681 slaves and remains persistent (even if the first slave is removed) until
2682 the bonding device is brought down or reconfigured.
2684 If you wish to change the MAC address, you can set it with
2691 The MAC address can be also changed by bringing down/up the
2698 This method will automatically take the address from the next
2702 from the bond (`ifenslave -d bond0 eth0'). The bonding driver will
2703 then restore the MAC addresses that the slaves had before they were
2709 The latest version of the bonding driver can be found in the latest
2710 version of the linux kernel, found on http://kernel.org
2712 The latest version of this document can be found in the latest kernel
2715 Discussions regarding the usage of the bonding driver take place on the
2717 problems, post them to the list. The list address is:
2726 Discussions regarding the development of the bonding driver take place
2727 on the main Linux network mailing list, hosted at vger.kernel.org. The list