bridge.rst 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335
  1. .. SPDX-License-Identifier: GPL-2.0
  2. =================
  3. Ethernet Bridging
  4. =================
  5. Introduction
  6. ============
  7. The IEEE 802.1Q-2022 (Bridges and Bridged Networks) standard defines the
  8. operation of bridges in computer networks. A bridge, in the context of this
  9. standard, is a device that connects two or more network segments and operates
  10. at the data link layer (Layer 2) of the OSI (Open Systems Interconnection)
  11. model. The purpose of a bridge is to filter and forward frames between
  12. different segments based on the destination MAC (Media Access Control) address.
  13. Bridge kAPI
  14. ===========
  15. Here are some core structures of bridge code. Note that the kAPI is *unstable*,
  16. and can be changed at any time.
  17. .. kernel-doc:: net/bridge/br_private.h
  18. :identifiers: net_bridge_vlan
  19. Bridge uAPI
  20. ===========
  21. Modern Linux bridge uAPI is accessed via Netlink interface. You can find
  22. below files where the bridge and bridge port netlink attributes are defined.
  23. Bridge netlink attributes
  24. -------------------------
  25. .. kernel-doc:: include/uapi/linux/if_link.h
  26. :doc: Bridge enum definition
  27. Bridge port netlink attributes
  28. ------------------------------
  29. .. kernel-doc:: include/uapi/linux/if_link.h
  30. :doc: Bridge port enum definition
  31. Bridge sysfs
  32. ------------
  33. The sysfs interface is deprecated and should not be extended if new
  34. options are added.
  35. STP
  36. ===
  37. The STP (Spanning Tree Protocol) implementation in the Linux bridge driver
  38. is a critical feature that helps prevent loops and broadcast storms in
  39. Ethernet networks by identifying and disabling redundant links. In a Linux
  40. bridge context, STP is crucial for network stability and availability.
  41. STP is a Layer 2 protocol that operates at the Data Link Layer of the OSI
  42. model. It was originally developed as IEEE 802.1D and has since evolved into
  43. multiple versions, including Rapid Spanning Tree Protocol (RSTP) and
  44. `Multiple Spanning Tree Protocol (MSTP)
  45. <https://lore.kernel.org/netdev/20220316150857.2442916-1-tobias@waldekranz.com/>`_.
  46. The 802.1D-2004 removed the original Spanning Tree Protocol, instead
  47. incorporating the Rapid Spanning Tree Protocol (RSTP). By 2014, all the
  48. functionality defined by IEEE 802.1D has been incorporated into either
  49. IEEE 802.1Q (Bridges and Bridged Networks) or IEEE 802.1AC (MAC Service
  50. Definition). 802.1D has been officially withdrawn in 2022.
  51. Bridge Ports and STP States
  52. ---------------------------
  53. In the context of STP, bridge ports can be in one of the following states:
  54. * Blocking: The port is disabled for data traffic and only listens for
  55. BPDUs (Bridge Protocol Data Units) from other devices to determine the
  56. network topology.
  57. * Listening: The port begins to participate in the STP process and listens
  58. for BPDUs.
  59. * Learning: The port continues to listen for BPDUs and begins to learn MAC
  60. addresses from incoming frames but does not forward data frames.
  61. * Forwarding: The port is fully operational and forwards both BPDUs and
  62. data frames.
  63. * Disabled: The port is administratively disabled and does not participate
  64. in the STP process. The data frames forwarding are also disabled.
  65. Root Bridge and Convergence
  66. ---------------------------
  67. In the context of networking and Ethernet bridging in Linux, the root bridge
  68. is a designated switch in a bridged network that serves as a reference point
  69. for the spanning tree algorithm to create a loop-free topology.
  70. Here's how the STP works and root bridge is chosen:
  71. 1. Bridge Priority: Each bridge running a spanning tree protocol, has a
  72. configurable Bridge Priority value. The lower the value, the higher the
  73. priority. By default, the Bridge Priority is set to a standard value
  74. (e.g., 32768).
  75. 2. Bridge ID: The Bridge ID is composed of two components: Bridge Priority
  76. and the MAC address of the bridge. It uniquely identifies each bridge
  77. in the network. The Bridge ID is used to compare the priorities of
  78. different bridges.
  79. 3. Bridge Election: When the network starts, all bridges initially assume
  80. that they are the root bridge. They start advertising Bridge Protocol
  81. Data Units (BPDU) to their neighbors, containing their Bridge ID and
  82. other information.
  83. 4. BPDU Comparison: Bridges exchange BPDUs to determine the root bridge.
  84. Each bridge examines the received BPDUs, including the Bridge Priority
  85. and Bridge ID, to determine if it should adjust its own priorities.
  86. The bridge with the lowest Bridge ID will become the root bridge.
  87. 5. Root Bridge Announcement: Once the root bridge is determined, it sends
  88. BPDUs with information about the root bridge to all other bridges in the
  89. network. This information is used by other bridges to calculate the
  90. shortest path to the root bridge and, in doing so, create a loop-free
  91. topology.
  92. 6. Forwarding Ports: After the root bridge is selected and the spanning tree
  93. topology is established, each bridge determines which of its ports should
  94. be in the forwarding state (used for data traffic) and which should be in
  95. the blocking state (used to prevent loops). The root bridge's ports are
  96. all in the forwarding state. while other bridges have some ports in the
  97. blocking state to avoid loops.
  98. 7. Root Ports: After the root bridge is selected and the spanning tree
  99. topology is established, each non-root bridge processes incoming
  100. BPDUs and determines which of its ports provides the shortest path to the
  101. root bridge based on the information in the received BPDUs. This port is
  102. designated as the root port. And it is in the Forwarding state, allowing
  103. it to actively forward network traffic.
  104. 8. Designated ports: A designated port is the port through which the non-root
  105. bridge will forward traffic towards the designated segment. Designated ports
  106. are placed in the Forwarding state. All other ports on the non-root
  107. bridge that are not designated for specific segments are placed in the
  108. Blocking state to prevent network loops.
  109. STP ensures network convergence by calculating the shortest path and disabling
  110. redundant links. When network topology changes occur (e.g., a link failure),
  111. STP recalculates the network topology to restore connectivity while avoiding loops.
  112. Proper configuration of STP parameters, such as the bridge priority, can
  113. influence network performance, path selection and which bridge becomes the
  114. Root Bridge.
  115. User space STP helper
  116. ---------------------
  117. The user space STP helper *bridge-stp* is a program to control whether to use
  118. user mode spanning tree. The ``/sbin/bridge-stp <bridge> <start|stop>`` is
  119. called by the kernel when STP is enabled/disabled on a bridge
  120. (via ``brctl stp <bridge> <on|off>`` or ``ip link set <bridge> type bridge
  121. stp_state <0|1>``). The kernel enables user_stp mode if that command returns
  122. 0, or enables kernel_stp mode if that command returns any other value.
  123. VLAN
  124. ====
  125. A LAN (Local Area Network) is a network that covers a small geographic area,
  126. typically within a single building or a campus. LANs are used to connect
  127. computers, servers, printers, and other networked devices within a localized
  128. area. LANs can be wired (using Ethernet cables) or wireless (using Wi-Fi).
  129. A VLAN (Virtual Local Area Network) is a logical segmentation of a physical
  130. network into multiple isolated broadcast domains. VLANs are used to divide
  131. a single physical LAN into multiple virtual LANs, allowing different groups of
  132. devices to communicate as if they were on separate physical networks.
  133. Typically there are two VLAN implementations, IEEE 802.1Q and IEEE 802.1ad
  134. (also known as QinQ). IEEE 802.1Q is a standard for VLAN tagging in Ethernet
  135. networks. It allows network administrators to create logical VLANs on a
  136. physical network and tag Ethernet frames with VLAN information, which is
  137. called *VLAN-tagged frames*. IEEE 802.1ad, commonly known as QinQ or Double
  138. VLAN, is an extension of the IEEE 802.1Q standard. QinQ allows for the
  139. stacking of multiple VLAN tags within a single Ethernet frame. The Linux
  140. bridge supports both the IEEE 802.1Q and `802.1AD
  141. <https://lore.kernel.org/netdev/1402401565-15423-1-git-send-email-makita.toshiaki@lab.ntt.co.jp/>`_
  142. protocol for VLAN tagging.
  143. `VLAN filtering <https://lore.kernel.org/netdev/1360792820-14116-1-git-send-email-vyasevic@redhat.com/>`_
  144. on a bridge is disabled by default. After enabling VLAN filtering on a bridge,
  145. it will start forwarding frames to appropriate destinations based on their
  146. destination MAC address and VLAN tag (both must match).
  147. Multicast
  148. =========
  149. The Linux bridge driver has multicast support allowing it to process Internet
  150. Group Management Protocol (IGMP) or Multicast Listener Discovery (MLD)
  151. messages, and to efficiently forward multicast data packets. The bridge
  152. driver supports IGMPv2/IGMPv3 and MLDv1/MLDv2.
  153. Multicast snooping
  154. ------------------
  155. Multicast snooping is a networking technology that allows network switches
  156. to intelligently manage multicast traffic within a local area network (LAN).
  157. The switch maintains a multicast group table, which records the association
  158. between multicast group addresses and the ports where hosts have joined these
  159. groups. The group table is dynamically updated based on the IGMP/MLD messages
  160. received. With the multicast group information gathered through snooping, the
  161. switch optimizes the forwarding of multicast traffic. Instead of blindly
  162. broadcasting the multicast traffic to all ports, it sends the multicast
  163. traffic based on the destination MAC address only to ports which have
  164. subscribed the respective destination multicast group.
  165. When created, the Linux bridge devices have multicast snooping enabled by
  166. default. It maintains a Multicast forwarding database (MDB) which keeps track
  167. of port and group relationships.
  168. IGMPv3/MLDv2 EHT support
  169. ------------------------
  170. The Linux bridge supports IGMPv3/MLDv2 EHT (Explicit Host Tracking), which
  171. was added by `474ddb37fa3a ("net: bridge: multicast: add EHT allow/block handling")
  172. <https://lore.kernel.org/netdev/20210120145203.1109140-1-razor@blackwall.org/>`_
  173. The explicit host tracking enables the device to keep track of each
  174. individual host that is joined to a particular group or channel. The main
  175. benefit of the explicit host tracking in IGMP is to allow minimal leave
  176. latencies when a host leaves a multicast group or channel.
  177. The length of time between a host wanting to leave and a device stopping
  178. traffic forwarding is called the IGMP leave latency. A device configured
  179. with IGMPv3 or MLDv2 and explicit tracking can immediately stop forwarding
  180. traffic if the last host to request to receive traffic from the device
  181. indicates that it no longer wants to receive traffic. The leave latency
  182. is thus bound only by the packet transmission latencies in the multiaccess
  183. network and the processing time in the device.
  184. Other multicast features
  185. ------------------------
  186. The Linux bridge also supports `per-VLAN multicast snooping
  187. <https://lore.kernel.org/netdev/20210719170637.435541-1-razor@blackwall.org/>`_,
  188. which is disabled by default but can be enabled. And `Multicast Router Discovery
  189. <https://lore.kernel.org/netdev/20190121062628.2710-1-linus.luessing@c0d3.blue/>`_,
  190. which help identify the location of multicast routers.
  191. Switchdev
  192. =========
  193. Linux Bridge Switchdev is a feature in the Linux kernel that extends the
  194. capabilities of the traditional Linux bridge to work more efficiently with
  195. hardware switches that support switchdev. With Linux Bridge Switchdev, certain
  196. networking functions like forwarding, filtering, and learning of Ethernet
  197. frames can be offloaded to a hardware switch. This offloading reduces the
  198. burden on the Linux kernel and CPU, leading to improved network performance
  199. and lower latency.
  200. To use Linux Bridge Switchdev, you need hardware switches that support the
  201. switchdev interface. This means that the switch hardware needs to have the
  202. necessary drivers and functionality to work in conjunction with the Linux
  203. kernel.
  204. Please see the :ref:`switchdev` document for more details.
  205. Netfilter
  206. =========
  207. The bridge netfilter module is a legacy feature that allows to filter bridged
  208. packets with iptables and ip6tables. Its use is discouraged. Users should
  209. consider using nftables for packet filtering.
  210. The older ebtables tool is more feature-limited compared to nftables, but
  211. just like nftables it doesn't need this module either to function.
  212. The br_netfilter module intercepts packets entering the bridge, performs
  213. minimal sanity tests on ipv4 and ipv6 packets and then pretends that
  214. these packets are being routed, not bridged. br_netfilter then calls
  215. the ip and ipv6 netfilter hooks from the bridge layer, i.e. ip(6)tables
  216. rulesets will also see these packets.
  217. br_netfilter is also the reason for the iptables *physdev* match:
  218. This match is the only way to reliably tell routed and bridged packets
  219. apart in an iptables ruleset.
  220. Note that ebtables and nftables will work fine without the br_netfilter module.
  221. iptables/ip6tables/arptables do not work for bridged traffic because they
  222. plug in the routing stack. nftables rules in ip/ip6/inet/arp families won't
  223. see traffic that is forwarded by a bridge either, but that's very much how it
  224. should be.
  225. Historically the feature set of ebtables was very limited (it still is),
  226. this module was added to pretend packets are routed and invoke the ipv4/ipv6
  227. netfilter hooks from the bridge so users had access to the more feature-rich
  228. iptables matching capabilities (including conntrack). nftables doesn't have
  229. this limitation, pretty much all features work regardless of the protocol family.
  230. So, br_netfilter is only needed if users, for some reason, need to use
  231. ip(6)tables to filter packets forwarded by the bridge, or NAT bridged
  232. traffic. For pure link layer filtering, this module isn't needed.
  233. Other Features
  234. ==============
  235. The Linux bridge also supports `IEEE 802.11 Proxy ARP
  236. <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=958501163ddd6ea22a98f94fa0e7ce6d4734e5c4>`_,
  237. `Media Redundancy Protocol (MRP)
  238. <https://lore.kernel.org/netdev/20200426132208.3232-1-horatiu.vultur@microchip.com/>`_,
  239. `Media Redundancy Protocol (MRP) LC mode
  240. <https://lore.kernel.org/r/20201124082525.273820-1-horatiu.vultur@microchip.com>`_,
  241. `IEEE 802.1X port authentication
  242. <https://lore.kernel.org/netdev/20220218155148.2329797-1-schultz.hans+netdev@gmail.com/>`_,
  243. and `MAC Authentication Bypass (MAB)
  244. <https://lore.kernel.org/netdev/20221101193922.2125323-2-idosch@nvidia.com/>`_.
  245. FAQ
  246. ===
  247. What does a bridge do?
  248. ----------------------
  249. A bridge transparently forwards traffic between multiple network interfaces.
  250. In plain English this means that a bridge connects two or more physical
  251. Ethernet networks, to form one larger (logical) Ethernet network.
  252. Is it L3 protocol independent?
  253. ------------------------------
  254. Yes. The bridge sees all frames, but it *uses* only L2 headers/information.
  255. As such, the bridging functionality is protocol independent, and there should
  256. be no trouble forwarding IPX, NetBEUI, IP, IPv6, etc.
  257. Contact Info
  258. ============
  259. The code is currently maintained by Roopa Prabhu <roopa@nvidia.com> and
  260. Nikolay Aleksandrov <razor@blackwall.org>. Bridge bugs and enhancements
  261. are discussed on the linux-netdev mailing list netdev@vger.kernel.org and
  262. bridge@lists.linux.dev.
  263. The list is open to anyone interested: http://vger.kernel.org/vger-lists.html#netdev
  264. External Links
  265. ==============
  266. The old Documentation for Linux bridging is on:
  267. https://wiki.linuxfoundation.org/networking/bridge