segmentation-offloads.rst 8.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190
  1. .. SPDX-License-Identifier: GPL-2.0
  2. =====================
  3. Segmentation Offloads
  4. =====================
  5. Introduction
  6. ============
  7. This document describes a set of techniques in the Linux networking stack
  8. to take advantage of segmentation offload capabilities of various NICs.
  9. The following technologies are described:
  10. * TCP Segmentation Offload - TSO
  11. * UDP Fragmentation Offload - UFO
  12. * IPIP, SIT, GRE, and UDP Tunnel Offloads
  13. * Generic Segmentation Offload - GSO
  14. * Generic Receive Offload - GRO
  15. * Partial Generic Segmentation Offload - GSO_PARTIAL
  16. * SCTP acceleration with GSO - GSO_BY_FRAGS
  17. TCP Segmentation Offload
  18. ========================
  19. TCP segmentation allows a device to segment a single frame into multiple
  20. frames with a data payload size specified in skb_shinfo()->gso_size.
  21. When TCP segmentation requested the bit for either SKB_GSO_TCPV4 or
  22. SKB_GSO_TCPV6 should be set in skb_shinfo()->gso_type and
  23. skb_shinfo()->gso_size should be set to a non-zero value.
  24. TCP segmentation is dependent on support for the use of partial checksum
  25. offload. For this reason TSO is normally disabled if the Tx checksum
  26. offload for a given device is disabled.
  27. In order to support TCP segmentation offload it is necessary to populate
  28. the network and transport header offsets of the skbuff so that the device
  29. drivers will be able determine the offsets of the IP or IPv6 header and the
  30. TCP header. In addition as CHECKSUM_PARTIAL is required csum_start should
  31. also point to the TCP header of the packet.
  32. For IPv4 segmentation we support one of two types in terms of the IP ID.
  33. The default behavior is to increment the IP ID with every segment. If the
  34. GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
  35. ID and all segments will use the same IP ID.
  36. For encapsulated packets, SKB_GSO_TCP_FIXEDID refers only to the outer header.
  37. SKB_GSO_TCP_FIXEDID_INNER can be used to specify the same for the inner header.
  38. Any combination of these two GSO types is allowed.
  39. If a device has NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when
  40. performing TSO and we will either increment the IP ID for all frames, or leave
  41. it at a static value based on driver preference. For encapsulated packets,
  42. NETIF_F_TSO_MANGLEID is relevant for both outer and inner headers, unless the
  43. DF bit is not set on the outer header, in which case the device driver must
  44. guarantee that the IP ID field is incremented in the outer header with every
  45. segment.
  46. UDP Fragmentation Offload
  47. =========================
  48. UDP fragmentation offload allows a device to fragment an oversized UDP
  49. datagram into multiple IPv4 fragments. Many of the requirements for UDP
  50. fragmentation offload are the same as TSO. However the IPv4 ID for
  51. fragments should not increment as a single IPv4 datagram is fragmented.
  52. UFO is deprecated: modern kernels will no longer generate UFO skbs, but can
  53. still receive them from tuntap and similar devices. Offload of UDP-based
  54. tunnel protocols is still supported.
  55. IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
  56. ========================================================
  57. In addition to the offloads described above it is possible for a frame to
  58. contain additional headers such as an outer tunnel. In order to account
  59. for such instances an additional set of segmentation offload types were
  60. introduced including SKB_GSO_IPXIP4, SKB_GSO_IPXIP6, SKB_GSO_GRE, and
  61. SKB_GSO_UDP_TUNNEL. These extra segmentation types are used to identify
  62. cases where there are more than just 1 set of headers. For example in the
  63. case of IPIP and SIT we should have the network and transport headers moved
  64. from the standard list of headers to "inner" header offsets.
  65. Currently only two levels of headers are supported. The convention is to
  66. refer to the tunnel headers as the outer headers, while the encapsulated
  67. data is normally referred to as the inner headers. Below is the list of
  68. calls to access the given headers:
  69. IPIP/SIT Tunnel::
  70. Outer Inner
  71. MAC skb_mac_header
  72. Network skb_network_header skb_inner_network_header
  73. Transport skb_transport_header
  74. UDP/GRE Tunnel::
  75. Outer Inner
  76. MAC skb_mac_header skb_inner_mac_header
  77. Network skb_network_header skb_inner_network_header
  78. Transport skb_transport_header skb_inner_transport_header
  79. In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and
  80. SKB_GSO_UDP_TUNNEL_CSUM. These two additional tunnel types reflect the
  81. fact that the outer header also requests to have a non-zero checksum
  82. included in the outer header.
  83. Finally there is SKB_GSO_TUNNEL_REMCSUM which indicates that a given tunnel
  84. header has requested a remote checksum offload. In this case the inner
  85. headers will be left with a partial checksum and only the outer header
  86. checksum will be computed.
  87. Generic Segmentation Offload
  88. ============================
  89. Generic segmentation offload is a pure software offload that is meant to
  90. deal with cases where device drivers cannot perform the offloads described
  91. above. What occurs in GSO is that a given skbuff will have its data broken
  92. out over multiple skbuffs that have been resized to match the MSS provided
  93. via skb_shinfo()->gso_size.
  94. Before enabling any hardware segmentation offload a corresponding software
  95. offload is required in GSO. Otherwise it becomes possible for a frame to
  96. be re-routed between devices and end up being unable to be transmitted.
  97. Generic Receive Offload
  98. =======================
  99. Generic receive offload is the complement to GSO. Ideally any frame
  100. assembled by GRO should be segmented to create an identical sequence of
  101. frames using GSO, and any sequence of frames segmented by GSO should be
  102. able to be reassembled back to the original by GRO.
  103. Partial Generic Segmentation Offload
  104. ====================================
  105. Partial generic segmentation offload is a hybrid between TSO and GSO. What
  106. it effectively does is take advantage of certain traits of TCP and tunnels
  107. so that instead of having to rewrite the packet headers for each segment
  108. only the inner-most transport header and possibly the outer-most network
  109. header need to be updated. This allows devices that do not support tunnel
  110. offloads or tunnel offloads with checksum to still make use of segmentation.
  111. With the partial offload what occurs is that all headers excluding the
  112. inner transport header are updated such that they will contain the correct
  113. values for if the header was simply duplicated. The one exception to this
  114. is the outer IPv4 ID field. It is up to the device drivers to guarantee
  115. that the IPv4 ID field is incremented in the case that a given header does
  116. not have the DF bit set.
  117. SCTP acceleration with GSO
  118. ===========================
  119. SCTP - despite the lack of hardware support - can still take advantage of
  120. GSO to pass one large packet through the network stack, rather than
  121. multiple small packets.
  122. This requires a different approach to other offloads, as SCTP packets
  123. cannot be just segmented to (P)MTU. Rather, the chunks must be contained in
  124. IP segments, padding respected. So unlike regular GSO, SCTP can't just
  125. generate a big skb, set gso_size to the fragmentation point and deliver it
  126. to IP layer.
  127. Instead, the SCTP protocol layer builds an skb with the segments correctly
  128. padded and stored as chained skbs, and skb_segment() splits based on those.
  129. To signal this, gso_size is set to the special value GSO_BY_FRAGS.
  130. Therefore, any code in the core networking stack must be aware of the
  131. possibility that gso_size will be GSO_BY_FRAGS and handle that case
  132. appropriately.
  133. There are some helpers to make this easier:
  134. - skb_is_gso(skb) && skb_is_gso_sctp(skb) is the best way to see if
  135. an skb is an SCTP GSO skb.
  136. - For size checks, the skb_gso_validate_*_len family of helpers correctly
  137. considers GSO_BY_FRAGS.
  138. - For manipulating packets, skb_increase_gso_size and skb_decrease_gso_size
  139. will check for GSO_BY_FRAGS and WARN if asked to manipulate these skbs.
  140. This also affects drivers with the NETIF_F_FRAGLIST & NETIF_F_GSO_SCTP bits
  141. set. Note also that NETIF_F_GSO_SCTP is included in NETIF_F_GSO_SOFTWARE.