tcp_ao.rst 22 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453
  1. .. SPDX-License-Identifier: GPL-2.0
  2. ========================================================
  3. TCP Authentication Option Linux implementation (RFC5925)
  4. ========================================================
  5. TCP Authentication Option (TCP-AO) provides a TCP extension aimed at verifying
  6. segments between trusted peers. It adds a new TCP header option with
  7. a Message Authentication Code (MAC). MACs are produced from the content
  8. of a TCP segment using a hashing function with a password known to both peers.
  9. The intent of TCP-AO is to deprecate TCP-MD5 providing better security,
  10. key rotation and support for a variety of hashing algorithms.
  11. 1. Introduction
  12. ===============
  13. .. table:: Short and Limited Comparison of TCP-AO and TCP-MD5
  14. +----------------------+------------------------+-----------------------+
  15. | | TCP-MD5 | TCP-AO |
  16. +======================+========================+=======================+
  17. |Supported hashing |MD5 |Must support HMAC-SHA1 |
  18. |algorithms |(cryptographically weak)|(chosen-prefix attacks)|
  19. | | |and CMAC-AES-128 (only |
  20. | | |side-channel attacks). |
  21. | | |May support any hashing|
  22. | | |algorithm. |
  23. +----------------------+------------------------+-----------------------+
  24. |Length of MACs (bytes)|16 |Typically 12-16. |
  25. | | |Other variants that fit|
  26. | | |TCP header permitted. |
  27. +----------------------+------------------------+-----------------------+
  28. |Number of keys per |1 |Many |
  29. |TCP connection | | |
  30. +----------------------+------------------------+-----------------------+
  31. |Possibility to change |Non-practical (both |Supported by protocol |
  32. |an active key |peers have to change | |
  33. | |them during MSL) | |
  34. +----------------------+------------------------+-----------------------+
  35. |Protection against |No |Yes: ignoring them |
  36. |ICMP 'hard errors' | |by default on |
  37. | | |established connections|
  38. +----------------------+------------------------+-----------------------+
  39. |Protection against |No |Yes: pseudo-header |
  40. |traffic-crossing | |includes TCP ports. |
  41. |attack | | |
  42. +----------------------+------------------------+-----------------------+
  43. |Protection against |No |Sequence Number |
  44. |replayed TCP segments | |Extension (SNE) and |
  45. | | |Initial Sequence |
  46. | | |Numbers (ISNs) |
  47. +----------------------+------------------------+-----------------------+
  48. |Supports |Yes |No. ISNs+SNE are needed|
  49. |Connectionless Resets | |to correctly sign RST. |
  50. +----------------------+------------------------+-----------------------+
  51. |Standards |RFC 2385 |RFC 5925, RFC 5926 |
  52. +----------------------+------------------------+-----------------------+
  53. 1.1 Frequently Asked Questions (FAQ) with references to RFC 5925
  54. ----------------------------------------------------------------
  55. Q: Can either SendID or RecvID be non-unique for the same 4-tuple
  56. (srcaddr, srcport, dstaddr, dstport)?
  57. A: No [3.1]::
  58. >> The IDs of MKTs MUST NOT overlap where their TCP connection
  59. identifiers overlap.
  60. Q: Can Master Key Tuple (MKT) for an active connection be removed?
  61. A: No, unless it's copied to Transport Control Block (TCB) [3.1]::
  62. It is presumed that an MKT affecting a particular connection cannot
  63. be destroyed during an active connection -- or, equivalently, that
  64. its parameters are copied to an area local to the connection (i.e.,
  65. instantiated) and so changes would affect only new connections.
  66. Q: If an old MKT needs to be deleted, how should it be done in order
  67. to not remove it for an active connection? (As it can be still in use
  68. at any moment later)
  69. A: Not specified by RFC 5925, seems to be a problem for key management
  70. to ensure that no one uses such MKT before trying to remove it.
  71. Q: Can an old MKT exist forever and be used by another peer?
  72. A: It can, it's a key management task to decide when to remove an old key [6.1]::
  73. Deciding when to start using a key is a performance issue. Deciding
  74. when to remove an MKT is a security issue. Invalid MKTs are expected
  75. to be removed. TCP-AO provides no mechanism to coordinate their removal,
  76. as we consider this a key management operation.
  77. also [6.1]::
  78. The only way to avoid reuse of previously used MKTs is to remove the MKT
  79. when it is no longer considered permitted.
  80. Linux TCP-AO will try its best to prevent you from removing a key that's
  81. being used, considering it a key management failure. But since keeping
  82. an outdated key may become a security issue and as a peer may
  83. unintentionally prevent the removal of an old key by always setting
  84. it as RNextKeyID - a forced key removal mechanism is provided, where
  85. userspace has to supply KeyID to use instead of the one that's being removed
  86. and the kernel will atomically delete the old key, even if the peer is
  87. still requesting it. There are no guarantees for force-delete as the peer
  88. may yet not have the new key - the TCP connection may just break.
  89. Alternatively, one may choose to shut down the socket.
  90. Q: What happens when a packet is received on a new connection with no known
  91. MKT's RecvID?
  92. A: RFC 5925 specifies that by default it is accepted with a warning logged, but
  93. the behaviour can be configured by the user [7.5.1.a]::
  94. If the segment is a SYN, then this is the first segment of a new
  95. connection. Find the matching MKT for this segment, using the segment's
  96. socket pair and its TCP-AO KeyID, matched against the MKT's TCP connection
  97. identifier and the MKT's RecvID.
  98. i. If there is no matching MKT, remove TCP-AO from the segment.
  99. Proceed with further TCP handling of the segment.
  100. NOTE: this presumes that connections that do not match any MKT
  101. should be silently accepted, as noted in Section 7.3.
  102. [7.3]::
  103. >> A TCP-AO implementation MUST allow for configuration of the behavior
  104. of segments with TCP-AO but that do not match an MKT. The initial default
  105. of this configuration SHOULD be to silently accept such connections.
  106. If this is not the desired case, an MKT can be included to match such
  107. connections, or the connection can indicate that TCP-AO is required.
  108. Alternately, the configuration can be changed to discard segments with
  109. the AO option not matching an MKT.
  110. [10.2.b]::
  111. Connections not matching any MKT do not require TCP-AO. Further, incoming
  112. segments with TCP-AO are not discarded solely because they include
  113. the option, provided they do not match any MKT.
  114. Note that Linux TCP-AO implementation differs in this aspect. Currently, TCP-AO
  115. segments with unknown key signatures are discarded with warnings logged.
  116. Q: Does the RFC imply centralized kernel key management in any way?
  117. (i.e. that a key on all connections MUST be rotated at the same time?)
  118. A: Not specified. MKTs can be managed in userspace, the only relevant part to
  119. key changes is [7.3]::
  120. >> All TCP segments MUST be checked against the set of MKTs for matching
  121. TCP connection identifiers.
  122. Q: What happens when RNextKeyID requested by a peer is unknown? Should
  123. the connection be reset?
  124. A: It should not, no action needs to be performed [7.5.2.e]::
  125. ii. If they differ, determine whether the RNextKeyID MKT is ready.
  126. 1. If the MKT corresponding to the segment’s socket pair and RNextKeyID
  127. is not available, no action is required (RNextKeyID of a received
  128. segment needs to match the MKT’s SendID).
  129. Q: How is current_key set, and when does it change? Is it a user-triggered
  130. change, or is it triggered by a request from the remote peer? Is it set by the
  131. user explicitly, or by a matching rule?
  132. A: current_key is set by RNextKeyID [6.1]::
  133. Rnext_key is changed only by manual user intervention or MKT management
  134. protocol operation. It is not manipulated by TCP-AO. Current_key is updated
  135. by TCP-AO when processing received TCP segments as discussed in the segment
  136. processing description in Section 7.5. Note that the algorithm allows
  137. the current_key to change to a new MKT, then change back to a previously
  138. used MKT (known as "backing up"). This can occur during an MKT change when
  139. segments are received out of order, and is considered a feature of TCP-AO,
  140. because reordering does not result in drops.
  141. [7.5.2.e.ii]::
  142. 2. If the matching MKT corresponding to the segment’s socket pair and
  143. RNextKeyID is available:
  144. a. Set current_key to the RNextKeyID MKT.
  145. Q: If both peers have multiple MKTs matching the connection's socket pair
  146. (with different KeyIDs), how should the sender/receiver pick KeyID to use?
  147. A: Some mechanism should pick the "desired" MKT [3.3]::
  148. Multiple MKTs may match a single outgoing segment, e.g., when MKTs
  149. are being changed. Those MKTs cannot have conflicting IDs (as noted
  150. elsewhere), and some mechanism must determine which MKT to use for each
  151. given outgoing segment.
  152. >> An outgoing TCP segment MUST match at most one desired MKT, indicated
  153. by the segment’s socket pair. The segment MAY match multiple MKTs, provided
  154. that exactly one MKT is indicated as desired. Other information in
  155. the segment MAY be used to determine the desired MKT when multiple MKTs
  156. match; such information MUST NOT include values in any TCP option fields.
  157. Q: Can TCP-MD5 connection migrate to TCP-AO (and vice-versa):
  158. A: No [1]::
  159. TCP MD5-protected connections cannot be migrated to TCP-AO because TCP MD5
  160. does not support any changes to a connection’s security algorithm
  161. once established.
  162. Q: If all MKTs are removed on a connection, can it become a non-TCP-AO signed
  163. connection?
  164. A: [7.5.2] doesn't have the same choice as SYN packet handling in [7.5.1.i]
  165. that would allow accepting segments without a sign (which would be insecure).
  166. While switching to non-TCP-AO connection is not prohibited directly, it seems
  167. what the RFC means. Also, there's a requirement for TCP-AO connections to
  168. always have one current_key [3.3]::
  169. TCP-AO requires that every protected TCP segment match exactly one MKT.
  170. [3.3]::
  171. >> An incoming TCP segment including TCP-AO MUST match exactly one MKT,
  172. indicated solely by the segment’s socket pair and its TCP-AO KeyID.
  173. [4.4]::
  174. One or more MKTs. These are the MKTs that match this connection’s
  175. socket pair.
  176. Q: Can a non-TCP-AO connection become a TCP-AO-enabled one?
  177. A: No: for an already established non-TCP-AO connection it would be impossible
  178. to switch to using TCP-AO, as the traffic key generation requires the initial
  179. sequence numbers. Paraphrasing, starting using TCP-AO would require
  180. re-establishing the TCP connection.
  181. 2. In-kernel MKTs database vs database in userspace
  182. ===================================================
  183. Linux TCP-AO support is implemented using ``setsockopt()s``, in a similar way
  184. to TCP-MD5. It means that a userspace application that wants to use TCP-AO
  185. should perform ``setsockopt()`` on a TCP socket when it wants to add,
  186. remove or rotate MKTs. This approach moves the key management responsibility
  187. to userspace as well as decisions on corner cases, i.e. what to do if
  188. the peer doesn't respect RNextKeyID; moving more code to userspace, especially
  189. responsible for the policy decisions. Besides, it's flexible and scales well
  190. (with less locking needed than in the case of an in-kernel database). One also
  191. should keep in mind that mainly intended users are BGP processes, not any
  192. random applications, which means that compared to IPsec tunnels,
  193. no transparency is really needed and modern BGP daemons already have
  194. ``setsockopt()s`` for TCP-MD5 support.
  195. .. table:: Considered pros and cons of the approaches
  196. +----------------------+------------------------+-----------------------+
  197. | | ``setsockopt()`` | in-kernel DB |
  198. +======================+========================+=======================+
  199. | Extendability | ``setsockopt()`` | Netlink messages are |
  200. | | commands should be | simple and extendable |
  201. | | extendable syscalls | |
  202. +----------------------+------------------------+-----------------------+
  203. | Required userspace | BGP or any application | could be transparent |
  204. | changes | that wants TCP-AO needs| as tunnels, providing |
  205. | | to perform | something like |
  206. | | ``setsockopt()s`` | ``ip tcpao add key`` |
  207. | | and do key management | (delete/show/rotate) |
  208. +----------------------+------------------------+-----------------------+
  209. |MKTs removal or adding| harder for userspace | harder for kernel |
  210. +----------------------+------------------------+-----------------------+
  211. | Dump-ability | ``getsockopt()`` | Netlink .dump() |
  212. | | | callback |
  213. +----------------------+------------------------+-----------------------+
  214. | Limits on kernel | equal |
  215. | resources/memory | |
  216. +----------------------+------------------------+-----------------------+
  217. | Scalability | contention on | contention on |
  218. | | ``TCP_LISTEN`` sockets | the whole database |
  219. +----------------------+------------------------+-----------------------+
  220. | Monitoring & warnings| ``TCP_DIAG`` | same Netlink socket |
  221. +----------------------+------------------------+-----------------------+
  222. | Matching of MKTs | half-problem: only | hard |
  223. | | listen sockets | |
  224. +----------------------+------------------------+-----------------------+
  225. 3. uAPI
  226. =======
  227. Linux provides a set of ``setsockopt()s`` and ``getsockopt()s`` that let
  228. userspace manage TCP-AO on a per-socket basis. In order to add/delete MKTs
  229. ``TCP_AO_ADD_KEY`` and ``TCP_AO_DEL_KEY`` TCP socket options must be used.
  230. It is not allowed to add a key on an established non-TCP-AO connection
  231. as well as to remove the last key from TCP-AO connection.
  232. ``setsockopt(TCP_AO_DEL_KEY)`` command may specify ``tcp_ao_del::current_key``
  233. + ``tcp_ao_del::set_current`` and/or ``tcp_ao_del::rnext``
  234. + ``tcp_ao_del::set_rnext`` which makes such delete "forced": it
  235. provides userspace a way to delete a key that's being used and atomically set
  236. another one instead. This is not intended for normal use and should be used
  237. only when the peer ignores RNextKeyID and keeps requesting/using an old key.
  238. It provides a way to force-delete a key that's not trusted but may break
  239. the TCP-AO connection.
  240. The usual/normal key-rotation can be performed with ``setsockopt(TCP_AO_INFO)``.
  241. It also provides a uAPI to change per-socket TCP-AO settings, such as
  242. ignoring ICMPs, as well as clear per-socket TCP-AO packet counters.
  243. The corresponding ``getsockopt(TCP_AO_INFO)`` can be used to get those
  244. per-socket TCP-AO settings.
  245. Another useful command is ``getsockopt(TCP_AO_GET_KEYS)``. One can use it
  246. to list all MKTs on a TCP socket or use a filter to get keys for a specific
  247. peer and/or sndid/rcvid, VRF L3 interface or get current_key/rnext_key.
  248. To repair TCP-AO connections ``setsockopt(TCP_AO_REPAIR)`` is available,
  249. provided that the user previously has checkpointed/dumped the socket with
  250. ``getsockopt(TCP_AO_REPAIR)``.
  251. A tip here for scaled TCP_LISTEN sockets, that may have some thousands TCP-AO
  252. keys, is: use filters in ``getsockopt(TCP_AO_GET_KEYS)`` and asynchronous
  253. delete with ``setsockopt(TCP_AO_DEL_KEY)``.
  254. Linux TCP-AO also provides a bunch of segment counters that can be helpful
  255. with troubleshooting/debugging issues. Every MKT has good/bad counters
  256. that reflect how many packets passed/failed verification.
  257. Each TCP-AO socket has the following counters:
  258. - for good segments (properly signed)
  259. - for bad segments (failed TCP-AO verification)
  260. - for segments with unknown keys
  261. - for segments where an AO signature was expected, but wasn't found
  262. - for the number of ignored ICMPs
  263. TCP-AO per-socket counters are also duplicated with per-netns counters,
  264. exposed with SNMP. Those are ``TCPAOGood``, ``TCPAOBad``, ``TCPAOKeyNotFound``,
  265. ``TCPAORequired`` and ``TCPAODroppedIcmps``.
  266. For monitoring purposes, there are following TCP-AO trace events:
  267. ``tcp_hash_bad_header``, ``tcp_hash_ao_required``, ``tcp_ao_handshake_failure``,
  268. ``tcp_ao_wrong_maclen``, ``tcp_ao_wrong_maclen``, ``tcp_ao_key_not_found``,
  269. ``tcp_ao_rnext_request``, ``tcp_ao_synack_no_key``, ``tcp_ao_snd_sne_update``,
  270. ``tcp_ao_rcv_sne_update``. It's possible to separately enable any of them and
  271. one can filter them by net-namespace, 4-tuple, family, L3 index, and TCP header
  272. flags. If a segment has a TCP-AO header, the filters may also include
  273. keyid, rnext, and maclen. SNE updates include the rolled-over numbers.
  274. RFC 5925 very permissively specifies how TCP port matching can be done for
  275. MKTs::
  276. TCP connection identifier. A TCP socket pair, i.e., a local IP
  277. address, a remote IP address, a TCP local port, and a TCP remote port.
  278. Values can be partially specified using ranges (e.g., 2-30), masks
  279. (e.g., 0xF0), wildcards (e.g., "*"), or any other suitable indication.
  280. Currently Linux TCP-AO implementation doesn't provide any TCP port matching.
  281. Probably, port ranges are the most flexible for uAPI, but so far
  282. not implemented.
  283. 4. ``setsockopt()`` vs ``accept()`` race
  284. ========================================
  285. In contrast with an established TCP-MD5 connection which has just one key,
  286. TCP-AO connections may have many keys, which means that accepted connections
  287. on a listen socket may have any amount of keys as well. As copying all those
  288. keys on a first properly signed SYN would make the request socket bigger, that
  289. would be undesirable. Currently, the implementation doesn't copy keys
  290. to request sockets, but rather look them up on the "parent" listener socket.
  291. The result is that when userspace removes TCP-AO keys, that may break
  292. not-yet-established connections on request sockets as well as not removing
  293. keys from sockets that were already established, but not yet ``accept()``'ed,
  294. hanging in the accept queue.
  295. The reverse is valid as well: if userspace adds a new key for a peer on
  296. a listener socket, the established sockets in the accept queue won't
  297. have the new keys.
  298. At this moment, the resolution for the two races:
  299. ``setsockopt(TCP_AO_ADD_KEY)`` vs ``accept()``
  300. and ``setsockopt(TCP_AO_DEL_KEY)`` vs ``accept()`` is delegated to userspace.
  301. This means that it's expected that userspace would check the MKTs on the socket
  302. that was returned by ``accept()`` to verify that any key rotation that
  303. happened on the listen socket is reflected on the newly established connection.
  304. This is a similar "do-nothing" approach to TCP-MD5 from the kernel side and
  305. may be changed later by introducing new flags to ``tcp_ao_add``
  306. and ``tcp_ao_del``.
  307. Note that this race is rare for it needs TCP-AO key rotation to happen
  308. during the 3-way handshake for the new TCP connection.
  309. 5. Interaction with TCP-MD5
  310. ===========================
  311. A TCP connection can not migrate between TCP-AO and TCP-MD5 options. The
  312. established sockets that have either AO or MD5 keys are restricted for
  313. adding keys of the other option.
  314. For listening sockets the picture is different: BGP server may want to receive
  315. both TCP-AO and (deprecated) TCP-MD5 clients. As a result, both types of keys
  316. may be added to TCP_CLOSED or TCP_LISTEN sockets. It's not allowed to add
  317. different types of keys for the same peer.
  318. 6. SNE Linux implementation
  319. ===========================
  320. RFC 5925 [6.2] describes the algorithm of how to extend TCP sequence numbers
  321. with SNE. In short: TCP has to track the previous sequence numbers and set
  322. sne_flag when the current SEQ number rolls over. The flag is cleared when
  323. both current and previous SEQ numbers cross 0x7fff, which is 32Kb.
  324. In times when sne_flag is set, the algorithm compares SEQ for each packet with
  325. 0x7fff and if it's higher than 32Kb, it assumes that the packet should be
  326. verified with SNE before the increment. As a result, there's
  327. this [0; 32Kb] window, when packets with (SNE - 1) can be accepted.
  328. Linux implementation simplifies this a bit: as the network stack already tracks
  329. the first SEQ byte that ACK is wanted for (snd_una) and the next SEQ byte that
  330. is wanted (rcv_nxt) - that's enough information for a rough estimation
  331. on where in the 4GB SEQ number space both sender and receiver are.
  332. When they roll over to zero, the corresponding SNE gets incremented.
  333. tcp_ao_compute_sne() is called for each TCP-AO segment. It compares SEQ numbers
  334. from the segment with snd_una or rcv_nxt and fits the result into a 2GB window around them,
  335. detecting SEQ numbers rolling over. That simplifies the code a lot and only
  336. requires SNE numbers to be stored on every TCP-AO socket.
  337. The 2GB window at first glance seems much more permissive compared to
  338. RFC 5926. But that is only used to pick the correct SNE before/after
  339. a rollover. It allows more TCP segment replays, but yet all regular
  340. TCP checks in tcp_sequence() are applied on the verified segment.
  341. So, it trades a bit more permissive acceptance of replayed/retransmitted
  342. segments for the simplicity of the algorithm and what seems better behaviour
  343. for large TCP windows.
  344. 7. Links
  345. ========
  346. RFC 5925 The TCP Authentication Option
  347. https://www.rfc-editor.org/rfc/pdfrfc/rfc5925.txt.pdf
  348. RFC 5926 Cryptographic Algorithms for the TCP Authentication Option (TCP-AO)
  349. https://www.rfc-editor.org/rfc/pdfrfc/rfc5926.txt.pdf
  350. Draft "SHA-2 Algorithm for the TCP Authentication Option (TCP-AO)"
  351. https://datatracker.ietf.org/doc/html/draft-nayak-tcp-sha2-03
  352. RFC 2385 Protection of BGP Sessions via the TCP MD5 Signature Option
  353. https://www.rfc-editor.org/rfc/pdfrfc/rfc2385.txt.pdf
  354. :Author: Dmitry Safonov <dima@arista.com>