| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453 |
- .. SPDX-License-Identifier: GPL-2.0
- ========================================================
- TCP Authentication Option Linux implementation (RFC5925)
- ========================================================
- TCP Authentication Option (TCP-AO) provides a TCP extension aimed at verifying
- segments between trusted peers. It adds a new TCP header option with
- a Message Authentication Code (MAC). MACs are produced from the content
- of a TCP segment using a hashing function with a password known to both peers.
- The intent of TCP-AO is to deprecate TCP-MD5 providing better security,
- key rotation and support for a variety of hashing algorithms.
- 1. Introduction
- ===============
- .. table:: Short and Limited Comparison of TCP-AO and TCP-MD5
- +----------------------+------------------------+-----------------------+
- | | TCP-MD5 | TCP-AO |
- +======================+========================+=======================+
- |Supported hashing |MD5 |Must support HMAC-SHA1 |
- |algorithms |(cryptographically weak)|(chosen-prefix attacks)|
- | | |and CMAC-AES-128 (only |
- | | |side-channel attacks). |
- | | |May support any hashing|
- | | |algorithm. |
- +----------------------+------------------------+-----------------------+
- |Length of MACs (bytes)|16 |Typically 12-16. |
- | | |Other variants that fit|
- | | |TCP header permitted. |
- +----------------------+------------------------+-----------------------+
- |Number of keys per |1 |Many |
- |TCP connection | | |
- +----------------------+------------------------+-----------------------+
- |Possibility to change |Non-practical (both |Supported by protocol |
- |an active key |peers have to change | |
- | |them during MSL) | |
- +----------------------+------------------------+-----------------------+
- |Protection against |No |Yes: ignoring them |
- |ICMP 'hard errors' | |by default on |
- | | |established connections|
- +----------------------+------------------------+-----------------------+
- |Protection against |No |Yes: pseudo-header |
- |traffic-crossing | |includes TCP ports. |
- |attack | | |
- +----------------------+------------------------+-----------------------+
- |Protection against |No |Sequence Number |
- |replayed TCP segments | |Extension (SNE) and |
- | | |Initial Sequence |
- | | |Numbers (ISNs) |
- +----------------------+------------------------+-----------------------+
- |Supports |Yes |No. ISNs+SNE are needed|
- |Connectionless Resets | |to correctly sign RST. |
- +----------------------+------------------------+-----------------------+
- |Standards |RFC 2385 |RFC 5925, RFC 5926 |
- +----------------------+------------------------+-----------------------+
- 1.1 Frequently Asked Questions (FAQ) with references to RFC 5925
- ----------------------------------------------------------------
- Q: Can either SendID or RecvID be non-unique for the same 4-tuple
- (srcaddr, srcport, dstaddr, dstport)?
- A: No [3.1]::
- >> The IDs of MKTs MUST NOT overlap where their TCP connection
- identifiers overlap.
- Q: Can Master Key Tuple (MKT) for an active connection be removed?
- A: No, unless it's copied to Transport Control Block (TCB) [3.1]::
- It is presumed that an MKT affecting a particular connection cannot
- be destroyed during an active connection -- or, equivalently, that
- its parameters are copied to an area local to the connection (i.e.,
- instantiated) and so changes would affect only new connections.
- Q: If an old MKT needs to be deleted, how should it be done in order
- to not remove it for an active connection? (As it can be still in use
- at any moment later)
- A: Not specified by RFC 5925, seems to be a problem for key management
- to ensure that no one uses such MKT before trying to remove it.
- Q: Can an old MKT exist forever and be used by another peer?
- A: It can, it's a key management task to decide when to remove an old key [6.1]::
- Deciding when to start using a key is a performance issue. Deciding
- when to remove an MKT is a security issue. Invalid MKTs are expected
- to be removed. TCP-AO provides no mechanism to coordinate their removal,
- as we consider this a key management operation.
- also [6.1]::
- The only way to avoid reuse of previously used MKTs is to remove the MKT
- when it is no longer considered permitted.
- Linux TCP-AO will try its best to prevent you from removing a key that's
- being used, considering it a key management failure. But since keeping
- an outdated key may become a security issue and as a peer may
- unintentionally prevent the removal of an old key by always setting
- it as RNextKeyID - a forced key removal mechanism is provided, where
- userspace has to supply KeyID to use instead of the one that's being removed
- and the kernel will atomically delete the old key, even if the peer is
- still requesting it. There are no guarantees for force-delete as the peer
- may yet not have the new key - the TCP connection may just break.
- Alternatively, one may choose to shut down the socket.
- Q: What happens when a packet is received on a new connection with no known
- MKT's RecvID?
- A: RFC 5925 specifies that by default it is accepted with a warning logged, but
- the behaviour can be configured by the user [7.5.1.a]::
- If the segment is a SYN, then this is the first segment of a new
- connection. Find the matching MKT for this segment, using the segment's
- socket pair and its TCP-AO KeyID, matched against the MKT's TCP connection
- identifier and the MKT's RecvID.
- i. If there is no matching MKT, remove TCP-AO from the segment.
- Proceed with further TCP handling of the segment.
- NOTE: this presumes that connections that do not match any MKT
- should be silently accepted, as noted in Section 7.3.
- [7.3]::
- >> A TCP-AO implementation MUST allow for configuration of the behavior
- of segments with TCP-AO but that do not match an MKT. The initial default
- of this configuration SHOULD be to silently accept such connections.
- If this is not the desired case, an MKT can be included to match such
- connections, or the connection can indicate that TCP-AO is required.
- Alternately, the configuration can be changed to discard segments with
- the AO option not matching an MKT.
- [10.2.b]::
- Connections not matching any MKT do not require TCP-AO. Further, incoming
- segments with TCP-AO are not discarded solely because they include
- the option, provided they do not match any MKT.
- Note that Linux TCP-AO implementation differs in this aspect. Currently, TCP-AO
- segments with unknown key signatures are discarded with warnings logged.
- Q: Does the RFC imply centralized kernel key management in any way?
- (i.e. that a key on all connections MUST be rotated at the same time?)
- A: Not specified. MKTs can be managed in userspace, the only relevant part to
- key changes is [7.3]::
- >> All TCP segments MUST be checked against the set of MKTs for matching
- TCP connection identifiers.
- Q: What happens when RNextKeyID requested by a peer is unknown? Should
- the connection be reset?
- A: It should not, no action needs to be performed [7.5.2.e]::
- ii. If they differ, determine whether the RNextKeyID MKT is ready.
- 1. If the MKT corresponding to the segment’s socket pair and RNextKeyID
- is not available, no action is required (RNextKeyID of a received
- segment needs to match the MKT’s SendID).
- Q: How is current_key set, and when does it change? Is it a user-triggered
- change, or is it triggered by a request from the remote peer? Is it set by the
- user explicitly, or by a matching rule?
- A: current_key is set by RNextKeyID [6.1]::
- Rnext_key is changed only by manual user intervention or MKT management
- protocol operation. It is not manipulated by TCP-AO. Current_key is updated
- by TCP-AO when processing received TCP segments as discussed in the segment
- processing description in Section 7.5. Note that the algorithm allows
- the current_key to change to a new MKT, then change back to a previously
- used MKT (known as "backing up"). This can occur during an MKT change when
- segments are received out of order, and is considered a feature of TCP-AO,
- because reordering does not result in drops.
- [7.5.2.e.ii]::
- 2. If the matching MKT corresponding to the segment’s socket pair and
- RNextKeyID is available:
- a. Set current_key to the RNextKeyID MKT.
- Q: If both peers have multiple MKTs matching the connection's socket pair
- (with different KeyIDs), how should the sender/receiver pick KeyID to use?
- A: Some mechanism should pick the "desired" MKT [3.3]::
- Multiple MKTs may match a single outgoing segment, e.g., when MKTs
- are being changed. Those MKTs cannot have conflicting IDs (as noted
- elsewhere), and some mechanism must determine which MKT to use for each
- given outgoing segment.
- >> An outgoing TCP segment MUST match at most one desired MKT, indicated
- by the segment’s socket pair. The segment MAY match multiple MKTs, provided
- that exactly one MKT is indicated as desired. Other information in
- the segment MAY be used to determine the desired MKT when multiple MKTs
- match; such information MUST NOT include values in any TCP option fields.
- Q: Can TCP-MD5 connection migrate to TCP-AO (and vice-versa):
- A: No [1]::
- TCP MD5-protected connections cannot be migrated to TCP-AO because TCP MD5
- does not support any changes to a connection’s security algorithm
- once established.
- Q: If all MKTs are removed on a connection, can it become a non-TCP-AO signed
- connection?
- A: [7.5.2] doesn't have the same choice as SYN packet handling in [7.5.1.i]
- that would allow accepting segments without a sign (which would be insecure).
- While switching to non-TCP-AO connection is not prohibited directly, it seems
- what the RFC means. Also, there's a requirement for TCP-AO connections to
- always have one current_key [3.3]::
- TCP-AO requires that every protected TCP segment match exactly one MKT.
- [3.3]::
- >> An incoming TCP segment including TCP-AO MUST match exactly one MKT,
- indicated solely by the segment’s socket pair and its TCP-AO KeyID.
- [4.4]::
- One or more MKTs. These are the MKTs that match this connection’s
- socket pair.
- Q: Can a non-TCP-AO connection become a TCP-AO-enabled one?
- A: No: for an already established non-TCP-AO connection it would be impossible
- to switch to using TCP-AO, as the traffic key generation requires the initial
- sequence numbers. Paraphrasing, starting using TCP-AO would require
- re-establishing the TCP connection.
- 2. In-kernel MKTs database vs database in userspace
- ===================================================
- Linux TCP-AO support is implemented using ``setsockopt()s``, in a similar way
- to TCP-MD5. It means that a userspace application that wants to use TCP-AO
- should perform ``setsockopt()`` on a TCP socket when it wants to add,
- remove or rotate MKTs. This approach moves the key management responsibility
- to userspace as well as decisions on corner cases, i.e. what to do if
- the peer doesn't respect RNextKeyID; moving more code to userspace, especially
- responsible for the policy decisions. Besides, it's flexible and scales well
- (with less locking needed than in the case of an in-kernel database). One also
- should keep in mind that mainly intended users are BGP processes, not any
- random applications, which means that compared to IPsec tunnels,
- no transparency is really needed and modern BGP daemons already have
- ``setsockopt()s`` for TCP-MD5 support.
- .. table:: Considered pros and cons of the approaches
- +----------------------+------------------------+-----------------------+
- | | ``setsockopt()`` | in-kernel DB |
- +======================+========================+=======================+
- | Extendability | ``setsockopt()`` | Netlink messages are |
- | | commands should be | simple and extendable |
- | | extendable syscalls | |
- +----------------------+------------------------+-----------------------+
- | Required userspace | BGP or any application | could be transparent |
- | changes | that wants TCP-AO needs| as tunnels, providing |
- | | to perform | something like |
- | | ``setsockopt()s`` | ``ip tcpao add key`` |
- | | and do key management | (delete/show/rotate) |
- +----------------------+------------------------+-----------------------+
- |MKTs removal or adding| harder for userspace | harder for kernel |
- +----------------------+------------------------+-----------------------+
- | Dump-ability | ``getsockopt()`` | Netlink .dump() |
- | | | callback |
- +----------------------+------------------------+-----------------------+
- | Limits on kernel | equal |
- | resources/memory | |
- +----------------------+------------------------+-----------------------+
- | Scalability | contention on | contention on |
- | | ``TCP_LISTEN`` sockets | the whole database |
- +----------------------+------------------------+-----------------------+
- | Monitoring & warnings| ``TCP_DIAG`` | same Netlink socket |
- +----------------------+------------------------+-----------------------+
- | Matching of MKTs | half-problem: only | hard |
- | | listen sockets | |
- +----------------------+------------------------+-----------------------+
- 3. uAPI
- =======
- Linux provides a set of ``setsockopt()s`` and ``getsockopt()s`` that let
- userspace manage TCP-AO on a per-socket basis. In order to add/delete MKTs
- ``TCP_AO_ADD_KEY`` and ``TCP_AO_DEL_KEY`` TCP socket options must be used.
- It is not allowed to add a key on an established non-TCP-AO connection
- as well as to remove the last key from TCP-AO connection.
- ``setsockopt(TCP_AO_DEL_KEY)`` command may specify ``tcp_ao_del::current_key``
- + ``tcp_ao_del::set_current`` and/or ``tcp_ao_del::rnext``
- + ``tcp_ao_del::set_rnext`` which makes such delete "forced": it
- provides userspace a way to delete a key that's being used and atomically set
- another one instead. This is not intended for normal use and should be used
- only when the peer ignores RNextKeyID and keeps requesting/using an old key.
- It provides a way to force-delete a key that's not trusted but may break
- the TCP-AO connection.
- The usual/normal key-rotation can be performed with ``setsockopt(TCP_AO_INFO)``.
- It also provides a uAPI to change per-socket TCP-AO settings, such as
- ignoring ICMPs, as well as clear per-socket TCP-AO packet counters.
- The corresponding ``getsockopt(TCP_AO_INFO)`` can be used to get those
- per-socket TCP-AO settings.
- Another useful command is ``getsockopt(TCP_AO_GET_KEYS)``. One can use it
- to list all MKTs on a TCP socket or use a filter to get keys for a specific
- peer and/or sndid/rcvid, VRF L3 interface or get current_key/rnext_key.
- To repair TCP-AO connections ``setsockopt(TCP_AO_REPAIR)`` is available,
- provided that the user previously has checkpointed/dumped the socket with
- ``getsockopt(TCP_AO_REPAIR)``.
- A tip here for scaled TCP_LISTEN sockets, that may have some thousands TCP-AO
- keys, is: use filters in ``getsockopt(TCP_AO_GET_KEYS)`` and asynchronous
- delete with ``setsockopt(TCP_AO_DEL_KEY)``.
- Linux TCP-AO also provides a bunch of segment counters that can be helpful
- with troubleshooting/debugging issues. Every MKT has good/bad counters
- that reflect how many packets passed/failed verification.
- Each TCP-AO socket has the following counters:
- - for good segments (properly signed)
- - for bad segments (failed TCP-AO verification)
- - for segments with unknown keys
- - for segments where an AO signature was expected, but wasn't found
- - for the number of ignored ICMPs
- TCP-AO per-socket counters are also duplicated with per-netns counters,
- exposed with SNMP. Those are ``TCPAOGood``, ``TCPAOBad``, ``TCPAOKeyNotFound``,
- ``TCPAORequired`` and ``TCPAODroppedIcmps``.
- For monitoring purposes, there are following TCP-AO trace events:
- ``tcp_hash_bad_header``, ``tcp_hash_ao_required``, ``tcp_ao_handshake_failure``,
- ``tcp_ao_wrong_maclen``, ``tcp_ao_wrong_maclen``, ``tcp_ao_key_not_found``,
- ``tcp_ao_rnext_request``, ``tcp_ao_synack_no_key``, ``tcp_ao_snd_sne_update``,
- ``tcp_ao_rcv_sne_update``. It's possible to separately enable any of them and
- one can filter them by net-namespace, 4-tuple, family, L3 index, and TCP header
- flags. If a segment has a TCP-AO header, the filters may also include
- keyid, rnext, and maclen. SNE updates include the rolled-over numbers.
- RFC 5925 very permissively specifies how TCP port matching can be done for
- MKTs::
- TCP connection identifier. A TCP socket pair, i.e., a local IP
- address, a remote IP address, a TCP local port, and a TCP remote port.
- Values can be partially specified using ranges (e.g., 2-30), masks
- (e.g., 0xF0), wildcards (e.g., "*"), or any other suitable indication.
- Currently Linux TCP-AO implementation doesn't provide any TCP port matching.
- Probably, port ranges are the most flexible for uAPI, but so far
- not implemented.
- 4. ``setsockopt()`` vs ``accept()`` race
- ========================================
- In contrast with an established TCP-MD5 connection which has just one key,
- TCP-AO connections may have many keys, which means that accepted connections
- on a listen socket may have any amount of keys as well. As copying all those
- keys on a first properly signed SYN would make the request socket bigger, that
- would be undesirable. Currently, the implementation doesn't copy keys
- to request sockets, but rather look them up on the "parent" listener socket.
- The result is that when userspace removes TCP-AO keys, that may break
- not-yet-established connections on request sockets as well as not removing
- keys from sockets that were already established, but not yet ``accept()``'ed,
- hanging in the accept queue.
- The reverse is valid as well: if userspace adds a new key for a peer on
- a listener socket, the established sockets in the accept queue won't
- have the new keys.
- At this moment, the resolution for the two races:
- ``setsockopt(TCP_AO_ADD_KEY)`` vs ``accept()``
- and ``setsockopt(TCP_AO_DEL_KEY)`` vs ``accept()`` is delegated to userspace.
- This means that it's expected that userspace would check the MKTs on the socket
- that was returned by ``accept()`` to verify that any key rotation that
- happened on the listen socket is reflected on the newly established connection.
- This is a similar "do-nothing" approach to TCP-MD5 from the kernel side and
- may be changed later by introducing new flags to ``tcp_ao_add``
- and ``tcp_ao_del``.
- Note that this race is rare for it needs TCP-AO key rotation to happen
- during the 3-way handshake for the new TCP connection.
- 5. Interaction with TCP-MD5
- ===========================
- A TCP connection can not migrate between TCP-AO and TCP-MD5 options. The
- established sockets that have either AO or MD5 keys are restricted for
- adding keys of the other option.
- For listening sockets the picture is different: BGP server may want to receive
- both TCP-AO and (deprecated) TCP-MD5 clients. As a result, both types of keys
- may be added to TCP_CLOSED or TCP_LISTEN sockets. It's not allowed to add
- different types of keys for the same peer.
- 6. SNE Linux implementation
- ===========================
- RFC 5925 [6.2] describes the algorithm of how to extend TCP sequence numbers
- with SNE. In short: TCP has to track the previous sequence numbers and set
- sne_flag when the current SEQ number rolls over. The flag is cleared when
- both current and previous SEQ numbers cross 0x7fff, which is 32Kb.
- In times when sne_flag is set, the algorithm compares SEQ for each packet with
- 0x7fff and if it's higher than 32Kb, it assumes that the packet should be
- verified with SNE before the increment. As a result, there's
- this [0; 32Kb] window, when packets with (SNE - 1) can be accepted.
- Linux implementation simplifies this a bit: as the network stack already tracks
- the first SEQ byte that ACK is wanted for (snd_una) and the next SEQ byte that
- is wanted (rcv_nxt) - that's enough information for a rough estimation
- on where in the 4GB SEQ number space both sender and receiver are.
- When they roll over to zero, the corresponding SNE gets incremented.
- tcp_ao_compute_sne() is called for each TCP-AO segment. It compares SEQ numbers
- from the segment with snd_una or rcv_nxt and fits the result into a 2GB window around them,
- detecting SEQ numbers rolling over. That simplifies the code a lot and only
- requires SNE numbers to be stored on every TCP-AO socket.
- The 2GB window at first glance seems much more permissive compared to
- RFC 5926. But that is only used to pick the correct SNE before/after
- a rollover. It allows more TCP segment replays, but yet all regular
- TCP checks in tcp_sequence() are applied on the verified segment.
- So, it trades a bit more permissive acceptance of replayed/retransmitted
- segments for the simplicity of the algorithm and what seems better behaviour
- for large TCP windows.
- 7. Links
- ========
- RFC 5925 The TCP Authentication Option
- https://www.rfc-editor.org/rfc/pdfrfc/rfc5925.txt.pdf
- RFC 5926 Cryptographic Algorithms for the TCP Authentication Option (TCP-AO)
- https://www.rfc-editor.org/rfc/pdfrfc/rfc5926.txt.pdf
- Draft "SHA-2 Algorithm for the TCP Authentication Option (TCP-AO)"
- https://datatracker.ietf.org/doc/html/draft-nayak-tcp-sha2-03
- RFC 2385 Protection of BGP Sessions via the TCP MD5 Signature Option
- https://www.rfc-editor.org/rfc/pdfrfc/rfc2385.txt.pdf
- :Author: Dmitry Safonov <dima@arista.com>
|