mctp.rst 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320
  1. .. SPDX-License-Identifier: GPL-2.0
  2. ==============================================
  3. Management Component Transport Protocol (MCTP)
  4. ==============================================
  5. net/mctp/ contains protocol support for MCTP, as defined by DMTF standard
  6. DSP0236. Physical interface drivers ("bindings" in the specification) are
  7. provided in drivers/net/mctp/.
  8. The core code provides a socket-based interface to send and receive MCTP
  9. messages, through an AF_MCTP, SOCK_DGRAM socket.
  10. Structure: interfaces & networks
  11. ================================
  12. The kernel models the local MCTP topology through two items: interfaces and
  13. networks.
  14. An interface (or "link") is an instance of an MCTP physical transport binding
  15. (as defined by DSP0236, section 3.2.47), likely connected to a specific hardware
  16. device. This is represented as a ``struct netdevice``.
  17. A network defines a unique address space for MCTP endpoints by endpoint-ID
  18. (described by DSP0236, section 3.2.31). A network has a user-visible identifier
  19. to allow references from userspace. Route definitions are specific to one
  20. network.
  21. Interfaces are associated with one network. A network may be associated with one
  22. or more interfaces.
  23. If multiple networks are present, each may contain endpoint IDs (EIDs) that are
  24. also present on other networks.
  25. Sockets API
  26. ===========
  27. Protocol definitions
  28. --------------------
  29. MCTP uses ``AF_MCTP`` / ``PF_MCTP`` for the address- and protocol- families.
  30. Since MCTP is message-based, only ``SOCK_DGRAM`` sockets are supported.
  31. .. code-block:: C
  32. int sd = socket(AF_MCTP, SOCK_DGRAM, 0);
  33. The only (current) value for the ``protocol`` argument is 0.
  34. As with all socket address families, source and destination addresses are
  35. specified with a ``sockaddr`` type, with a single-byte endpoint address:
  36. .. code-block:: C
  37. typedef __u8 mctp_eid_t;
  38. struct mctp_addr {
  39. mctp_eid_t s_addr;
  40. };
  41. struct sockaddr_mctp {
  42. __kernel_sa_family_t smctp_family;
  43. unsigned int smctp_network;
  44. struct mctp_addr smctp_addr;
  45. __u8 smctp_type;
  46. __u8 smctp_tag;
  47. };
  48. #define MCTP_NET_ANY 0x0
  49. #define MCTP_ADDR_ANY 0xff
  50. Syscall behaviour
  51. -----------------
  52. The following sections describe the MCTP-specific behaviours of the standard
  53. socket system calls. These behaviours have been chosen to map closely to the
  54. existing sockets APIs.
  55. ``bind()`` : set local socket address
  56. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  57. Sockets that receive incoming request packets will bind to a local address,
  58. using the ``bind()`` syscall.
  59. .. code-block:: C
  60. struct sockaddr_mctp addr;
  61. addr.smctp_family = AF_MCTP;
  62. addr.smctp_network = MCTP_NET_ANY;
  63. addr.smctp_addr.s_addr = MCTP_ADDR_ANY;
  64. addr.smctp_type = MCTP_TYPE_PLDM;
  65. addr.smctp_tag = MCTP_TAG_OWNER;
  66. int rc = bind(sd, (struct sockaddr *)&addr, sizeof(addr));
  67. This establishes the local address of the socket. Incoming MCTP messages that
  68. match the network, address, and message type will be received by this socket.
  69. The reference to 'incoming' is important here; a bound socket will only receive
  70. messages with the TO bit set, to indicate an incoming request message, rather
  71. than a response.
  72. The ``smctp_tag`` value will configure the tags accepted from the remote side of
  73. this socket. Given the above, the only valid value is ``MCTP_TAG_OWNER``, which
  74. will result in remotely "owned" tags being routed to this socket. Since
  75. ``MCTP_TAG_OWNER`` is set, the 3 least-significant bits of ``smctp_tag`` are not
  76. used; callers must set them to zero.
  77. A ``smctp_network`` value of ``MCTP_NET_ANY`` will configure the socket to
  78. receive incoming packets from any locally-connected network. A specific network
  79. value will cause the socket to only receive incoming messages from that network.
  80. The ``smctp_addr`` field specifies a local address to bind to. A value of
  81. ``MCTP_ADDR_ANY`` configures the socket to receive messages addressed to any
  82. local destination EID.
  83. The ``smctp_type`` field specifies which message types to receive. Only the
  84. lower 7 bits of the type is matched on incoming messages (ie., the
  85. most-significant IC bit is not part of the match). This results in the socket
  86. receiving packets with and without a message integrity check footer.
  87. ``sendto()``, ``sendmsg()``, ``send()`` : transmit an MCTP message
  88. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  89. An MCTP message is transmitted using one of the ``sendto()``, ``sendmsg()`` or
  90. ``send()`` syscalls. Using ``sendto()`` as the primary example:
  91. .. code-block:: C
  92. struct sockaddr_mctp addr;
  93. char buf[14];
  94. ssize_t len;
  95. /* set message destination */
  96. addr.smctp_family = AF_MCTP;
  97. addr.smctp_network = 0;
  98. addr.smctp_addr.s_addr = 8;
  99. addr.smctp_tag = MCTP_TAG_OWNER;
  100. addr.smctp_type = MCTP_TYPE_ECHO;
  101. /* arbitrary message to send, with message-type header */
  102. buf[0] = MCTP_TYPE_ECHO;
  103. memcpy(buf + 1, "hello, world!", sizeof(buf) - 1);
  104. len = sendto(sd, buf, sizeof(buf), 0,
  105. (struct sockaddr_mctp *)&addr, sizeof(addr));
  106. The network and address fields of ``addr`` define the remote address to send to.
  107. If ``smctp_tag`` has the ``MCTP_TAG_OWNER``, the kernel will ignore any bits set
  108. in ``MCTP_TAG_VALUE``, and generate a tag value suitable for the destination
  109. EID. If ``MCTP_TAG_OWNER`` is not set, the message will be sent with the tag
  110. value as specified. If a tag value cannot be allocated, the system call will
  111. report an errno of ``EAGAIN``.
  112. The application must provide the message type byte as the first byte of the
  113. message buffer passed to ``sendto()``. If a message integrity check is to be
  114. included in the transmitted message, it must also be provided in the message
  115. buffer, and the most-significant bit of the message type byte must be 1.
  116. The ``sendmsg()`` system call allows a more compact argument interface, and the
  117. message buffer to be specified as a scatter-gather list. At present no ancillary
  118. message types (used for the ``msg_control`` data passed to ``sendmsg()``) are
  119. defined.
  120. Transmitting a message on an unconnected socket with ``MCTP_TAG_OWNER``
  121. specified will cause an allocation of a tag, if no valid tag is already
  122. allocated for that destination. The (destination-eid,tag) tuple acts as an
  123. implicit local socket address, to allow the socket to receive responses to this
  124. outgoing message. If any previous allocation has been performed (to for a
  125. different remote EID), that allocation is lost.
  126. Sockets will only receive responses to requests they have sent (with TO=1) and
  127. may only respond (with TO=0) to requests they have received.
  128. ``recvfrom()``, ``recvmsg()``, ``recv()`` : receive an MCTP message
  129. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  130. An MCTP message can be received by an application using one of the
  131. ``recvfrom()``, ``recvmsg()``, or ``recv()`` system calls. Using ``recvfrom()``
  132. as the primary example:
  133. .. code-block:: C
  134. struct sockaddr_mctp addr;
  135. socklen_t addrlen;
  136. char buf[14];
  137. ssize_t len;
  138. addrlen = sizeof(addr);
  139. len = recvfrom(sd, buf, sizeof(buf), 0,
  140. (struct sockaddr_mctp *)&addr, &addrlen);
  141. /* We can expect addr to describe an MCTP address */
  142. assert(addrlen >= sizeof(buf));
  143. assert(addr.smctp_family == AF_MCTP);
  144. printf("received %zd bytes from remote EID %d\n", rc, addr.smctp_addr);
  145. The address argument to ``recvfrom`` and ``recvmsg`` is populated with the
  146. remote address of the incoming message, including tag value (this will be needed
  147. in order to reply to the message).
  148. The first byte of the message buffer will contain the message type byte. If an
  149. integrity check follows the message, it will be included in the received buffer.
  150. The ``recv()`` system call behaves in a similar way, but does not provide a
  151. remote address to the application. Therefore, these are only useful if the
  152. remote address is already known, or the message does not require a reply.
  153. Like the send calls, sockets will only receive responses to requests they have
  154. sent (TO=1) and may only respond (TO=0) to requests they have received.
  155. ``ioctl(SIOCMCTPALLOCTAG)`` and ``ioctl(SIOCMCTPDROPTAG)``
  156. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  157. These tags give applications more control over MCTP message tags, by allocating
  158. (and dropping) tag values explicitly, rather than the kernel automatically
  159. allocating a per-message tag at ``sendmsg()`` time.
  160. In general, you will only need to use these ioctls if your MCTP protocol does
  161. not fit the usual request/response model. For example, if you need to persist
  162. tags across multiple requests, or a request may generate more than one response.
  163. In these cases, the ioctls allow you to decouple the tag allocation (and
  164. release) from individual message send and receive operations.
  165. Both ioctls are passed a pointer to a ``struct mctp_ioc_tag_ctl``:
  166. .. code-block:: C
  167. struct mctp_ioc_tag_ctl {
  168. mctp_eid_t peer_addr;
  169. __u8 tag;
  170. __u16 flags;
  171. };
  172. ``SIOCMCTPALLOCTAG`` allocates a tag for a specific peer, which an application
  173. can use in future ``sendmsg()`` calls. The application populates the
  174. ``peer_addr`` member with the remote EID. Other fields must be zero.
  175. On return, the ``tag`` member will be populated with the allocated tag value.
  176. The allocated tag will have the following tag bits set:
  177. - ``MCTP_TAG_OWNER``: it only makes sense to allocate tags if you're the tag
  178. owner
  179. - ``MCTP_TAG_PREALLOC``: to indicate to ``sendmsg()`` that this is a
  180. preallocated tag.
  181. - ... and the actual tag value, within the least-significant three bits
  182. (``MCTP_TAG_MASK``). Note that zero is a valid tag value.
  183. The tag value should be used as-is for the ``smctp_tag`` member of ``struct
  184. sockaddr_mctp``.
  185. ``SIOCMCTPDROPTAG`` releases a tag that has been previously allocated by a
  186. ``SIOCMCTPALLOCTAG`` ioctl. The ``peer_addr`` must be the same as used for the
  187. allocation, and the ``tag`` value must match exactly the tag returned from the
  188. allocation (including the ``MCTP_TAG_OWNER`` and ``MCTP_TAG_PREALLOC`` bits).
  189. The ``flags`` field must be zero.
  190. Kernel internals
  191. ================
  192. There are a few possible packet flows in the MCTP stack:
  193. 1. local TX to remote endpoint, message <= MTU::
  194. sendmsg()
  195. -> mctp_local_output()
  196. : route lookup
  197. -> rt->output() (== mctp_route_output)
  198. -> dev_queue_xmit()
  199. 2. local TX to remote endpoint, message > MTU::
  200. sendmsg()
  201. -> mctp_local_output()
  202. -> mctp_do_fragment_route()
  203. : creates packet-sized skbs. For each new skb:
  204. -> rt->output() (== mctp_route_output)
  205. -> dev_queue_xmit()
  206. 3. remote TX to local endpoint, single-packet message::
  207. mctp_pkttype_receive()
  208. : route lookup
  209. -> rt->output() (== mctp_route_input)
  210. : sk_key lookup
  211. -> sock_queue_rcv_skb()
  212. 4. remote TX to local endpoint, multiple-packet message::
  213. mctp_pkttype_receive()
  214. : route lookup
  215. -> rt->output() (== mctp_route_input)
  216. : sk_key lookup
  217. : stores skb in struct sk_key->reasm_head
  218. mctp_pkttype_receive()
  219. : route lookup
  220. -> rt->output() (== mctp_route_input)
  221. : sk_key lookup
  222. : finds existing reassembly in sk_key->reasm_head
  223. : appends new fragment
  224. -> sock_queue_rcv_skb()
  225. Key refcounts
  226. -------------
  227. * keys are refed by:
  228. - a skb: during route output, stored in ``skb->cb``.
  229. - netns and sock lists.
  230. * keys can be associated with a device, in which case they hold a
  231. reference to the dev (set through ``key->dev``, counted through
  232. ``dev->key_count``). Multiple keys can reference the device.