dev-stateless-decoder.rst 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425
  1. .. SPDX-License-Identifier: GPL-2.0
  2. .. c:namespace:: V4L
  3. .. _stateless_decoder:
  4. **************************************************
  5. Memory-to-memory Stateless Video Decoder Interface
  6. **************************************************
  7. A stateless decoder is a decoder that works without retaining any kind of state
  8. between processed frames. This means that each frame is decoded independently
  9. of any previous and future frames, and that the client is responsible for
  10. maintaining the decoding state and providing it to the decoder with each
  11. decoding request. This is in contrast to the stateful video decoder interface,
  12. where the hardware and driver maintain the decoding state and all the client
  13. has to do is to provide the raw encoded stream and dequeue decoded frames in
  14. display order.
  15. This section describes how user-space ("the client") is expected to communicate
  16. with stateless decoders in order to successfully decode an encoded stream.
  17. Compared to stateful codecs, the decoder/client sequence is simpler, but the
  18. cost of this simplicity is extra complexity in the client which is responsible
  19. for maintaining a consistent decoding state.
  20. Stateless decoders make use of the :ref:`media-request-api`. A stateless
  21. decoder must expose the ``V4L2_BUF_CAP_SUPPORTS_REQUESTS`` capability on its
  22. ``OUTPUT`` queue when :c:func:`VIDIOC_REQBUFS` or :c:func:`VIDIOC_CREATE_BUFS`
  23. are invoked.
  24. Depending on the encoded formats supported by the decoder, a single decoded
  25. frame may be the result of several decode requests (for instance, H.264 streams
  26. with multiple slices per frame). Decoders that support such formats must also
  27. expose the ``V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF`` capability on their
  28. ``OUTPUT`` queue.
  29. Querying capabilities
  30. =====================
  31. 1. To enumerate the set of coded formats supported by the decoder, the client
  32. calls :c:func:`VIDIOC_ENUM_FMT` on the ``OUTPUT`` queue.
  33. * The driver must always return the full set of supported ``OUTPUT`` formats,
  34. irrespective of the format currently set on the ``CAPTURE`` queue.
  35. * Simultaneously, the driver must restrain the set of values returned by
  36. codec-specific capability controls (such as H.264 profiles) to the set
  37. actually supported by the hardware.
  38. 2. To enumerate the set of supported raw formats, the client calls
  39. :c:func:`VIDIOC_ENUM_FMT` on the ``CAPTURE`` queue.
  40. * The driver must return only the formats supported for the format currently
  41. active on the ``OUTPUT`` queue.
  42. * Depending on the currently set ``OUTPUT`` format, the set of supported raw
  43. formats may depend on the value of some codec-dependent controls.
  44. The client is responsible for making sure that these controls are set
  45. before querying the ``CAPTURE`` queue. Failure to do so will result in the
  46. default values for these controls being used, and a returned set of formats
  47. that may not be usable for the media the client is trying to decode.
  48. 3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported
  49. resolutions for a given format, passing desired pixel format in
  50. :c:type:`v4l2_frmsizeenum`'s ``pixel_format``.
  51. 4. Supported profiles and levels for the current ``OUTPUT`` format, if
  52. applicable, may be queried using their respective controls via
  53. :c:func:`VIDIOC_QUERYCTRL`.
  54. Initialization
  55. ==============
  56. 1. Set the coded format on the ``OUTPUT`` queue via :c:func:`VIDIOC_S_FMT`.
  57. * **Required fields:**
  58. ``type``
  59. a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``.
  60. ``pixelformat``
  61. a coded pixel format.
  62. ``width``, ``height``
  63. coded width and height parsed from the stream.
  64. other fields
  65. follow standard semantics.
  66. .. note::
  67. Changing the ``OUTPUT`` format may change the currently set ``CAPTURE``
  68. format. The driver will derive a new ``CAPTURE`` format from the
  69. ``OUTPUT`` format being set, including resolution, colorimetry
  70. parameters, etc. If the client needs a specific ``CAPTURE`` format,
  71. it must adjust it afterwards.
  72. 2. Call :c:func:`VIDIOC_S_EXT_CTRLS` to set all the controls (parsed headers,
  73. etc.) required by the ``OUTPUT`` format to enumerate the ``CAPTURE`` formats.
  74. 3. Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get the format for the
  75. destination buffers parsed/decoded from the bytestream.
  76. * **Required fields:**
  77. ``type``
  78. a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``.
  79. * **Returned fields:**
  80. ``width``, ``height``
  81. frame buffer resolution for the decoded frames.
  82. ``pixelformat``
  83. pixel format for decoded frames.
  84. ``num_planes`` (for _MPLANE ``type`` only)
  85. number of planes for pixelformat.
  86. ``sizeimage``, ``bytesperline``
  87. as per standard semantics; matching frame buffer format.
  88. .. note::
  89. The value of ``pixelformat`` may be any pixel format supported for the
  90. ``OUTPUT`` format, based on the hardware capabilities. It is suggested
  91. that the driver chooses the preferred/optimal format for the current
  92. configuration. For example, a YUV format may be preferred over an RGB
  93. format, if an additional conversion step would be required for RGB.
  94. 4. *[optional]* Enumerate ``CAPTURE`` formats via :c:func:`VIDIOC_ENUM_FMT` on
  95. the ``CAPTURE`` queue. The client may use this ioctl to discover which
  96. alternative raw formats are supported for the current ``OUTPUT`` format and
  97. select one of them via :c:func:`VIDIOC_S_FMT`.
  98. .. note::
  99. The driver will return only formats supported for the currently selected
  100. ``OUTPUT`` format and currently set controls, even if more formats may be
  101. supported by the decoder in general.
  102. For example, a decoder may support YUV and RGB formats for
  103. resolutions 1920x1088 and lower, but only YUV for higher resolutions (due
  104. to hardware limitations). After setting a resolution of 1920x1088 or lower
  105. as the ``OUTPUT`` format, :c:func:`VIDIOC_ENUM_FMT` may return a set of
  106. YUV and RGB pixel formats, but after setting a resolution higher than
  107. 1920x1088, the driver will not return RGB pixel formats, since they are
  108. unsupported for this resolution.
  109. 5. *[optional]* Choose a different ``CAPTURE`` format than suggested via
  110. :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for the client to
  111. choose a different format than selected/suggested by the driver in
  112. :c:func:`VIDIOC_G_FMT`.
  113. * **Required fields:**
  114. ``type``
  115. a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``.
  116. ``pixelformat``
  117. a raw pixel format.
  118. ``width``, ``height``
  119. frame buffer resolution of the decoded stream; typically unchanged from
  120. what was returned with :c:func:`VIDIOC_G_FMT`, but it may be different
  121. if the hardware supports composition and/or scaling.
  122. After performing this step, the client must perform step 3 again in order
  123. to obtain up-to-date information about the buffers size and layout.
  124. 6. Allocate source (bytestream) buffers via :c:func:`VIDIOC_REQBUFS` on
  125. ``OUTPUT`` queue.
  126. * **Required fields:**
  127. ``count``
  128. requested number of buffers to allocate; greater than zero.
  129. ``type``
  130. a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``.
  131. ``memory``
  132. follows standard semantics.
  133. * **Returned fields:**
  134. ``count``
  135. actual number of buffers allocated.
  136. * If required, the driver will adjust ``count`` to be equal or bigger to the
  137. minimum of required number of ``OUTPUT`` buffers for the given format and
  138. requested count. The client must check this value after the ioctl returns
  139. to get the actual number of buffers allocated.
  140. 7. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` on the
  141. ``CAPTURE`` queue.
  142. * **Required fields:**
  143. ``count``
  144. requested number of buffers to allocate; greater than zero. The client
  145. is responsible for deducing the minimum number of buffers required
  146. for the stream to be properly decoded (taking e.g. reference frames
  147. into account) and pass an equal or bigger number.
  148. ``type``
  149. a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``.
  150. ``memory``
  151. follows standard semantics. ``V4L2_MEMORY_USERPTR`` is not supported
  152. for ``CAPTURE`` buffers.
  153. * **Returned fields:**
  154. ``count``
  155. adjusted to allocated number of buffers, in case the codec requires
  156. more buffers than requested.
  157. * The driver must adjust count to the minimum of required number of
  158. ``CAPTURE`` buffers for the current format, stream configuration and
  159. requested count. The client must check this value after the ioctl
  160. returns to get the number of buffers allocated.
  161. 8. Allocate requests (likely one per ``OUTPUT`` buffer) via
  162. :c:func:`MEDIA_IOC_REQUEST_ALLOC` on the media device.
  163. 9. Start streaming on both ``OUTPUT`` and ``CAPTURE`` queues via
  164. :c:func:`VIDIOC_STREAMON`.
  165. Decoding
  166. ========
  167. For each frame, the client is responsible for submitting at least one request to
  168. which the following is attached:
  169. * The amount of encoded data expected by the codec for its current
  170. configuration, as a buffer submitted to the ``OUTPUT`` queue. Typically, this
  171. corresponds to one frame worth of encoded data, but some formats may allow (or
  172. require) different amounts per unit.
  173. * All the metadata needed to decode the submitted encoded data, in the form of
  174. controls relevant to the format being decoded.
  175. The amount of data and contents of the source ``OUTPUT`` buffer, as well as the
  176. controls that must be set on the request, depend on the active coded pixel
  177. format and might be affected by codec-specific extended controls, as stated in
  178. documentation of each format.
  179. If there is a possibility that the decoded frame will require one or more
  180. decode requests after the current one in order to be produced, then the client
  181. must set the ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag on the ``OUTPUT``
  182. buffer. This will result in the (potentially partially) decoded ``CAPTURE``
  183. buffer not being made available for dequeueing, and reused for the next decode
  184. request if the timestamp of the next ``OUTPUT`` buffer has not changed.
  185. A typical frame would thus be decoded using the following sequence:
  186. 1. Queue an ``OUTPUT`` buffer containing one unit of encoded bytestream data for
  187. the decoding request, using :c:func:`VIDIOC_QBUF`.
  188. * **Required fields:**
  189. ``index``
  190. index of the buffer being queued.
  191. ``type``
  192. type of the buffer.
  193. ``bytesused``
  194. number of bytes taken by the encoded data frame in the buffer.
  195. ``flags``
  196. the ``V4L2_BUF_FLAG_REQUEST_FD`` flag must be set. Additionally, if
  197. we are not sure that the current decode request is the last one needed
  198. to produce a fully decoded frame, then
  199. ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` must also be set.
  200. ``request_fd``
  201. must be set to the file descriptor of the decoding request.
  202. ``timestamp``
  203. must be set to a unique value per frame. This value will be propagated
  204. into the decoded frame's buffer and can also be used to use this frame
  205. as the reference of another. If using multiple decode requests per
  206. frame, then the timestamps of all the ``OUTPUT`` buffers for a given
  207. frame must be identical. If the timestamp changes, then the currently
  208. held ``CAPTURE`` buffer will be made available for dequeuing and the
  209. current request will work on a new ``CAPTURE`` buffer.
  210. 2. Set the codec-specific controls for the decoding request, using
  211. :c:func:`VIDIOC_S_EXT_CTRLS`.
  212. * **Required fields:**
  213. ``which``
  214. must be ``V4L2_CTRL_WHICH_REQUEST_VAL``.
  215. ``request_fd``
  216. must be set to the file descriptor of the decoding request.
  217. other fields
  218. other fields are set as usual when setting controls. The ``controls``
  219. array must contain all the codec-specific controls required to decode
  220. a frame.
  221. .. note::
  222. It is possible to specify the controls in different invocations of
  223. :c:func:`VIDIOC_S_EXT_CTRLS`, or to overwrite a previously set control, as
  224. long as ``request_fd`` and ``which`` are properly set. The controls state
  225. at the moment of request submission is the one that will be considered.
  226. .. note::
  227. The order in which steps 1 and 2 take place is interchangeable.
  228. 3. Submit the request by invoking :c:func:`MEDIA_REQUEST_IOC_QUEUE` on the
  229. request FD.
  230. If the request is submitted without an ``OUTPUT`` buffer, or if some of the
  231. required controls are missing from the request, then
  232. :c:func:`MEDIA_REQUEST_IOC_QUEUE` will return ``-ENOENT``. If more than one
  233. ``OUTPUT`` buffer is queued, then it will return ``-EINVAL``.
  234. :c:func:`MEDIA_REQUEST_IOC_QUEUE` returning non-zero means that no
  235. ``CAPTURE`` buffer will be produced for this request.
  236. ``CAPTURE`` buffers must not be part of the request, and are queued
  237. independently. They are returned in decode order (i.e. the same order as coded
  238. frames were submitted to the ``OUTPUT`` queue).
  239. Runtime decoding errors are signaled by the dequeued ``CAPTURE`` buffers
  240. carrying the ``V4L2_BUF_FLAG_ERROR`` flag. If a decoded reference frame has an
  241. error, then all following decoded frames that refer to it also have the
  242. ``V4L2_BUF_FLAG_ERROR`` flag set, although the decoder will still try to
  243. produce (likely corrupted) frames.
  244. Buffer management while decoding
  245. ================================
  246. Contrary to stateful decoders, a stateless decoder does not perform any kind of
  247. buffer management: it only guarantees that dequeued ``CAPTURE`` buffers can be
  248. used by the client for as long as they are not queued again. "Used" here
  249. encompasses using the buffer for compositing or display.
  250. A dequeued capture buffer can also be used as the reference frame of another
  251. buffer.
  252. A frame is specified as reference by converting its timestamp into nanoseconds,
  253. and storing it into the relevant member of a codec-dependent control structure.
  254. The :c:func:`v4l2_timeval_to_ns` function must be used to perform that
  255. conversion. The timestamp of a frame can be used to reference it as soon as all
  256. its units of encoded data are successfully submitted to the ``OUTPUT`` queue.
  257. A decoded buffer containing a reference frame must not be reused as a decoding
  258. target until all the frames referencing it have been decoded. The safest way to
  259. achieve this is to refrain from queueing a reference buffer until all the
  260. decoded frames referencing it have been dequeued. However, if the driver can
  261. guarantee that buffers queued to the ``CAPTURE`` queue are processed in queued
  262. order, then user-space can take advantage of this guarantee and queue a
  263. reference buffer when the following conditions are met:
  264. 1. All the requests for frames affected by the reference frame have been
  265. queued, and
  266. 2. A sufficient number of ``CAPTURE`` buffers to cover all the decoded
  267. referencing frames have been queued.
  268. When queuing a decoding request, the driver will increase the reference count of
  269. all the resources associated with reference frames. This means that the client
  270. can e.g. close the DMABUF file descriptors of reference frame buffers if it
  271. won't need them afterwards.
  272. Seeking
  273. =======
  274. In order to seek, the client just needs to submit requests using input buffers
  275. corresponding to the new stream position. It must however be aware that
  276. resolution may have changed and follow the dynamic resolution change sequence in
  277. that case. Also depending on the codec used, picture parameters (e.g. SPS/PPS
  278. for H.264) may have changed and the client is responsible for making sure that a
  279. valid state is sent to the decoder.
  280. The client is then free to ignore any returned ``CAPTURE`` buffer that comes
  281. from the pre-seek position.
  282. Pausing
  283. =======
  284. In order to pause, the client can just cease queuing buffers onto the ``OUTPUT``
  285. queue. Without source bytestream data, there is no data to process and the codec
  286. will remain idle.
  287. Dynamic resolution change
  288. =========================
  289. If the client detects a resolution change in the stream, it will need to perform
  290. the initialization sequence again with the new resolution:
  291. 1. If the last submitted request resulted in a ``CAPTURE`` buffer being
  292. held by the use of the ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag, then the
  293. last frame is not available on the ``CAPTURE`` queue. In this case, a
  294. ``V4L2_DEC_CMD_FLUSH`` command shall be sent. This will make the driver
  295. dequeue the held ``CAPTURE`` buffer.
  296. 2. Wait until all submitted requests have completed and dequeue the
  297. corresponding output buffers.
  298. 3. Call :c:func:`VIDIOC_STREAMOFF` on both the ``OUTPUT`` and ``CAPTURE``
  299. queues.
  300. 4. Free all ``CAPTURE`` buffers by calling :c:func:`VIDIOC_REQBUFS` on the
  301. ``CAPTURE`` queue with a buffer count of zero.
  302. 5. Perform the initialization sequence again (minus the allocation of
  303. ``OUTPUT`` buffers), with the new resolution set on the ``OUTPUT`` queue.
  304. Note that due to resolution constraints, a different format may need to be
  305. picked on the ``CAPTURE`` queue.
  306. Drain
  307. =====
  308. If the last submitted request resulted in a ``CAPTURE`` buffer being
  309. held by the use of the ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag, then the
  310. last frame is not available on the ``CAPTURE`` queue. In this case, a
  311. ``V4L2_DEC_CMD_FLUSH`` command shall be sent. This will make the driver
  312. dequeue the held ``CAPTURE`` buffer.
  313. After that, in order to drain the stream on a stateless decoder, the client
  314. just needs to wait until all the submitted requests are completed.