bra.rst 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336
  1. ==========================
  2. Bulk Register Access (BRA)
  3. ==========================
  4. Conventions
  5. -----------
  6. Capitalized words used in this documentation are intentional and refer
  7. to concepts of the SoundWire 1.x specification.
  8. Introduction
  9. ------------
  10. The SoundWire 1.x specification provides a mechanism to speed-up
  11. command/control transfers by reclaiming parts of the audio
  12. bandwidth. The Bulk Register Access (BRA) protocol is a standard
  13. solution based on the Bulk Payload Transport (BPT) definitions.
  14. The regular control channel uses Column 0 and can only send/retrieve
  15. one byte per frame with write/read commands. With a typical 48kHz
  16. frame rate, only 48kB/s can be transferred.
  17. The optional Bulk Register Access capability can transmit up to 12
  18. Mbits/s and reduce transfer times by several orders of magnitude, but
  19. has multiple design constraints:
  20. (1) Each frame can only support a read or a write transfer, with a
  21. 10-byte overhead per frame (header and footer response).
  22. (2) The read/writes SHALL be from/to contiguous register addresses
  23. in the same frame. A fragmented register space decreases the
  24. efficiency of the protocol by requiring multiple BRA transfers
  25. scheduled in different frames.
  26. (3) The targeted Peripheral device SHALL support the optional Data
  27. Port 0, and likewise the Manager SHALL expose audio-like Ports
  28. to insert BRA packets in the audio payload using the concepts of
  29. Sample Interval, HSTART, HSTOP, etc.
  30. (4) The BRA transport efficiency depends on the available
  31. bandwidth. If there are no on-going audio transfers, the entire
  32. frame minus Column 0 can be reclaimed for BRA. The frame shape
  33. also impacts efficiency: since Column0 cannot be used for
  34. BTP/BRA, the frame should rely on a large number of columns and
  35. minimize the number of rows. The bus clock should be as high as
  36. possible.
  37. (5) The number of bits transferred per frame SHALL be a multiple of
  38. 8 bits. Padding bits SHALL be inserted if necessary at the end
  39. of the data.
  40. (6) The regular read/write commands can be issued in parallel with
  41. BRA transfers. This is convenient to e.g. deal with alerts, jack
  42. detection or change the volume during firmware download, but
  43. accessing the same address with two independent protocols has to
  44. be avoided to avoid undefined behavior.
  45. (7) Some implementations may not be capable of handling the
  46. bandwidth of the BRA protocol, e.g. in the case of a slow I2C
  47. bus behind the SoundWire IP. In this case, the transfers may
  48. need to be spaced in time or flow-controlled.
  49. (8) Each BRA packet SHALL be marked as 'Active' when valid data is
  50. to be transmitted. This allows for software to allocate a BRA
  51. stream but not transmit/discard data while processing the
  52. results or preparing the next batch of data, or allowing the
  53. peripheral to deal with the previous transfer. In addition BRA
  54. transfer can be started early on without data being ready.
  55. (9) Up to 470 bytes may be transmitted per frame.
  56. (10) The address is represented with 32 bits and does not rely on
  57. the paging registers used for the regular command/control
  58. protocol in Column 0.
  59. Error checking
  60. --------------
  61. Firmware download is one of the key usages of the Bulk Register Access
  62. protocol. To make sure the binary data integrity is not compromised by
  63. transmission or programming errors, each BRA packet provides:
  64. (1) A CRC on the 7-byte header. This CRC helps the Peripheral Device
  65. check if it is addressed and set the start address and number of
  66. bytes. The Peripheral Device provides a response in Byte 7.
  67. (2) A CRC on the data block (header excluded). This CRC is
  68. transmitted as the last-but-one byte in the packet, prior to the
  69. footer response.
  70. The header response can be one of:
  71. (a) Ack
  72. (b) Nak
  73. (c) Not Ready
  74. The footer response can be one of:
  75. (1) Ack
  76. (2) Nak (CRC failure)
  77. (3) Good (operation completed)
  78. (4) Bad (operation failed)
  79. Example frame
  80. -------------
  81. The example below is not to scale and makes simplifying assumptions
  82. for clarity. The different chunks in the BRA packets are not required
  83. to start on a new SoundWire Row, and the scale of data may vary.
  84. ::
  85. +---+--------------------------------------------+
  86. + | |
  87. + | BRA HEADER |
  88. + | |
  89. + +--------------------------------------------+
  90. + C | HEADER CRC |
  91. + O +--------------------------------------------+
  92. + M | HEADER RESPONSE |
  93. + M +--------------------------------------------+
  94. + A | |
  95. + N | |
  96. + D | DATA |
  97. + | |
  98. + | |
  99. + | |
  100. + +--------------------------------------------+
  101. + | DATA CRC |
  102. + +--------------------------------------------+
  103. + | FOOTER RESPONSE |
  104. +---+--------------------------------------------+
  105. Assuming the frame uses N columns, the configuration shown above can
  106. be programmed by setting the DP0 registers as:
  107. - HSTART = 1
  108. - HSTOP = N - 1
  109. - Sampling Interval = N
  110. - WordLength = N - 1
  111. Addressing restrictions
  112. -----------------------
  113. The Device Number specified in the Header follows the SoundWire
  114. definitions, and broadcast and group addressing are permitted. For now
  115. the Linux implementation only allows for a single BPT transfer to a
  116. single device at a time. This might be revisited at a later point as
  117. an optimization to send the same firmware to multiple devices, but
  118. this would only be beneficial for single-link solutions.
  119. In the case of multiple Peripheral devices attached to different
  120. Managers, the broadcast and group addressing is not supported by the
  121. SoundWire specification. Each device must be handled with separate BRA
  122. streams, possibly in parallel - the links are really independent.
  123. Unsupported features
  124. --------------------
  125. The Bulk Register Access specification provides a number of
  126. capabilities that are not supported in known implementations, such as:
  127. (1) Transfers initiated by a Peripheral Device. The BRA Initiator is
  128. always the Manager Device.
  129. (2) Flow-control capabilities and retransmission based on the
  130. 'NotReady' header response require extra buffering in the
  131. SoundWire IP and are not implemented.
  132. Bi-directional handling
  133. -----------------------
  134. The BRA protocol can handle writes as well as reads, and in each
  135. packet the header and footer response are provided by the Peripheral
  136. Target device. On the Peripheral device, the BRA protocol is handled
  137. by a single DP0 data port, and at the low-level the bus ownership can
  138. will change for header/footer response as well as the data transmitted
  139. during a read.
  140. On the host side, most implementations rely on a Port-like concept,
  141. with two FIFOs consuming/generating data transfers in parallel
  142. (Host->Peripheral and Peripheral->Host). The amount of data
  143. consumed/produced by these FIFOs is not symmetrical, as a result
  144. hardware typically inserts markers to help software and hardware
  145. interpret raw data
  146. Each packet will typically have:
  147. (1) a 'Start of Packet' indicator.
  148. (2) an 'End of Packet' indicator.
  149. (3) a packet identifier to correlate the data requested and
  150. transmitted, and the error status for each frame
  151. Hardware implementations can check errors at the frame level, and
  152. retry a transfer in case of errors. However, as for the flow-control
  153. case, this requires extra buffering and intelligence in the
  154. hardware. The Linux support assumes that the entire transfer is
  155. cancelled if a single error is detected in one of the responses.
  156. Abstraction required
  157. ~~~~~~~~~~~~~~~~~~~~
  158. There are no standard registers or mandatory implementation at the
  159. Manager level, so the low-level BPT/BRA details must be hidden in
  160. Manager-specific code. For example the Cadence IP format above is not
  161. known to the codec drivers.
  162. Likewise, codec drivers should not have to know the frame size. The
  163. computation of CRC and handling of responses is handled in helpers and
  164. Manager-specific code.
  165. The host BRA driver may also have restrictions on pages allocated for
  166. DMA, or other host-DSP communication protocols. The codec driver
  167. should not be aware of any of these restrictions, since it might be
  168. reused in combination with different implementations of Manager IPs.
  169. Concurrency between BRA and regular read/write
  170. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  171. The existing 'nread/nwrite' API already relies on a notion of start
  172. address and number of bytes, so it would be possible to extend this
  173. API with a 'hint' requesting BPT/BRA be used.
  174. However BRA transfers could be quite long, and the use of a single
  175. mutex for regular read/write and BRA is a show-stopper. Independent
  176. operation of the control/command and BRA transfers is a fundamental
  177. requirement, e.g. to change the volume level with the existing regmap
  178. interface while downloading firmware. The integration must however
  179. ensure that there are no concurrent access to the same address with
  180. the command/control protocol and the BRA protocol.
  181. In addition, the 'sdw_msg' structure hard-codes support for 16-bit
  182. addresses and paging registers which are irrelevant for BPT/BRA
  183. support based on native 32-bit addresses. A separate API with
  184. 'sdw_bpt_msg' makes more sense.
  185. One possible strategy to speed-up all initialization tasks would be to
  186. start a BRA transfer for firmware download, then deal with all the
  187. "regular" read/writes in parallel with the command channel, and last
  188. to wait for the BRA transfers to complete. This would allow for a
  189. degree of overlap instead of a purely sequential solution. As such,
  190. the BRA API must support async transfers and expose a separate wait
  191. function.
  192. Peripheral/bus interface
  193. ------------------------
  194. The bus interface for BPT/BRA is made of two functions:
  195. - sdw_bpt_send_async(bpt_message)
  196. This function sends the data using the Manager
  197. implementation-defined capabilities (typically DMA or IPC
  198. protocol).
  199. Queueing is currently not supported, the caller
  200. needs to wait for completion of the requested transfer.
  201. - sdw_bpt_wait()
  202. This function waits for the entire message provided by the
  203. codec driver in the 'send_async' stage. Intermediate status for
  204. smaller chunks will not be provided back to the codec driver,
  205. only a return code will be provided.
  206. Regmap use
  207. ~~~~~~~~~~
  208. Existing codec drivers rely on regmap to download firmware to
  209. Peripherals. regmap exposes an async interface similar to the
  210. send/wait API suggested above, so at a high-level it would seem
  211. natural to combine BRA and regmap. The regmap layer could check if BRA
  212. is available or not, and use a regular read-write command channel in
  213. the latter case.
  214. The regmap integration will be handled in a second step.
  215. BRA stream model
  216. ----------------
  217. For regular audio transfers, the machine driver exposes a dailink
  218. connecting CPU DAI(s) and Codec DAI(s).
  219. This model is not required BRA support:
  220. (1) The SoundWire DAIs are mainly wrappers for SoundWire Data
  221. Ports, with possibly some analog or audio conversion
  222. capabilities bolted behind the Data Port. In the context of
  223. BRA, the DP0 is the destination. DP0 registers are standard and
  224. can be programmed blindly without knowing what Peripheral is
  225. connected to each link. In addition, if there are multiple
  226. Peripherals on a link and some of them do not support DP0, the
  227. write commands to program DP0 registers will generate harmless
  228. COMMAND_IGNORED responses that will be wired-ORed with
  229. responses from Peripherals which support DP0. In other words,
  230. the DP0 programming can be done with broadcast commands, and
  231. the information on the Target device can be added only in the
  232. BRA Header.
  233. (2) At the CPU level, the DAI concept is not useful for BRA; the
  234. machine driver will not create a dailink relying on DP0. The
  235. only concept that is needed is the notion of port.
  236. (3) The stream concept relies on a set of master_rt and slave_rt
  237. concepts. All of these entities represent ports and not DAIs.
  238. (4) With the assumption that a single BRA stream is used per link,
  239. that stream can connect master ports as well as all peripheral
  240. DP0 ports.
  241. (5) BRA transfers only make sense in the context of one
  242. Manager/Link, so the BRA stream handling does not rely on the
  243. concept of multi-link aggregation allowed by regular DAI links.
  244. Audio DMA support
  245. -----------------
  246. Some DMAs, such as HDaudio, require an audio format field to be
  247. set. This format is in turn used to define acceptable bursts. BPT/BRA
  248. support is not fully compatible with these definitions in that the
  249. format and bandwidth may vary between read and write commands.
  250. In addition, on Intel HDaudio Intel platforms the DMAs need to be
  251. programmed with a PCM format matching the bandwidth of the BPT/BRA
  252. transfer. The format is based on 192kHz 32-bit samples, and the number
  253. of channels varies to adjust the bandwidth. The notion of channel is
  254. completely notional since the data is not typical audio
  255. PCM. Programming such channels helps reserve enough bandwidth and adjust
  256. FIFO sizes to avoid xruns.
  257. Alignment requirements are currently not enforced at the core level
  258. but at the platform-level, e.g. for Intel the data sizes must be
  259. equal to or larger than 16 bytes.