packing.rst 14 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345
  1. ================================================
  2. Generic bitfield packing and unpacking functions
  3. ================================================
  4. Problem statement
  5. -----------------
  6. When working with hardware, one has to choose between several approaches of
  7. interfacing with it.
  8. One can memory-map a pointer to a carefully crafted struct over the hardware
  9. device's memory region, and access its fields as struct members (potentially
  10. declared as bitfields). But writing code this way would make it less portable,
  11. due to potential endianness mismatches between the CPU and the hardware device.
  12. Additionally, one has to pay close attention when translating register
  13. definitions from the hardware documentation into bit field indices for the
  14. structs. Also, some hardware (typically networking equipment) tends to group
  15. its register fields in ways that violate any reasonable word boundaries
  16. (sometimes even 64 bit ones). This creates the inconvenience of having to
  17. define "high" and "low" portions of register fields within the struct.
  18. A more robust alternative to struct field definitions would be to extract the
  19. required fields by shifting the appropriate number of bits. But this would
  20. still not protect from endianness mismatches, except if all memory accesses
  21. were performed byte-by-byte. Also the code can easily get cluttered, and the
  22. high-level idea might get lost among the many bit shifts required.
  23. Many drivers take the bit-shifting approach and then attempt to reduce the
  24. clutter with tailored macros, but more often than not these macros take
  25. shortcuts that still prevent the code from being truly portable.
  26. The solution
  27. ------------
  28. This API deals with 2 basic operations:
  29. - Packing a CPU-usable number into a memory buffer (with hardware
  30. constraints/quirks)
  31. - Unpacking a memory buffer (which has hardware constraints/quirks)
  32. into a CPU-usable number.
  33. The API offers an abstraction over said hardware constraints and quirks,
  34. over CPU endianness and therefore between possible mismatches between
  35. the two.
  36. The basic unit of these API functions is the u64. From the CPU's
  37. perspective, bit 63 always means bit offset 7 of byte 7, albeit only
  38. logically. The question is: where do we lay this bit out in memory?
  39. The following examples cover the memory layout of a packed u64 field.
  40. The byte offsets in the packed buffer are always implicitly 0, 1, ... 7.
  41. What the examples show is where the logical bytes and bits sit.
  42. 1. Normally (no quirks), we would do it like this:
  43. ::
  44. 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
  45. 7 6 5 4
  46. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  47. 3 2 1 0
  48. That is, the MSByte (7) of the CPU-usable u64 sits at memory offset 0, and the
  49. LSByte (0) of the u64 sits at memory offset 7.
  50. This corresponds to what most folks would regard to as "big endian", where
  51. bit i corresponds to the number 2^i. This is also referred to in the code
  52. comments as "logical" notation.
  53. 2. If QUIRK_MSB_ON_THE_RIGHT is set, we do it like this:
  54. ::
  55. 56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39
  56. 7 6 5 4
  57. 24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7
  58. 3 2 1 0
  59. That is, QUIRK_MSB_ON_THE_RIGHT does not affect byte positioning, but
  60. inverts bit offsets inside a byte.
  61. 3. If QUIRK_LITTLE_ENDIAN is set, we do it like this:
  62. ::
  63. 39 38 37 36 35 34 33 32 47 46 45 44 43 42 41 40 55 54 53 52 51 50 49 48 63 62 61 60 59 58 57 56
  64. 4 5 6 7
  65. 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24
  66. 0 1 2 3
  67. Therefore, QUIRK_LITTLE_ENDIAN means that inside the memory region, every
  68. byte from each 4-byte word is placed at its mirrored position compared to
  69. the boundary of that word.
  70. 4. If QUIRK_MSB_ON_THE_RIGHT and QUIRK_LITTLE_ENDIAN are both set, we do it
  71. like this:
  72. ::
  73. 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
  74. 4 5 6 7
  75. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
  76. 0 1 2 3
  77. 5. If just QUIRK_LSW32_IS_FIRST is set, we do it like this:
  78. ::
  79. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  80. 3 2 1 0
  81. 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
  82. 7 6 5 4
  83. In this case the 8 byte memory region is interpreted as follows: first
  84. 4 bytes correspond to the least significant 4-byte word, next 4 bytes to
  85. the more significant 4-byte word.
  86. 6. If QUIRK_LSW32_IS_FIRST and QUIRK_MSB_ON_THE_RIGHT are set, we do it like
  87. this:
  88. ::
  89. 24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7
  90. 3 2 1 0
  91. 56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39
  92. 7 6 5 4
  93. 7. If QUIRK_LSW32_IS_FIRST and QUIRK_LITTLE_ENDIAN are set, it looks like
  94. this:
  95. ::
  96. 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24
  97. 0 1 2 3
  98. 39 38 37 36 35 34 33 32 47 46 45 44 43 42 41 40 55 54 53 52 51 50 49 48 63 62 61 60 59 58 57 56
  99. 4 5 6 7
  100. 8. If QUIRK_LSW32_IS_FIRST, QUIRK_LITTLE_ENDIAN and QUIRK_MSB_ON_THE_RIGHT
  101. are set, it looks like this:
  102. ::
  103. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
  104. 0 1 2 3
  105. 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
  106. 4 5 6 7
  107. We always think of our offsets as if there were no quirk, and we translate
  108. them afterwards, before accessing the memory region.
  109. Note on buffer lengths not multiple of 4
  110. ----------------------------------------
  111. To deal with memory layout quirks where groups of 4 bytes are laid out "little
  112. endian" relative to each other, but "big endian" within the group itself, the
  113. concept of groups of 4 bytes is intrinsic to the packing API (not to be
  114. confused with the memory access, which is performed byte by byte, though).
  115. With buffer lengths not multiple of 4, this means one group will be incomplete.
  116. Depending on the quirks, this may lead to discontinuities in the bit fields
  117. accessible through the buffer. The packing API assumes discontinuities were not
  118. the intention of the memory layout, so it avoids them by effectively logically
  119. shortening the most significant group of 4 octets to the number of octets
  120. actually available.
  121. Example with a 31 byte sized buffer given below. Physical buffer offsets are
  122. implicit, and increase from left to right within a group, and from top to
  123. bottom within a column.
  124. No quirks:
  125. ::
  126. 31 29 28 | Group 7 (most significant)
  127. 27 26 25 24 | Group 6
  128. 23 22 21 20 | Group 5
  129. 19 18 17 16 | Group 4
  130. 15 14 13 12 | Group 3
  131. 11 10 9 8 | Group 2
  132. 7 6 5 4 | Group 1
  133. 3 2 1 0 | Group 0 (least significant)
  134. QUIRK_LSW32_IS_FIRST:
  135. ::
  136. 3 2 1 0 | Group 0 (least significant)
  137. 7 6 5 4 | Group 1
  138. 11 10 9 8 | Group 2
  139. 15 14 13 12 | Group 3
  140. 19 18 17 16 | Group 4
  141. 23 22 21 20 | Group 5
  142. 27 26 25 24 | Group 6
  143. 30 29 28 | Group 7 (most significant)
  144. QUIRK_LITTLE_ENDIAN:
  145. ::
  146. 30 28 29 | Group 7 (most significant)
  147. 24 25 26 27 | Group 6
  148. 20 21 22 23 | Group 5
  149. 16 17 18 19 | Group 4
  150. 12 13 14 15 | Group 3
  151. 8 9 10 11 | Group 2
  152. 4 5 6 7 | Group 1
  153. 0 1 2 3 | Group 0 (least significant)
  154. QUIRK_LITTLE_ENDIAN | QUIRK_LSW32_IS_FIRST:
  155. ::
  156. 0 1 2 3 | Group 0 (least significant)
  157. 4 5 6 7 | Group 1
  158. 8 9 10 11 | Group 2
  159. 12 13 14 15 | Group 3
  160. 16 17 18 19 | Group 4
  161. 20 21 22 23 | Group 5
  162. 24 25 26 27 | Group 6
  163. 28 29 30 | Group 7 (most significant)
  164. Intended use
  165. ------------
  166. Drivers that opt to use this API first need to identify which of the above 3
  167. quirk combinations (for a total of 8) match what the hardware documentation
  168. describes.
  169. There are 3 supported usage patterns, detailed below.
  170. packing()
  171. ^^^^^^^^^
  172. This API function is deprecated.
  173. The packing() function returns an int-encoded error code, which protects the
  174. programmer against incorrect API use. The errors are not expected to occur
  175. during runtime, therefore it is reasonable to wrap packing() into a custom
  176. function which returns void and swallows those errors. Optionally it can
  177. dump stack or print the error description.
  178. .. code-block:: c
  179. void my_packing(void *buf, u64 *val, int startbit, int endbit,
  180. size_t len, enum packing_op op)
  181. {
  182. int err;
  183. /* Adjust quirks accordingly */
  184. err = packing(buf, val, startbit, endbit, len, op, QUIRK_LSW32_IS_FIRST);
  185. if (likely(!err))
  186. return;
  187. if (err == -EINVAL) {
  188. pr_err("Start bit (%d) expected to be larger than end (%d)\n",
  189. startbit, endbit);
  190. } else if (err == -ERANGE) {
  191. if ((startbit - endbit + 1) > 64)
  192. pr_err("Field %d-%d too large for 64 bits!\n",
  193. startbit, endbit);
  194. else
  195. pr_err("Cannot store %llx inside bits %d-%d (would truncate)\n",
  196. *val, startbit, endbit);
  197. }
  198. dump_stack();
  199. }
  200. pack() and unpack()
  201. ^^^^^^^^^^^^^^^^^^^
  202. These are const-correct variants of packing(), and eliminate the last "enum
  203. packing_op op" argument.
  204. Calling pack(...) is equivalent, and preferred, to calling packing(..., PACK).
  205. Calling unpack(...) is equivalent, and preferred, to calling packing(..., UNPACK).
  206. pack_fields() and unpack_fields()
  207. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  208. The library exposes optimized functions for the scenario where there are many
  209. fields represented in a buffer, and it encourages consumer drivers to avoid
  210. repetitive calls to pack() and unpack() for each field, but instead use
  211. pack_fields() and unpack_fields(), which reduces the code footprint.
  212. These APIs use field definitions in arrays of ``struct packed_field_u8`` or
  213. ``struct packed_field_u16``, allowing consumer drivers to minimize the size
  214. of these arrays according to their custom requirements.
  215. The pack_fields() and unpack_fields() API functions are actually macros which
  216. automatically select the appropriate function at compile time, based on the
  217. type of the fields array passed in.
  218. An additional benefit over pack() and unpack() is that sanity checks on the
  219. field definitions are handled at compile time with ``BUILD_BUG_ON`` rather
  220. than only when the offending code is executed. These functions return void and
  221. wrapping them to handle unexpected errors is not necessary.
  222. It is recommended, but not required, that you wrap your packed buffer into a
  223. structured type with a fixed size. This generally makes it easier for the
  224. compiler to enforce that the correct size buffer is used.
  225. Here is an example of how to use the fields APIs:
  226. .. code-block:: c
  227. /* Ordering inside the unpacked structure is flexible and can be different
  228. * from the packed buffer. Here, it is optimized to reduce padding.
  229. */
  230. struct data {
  231. u64 field3;
  232. u32 field4;
  233. u16 field1;
  234. u8 field2;
  235. };
  236. #define SIZE 13
  237. typedef struct __packed { u8 buf[SIZE]; } packed_buf_t;
  238. static const struct packed_field_u8 fields[] = {
  239. PACKED_FIELD(100, 90, struct data, field1),
  240. PACKED_FIELD(90, 87, struct data, field2),
  241. PACKED_FIELD(86, 30, struct data, field3),
  242. PACKED_FIELD(29, 0, struct data, field4),
  243. };
  244. void unpack_your_data(const packed_buf_t *buf, struct data *unpacked)
  245. {
  246. BUILD_BUG_ON(sizeof(*buf) != SIZE;
  247. unpack_fields(buf, sizeof(*buf), unpacked, fields,
  248. QUIRK_LITTLE_ENDIAN);
  249. }
  250. void pack_your_data(const struct data *unpacked, packed_buf_t *buf)
  251. {
  252. BUILD_BUG_ON(sizeof(*buf) != SIZE;
  253. pack_fields(buf, sizeof(*buf), unpacked, fields,
  254. QUIRK_LITTLE_ENDIAN);
  255. }