dm-integrity.rst 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308
  1. ============
  2. dm-integrity
  3. ============
  4. The dm-integrity target emulates a block device that has additional
  5. per-sector tags that can be used for storing integrity information.
  6. A general problem with storing integrity tags with every sector is that
  7. writing the sector and the integrity tag must be atomic - i.e. in case of
  8. crash, either both sector and integrity tag or none of them is written.
  9. To guarantee write atomicity, the dm-integrity target uses journal, it
  10. writes sector data and integrity tags into a journal, commits the journal
  11. and then copies the data and integrity tags to their respective location.
  12. The dm-integrity target can be used with the dm-crypt target - in this
  13. situation the dm-crypt target creates the integrity data and passes them
  14. to the dm-integrity target via bio_integrity_payload attached to the bio.
  15. In this mode, the dm-crypt and dm-integrity targets provide authenticated
  16. disk encryption - if the attacker modifies the encrypted device, an I/O
  17. error is returned instead of random data.
  18. The dm-integrity target can also be used as a standalone target, in this
  19. mode it calculates and verifies the integrity tag internally. In this
  20. mode, the dm-integrity target can be used to detect silent data
  21. corruption on the disk or in the I/O path.
  22. There's an alternate mode of operation where dm-integrity uses a bitmap
  23. instead of a journal. If a bit in the bitmap is 1, the corresponding
  24. region's data and integrity tags are not synchronized - if the machine
  25. crashes, the unsynchronized regions will be recalculated. The bitmap mode
  26. is faster than the journal mode, because we don't have to write the data
  27. twice, but it is also less reliable, because if data corruption happens
  28. when the machine crashes, it may not be detected.
  29. When loading the target for the first time, the kernel driver will format
  30. the device. But it will only format the device if the superblock contains
  31. zeroes. If the superblock is neither valid nor zeroed, the dm-integrity
  32. target can't be loaded.
  33. Accesses to the on-disk metadata area containing checksums (aka tags) are
  34. buffered using dm-bufio. When an access to any given metadata area
  35. occurs, each unique metadata area gets its own buffer(s). The buffer size
  36. is capped at the size of the metadata area, but may be smaller, thereby
  37. requiring multiple buffers to represent the full metadata area. A smaller
  38. buffer size will produce a smaller resulting read/write operation to the
  39. metadata area for small reads/writes. The metadata is still read even in
  40. a full write to the data covered by a single buffer.
  41. To use the target for the first time:
  42. 1. overwrite the superblock with zeroes
  43. 2. load the dm-integrity target with one-sector size, the kernel driver
  44. will format the device
  45. 3. unload the dm-integrity target
  46. 4. read the "provided_data_sectors" value from the superblock
  47. 5. load the dm-integrity target with the target size
  48. "provided_data_sectors"
  49. 6. if you want to use dm-integrity with dm-crypt, load the dm-crypt target
  50. with the size "provided_data_sectors"
  51. Target arguments:
  52. 1. the underlying block device
  53. 2. the number of reserved sector at the beginning of the device - the
  54. dm-integrity won't read of write these sectors
  55. 3. the size of the integrity tag (if "-" is used, the size is taken from
  56. the internal-hash algorithm)
  57. 4. mode:
  58. D - direct writes (without journal)
  59. in this mode, journaling is
  60. not used and data sectors and integrity tags are written
  61. separately. In case of crash, it is possible that the data
  62. and integrity tag doesn't match.
  63. J - journaled writes
  64. data and integrity tags are written to the
  65. journal and atomicity is guaranteed. In case of crash,
  66. either both data and tag or none of them are written. The
  67. journaled mode degrades write throughput twice because the
  68. data have to be written twice.
  69. B - bitmap mode - data and metadata are written without any
  70. synchronization, the driver maintains a bitmap of dirty
  71. regions where data and metadata don't match. This mode can
  72. only be used with internal hash.
  73. R - recovery mode - in this mode, journal is not replayed,
  74. checksums are not checked and writes to the device are not
  75. allowed. This mode is useful for data recovery if the
  76. device cannot be activated in any of the other standard
  77. modes.
  78. I - inline mode - in this mode, dm-integrity will store integrity
  79. data directly in the underlying device sectors.
  80. The underlying device must have an integrity profile that
  81. allows storing user integrity data and provides enough
  82. space for the selected integrity tag.
  83. 5. the number of additional arguments
  84. Additional arguments:
  85. journal_sectors:number
  86. The size of journal, this argument is used only if formatting the
  87. device. If the device is already formatted, the value from the
  88. superblock is used.
  89. interleave_sectors:number (default 32768)
  90. The number of interleaved sectors. This values is rounded down to
  91. a power of two. If the device is already formatted, the value from
  92. the superblock is used.
  93. meta_device:device
  94. Don't interleave the data and metadata on the device. Use a
  95. separate device for metadata.
  96. buffer_sectors:number (default 128)
  97. The number of sectors in one metadata buffer. The value is rounded
  98. down to a power of two.
  99. journal_watermark:number (default 50)
  100. The journal watermark in percents. When the size of the journal
  101. exceeds this watermark, the thread that flushes the journal will
  102. be started.
  103. commit_time:number (default 10000)
  104. Commit time in milliseconds. When this time passes, the journal is
  105. written. The journal is also written immediately if the FLUSH
  106. request is received.
  107. internal_hash:algorithm(:key) (the key is optional)
  108. Use internal hash or crc.
  109. When this argument is used, the dm-integrity target won't accept
  110. integrity tags from the upper target, but it will automatically
  111. generate and verify the integrity tags.
  112. You can use a crc algorithm (such as crc32), then integrity target
  113. will protect the data against accidental corruption.
  114. You can also use a hmac algorithm (for example
  115. "hmac(sha256):0123456789abcdef"), in this mode it will provide
  116. cryptographic authentication of the data without encryption.
  117. When this argument is not used, the integrity tags are accepted
  118. from an upper layer target, such as dm-crypt. The upper layer
  119. target should check the validity of the integrity tags.
  120. recalculate
  121. Recalculate the integrity tags automatically. It is only valid
  122. when using internal hash.
  123. journal_crypt:algorithm(:key) (the key is optional)
  124. Encrypt the journal using given algorithm to make sure that the
  125. attacker can't read the journal. You can use a block cipher here
  126. (such as "cbc(aes)") or a stream cipher (for example "chacha20"
  127. or "ctr(aes)").
  128. The journal contains history of last writes to the block device,
  129. an attacker reading the journal could see the last sector numbers
  130. that were written. From the sector numbers, the attacker can infer
  131. the size of files that were written. To protect against this
  132. situation, you can encrypt the journal.
  133. journal_mac:algorithm(:key) (the key is optional)
  134. Protect sector numbers in the journal from accidental or malicious
  135. modification. To protect against accidental modification, use a
  136. crc algorithm, to protect against malicious modification, use a
  137. hmac algorithm with a key.
  138. This option is not needed when using internal-hash because in this
  139. mode, the integrity of journal entries is checked when replaying
  140. the journal. Thus, modified sector number would be detected at
  141. this stage.
  142. block_size:number (default 512)
  143. The size of a data block in bytes. The larger the block size the
  144. less overhead there is for per-block integrity metadata.
  145. Supported values are 512, 1024, 2048 and 4096 bytes.
  146. sectors_per_bit:number
  147. In the bitmap mode, this parameter specifies the number of
  148. 512-byte sectors that corresponds to one bitmap bit.
  149. bitmap_flush_interval:number
  150. The bitmap flush interval in milliseconds. The metadata buffers
  151. are synchronized when this interval expires.
  152. allow_discards
  153. Allow block discard requests (a.k.a. TRIM) for the integrity device.
  154. Discards are only allowed to devices using internal hash.
  155. fix_padding
  156. Use a smaller padding of the tag area that is more
  157. space-efficient. If this option is not present, large padding is
  158. used - that is for compatibility with older kernels.
  159. fix_hmac
  160. Improve security of internal_hash and journal_mac:
  161. - the section number is mixed to the mac, so that an attacker can't
  162. copy sectors from one journal section to another journal section
  163. - the superblock is protected by journal_mac
  164. - a 16-byte salt stored in the superblock is mixed to the mac, so
  165. that the attacker can't detect that two disks have the same hmac
  166. key and also to disallow the attacker to move sectors from one
  167. disk to another
  168. legacy_recalculate
  169. Allow recalculating of volumes with HMAC keys. This is disabled by
  170. default for security reasons - an attacker could modify the volume,
  171. set recalc_sector to zero, and the kernel would not detect the
  172. modification.
  173. The journal mode (D/J), buffer_sectors, journal_watermark, commit_time and
  174. allow_discards can be changed when reloading the target (load an inactive
  175. table and swap the tables with suspend and resume). The other arguments
  176. should not be changed when reloading the target because the layout of disk
  177. data depend on them and the reloaded target would be non-functional.
  178. For example, on a device using the default interleave_sectors of 32768, a
  179. block_size of 512, and an internal_hash of crc32c with a tag size of 4
  180. bytes, it will take 128 KiB of tags to track a full data area, requiring
  181. 256 sectors of metadata per data area. With the default buffer_sectors of
  182. 128, that means there will be 2 buffers per metadata area, or 2 buffers
  183. per 16 MiB of data.
  184. Status line:
  185. 1. the number of integrity mismatches
  186. 2. provided data sectors - that is the number of sectors that the user
  187. could use
  188. 3. the current recalculating position (or '-' if we didn't recalculate)
  189. The layout of the formatted block device:
  190. * reserved sectors
  191. (they are not used by this target, they can be used for
  192. storing LUKS metadata or for other purpose), the size of the reserved
  193. area is specified in the target arguments
  194. * superblock (4kiB)
  195. * magic string - identifies that the device was formatted
  196. * version
  197. * log2(interleave sectors)
  198. * integrity tag size
  199. * the number of journal sections
  200. * provided data sectors - the number of sectors that this target
  201. provides (i.e. the size of the device minus the size of all
  202. metadata and padding). The user of this target should not send
  203. bios that access data beyond the "provided data sectors" limit.
  204. * flags
  205. SB_FLAG_HAVE_JOURNAL_MAC
  206. - a flag is set if journal_mac is used
  207. SB_FLAG_RECALCULATING
  208. - recalculating is in progress
  209. SB_FLAG_DIRTY_BITMAP
  210. - journal area contains the bitmap of dirty
  211. blocks
  212. * log2(sectors per block)
  213. * a position where recalculating finished
  214. * journal
  215. The journal is divided into sections, each section contains:
  216. * metadata area (4kiB), it contains journal entries
  217. - every journal entry contains:
  218. * logical sector (specifies where the data and tag should
  219. be written)
  220. * last 8 bytes of data
  221. * integrity tag (the size is specified in the superblock)
  222. - every metadata sector ends with
  223. * mac (8-bytes), all the macs in 8 metadata sectors form a
  224. 64-byte value. It is used to store hmac of sector
  225. numbers in the journal section, to protect against a
  226. possibility that the attacker tampers with sector
  227. numbers in the journal.
  228. * commit id
  229. * data area (the size is variable; it depends on how many journal
  230. entries fit into the metadata area)
  231. - every sector in the data area contains:
  232. * data (504 bytes of data, the last 8 bytes are stored in
  233. the journal entry)
  234. * commit id
  235. To test if the whole journal section was written correctly, every
  236. 512-byte sector of the journal ends with 8-byte commit id. If the
  237. commit id matches on all sectors in a journal section, then it is
  238. assumed that the section was written correctly. If the commit id
  239. doesn't match, the section was written partially and it should not
  240. be replayed.
  241. * one or more runs of interleaved tags and data.
  242. Each run contains:
  243. * tag area - it contains integrity tags. There is one tag for each
  244. sector in the data area. The size of this area is always 4KiB or
  245. greater.
  246. * data area - it contains data sectors. The number of data sectors
  247. in one run must be a power of two. log2 of this value is stored
  248. in the superblock.