scrub.rst 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342
  1. .. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.2-no-invariants-or-later
  2. =============
  3. Scrub Control
  4. =============
  5. Copyright (c) 2024-2025 HiSilicon Limited.
  6. :Author: Shiju Jose <shiju.jose@huawei.com>
  7. :License: The GNU Free Documentation License, Version 1.2 without
  8. Invariant Sections, Front-Cover Texts nor Back-Cover Texts.
  9. (dual licensed under the GPL v2)
  10. - Written for: 6.15
  11. Introduction
  12. ------------
  13. Increasing DRAM size and cost have made memory subsystem reliability an
  14. important concern. These modules are used where potentially corrupted data
  15. could cause expensive or fatal issues. Memory errors are among the top
  16. hardware failures that cause server and workload crashes.
  17. Memory scrubbing is a feature where an ECC (Error-Correcting Code) engine
  18. reads data from each memory media location, corrects if necessary and writes
  19. the corrected data back to the same memory media location.
  20. DIMMs can be scrubbed at a configurable rate to detect uncorrected memory
  21. errors and attempt recovery from detected errors, providing the following
  22. benefits:
  23. 1. Proactively scrubbing DIMMs reduces the chance of a correctable error
  24. becoming uncorrectable.
  25. 2. When detected, uncorrected errors caught in unallocated memory pages are
  26. isolated and prevented from being allocated to an application or the OS.
  27. 3. This reduces the likelihood of software or hardware products encountering
  28. memory errors.
  29. 4. The additional data on failures in memory may be used to build up
  30. statistics that are later used to decide whether to use memory repair
  31. technologies such as Post Package Repair or Sparing.
  32. There are 2 types of memory scrubbing:
  33. 1. Background (patrol) scrubbing while the DRAM is otherwise idle.
  34. 2. On-demand scrubbing for a specific address range or region of memory.
  35. Several types of interfaces to hardware memory scrubbers have been
  36. identified, such as CXL memory device patrol scrub, CXL DDR5 ECS, ACPI
  37. RAS2 memory scrubbing, and ACPI NVDIMM ARS (Address Range Scrub).
  38. The control mechanisms vary across different memory scrubbers. To enable
  39. standardized userspace tooling, there is a need to present these controls
  40. through a standardized ABI.
  41. A generic memory EDAC scrub control allows users to manage underlying
  42. scrubbers in the system through a standardized sysfs control interface. It
  43. abstracts the management of various scrubbing functionalities into a unified
  44. set of functions.
  45. Use cases of common scrub control feature
  46. -----------------------------------------
  47. 1. Several types of interfaces for hardware memory scrubbers have been
  48. identified, including the CXL memory device patrol scrub, CXL DDR5 ECS,
  49. ACPI RAS2 memory scrubbing features, ACPI NVDIMM ARS (Address Range Scrub),
  50. and software-based memory scrubbers.
  51. Of the identified interfaces to hardware memory scrubbers some support
  52. control over patrol (background) scrubbing (e.g., ACPI RAS2, CXL) and/or
  53. on-demand scrubbing (e.g., ACPI RAS2, ACPI ARS). However, the scrub control
  54. interfaces vary between memory scrubbers, highlighting the need for
  55. a standardized, generic sysfs scrub control interface that is accessible to
  56. userspace for administration and use by scripts/tools.
  57. 2. User-space scrub controls allow users to disable scrubbing if necessary,
  58. for example, to disable background patrol scrubbing or adjust the scrub
  59. rate for performance-aware operations where background activities need to
  60. be minimized or disabled.
  61. 3. User-space tools enable on-demand scrubbing for specific address ranges,
  62. provided that the scrubber supports this functionality.
  63. 4. User-space tools can also control memory DIMM scrubbing at a configurable
  64. scrub rate via sysfs scrub controls. This approach offers several benefits:
  65. 4.1. Detects uncorrectable memory errors early, before user access to affected
  66. memory, helping facilitate recovery.
  67. 4.2. Reduces the likelihood of correctable errors developing into uncorrectable
  68. errors.
  69. 5. Policy control for hotplugged memory is necessary because there may not
  70. be a system-wide BIOS or similar control to manage scrub settings for a CXL
  71. device added after boot. Determining these settings is a policy decision,
  72. balancing reliability against performance, so userspace should control it.
  73. Therefore, a unified interface is recommended for handling this function in
  74. a way that aligns with other similar interfaces, rather than creating a
  75. separate one.
  76. Scrubbing features
  77. ------------------
  78. CXL Memory Scrubbing features
  79. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  80. CXL spec r3.1 [1]_ section 8.2.9.9.11.1 describes the memory device patrol
  81. scrub control feature. The device patrol scrub proactively locates and makes
  82. corrections to errors in regular cycle. The patrol scrub control allows the
  83. userspace request to change CXL patrol scrubber's configurations.
  84. The patrol scrub control allows the requester to specify the number of
  85. hours in which the patrol scrub cycles must be completed, provided that
  86. the requested scrub rate must be within the supported range of the
  87. scrub rate that the device is capable of. In the CXL driver, the
  88. number of seconds per scrub cycles, which user requests via sysfs, is
  89. rescaled to hours per scrub cycles.
  90. In addition, they allow the host to disable the feature in case it interferes
  91. with performance-aware operations which require the background operations to
  92. be turned off.
  93. Error Check Scrub (ECS)
  94. ~~~~~~~~~~~~~~~~~~~~~~~
  95. CXL spec r3.1 [1]_ section 8.2.9.9.11.2 describes Error Check Scrub (ECS)
  96. - a feature defined in the JEDEC DDR5 SDRAM Specification (JESD79-5) and
  97. allowing DRAM to internally read, correct single-bit errors, and write back
  98. corrected data bits to the DRAM array while providing transparency to error
  99. counts.
  100. The DDR5 device contains number of memory media Field Replaceable Units (FRU)
  101. per device. The DDR5 ECS feature and thus the ECS control driver supports
  102. configuring the ECS parameters per FRU.
  103. ACPI RAS2 Hardware-based Memory Scrubbing
  104. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  105. ACPI spec 6.5 [2]_ section 5.2.21 ACPI RAS2 describes an ACPI RAS2 table
  106. which provides interfaces for platform RAS features and supports independent
  107. RAS controls and capabilities for a given RAS feature for multiple instances
  108. of the same component in a given system.
  109. Memory RAS features apply to RAS capabilities, controls and operations that
  110. are specific to memory. RAS2 PCC sub-spaces for memory-specific RAS features
  111. have a Feature Type of 0x00 (Memory).
  112. The platform can use the hardware-based memory scrubbing feature to expose
  113. controls and capabilities associated with hardware-based memory scrub
  114. engines. The RAS2 memory scrubbing feature supports as per spec,
  115. 1. Independent memory scrubbing controls for each NUMA domain, identified
  116. using its proximity domain.
  117. 2. Provision for background (patrol) scrubbing of the entire memory system,
  118. as well as on-demand scrubbing for a specific region of memory.
  119. ACPI Address Range Scrubbing (ARS)
  120. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  121. ACPI spec 6.5 [2]_ section 9.19.7.2 describes Address Range Scrubbing (ARS).
  122. ARS allows the platform to communicate memory errors to system software.
  123. This capability allows system software to prevent accesses to addresses with
  124. uncorrectable errors in memory. ARS functions manage all NVDIMMs present in
  125. the system. Only one scrub can be in progress system wide at any given time.
  126. The following functions are supported as per the specification:
  127. 1. Query ARS Capabilities for a given address range, indicates platform
  128. supports the ACPI NVDIMM Root Device Unconsumed Error Notification.
  129. 2. Start ARS triggers an Address Range Scrub for the given memory range.
  130. Address scrubbing can be done for volatile or persistent memory, or both.
  131. 3. Query ARS Status command allows software to get the status of ARS,
  132. including the progress of ARS and ARS error record.
  133. 4. Clear Uncorrectable Error.
  134. 5. Translate SPA
  135. 6. ARS Error Inject etc.
  136. The kernel supports an existing control for ARS and ARS is currently not
  137. supported in EDAC.
  138. .. [1] https://computeexpresslink.org/cxl-specification/
  139. .. [2] https://uefi.org/specs/ACPI/6.5/
  140. Comparison of various scrubbing features
  141. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  142. +--------------+-----------+-----------+-----------+-----------+
  143. | | ACPI | CXL patrol| CXL ECS | ARS |
  144. | Name | RAS2 | scrub | | |
  145. +--------------+-----------+-----------+-----------+-----------+
  146. | | | | | |
  147. | On-demand | Supported | No | No | Supported |
  148. | Scrubbing | | | | |
  149. | | | | | |
  150. +--------------+-----------+-----------+-----------+-----------+
  151. | | | | | |
  152. | Background | Supported | Supported | Supported | No |
  153. | scrubbing | | | | |
  154. | | | | | |
  155. +--------------+-----------+-----------+-----------+-----------+
  156. | | | | | |
  157. | Mode of | Scrub ctrl| per device| per memory| Unknown |
  158. | scrubbing | per NUMA | | media | |
  159. | | domain. | | | |
  160. +--------------+-----------+-----------+-----------+-----------+
  161. | | | | | |
  162. | Query scrub | Supported | Supported | Supported | Supported |
  163. | capabilities | | | | |
  164. | | | | | |
  165. +--------------+-----------+-----------+-----------+-----------+
  166. | | | | | |
  167. | Setting | Supported | No | No | Supported |
  168. | address range| | | | |
  169. | | | | | |
  170. +--------------+-----------+-----------+-----------+-----------+
  171. | | | | | |
  172. | Setting | Supported | Supported | No | No |
  173. | scrub rate | | | | |
  174. | | | | | |
  175. +--------------+-----------+-----------+-----------+-----------+
  176. | | | | | |
  177. | Unit for | Not | in hours | No | No |
  178. | scrub rate | Defined | | | |
  179. | | | | | |
  180. +--------------+-----------+-----------+-----------+-----------+
  181. | | Supported | | | |
  182. | Scrub | on-demand | No | No | Supported |
  183. | status/ | scrubbing | | | |
  184. | Completion | only | | | |
  185. +--------------+-----------+-----------+-----------+-----------+
  186. | UC error | |CXL general|CXL general| ACPI UCE |
  187. | reporting | Exception |media/DRAM |media/DRAM | notify and|
  188. | | |event/media|event/media| query |
  189. | | |scan? |scan? | ARS status|
  190. +--------------+-----------+-----------+-----------+-----------+
  191. | | | | | |
  192. | Support for | Supported | Supported | Supported | No |
  193. | EDAC control | | | | |
  194. | | | | | |
  195. +--------------+-----------+-----------+-----------+-----------+
  196. The File System
  197. ---------------
  198. The control attributes of a registered scrubber instance could be
  199. accessed in:
  200. /sys/bus/edac/devices/<dev-name>/scrubX/
  201. sysfs
  202. -----
  203. Sysfs files are documented in
  204. `Documentation/ABI/testing/sysfs-edac-scrub`
  205. `Documentation/ABI/testing/sysfs-edac-ecs`
  206. Examples
  207. --------
  208. The usage takes the form shown in these examples:
  209. 1. CXL memory Patrol Scrub
  210. The following are the use cases identified why we might increase the scrub rate.
  211. - Scrubbing is needed at device granularity because a device is showing
  212. unexpectedly high errors.
  213. - Scrubbing may apply to memory that isn't online at all yet. Likely this
  214. is a system wide default setting on boot.
  215. - Scrubbing at a higher rate because the monitor software has determined that
  216. more reliability is necessary for a particular data set. This is called
  217. Differentiated Reliability.
  218. 1.1. Device based scrubbing
  219. CXL memory is exposed to memory management subsystem and ultimately userspace
  220. via CXL devices. Device-based scrubbing is used for the first use case
  221. described in "Section 1 CXL Memory Patrol Scrub".
  222. When combining control via the device interfaces and region interfaces,
  223. "see Section 1.2 Region based scrubbing".
  224. Sysfs files for scrubbing are documented in
  225. `Documentation/ABI/testing/sysfs-edac-scrub`
  226. 1.2. Region based scrubbing
  227. CXL memory is exposed to memory management subsystem and ultimately userspace
  228. via CXL regions. CXL Regions represent mapped memory capacity in system
  229. physical address space. These can incorporate one or more parts of multiple CXL
  230. memory devices with traffic interleaved across them. The user may want to control
  231. the scrub rate via this more abstract region instead of having to figure out the
  232. constituent devices and program them separately. The scrub rate for each device
  233. covers the whole device. Thus if multiple regions use parts of that device then
  234. requests for scrubbing of other regions may result in a higher scrub rate than
  235. requested for this specific region.
  236. Region-based scrubbing is used for the third use case described in
  237. "Section 1 CXL Memory Patrol Scrub".
  238. Userspace must follow below set of rules on how to set the scrub rates for any
  239. mixture of requirements.
  240. 1. Taking each region in turn from lowest desired scrub rate to highest and set
  241. their scrub rates. Later regions may override the scrub rate on individual
  242. devices (and hence potentially whole regions).
  243. 2. Take each device for which enhanced scrubbing is required (higher rate) and
  244. set those scrub rates. This will override the scrub rates of individual devices,
  245. setting them to the maximum rate required for any of the regions they help back,
  246. unless a specific rate is already defined.
  247. Sysfs files for scrubbing are documented in
  248. `Documentation/ABI/testing/sysfs-edac-scrub`
  249. 2. CXL memory Error Check Scrub (ECS)
  250. The Error Check Scrub (ECS) feature enables a memory device to perform error
  251. checking and correction (ECC) and count single-bit errors. The associated
  252. memory controller sets the ECS mode with a trigger sent to the memory
  253. device. CXL ECS control allows the host, thus the userspace, to change the
  254. attributes for error count mode, threshold number of errors per segment
  255. (indicating how many segments have at least that number of errors) for
  256. reporting errors, and reset the ECS counter. Thus the responsibility for
  257. initiating Error Check Scrub on a memory device may lie with the memory
  258. controller or platform when unexpectedly high error rates are detected.
  259. Sysfs files for scrubbing are documented in
  260. `Documentation/ABI/testing/sysfs-edac-ecs`