intel_dptf.rst 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437
  1. .. SPDX-License-Identifier: GPL-2.0
  2. ===============================================================
  3. Intel(R) Dynamic Platform and Thermal Framework Sysfs Interface
  4. ===============================================================
  5. :Copyright: © 2022 Intel Corporation
  6. :Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
  7. Introduction
  8. ------------
  9. Intel(R) Dynamic Platform and Thermal Framework (DPTF) is a platform
  10. level hardware/software solution for power and thermal management.
  11. As a container for multiple power/thermal technologies, DPTF provides
  12. a coordinated approach for different policies to effect the hardware
  13. state of a system.
  14. Since it is a platform level framework, this has several components.
  15. Some parts of the technology is implemented in the firmware and uses
  16. ACPI and PCI devices to expose various features for monitoring and
  17. control. Linux has a set of kernel drivers exposing hardware interface
  18. to user space. This allows user space thermal solutions like
  19. "Linux Thermal Daemon" to read platform specific thermal and power
  20. tables to deliver adequate performance while keeping the system under
  21. thermal limits.
  22. DPTF ACPI Drivers interface
  23. ----------------------------
  24. :file:`/sys/bus/platform/devices/<N>/uuids`, where <N>
  25. =INT3400|INTC1040|INTC1041|INTC10A0
  26. ``available_uuids`` (RO)
  27. A set of UUIDs strings presenting available policies
  28. which should be notified to the firmware when the
  29. user space can support those policies.
  30. UUID strings:
  31. "42A441D6-AE6A-462b-A84B-4A8CE79027D3" : Passive 1
  32. "3A95C389-E4B8-4629-A526-C52C88626BAE" : Active
  33. "97C68AE7-15FA-499c-B8C9-5DA81D606E0A" : Critical
  34. "63BE270F-1C11-48FD-A6F7-3AF253FF3E2D" : Adaptive performance
  35. "5349962F-71E6-431D-9AE8-0A635B710AEE" : Emergency call
  36. "9E04115A-AE87-4D1C-9500-0F3E340BFE75" : Passive 2
  37. "F5A35014-C209-46A4-993A-EB56DE7530A1" : Power Boss
  38. "6ED722A7-9240-48A5-B479-31EEF723D7CF" : Virtual Sensor
  39. "16CAF1B7-DD38-40ED-B1C1-1B8A1913D531" : Cooling mode
  40. "BE84BABF-C4D4-403D-B495-3128FD44dAC1" : HDC
  41. ``current_uuid`` (RW)
  42. User space can write strings from available UUIDs, one at a
  43. time.
  44. :file:`/sys/bus/platform/devices/<N>/`, where <N>
  45. =INT3400|INTC1040|INTC1041|INTC10A0
  46. ``imok`` (WO)
  47. User space daemon write 1 to respond to firmware event
  48. for sending keep alive notification. User space receives
  49. THERMAL_EVENT_KEEP_ALIVE kobject uevent notification when
  50. firmware calls for user space to respond with imok ACPI
  51. method.
  52. ``odvp*`` (RO)
  53. Firmware thermal status variable values. Thermal tables
  54. calls for different processing based on these variable
  55. values.
  56. ``data_vault`` (RO)
  57. Binary thermal table. Refer to
  58. https:/github.com/intel/thermal_daemon for decoding
  59. thermal table.
  60. ``production_mode`` (RO)
  61. When different from zero, manufacturer locked thermal configuration
  62. from further changes.
  63. ACPI Thermal Relationship table interface
  64. ------------------------------------------
  65. :file:`/dev/acpi_thermal_rel`
  66. This device provides IOCTL interface to read standard ACPI
  67. thermal relationship tables via ACPI methods _TRT and _ART.
  68. These IOCTLs are defined in
  69. drivers/thermal/intel/int340x_thermal/acpi_thermal_rel.h
  70. IOCTLs:
  71. ACPI_THERMAL_GET_TRT_LEN: Get length of TRT table
  72. ACPI_THERMAL_GET_ART_LEN: Get length of ART table
  73. ACPI_THERMAL_GET_TRT_COUNT: Number of records in TRT table
  74. ACPI_THERMAL_GET_ART_COUNT: Number of records in ART table
  75. ACPI_THERMAL_GET_TRT: Read binary TRT table, length to read is
  76. provided via argument to ioctl().
  77. ACPI_THERMAL_GET_ART: Read binary ART table, length to read is
  78. provided via argument to ioctl().
  79. DPTF ACPI Sensor drivers
  80. -------------------------
  81. DPTF Sensor drivers are presented as standard thermal sysfs thermal_zone.
  82. DPTF ACPI Cooling drivers
  83. --------------------------
  84. DPTF cooling drivers are presented as standard thermal sysfs cooling_device.
  85. DPTF Processor thermal PCI Driver interface
  86. --------------------------------------------
  87. :file:`/sys/bus/pci/devices/0000\:00\:04.0/power_limits/`
  88. Refer to Documentation/power/powercap/powercap.rst for powercap
  89. ABI.
  90. ``power_limit_0_max_uw`` (RO)
  91. Maximum powercap sysfs constraint_0_power_limit_uw for Intel RAPL
  92. ``power_limit_0_step_uw`` (RO)
  93. Power limit increment/decrements for Intel RAPL constraint 0 power limit
  94. ``power_limit_0_min_uw`` (RO)
  95. Minimum powercap sysfs constraint_0_power_limit_uw for Intel RAPL
  96. ``power_limit_0_tmin_us`` (RO)
  97. Minimum powercap sysfs constraint_0_time_window_us for Intel RAPL
  98. ``power_limit_0_tmax_us`` (RO)
  99. Maximum powercap sysfs constraint_0_time_window_us for Intel RAPL
  100. ``power_limit_1_max_uw`` (RO)
  101. Maximum powercap sysfs constraint_1_power_limit_uw for Intel RAPL
  102. ``power_limit_1_step_uw`` (RO)
  103. Power limit increment/decrements for Intel RAPL constraint 1 power limit
  104. ``power_limit_1_min_uw`` (RO)
  105. Minimum powercap sysfs constraint_1_power_limit_uw for Intel RAPL
  106. ``power_limit_1_tmin_us`` (RO)
  107. Minimum powercap sysfs constraint_1_time_window_us for Intel RAPL
  108. ``power_limit_1_tmax_us`` (RO)
  109. Maximum powercap sysfs constraint_1_time_window_us for Intel RAPL
  110. ``power_floor_status`` (RO)
  111. When set to 1, the power floor of the system in the current
  112. configuration has been reached. It needs to be reconfigured to allow
  113. power to be reduced any further.
  114. ``power_floor_enable`` (RW)
  115. When set to 1, enable reading and notification of the power floor
  116. status. Notifications are triggered for the power_floor_status
  117. attribute value changes.
  118. :file:`/sys/bus/pci/devices/0000\:00\:04.0/`
  119. ``tcc_offset_degree_celsius`` (RW)
  120. TCC offset from the critical temperature where hardware will throttle
  121. CPU.
  122. :file:`/sys/bus/pci/devices/0000\:00\:04.0/workload_request`
  123. ``workload_available_types`` (RO)
  124. Available workload types. User space can specify one of the workload type
  125. it is currently executing via workload_type. For example: idle, bursty,
  126. sustained etc.
  127. ``workload_type`` (RW)
  128. User space can specify any one of the available workload type using
  129. this interface.
  130. :file:`/sys/bus/pci/devices/0000\:00\:04.0/ptc_0_control`
  131. :file:`/sys/bus/pci/devices/0000\:00\:04.0/ptc_1_control`
  132. :file:`/sys/bus/pci/devices/0000\:00\:04.0/ptc_2_control`
  133. All these controls needs admin privilege to update.
  134. ``enable`` (RW)
  135. 1 for enable, 0 for disable. Shows the current enable status of
  136. platform temperature control feature. User space can enable/disable
  137. hardware controls.
  138. ``temperature_target`` (RW)
  139. Update a new temperature target in milli degree celsius for hardware to
  140. use for the temperature control.
  141. ``thermal_tolerance`` (RW)
  142. This attribute ranges from 0 to 7, where 0 represents
  143. the most aggressive control to avoid any temperature overshoots, and
  144. 7 represents a more graceful approach, favoring performance even at
  145. the expense of temperature overshoots.
  146. Note: This level may not scale linearly. For example, a value of 3 does
  147. not necessarily imply a 50% improvement in performance compared to a
  148. value of 0.
  149. Given that this is platform temperature control, it is expected that a
  150. single user-level manager owns and manages the controls. If multiple
  151. user-level software applications attempt to write different targets, it
  152. can lead to unexpected behavior.
  153. DPTF Processor thermal RFIM interface
  154. --------------------------------------------
  155. RFIM interface allows adjustment of FIVR (Fully Integrated Voltage Regulator),
  156. DDR (Double Data Rate) and DLVR (Digital Linear Voltage Regulator)
  157. frequencies to avoid RF interference with WiFi and 5G.
  158. Switching voltage regulators (VR) generate radiated EMI or RFI at the
  159. fundamental frequency and its harmonics. Some harmonics may interfere
  160. with very sensitive wireless receivers such as Wi-Fi and cellular that
  161. are integrated into host systems like notebook PCs. One of mitigation
  162. methods is requesting SOC integrated VR (IVR) switching frequency to a
  163. small % and shift away the switching noise harmonic interference from
  164. radio channels. OEM or ODMs can use the driver to control SOC IVR
  165. operation within the range where it does not impact IVR performance.
  166. Some products use DLVR instead of FIVR as switching voltage regulator.
  167. In this case attributes of DLVR must be adjusted instead of FIVR.
  168. While shifting the frequencies additional clock noise can be introduced,
  169. which is compensated by adjusting Spread spectrum percent. This helps
  170. to reduce the clock noise to meet regulatory compliance. This spreading
  171. % increases bandwidth of signal transmission and hence reduces the
  172. effects of interference, noise and signal fading.
  173. DRAM devices of DDR IO interface and their power plane can generate EMI
  174. at the data rates. Similar to IVR control mechanism, Intel offers a
  175. mechanism by which DDR data rates can be changed if several conditions
  176. are met: there is strong RFI interference because of DDR; CPU power
  177. management has no other restriction in changing DDR data rates;
  178. PC ODMs enable this feature (real time DDR RFI Mitigation referred to as
  179. DDR-RFIM) for Wi-Fi from BIOS.
  180. FIVR attributes
  181. :file:`/sys/bus/pci/devices/0000\:00\:04.0/fivr/`
  182. ``vco_ref_code_lo`` (RW)
  183. The VCO reference code is an 11-bit field and controls the FIVR
  184. switching frequency. This is the 3-bit LSB field.
  185. ``vco_ref_code_hi`` (RW)
  186. The VCO reference code is an 11-bit field and controls the FIVR
  187. switching frequency. This is the 8-bit MSB field.
  188. ``spread_spectrum_pct`` (RW)
  189. Set the FIVR spread spectrum clocking percentage
  190. ``spread_spectrum_clk_enable`` (RW)
  191. Enable/disable of the FIVR spread spectrum clocking feature
  192. ``rfi_vco_ref_code`` (RW)
  193. This field is a read only status register which reflects the
  194. current FIVR switching frequency
  195. ``fivr_fffc_rev`` (RW)
  196. This field indicated the revision of the FIVR HW.
  197. DVFS attributes
  198. :file:`/sys/bus/pci/devices/0000\:00\:04.0/dvfs/`
  199. ``rfi_restriction_run_busy`` (RW)
  200. Request the restriction of specific DDR data rate and set this
  201. value 1. Self reset to 0 after operation.
  202. ``rfi_restriction_err_code`` (RW)
  203. 0 :Request is accepted, 1:Feature disabled,
  204. 2: the request restricts more points than it is allowed
  205. ``rfi_restriction_data_rate_Delta`` (RW)
  206. Restricted DDR data rate for RFI protection: Lower Limit
  207. ``rfi_restriction_data_rate_Base`` (RW)
  208. Restricted DDR data rate for RFI protection: Upper Limit
  209. ``ddr_data_rate_point_0`` (RO)
  210. DDR data rate selection 1st point
  211. ``ddr_data_rate_point_1`` (RO)
  212. DDR data rate selection 2nd point
  213. ``ddr_data_rate_point_2`` (RO)
  214. DDR data rate selection 3rd point
  215. ``ddr_data_rate_point_3`` (RO)
  216. DDR data rate selection 4th point
  217. ``rfi_disable (RW)``
  218. Disable DDR rate change feature
  219. DLVR attributes
  220. :file:`/sys/bus/pci/devices/0000\:00\:04.0/dlvr/`
  221. ``dlvr_hardware_rev`` (RO)
  222. DLVR hardware revision.
  223. ``dlvr_freq_mhz`` (RO)
  224. Current DLVR PLL frequency in MHz.
  225. ``dlvr_freq_select`` (RW)
  226. Sets DLVR PLL clock frequency. Once set, and enabled via
  227. dlvr_rfim_enable, the dlvr_freq_mhz will show the current
  228. DLVR PLL frequency.
  229. ``dlvr_pll_busy`` (RO)
  230. PLL can't accept frequency change when set.
  231. ``dlvr_rfim_enable`` (RW)
  232. 0: Disable RF frequency hopping, 1: Enable RF frequency hopping.
  233. ``dlvr_spread_spectrum_pct`` (RW)
  234. Sets DLVR spread spectrum percent value.
  235. ``dlvr_control_mode`` (RW)
  236. Specifies how frequencies are spread using spread spectrum.
  237. 0: Down spread,
  238. 1: Spread in the Center.
  239. ``dlvr_control_lock`` (RW)
  240. 1: future writes are ignored.
  241. DPTF Power supply and Battery Interface
  242. ----------------------------------------
  243. Refer to Documentation/ABI/testing/sysfs-platform-dptf
  244. DPTF Fan Control
  245. ----------------------------------------
  246. Refer to Documentation/admin-guide/acpi/fan_performance_states.rst
  247. Workload Type Hints
  248. ----------------------------------------
  249. The firmware in Meteor Lake processor generation is capable of identifying
  250. workload type and passing hints regarding it to the OS. A special sysfs
  251. interface is provided to allow user space to obtain workload type hints from
  252. the firmware and control the rate at which they are provided.
  253. User space can poll attribute "workload_type_index" for the current hint or
  254. can receive a notification whenever the value of this attribute is updated.
  255. file:`/sys/bus/pci/devices/0000:00:04.0/workload_hint/`
  256. Segment 0, bus 0, device 4, function 0 is reserved for the processor thermal
  257. device on all Intel client processors. So, the above path doesn't change
  258. based on the processor generation.
  259. ``workload_hint_enable`` (RW)
  260. Enable firmware to send workload type hints to user space.
  261. ``workload_slow_hint_enable`` (RW)
  262. Enable firmware to send slow workload type hints to user space.
  263. ``notification_delay_ms`` (RW)
  264. Minimum delay in milliseconds before firmware will notify OS. This is
  265. for the rate control of notifications. This delay is between changing
  266. the workload type prediction in the firmware and notifying the OS about
  267. the change. The default delay is 1024 ms. The delay of 0 is invalid.
  268. The delay is rounded up to the nearest power of 2 to simplify firmware
  269. programming of the delay value. The read of notification_delay_ms
  270. attribute shows the effective value used.
  271. ``workload_type_index`` (RO)
  272. Predicted workload type index. User space can get notification of
  273. change via existing sysfs attribute change notification mechanism.
  274. The supported index values and their meaning for the Meteor Lake
  275. processor generation are as follows:
  276. 0 - Idle: System performs no tasks, power and idle residency are
  277. consistently low for long periods of time.
  278. 1 – Battery Life: Power is relatively low, but the processor may
  279. still be actively performing a task, such as video playback for
  280. a long period of time.
  281. 2 – Sustained: Power level that is relatively high for a long period
  282. of time, with very few to no periods of idleness, which will
  283. eventually exhaust RAPL Power Limit 1 and 2.
  284. 3 – Bursty: Consumes a relatively constant average amount of power, but
  285. periods of relative idleness are interrupted by bursts of
  286. activity. The bursts are relatively short and the periods of
  287. relative idleness between them typically prevent RAPL Power
  288. Limit 1 from being exhausted.
  289. 4 – Unknown: Can't classify.
  290. On processors starting from Panther Lake additional hints are provided.
  291. The hardware analyzes workload residencies over an extended period to
  292. determine whether the workload classification tends toward idle/battery
  293. life states or sustained/performance states. Based on this long-term
  294. analysis, it classifies:
  295. Power Classification: If the workload exhibits more idle or battery life
  296. residencies, it is classified as "power".
  297. Performance Classification: If the workload exhibits more sustained or
  298. performance residencies, it is classified as "performance".
  299. This approach enables applications to ignore short-term workload
  300. fluctuations and instead respond to longer-term power vs. performance
  301. trends.
  302. Residency thresholds for this classification are CPU generation-specific.
  303. Classification is reported via bit 4 of the workload_type_index:
  304. Bit 4 = 1: Power classification
  305. Bit 4 = 0: Performance classification