zswap.rst 6.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138
  1. =====
  2. zswap
  3. =====
  4. Overview
  5. ========
  6. Zswap is a lightweight compressed cache for swap pages. It takes pages that are
  7. in the process of being swapped out and attempts to compress them into a
  8. dynamically allocated RAM-based memory pool. zswap basically trades CPU cycles
  9. for potentially reduced swap I/O. This trade-off can also result in a
  10. significant performance improvement if reads from the compressed cache are
  11. faster than reads from a swap device.
  12. Some potential benefits:
  13. * Desktop/laptop users with limited RAM capacities can mitigate the
  14. performance impact of swapping.
  15. * Overcommitted guests that share a common I/O resource can
  16. dramatically reduce their swap I/O pressure, avoiding heavy handed I/O
  17. throttling by the hypervisor. This allows more work to get done with less
  18. impact to the guest workload and guests sharing the I/O subsystem
  19. * Users with SSDs as swap devices can extend the life of the device by
  20. drastically reducing life-shortening writes.
  21. Zswap evicts pages from compressed cache on an LRU basis to the backing swap
  22. device when the compressed pool reaches its size limit. This requirement had
  23. been identified in prior community discussions.
  24. Whether Zswap is enabled at the boot time depends on whether
  25. the ``CONFIG_ZSWAP_DEFAULT_ON`` Kconfig option is enabled or not.
  26. This setting can then be overridden by providing the kernel command line
  27. ``zswap.enabled=`` option, for example ``zswap.enabled=0``.
  28. Zswap can also be enabled and disabled at runtime using the sysfs interface.
  29. An example command to enable zswap at runtime, assuming sysfs is mounted
  30. at ``/sys``, is::
  31. echo 1 > /sys/module/zswap/parameters/enabled
  32. When zswap is disabled at runtime it will stop storing pages that are
  33. being swapped out. However, it will _not_ immediately write out or fault
  34. back into memory all of the pages stored in the compressed pool. The
  35. pages stored in zswap will remain in the compressed pool until they are
  36. either invalidated or faulted back into memory. In order to force all
  37. pages out of the compressed pool, a swapoff on the swap device(s) will
  38. fault back into memory all swapped out pages, including those in the
  39. compressed pool.
  40. Design
  41. ======
  42. Zswap receives pages for compression from the swap subsystem and is able to
  43. evict pages from its own compressed pool on an LRU basis and write them back to
  44. the backing swap device in the case that the compressed pool is full.
  45. Zswap makes use of zsmalloc for the managing the compressed memory pool. Each
  46. allocation in zsmalloc is not directly accessible by address. Rather, a handle is
  47. returned by the allocation routine and that handle must be mapped before being
  48. accessed. The compressed memory pool grows on demand and shrinks as compressed
  49. pages are freed. The pool is not preallocated.
  50. When a swap page is passed from swapout to zswap, zswap maintains a mapping of
  51. the swap entry, a combination of the swap type and swap offset, to the zsmalloc
  52. handle that references that compressed swap page. This mapping is achieved
  53. with an xarray per swap type. The swap offset is the search key for the xarray
  54. nodes.
  55. During a page fault on a PTE that is a swap entry, the swapin code calls the
  56. zswap load function to decompress the page into the page allocated by the page
  57. fault handler.
  58. Once there are no PTEs referencing a swap page stored in zswap (i.e. the count
  59. in the swap_map goes to 0) the swap code calls the zswap invalidate function
  60. to free the compressed entry.
  61. Zswap seeks to be simple in its policies. Sysfs attributes allow for one user
  62. controlled policy:
  63. * max_pool_percent - The maximum percentage of memory that the compressed
  64. pool can occupy.
  65. The default compressor is selected in ``CONFIG_ZSWAP_COMPRESSOR_DEFAULT``
  66. Kconfig option, but it can be overridden at boot time by setting the
  67. ``compressor`` attribute, e.g. ``zswap.compressor=lzo``.
  68. It can also be changed at runtime using the sysfs "compressor"
  69. attribute, e.g.::
  70. echo lzo > /sys/module/zswap/parameters/compressor
  71. When the compressor parameter is changed at runtime, any existing compressed
  72. pages are not modified; they are left in their own pool. When a request is
  73. made for a page in an old pool, it is uncompressed using its original
  74. compressor. Once all pages are removed from an old pool, the pool and its
  75. compressor are freed.
  76. Some of the pages in zswap are same-value filled pages (i.e. contents of the
  77. page have same value or repetitive pattern). These pages include zero-filled
  78. pages and they are handled differently. During store operation, a page is
  79. checked if it is a same-value filled page before compressing it. If true, the
  80. compressed length of the page is set to zero and the pattern or same-filled
  81. value is stored.
  82. To prevent zswap from shrinking pool when zswap is full and there's a high
  83. pressure on swap (this will result in flipping pages in and out zswap pool
  84. without any real benefit but with a performance drop for the system), a
  85. special parameter has been introduced to implement a sort of hysteresis to
  86. refuse taking pages into zswap pool until it has sufficient space if the limit
  87. has been hit. To set the threshold at which zswap would start accepting pages
  88. again after it became full, use the sysfs ``accept_threshold_percent``
  89. attribute, e. g.::
  90. echo 80 > /sys/module/zswap/parameters/accept_threshold_percent
  91. Setting this parameter to 100 will disable the hysteresis.
  92. Some users cannot tolerate the swapping that comes with zswap store failures
  93. and zswap writebacks. Swapping can be disabled entirely (without disabling
  94. zswap itself) on a cgroup-basis as follows::
  95. echo 0 > /sys/fs/cgroup/<cgroup-name>/memory.zswap.writeback
  96. Note that if the store failures are recurring (for e.g if the pages are
  97. incompressible), users can observe reclaim inefficiency after disabling
  98. writeback (because the same pages might be rejected again and again).
  99. When there is a sizable amount of cold memory residing in the zswap pool, it
  100. can be advantageous to proactively write these cold pages to swap and reclaim
  101. the memory for other use cases. By default, the zswap shrinker is disabled.
  102. User can enable it as follows::
  103. echo Y > /sys/module/zswap/parameters/shrinker_enabled
  104. This can be enabled at the boot time if ``CONFIG_ZSWAP_SHRINKER_DEFAULT_ON`` is
  105. selected.
  106. A debugfs interface is provided for various statistic about pool size, number
  107. of pages stored, same-value filled pages and various counters for the reasons
  108. pages are rejected.