swap-table.rst 2.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
  1. .. SPDX-License-Identifier: GPL-2.0
  2. :Author: Chris Li <chrisl@kernel.org>, Kairui Song <kasong@tencent.com>
  3. ==========
  4. Swap Table
  5. ==========
  6. Swap table implements swap cache as a per-cluster swap cache value array.
  7. Swap Entry
  8. ----------
  9. A swap entry contains the information required to serve the anonymous page
  10. fault.
  11. Swap entry is encoded as two parts: swap type and swap offset.
  12. The swap type indicates which swap device to use.
  13. The swap offset is the offset of the swap file to read the page data from.
  14. Swap Cache
  15. ----------
  16. Swap cache is a map to look up folios using swap entry as the key. The result
  17. value can have three possible types depending on which stage of this swap entry
  18. was in.
  19. 1. NULL: This swap entry is not used.
  20. 2. folio: A folio has been allocated and bound to this swap entry. This is
  21. the transient state of swap out or swap in. The folio data can be in
  22. the folio or swap file, or both.
  23. 3. shadow: The shadow contains the working set information of the swapped
  24. out folio. This is the normal state for a swapped out page.
  25. Swap Table Internals
  26. --------------------
  27. The previous swap cache is implemented by XArray. The XArray is a tree
  28. structure. Each lookup will go through multiple nodes. Can we do better?
  29. Notice that most of the time when we look up the swap cache, we are either
  30. in a swap in or swap out path. We should already have the swap cluster,
  31. which contains the swap entry.
  32. If we have a per-cluster array to store swap cache value in the cluster.
  33. Swap cache lookup within the cluster can be a very simple array lookup.
  34. We give such a per-cluster swap cache value array a name: the swap table.
  35. A swap table is an array of pointers. Each pointer is the same size as a
  36. PTE. The size of a swap table for one swap cluster typically matches a PTE
  37. page table, which is one page on modern 64-bit systems.
  38. With swap table, swap cache lookup can achieve great locality, simpler,
  39. and faster.
  40. Locking
  41. -------
  42. Swap table modification requires taking the cluster lock. If a folio
  43. is being added to or removed from the swap table, the folio must be
  44. locked prior to the cluster lock. After adding or removing is done, the
  45. folio shall be unlocked.
  46. Swap table lookup is protected by RCU and atomic read. If the lookup
  47. returns a folio, the user must lock the folio before use.