tunables.texi 33 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769
  1. @node Tunables
  2. @c @node Tunables, , Internal Probes, Top
  3. @c %MENU% Tunable switches to alter libc internal behavior
  4. @chapter Tunables
  5. @cindex tunables
  6. @dfn{Tunables} are a feature in @theglibc{} that allows application authors and
  7. distribution maintainers to alter the runtime library behavior to match
  8. their workload. These are implemented as a set of switches that may be
  9. modified in different ways. The current default method to do this is via
  10. the @env{GLIBC_TUNABLES} environment variable by setting it to a string
  11. of colon-separated @var{name}=@var{value} pairs. For example, the following
  12. example enables @code{malloc} checking and sets the @code{malloc}
  13. trim threshold to 128
  14. bytes:
  15. @example
  16. GLIBC_TUNABLES=glibc.malloc.trim_threshold=128:glibc.malloc.check=3
  17. export GLIBC_TUNABLES
  18. @end example
  19. Tunables are not part of the @glibcadj{} stable ABI, and they are
  20. subject to change or removal across releases. Additionally, the method to
  21. modify tunable values may change between releases and across distributions.
  22. It is possible to implement multiple `frontends' for the tunables allowing
  23. distributions to choose their preferred method at build time.
  24. Finally, the set of tunables available may vary between distributions as
  25. the tunables feature allows distributions to add their own tunables under
  26. their own namespace.
  27. Passing @option{--list-tunables} to the dynamic loader to print all
  28. tunables with minimum and maximum values:
  29. @example
  30. $ /lib64/ld-linux-x86-64.so.2 --list-tunables
  31. glibc.rtld.nns: 0x4 (min: 0x1, max: 0x10)
  32. glibc.malloc.trim_threshold: 0x0 (min: 0x0, max: 0xffffffffffffffff)
  33. glibc.malloc.perturb: 0 (min: 0, max: 255)
  34. glibc.cpu.x86_shared_cache_size: 0x100000 (min: 0x0, max: 0xffffffffffffffff)
  35. glibc.pthread.rseq: 1 (min: 0, max: 1)
  36. glibc.cpu.prefer_map_32bit_exec: 0 (min: 0, max: 1)
  37. glibc.mem.tagging: 0 (min: 0, max: 255)
  38. glibc.malloc.hugetlb: 0x0 (min: 0x0, max: 0xffffffffffffffff)
  39. glibc.cpu.x86_rep_movsb_threshold: 0x2000 (min: 0x100, max: 0xffffffffffffffff)
  40. glibc.malloc.mxfast: 0x0 (min: 0x0, max: 0xffffffffffffffff)
  41. glibc.rtld.dynamic_sort: 2 (min: 1, max: 2)
  42. glibc.malloc.top_pad: 0x20000 (min: 0x0, max: 0xffffffffffffffff)
  43. glibc.cpu.x86_rep_stosb_threshold: 0x800 (min: 0x1, max: 0xffffffffffffffff)
  44. glibc.cpu.x86_non_temporal_threshold: 0xc0000 (min: 0x4040, max: 0xfffffffffffffff)
  45. glibc.cpu.x86_memset_non_temporal_threshold: 0xc0000 (min: 0x4040, max: 0xfffffffffffffff)
  46. glibc.cpu.x86_shstk:
  47. glibc.pthread.stack_cache_size: 0x2800000 (min: 0x0, max: 0xffffffffffffffff)
  48. glibc.malloc.mmap_max: 0 (min: 0, max: 2147483647)
  49. glibc.cpu.plt_rewrite: 0 (min: 0, max: 2)
  50. glibc.malloc.tcache_unsorted_limit: 0x0 (min: 0x0, max: 0xffffffffffffffff)
  51. glibc.cpu.x86_ibt:
  52. glibc.cpu.hwcaps:
  53. glibc.malloc.arena_max: 0x0 (min: 0x1, max: 0xffffffffffffffff)
  54. glibc.malloc.mmap_threshold: 0x0 (min: 0x0, max: 0xffffffffffffffff)
  55. glibc.cpu.x86_data_cache_size: 0x8000 (min: 0x0, max: 0xffffffffffffffff)
  56. glibc.malloc.tcache_count: 0x0 (min: 0x0, max: 0xffffffffffffffff)
  57. glibc.malloc.arena_test: 0x0 (min: 0x1, max: 0xffffffffffffffff)
  58. glibc.pthread.mutex_spin_count: 100 (min: 0, max: 32767)
  59. glibc.rtld.optional_static_tls: 0x200 (min: 0x0, max: 0xffffffffffffffff)
  60. glibc.malloc.tcache_max: 0x0 (min: 0x0, max: 0xffffffffffffffff)
  61. glibc.malloc.check: 0 (min: 0, max: 3)
  62. @end example
  63. @menu
  64. * Tunable names:: The structure of a tunable name
  65. * Memory Allocation Tunables:: Tunables in the memory allocation subsystem
  66. * Dynamic Linking Tunables:: Tunables in the dynamic linking subsystem
  67. * POSIX Thread Tunables:: Tunables in the POSIX thread subsystem
  68. * Hardware Capability Tunables:: Tunables that modify the hardware
  69. capabilities seen by @theglibc{}
  70. * Memory Related Tunables:: Tunables that control the use of memory by
  71. @theglibc{}.
  72. * gmon Tunables:: Tunables that control the gmon profiler, used in
  73. conjunction with gprof
  74. @end menu
  75. @node Tunable names
  76. @section Tunable names
  77. @cindex Tunable names
  78. @cindex Tunable namespaces
  79. A tunable name is split into three components, a top namespace, a tunable
  80. namespace and the tunable name. The top namespace for tunables implemented in
  81. @theglibc{} is @code{glibc}. Distributions that choose to add custom tunables
  82. in their maintained versions of @theglibc{} may choose to do so under their own
  83. top namespace.
  84. The tunable namespace is a logical grouping of tunables in a single
  85. module. This currently holds no special significance, although that may
  86. change in the future.
  87. The tunable name is the actual name of the tunable. It is possible that
  88. different tunable namespaces may have tunables within them that have the
  89. same name, likewise for top namespaces. Hence, we only support
  90. identification of tunables by their full name, i.e. with the top
  91. namespace, tunable namespace and tunable name, separated by periods.
  92. @node Memory Allocation Tunables
  93. @section Memory Allocation Tunables
  94. @cindex memory allocation tunables
  95. @cindex malloc tunables
  96. @cindex tunables, malloc
  97. @deftp {Tunable namespace} glibc.malloc
  98. Memory allocation behavior can be modified by setting any of the
  99. following tunables in the @code{malloc} namespace:
  100. @end deftp
  101. @deftp Tunable glibc.malloc.check
  102. This tunable supersedes the @env{MALLOC_CHECK_} environment variable and is
  103. identical in features. This tunable has no effect by default and needs the
  104. debug library @file{libc_malloc_debug} to be preloaded using the
  105. @code{LD_PRELOAD} environment variable.
  106. Setting this tunable to a non-zero value less than 4 enables a special (less
  107. efficient) memory allocator for the @code{malloc} family of functions that is
  108. designed to be tolerant against simple errors such as double calls of
  109. free with the same argument, or overruns of a single byte (off-by-one
  110. bugs). Not all such errors can be protected against, however, and memory
  111. leaks can result. Any detected heap corruption results in immediate
  112. termination of the process.
  113. Like @env{MALLOC_CHECK_}, @code{glibc.malloc.check} has a problem in that it
  114. diverges from normal program behavior by writing to @code{stderr}, which could
  115. by exploited in SUID and SGID binaries. Therefore, @code{glibc.malloc.check}
  116. is disabled by default for SUID and SGID binaries.
  117. @end deftp
  118. @deftp Tunable glibc.malloc.top_pad
  119. This tunable supersedes the @env{MALLOC_TOP_PAD_} environment variable and is
  120. identical in features.
  121. This tunable determines the amount of extra memory in bytes to obtain from the
  122. system when any of the arenas need to be extended. It also specifies the
  123. number of bytes to retain when shrinking any of the arenas. This provides the
  124. necessary hysteresis in heap size such that excessive amounts of system calls
  125. can be avoided.
  126. The default value of this tunable is @samp{131072} (128 KB).
  127. @end deftp
  128. @deftp Tunable glibc.malloc.perturb
  129. This tunable supersedes the @env{MALLOC_PERTURB_} environment variable and is
  130. identical in features.
  131. If set to a non-zero value, memory blocks are initialized with values depending
  132. on some low order bits of this tunable when they are allocated (except when
  133. allocated by @code{calloc}) and freed. This can be used to debug the use of
  134. uninitialized or freed heap memory. Note that this option does not guarantee
  135. that the freed block will have any specific values. It only guarantees that the
  136. content the block had before it was freed will be overwritten.
  137. The default value of this tunable is @samp{0}.
  138. @end deftp
  139. @deftp Tunable glibc.malloc.mmap_threshold
  140. This tunable supersedes the @env{MALLOC_MMAP_THRESHOLD_} environment variable
  141. and is identical in features.
  142. When this tunable is set, all chunks larger than this value in bytes are
  143. allocated outside the normal heap, using the @code{mmap} system call. This way
  144. it is guaranteed that the memory for these chunks can be returned to the system
  145. on @code{free}. Note that requests smaller than this threshold might still be
  146. allocated via @code{mmap}.
  147. If this tunable is not set, the default value is set to @samp{131072} bytes and
  148. the threshold is adjusted dynamically to suit the allocation patterns of the
  149. program. If the tunable is set, the dynamic adjustment is disabled and the
  150. value is set as static.
  151. @end deftp
  152. @deftp Tunable glibc.malloc.trim_threshold
  153. This tunable supersedes the @env{MALLOC_TRIM_THRESHOLD_} environment variable
  154. and is identical in features.
  155. The value of this tunable is the minimum size (in bytes) of the top-most,
  156. releasable chunk in an arena that will trigger a system call in order to return
  157. memory to the system from that arena.
  158. If this tunable is not set, the default value is set as 128 KB and the
  159. threshold is adjusted dynamically to suit the allocation patterns of the
  160. program. If the tunable is set, the dynamic adjustment is disabled and the
  161. value is set as static.
  162. @end deftp
  163. @deftp Tunable glibc.malloc.mmap_max
  164. This tunable supersedes the @env{MALLOC_MMAP_MAX_} environment variable and is
  165. identical in features.
  166. The value of this tunable is maximum number of chunks to allocate with
  167. @code{mmap}. Setting this to zero disables all use of @code{mmap}.
  168. The default value of this tunable is @samp{65536}.
  169. @end deftp
  170. @deftp Tunable glibc.malloc.arena_test
  171. This tunable supersedes the @env{MALLOC_ARENA_TEST} environment variable and is
  172. identical in features.
  173. The @code{glibc.malloc.arena_test} tunable specifies the number of arenas that
  174. can be created before the test on the limit to the number of arenas is
  175. conducted. The value is ignored if @code{glibc.malloc.arena_max} is set.
  176. The default value of this tunable is 2 for 32-bit systems and 8 for 64-bit
  177. systems.
  178. @end deftp
  179. @deftp Tunable glibc.malloc.arena_max
  180. This tunable supersedes the @env{MALLOC_ARENA_MAX} environment variable and is
  181. identical in features.
  182. This tunable sets the number of arenas to use in a process regardless of the
  183. number of cores in the system.
  184. The default value of this tunable is @code{0}, meaning that the limit on the
  185. number of arenas is determined by the number of CPU cores online. For 32-bit
  186. systems the limit is twice the number of cores online and on 64-bit systems, it
  187. is 8 times the number of cores online.
  188. @end deftp
  189. @deftp Tunable glibc.malloc.tcache_max
  190. The maximum size of a request (in bytes) which may be met via the
  191. per-thread cache. The default (and maximum) value is 1032 bytes on
  192. 64-bit systems and 516 bytes on 32-bit systems.
  193. @end deftp
  194. @deftp Tunable glibc.malloc.tcache_count
  195. The maximum number of chunks of each size to cache. The default is 7.
  196. The upper limit is 65535. If set to zero, the per-thread cache is effectively
  197. disabled.
  198. The approximate maximum overhead of the per-thread cache is thus equal
  199. to the number of bins times the chunk count in each bin times the size
  200. of each chunk. With defaults, the approximate maximum overhead of the
  201. per-thread cache is approximately 236 KB on 64-bit systems and 118 KB
  202. on 32-bit systems.
  203. @end deftp
  204. @deftp Tunable glibc.malloc.tcache_unsorted_limit
  205. When the user requests memory and the request cannot be met via the
  206. per-thread cache, the arenas are used to meet the request. At this
  207. time, additional chunks will be moved from existing arena lists to
  208. pre-fill the corresponding cache. While copies from the fastbins,
  209. smallbins, and regular bins are bounded and predictable due to the bin
  210. sizes, copies from the unsorted bin are not bounded, and incur
  211. additional time penalties as they need to be sorted as they're
  212. scanned. To make scanning the unsorted list more predictable and
  213. bounded, the user may set this tunable to limit the number of chunks
  214. that are scanned from the unsorted list while searching for chunks to
  215. pre-fill the per-thread cache with. The default, or when set to zero,
  216. is no limit.
  217. @end deftp
  218. @deftp Tunable glibc.malloc.mxfast
  219. One of the optimizations @code{malloc} uses is to maintain a series of ``fast
  220. bins'' that hold chunks up to a specific size. The default and
  221. maximum size which may be held this way is 80 bytes on 32-bit systems
  222. or 160 bytes on 64-bit systems. Applications which value size over
  223. speed may choose to reduce the size of requests which are serviced
  224. from fast bins with this tunable. Note that the value specified
  225. includes @code{malloc}'s internal overhead, which is normally the size of one
  226. pointer, so add 4 on 32-bit systems or 8 on 64-bit systems to the size
  227. passed to @code{malloc} for the largest bin size to enable.
  228. @end deftp
  229. @deftp Tunable glibc.malloc.hugetlb
  230. This tunable controls the usage of Huge Pages on @code{malloc} calls. The
  231. default value is @code{0}, which disables any additional support on
  232. @code{malloc}.
  233. Setting its value to @code{1} enables the use of @code{madvise} with
  234. @code{MADV_HUGEPAGE} after memory allocation with @code{mmap}. It is enabled
  235. only if the system supports Transparent Huge Page (currently only on Linux).
  236. Setting its value to @code{2} enables the use of Huge Page directly with
  237. @code{mmap} with the use of @code{MAP_HUGETLB} flag. The huge page size
  238. to use will be the default one provided by the system. A value larger than
  239. @code{2} specifies huge page size, which will be matched against the system
  240. supported ones. If provided value is invalid, @code{MAP_HUGETLB} will not
  241. be used.
  242. @end deftp
  243. @node Dynamic Linking Tunables
  244. @section Dynamic Linking Tunables
  245. @cindex dynamic linking tunables
  246. @cindex rtld tunables
  247. @deftp {Tunable namespace} glibc.rtld
  248. Dynamic linker behavior can be modified by setting the
  249. following tunables in the @code{rtld} namespace:
  250. @end deftp
  251. @deftp Tunable glibc.rtld.nns
  252. Sets the number of supported dynamic link namespaces (see @code{dlmopen}).
  253. Currently this limit can be set between 1 and 16 inclusive, the default is 4.
  254. Each link namespace consumes some memory in all thread, and thus raising the
  255. limit will increase the amount of memory each thread uses. Raising the limit
  256. is useful when your application uses more than 4 dynamic link namespaces as
  257. created by @code{dlmopen} with an lmid argument of @code{LM_ID_NEWLM}.
  258. Dynamic linker audit modules are loaded in their own dynamic link namespaces,
  259. but they are not accounted for in @code{glibc.rtld.nns}. They implicitly
  260. increase the per-thread memory usage as necessary, so this tunable does
  261. not need to be changed to allow many audit modules e.g. via @env{LD_AUDIT}.
  262. @end deftp
  263. @deftp Tunable glibc.rtld.optional_static_tls
  264. Sets the amount of surplus static TLS in bytes to allocate at program
  265. startup. Every thread created allocates this amount of specified surplus
  266. static TLS. This is a minimum value and additional space may be allocated
  267. for internal purposes including alignment. Optional static TLS is used for
  268. optimizing dynamic TLS access for platforms that support such optimizations
  269. e.g. TLS descriptors or optimized TLS access for POWER (@code{DT_PPC64_OPT}
  270. and @code{DT_PPC_OPT}). In order to make the best use of such optimizations
  271. the value should be as many bytes as would be required to hold all TLS
  272. variables in all dynamic loaded shared libraries. The value cannot be known
  273. by the dynamic loader because it doesn't know the expected set of shared
  274. libraries which will be loaded. The existing static TLS space cannot be
  275. changed once allocated at process startup. The default allocation of
  276. optional static TLS is 512 bytes and is allocated in every thread.
  277. @end deftp
  278. @deftp Tunable glibc.rtld.dynamic_sort
  279. Sets the algorithm to use for DSO sorting, valid values are @samp{1} and
  280. @samp{2}. For value of @samp{1}, an older O(n^3) algorithm is used, which is
  281. long time tested, but may have performance issues when dependencies between
  282. shared objects contain cycles due to circular dependencies. When set to the
  283. value of @samp{2}, a different algorithm is used, which implements a
  284. topological sort through depth-first search, and does not exhibit the
  285. performance issues of @samp{1}.
  286. The default value of this tunable is @samp{2}.
  287. @end deftp
  288. @deftp Tunable glibc.rtld.enable_secure
  289. Used to run a program as if it were a setuid process. The only valid value
  290. is @samp{1} as this tunable can only be used to set and not unset
  291. @code{enable_secure}. Setting this tunable to @samp{1} also disables all other
  292. tunables. This tunable is intended to facilitate more extensive verification
  293. tests for @code{AT_SECURE} programs and not meant to be a security feature.
  294. The default value of this tunable is @samp{0}.
  295. @end deftp
  296. @deftp Tunable glibc.rtld.execstack
  297. @Theglibc{} will use either the default architecture ABI flags (that might
  298. contain the executable bit) or the value of @code{PT_GNU_STACK} (if present)
  299. to define whether to mark the stack non-executable and if the program or
  300. any shared library dependency requires an executable stack the loader will
  301. change the main stack permission if kernel starts with a non-executable stack.
  302. The @code{glibc.rtld.execstack} can be used to control whether an executable
  303. stack is allowed from the main program. Setting the value to @code{0} disables
  304. the ABI auto-negotiation (meaning no executable stacks even if the ABI or ELF
  305. header requires it), @code{1} enables auto-negotiation (although the program
  306. might not need an executable stack), while @code{2} forces an executable
  307. stack at process start. This is provided for compatibility reasons, when
  308. the program dynamically loads modules with @code{dlopen} which require
  309. an executable stack.
  310. When executable stacks are not allowed, and if the main program requires it,
  311. the loader will fail with an error message.
  312. Some systems do not have separate page protection flags at the hardware
  313. level for read access and execute access (sometimes called read-implies-exec).
  314. This mode can also be enabled on certain systems where the hardware supports
  315. separate protection flags. The @theglibc{} tunable configuration is independent
  316. of hardware capabilities and kernel configuration.
  317. @strong{NB:} Trying to load a dynamic shared library with @code{dlopen} or
  318. @code{dlmopen} that requires an executable stack will always fail if the
  319. main program does not require an executable stack at loading time. This
  320. can be worked around by setting the tunable to @code{2}, where the stack is
  321. always executable.
  322. @end deftp
  323. @node POSIX Thread Tunables
  324. @section POSIX Thread Tunables
  325. @cindex pthread mutex tunables
  326. @cindex thread mutex tunables
  327. @cindex mutex tunables
  328. @cindex tunables thread mutex
  329. @deftp {Tunable namespace} glibc.pthread
  330. The behavior of POSIX threads can be tuned to gain performance improvements
  331. according to specific hardware capabilities and workload characteristics by
  332. setting the following tunables in the @code{pthread} namespace:
  333. @end deftp
  334. @deftp Tunable glibc.pthread.mutex_spin_count
  335. The @code{glibc.pthread.mutex_spin_count} tunable sets the maximum number of times
  336. a thread should spin on the lock before calling into the kernel to block.
  337. Adaptive spin is used for mutexes initialized with the
  338. @code{PTHREAD_MUTEX_ADAPTIVE_NP} GNU extension. It affects both
  339. @code{pthread_mutex_lock} and @code{pthread_mutex_timedlock}.
  340. The thread spins until either the maximum spin count is reached or the lock
  341. is acquired.
  342. The default value of this tunable is @samp{100}.
  343. @end deftp
  344. @deftp Tunable glibc.pthread.stack_cache_size
  345. This tunable configures the maximum size of the stack cache. Once the
  346. stack cache exceeds this size, unused thread stacks are returned to
  347. the kernel, to bring the cache size below this limit.
  348. The value is measured in bytes. The default is @samp{41943040}
  349. (forty mibibytes).
  350. @end deftp
  351. @deftp Tunable glibc.pthread.rseq
  352. The @code{glibc.pthread.rseq} tunable can be set to @samp{0}, to disable
  353. restartable sequences support in @theglibc{}. This enables applications
  354. to perform direct restartable sequence registration with the kernel.
  355. The default is @samp{1}, which means that @theglibc{} performs
  356. registration on behalf of the application.
  357. Restartable sequences are a Linux-specific extension.
  358. @end deftp
  359. @deftp Tunable glibc.pthread.stack_hugetlb
  360. This tunable controls whether to use Huge Pages in the stacks created by
  361. @code{pthread_create}. This tunable only affects the stacks created by
  362. @theglibc{}, it has no effect on stack assigned with
  363. @code{pthread_attr_setstack}.
  364. The default is @samp{1} where the system default value is used. Setting
  365. its value to @code{0} enables the use of @code{madvise} with
  366. @code{MADV_NOHUGEPAGE} after stack creation with @code{mmap}.
  367. This is a memory utilization optimization, since internal glibc setup of either
  368. the thread descriptor and the guard page might force the kernel to move the
  369. thread stack originally backup by Huge Pages to default pages.
  370. @end deftp
  371. @node Hardware Capability Tunables
  372. @section Hardware Capability Tunables
  373. @cindex hardware capability tunables
  374. @cindex hwcap tunables
  375. @cindex tunables, hwcap
  376. @cindex hwcaps tunables
  377. @cindex tunables, hwcaps
  378. @cindex data_cache_size tunables
  379. @cindex tunables, data_cache_size
  380. @cindex shared_cache_size tunables
  381. @cindex tunables, shared_cache_size
  382. @cindex non_temporal_threshold tunables
  383. @cindex memset_non_temporal_threshold tunables
  384. @cindex tunables, non_temporal_threshold, memset_non_temporal_threshold
  385. @deftp {Tunable namespace} glibc.cpu
  386. Behavior of @theglibc{} can be tuned to assume specific hardware capabilities
  387. by setting the following tunables in the @code{cpu} namespace:
  388. @end deftp
  389. @deftp Tunable glibc.cpu.hwcaps
  390. The @code{glibc.cpu.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to
  391. enable CPU/ARCH feature @code{yyy}, disable CPU/ARCH feature @code{xxx}
  392. and @code{zzz} where the feature name is case-sensitive and has to match
  393. the ones in @code{sysdeps/x86/include/cpu-features.h}.
  394. On s390x, the supported HWCAP and STFLE features can be found in
  395. @code{sysdeps/s390/cpu-features.c}. In addition the user can also set
  396. a CPU arch-level like @code{z13} instead of single HWCAP and STFLE features.
  397. On powerpc, the supported HWCAP and HWCAP2 features can be found in
  398. @code{sysdeps/powerpc/dl-procinfo.c}.
  399. On loongarch, the supported HWCAP features can be found in
  400. @code{sysdeps/loongarch/cpu-tunables.c}.
  401. This tunable is specific to i386, x86-64, s390x, powerpc and loongarch.
  402. @end deftp
  403. @deftp Tunable glibc.cpu.cached_memopt
  404. The @code{glibc.cpu.cached_memopt=[0|1]} tunable allows the user to
  405. enable optimizations recommended for cacheable memory. If set to
  406. @code{1}, @theglibc{} assumes that the process memory image consists
  407. of cacheable (non-device) memory only. The default, @code{0},
  408. indicates that the process may use device memory.
  409. This tunable is specific to powerpc, powerpc64 and powerpc64le.
  410. @end deftp
  411. @deftp Tunable glibc.cpu.name
  412. The @code{glibc.cpu.name=xxx} tunable allows the user to tell @theglibc{} to
  413. assume that the CPU is @code{xxx} where xxx may have one of these values:
  414. @code{generic}, @code{thunderxt88}, @code{thunderx2t99},
  415. @code{thunderx2t99p1}, @code{ares}, @code{emag}, @code{kunpeng},
  416. @code{a64fx}.
  417. This tunable is specific to aarch64.
  418. @end deftp
  419. @deftp Tunable glibc.cpu.x86_data_cache_size
  420. The @code{glibc.cpu.x86_data_cache_size} tunable allows the user to set
  421. data cache size in bytes for use in memory and string routines.
  422. This tunable is specific to i386 and x86-64.
  423. @end deftp
  424. @deftp Tunable glibc.cpu.x86_shared_cache_size
  425. The @code{glibc.cpu.x86_shared_cache_size} tunable allows the user to
  426. set shared cache size in bytes for use in memory and string routines.
  427. @end deftp
  428. @deftp Tunable glibc.cpu.x86_non_temporal_threshold
  429. The @code{glibc.cpu.x86_non_temporal_threshold} tunable allows the user
  430. to set threshold in bytes for non temporal store. Non temporal stores
  431. give a hint to the hardware to move data directly to memory without
  432. displacing other data from the cache. This tunable is used by some
  433. platforms to determine when to use non temporal stores in operations
  434. like memmove and memcpy.
  435. This tunable is specific to i386 and x86-64.
  436. @end deftp
  437. @deftp Tunable glibc.cpu.x86_memset_non_temporal_threshold
  438. The @code{glibc.cpu.x86_memset_non_temporal_threshold} tunable allows
  439. the user to set threshold in bytes for non temporal store in
  440. memset. Non temporal stores give a hint to the hardware to move data
  441. directly to memory without displacing other data from the cache. This
  442. tunable is used by some platforms to determine when to use non
  443. temporal stores memset.
  444. This tunable is specific to i386 and x86-64.
  445. @end deftp
  446. @deftp Tunable glibc.cpu.x86_rep_movsb_threshold
  447. The @code{glibc.cpu.x86_rep_movsb_threshold} tunable allows the user to
  448. set threshold in bytes to start using "rep movsb". The value must be
  449. greater than zero, and currently defaults to 2048 bytes.
  450. This tunable is specific to i386 and x86-64.
  451. @end deftp
  452. @deftp Tunable glibc.cpu.x86_rep_stosb_threshold
  453. The @code{glibc.cpu.x86_rep_stosb_threshold} tunable allows the user to
  454. set threshold in bytes to start using "rep stosb". The value must be
  455. greater than zero, and currently defaults to 2048 bytes.
  456. This tunable is specific to i386 and x86-64.
  457. @end deftp
  458. @deftp Tunable glibc.cpu.x86_ibt
  459. The @code{glibc.cpu.x86_ibt} tunable allows the user to control how
  460. indirect branch tracking (IBT) should be enabled. Accepted values are
  461. @code{on}, @code{off}, and @code{permissive}. @code{on} always turns
  462. on IBT regardless of whether IBT is enabled in the executable and its
  463. dependent shared libraries. @code{off} always turns off IBT regardless
  464. of whether IBT is enabled in the executable and its dependent shared
  465. libraries. @code{permissive} is the same as the default which disables
  466. IBT on non-CET executables and shared libraries.
  467. This tunable is specific to i386 and x86-64.
  468. @end deftp
  469. @deftp Tunable glibc.cpu.x86_shstk
  470. The @code{glibc.cpu.x86_shstk} tunable allows the user to control how
  471. the shadow stack (SHSTK) should be enabled. Accepted values are
  472. @code{on}, @code{off}, and @code{permissive}. @code{on} always turns on
  473. SHSTK regardless of whether SHSTK is enabled in the executable and its
  474. dependent shared libraries. @code{off} always turns off SHSTK regardless
  475. of whether SHSTK is enabled in the executable and its dependent shared
  476. libraries. @code{permissive} changes how dlopen works on non-CET shared
  477. libraries. By default, when SHSTK is enabled, dlopening a non-CET shared
  478. library returns an error. With @code{permissive}, it turns off SHSTK
  479. instead.
  480. This tunable is specific to i386 and x86-64.
  481. @end deftp
  482. @deftp Tunable glibc.cpu.prefer_map_32bit_exec
  483. When this tunable is set to @code{1}, shared libraries of non-setuid
  484. programs will be loaded below 2GB with MAP_32BIT.
  485. Note that the @env{LD_PREFER_MAP_32BIT_EXEC} environment is an alias of
  486. this tunable.
  487. This tunable is specific to 64-bit x86-64.
  488. @end deftp
  489. @deftp Tunable glibc.cpu.plt_rewrite
  490. When this tunable is set to @code{1}, the dynamic linker will rewrite
  491. the PLT section with 32-bit direct jump. When it is set to @code{2},
  492. the dynamic linker will rewrite the PLT section with 32-bit direct
  493. jump and on APX processors with 64-bit absolute jump.
  494. This tunable is specific to x86-64 and effective only when the lazy
  495. binding is disabled.
  496. @end deftp
  497. @deftp Tunable glibc.cpu.aarch64_bti
  498. This tunable controls Branch Target Identification (BTI) handling for the
  499. process. This handling is implemented via protecting the memory mapping
  500. with @code{PROT_BTI} for modules that are marked with the appropriate ELF
  501. property @code{GNU_PROPERTY_AARCH64_FEATURE_1_BTI} (see Program Loading in
  502. @url{https://github.com/ARM-software/abi-aa/blob/main/sysvabi64/sysvabi64.rst}).
  503. Accepted values are:
  504. 0 = permissive: BTI protection is enabled only for modules that have BTI
  505. marking (default).
  506. 1 = enforced: if a module that does not have BTI marking is loaded, it is
  507. an error (either a process abort or a @code{dlopen} error if this binary
  508. is loaded via @code{dlopen}).
  509. @end deftp
  510. @deftp Tunable glibc.cpu.aarch64_gcs
  511. This tunable controls Guarded Control Stack (GCS) for the process.
  512. Accepted values are:
  513. 0 = disabled: do not enable GCS.
  514. 1 = enforced: check markings and fail if any binary is not marked.
  515. 2 = optional: check markings but keep GCS off if any binary is unmarked.
  516. 3 = override: enable GCS, markings are ignored.
  517. If unmarked binary is loaded via @code{dlopen} when GCS is enabled and
  518. markings are not ignored (@code{aarch64_gcs == 1} or @code{2}), then
  519. the process will be aborted.
  520. Default is @code{0}, so GCS is disabled by default.
  521. This tunable is specific to AArch64. On systems that do not support
  522. Guarded Control Stack this tunable has no effect.
  523. Before enabling GCS for the process the value of this tunable is checked
  524. and depending on it the following outcomes are possible.
  525. @code{aarch64_gcs == 0}: GCS will not be enabled and GCS markings will not be
  526. checked for any binaries.
  527. @code{aarch64_gcs == 1}: GCS markings will be checked for all binaries loaded
  528. at startup and, only if all binaries are GCS-marked, GCS will be enabled. If
  529. any of the binaries are not GCS-marked, the process will abort. Subsequent call
  530. to @code{dlopen} for an unmarked binary will also result in abort.
  531. @code{aarch64_gcs == 2}: GCS markings will be checked for all binaries loaded
  532. at startup and, if any of such binaries are not GCS-marked, GCS will not be
  533. enabled and there will be no more checks for GCS marking. If all binaries
  534. loaded at startup are GCS-marked, then GCS will be enabled, in which case a
  535. call to @code{dlopen} for an unmarked binary will also result in abort.
  536. @code{aarch64_gcs == 3}: GCS will be enabled and GCS markings will not be
  537. checked for any binaries.
  538. @end deftp
  539. @node Memory Related Tunables
  540. @section Memory Related Tunables
  541. @cindex memory related tunables
  542. @deftp {Tunable namespace} glibc.mem
  543. This tunable namespace supports operations that affect the way @theglibc{}
  544. and the process manage memory.
  545. @end deftp
  546. @deftp Tunable glibc.mem.tagging
  547. If the hardware supports memory tagging, this tunable can be used to
  548. control the way @theglibc{} uses this feature. At present this is only
  549. supported on AArch64 systems with the MTE extension; it is ignored for
  550. all other systems.
  551. This tunable takes a value between 0 and 255 and acts as a bitmask
  552. that enables various capabilities.
  553. Bit 0 (the least significant bit) causes the @code{malloc}
  554. subsystem to allocate
  555. tagged memory, with each allocation being assigned a random tag.
  556. Bit 1 enables precise faulting mode for tag violations on systems that
  557. support deferred tag violation reporting. This may cause programs
  558. to run more slowly.
  559. Bit 2 enables either precise or deferred faulting mode for tag violations
  560. whichever is preferred by the system.
  561. Other bits are currently reserved.
  562. @Theglibc{} startup code will automatically enable memory tagging
  563. support in the kernel if this tunable has any non-zero value.
  564. The default value is @samp{0}, which disables all memory tagging.
  565. @end deftp
  566. @deftp Tunable glibc.mem.decorate_maps
  567. If the kernel supports naming anonymous virtual memory areas (since
  568. Linux version 5.17, although not always enabled by some kernel
  569. configurations), this tunable can be used to control whether
  570. @theglibc{} decorates the underlying memory obtained from operating
  571. system with a string describing its usage (for instance, on the thread
  572. stack created by @code{pthread_create} or memory allocated by
  573. @code{malloc}).
  574. The process mappings can be obtained by reading the @code{/proc/<pid>maps}
  575. (with @code{pid} being either the @dfn{process ID} or @code{self} for the
  576. process own mapping).
  577. This tunable takes a value of 0 and 1, where 1 enables the feature.
  578. The default value is @samp{0}, which disables the decoration.
  579. @end deftp
  580. @node gmon Tunables
  581. @section gmon Tunables
  582. @cindex gmon tunables
  583. @deftp {Tunable namespace} glibc.gmon
  584. This tunable namespace affects the behaviour of the gmon profiler.
  585. gmon is a component of @theglibc{} which is normally used in
  586. conjunction with gprof.
  587. When GCC compiles a program with the @code{-pg} option, it instruments
  588. the program with calls to the @code{mcount} function, to record the
  589. program's call graph. At program startup, a memory buffer is allocated
  590. to store this call graph; the size of the buffer is calculated using a
  591. heuristic based on code size. If during execution, the buffer is found
  592. to be too small, profiling will be aborted and no @file{gmon.out} file
  593. will be produced. In that case, you will see the following message
  594. printed to standard error:
  595. @example
  596. mcount: call graph buffer size limit exceeded, gmon.out will not be generated
  597. @end example
  598. Most of the symbols discussed in this section are defined in the header
  599. @code{sys/gmon.h}. However, some symbols (for example @code{mcount})
  600. are not defined in any header file, since they are only intended to be
  601. called from code generated by the compiler.
  602. @end deftp
  603. @deftp Tunable glibc.mem.minarcs
  604. The heuristic for sizing the call graph buffer is known to be
  605. insufficient for small programs; hence, the calculated value is clamped
  606. to be at least a minimum size. The default minimum (in units of
  607. call graph entries, @code{struct tostruct}), is given by the macro
  608. @code{MINARCS}. If you have some program with an unusually complex
  609. call graph, for which the heuristic fails to allocate enough space,
  610. you can use this tunable to increase the minimum to a larger value.
  611. @end deftp
  612. @deftp Tunable glibc.mem.maxarcs
  613. To prevent excessive memory consumption when profiling very large
  614. programs, the call graph buffer is allowed to have a maximum of
  615. @code{MAXARCS} entries. For some very large programs, the default
  616. value of @code{MAXARCS} defined in @file{sys/gmon.h} is too small; in
  617. that case, you can use this tunable to increase it.
  618. Note the value of the @code{maxarcs} tunable must be greater or equal
  619. to that of the @code{minarcs} tunable; if this constraint is violated,
  620. a warning will printed to standard error at program startup, and
  621. the @code{minarcs} value will be used as the maximum as well.
  622. Setting either tunable too high may result in a call graph buffer
  623. whose size exceeds the available memory; in that case, an out of memory
  624. error will be printed at program startup, the profiler will be
  625. disabled, and no @file{gmon.out} file will be generated.
  626. @end deftp