| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162 |
- .. SPDX-License-Identifier: GPL-2.0
- =====================================
- Using Propeller with the Linux kernel
- =====================================
- This enables Propeller build support for the kernel when using Clang
- compiler. Propeller is a profile-guided optimization (PGO) method used
- to optimize binary executables. Like AutoFDO, it utilizes hardware
- sampling to gather information about the frequency of execution of
- different code paths within a binary. Unlike AutoFDO, this information
- is then used right before linking phase to optimize (among others)
- block layout within and across functions.
- A few important notes about adopting Propeller optimization:
- #. Although it can be used as a standalone optimization step, it is
- strongly recommended to apply Propeller on top of AutoFDO,
- AutoFDO+ThinLTO or Instrument FDO. The rest of this document
- assumes this paradigm.
- #. Propeller uses another round of profiling on top of
- AutoFDO/AutoFDO+ThinLTO/iFDO. The whole build process involves
- "build-afdo - train-afdo - build-propeller - train-propeller -
- build-optimized".
- #. Propeller requires LLVM 19 release or later for Clang/Clang++
- and the linker(ld.lld).
- #. In addition to LLVM toolchain, Propeller requires a profiling
- conversion tool: https://github.com/google/autofdo with a release
- after v0.30.1: https://github.com/google/autofdo/releases/tag/v0.30.1.
- The Propeller optimization process involves the following steps:
- #. Initial building: Build the AutoFDO or AutoFDO+ThinLTO binary as
- you would normally do, but with a set of compile-time / link-time
- flags, so that a special metadata section is created within the
- kernel binary. The special section is only intend to be used by the
- profiling tool, it is not part of the runtime image, nor does it
- change kernel run time text sections.
- #. Profiling: The above kernel is then run with a representative
- workload to gather execution frequency data. This data is collected
- using hardware sampling, via perf. Propeller is most effective on
- platforms supporting advanced PMU features like LBR on Intel
- machines. This step is the same as profiling the kernel for AutoFDO
- (the exact perf parameters can be different).
- #. Propeller profile generation: Perf output file is converted to a
- pair of Propeller profiles via an offline tool.
- #. Optimized build: Build the AutoFDO or AutoFDO+ThinLTO optimized
- binary as you would normally do, but with a compile-time /
- link-time flag to pick up the Propeller compile time and link time
- profiles. This build step uses 3 profiles - the AutoFDO profile,
- the Propeller compile-time profile and the Propeller link-time
- profile.
- #. Deployment: The optimized kernel binary is deployed and used
- in production environments, providing improved performance
- and reduced latency.
- Preparation
- ===========
- Configure the kernel with::
- CONFIG_AUTOFDO_CLANG=y
- CONFIG_PROPELLER_CLANG=y
- Customization
- =============
- The default CONFIG_PROPELLER_CLANG setting covers kernel space objects
- for Propeller builds. One can, however, enable or disable Propeller build
- for individual files and directories by adding a line similar to the
- following to the respective kernel Makefile:
- - For enabling a single file (e.g. foo.o)::
- PROPELLER_PROFILE_foo.o := y
- - For enabling all files in one directory::
- PROPELLER_PROFILE := y
- - For disabling one file::
- PROPELLER_PROFILE_foo.o := n
- - For disabling all files in one directory::
- PROPELLER__PROFILE := n
- Workflow
- ========
- Here is an example workflow for building an AutoFDO+Propeller kernel:
- 1) Assuming an AutoFDO profile is already collected following
- instructions in the AutoFDO document, build the kernel on the host
- machine, with AutoFDO and Propeller build configs ::
- CONFIG_AUTOFDO_CLANG=y
- CONFIG_PROPELLER_CLANG=y
- and ::
- $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<autofdo-profile-name>
- 2) Install the kernel on the test machine.
- 3) Run the load tests. The '-c' option in perf specifies the sample
- event period. We suggest using a suitable prime number, like 500009,
- for this purpose.
- - For Intel platforms::
- $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c <count> -o <perf_file> -- <loadtest>
- - For AMD platforms::
- $ perf record --pfm-event RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a -N -b -c <count> -o <perf_file> -- <loadtest>
- Note you can repeat the above steps to collect multiple <perf_file>s.
- 4) (Optional) Download the raw perf file(s) to the host machine.
- 5) Use the create_llvm_prof tool (https://github.com/google/autofdo) to
- generate Propeller profile. ::
- $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file>
- --format=propeller --propeller_output_module_name
- --out=<propeller_profile_prefix>_cc_profile.txt
- --propeller_symorder=<propeller_profile_prefix>_ld_profile.txt
- "<propeller_profile_prefix>" can be something like "/home/user/dir/any_string".
- This command generates a pair of Propeller profiles:
- "<propeller_profile_prefix>_cc_profile.txt" and
- "<propeller_profile_prefix>_ld_profile.txt".
- If there are more than 1 perf_file collected in the previous step,
- you can create a temp list file "<perf_file_list>" with each line
- containing one perf file name and run::
- $ create_llvm_prof --binary=<vmlinux> --profile=@<perf_file_list>
- --format=propeller --propeller_output_module_name
- --out=<propeller_profile_prefix>_cc_profile.txt
- --propeller_symorder=<propeller_profile_prefix>_ld_profile.txt
- 6) Rebuild the kernel using the AutoFDO and Propeller
- profiles. ::
- CONFIG_AUTOFDO_CLANG=y
- CONFIG_PROPELLER_CLANG=y
- and ::
- $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file> CLANG_PROPELLER_PROFILE_PREFIX=<propeller_profile_prefix>
|