| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235 |
- .. SPDX-License-Identifier: GPL-2.0
- ========================================
- Debugging advice for driver development
- ========================================
- This document serves as a general starting point and lookup for debugging
- device drivers.
- While this guide focuses on debugging that requires re-compiling the
- module/kernel, the :doc:`userspace debugging guide
- </process/debugging/userspace_debugging_guide>` will guide
- you through tools like dynamic debug, ftrace and other tools useful for
- debugging issues and behavior.
- For general debugging advice, see the :doc:`general advice document
- </process/debugging/index>`.
- .. contents::
- :depth: 3
- The following sections show you the available tools.
- printk() & friends
- ------------------
- These are derivatives of printf() with varying destinations and support for
- being dynamically turned on or off, or lack thereof.
- Simple printk()
- ~~~~~~~~~~~~~~~
- The classic, can be used to great effect for quick and dirty development
- of new modules or to extract arbitrary necessary data for troubleshooting.
- Prerequisite: ``CONFIG_PRINTK`` (usually enabled by default)
- **Pros**:
- - No need to learn anything, simple to use
- - Easy to modify exactly to your needs (formatting of the data (See:
- :doc:`/core-api/printk-formats`), visibility in the log)
- - Can cause delays in the execution of the code (beneficial to confirm whether
- timing is a factor)
- **Cons**:
- - Requires rebuilding the kernel/module
- - Can cause delays in the execution of the code (which can cause issues to be
- not reproducible)
- For the full documentation see :doc:`/core-api/printk-basics`
- Trace_printk
- ~~~~~~~~~~~~
- Prerequisite: ``CONFIG_DYNAMIC_FTRACE`` & ``#include <linux/ftrace.h>``
- It is a tiny bit less comfortable to use than printk(), because you will have
- to read the messages from the trace file (See: :ref:`read_ftrace_log`
- instead of from the kernel log, but very useful when printk() adds unwanted
- delays into the code execution, causing issues to be flaky or hidden.)
- If the processing of this still causes timing issues then you can try
- trace_puts().
- For the full Documentation see trace_printk()
- dev_dbg
- ~~~~~~~
- Print statement, which can be targeted by
- :ref:`process/debugging/userspace_debugging_guide:dynamic debug` that contains
- additional information about the device used within the context.
- **When is it appropriate to leave a debug print in the code?**
- Permanent debug statements have to be useful for a developer to troubleshoot
- driver misbehavior. Judging that is a bit more of an art than a science, but
- some guidelines are in the :ref:`Coding style guidelines
- <process/coding-style:13) printing kernel messages>`. In almost all cases the
- debug statements shouldn't be upstreamed, as a working driver is supposed to be
- silent.
- Custom printk
- ~~~~~~~~~~~~~
- Example::
- #define core_dbg(fmt, arg...) do { \
- if (core_debug) \
- printk(KERN_DEBUG pr_fmt("core: " fmt), ## arg); \
- } while (0)
- **When should you do this?**
- It is better to just use a pr_debug(), which can later be turned on/off with
- dynamic debug. Additionally, a lot of drivers activate these prints via a
- variable like ``core_debug`` set by a module parameter. However, Module
- parameters `are not recommended anymore
- <https://lore.kernel.org/all/2024032757-surcharge-grime-d3dd@gregkh>`_.
- Ftrace
- ------
- Creating a custom Ftrace tracepoint
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- A tracepoint adds a hook into your code that will be called and logged when the
- tracepoint is enabled. This can be used, for example, to trace hitting a
- conditional branch or to dump the internal state at specific points of the code
- flow during a debugging session.
- Here is a basic description of :ref:`how to implement new tracepoints
- <trace/tracepoints:usage>`.
- For the full event tracing documentation see :doc:`/trace/events`
- For the full Ftrace documentation see :doc:`/trace/ftrace`
- DebugFS
- -------
- Prerequisite: ``CONFIG_DEBUG_FS` & `#include <linux/debugfs.h>``
- DebugFS differs from the other approaches of debugging, as it doesn't write
- messages to the kernel log nor add traces to the code. Instead it allows the
- developer to handle a set of files.
- With these files you can either store values of variables or make
- register/memory dumps or you can make these files writable and modify
- values/settings in the driver.
- Possible use-cases among others:
- - Store register values
- - Keep track of variables
- - Store errors
- - Store settings
- - Toggle a setting like debug on/off
- - Error injection
- This is especially useful, when the size of a data dump would be hard to digest
- as part of the general kernel log (for example when dumping raw bitstream data)
- or when you are not interested in all the values all the time, but with the
- possibility to inspect them.
- The general idea is:
- - Create a directory during probe (``struct dentry *parent =
- debugfs_create_dir("my_driver", NULL);``)
- - Create a file (``debugfs_create_u32("my_value", 444, parent, &my_variable);``)
- - In this example the file is found in
- ``/sys/kernel/debug/my_driver/my_value`` (with read permissions for
- user/group/all)
- - any read of the file will return the current contents of the variable
- ``my_variable``
- - Clean up the directory when removing the device
- (``debugfs_remove(parent);``)
- For the full documentation see :doc:`/filesystems/debugfs`.
- KASAN, UBSAN, lockdep and other error checkers
- ----------------------------------------------
- KASAN (Kernel Address Sanitizer)
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Prerequisite: ``CONFIG_KASAN``
- KASAN is a dynamic memory error detector that helps to find use-after-free and
- out-of-bounds bugs. It uses compile-time instrumentation to check every memory
- access.
- For the full documentation see :doc:`/dev-tools/kasan`.
- UBSAN (Undefined Behavior Sanitizer)
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Prerequisite: ``CONFIG_UBSAN``
- UBSAN relies on compiler instrumentation and runtime checks to detect undefined
- behavior. It is designed to find a variety of issues, including signed integer
- overflow, array index out of bounds, and more.
- For the full documentation see :doc:`/dev-tools/ubsan`
- lockdep (Lock Dependency Validator)
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Prerequisite: ``CONFIG_DEBUG_LOCKDEP``
- lockdep is a runtime lock dependency validator that detects potential deadlocks
- and other locking-related issues in the kernel.
- It tracks lock acquisitions and releases, building a dependency graph that is
- analyzed for potential deadlocks.
- lockdep is especially useful for validating the correctness of lock ordering in
- the kernel.
- PSI (Pressure stall information tracking)
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Prerequisite: ``CONFIG_PSI``
- PSI is a measurement tool to identify excessive overcommits on hardware
- resources, that can cause performance disruptions or even OOM kills.
- device coredump
- ---------------
- Prerequisite: ``CONFIG_DEV_COREDUMP`` & ``#include <linux/devcoredump.h>``
- Provides the infrastructure for a driver to provide arbitrary data to userland.
- It is most often used in conjunction with udev or similar userland application
- to listen for kernel uevents, which indicate that the dump is ready. Udev has
- rules to copy that file somewhere for long-term storage and analysis, as by
- default, the data for the dump is automatically cleaned up after a default
- 5 minutes. That data is analyzed with driver-specific tools or GDB.
- A device coredump can be created with a vmalloc area, with read/free
- methods, or as a scatter/gather list.
- You can find an example implementation at:
- `drivers/media/platform/qcom/venus/core.c
- <https://elixir.bootlin.com/linux/v6.11.6/source/drivers/media/platform/qcom/venus/core.c#L30>`__,
- in the Bluetooth HCI layer, in several wireless drivers, and in several
- DRM drivers.
- devcoredump interfaces
- ~~~~~~~~~~~~~~~~~~~~~~
- .. kernel-doc:: include/linux/devcoredump.h
- .. kernel-doc:: drivers/base/devcoredump.c
- **Copyright** ©2024 : Collabora
|