| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259 |
- xdrgen - Linux Kernel XDR code generator
- Introduction
- ------------
- SunRPC programs are typically specified using a language defined by
- RFC 4506. In fact, all IETF-published NFS specifications provide a
- description of the specified protocol using this language.
- Since the 1990's, user space consumers of SunRPC have had access to
- a tool that could read such XDR specifications and then generate C
- code that implements the RPC portions of that protocol. This tool is
- called rpcgen.
- This RPC-level code is code that handles input directly from the
- network, and thus a high degree of memory safety and sanity checking
- is needed to help ensure proper levels of security. Bugs in this
- code can have significant impact on security and performance.
- However, it is code that is repetitive and tedious to write by hand.
- The C code generated by rpcgen makes extensive use of the facilities
- of the user space TI-RPC library and libc. Furthermore, the dialect
- of the generated code is very traditional K&R C.
- The Linux kernel's implementation of SunRPC-based protocols hand-roll
- their XDR implementation. There are two main reasons for this:
- 1. libtirpc (and its predecessors) operate only in user space. The
- kernel's RPC implementation and its API are significantly
- different than libtirpc.
- 2. rpcgen-generated code is believed to be less efficient than code
- that is hand-written.
- These days, gcc and its kin are capable of optimizing code better
- than human authors. There are only a few instances where writing
- XDR code by hand will make a measurable performance different.
- In addition, the current hand-written code in the Linux kernel is
- difficult to audit and prove that it implements exactly what is in
- the protocol specification.
- In order to accrue the benefits of machine-generated XDR code in the
- kernel, a tool is needed that will output C code that works against
- the kernel's SunRPC implementation rather than libtirpc.
- Enter xdrgen.
- Dependencies
- ------------
- These dependencies are typically packaged by Linux distributions:
- - python3
- - python3-lark
- - python3-jinja2
- These dependencies are available via PyPi:
- - pip install 'lark[interegular]'
- XDR Specifications
- ------------------
- When adding a new protocol implementation to the kernel, the XDR
- specification can be derived by feeding a .txt copy of the RFC to
- the script located in tools/net/sunrpc/extract.sh.
- $ extract.sh < rfc0001.txt > new2.x
- Operation
- ---------
- Once a .x file is available, use xdrgen to generate source and
- header files containing an implementation of XDR encoding and
- decoding functions for the specified protocol.
- $ ./xdrgen definitions new2.x > include/linux/sunrpc/xdrgen/new2.h
- $ ./xdrgen declarations new2.x > new2xdr_gen.h
- and
- $ ./xdrgen source new2.x > new2xdr_gen.c
- The files are ready to use for a server-side protocol implementation,
- or may be used as a guide for implementing these routines by hand.
- By default, the only comments added to this code are kdoc comments
- that appear directly in front of the public per-procedure APIs. For
- deeper introspection, specifying the "--annotate" flag will insert
- additional comments in the generated code to help readers match the
- generated code to specific parts of the XDR specification.
- Because the generated code is targeted for the Linux kernel, it
- is tagged with a GPLv2-only license.
- The xdrgen tool can also provide lexical and syntax checking of
- an XDR specification:
- $ ./xdrgen lint xdr/new.x
- How It Works
- ------------
- xdrgen does not use machine learning to generate source code. The
- translation is entirely deterministic.
- RFC 4506 Section 6 contains a BNF grammar of the XDR specification
- language. The grammar has been adapted for use by the Python Lark
- module.
- The xdr.ebnf file in this directory contains the grammar used to
- parse XDR specifications. xdrgen configures Lark using the grammar
- in xdr.ebnf. Lark parses the target XDR specification using this
- grammar, creating a parse tree.
- xdrgen then transforms the parse tree into an abstract syntax tree.
- This tree is passed to a series of code generators.
- The generators are implemented as Python classes residing in the
- generators/ directory. Each generator emits code created from Jinja2
- templates stored in the templates/ directory.
- The source code is generated in the same order in which they appear
- in the specification to ensure the generated code compiles. This
- conforms with the behavior of rpcgen.
- xdrgen assumes that the generated source code is further compiled by
- a compiler that can optimize in a number of ways, including:
- - Unused functions are discarded (ie, not added to the executable)
- - Aggressive function inlining removes unnecessary stack frames
- - Single-arm switch statements are replaced by a single conditional
- branch
- And so on.
- Pragmas
- -------
- Pragma directives specify exceptions to the normal generation of
- encoding and decoding functions. Currently one directive is
- implemented: "public".
- Pragma big_endian
- ------ ----------
- pragma big_endian <enum> ;
- For variables that might contain only a small number values, it
- is more efficient to avoid the byte-swap when encoding or decoding
- on little-endian machines. Such is often the case with error status
- codes. For example:
- pragma big_endian nfsstat3;
- In this case, when generating an XDR struct or union containing a
- field of type "nfsstat3", xdrgen will make the type of that field
- "__be32" instead of "enum nfsstat3". XDR unions then switch on the
- non-byte-swapped value of that field.
- Pragma exclude
- ------ -------
- pragma exclude <RPC procedure> ;
- In some cases, a procedure encoder or decoder function might need
- special processing that cannot be automatically generated. The
- automatically-generated functions might conflict or interfere with
- the hand-rolled function. To avoid editing the generated source code
- by hand, a pragma can specify that the procedure's encoder and
- decoder functions are not included in the generated header and
- source.
- For example:
- pragma exclude NFSPROC3_READDIRPLUS;
- Excludes the decoder function for the READDIRPLUS argument and the
- encoder function for the READDIRPLUS result.
- Note that because data item encoder and decoder functions are
- defined "static __maybe_unused", subsequent compilation
- automatically excludes data item encoder and decoder functions that
- are used only by excluded procedure.
- Pragma header
- ------ ------
- pragma header <string> ;
- Provide a name to use for the header file. For example:
- pragma header nlm4;
- Adds
- #include "nlm4xdr_gen.h"
- to the generated source file.
- Pragma public
- ------ ------
- pragma public <XDR data item> ;
- Normally XDR encoder and decoder functions are "static". In case an
- implementer wants to call these functions from other source code,
- s/he can add a public pragma in the input .x file to indicate a set
- of functions that should get a prototype in the generated header,
- and the function definitions will not be declared static.
- For example:
- pragma public nfsstat3;
- Adds these prototypes in the generated header:
- bool xdrgen_decode_nfsstat3(struct xdr_stream *xdr, enum nfsstat3 *ptr);
- bool xdrgen_encode_nfsstat3(struct xdr_stream *xdr, enum nfsstat3 value);
- And, in the generated source code, both of these functions appear
- without the "static __maybe_unused" modifiers.
- Future Work
- -----------
- Finish implementing XDR pointer and list types.
- Generate client-side procedure functions
- Expand the README into a user guide similar to rpcgen(1)
- Add more pragma directives:
- * @pages -- use xdr_read/write_pages() for the specified opaque
- field
- * @skip -- do not decode, but rather skip, the specified argument
- field
- Enable something like a #include to dynamically insert the content
- of other specification files
- Build a unit test suite for verifying translation of XDR language
- into compilable code
- Add a command-line option to insert trace_printk call sites in the
- generated source code, for improved (temporary) observability
- Generate kernel Rust code as well as C code
|