diff options
Diffstat (limited to 'docs/PTHInternals.rst')
| -rw-r--r-- | docs/PTHInternals.rst | 163 |
1 files changed, 0 insertions, 163 deletions
diff --git a/docs/PTHInternals.rst b/docs/PTHInternals.rst deleted file mode 100644 index 7401cf9b4d4b..000000000000 --- a/docs/PTHInternals.rst +++ /dev/null @@ -1,163 +0,0 @@ -========================== -Pretokenized Headers (PTH) -========================== - -This document first describes the low-level interface for using PTH and -then briefly elaborates on its design and implementation. If you are -interested in the end-user view, please see the :ref:`User's Manual -<usersmanual-precompiled-headers>`. - -Using Pretokenized Headers with ``clang`` (Low-level Interface) -=============================================================== - -The Clang compiler frontend, ``clang -cc1``, supports three command line -options for generating and using PTH files. - -To generate PTH files using ``clang -cc1``, use the option ``-emit-pth``: - -.. code-block:: console - - $ clang -cc1 test.h -emit-pth -o test.h.pth - -This option is transparently used by ``clang`` when generating PTH -files. Similarly, PTH files can be used as prefix headers using the -``-include-pth`` option: - -.. code-block:: console - - $ clang -cc1 -include-pth test.h.pth test.c -o test.s - -Alternatively, Clang's PTH files can be used as a raw "token-cache" (or -"content" cache) of the source included by the original header file. -This means that the contents of the PTH file are searched as substitutes -for *any* source files that are used by ``clang -cc1`` to process a -source file. This is done by specifying the ``-token-cache`` option: - -.. code-block:: console - - $ cat test.h - #include <stdio.h> - $ clang -cc1 -emit-pth test.h -o test.h.pth - $ cat test.c - #include "test.h" - $ clang -cc1 test.c -o test -token-cache test.h.pth - -In this example the contents of ``stdio.h`` (and the files it includes) -will be retrieved from ``test.h.pth``, as the PTH file is being used in -this case as a raw cache of the contents of ``test.h``. This is a -low-level interface used to both implement the high-level PTH interface -as well as to provide alternative means to use PTH-style caching. - -PTH Design and Implementation -============================= - -Unlike GCC's precompiled headers, which cache the full ASTs and -preprocessor state of a header file, Clang's pretokenized header files -mainly cache the raw lexer *tokens* that are needed to segment the -stream of characters in a source file into keywords, identifiers, and -operators. Consequently, PTH serves to mainly directly speed up the -lexing and preprocessing of a source file, while parsing and -type-checking must be completely redone every time a PTH file is used. - -Basic Design Tradeoffs ----------------------- - -In the long term there are plans to provide an alternate PCH -implementation for Clang that also caches the work for parsing and type -checking the contents of header files. The current implementation of PCH -in Clang as pretokenized header files was motivated by the following -factors: - -**Language independence** - PTH files work with any language that - Clang's lexer can handle, including C, Objective-C, and (in the early - stages) C++. This means development on language features at the - parsing level or above (which is basically almost all interesting - pieces) does not require PTH to be modified. - -**Simple design** - Relatively speaking, PTH has a simple design and - implementation, making it easy to test. Further, because the - machinery for PTH resides at the lower-levels of the Clang library - stack it is fairly straightforward to profile and optimize. - -Further, compared to GCC's PCH implementation (which is the dominate -precompiled header file implementation that Clang can be directly -compared against) the PTH design in Clang yields several attractive -features: - -**Architecture independence** - In contrast to GCC's PCH files (and - those of several other compilers), Clang's PTH files are architecture - independent, requiring only a single PTH file when building a - program for multiple architectures. - - For example, on Mac OS X one may wish to compile a "universal binary" - that runs on PowerPC, 32-bit Intel (i386), and 64-bit Intel - architectures. In contrast, GCC requires a PCH file for each - architecture, as the definitions of types in the AST are - architecture-specific. Since a Clang PTH file essentially represents - a lexical cache of header files, a single PTH file can be safely used - when compiling for multiple architectures. This can also reduce - compile times because only a single PTH file needs to be generated - during a build instead of several. - -**Reduced memory pressure** - Similar to GCC, Clang reads PTH files - via the use of memory mapping (i.e., ``mmap``). Clang, however, - memory maps PTH files as read-only, meaning that multiple invocations - of ``clang -cc1`` can share the same pages in memory from a - memory-mapped PTH file. In comparison, GCC also memory maps its PCH - files but also modifies those pages in memory, incurring the - copy-on-write costs. The read-only nature of PTH can greatly reduce - memory pressure for builds involving multiple cores, thus improving - overall scalability. - -**Fast generation** - PTH files can be generated in a small fraction - of the time needed to generate GCC's PCH files. Since PTH/PCH - generation is a serial operation that typically blocks progress - during a build, faster generation time leads to improved processor - utilization with parallel builds on multicore machines. - -Despite these strengths, PTH's simple design suffers some algorithmic -handicaps compared to other PCH strategies such as those used by GCC. -While PTH can greatly speed up the processing time of a header file, the -amount of work required to process a header file is still roughly linear -in the size of the header file. In contrast, the amount of work done by -GCC to process a precompiled header is (theoretically) constant (the -ASTs for the header are literally memory mapped into the compiler). This -means that only the pieces of the header file that are referenced by the -source file including the header are the only ones the compiler needs to -process during actual compilation. While GCC's particular implementation -of PCH mitigates some of these algorithmic strengths via the use of -copy-on-write pages, the approach itself can fundamentally dominate at -an algorithmic level, especially when one considers header files of -arbitrary size. - -There is also a PCH implementation for Clang based on the lazy -deserialization of ASTs. This approach theoretically has the same -constant-time algorithmic advantages just mentioned but also retains some -of the strengths of PTH such as reduced memory pressure (ideal for -multi-core builds). - -Internal PTH Optimizations --------------------------- - -While the main optimization employed by PTH is to reduce lexing time of -header files by caching pre-lexed tokens, PTH also employs several other -optimizations to speed up the processing of header files: - -- ``stat`` caching: PTH files cache information obtained via calls to - ``stat`` that ``clang -cc1`` uses to resolve which files are included - by ``#include`` directives. This greatly reduces the overhead - involved in context-switching to the kernel to resolve included - files. - -- Fast skipping of ``#ifdef`` ... ``#endif`` chains: PTH files - record the basic structure of nested preprocessor blocks. When the - condition of the preprocessor block is false, all of its tokens are - immediately skipped instead of requiring them to be handled by - Clang's preprocessor. - - |
