diff options
author | Dimitry Andric <dim@FreeBSD.org> | 2016-07-23 20:41:05 +0000 |
---|---|---|
committer | Dimitry Andric <dim@FreeBSD.org> | 2016-07-23 20:41:05 +0000 |
commit | 01095a5d43bbfde13731688ddcf6048ebb8b7721 (patch) | |
tree | 4def12e759965de927d963ac65840d663ef9d1ea /docs | |
parent | f0f4822ed4b66e3579e92a89f368f8fb860e218e (diff) |
Diffstat (limited to 'docs')
77 files changed, 5327 insertions, 2803 deletions
diff --git a/docs/AMDGPUUsage.rst b/docs/AMDGPUUsage.rst index 97d6662a2edb2..34a9b6011d40c 100644 --- a/docs/AMDGPUUsage.rst +++ b/docs/AMDGPUUsage.rst @@ -9,6 +9,29 @@ The AMDGPU back-end provides ISA code generation for AMD GPUs, starting with the R600 family up until the current Volcanic Islands (GCN Gen 3). +Conventions +=========== + +Address Spaces +-------------- + +The AMDGPU back-end uses the following address space mapping: + + ============= ============================================ + Address Space Memory Space + ============= ============================================ + 0 Private + 1 Global + 2 Constant + 3 Local + 4 Generic (Flat) + 5 Region + ============= ============================================ + +The terminology in the table, aside from the region memory space, is from the +OpenCL standard. + + Assembler ========= @@ -65,14 +88,14 @@ wait for. .. code-block:: nasm - // Wait for all counters to be 0 + ; Wait for all counters to be 0 s_waitcnt 0 - // Equivalent to s_waitcnt 0. Counter names can also be delimited by - // '&' or ','. + ; Equivalent to s_waitcnt 0. Counter names can also be delimited by + ; '&' or ','. s_waitcnt vmcnt(0) expcnt(0) lgkcmt(0) - // Wait for vmcnt counter to be 1. + ; Wait for vmcnt counter to be 1. s_waitcnt vmcnt(1) VOP1, VOP2, VOP3, VOPC Instructions @@ -153,7 +176,10 @@ Here is an example of a minimal amd_kernel_code_t specification: .hsa_code_object_version 1,0 .hsa_code_object_isa - .text + .hsatext + .globl hello_world + .p2align 8 + .amdgpu_hsa_kernel hello_world hello_world: @@ -173,5 +199,7 @@ Here is an example of a minimal amd_kernel_code_t specification: s_waitcnt lgkmcnt(0) v_mov_b32 v1, s0 v_mov_b32 v2, s1 - flat_store_dword v0, v[1:2] + flat_store_dword v[1:2], v0 s_endpgm + .Lfunc_end0: + .size hello_world, .Lfunc_end0-hello_world diff --git a/docs/AdvancedBuilds.rst b/docs/AdvancedBuilds.rst new file mode 100644 index 0000000000000..dc808a0ab83f4 --- /dev/null +++ b/docs/AdvancedBuilds.rst @@ -0,0 +1,174 @@ +============================= +Advanced Build Configurations +============================= + +.. contents:: + :local: + +Introduction +============ + +`CMake <http://www.cmake.org/>`_ is a cross-platform build-generator tool. CMake +does not build the project, it generates the files needed by your build tool +(GNU make, Visual Studio, etc.) for building LLVM. + +If **you are a new contributor**, please start with the :doc:`GettingStarted` or +:doc:`CMake` pages. This page is intended for users doing more complex builds. + +Many of the examples below are written assuming specific CMake Generators. +Unless otherwise explicitly called out these commands should work with any CMake +generator. + +Bootstrap Builds +================ + +The Clang CMake build system supports bootstrap (aka multi-stage) builds. At a +high level a multi-stage build is a chain of builds that pass data from one +stage into the next. The most common and simple version of this is a traditional +bootstrap build. + +In a simple two-stage bootstrap build, we build clang using the system compiler, +then use that just-built clang to build clang again. In CMake this simplest form +of a bootstrap build can be configured with a single option, +CLANG_ENABLE_BOOTSTRAP. + +.. code-block:: console + + $ cmake -G Ninja -DCLANG_ENABLE_BOOTSTRAP=On <path to source> + $ ninja stage2 + +This command itself isn't terribly useful because it assumes default +configurations for each stage. The next series of examples utilize CMake cache +scripts to provide more complex options. + +The clang build system refers to builds as stages. A stage1 build is a standard +build using the compiler installed on the host, and a stage2 build is built +using the stage1 compiler. This nomenclature holds up to more stages too. In +general a stage*n* build is built using the output from stage*n-1*. + +Apple Clang Builds (A More Complex Bootstrap) +============================================= + +Apple's Clang builds are a slightly more complicated example of the simple +bootstrapping scenario. Apple Clang is built using a 2-stage build. + +The stage1 compiler is a host-only compiler with some options set. The stage1 +compiler is a balance of optimization vs build time because it is a throwaway. +The stage2 compiler is the fully optimized compiler intended to ship to users. + +Setting up these compilers requires a lot of options. To simplify the +configuration the Apple Clang build settings are contained in CMake Cache files. +You can build an Apple Clang compiler using the following commands: + +.. code-block:: console + + $ cmake -G Ninja -C <path to clang>/cmake/caches/Apple-stage1.cmake <path to source> + $ ninja stage2-distribution + +This CMake invocation configures the stage1 host compiler, and sets +CLANG_BOOTSTRAP_CMAKE_ARGS to pass the Apple-stage2.cmake cache script to the +stage2 configuration step. + +When you build the stage2-distribution target it builds the minimal stage1 +compiler and required tools, then configures and builds the stage2 compiler +based on the settings in Apple-stage2.cmake. + +This pattern of using cache scripts to set complex settings, and specifically to +make later stage builds include cache scripts is common in our more advanced +build configurations. + +Multi-stage PGO +=============== + +Profile-Guided Optimizations (PGO) is a really great way to optimize the code +clang generates. Our multi-stage PGO builds are a workflow for generating PGO +profiles that can be used to optimize clang. + +At a high level, the way PGO works is that you build an instrumented compiler, +then you run the instrumented compiler against sample source files. While the +instrumented compiler runs it will output a bunch of files containing +performance counters (.profraw files). After generating all the profraw files +you use llvm-profdata to merge the files into a single profdata file that you +can feed into the LLVM_PROFDATA_FILE option. + +Our PGO.cmake cache script automates that whole process. You can use it by +running: + +.. code-block:: console + + $ cmake -G Ninja -C <path_to_clang>/cmake/caches/PGO.cmake <source dir> + $ ninja stage2-instrumented-generate-profdata + +If you let that run for a few hours or so, it will place a profdata file in your +build directory. This takes a really long time because it builds clang twice, +and you *must* have compiler-rt in your build tree. + +This process uses any source files under the perf-training directory as training +data as long as the source files are marked up with LIT-style RUN lines. + +After it finishes you can use “find . -name clang.profdata” to find it, but it +should be at a path something like: + +.. code-block:: console + + <build dir>/tools/clang/stage2-instrumented-bins/utils/perf-training/clang.profdata + +You can feed that file into the LLVM_PROFDATA_FILE option when you build your +optimized compiler. + +The PGO came cache has a slightly different stage naming scheme than other +multi-stage builds. It generates three stages; stage1, stage2-instrumented, and +stage2. Both of the stage2 builds are built using the stage1 compiler. + +The PGO came cache generates the following additional targets: + +**stage2-instrumented** + Builds a stage1 x86 compiler, runtime, and required tools (llvm-config, + llvm-profdata) then uses that compiler to build an instrumented stage2 compiler. + +**stage2-instrumented-generate-profdata** + Depends on "stage2-instrumented" and will use the instrumented compiler to + generate profdata based on the training files in <clang>/utils/perf-training + +**stage2** + Depends of "stage2-instrumented-generate-profdata" and will use the stage1 + compiler with the stage2 profdata to build a PGO-optimized compiler. + +**stage2-check-llvm** + Depends on stage2 and runs check-llvm using the stage2 compiler. + +**stage2-check-clang** + Depends on stage2 and runs check-clang using the stage2 compiler. + +**stage2-check-all** + Depends on stage2 and runs check-all using the stage2 compiler. + +**stage2-test-suite** + Depends on stage2 and runs the test-suite using the stage3 compiler (requires + in-tree test-suite). + +3-Stage Non-Determinism +======================= + +In the ancient lore of compilers non-determinism is like the multi-headed hydra. +Whenever it's head pops up, terror and chaos ensue. + +Historically one of the tests to verify that a compiler was deterministic would +be a three stage build. The idea of a three stage build is you take your sources +and build a compiler (stage1), then use that compiler to rebuild the sources +(stage2), then you use that compiler to rebuild the sources a third time +(stage3) with an identical configuration to the stage2 build. At the end of +this, you have a stage2 and stage3 compiler that should be bit-for-bit +identical. + +You can perform one of these 3-stage builds with LLVM & clang using the +following commands: + +.. code-block:: console + + $ cmake -G Ninja -C <path_to_clang>/cmake/caches/3-stage.cmake <source dir> + $ ninja stage3 + +After the build you can compare the stage2 & stage3 compilers. We have a bot +setup `here <http://lab.llvm.org:8011/builders/clang-3stage-ubuntu>`_ that runs +this build and compare configuration. diff --git a/docs/AliasAnalysis.rst b/docs/AliasAnalysis.rst index e055b4e1afbc3..097f7bf75cbcd 100644 --- a/docs/AliasAnalysis.rst +++ b/docs/AliasAnalysis.rst @@ -31,8 +31,7 @@ well together. This document contains information necessary to successfully implement this interface, use it, and to test both sides. It also explains some of the finer -points about what exactly results mean. If you feel that something is unclear -or should be added, please `let me know <mailto:sabre@nondot.org>`_. +points about what exactly results mean. ``AliasAnalysis`` Class Overview ================================ diff --git a/docs/Atomics.rst b/docs/Atomics.rst index 79ab74792dd47..4961348d0c97d 100644 --- a/docs/Atomics.rst +++ b/docs/Atomics.rst @@ -8,17 +8,13 @@ LLVM Atomic Instructions and Concurrency Guide Introduction ============ -Historically, LLVM has not had very strong support for concurrency; some minimal -intrinsics were provided, and ``volatile`` was used in some cases to achieve -rough semantics in the presence of concurrency. However, this is changing; -there are now new instructions which are well-defined in the presence of threads -and asynchronous signals, and the model for existing instructions has been -clarified in the IR. +LLVM supports instructions which are well-defined in the presence of threads and +asynchronous signals. The atomic instructions are designed specifically to provide readable IR and optimized code generation for the following: -* The new C++11 ``<atomic>`` header. (`C++11 draft available here +* The C++11 ``<atomic>`` header. (`C++11 draft available here <http://www.open-std.org/jtc1/sc22/wg21/>`_.) (`C11 draft available here <http://www.open-std.org/jtc1/sc22/wg14/>`_.) @@ -371,7 +367,7 @@ Predicates for optimizer writers to query: that they return true for any operation which is volatile or at least Monotonic. -* ``isAtLeastAcquire()``/``isAtLeastRelease()``: These are predicates on +* ``isStrongerThan`` / ``isAtLeastOrStrongerThan``: These are predicates on orderings. They can be useful for passes that are aware of atomics, for example to do DSE across a single atomic access, but not across a release-acquire pair (see MemoryDependencyAnalysis for an example of this) @@ -402,7 +398,7 @@ operations: MemoryDependencyAnalysis (which is also used by other passes like GVN). * Folding a load: Any atomic load from a constant global can be constant-folded, - because it cannot be observed. Similar reasoning allows scalarrepl with + because it cannot be observed. Similar reasoning allows sroa with atomic loads and stores. Atomics and Codegen @@ -417,19 +413,28 @@ The MachineMemOperand for all atomic operations is currently marked as volatile; this is not correct in the IR sense of volatile, but CodeGen handles anything marked volatile very conservatively. This should get fixed at some point. -Common architectures have some way of representing at least a pointer-sized -lock-free ``cmpxchg``; such an operation can be used to implement all the other -atomic operations which can be represented in IR up to that size. Backends are -expected to implement all those operations, but not operations which cannot be -implemented in a lock-free manner. It is expected that backends will give an -error when given an operation which cannot be implemented. (The LLVM code -generator is not very helpful here at the moment, but hopefully that will -change.) +One very important property of the atomic operations is that if your backend +supports any inline lock-free atomic operations of a given size, you should +support *ALL* operations of that size in a lock-free manner. + +When the target implements atomic ``cmpxchg`` or LL/SC instructions (as most do) +this is trivial: all the other operations can be implemented on top of those +primitives. However, on many older CPUs (e.g. ARMv5, SparcV8, Intel 80386) there +are atomic load and store instructions, but no ``cmpxchg`` or LL/SC. As it is +invalid to implement ``atomic load`` using the native instruction, but +``cmpxchg`` using a library call to a function that uses a mutex, ``atomic +load`` must *also* expand to a library call on such architectures, so that it +can remain atomic with regards to a simultaneous ``cmpxchg``, by using the same +mutex. + +AtomicExpandPass can help with that: it will expand all atomic operations to the +proper ``__atomic_*`` libcalls for any size above the maximum set by +``setMaxAtomicSizeInBitsSupported`` (which defaults to 0). On x86, all atomic loads generate a ``MOV``. SequentiallyConsistent stores generate an ``XCHG``, other stores generate a ``MOV``. SequentiallyConsistent fences generate an ``MFENCE``, other fences do not cause any code to be -generated. cmpxchg uses the ``LOCK CMPXCHG`` instruction. ``atomicrmw xchg`` +generated. ``cmpxchg`` uses the ``LOCK CMPXCHG`` instruction. ``atomicrmw xchg`` uses ``XCHG``, ``atomicrmw add`` and ``atomicrmw sub`` use ``XADD``, and all other ``atomicrmw`` operations generate a loop with ``LOCK CMPXCHG``. Depending on the users of the result, some ``atomicrmw`` operations can be translated into @@ -450,10 +455,151 @@ atomic constructs. Here are some lowerings it can do: ``emitStoreConditional()`` * large loads/stores -> ll-sc/cmpxchg by overriding ``shouldExpandAtomicStoreInIR()``/``shouldExpandAtomicLoadInIR()`` -* strong atomic accesses -> monotonic accesses + fences - by using ``setInsertFencesForAtomic()`` and overriding ``emitLeadingFence()`` - and ``emitTrailingFence()`` +* strong atomic accesses -> monotonic accesses + fences by overriding + ``shouldInsertFencesForAtomic()``, ``emitLeadingFence()``, and + ``emitTrailingFence()`` * atomic rmw -> loop with cmpxchg or load-linked/store-conditional by overriding ``expandAtomicRMWInIR()`` +* expansion to __atomic_* libcalls for unsupported sizes. For an example of all of these, look at the ARM backend. + +Libcalls: __atomic_* +==================== + +There are two kinds of atomic library calls that are generated by LLVM. Please +note that both sets of library functions somewhat confusingly share the names of +builtin functions defined by clang. Despite this, the library functions are +not directly related to the builtins: it is *not* the case that ``__atomic_*`` +builtins lower to ``__atomic_*`` library calls and ``__sync_*`` builtins lower +to ``__sync_*`` library calls. + +The first set of library functions are named ``__atomic_*``. This set has been +"standardized" by GCC, and is described below. (See also `GCC's documentation +<https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary>`_) + +LLVM's AtomicExpandPass will translate atomic operations on data sizes above +``MaxAtomicSizeInBitsSupported`` into calls to these functions. + +There are four generic functions, which can be called with data of any size or +alignment:: + + void __atomic_load(size_t size, void *ptr, void *ret, int ordering) + void __atomic_store(size_t size, void *ptr, void *val, int ordering) + void __atomic_exchange(size_t size, void *ptr, void *val, void *ret, int ordering) + bool __atomic_compare_exchange(size_t size, void *ptr, void *expected, void *desired, int success_order, int failure_order) + +There are also size-specialized versions of the above functions, which can only +be used with *naturally-aligned* pointers of the appropriate size. In the +signatures below, "N" is one of 1, 2, 4, 8, and 16, and "iN" is the appropriate +integer type of that size; if no such integer type exists, the specialization +cannot be used:: + + iN __atomic_load_N(iN *ptr, iN val, int ordering) + void __atomic_store_N(iN *ptr, iN val, int ordering) + iN __atomic_exchange_N(iN *ptr, iN val, int ordering) + bool __atomic_compare_exchange_N(iN *ptr, iN *expected, iN desired, int success_order, int failure_order) + +Finally there are some read-modify-write functions, which are only available in +the size-specific variants (any other sizes use a ``__atomic_compare_exchange`` +loop):: + + iN __atomic_fetch_add_N(iN *ptr, iN val, int ordering) + iN __atomic_fetch_sub_N(iN *ptr, iN val, int ordering) + iN __atomic_fetch_and_N(iN *ptr, iN val, int ordering) + iN __atomic_fetch_or_N(iN *ptr, iN val, int ordering) + iN __atomic_fetch_xor_N(iN *ptr, iN val, int ordering) + iN __atomic_fetch_nand_N(iN *ptr, iN val, int ordering) + +This set of library functions have some interesting implementation requirements +to take note of: + +- They support all sizes and alignments -- including those which cannot be + implemented natively on any existing hardware. Therefore, they will certainly + use mutexes in for some sizes/alignments. + +- As a consequence, they cannot be shipped in a statically linked + compiler-support library, as they have state which must be shared amongst all + DSOs loaded in the program. They must be provided in a shared library used by + all objects. + +- The set of atomic sizes supported lock-free must be a superset of the sizes + any compiler can emit. That is: if a new compiler introduces support for + inline-lock-free atomics of size N, the ``__atomic_*`` functions must also have a + lock-free implementation for size N. This is a requirement so that code + produced by an old compiler (which will have called the ``__atomic_*`` function) + interoperates with code produced by the new compiler (which will use native + the atomic instruction). + +Note that it's possible to write an entirely target-independent implementation +of these library functions by using the compiler atomic builtins themselves to +implement the operations on naturally-aligned pointers of supported sizes, and a +generic mutex implementation otherwise. + +Libcalls: __sync_* +================== + +Some targets or OS/target combinations can support lock-free atomics, but for +various reasons, it is not practical to emit the instructions inline. + +There's two typical examples of this. + +Some CPUs support multiple instruction sets which can be swiched back and forth +on function-call boundaries. For example, MIPS supports the MIPS16 ISA, which +has a smaller instruction encoding than the usual MIPS32 ISA. ARM, similarly, +has the Thumb ISA. In MIPS16 and earlier versions of Thumb, the atomic +instructions are not encodable. However, those instructions are available via a +function call to a function with the longer encoding. + +Additionally, a few OS/target pairs provide kernel-supported lock-free +atomics. ARM/Linux is an example of this: the kernel `provides +<https://www.kernel.org/doc/Documentation/arm/kernel_user_helpers.txt>`_ a +function which on older CPUs contains a "magically-restartable" atomic sequence +(which looks atomic so long as there's only one CPU), and contains actual atomic +instructions on newer multicore models. This sort of functionality can typically +be provided on any architecture, if all CPUs which are missing atomic +compare-and-swap support are uniprocessor (no SMP). This is almost always the +case. The only common architecture without that property is SPARC -- SPARCV8 SMP +systems were common, yet it doesn't support any sort of compare-and-swap +operation. + +In either of these cases, the Target in LLVM can claim support for atomics of an +appropriate size, and then implement some subset of the operations via libcalls +to a ``__sync_*`` function. Such functions *must* not use locks in their +implementation, because unlike the ``__atomic_*`` routines used by +AtomicExpandPass, these may be mixed-and-matched with native instructions by the +target lowering. + +Further, these routines do not need to be shared, as they are stateless. So, +there is no issue with having multiple copies included in one binary. Thus, +typically these routines are implemented by the statically-linked compiler +runtime support library. + +LLVM will emit a call to an appropriate ``__sync_*`` routine if the target +ISelLowering code has set the corresponding ``ATOMIC_CMPXCHG``, ``ATOMIC_SWAP``, +or ``ATOMIC_LOAD_*`` operation to "Expand", and if it has opted-into the +availability of those library functions via a call to ``initSyncLibcalls()``. + +The full set of functions that may be called by LLVM is (for ``N`` being 1, 2, +4, 8, or 16):: + + iN __sync_val_compare_and_swap_N(iN *ptr, iN expected, iN desired) + iN __sync_lock_test_and_set_N(iN *ptr, iN val) + iN __sync_fetch_and_add_N(iN *ptr, iN val) + iN __sync_fetch_and_sub_N(iN *ptr, iN val) + iN __sync_fetch_and_and_N(iN *ptr, iN val) + iN __sync_fetch_and_or_N(iN *ptr, iN val) + iN __sync_fetch_and_xor_N(iN *ptr, iN val) + iN __sync_fetch_and_nand_N(iN *ptr, iN val) + iN __sync_fetch_and_max_N(iN *ptr, iN val) + iN __sync_fetch_and_umax_N(iN *ptr, iN val) + iN __sync_fetch_and_min_N(iN *ptr, iN val) + iN __sync_fetch_and_umin_N(iN *ptr, iN val) + +This list doesn't include any function for atomic load or store; all known +architectures support atomic loads and stores directly (possibly by emitting a +fence on either side of a normal load or store.) + +There's also, somewhat separately, the possibility to lower ``ATOMIC_FENCE`` to +``__sync_synchronize()``. This may happen or not happen independent of all the +above, controlled purely by ``setOperationAction(ISD::ATOMIC_FENCE, ...)``. diff --git a/docs/BitCodeFormat.rst b/docs/BitCodeFormat.rst index d6e3099bdb63d..ffa2176325275 100644 --- a/docs/BitCodeFormat.rst +++ b/docs/BitCodeFormat.rst @@ -467,10 +467,11 @@ Native Object File Wrapper Format ================================= Bitcode files for LLVM IR may also be wrapped in a native object file -(i.e. ELF, COFF, Mach-O). The bitcode must be stored in a section of the -object file named ``.llvmbc``. This wrapper format is useful for accommodating -LTO in compilation pipelines where intermediate objects must be native object -files which contain metadata in other sections. +(i.e. ELF, COFF, Mach-O). The bitcode must be stored in a section of the object +file named ``__LLVM,__bitcode`` for MachO and ``.llvmbc`` for the other object +formats. This wrapper format is useful for accommodating LTO in compilation +pipelines where intermediate objects must be native object files which contain +metadata in other sections. Not all tools support this format. @@ -689,6 +690,7 @@ global variable. The operand fields are: .. _linkage type: * *linkage*: An encoding of the linkage type for this variable: + * ``external``: code 0 * ``weak``: code 1 * ``appending``: code 2 @@ -713,20 +715,30 @@ global variable. The operand fields are: .. _visibility: * *visibility*: If present, an encoding of the visibility of this variable: + * ``default``: code 0 * ``hidden``: code 1 * ``protected``: code 2 +.. _bcthreadlocal: + * *threadlocal*: If present, an encoding of the thread local storage mode of the variable: + * ``not thread local``: code 0 * ``thread local; default TLS model``: code 1 * ``localdynamic``: code 2 * ``initialexec``: code 3 * ``localexec``: code 4 -* *unnamed_addr*: If present and non-zero, indicates that the variable has - ``unnamed_addr`` +.. _bcunnamedaddr: + +* *unnamed_addr*: If present, an encoding of the ``unnamed_addr`` attribute of this + variable: + + * not ``unnamed_addr``: code 0 + * ``unnamed_addr``: code 1 + * ``local_unnamed_addr``: code 2 .. _bcdllstorageclass: @@ -736,6 +748,8 @@ global variable. The operand fields are: * ``dllimport``: code 1 * ``dllexport``: code 2 +* *comdat*: An encoding of the COMDAT of this function + .. _FUNCTION: MODULE_CODE_FUNCTION Record @@ -756,6 +770,7 @@ function. The operand fields are: * ``anyregcc``: code 13 * ``preserve_mostcc``: code 14 * ``preserve_allcc``: code 15 + * ``swiftcc`` : code 16 * ``cxx_fast_tlscc``: code 17 * ``x86_stdcallcc``: code 64 * ``x86_fastcallcc``: code 65 @@ -782,8 +797,8 @@ function. The operand fields are: * *gc*: If present and nonzero, the 1-based garbage collector index in the table of `MODULE_CODE_GCNAME`_ entries. -* *unnamed_addr*: If present and non-zero, indicates that the function has - ``unnamed_addr`` +* *unnamed_addr*: If present, an encoding of the + :ref:`unnamed_addr<bcunnamedaddr>` attribute of this function * *prologuedata*: If non-zero, the value index of the prologue data for this function, plus 1. @@ -802,7 +817,7 @@ function. The operand fields are: MODULE_CODE_ALIAS Record ^^^^^^^^^^^^^^^^^^^^^^^^ -``[ALIAS, alias type, aliasee val#, linkage, visibility, dllstorageclass]`` +``[ALIAS, alias type, aliasee val#, linkage, visibility, dllstorageclass, threadlocal, unnamed_addr]`` The ``ALIAS`` record (code 9) marks the definition of an alias. The operand fields are @@ -818,6 +833,12 @@ fields are * *dllstorageclass*: If present, an encoding of the :ref:`dllstorageclass<bcdllstorageclass>` of the alias +* *threadlocal*: If present, an encoding of the + :ref:`thread local property<bcthreadlocal>` of the alias + +* *unnamed_addr*: If present, an encoding of the + :ref:`unnamed_addr<bcunnamedaddr>` attribute of this alias + MODULE_CODE_PURGEVALS Record ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ diff --git a/docs/BitSets.rst b/docs/BitSets.rst deleted file mode 100644 index 18dbf6df563f7..0000000000000 --- a/docs/BitSets.rst +++ /dev/null @@ -1,115 +0,0 @@ -======= -Bitsets -======= - -This is a mechanism that allows IR modules to co-operatively build pointer -sets corresponding to addresses within a given set of globals. One example -of a use case for this is to allow a C++ program to efficiently verify (at -each call site) that a vtable pointer is in the set of valid vtable pointers -for the type of the class or its derived classes. - -To use the mechanism, a client creates a global metadata node named -``llvm.bitsets``. Each element is a metadata node with three elements: - -1. a metadata object representing an identifier for the bitset -2. either a global variable or a function -3. a byte offset into the global (generally zero for functions) - -Each bitset must exclusively contain either global variables or functions. - -.. admonition:: Limitation - - The current implementation only supports functions as members of bitsets on - the x86-32 and x86-64 architectures. - -This will cause a link-time optimization pass to generate bitsets from the -memory addresses referenced from the elements of the bitset metadata. The -pass will lay out referenced global variables consecutively, so their -definitions must be available at LTO time. - -A bit set containing functions is transformed into a jump table, which -is a block of code consisting of one branch instruction for each of the -functions in the bit set that branches to the target function, and redirect -any taken function addresses to the corresponding jump table entry. In the -object file's symbol table, the jump table entries take the identities of -the original functions, so that addresses taken outside the module will pass -any verification done inside the module. - -Jump tables may call external functions, so their definitions need not -be available at LTO time. Note that if an externally defined function is a -member of a bitset, there is no guarantee that its identity within the module -will be the same as its identity outside of the module, as the former will -be the jump table entry if a jump table is necessary. - -The `GlobalLayoutBuilder`_ class is responsible for laying out the globals -efficiently to minimize the sizes of the underlying bitsets. An intrinsic, -:ref:`llvm.bitset.test <bitset.test>`, generates code to test whether a -given pointer is a member of a bitset. - -:Example: - -:: - - target datalayout = "e-p:32:32" - - @a = internal global i32 0 - @b = internal global i32 0 - @c = internal global i32 0 - @d = internal global [2 x i32] [i32 0, i32 0] - - define void @e() { - ret void - } - - define void @f() { - ret void - } - - declare void @g() - - !llvm.bitsets = !{!0, !1, !2, !3, !4, !5, !6} - - !0 = !{!"bitset1", i32* @a, i32 0} - !1 = !{!"bitset1", i32* @b, i32 0} - !2 = !{!"bitset2", i32* @b, i32 0} - !3 = !{!"bitset2", i32* @c, i32 0} - !4 = !{!"bitset2", i32* @d, i32 4} - !5 = !{!"bitset3", void ()* @e, i32 0} - !6 = !{!"bitset3", void ()* @g, i32 0} - - declare i1 @llvm.bitset.test(i8* %ptr, metadata %bitset) nounwind readnone - - define i1 @foo(i32* %p) { - %pi8 = bitcast i32* %p to i8* - %x = call i1 @llvm.bitset.test(i8* %pi8, metadata !"bitset1") - ret i1 %x - } - - define i1 @bar(i32* %p) { - %pi8 = bitcast i32* %p to i8* - %x = call i1 @llvm.bitset.test(i8* %pi8, metadata !"bitset2") - ret i1 %x - } - - define i1 @baz(void ()* %p) { - %pi8 = bitcast void ()* %p to i8* - %x = call i1 @llvm.bitset.test(i8* %pi8, metadata !"bitset3") - ret i1 %x - } - - define void @main() { - %a1 = call i1 @foo(i32* @a) ; returns 1 - %b1 = call i1 @foo(i32* @b) ; returns 1 - %c1 = call i1 @foo(i32* @c) ; returns 0 - %a2 = call i1 @bar(i32* @a) ; returns 0 - %b2 = call i1 @bar(i32* @b) ; returns 1 - %c2 = call i1 @bar(i32* @c) ; returns 1 - %d02 = call i1 @bar(i32* getelementptr ([2 x i32]* @d, i32 0, i32 0)) ; returns 0 - %d12 = call i1 @bar(i32* getelementptr ([2 x i32]* @d, i32 0, i32 1)) ; returns 1 - %e = call i1 @baz(void ()* @e) ; returns 1 - %f = call i1 @baz(void ()* @f) ; returns 0 - %g = call i1 @baz(void ()* @g) ; returns 1 - ret void - } - -.. _GlobalLayoutBuilder: http://llvm.org/klaus/llvm/blob/master/include/llvm/Transforms/IPO/LowerBitSets.h diff --git a/docs/BuildingLLVMWithAutotools.rst b/docs/BuildingLLVMWithAutotools.rst deleted file mode 100644 index 083ead67ebb6d..0000000000000 --- a/docs/BuildingLLVMWithAutotools.rst +++ /dev/null @@ -1,338 +0,0 @@ -==================================== -Building LLVM With Autotools -==================================== - -.. contents:: - :local: - -.. warning:: - - Building LLVM with autoconf is deprecated as of 3.8. The autoconf build - system will be removed in 3.9. Please migrate to using CMake. For more - information see: `Building LLVM with CMake <CMake.html>`_ - -Overview -======== - -This document details how to use the LLVM autotools based build system to -configure and build LLVM from source. The normal developer process using CMake -is detailed `here <GettingStarted.html#check-here>`_. - -A Quick Summary ---------------- - -#. Configure and build LLVM and Clang: - - * ``cd where-you-want-to-build-llvm`` - * ``mkdir build`` (for building without polluting the source dir) - * ``cd build`` - * ``../llvm/configure [options]`` - Some common options: - - * ``--prefix=directory`` --- Specify for *directory* the full pathname of - where you want the LLVM tools and libraries to be installed (default - ``/usr/local``). - - * ``--enable-optimized`` --- Compile with optimizations enabled (default - is NO). - - * ``--enable-assertions`` --- Compile with assertion checks enabled - (default is YES). - - * ``make [-j]`` --- The ``-j`` specifies the number of jobs (commands) to run - simultaneously. This builds both LLVM and Clang for Debug+Asserts mode. - The ``--enable-optimized`` configure option is used to specify a Release - build. - - * ``make check-all`` --- This run the regression tests to ensure everything - is in working order. - - * If you get an "internal compiler error (ICE)" or test failures, see - `here <GettingStarted.html#check-here>`_. - -Local LLVM Configuration ------------------------- - -Once checked out from the Subversion repository, the LLVM suite source code must -be configured via the ``configure`` script. This script sets variables in the -various ``*.in`` files, most notably ``llvm/Makefile.config`` and -``llvm/include/Config/config.h``. It also populates *OBJ_ROOT* with the -Makefiles needed to begin building LLVM. - -The following environment variables are used by the ``configure`` script to -configure the build system: - -+------------+-----------------------------------------------------------+ -| Variable | Purpose | -+============+===========================================================+ -| CC | Tells ``configure`` which C compiler to use. By default, | -| | ``configure`` will check ``PATH`` for ``clang`` and GCC C | -| | compilers (in this order). Use this variable to override | -| | ``configure``\'s default behavior. | -+------------+-----------------------------------------------------------+ -| CXX | Tells ``configure`` which C++ compiler to use. By | -| | default, ``configure`` will check ``PATH`` for | -| | ``clang++`` and GCC C++ compilers (in this order). Use | -| | this variable to override ``configure``'s default | -| | behavior. | -+------------+-----------------------------------------------------------+ - -The following options can be used to set or enable LLVM specific options: - -``--enable-optimized`` - - Enables optimized compilation (debugging symbols are removed and GCC - optimization flags are enabled). Note that this is the default setting if you - are using the LLVM distribution. The default behavior of a Subversion - checkout is to use an unoptimized build (also known as a debug build). - -``--enable-debug-runtime`` - - Enables debug symbols in the runtime libraries. The default is to strip debug - symbols from the runtime libraries. - -``--enable-jit`` - - Compile the Just In Time (JIT) compiler functionality. This is not available - on all platforms. The default is dependent on platform, so it is best to - explicitly enable it if you want it. - -``--enable-targets=target-option`` - - Controls which targets will be built and linked into llc. The default value - for ``target_options`` is "all" which builds and links all available targets. - The "host" target is selected as the target of the build host. You can also - specify a comma separated list of target names that you want available in llc. - The target names use all lower case. The current set of targets is: - - ``aarch64, arm, arm64, cpp, hexagon, mips, mipsel, mips64, mips64el, msp430, - powerpc, nvptx, r600, sparc, systemz, x86, x86_64, xcore``. - -``--enable-doxygen`` - - Look for the doxygen program and enable construction of doxygen based - documentation from the source code. This is disabled by default because - generating the documentation can take a long time and producess 100s of - megabytes of output. - -To configure LLVM, follow these steps: - -#. Change directory into the object root directory: - - .. code-block:: console - - % cd OBJ_ROOT - -#. Run the ``configure`` script located in the LLVM source tree: - - .. code-block:: console - - % $LLVM_SRC_DIR/configure --prefix=/install/path [other options] - -Compiling the LLVM Suite Source Code ------------------------------------- - -Once you have configured LLVM, you can build it. There are three types of -builds: - -Debug Builds - - These builds are the default when one is using a Subversion checkout and - types ``gmake`` (unless the ``--enable-optimized`` option was used during - configuration). The build system will compile the tools and libraries with - debugging information. To get a Debug Build using the LLVM distribution the - ``--disable-optimized`` option must be passed to ``configure``. - -Release (Optimized) Builds - - These builds are enabled with the ``--enable-optimized`` option to - ``configure`` or by specifying ``ENABLE_OPTIMIZED=1`` on the ``gmake`` command - line. For these builds, the build system will compile the tools and libraries - with GCC optimizations enabled and strip debugging information from the - libraries and executables it generates. Note that Release Builds are default - when using an LLVM distribution. - -Profile Builds - - These builds are for use with profiling. They compile profiling information - into the code for use with programs like ``gprof``. Profile builds must be - started by specifying ``ENABLE_PROFILING=1`` on the ``gmake`` command line. - -Once you have LLVM configured, you can build it by entering the *OBJ_ROOT* -directory and issuing the following command: - -.. code-block:: console - - % gmake - -If the build fails, please `check here <GettingStarted.html#check-here>`_ -to see if you are using a version of GCC that is known not to compile LLVM. - -If you have multiple processors in your machine, you may wish to use some of the -parallel build options provided by GNU Make. For example, you could use the -command: - -.. code-block:: console - - % gmake -j2 - -There are several special targets which are useful when working with the LLVM -source code: - -``gmake clean`` - - Removes all files generated by the build. This includes object files, - generated C/C++ files, libraries, and executables. - -``gmake dist-clean`` - - Removes everything that ``gmake clean`` does, but also removes files generated - by ``configure``. It attempts to return the source tree to the original state - in which it was shipped. - -``gmake install`` - - Installs LLVM header files, libraries, tools, and documentation in a hierarchy - under ``$PREFIX``, specified with ``$LLVM_SRC_DIR/configure --prefix=[dir]``, which - defaults to ``/usr/local``. - -``gmake -C runtime install-bytecode`` - - Assuming you built LLVM into $OBJDIR, when this command is run, it will - install bitcode libraries into the GCC front end's bitcode library directory. - If you need to update your bitcode libraries, this is the target to use once - you've built them. - -Please see the `Makefile Guide <MakefileGuide.html>`_ for further details on -these ``make`` targets and descriptions of other targets available. - -It is also possible to override default values from ``configure`` by declaring -variables on the command line. The following are some examples: - -``gmake ENABLE_OPTIMIZED=1`` - - Perform a Release (Optimized) build. - -``gmake ENABLE_OPTIMIZED=1 DISABLE_ASSERTIONS=1`` - - Perform a Release (Optimized) build without assertions enabled. - -``gmake ENABLE_OPTIMIZED=0`` - - Perform a Debug build. - -``gmake ENABLE_PROFILING=1`` - - Perform a Profiling build. - -``gmake VERBOSE=1`` - - Print what ``gmake`` is doing on standard output. - -``gmake TOOL_VERBOSE=1`` - - Ask each tool invoked by the makefiles to print out what it is doing on - the standard output. This also implies ``VERBOSE=1``. - -Every directory in the LLVM object tree includes a ``Makefile`` to build it and -any subdirectories that it contains. Entering any directory inside the LLVM -object tree and typing ``gmake`` should rebuild anything in or below that -directory that is out of date. - -This does not apply to building the documentation. -LLVM's (non-Doxygen) documentation is produced with the -`Sphinx <http://sphinx-doc.org/>`_ documentation generation system. -There are some HTML documents that have not yet been converted to the new -system (which uses the easy-to-read and easy-to-write -`reStructuredText <http://sphinx-doc.org/rest.html>`_ plaintext markup -language). -The generated documentation is built in the ``$LLVM_SRC_DIR/docs`` directory using -a special makefile. -For instructions on how to install Sphinx, see -`Sphinx Introduction for LLVM Developers -<http://lld.llvm.org/sphinx_intro.html>`_. -After following the instructions there for installing Sphinx, build the LLVM -HTML documentation by doing the following: - -.. code-block:: console - - $ cd $LLVM_SRC_DIR/docs - $ make -f Makefile.sphinx - -This creates a ``_build/html`` sub-directory with all of the HTML files, not -just the generated ones. -This directory corresponds to ``llvm.org/docs``. -For example, ``_build/html/SphinxQuickstartTemplate.html`` corresponds to -``llvm.org/docs/SphinxQuickstartTemplate.html``. -The :doc:`SphinxQuickstartTemplate` is useful when creating a new document. - -Cross-Compiling LLVM --------------------- - -It is possible to cross-compile LLVM itself. That is, you can create LLVM -executables and libraries to be hosted on a platform different from the platform -where they are built (a Canadian Cross build). To configure a cross-compile, -supply the configure script with ``--build`` and ``--host`` options that are -different. The values of these options must be legal target triples that your -GCC compiler supports. - -The result of such a build is executables that are not runnable on on the build -host (--build option) but can be executed on the compile host (--host option). - -Check :doc:`HowToCrossCompileLLVM` and `Clang docs on how to cross-compile in general -<http://clang.llvm.org/docs/CrossCompilation.html>`_ for more information -about cross-compiling. - -The Location of LLVM Object Files ---------------------------------- - -The LLVM build system is capable of sharing a single LLVM source tree among -several LLVM builds. Hence, it is possible to build LLVM for several different -platforms or configurations using the same source tree. - -This is accomplished in the typical autoconf manner: - -* Change directory to where the LLVM object files should live: - - .. code-block:: console - - % cd OBJ_ROOT - -* Run the ``configure`` script found in the LLVM source directory: - - .. code-block:: console - - % $LLVM_SRC_DIR/configure - -The LLVM build will place files underneath *OBJ_ROOT* in directories named after -the build type: - -Debug Builds with assertions enabled (the default) - - Tools - - ``OBJ_ROOT/Debug+Asserts/bin`` - - Libraries - - ``OBJ_ROOT/Debug+Asserts/lib`` - -Release Builds - - Tools - - ``OBJ_ROOT/Release/bin`` - - Libraries - - ``OBJ_ROOT/Release/lib`` - -Profile Builds - - Tools - - ``OBJ_ROOT/Profile/bin`` - - Libraries - - ``OBJ_ROOT/Profile/lib`` diff --git a/docs/CMake.rst b/docs/CMake.rst index 4e5feae99931a..5d57bc98596b3 100644 --- a/docs/CMake.rst +++ b/docs/CMake.rst @@ -12,12 +12,20 @@ Introduction does not build the project, it generates the files needed by your build tool (GNU make, Visual Studio, etc.) for building LLVM. +If **you are a new contributor**, please start with the :doc:`GettingStarted` +page. This page is geared for existing contributors moving from the +legacy configure/make system. + If you are really anxious about getting a functional LLVM build, go to the `Quick start`_ section. If you are a CMake novice, start with `Basic CMake usage`_ and then go back to the `Quick start`_ section once you know what you are doing. The `Options and variables`_ section is a reference for customizing your build. If you already have experience with CMake, this is the recommended starting point. +This page is geared towards users of the LLVM CMake build. If you're looking for +information about modifying the LLVM CMake build system you may want to see the +:doc:`CMakePrimer` page. It has a basic overview of the CMake language. + .. _Quick start: Quick start @@ -26,10 +34,7 @@ Quick start We use here the command-line, non-interactive CMake interface. #. `Download <http://www.cmake.org/cmake/resources/software.html>`_ and install - CMake. Version 2.8.8 is the minimum required, but if you're using the Ninja - backend, CMake v3.2 or newer is required to `get interactive output - <http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20141117/244797.html>`_ - when running :doc:`Lit <CommandGuide/lit>`. + CMake. Version 3.4.3 is the minimum required. #. Open a shell. Your development tools must be reachable from this shell through the PATH environment variable. @@ -259,6 +264,9 @@ LLVM-specific variables link against LLVM libraries and make use of C++ exceptions in your own code that need to propagate through LLVM code. Defaults to OFF. +**LLVM_ENABLE_EXPENSIVE_CHECKS**:BOOL + Enable additional time/memory expensive checking. Defaults to OFF. + **LLVM_ENABLE_PIC**:BOOL Add the ``-fPIC`` flag to the compiler command-line, if the compiler supports this flag. Some systems, like Windows, do not need this flag. Defaults to ON. @@ -328,6 +336,14 @@ LLVM-specific variables will not be used. If the variable for an external project does not point to a valid path, then that project will not be built. +**LLVM_EXTERNAL_PROJECTS**:STRING + Semicolon-separated list of additional external projects to build as part of + llvm. For each project LLVM_EXTERNAL_<NAME>_SOURCE_DIR have to be specified + with the path for the source code of the project. Example: + ``-DLLVM_EXTERNAL_PROJECTS="Foo;Bar" + -DLLVM_EXTERNAL_FOO_SOURCE_DIR=/src/foo + -DLLVM_EXTERNAL_BAR_SOURCE_DIR=/src/bar``. + **LLVM_USE_OPROFILE**:BOOL Enable building OProfile JIT support. Defaults to OFF. @@ -347,6 +363,11 @@ LLVM-specific variables are ``Address``, ``Memory``, ``MemoryWithOrigins``, ``Undefined``, ``Thread``, and ``Address;Undefined``. Defaults to empty string. +**LLVM_ENABLE_LTO**:STRING + Add ``-flto`` or ``-flto=`` flags to the compile and link command + lines, enabling link-time optimization. Possible values are ``Off``, + ``On``, ``Thin`` and ``Full``. Defaults to OFF. + **LLVM_PARALLEL_COMPILE_JOBS**:STRING Define the maximum number of concurrent compilation jobs. @@ -354,10 +375,12 @@ LLVM-specific variables Define the maximum number of concurrent link jobs. **LLVM_BUILD_DOCS**:BOOL - Enables all enabled documentation targets (i.e. Doxgyen and Sphinx targets) to - be built as part of the normal build. If the ``install`` target is run then - this also enables all built documentation targets to be installed. Defaults to - OFF. + Adds all *enabled* documentation targets (i.e. Doxgyen and Sphinx targets) as + dependencies of the default build targets. This results in all of the (enabled) + documentation targets being as part of a normal build. If the ``install`` + target is run then this also enables all built documentation targets to be + installed. Defaults to OFF. To enable a particular documentation target, see + see LLVM_ENABLE_SPHINX and LLVM_ENABLE_DOXYGEN. **LLVM_ENABLE_DOXYGEN**:BOOL Enables the generation of browsable HTML documentation using doxygen. @@ -409,7 +432,7 @@ LLVM-specific variables Defaults to OFF. **LLVM_ENABLE_SPHINX**:BOOL - If enabled CMake will search for the ``sphinx-build`` executable and will make + If specified, CMake will search for the ``sphinx-build`` executable and will make the ``SPHINX_OUTPUT_HTML`` and ``SPHINX_OUTPUT_MAN`` CMake options available. Defaults to OFF. @@ -463,6 +486,47 @@ LLVM-specific variables If you want to build LLVM as a shared library, you should use the ``LLVM_BUILD_LLVM_DYLIB`` option. +**LLVM_OPTIMIZED_TABLEGEN**:BOOL + If enabled and building a debug or asserts build the CMake build system will + generate a Release build tree to build a fully optimized tablegen for use + during the build. Enabling this option can significantly speed up build times + especially when building LLVM in Debug configurations. + +CMake Caches +============ + +Recently LLVM and Clang have been adding some more complicated build system +features. Utilizing these new features often involves a complicated chain of +CMake variables passed on the command line. Clang provides a collection of CMake +cache scripts to make these features more approachable. + +CMake cache files are utilized using CMake's -C flag: + +.. code-block:: console + + $ cmake -C <path to cache file> <path to sources> + +CMake cache scripts are processed in an isolated scope, only cached variables +remain set when the main configuration runs. CMake cached variables do not reset +variables that are already set unless the FORCE option is specified. + +A few notes about CMake Caches: + +- Order of command line arguments is important + + - -D arguments specified before -C are set before the cache is processed and + can be read inside the cache file + - -D arguments specified after -C are set after the cache is processed and + are unset inside the cache file + +- All -D arguments will override cache file settings +- CMAKE_TOOLCHAIN_FILE is evaluated after both the cache file and the command + line arguments +- It is recommended that all -D options should be specified *before* -C + +For more information about some of the advanced build configurations supported +via Cache files see :doc:`AdvancedBuilds`. + Executing the test suite ======================== @@ -502,7 +566,7 @@ and uses them to build a simple application ``simple-tool``. .. code-block:: cmake - cmake_minimum_required(VERSION 2.8.8) + cmake_minimum_required(VERSION 3.4.3) project(SimpleProject) find_package(LLVM REQUIRED CONFIG) @@ -532,16 +596,16 @@ The ``find_package(...)`` directive when used in CONFIG mode (as in the above example) will look for the ``LLVMConfig.cmake`` file in various locations (see cmake manual for details). It creates a ``LLVM_DIR`` cache entry to save the directory where ``LLVMConfig.cmake`` is found or allows the user to specify the -directory (e.g. by passing ``-DLLVM_DIR=/usr/share/llvm/cmake`` to +directory (e.g. by passing ``-DLLVM_DIR=/usr/lib/cmake/llvm`` to the ``cmake`` command or by setting it directly in ``ccmake`` or ``cmake-gui``). This file is available in two different locations. -* ``<INSTALL_PREFIX>/share/llvm/cmake/LLVMConfig.cmake`` where +* ``<INSTALL_PREFIX>/lib/cmake/llvm/LLVMConfig.cmake`` where ``<INSTALL_PREFIX>`` is the install prefix of an installed version of LLVM. - On Linux typically this is ``/usr/share/llvm/cmake/LLVMConfig.cmake``. + On Linux typically this is ``/usr/lib/cmake/llvm/LLVMConfig.cmake``. -* ``<LLVM_BUILD_ROOT>/share/llvm/cmake/LLVMConfig.cmake`` where +* ``<LLVM_BUILD_ROOT>/lib/cmake/llvm/LLVMConfig.cmake`` where ``<LLVM_BUILD_ROOT>`` is the root of the LLVM build tree. **Note: this is only available when building LLVM with CMake.** diff --git a/docs/CMakePrimer.rst b/docs/CMakePrimer.rst new file mode 100644 index 0000000000000..034779022142a --- /dev/null +++ b/docs/CMakePrimer.rst @@ -0,0 +1,465 @@ +============ +CMake Primer +============ + +.. contents:: + :local: + +.. warning:: + Disclaimer: This documentation is written by LLVM project contributors `not` + anyone affiliated with the CMake project. This document may contain + inaccurate terminology, phrasing, or technical details. It is provided with + the best intentions. + + +Introduction +============ + +The LLVM project and many of the core projects built on LLVM build using CMake. +This document aims to provide a brief overview of CMake for developers modifying +LLVM projects or building their own projects on top of LLVM. + +The official CMake language references is available in the cmake-language +manpage and `cmake-language online documentation +<https://cmake.org/cmake/help/v3.4/manual/cmake-language.7.html>`_. + +10,000 ft View +============== + +CMake is a tool that reads script files in its own language that describe how a +software project builds. As CMake evaluates the scripts it constructs an +internal representation of the software project. Once the scripts have been +fully processed, if there are no errors, CMake will generate build files to +actually build the project. CMake supports generating build files for a variety +of command line build tools as well as for popular IDEs. + +When a user runs CMake it performs a variety of checks similar to how autoconf +worked historically. During the checks and the evaluation of the build +description scripts CMake caches values into the CMakeCache. This is useful +because it allows the build system to skip long-running checks during +incremental development. CMake caching also has some drawbacks, but that will be +discussed later. + +Scripting Overview +================== + +CMake's scripting language has a very simple grammar. Every language construct +is a command that matches the pattern _name_(_args_). Commands come in three +primary types: language-defined (commands implemented in C++ in CMake), defined +functions, and defined macros. The CMake distribution also contains a suite of +CMake modules that contain definitions for useful functionality. + +The example below is the full CMake build for building a C++ "Hello World" +program. The example uses only CMake language-defined functions. + +.. code-block:: cmake + + cmake_minimum_required(VERSION 3.2) + project(HelloWorld) + add_executable(HelloWorld HelloWorld.cpp) + +The CMake language provides control flow constructs in the form of foreach loops +and if blocks. To make the example above more complicated you could add an if +block to define "APPLE" when targeting Apple platforms: + +.. code-block:: cmake + + cmake_minimum_required(VERSION 3.2) + project(HelloWorld) + add_executable(HelloWorld HelloWorld.cpp) + if(APPLE) + target_compile_definitions(HelloWorld PUBLIC APPLE) + endif() + +Variables, Types, and Scope +=========================== + +Dereferencing +------------- + +In CMake variables are "stringly" typed. All variables are represented as +strings throughout evaluation. Wrapping a variable in ``${}`` dereferences it +and results in a literal substitution of the name for the value. CMake refers to +this as "variable evaluation" in their documentation. Dereferences are performed +*before* the command being called receives the arguments. This means +dereferencing a list results in multiple separate arguments being passed to the +command. + +Variable dereferences can be nested and be used to model complex data. For +example: + +.. code-block:: cmake + + set(var_name var1) + set(${var_name} foo) # same as "set(var1 foo)" + set(${${var_name}}_var bar) # same as "set(foo_var bar)" + +Dereferencing an unset variable results in an empty expansion. It is a common +pattern in CMake to conditionally set variables knowing that it will be used in +code paths that the variable isn't set. There are examples of this throughout +the LLVM CMake build system. + +An example of variable empty expansion is: + +.. code-block:: cmake + + if(APPLE) + set(extra_sources Apple.cpp) + endif() + add_executable(HelloWorld HelloWorld.cpp ${extra_sources}) + +In this example the ``extra_sources`` variable is only defined if you're +targeting an Apple platform. For all other targets the ``extra_sources`` will be +evaluated as empty before add_executable is given its arguments. + +One big "Gotcha" with variable dereferencing is that ``if`` commands implicitly +dereference values. This has some unexpected results. For example: + +.. code-block:: cmake + + if("${SOME_VAR}" STREQUAL "MSVC") + +In this code sample MSVC will be implicitly dereferenced, which will result in +the if command comparing the value of the dereferenced variables ``SOME_VAR`` +and ``MSVC``. A common workaround to this solution is to prepend strings being +compared with an ``x``. + +.. code-block:: cmake + + if("x${SOME_VAR}" STREQUAL "xMSVC") + +This works because while ``MSVC`` is a defined variable, ``xMSVC`` is not. This +pattern is uncommon, but it does occur in LLVM's CMake scripts. + +.. note:: + + Once the LLVM project upgrades its minimum CMake version to 3.1 or later we + can prevent this behavior by setting CMP0054 to new. For more information on + CMake policies please see the cmake-policies manpage or the `cmake-policies + online documentation + <https://cmake.org/cmake/help/v3.4/manual/cmake-policies.7.html>`_. + +Lists +----- + +In CMake lists are semi-colon delimited strings, and it is strongly advised that +you avoid using semi-colons in lists; it doesn't go smoothly. A few examples of +defining lists: + +.. code-block:: cmake + + # Creates a list with members a, b, c, and d + set(my_list a b c d) + set(my_list "a;b;c;d") + + # Creates a string "a b c d" + set(my_string "a b c d") + +Lists of Lists +-------------- + +One of the more complicated patterns in CMake is lists of lists. Because a list +cannot contain an element with a semi-colon to construct a list of lists you +make a list of variable names that refer to other lists. For example: + +.. code-block:: cmake + + set(list_of_lists a b c) + set(a 1 2 3) + set(b 4 5 6) + set(c 7 8 9) + +With this layout you can iterate through the list of lists printing each value +with the following code: + +.. code-block:: cmake + + foreach(list_name IN LISTS list_of_lists) + foreach(value IN LISTS ${list_name}) + message(${value}) + endforeach() + endforeach() + +You'll notice that the inner foreach loop's list is doubly dereferenced. This is +because the first dereference turns ``list_name`` into the name of the sub-list +(a, b, or c in the example), then the second dereference is to get the value of +the list. + +This pattern is used throughout CMake, the most common example is the compiler +flags options, which CMake refers to using the following variable expansions: +CMAKE_${LANGUAGE}_FLAGS and CMAKE_${LANGUAGE}_FLAGS_${CMAKE_BUILD_TYPE}. + +Other Types +----------- + +Variables that are cached or specified on the command line can have types +associated with them. The variable's type is used by CMake's UI tool to display +the right input field. The variable's type generally doesn't impact evaluation. +One of the few examples is PATH variables, which CMake does have some special +handling for. You can read more about the special handling in `CMake's set +documentation +<https://cmake.org/cmake/help/v3.5/command/set.html#set-cache-entry>`_. + +Scope +----- + +CMake inherently has a directory-based scoping. Setting a variable in a +CMakeLists file, will set the variable for that file, and all subdirectories. +Variables set in a CMake module that is included in a CMakeLists file will be +set in the scope they are included from, and all subdirectories. + +When a variable that is already set is set again in a subdirectory it overrides +the value in that scope and any deeper subdirectories. + +The CMake set command provides two scope-related options. PARENT_SCOPE sets a +variable into the parent scope, and not the current scope. The CACHE option sets +the variable in the CMakeCache, which results in it being set in all scopes. The +CACHE option will not set a variable that already exists in the CACHE unless the +FORCE option is specified. + +In addition to directory-based scope, CMake functions also have their own scope. +This means variables set inside functions do not bleed into the parent scope. +This is not true of macros, and it is for this reason LLVM prefers functions +over macros whenever reasonable. + +.. note:: + Unlike C-based languages, CMake's loop and control flow blocks do not have + their own scopes. + +Control Flow +============ + +CMake features the same basic control flow constructs you would expect in any +scripting language, but there are a few quarks because, as with everything in +CMake, control flow constructs are commands. + +If, ElseIf, Else +---------------- + +.. note:: + For the full documentation on the CMake if command go + `here <https://cmake.org/cmake/help/v3.4/command/if.html>`_. That resource is + far more complete. + +In general CMake if blocks work the way you'd expect: + +.. code-block:: cmake + + if(<condition>) + .. do stuff + elseif(<condition>) + .. do other stuff + else() + .. do other other stuff + endif() + +The single most important thing to know about CMake's if blocks coming from a C +background is that they do not have their own scope. Variables set inside +conditional blocks persist after the ``endif()``. + +Loops +----- + +The most common form of the CMake ``foreach`` block is: + +.. code-block:: cmake + + foreach(var ...) + .. do stuff + endforeach() + +The variable argument portion of the ``foreach`` block can contain dereferenced +lists, values to iterate, or a mix of both: + +.. code-block:: cmake + + foreach(var foo bar baz) + message(${var}) + endforeach() + # prints: + # foo + # bar + # baz + + set(my_list 1 2 3) + foreach(var ${my_list}) + message(${var}) + endforeach() + # prints: + # 1 + # 2 + # 3 + + foreach(var ${my_list} out_of_bounds) + message(${var}) + endforeach() + # prints: + # 1 + # 2 + # 3 + # out_of_bounds + +There is also a more modern CMake foreach syntax. The code below is equivalent +to the code above: + +.. code-block:: cmake + + foreach(var IN ITEMS foo bar baz) + message(${var}) + endforeach() + # prints: + # foo + # bar + # baz + + set(my_list 1 2 3) + foreach(var IN LISTS my_list) + message(${var}) + endforeach() + # prints: + # 1 + # 2 + # 3 + + foreach(var IN LISTS my_list ITEMS out_of_bounds) + message(${var}) + endforeach() + # prints: + # 1 + # 2 + # 3 + # out_of_bounds + +Similar to the conditional statements, these generally behave how you would +expect, and they do not have their own scope. + +CMake also supports ``while`` loops, although they are not widely used in LLVM. + +Modules, Functions and Macros +============================= + +Modules +------- + +Modules are CMake's vehicle for enabling code reuse. CMake modules are just +CMake script files. They can contain code to execute on include as well as +definitions for commands. + +In CMake macros and functions are universally referred to as commands, and they +are the primary method of defining code that can be called multiple times. + +In LLVM we have several CMake modules that are included as part of our +distribution for developers who don't build our project from source. Those +modules are the fundamental pieces needed to build LLVM-based projects with +CMake. We also rely on modules as a way of organizing the build system's +functionality for maintainability and re-use within LLVM projects. + +Argument Handling +----------------- + +When defining a CMake command handling arguments is very useful. The examples +in this section will all use the CMake ``function`` block, but this all applies +to the ``macro`` block as well. + +CMake commands can have named arguments, but all commands are implicitly +variable argument. If the command has named arguments they are required and must +be specified at every call site. Below is a trivial example of providing a +wrapper function for CMake's built in function ``add_dependencies``. + +.. code-block:: cmake + + function(add_deps target) + add_dependencies(${target} ${ARGV}) + endfunction() + +This example defines a new macro named ``add_deps`` which takes a required first +argument, and just calls another function passing through the first argument and +all trailing arguments. When variable arguments are present CMake defines them +in a list named ``ARGV``, and the count of the arguments is defined in ``ARGN``. + +CMake provides a module ``CMakeParseArguments`` which provides an implementation +of advanced argument parsing. We use this all over LLVM, and it is recommended +for any function that has complex argument-based behaviors or optional +arguments. CMake's official documentation for the module is in the +``cmake-modules`` manpage, and is also available at the +`cmake-modules online documentation +<https://cmake.org/cmake/help/v3.4/module/CMakeParseArguments.html>`_. + +.. note:: + As of CMake 3.5 the cmake_parse_arguments command has become a native command + and the CMakeParseArguments module is empty and only left around for + compatibility. + +Functions Vs Macros +------------------- + +Functions and Macros look very similar in how they are used, but there is one +fundamental difference between the two. Functions have their own scope, and +macros don't. This means variables set in macros will bleed out into the calling +scope. That makes macros suitable for defining very small bits of functionality +only. + +The other difference between CMake functions and macros is how arguments are +passed. Arguments to macros are not set as variables, instead dereferences to +the parameters are resolved across the macro before executing it. This can +result in some unexpected behavior if using unreferenced variables. For example: + +.. code-block:: cmake + + macro(print_list my_list) + foreach(var IN LISTS my_list) + message("${var}") + endforeach() + endmacro() + + set(my_list a b c d) + set(my_list_of_numbers 1 2 3 4) + print_list(my_list_of_numbers) + # prints: + # a + # b + # c + # d + +Generally speaking this issue is uncommon because it requires using +non-dereferenced variables with names that overlap in the parent scope, but it +is important to be aware of because it can lead to subtle bugs. + +LLVM Project Wrappers +===================== + +LLVM projects provide lots of wrappers around critical CMake built-in commands. +We use these wrappers to provide consistent behaviors across LLVM components +and to reduce code duplication. + +We generally (but not always) follow the convention that commands prefaced with +``llvm_`` are intended to be used only as building blocks for other commands. +Wrapper commands that are intended for direct use are generally named following +with the project in the middle of the command name (i.e. ``add_llvm_executable`` +is the wrapper for ``add_executable``). The LLVM ``add_*`` wrapper functions are +all defined in ``AddLLVM.cmake`` which is installed as part of the LLVM +distribution. It can be included and used by any LLVM sub-project that requires +LLVM. + +.. note:: + + Not all LLVM projects require LLVM for all use cases. For example compiler-rt + can be built without LLVM, and the compiler-rt sanitizer libraries are used + with GCC. + +Useful Built-in Commands +======================== + +CMake has a bunch of useful built-in commands. This document isn't going to +go into details about them because The CMake project has excellent +documentation. To highlight a few useful functions see: + +* `add_custom_command <https://cmake.org/cmake/help/v3.4/command/add_custom_command.html>`_ +* `add_custom_target <https://cmake.org/cmake/help/v3.4/command/add_custom_target.html>`_ +* `file <https://cmake.org/cmake/help/v3.4/command/file.html>`_ +* `list <https://cmake.org/cmake/help/v3.4/command/list.html>`_ +* `math <https://cmake.org/cmake/help/v3.4/command/math.html>`_ +* `string <https://cmake.org/cmake/help/v3.4/command/string.html>`_ + +The full documentation for CMake commands is in the ``cmake-commands`` manpage +and available on `CMake's website <https://cmake.org/cmake/help/v3.4/manual/cmake-commands.7.html>`_ diff --git a/docs/CodeGenerator.rst b/docs/CodeGenerator.rst index f3b949c7ad157..6a54343dfba62 100644 --- a/docs/CodeGenerator.rst +++ b/docs/CodeGenerator.rst @@ -45,7 +45,7 @@ components: ``include/llvm/CodeGen/``. At this level, concepts like "constant pool entries" and "jump tables" are explicitly exposed. -3. Classes and algorithms used to represent code as the object file level, the +3. Classes and algorithms used to represent code at the object file level, the `MC Layer`_. These classes represent assembly level constructs like labels, sections, and instructions. At this level, concepts like "constant pool entries" and "jump tables" don't exist. @@ -386,32 +386,27 @@ functions make it easy to build arbitrary machine instructions. Usage of the .. code-block:: c++ // Create a 'DestReg = mov 42' (rendered in X86 assembly as 'mov DestReg, 42') - // instruction. The '1' specifies how many operands will be added. - MachineInstr *MI = BuildMI(X86::MOV32ri, 1, DestReg).addImm(42); - - // Create the same instr, but insert it at the end of a basic block. + // instruction and insert it at the end of the given MachineBasicBlock. + const TargetInstrInfo &TII = ... MachineBasicBlock &MBB = ... - BuildMI(MBB, X86::MOV32ri, 1, DestReg).addImm(42); + DebugLoc DL; + MachineInstr *MI = BuildMI(MBB, DL, TII.get(X86::MOV32ri), DestReg).addImm(42); // Create the same instr, but insert it before a specified iterator point. MachineBasicBlock::iterator MBBI = ... - BuildMI(MBB, MBBI, X86::MOV32ri, 1, DestReg).addImm(42); + BuildMI(MBB, MBBI, DL, TII.get(X86::MOV32ri), DestReg).addImm(42); // Create a 'cmp Reg, 0' instruction, no destination reg. - MI = BuildMI(X86::CMP32ri, 2).addReg(Reg).addImm(0); + MI = BuildMI(MBB, DL, TII.get(X86::CMP32ri8)).addReg(Reg).addImm(42); // Create an 'sahf' instruction which takes no operands and stores nothing. - MI = BuildMI(X86::SAHF, 0); + MI = BuildMI(MBB, DL, TII.get(X86::SAHF)); // Create a self looping branch instruction. - BuildMI(MBB, X86::JNE, 1).addMBB(&MBB); + BuildMI(MBB, DL, TII.get(X86::JNE)).addMBB(&MBB); -The key thing to remember with the ``BuildMI`` functions is that you have to -specify the number of operands that the machine instruction will take. This -allows for efficient memory allocation. You also need to specify if operands -default to be uses of values, not definitions. If you need to add a definition -operand (other than the optional destination register), you must explicitly mark -it as such: +If you need to add a definition operand (other than the optional destination +register), you must explicitly mark it as such: .. code-block:: c++ @@ -632,7 +627,7 @@ directives through MCStreamer. On the implementation side of MCStreamer, there are two major implementations: one for writing out a .s file (MCAsmStreamer), and one for writing out a .o -file (MCObjectStreamer). MCAsmStreamer is a straight-forward implementation +file (MCObjectStreamer). MCAsmStreamer is a straightforward implementation that prints out a directive for each method (e.g. ``EmitValue -> .byte``), but MCObjectStreamer implements a full assembler. @@ -1771,13 +1766,11 @@ table that summarizes what features are supported by each target. Target Feature Matrix --------------------- -Note that this table does not include the C backend or Cpp backends, since they -do not use the target independent code generator infrastructure. It also -doesn't list features that are not supported fully by any target yet. It -considers a feature to be supported if at least one subtarget supports it. A -feature being supported means that it is useful and works for most cases, it -does not indicate that there are zero known bugs in the implementation. Here is -the key: +Note that this table does not list features that are not supported fully by any +target yet. It considers a feature to be supported if at least one subtarget +supports it. A feature being supported means that it is useful and works for +most cases, it does not indicate that there are zero known bugs in the +implementation. Here is the key: :raw-html:`<table border="1" cellspacing="0">` :raw-html:`<tr>` @@ -2197,9 +2190,9 @@ prefix byte on an instruction causes the instruction's memory access to go to the specified segment. LLVM address space 0 is the default address space, which includes the stack, and any unqualified memory accesses in a program. Address spaces 1-255 are currently reserved for user-defined code. The GS-segment is -represented by address space 256, while the FS-segment is represented by address -space 257. Other x86 segments have yet to be allocated address space -numbers. +represented by address space 256, the FS-segment is represented by address space +257, and the SS-segment is represented by address space 258. Other x86 segments +have yet to be allocated address space numbers. While these address spaces may seem similar to TLS via the ``thread_local`` keyword, and often use the same underlying hardware, there are some fundamental @@ -2645,3 +2638,59 @@ of a program is limited to 4K instructions: this ensures fast termination and a limited number of kernel function calls. Prior to running an eBPF program, a verifier performs static analysis to prevent loops in the code and to ensure valid register usage and operand types. + +The AMDGPU backend +------------------ + +The AMDGPU code generator lives in the lib/Target/AMDGPU directory, and is an +open source native AMD GCN ISA code generator. + +Target triples supported +^^^^^^^^^^^^^^^^^^^^^^^^ + +The following are the known target triples that are supported by the AMDGPU +backend. + +* **amdgcn--** --- AMD GCN GPUs (AMDGPU.7.0.0+) +* **amdgcn--amdhsa** --- AMD GCN GPUs (AMDGPU.7.0.0+) with HSA support +* **r600--** --- AMD GPUs HD2XXX-HD6XXX + +Relocations +^^^^^^^^^^^ + +Supported relocatable fields are: + +* **word32** --- This specifies a 32-bit field occupying 4 bytes with arbitrary + byte alignment. These values use the same byte order as other word values in + the AMD GPU architecture +* **word64** --- This specifies a 64-bit field occupying 8 bytes with arbitrary + byte alignment. These values use the same byte order as other word values in + the AMD GPU architecture + +Following notations are used for specifying relocation calculations: + +* **A** --- Represents the addend used to compute the value of the relocatable + field +* **G** --- Represents the offset into the global offset table at which the + relocation entry’s symbol will reside during execution. +* **GOT** --- Represents the address of the global offset table. +* **P** --- Represents the place (section offset or address) of the storage unit + being relocated (computed using ``r_offset``) +* **S** --- Represents the value of the symbol whose index resides in the + relocation entry + +AMDGPU Backend generates *Elf64_Rela* relocation records with the following +supported relocation types: + + ===================== ===== ========== ==================== + Relocation type Value Field Calculation + ===================== ===== ========== ==================== + ``R_AMDGPU_NONE`` 0 ``none`` ``none`` + ``R_AMDGPU_ABS32_LO`` 1 ``word32`` (S + A) & 0xFFFFFFFF + ``R_AMDGPU_ABS32_HI`` 2 ``word32`` (S + A) >> 32 + ``R_AMDGPU_ABS64`` 3 ``word64`` S + A + ``R_AMDGPU_REL32`` 4 ``word32`` S + A - P + ``R_AMDGPU_REL64`` 5 ``word64`` S + A - P + ``R_AMDGPU_ABS32`` 6 ``word32`` S + A + ``R_AMDGPU_GOTPCREL`` 7 ``word32`` G + GOT + A - P + ===================== ===== ========== ==================== diff --git a/docs/CodeOfConduct.rst b/docs/CodeOfConduct.rst new file mode 100644 index 0000000000000..aa366f3514e5e --- /dev/null +++ b/docs/CodeOfConduct.rst @@ -0,0 +1,112 @@ +============================== +LLVM Community Code of Conduct +============================== + +.. note:: + + This document is currently a **DRAFT** document while it is being discussed + by the community. + +The LLVM community has always worked to be a welcoming and respectful +community, and we want to ensure that doesn't change as we grow and evolve. To +that end, we have a few ground rules that we ask people to adhere to: + +* `be friendly and patient`_, +* `be welcoming`_, +* `be considerate`_, +* `be respectful`_, +* `be careful in the words that you choose and be kind to others`_, and +* `when we disagree, try to understand why`_. + +This isn't an exhaustive list of things that you can't do. Rather, take it in +the spirit in which it's intended - a guide to make it easier to communicate +and participate in the community. + +This code of conduct applies to all spaces managed by the LLVM project or The +LLVM Foundation. This includes IRC channels, mailing lists, bug trackers, LLVM +events such as the developer meetings and socials, and any other forums created +by the project that the community uses for communication. It applies to all of +your communication and conduct in these spaces, including emails, chats, things +you say, slides, videos, posters, signs, or even t-shirts you display in these +spaces. In addition, violations of this code outside these spaces may, in rare +cases, affect a person's ability to participate within them, when the conduct +amounts to an egregious violation of this code. + +If you believe someone is violating the code of conduct, we ask that you report +it by emailing conduct@llvm.org. For more details please see our +:doc:`Reporting Guide <ReportingGuide>`. + +.. _be friendly and patient: + +* **Be friendly and patient.** + +.. _be welcoming: + +* **Be welcoming.** We strive to be a community that welcomes and supports + people of all backgrounds and identities. This includes, but is not limited + to members of any race, ethnicity, culture, national origin, colour, + immigration status, social and economic class, educational level, sex, sexual + orientation, gender identity and expression, age, size, family status, + political belief, religion or lack thereof, and mental and physical ability. + +.. _be considerate: + +* **Be considerate.** Your work will be used by other people, and you in turn + will depend on the work of others. Any decision you take will affect users + and colleagues, and you should take those consequences into account. Remember + that we're a world-wide community, so you might not be communicating in + someone else's primary language. + +.. _be respectful: + +* **Be respectful.** Not all of us will agree all the time, but disagreement is + no excuse for poor behavior and poor manners. We might all experience some + frustration now and then, but we cannot allow that frustration to turn into + a personal attack. It's important to remember that a community where people + feel uncomfortable or threatened is not a productive one. Members of the LLVM + community should be respectful when dealing with other members as well as + with people outside the LLVM community. + +.. _be careful in the words that you choose and be kind to others: + +* **Be careful in the words that you choose and be kind to others.** Do not + insult or put down other participants. Harassment and other exclusionary + behavior aren't acceptable. This includes, but is not limited to: + + * Violent threats or language directed against another person. + * Discriminatory jokes and language. + * Posting sexually explicit or violent material. + * Posting (or threatening to post) other people's personally identifying + information ("doxing"). + * Personal insults, especially those using racist or sexist terms. + * Unwelcome sexual attention. + * Advocating for, or encouraging, any of the above behavior. + + In general, if someone asks you to stop, then stop. Persisting in such + behavior after being asked to stop is considered harassment. + +.. _when we disagree, try to understand why: + +* **When we disagree, try to understand why.** Disagreements, both social and + technical, happen all the time and LLVM is no exception. It is important that + we resolve disagreements and differing views constructively. Remember that + we're different. The strength of LLVM comes from its varied community, people + from a wide range of backgrounds. Different people have different + perspectives on issues. Being unable to understand why someone holds + a viewpoint doesn't mean that they're wrong. Don't forget that it is human to + err and blaming each other doesn't get us anywhere. Instead, focus on helping + to resolve issues and learning from mistakes. + +Questions? +========== + +If you have questions, please feel free to contact the LLVM Foundation Code of +Conduct Advisory Committee by emailing conduct@llvm.org. + + +(This text is based on the `Django Project`_ Code of Conduct, which is in turn +based on wording from the `Speak Up! project`_.) + +.. _Django Project: https://www.djangoproject.com/conduct/ +.. _Speak Up! project: http://speakup.io/coc.html + diff --git a/docs/CommandGuide/FileCheck.rst b/docs/CommandGuide/FileCheck.rst index 03c8829767760..a0ca1bfe52f92 100644 --- a/docs/CommandGuide/FileCheck.rst +++ b/docs/CommandGuide/FileCheck.rst @@ -38,10 +38,27 @@ OPTIONS prefixes to match. Multiple prefixes are useful for tests which might change for different run options, but most lines remain the same. +.. option:: --check-prefixes prefix1,prefix2,... + + An alias of :option:`--check-prefix` that allows multiple prefixes to be + specified as a comma separated list. + .. option:: --input-file filename File to check (defaults to stdin). +.. option:: --match-full-lines + + By default, FileCheck allows matches of anywhere on a line. This + option will require all positive matches to cover an entire + line. Leading and trailing whitespace is ignored, unless + :option:`--strict-whitespace` is also specified. (Note: negative + matches from ``CHECK-NOT`` are not affected by this option!) + + Passing this option is equivalent to inserting ``{{^ *}}`` or + ``{{^}}`` before, and ``{{ *$}}`` or ``{{$}}`` after every positive + check pattern. + .. option:: --strict-whitespace By default, FileCheck canonicalizes input horizontal whitespace (spaces and @@ -444,3 +461,22 @@ relative line number references, for example: // CHECK-NEXT: {{^ ;}} int a +Matching Newline Characters +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To match newline characters in regular expressions the character class +``[[:space:]]`` can be used. For example, the following pattern: + +.. code-block:: c++ + + // CHECK: DW_AT_location [DW_FORM_sec_offset] ([[DLOC:0x[0-9a-f]+]]){{[[:space:]].*}}"intd" + +matches output of the form (from llvm-dwarfdump): + +.. code-block:: llvm + + DW_AT_location [DW_FORM_sec_offset] (0x00000233) + DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000c9] = "intd") + +letting us set the :program:`FileCheck` variable ``DLOC`` to the desired value +``0x00000233``, extracted from the line immediately preceding "``intd``". diff --git a/docs/CommandGuide/bugpoint.rst b/docs/CommandGuide/bugpoint.rst index f11585d359c65..8c2a0d124981b 100644 --- a/docs/CommandGuide/bugpoint.rst +++ b/docs/CommandGuide/bugpoint.rst @@ -15,7 +15,7 @@ can be used to debug three types of failures: optimizer crashes, miscompilations by optimizers, or bad native code generation (including problems in the static and JIT compilers). It aims to reduce large test cases to small, useful ones. For more information on the design and inner workings of **bugpoint**, as well as -advice for using bugpoint, see *llvm/docs/Bugpoint.html* in the LLVM +advice for using bugpoint, see :doc:`/Bugpoint` in the LLVM distribution. OPTIONS @@ -151,7 +151,12 @@ OPTIONS **--compile-command** *command* This option defines the command to use with the **--compile-custom** - option to compile the bitcode testcase. This can be useful for + option to compile the bitcode testcase. The command should exit with a + failure exit code if the file is "interesting" and should exit with a + success exit code (i.e. 0) otherwise (this is the same as if it crashed on + "interesting" inputs). + + This can be useful for testing compiler output without running any link or execute stages. To generate a reduced unit test, you may add CHECK directives to the testcase and pass the name of an executable compile-command script in this form: @@ -171,6 +176,14 @@ OPTIONS **--safe-{int,jit,llc,custom}** option. +**--verbose-errors**\ =\ *{true,false}* + + The default behavior of bugpoint is to print "<crash>" when it finds a reduced + test that crashes compilation. This flag prints the output of the crashing + program to stderr. This is useful to make sure it is the same error being + tracked down and not a different error that happens to crash the compiler as + well. Defaults to false. + EXIT STATUS ----------- diff --git a/docs/CommandGuide/lit.rst b/docs/CommandGuide/lit.rst index 0ec14bb2236ea..b2da58ec02c13 100644 --- a/docs/CommandGuide/lit.rst +++ b/docs/CommandGuide/lit.rst @@ -355,6 +355,35 @@ be used to define subdirectories of optional tests, or to change other configuration parameters --- for example, to change the test format, or the suffixes which identify test files. +PRE-DEFINED SUBSTITUTIONS +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +:program:`lit` provides various patterns that can be used with the RUN command. +These are defined in TestRunner.py. + + ========== ============== + Macro Substitution + ========== ============== + %s source path (path to the file currently being run) + %S source dir (directory of the file currently being run) + %p same as %S + %{pathsep} path separator + %t temporary file name unique to the test + %T temporary directory unique to the test + %% % + %/s same as %s but replace all / with \\ + %/S same as %S but replace all / with \\ + %/p same as %p but replace all / with \\ + %/t same as %t but replace all / with \\ + %/T same as %T but replace all / with \\ + ========== ============== + +Further substitution patterns might be defined by each test module. +See the modules :ref:`local-configuration-files`. + +More information on the testing infrastucture can be found in the +:doc:`../TestingGuide`. + TEST RUN OUTPUT FORMAT ~~~~~~~~~~~~~~~~~~~~~~ diff --git a/docs/CommandGuide/llvm-cov.rst b/docs/CommandGuide/llvm-cov.rst index d0e78a9a1d11f..946b125a4529f 100644 --- a/docs/CommandGuide/llvm-cov.rst +++ b/docs/CommandGuide/llvm-cov.rst @@ -236,6 +236,26 @@ OPTIONS Show code coverage only for functions that match the given regular expression. +.. option:: -format=<FORMAT> + + Use the specified output format. The supported formats are: "text", "html". + +.. option:: -output-dir=PATH + + Specify a directory to write coverage reports into. If the directory does not + exist, it is created. When used in function view mode (i.e when -name or + -name-regex are used to select specific functions), the report is written to + PATH/functions.EXTENSION. When used in file view mode, a report for each file + is written to PATH/REL_PATH_TO_FILE.EXTENSION. + +.. option:: -Xdemangler=<TOOL>|<TOOL-OPTION> + + Specify a symbol demangler. This can be used to make reports more + human-readable. This option can be specified multiple times to supply + arguments to the demangler (e.g `-Xdemangler c++filt -Xdemangler -n` for C++). + The demangler is expected to read a newline-separated list of symbols from + stdin and write a newline-separated list of the same length to stdout. + .. option:: -line-coverage-gt=<N> Show code coverage only for functions with line coverage greater than the diff --git a/docs/CommandGuide/llvm-nm.rst b/docs/CommandGuide/llvm-nm.rst index 83d9fbaf9e8cc..f666e1c35e35d 100644 --- a/docs/CommandGuide/llvm-nm.rst +++ b/docs/CommandGuide/llvm-nm.rst @@ -126,6 +126,11 @@ OPTIONS Print only symbols referenced but not defined in this file. +.. option:: --radix=RADIX, -t + + Specify the radix of the symbol address(es). Values accepted d(decimal), + x(hexadecomal) and o(octal). + BUGS ---- diff --git a/docs/CommandGuide/llvm-profdata.rst b/docs/CommandGuide/llvm-profdata.rst index 74fe4ee9d2194..f5508b5b2b8f2 100644 --- a/docs/CommandGuide/llvm-profdata.rst +++ b/docs/CommandGuide/llvm-profdata.rst @@ -44,6 +44,9 @@ interpreted as relatively more important than a shorter run. Depending on the nature of the training runs it may be useful to adjust the weight given to each input file by using the ``-weighted-input`` option. +Profiles passed in via ``-weighted-input``, ``-input-files``, or via positional +arguments are processed once for each time they are seen. + OPTIONS ^^^^^^^ @@ -59,10 +62,17 @@ OPTIONS .. option:: -weighted-input=weight,filename - Specify an input file name along with a weight. The profile counts of the input - file will be scaled (multiplied) by the supplied ``weight``, where where ``weight`` - is a decimal integer >= 1. Input files specified without using this option are - assigned a default weight of 1. Examples are shown below. + Specify an input file name along with a weight. The profile counts of the + supplied ``filename`` will be scaled (multiplied) by the supplied + ``weight``, where where ``weight`` is a decimal integer >= 1. + Input files specified without using this option are assigned a default + weight of 1. Examples are shown below. + +.. option:: -input-files=path, -f=path + + Specify a file which contains a list of files to merge. The entries in this + file are newline-separated. Lines starting with '#' are skipped. Entries may + be of the form <filename> or <weight>,<filename>. .. option:: -instr (default) @@ -90,6 +100,12 @@ OPTIONS Emit the profile using GCC's gcov format (Not yet supported). +.. option:: -sparse[=true|false] + + Do not emit function records with 0 execution count. Can only be used in + conjunction with -instr. Defaults to false, since it can inhibit compiler + optimization during PGO. + EXAMPLES ^^^^^^^^ Basic Usage diff --git a/docs/CommandGuide/llvm-readobj.rst b/docs/CommandGuide/llvm-readobj.rst index b1918b548f856..417fcd05c8a20 100644 --- a/docs/CommandGuide/llvm-readobj.rst +++ b/docs/CommandGuide/llvm-readobj.rst @@ -80,6 +80,10 @@ input. Otherwise, it will read from the specified ``filenames``. Display the ELF program headers (only for ELF object files). +.. option:: -elf-section-groups, -g + + Display section groups (only for ELF object files). + EXIT STATUS ----------- diff --git a/docs/CompileCudaWithLLVM.rst b/docs/CompileCudaWithLLVM.rst index a981ffe1e8f52..f57839cec9615 100644 --- a/docs/CompileCudaWithLLVM.rst +++ b/docs/CompileCudaWithLLVM.rst @@ -18,9 +18,11 @@ familiarity with CUDA. Information about CUDA programming can be found in the How to Build LLVM with CUDA Support =================================== -Below is a quick summary of downloading and building LLVM. Consult the `Getting -Started <http://llvm.org/docs/GettingStarted.html>`_ page for more details on -setting up LLVM. +CUDA support is still in development and works the best in the trunk version +of LLVM. Below is a quick summary of downloading and building the trunk +version. Consult the `Getting Started +<http://llvm.org/docs/GettingStarted.html>`_ page for more details on setting +up LLVM. #. Checkout LLVM @@ -51,7 +53,7 @@ How to Compile CUDA C/C++ with LLVM =================================== We assume you have installed the CUDA driver and runtime. Consult the `NVIDIA -CUDA installation Guide +CUDA installation guide <https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html>`_ if you have not. @@ -60,8 +62,6 @@ which multiplies a ``float`` array by a ``float`` scalar (AXPY). .. code-block:: c++ - #include <helper_cuda.h> // for checkCudaErrors - #include <iostream> __global__ void axpy(float a, float* x, float* y) { @@ -78,25 +78,25 @@ which multiplies a ``float`` array by a ``float`` scalar (AXPY). // Copy input data to device. float* device_x; float* device_y; - checkCudaErrors(cudaMalloc(&device_x, kDataLen * sizeof(float))); - checkCudaErrors(cudaMalloc(&device_y, kDataLen * sizeof(float))); - checkCudaErrors(cudaMemcpy(device_x, host_x, kDataLen * sizeof(float), - cudaMemcpyHostToDevice)); + cudaMalloc(&device_x, kDataLen * sizeof(float)); + cudaMalloc(&device_y, kDataLen * sizeof(float)); + cudaMemcpy(device_x, host_x, kDataLen * sizeof(float), + cudaMemcpyHostToDevice); // Launch the kernel. axpy<<<1, kDataLen>>>(a, device_x, device_y); // Copy output data to host. - checkCudaErrors(cudaDeviceSynchronize()); - checkCudaErrors(cudaMemcpy(host_y, device_y, kDataLen * sizeof(float), - cudaMemcpyDeviceToHost)); + cudaDeviceSynchronize(); + cudaMemcpy(host_y, device_y, kDataLen * sizeof(float), + cudaMemcpyDeviceToHost); // Print the results. for (int i = 0; i < kDataLen; ++i) { std::cout << "y[" << i << "] = " << host_y[i] << "\n"; } - checkCudaErrors(cudaDeviceReset()); + cudaDeviceReset(); return 0; } @@ -104,16 +104,89 @@ The command line for compilation is similar to what you would use for C++. .. code-block:: console - $ clang++ -o axpy -I<CUDA install path>/samples/common/inc -L<CUDA install path>/<lib64 or lib> axpy.cu -lcudart_static -lcuda -ldl -lrt -pthread + $ clang++ axpy.cu -o axpy --cuda-gpu-arch=<GPU arch> \ + -L<CUDA install path>/<lib64 or lib> \ + -lcudart_static -ldl -lrt -pthread $ ./axpy y[0] = 2 y[1] = 4 y[2] = 6 y[3] = 8 -Note that ``helper_cuda.h`` comes from the CUDA samples, so you need the -samples installed for this example. ``<CUDA install path>`` is the root -directory where you installed CUDA SDK, typically ``/usr/local/cuda``. +``<CUDA install path>`` is the root directory where you installed CUDA SDK, +typically ``/usr/local/cuda``. ``<GPU arch>`` is `the compute capability of +your GPU <https://developer.nvidia.com/cuda-gpus>`_. For example, if you want +to run your program on a GPU with compute capability of 3.5, you should specify +``--cuda-gpu-arch=sm_35``. + +Detecting clang vs NVCC +======================= + +Although clang's CUDA implementation is largely compatible with NVCC's, you may +still want to detect when you're compiling CUDA code specifically with clang. + +This is tricky, because NVCC may invoke clang as part of its own compilation +process! For example, NVCC uses the host compiler's preprocessor when +compiling for device code, and that host compiler may in fact be clang. + +When clang is actually compiling CUDA code -- rather than being used as a +subtool of NVCC's -- it defines the ``__CUDA__`` macro. ``__CUDA_ARCH__`` is +defined only in device mode (but will be defined if NVCC is using clang as a +preprocessor). So you can use the following incantations to detect clang CUDA +compilation, in host and device modes: + +.. code-block:: c++ + + #if defined(__clang__) && defined(__CUDA__) && !defined(__CUDA_ARCH__) + // clang compiling CUDA code, host mode. + #endif + + #if defined(__clang__) && defined(__CUDA__) && defined(__CUDA_ARCH__) + // clang compiling CUDA code, device mode. + #endif + +Both clang and nvcc define ``__CUDACC__`` during CUDA compilation. You can +detect NVCC specifically by looking for ``__NVCC__``. + +Flags that control numerical code +================================= + +If you're using GPUs, you probably care about making numerical code run fast. +GPU hardware allows for more control over numerical operations than most CPUs, +but this results in more compiler options for you to juggle. + +Flags you may wish to tweak include: + +* ``-ffp-contract={on,off,fast}`` (defaults to ``fast`` on host and device when + compiling CUDA) Controls whether the compiler emits fused multiply-add + operations. + + * ``off``: never emit fma operations, and prevent ptxas from fusing multiply + and add instructions. + * ``on``: fuse multiplies and adds within a single statement, but never + across statements (C11 semantics). Prevent ptxas from fusing other + multiplies and adds. + * ``fast``: fuse multiplies and adds wherever profitable, even across + statements. Doesn't prevent ptxas from fusing additional multiplies and + adds. + + Fused multiply-add instructions can be much faster than the unfused + equivalents, but because the intermediate result in an fma is not rounded, + this flag can affect numerical code. + +* ``-fcuda-flush-denormals-to-zero`` (default: off) When this is enabled, + floating point operations may flush `denormal + <https://en.wikipedia.org/wiki/Denormal_number>`_ inputs and/or outputs to 0. + Operations on denormal numbers are often much slower than the same operations + on normal numbers. + +* ``-fcuda-approx-transcendentals`` (default: off) When this is enabled, the + compiler may emit calls to faster, approximate versions of transcendental + functions, instead of using the slower, fully IEEE-compliant versions. For + example, this flag allows clang to emit the ptx ``sin.approx.f32`` + instruction. + + This is implied by ``-ffast-math``. Optimizations ============= @@ -134,10 +207,9 @@ customizable target-independent optimization pipeline. straight-line scalar optimizations <https://goo.gl/4Rb9As>`_. * **Inferring memory spaces**. `This optimization - <http://www.llvm.org/docs/doxygen/html/NVPTXFavorNonGenericAddrSpaces_8cpp_source.html>`_ + <https://github.com/llvm-mirror/llvm/blob/master/lib/Target/NVPTX/NVPTXInferAddressSpaces.cpp>`_ infers the memory space of an address so that the backend can emit faster - special loads and stores from it. Details can be found in the `design - document for memory space inference <https://goo.gl/5wH2Ct>`_. + special loads and stores from it. * **Aggressive loop unrooling and function inlining**. Loop unrolling and function inlining need to be more aggressive for GPUs than for CPUs because @@ -167,3 +239,22 @@ customizable target-independent optimization pipeline. 32-bit ones on NVIDIA GPUs due to lack of a divide unit. Many of the 64-bit divides in our benchmarks have a divisor and dividend which fit in 32-bits at runtime. This optimization provides a fast path for this common case. + +Publication +=========== + +| `gpucc: An Open-Source GPGPU Compiler <http://dl.acm.org/citation.cfm?id=2854041>`_ +| Jingyue Wu, Artem Belevich, Eli Bendersky, Mark Heffernan, Chris Leary, Jacques Pienaar, Bjarke Roune, Rob Springer, Xuetian Weng, Robert Hundt +| *Proceedings of the 2016 International Symposium on Code Generation and Optimization (CGO 2016)* +| `Slides for the CGO talk <http://wujingyue.com/docs/gpucc-talk.pdf>`_ + +Tutorial +======== + +`CGO 2016 gpucc tutorial <http://wujingyue.com/docs/gpucc-tutorial.pdf>`_ + +Obtaining Help +============== + +To obtain help on LLVM in general and its CUDA support, see `the LLVM +community <http://llvm.org/docs/#mailing-lists>`_. diff --git a/docs/CompilerWriterInfo.rst b/docs/CompilerWriterInfo.rst index 6c3ff4b10f1e6..5ae47ea89fe2c 100644 --- a/docs/CompilerWriterInfo.rst +++ b/docs/CompilerWriterInfo.rst @@ -13,25 +13,18 @@ Architecture & Platform Information for Compiler Writers Hardware ======== -ARM ---- +AArch64 & ARM +------------- -* `ARM documentation <http://www.arm.com/documentation/>`_ (`Processor Cores <http://www.arm.com/documentation/ARMProcessor_Cores/>`_ Cores) +* `ARMv8-A Architecture Reference Manual <http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0487a.h/index.html>`_ (authentication required, free sign-up). This document covers both AArch64 and ARM instructions -* `ABI <http://www.arm.com/products/DevTools/ABI.html>`_ +* `ARMv7-M Architecture Reference Manual` <http://infocenter.arm.com/help/topic/com.arm.doc.ddi0403e.b/index.html>`_ (authentication required, free sign-up). This covers the Thumb2-only microcontrollers -* `ABI Addenda and Errata <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0045d/IHI0045D_ABI_addenda.pdf>`_ +* `ARMv6-M Architecture Reference Manual` <http://infocenter.arm.com/help/topic/com.arm.doc.ddi0419c/index.html>_ (authentication required, free sign-up). This covers the Thumb1-only microcontrollers * `ARM C Language Extensions <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053c/IHI0053C_acle_2_0.pdf>`_ -AArch64 -------- - -* `ARMv8 Architecture Reference Manual <http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0487a.h/index.html>`_ - -* `ARMv8 Instruction Set Overview <http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.genc010197a/index.html>`_ - -* `ARM C Language Extensions <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053c/IHI0053C_acle_2_0.pdf>`_ +* AArch32 `ABI Addenda and Errata <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0045d/IHI0045D_ABI_addenda.pdf>`_ Itanium (ia64) -------------- @@ -97,21 +90,10 @@ SystemZ X86 --- -AMD - Official manuals and docs -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - * `AMD processor manuals <http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_739,00.html>`_ -* `X86-64 ABI <http://www.x86-64.org/documentation>`_ - -Intel - Official manuals and docs -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - * `Intel 64 and IA-32 manuals <http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html>`_ * `Intel Itanium documentation <http://www.intel.com/design/itanium/documentation.htm?iid=ipp_srvr_proc_itanium2+techdocs>`_ - -Other x86-specific information -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - +* `X86 and X86-64 SysV psABI <https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI>`_ * `Calling conventions for different C++ compilers and operating systems <http://www.agner.org/optimize/calling_conventions.pdf>`_ XCore @@ -134,6 +116,7 @@ ABI Linux ----- +* `Linux extensions to gabi <https://github.com/hjl-tools/linux-abi/wiki/Linux-Extensions-to-gABI>`_ * `PowerPC 64-bit ELF ABI Supplement <http://www.linuxbase.org/spec/ELF/ppc64/>`_ * `Procedure Call Standard for the AArch64 Architecture <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055a/IHI0055A_aapcs64.pdf>`_ * `ELF for the ARM Architecture <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044e/IHI0044E_aaelf.pdf>`_ diff --git a/docs/CoverageMappingFormat.rst b/docs/CoverageMappingFormat.rst index 84cddff5ed9ed..158255ab86397 100644 --- a/docs/CoverageMappingFormat.rst +++ b/docs/CoverageMappingFormat.rst @@ -251,27 +251,40 @@ The coverage mapping variable generated by Clang has 3 fields: .. code-block:: llvm - @__llvm_coverage_mapping = internal constant { { i32, i32, i32, i32 }, [2 x { i8*, i32, i32 }], [40 x i8] } + @__llvm_coverage_mapping = internal constant { { i32, i32, i32, i32 }, [2 x { i64, i32, i64 }], [40 x i8] } { { i32, i32, i32, i32 } ; Coverage map header { i32 2, ; The number of function records i32 20, ; The length of the string that contains the encoded translation unit filenames i32 20, ; The length of the string that contains the encoded coverage mapping data - i32 0, ; Coverage mapping format version + i32 1, ; Coverage mapping format version }, - [2 x { i8*, i32, i32 }] [ ; Function records - { i8*, i32, i32 } { i8* getelementptr inbounds ([3 x i8]* @__llvm_profile_name_foo, i32 0, i32 0), ; Function's name - i32 3, ; Function's name length - i32 9 ; Function's encoded coverage mapping data string length + [2 x { i64, i32, i64 }] [ ; Function records + { i64, i32, i64 } { + i64 0x5cf8c24cdb18bdac, ; Function's name MD5 + i32 9, ; Function's encoded coverage mapping data string length + i64 0 ; Function's structural hash }, - { i8*, i32, i32 } { i8* getelementptr inbounds ([3 x i8]* @__llvm_profile_name_bar, i32 0, i32 0), ; Function's name - i32 3, ; Function's name length - i32 9 ; Function's encoded coverage mapping data string length + { i64, i32, i64 } { + i64 0xe413754a191db537, ; Function's name MD5 + i32 9, ; Function's encoded coverage mapping data string length + i64 0 ; Function's structural hash }], [40 x i8] c"..." ; Encoded data (dissected later) }, section "__llvm_covmap", align 8 +The function record layout has evolved since version 1. In version 1, the function record for *foo* is defined as follows: + +.. code-block:: llvm + + { i8*, i32, i32, i64 } { i8* getelementptr inbounds ([3 x i8]* @__profn_foo, i32 0, i32 0), ; Function's name + i32 3, ; Function's name length + i32 9, ; Function's encoded coverage mapping data string length + i64 0 ; Function's structural hash + } + + Coverage Mapping Header: ------------------------ @@ -283,7 +296,7 @@ The coverage mapping header has the following fields: * The length of the string in the third field of *__llvm_coverage_mapping* that contains the encoded coverage mapping data. -* The format version. 0 is the first (current) version of the coverage mapping format. +* The format version. The current version is 2 (encoded as a 1). .. _function records: @@ -294,10 +307,10 @@ A function record is a structure of the following type: .. code-block:: llvm - { i8*, i32, i32 } + { i64, i32, i64 } -It contains the pointer to the function's name, function's name length, -and the length of the encoded mapping data for that function. +It contains function name's MD5, the length of the encoded mapping data for that function, and function's +structural hash value. Encoded data: ------------- @@ -417,7 +430,7 @@ and can appear after ``:`` in the ``[foo : type]`` description. LEB128 ^^^^^^ -LEB128 is an unsigned interger value that is encoded using DWARF's LEB128 +LEB128 is an unsigned integer value that is encoded using DWARF's LEB128 encoding, optimizing for the case where values are small (1 byte for values less than 128). diff --git a/docs/DeveloperPolicy.rst b/docs/DeveloperPolicy.rst index 17baf2d27b134..23bdb2fcf17be 100644 --- a/docs/DeveloperPolicy.rst +++ b/docs/DeveloperPolicy.rst @@ -186,7 +186,7 @@ problem, we have a notion of an 'owner' for a piece of the code. The sole responsibility of a code owner is to ensure that a commit to their area of the code is appropriately reviewed, either by themself or by someone else. The list of current code owners can be found in the file -`CODE_OWNERS.TXT <http://llvm.org/viewvc/llvm-project/llvm/trunk/CODE_OWNERS.TXT?view=markup>`_ +`CODE_OWNERS.TXT <http://llvm.org/klaus/llvm/blob/master/CODE_OWNERS.TXT>`_ in the root of the LLVM source tree. Note that code ownership is completely different than reviewers: anyone can @@ -338,7 +338,7 @@ Obtaining Commit Access We grant commit access to contributors with a track record of submitting high quality patches. If you would like commit access, please send an email to -`Chris <mailto:sabre@nondot.org>`_ with the following information: +`Chris <mailto:clattner@llvm.org>`_ with the following information: #. The user name you want to commit with, e.g. "hacker". @@ -348,8 +348,10 @@ quality patches. If you would like commit access, please send an email to #. A "password hash" of the password you want to use, e.g. "``2ACR96qjUqsyM``". Note that you don't ever tell us what your password is; you just give it to us in an encrypted form. To get this, run "``htpasswd``" (a utility that - comes with apache) in crypt mode (often enabled with "``-d``"), or find a web - page that will do it for you. + comes with apache) in *crypt* mode (often enabled with "``-d``"), or find a web + page that will do it for you. Note that our system does not work with MD5 + hashes. These are significantly longer than a crypt hash - e.g. + "``$apr1$vea6bBV2$Z8IFx.AfeD8LhqlZFqJer0``", we only accept the shorter crypt hash. Once you've been granted commit access, you should be able to check out an LLVM tree with an SVN URL of "https://username@llvm.org/..." instead of the normal diff --git a/docs/FAQ.rst b/docs/FAQ.rst index 0559a1ff21505..0ab99f3452a7a 100644 --- a/docs/FAQ.rst +++ b/docs/FAQ.rst @@ -75,149 +75,17 @@ reference. In fact, the names of dummy numbered temporaries like ``%1`` are not explicitly represented in the in-memory representation at all (see ``Value::getName()``). -Build Problems -============== - -When I run configure, it finds the wrong C compiler. ----------------------------------------------------- -The ``configure`` script attempts to locate first ``gcc`` and then ``cc``, -unless it finds compiler paths set in ``CC`` and ``CXX`` for the C and C++ -compiler, respectively. - -If ``configure`` finds the wrong compiler, either adjust your ``PATH`` -environment variable or set ``CC`` and ``CXX`` explicitly. - - -The ``configure`` script finds the right C compiler, but it uses the LLVM tools from a previous build. What do I do? ---------------------------------------------------------------------------------------------------------------------- -The ``configure`` script uses the ``PATH`` to find executables, so if it's -grabbing the wrong linker/assembler/etc, there are two ways to fix it: - -#. Adjust your ``PATH`` environment variable so that the correct program - appears first in the ``PATH``. This may work, but may not be convenient - when you want them *first* in your path for other work. - -#. Run ``configure`` with an alternative ``PATH`` that is correct. In a - Bourne compatible shell, the syntax would be: - -.. code-block:: console - - % PATH=[the path without the bad program] $LLVM_SRC_DIR/configure ... - -This is still somewhat inconvenient, but it allows ``configure`` to do its -work without having to adjust your ``PATH`` permanently. - - -When creating a dynamic library, I get a strange GLIBC error. -------------------------------------------------------------- -Under some operating systems (i.e. Linux), libtool does not work correctly if -GCC was compiled with the ``--disable-shared option``. To work around this, -install your own version of GCC that has shared libraries enabled by default. - - -I've updated my source tree from Subversion, and now my build is trying to use a file/directory that doesn't exist. -------------------------------------------------------------------------------------------------------------------- -You need to re-run configure in your object directory. When new Makefiles -are added to the source tree, they have to be copied over to the object tree -in order to be used by the build. - - -I've modified a Makefile in my source tree, but my build tree keeps using the old version. What do I do? ---------------------------------------------------------------------------------------------------------- -If the Makefile already exists in your object tree, you can just run the -following command in the top level directory of your object tree: - -.. code-block:: console - - % ./config.status <relative path to Makefile>; - -If the Makefile is new, you will have to modify the configure script to copy -it over. - - -I've upgraded to a new version of LLVM, and I get strange build errors. ------------------------------------------------------------------------ -Sometimes, changes to the LLVM source code alters how the build system works. -Changes in ``libtool``, ``autoconf``, or header file dependencies are -especially prone to this sort of problem. - -The best thing to try is to remove the old files and re-build. In most cases, -this takes care of the problem. To do this, just type ``make clean`` and then -``make`` in the directory that fails to build. - - -I've built LLVM and am testing it, but the tests freeze. --------------------------------------------------------- -This is most likely occurring because you built a profile or release -(optimized) build of LLVM and have not specified the same information on the -``gmake`` command line. - -For example, if you built LLVM with the command: - -.. code-block:: console - - % gmake ENABLE_PROFILING=1 - -...then you must run the tests with the following commands: - -.. code-block:: console - - % cd llvm/test - % gmake ENABLE_PROFILING=1 - -Why do test results differ when I perform different types of builds? --------------------------------------------------------------------- -The LLVM test suite is dependent upon several features of the LLVM tools and -libraries. - -First, the debugging assertions in code are not enabled in optimized or -profiling builds. Hence, tests that used to fail may pass. - -Second, some tests may rely upon debugging options or behavior that is only -available in the debug build. These tests will fail in an optimized or -profile build. - - -Compiling LLVM with GCC 3.3.2 fails, what should I do? ------------------------------------------------------- -This is `a bug in GCC <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13392>`_, -and affects projects other than LLVM. Try upgrading or downgrading your GCC. - - -After Subversion update, rebuilding gives the error "No rule to make target". ------------------------------------------------------------------------------ -If the error is of the form: - -.. code-block:: console - - gmake[2]: *** No rule to make target `/path/to/somefile', - needed by `/path/to/another/file.d'. - Stop. - -This may occur anytime files are moved within the Subversion repository or -removed entirely. In this case, the best solution is to erase all ``.d`` -files, which list dependencies for source files, and rebuild: - -.. code-block:: console - - % cd $LLVM_OBJ_DIR - % rm -f `find . -name \*\.d` - % gmake - -In other cases, it may be necessary to run ``make clean`` before rebuilding. - Source Languages ================ What source languages are supported? ------------------------------------ -LLVM currently has full support for C and C++ source languages. These are -available through both `Clang <http://clang.llvm.org/>`_ and `DragonEgg -<http://dragonegg.llvm.org/>`_. -The PyPy developers are working on integrating LLVM into the PyPy backend so -that PyPy language can translate to LLVM. +LLVM currently has full support for C and C++ source languages through +`Clang <http://clang.llvm.org/>`_. Many other language frontends have +been written using LLVM, and an incomplete list is available at +`projects with LLVM <http://llvm.org/ProjectsWithLLVM/>`_. I'd like to write a self-hosting LLVM compiler. How should I interface with the LLVM middle-end optimizers and back-end code generators? diff --git a/docs/GettingStarted.rst b/docs/GettingStarted.rst index 6aba500367939..54240b92b6af8 100644 --- a/docs/GettingStarted.rst +++ b/docs/GettingStarted.rst @@ -38,6 +38,9 @@ Here's the short story for getting up and running quickly with LLVM: #. Read the documentation. #. Read the documentation. #. Remember that you were warned twice about reading the documentation. + + * In particular, the *relative paths specified are important*. + #. Checkout LLVM: * ``cd where-you-want-llvm-to-live`` @@ -49,13 +52,13 @@ Here's the short story for getting up and running quickly with LLVM: * ``cd llvm/tools`` * ``svn co http://llvm.org/svn/llvm-project/cfe/trunk clang`` -#. Checkout Compiler-RT (required to build the sanitizers): +#. Checkout Compiler-RT (required to build the sanitizers) **[Optional]**: * ``cd where-you-want-llvm-to-live`` * ``cd llvm/projects`` * ``svn co http://llvm.org/svn/llvm-project/compiler-rt/trunk compiler-rt`` -#. Checkout Libomp (required for OpenMP support): +#. Checkout Libomp (required for OpenMP support) **[Optional]**: * ``cd where-you-want-llvm-to-live`` * ``cd llvm/projects`` @@ -76,10 +79,15 @@ Here's the short story for getting up and running quickly with LLVM: #. Configure and build LLVM and Clang: - The usual build uses `CMake <CMake.html>`_. If you would rather use - autotools, see `Building LLVM with autotools <BuildingLLVMWithAutotools.html>`_. - Although the build is known to work with CMake >= 2.8.8, we recommend CMake - >= v3.2, especially if you're generating Ninja build files. + *Warning:* Make sure you've checked out *all of* the source code + before trying to configure with cmake. cmake does not pickup newly + added source directories in incremental builds. + + The build uses `CMake <CMake.html>`_. LLVM requires CMake 3.4.3 to build. It + is generally recommended to use a recent CMake, especially if you're + generating Ninja build files. This is because the CMake project is constantly + improving the quality of the generators, and the Ninja generator gets a lot + of attention. * ``cd where you want to build llvm`` * ``mkdir build`` @@ -89,10 +97,10 @@ Here's the short story for getting up and running quickly with LLVM: Some common generators are: * ``Unix Makefiles`` --- for generating make-compatible parallel makefiles. - * ``Ninja`` --- for generating `Ninja <http://martine.github.io/ninja/>` - build files. Most llvm developers use Ninja. + * ``Ninja`` --- for generating `Ninja <https://ninja-build.org>`_ + build files. Most llvm developers use Ninja. * ``Visual Studio`` --- for generating Visual Studio projects and - solutions. + solutions. * ``Xcode`` --- for generating Xcode projects. Some Common options: @@ -117,15 +125,17 @@ Here's the short story for getting up and running quickly with LLVM: * CMake will generate build targets for each tool and library, and most LLVM sub-projects generate their own ``check-<project>`` target. + * Running a serial build will be *slow*. Make sure you run a + parallel build; for ``make``, use ``make -j``. + * For more information see `CMake <CMake.html>`_ * If you get an "internal compiler error (ICE)" or test failures, see `below`_. Consult the `Getting Started with LLVM`_ section for detailed information on -configuring and compiling LLVM. See `Setting Up Your Environment`_ for tips -that simplify working with the Clang front end and LLVM tools. Go to `Program -Layout`_ to learn about the layout of the source code tree. +configuring and compiling LLVM. Go to `Directory Layout`_ to learn about the +layout of the source code tree. Requirements ============ @@ -161,16 +171,17 @@ Windows x64 x86-64 Visual Studio #. Code generation supported for Pentium processors and up #. Code generation supported for 32-bit ABI only #. To use LLVM modules on Win32-based system, you may configure LLVM - with ``-DBUILD_SHARED_LIBS=On`` for CMake builds or ``--enable-shared`` - for configure builds. + with ``-DBUILD_SHARED_LIBS=On``. #. MCJIT not working well pre-v7, old JIT engine not supported any more. -Note that you will need about 1-3 GB of space for a full LLVM build in Debug -mode, depending on the system (it is so large because of all the debugging -information and the fact that the libraries are statically linked into multiple -tools). If you do not need many of the tools and you are space-conscious, you -can pass ``ONLY_TOOLS="tools you need"`` to make. The Release build requires -considerably less space. +Note that Debug builds require a lot of time and disk space. An LLVM-only build +will need about 1-3 GB of space. A full build of LLVM and Clang will need around +15-20 GB of disk space. The exact space requirements will vary by system. (It +is so large because of all the debugging information and the fact that the +libraries are statically linked into multiple tools). + +If you you are space-constrained, you can build only selected tools or only +selected targets. The Release build requires considerably less space. The LLVM suite *may* compile on other platforms, but it is not guaranteed to do so. If compilation is successful, the LLVM utilities should be able to @@ -193,11 +204,7 @@ Package Version Notes `GNU Make <http://savannah.gnu.org/projects/make>`_ 3.79, 3.79.1 Makefile/build processor `GCC <http://gcc.gnu.org/>`_ >=4.7.0 C/C++ compiler\ :sup:`1` `python <http://www.python.org/>`_ >=2.7 Automated test suite\ :sup:`2` -`GNU M4 <http://savannah.gnu.org/projects/m4>`_ 1.4 Macro processor for configuration\ :sup:`3` -`GNU Autoconf <http://www.gnu.org/software/autoconf/>`_ 2.60 Configuration script builder\ :sup:`3` -`GNU Automake <http://www.gnu.org/software/automake/>`_ 1.9.6 aclocal macro generator\ :sup:`3` -`libtool <http://savannah.gnu.org/projects/libtool>`_ 1.5.22 Shared library manager\ :sup:`3` -`zlib <http://zlib.net>`_ >=1.2.3.4 Compression library\ :sup:`4` +`zlib <http://zlib.net>`_ >=1.2.3.4 Compression library\ :sup:`3` =========================================================== ============ ========================================== .. note:: @@ -207,9 +214,6 @@ Package Version Notes info. #. Only needed if you want to run the automated test suite in the ``llvm/test`` directory. - #. If you want to make changes to the configure scripts, you will need GNU - autoconf (2.60), and consequently, GNU M4 (version 1.4 or higher). You - will also need automake (1.9.6). We only use aclocal from that package. #. Optional, adds compression / uncompression capabilities to selected LLVM tools. @@ -421,22 +425,6 @@ appropriate pathname on your local system. All these paths are absolute: object files and compiled programs will be placed. It can be the same as SRC_ROOT). -.. _Setting Up Your Environment: - -Setting Up Your Environment ---------------------------- - -In order to compile and use LLVM, you may need to set some environment -variables. - -``LLVM_LIB_SEARCH_PATH=/path/to/your/bitcode/libs`` - - [Optional] This environment variable helps LLVM linking tools find the - locations of your bitcode libraries. It is provided only as a convenience - since you can specify the paths using the -L options of the tools and the - C/C++ front-end will automatically use the bitcode files installed in its - ``lib`` directory. - Unpacking the LLVM Archives --------------------------- @@ -513,8 +501,7 @@ get it from the Subversion repository: % svn co http://llvm.org/svn/llvm-project/test-suite/trunk test-suite By placing it in the ``llvm/projects``, it will be automatically configured by -the LLVM configure script as well as automatically updated when you run ``svn -update``. +the LLVM cmake configuration. Git Mirror ---------- @@ -628,6 +615,8 @@ Then, your .git/config should have [imap] sections. ; example for Traditional Chinese folder = "[Gmail]/&g0l6Pw-" +.. _developers-work-with-git-svn: + For developers to work with git-svn ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -711,9 +700,8 @@ Local LLVM Configuration ------------------------ Once checked out from the Subversion repository, the LLVM suite source code must -be configured before being built. For instructions using autotools please see -`Building LLVM With Autotools <BuildingLLVMWithAutotools.html>`_. The -recommended process uses CMake. Unlinke the normal ``configure`` script, CMake +be configured before being built. This process uses CMake. +Unlinke the normal ``configure`` script, CMake generates the build files in whatever format you request as well as various ``*.inc`` files, and ``llvm/include/Config/config.h``. @@ -744,9 +732,9 @@ used by people developing LLVM. | | the configure script. The default list is defined | | | as ``LLVM_ALL_TARGETS``, and can be set to include | | | out-of-tree targets. The default value includes: | -| | ``AArch64, AMDGPU, ARM, BPF, CppBackend, Hexagon, | -| | Mips, MSP430, NVPTX, PowerPC, Sparc, SystemZ | -| | X86, XCore``. | +| | ``AArch64, AMDGPU, ARM, BPF, Hexagon, Mips, | +| | MSP430, NVPTX, PowerPC, Sparc, SystemZ, X86, | +| | XCore``. | +-------------------------+----------------------------------------------------+ | LLVM_ENABLE_DOXYGEN | Build doxygen-based documentation from the source | | | code This is disabled by default because it is | @@ -888,8 +876,6 @@ The LLVM build system is capable of sharing a single LLVM source tree among several LLVM builds. Hence, it is possible to build LLVM for several different platforms or configurations using the same source tree. -This is accomplished in the typical autoconf manner: - * Change directory to where the LLVM object files should live: .. code-block:: console @@ -942,40 +928,38 @@ use this command instead of the 'echo' command above: .. _Program Layout: .. _general layout: -Program Layout -============== +Directory Layout +================ One useful source of information about the LLVM source base is the LLVM `doxygen -<http://www.doxygen.org/>`_ documentation available at +<http://www.doxygen.org/>`_ documentation available at `<http://llvm.org/doxygen/>`_. The following is a brief introduction to code layout: ``llvm/examples`` ----------------- -This directory contains some simple examples of how to use the LLVM IR and JIT. +Simple examples using the LLVM IR and JIT. ``llvm/include`` ---------------- -This directory contains public header files exported from the LLVM library. The -three main subdirectories of this directory are: +Public header files exported from the LLVM library. The three main subdirectories: ``llvm/include/llvm`` - This directory contains all of the LLVM specific header files. This directory - also has subdirectories for different portions of LLVM: ``Analysis``, - ``CodeGen``, ``Target``, ``Transforms``, etc... + All LLVM-specific header files, and subdirectories for different portions of + LLVM: ``Analysis``, ``CodeGen``, ``Target``, ``Transforms``, etc... ``llvm/include/llvm/Support`` - This directory contains generic support libraries that are provided with LLVM - but not necessarily specific to LLVM. For example, some C++ STL utilities and - a Command Line option processing library store their header files here. + Generic support libraries provided with LLVM but not necessarily specific to + LLVM. For example, some C++ STL utilities and a Command Line option processing + library store header files here. ``llvm/include/llvm/Config`` - This directory contains header files configured by the ``configure`` script. + Header files configured by the ``configure`` script. They wrap "standard" UNIX and C header files. Source code can include these header files which automatically take care of the conditional #includes that the ``configure`` script generates. @@ -983,103 +967,76 @@ three main subdirectories of this directory are: ``llvm/lib`` ------------ -This directory contains most of the source files of the LLVM system. In LLVM, -almost all code exists in libraries, making it very easy to share code among the -different `tools`_. +Most source files are here. By putting code in libraries, LLVM makes it easy to +share code among the `tools`_. ``llvm/lib/IR/`` - This directory holds the core LLVM source files that implement core classes - like Instruction and BasicBlock. + Core LLVM source files that implement core classes like Instruction and + BasicBlock. ``llvm/lib/AsmParser/`` - This directory holds the source code for the LLVM assembly language parser - library. + Source code for the LLVM assembly language parser library. ``llvm/lib/Bitcode/`` - This directory holds code for reading and write LLVM bitcode. + Code for reading and writing bitcode. ``llvm/lib/Analysis/`` - This directory contains a variety of different program analyses, such as - Dominator Information, Call Graphs, Induction Variables, Interval - Identification, Natural Loop Identification, etc. + A variety of program analyses, such as Call Graphs, Induction Variables, + Natural Loop Identification, etc. ``llvm/lib/Transforms/`` - This directory contains the source code for the LLVM to LLVM program - transformations, such as Aggressive Dead Code Elimination, Sparse Conditional - Constant Propagation, Inlining, Loop Invariant Code Motion, Dead Global - Elimination, and many others. + IR-to-IR program transformations, such as Aggressive Dead Code Elimination, + Sparse Conditional Constant Propagation, Inlining, Loop Invariant Code Motion, + Dead Global Elimination, and many others. ``llvm/lib/Target/`` - This directory contains files that describe various target architectures for - code generation. For example, the ``llvm/lib/Target/X86`` directory holds the - X86 machine description while ``llvm/lib/Target/ARM`` implements the ARM - backend. + Files describing target architectures for code generation. For example, + ``llvm/lib/Target/X86`` holds the X86 machine description. ``llvm/lib/CodeGen/`` - This directory contains the major parts of the code generator: Instruction - Selector, Instruction Scheduling, and Register Allocation. + The major parts of the code generator: Instruction Selector, Instruction + Scheduling, and Register Allocation. ``llvm/lib/MC/`` - (FIXME: T.B.D.) - -``llvm/lib/Debugger/`` - - This directory contains the source level debugger library that makes it - possible to instrument LLVM programs so that a debugger could identify source - code locations at which the program is executing. + (FIXME: T.B.D.) ....? ``llvm/lib/ExecutionEngine/`` - This directory contains libraries for executing LLVM bitcode directly at - runtime in both interpreted and JIT compiled fashions. + Libraries for directly executing bitcode at runtime in interpreted and + JIT-compiled scenarios. ``llvm/lib/Support/`` - This directory contains the source code that corresponds to the header files - located in ``llvm/include/ADT/`` and ``llvm/include/Support/``. + Source code that corresponding to the header files in ``llvm/include/ADT/`` + and ``llvm/include/Support/``. ``llvm/projects`` ----------------- -This directory contains projects that are not strictly part of LLVM but are -shipped with LLVM. This is also the directory where you should create your own -LLVM-based projects. - -``llvm/runtime`` ----------------- - -This directory contains libraries which are compiled into LLVM bitcode and used -when linking programs with the Clang front end. Most of these libraries are -skeleton versions of real libraries; for example, libc is a stripped down -version of glibc. - -Unlike the rest of the LLVM suite, this directory needs the LLVM GCC front end -to compile. +Projects not strictly part of LLVM but shipped with LLVM. This is also the +directory for creating your own LLVM-based projects which leverage the LLVM +build system. ``llvm/test`` ------------- -This directory contains feature and regression tests and other basic sanity -checks on the LLVM infrastructure. These are intended to run quickly and cover a -lot of territory without being exhaustive. +Feature and regression tests and other sanity checks on LLVM infrastructure. These +are intended to run quickly and cover a lot of territory without being exhaustive. ``test-suite`` -------------- -This is not a directory in the normal llvm module; it is a separate Subversion -module that must be checked out (usually to ``projects/test-suite``). This -module contains a comprehensive correctness, performance, and benchmarking test -suite for LLVM. It is a separate Subversion module because not every LLVM user -is interested in downloading or building such a comprehensive test suite. For -further details on this test suite, please see the :doc:`Testing Guide +A comprehensive correctness, performance, and benchmarking test suite for LLVM. +Comes in a separate Subversion module because not every LLVM user is interested +in such a comprehensive suite. For details see the :doc:`Testing Guide <TestingGuide>` document. .. _tools: @@ -1087,7 +1044,7 @@ further details on this test suite, please see the :doc:`Testing Guide ``llvm/tools`` -------------- -The **tools** directory contains the executables built out of the libraries +Executables built out of the libraries above, which form the main part of the user interface. You can always get help for a tool by typing ``tool_name -help``. The following is a brief introduction to the most important tools. More detailed information is in @@ -1135,72 +1092,67 @@ the `Command Guide <CommandGuide/index.html>`_. ``opt`` ``opt`` reads LLVM bitcode, applies a series of LLVM to LLVM transformations - (which are specified on the command line), and then outputs the resultant - bitcode. The '``opt -help``' command is a good way to get a list of the + (which are specified on the command line), and outputs the resultant + bitcode. '``opt -help``' is a good way to get a list of the program transformations available in LLVM. - ``opt`` can also be used to run a specific analysis on an input LLVM bitcode - file and print out the results. It is primarily useful for debugging + ``opt`` can also run a specific analysis on an input LLVM bitcode + file and print the results. Primarily useful for debugging analyses, or familiarizing yourself with what an analysis does. ``llvm/utils`` -------------- -This directory contains utilities for working with LLVM source code, and some of -the utilities are actually required as part of the build process because they -are code generators for parts of LLVM infrastructure. +Utilities for working with LLVM source code; some are part of the build process +because they are code generators for parts of the infrastructure. ``codegen-diff`` - ``codegen-diff`` is a script that finds differences between code that LLC - generates and code that LLI generates. This is a useful tool if you are + ``codegen-diff`` finds differences between code that LLC + generates and code that LLI generates. This is useful if you are debugging one of them, assuming that the other generates correct output. For the full user manual, run ```perldoc codegen-diff'``. ``emacs/`` - The ``emacs`` directory contains syntax-highlighting files which will work - with Emacs and XEmacs editors, providing syntax highlighting support for LLVM - assembly files and TableGen description files. For information on how to use - the syntax files, consult the ``README`` file in that directory. + Emacs and XEmacs syntax highlighting for LLVM assembly files and TableGen + description files. See the ``README`` for information on using them. ``getsrcs.sh`` - The ``getsrcs.sh`` script finds and outputs all non-generated source files, - which is useful if one wishes to do a lot of development across directories - and does not want to individually find each file. One way to use it is to run, - for example: ``xemacs `utils/getsources.sh``` from the top of your LLVM source + Finds and outputs all non-generated source files, + useful if one wishes to do a lot of development across directories + and does not want to find each file. One way to use it is to run, + for example: ``xemacs `utils/getsources.sh``` from the top of the LLVM source tree. ``llvmgrep`` - This little tool performs an ``egrep -H -n`` on each source file in LLVM and + Performs an ``egrep -H -n`` on each source file in LLVM and passes to it a regular expression provided on ``llvmgrep``'s command - line. This is a very efficient way of searching the source base for a + line. This is an efficient way of searching the source base for a particular regular expression. ``makellvm`` - The ``makellvm`` script compiles all files in the current directory and then + Compiles all files in the current directory, then compiles and links the tool that is the first argument. For example, assuming - you are in the directory ``llvm/lib/Target/Sparc``, if ``makellvm`` is in your - path, simply running ``makellvm llc`` will make a build of the current + you are in ``llvm/lib/Target/Sparc``, if ``makellvm`` is in your + path, running ``makellvm llc`` will make a build of the current directory, switch to directory ``llvm/tools/llc`` and build it, causing a re-linking of LLC. ``TableGen/`` - The ``TableGen`` directory contains the tool used to generate register + Contains the tool used to generate register descriptions, instruction set descriptions, and even assemblers from common TableGen description files. ``vim/`` - The ``vim`` directory contains syntax-highlighting files which will work with - the VIM editor, providing syntax highlighting support for LLVM assembly files - and TableGen description files. For information on how to use the syntax - files, consult the ``README`` file in that directory. + vim syntax-highlighting for LLVM assembly files + and TableGen description files. See the ``README`` for how to use them. .. _simple example: diff --git a/docs/GettingStartedVS.rst b/docs/GettingStartedVS.rst index 0ca50904ce44c..57ed875ca4f8d 100644 --- a/docs/GettingStartedVS.rst +++ b/docs/GettingStartedVS.rst @@ -45,10 +45,12 @@ approximately 3GB. Software -------- -You will need Visual Studio 2013 or higher. +You will need Visual Studio 2013 or higher, with the latest Update installed. You will also need the `CMake <http://www.cmake.org/>`_ build system since it -generates the project files you will use to build with. +generates the project files you will use to build with. CMake 2.8.12.2 is the +minimum required version for building with Visual Studio, though the latest +version of CMake is recommended. If you would like to run the LLVM tests you will need `Python <http://www.python.org/>`_. Version 2.7 and newer are known to work. You will @@ -91,6 +93,10 @@ Here's the short story for getting up and running quickly with LLVM: using LLVM. Another important option is ``LLVM_TARGETS_TO_BUILD``, which controls the LLVM target architectures that are included on the build. + * If CMake complains that it cannot find the compiler, make sure that + you have the Visual Studio C++ Tools installed, not just Visual Studio + itself (trying to create a C++ project in Visual Studio will generally + download the C++ tools if they haven't already been). * See the :doc:`LLVM CMake guide <CMake>` for detailed information about how to configure the LLVM build. * CMake generates project files for all build types. To select a specific diff --git a/docs/GoldPlugin.rst b/docs/GoldPlugin.rst index 6328934b37b34..88b944a2a0fdd 100644 --- a/docs/GoldPlugin.rst +++ b/docs/GoldPlugin.rst @@ -44,9 +44,7 @@ will either need to build gold or install a version with plugin support. the ``-plugin`` option. Running ``make`` will additionally build ``build/binutils/ar`` and ``nm-new`` binaries supporting plugins. -* Build the LLVMgold plugin. If building with autotools, run configure with - ``--with-binutils-include=/path/to/binutils/include`` and run ``make``. - If building with CMake, run cmake with +* Build the LLVMgold plugin. Run CMake with ``-DLLVM_BINUTILS_INCDIR=/path/to/binutils/include``. The correct include path will contain the file ``plugin-api.h``. diff --git a/docs/HowToCrossCompileLLVM.rst b/docs/HowToCrossCompileLLVM.rst index 1072517e4c2b6..e71c0b07a7a0e 100644 --- a/docs/HowToCrossCompileLLVM.rst +++ b/docs/HowToCrossCompileLLVM.rst @@ -39,6 +39,7 @@ For more information on how to configure CMake for LLVM/Clang, see :doc:`CMake`. The CMake options you need to add are: + * ``-DCMAKE_CROSSCOMPILING=True`` * ``-DCMAKE_INSTALL_PREFIX=<install-dir>`` * ``-DLLVM_TABLEGEN=<path-to-host-bin>/llvm-tblgen`` @@ -46,20 +47,40 @@ The CMake options you need to add are: * ``-DLLVM_DEFAULT_TARGET_TRIPLE=arm-linux-gnueabihf`` * ``-DLLVM_TARGET_ARCH=ARM`` * ``-DLLVM_TARGETS_TO_BUILD=ARM`` - * ``-DCMAKE_CXX_FLAGS='-target armv7a-linux-gnueabihf -mcpu=cortex-a9 -I/usr/arm-linux-gnueabihf/include/c++/4.7.2/arm-linux-gnueabihf/ -I/usr/arm-linux-gnueabihf/include/ -mfloat-abi=hard -ccc-gcc-name arm-linux-gnueabihf-gcc'`` + +If you're compiling with GCC, you can use architecture options for your target, +and the compiler driver will detect everything that it needs: + + * ``-DCMAKE_CXX_FLAGS='-march=armv7-a -mcpu=cortex-a9 -mfloat-abi=hard'`` + +However, if you're using Clang, the driver might not be up-to-date with your +specific Linux distribution, version or GCC layout, so you'll need to fudge. + +In addition to the ones above, you'll also need: + + * ``'-target arm-linux-gnueabihf'`` or whatever is the triple of your cross GCC. + * ``'--sysroot=/usr/arm-linux-gnueabihf'``, ``'--sysroot=/opt/gcc/arm-linux-gnueabihf'`` + or whatever is the location of your GCC's sysroot (where /lib, /bin etc are). + * Appropriate use of ``-I`` and ``-L``, depending on how the cross GCC is installed, + and where are the libraries and headers. The TableGen options are required to compile it with the host compiler, so you'll need to compile LLVM (or at least ``llvm-tblgen``) to your host -platform before you start. The CXX flags define the target, cpu (which +platform before you start. The CXX flags define the target, cpu (which in this case defaults to ``fpu=VFP3`` with NEON), and forcing the hard-float ABI. If you're -using Clang as a cross-compiler, you will *also* have to set ``-ccc-gcc-name``, +using Clang as a cross-compiler, you will *also* have to set ``--sysroot`` to make sure it picks the correct linker. +When using Clang, it's important that you choose the triple to be *identical* +to the GCC triple and the sysroot. This will make it easier for Clang to +find the correct tools and include headers. But that won't mean all headers and +libraries will be found. You'll still need to use ``-I`` and ``-L`` to locate +those extra ones, depending on your distribution. + Most of the time, what you want is to have a native compiler to the -platform itself, but not others. It might not even be feasible to -produce x86 binaries from ARM targets, so there's no point in compiling +platform itself, but not others. So there's rarely a point in compiling all back-ends. For that reason, you should also set the -``TARGETS_TO_BUILD`` to only build the ARM back-end. +``TARGETS_TO_BUILD`` to only build the back-end you're targeting to. You must set the ``CMAKE_INSTALL_PREFIX``, otherwise a ``ninja install`` will copy ARM binaries to your root filesystem, which is not what you @@ -83,14 +104,23 @@ running CMake: This is not a problem, since Clang/LLVM libraries are statically linked anyway, it shouldn't affect much. -#. The ARM libraries won't be installed in your system, and possibly - not easily installable anyway, so you'll have to build/download - them separately. But the CMake prepare step, which checks for +#. The ARM libraries won't be installed in your system. + But the CMake prepare step, which checks for dependencies, will check the *host* libraries, not the *target* - ones. + ones. Below there's a list of some dependencies, but your project could + have more, or this document could be outdated. You'll see the errors + while linking as an indication of that. + + Debian based distros have a way to add ``multiarch``, which adds + a new architecture and allows you to install packages for those + systems. See https://wiki.debian.org/Multiarch/HOWTO for more info. + + But not all distros will have that, and possibly not an easy way to + install them in any anyway, so you'll have to build/download + them separately. A quick way of getting the libraries is to download them from - a distribution repository, like Debian (http://packages.debian.org/wheezy/), + a distribution repository, like Debian (http://packages.debian.org/jessie/), and download the missing libraries. Note that the ``libXXX`` will have the shared objects (``.so``) and the ``libXXX-dev`` will give you the headers and the static (``.a``) library. Just in diff --git a/docs/HowToReleaseLLVM.rst b/docs/HowToReleaseLLVM.rst index 33c547e97a889..d44ea04a9fafc 100644 --- a/docs/HowToReleaseLLVM.rst +++ b/docs/HowToReleaseLLVM.rst @@ -332,9 +332,26 @@ Below are the rules regarding patching the release branch: #. During the remaining rounds of testing, only patches that fix critical regressions may be applied. -#. For dot releases all patches must mantain both API and ABI compatibility with +#. For dot releases all patches must maintain both API and ABI compatibility with the previous major release. Only bugfixes will be accepted. +Merging Patches +^^^^^^^^^^^^^^^ + +The ``utils/release/merge.sh`` script can be used to merge individual revisions +into any one of the llvm projects. To merge revision ``$N`` into project +``$PROJ``, do: + +#. ``svn co https://llvm.org/svn/llvm-project/$PROJ/branches/release_XX + $PROJ.src`` + +#. ``$PROJ.src/utils/release/merge.sh --proj $PROJ --rev $N`` + +#. Run regression tests. + +#. ``cd $PROJ.src``. Run the ``svn commit`` command printed out by ``merge.sh`` + in step 2. + Release Final Tasks ------------------- diff --git a/docs/LLVMBuild.rst b/docs/LLVMBuild.rst index 58f6f4d20a041..0200f78bfb7f4 100644 --- a/docs/LLVMBuild.rst +++ b/docs/LLVMBuild.rst @@ -49,8 +49,7 @@ Build Integration The LLVMBuild files themselves are just a declarative way to describe the project structure. The actual building of the LLVM project is -handled by another build system (currently we support both -:doc:`Makefiles <MakefileGuide>` and :doc:`CMake <CMake>`). +handled by another build system (See: :doc:`CMake <CMake>`). The build system implementation will load the relevant contents of the LLVMBuild files and use that to drive the actual project build. diff --git a/docs/LangRef.rst b/docs/LangRef.rst index 5f8a3a5a4a987..f6dda59fda255 100644 --- a/docs/LangRef.rst +++ b/docs/LangRef.rst @@ -250,6 +250,11 @@ linkage: together. This is the LLVM, typesafe, equivalent of having the system linker append together "sections" with identical names when .o files are linked. + + Unfortunately this doesn't correspond to any feature in .o files, so it + can only be used for variables like ``llvm.global_ctors`` which llvm + interprets specially. + ``extern_weak`` The semantics of this linkage follow the ELF object file model: the symbol is weak until linked, if not linked, the symbol becomes null @@ -427,6 +432,10 @@ added in the future: - On X86-64 the callee preserves all general purpose registers, except for RDI and RAX. +"``swiftcc``" - This calling convention is used for Swift language. + - On X86-64 RCX and R8 are available for additional integer returns, and + XMM2 and XMM3 are available for additional FP/vector returns. + - On iOS platforms, we use AAPCS-VFP calling convention. "``cc <n>``" - Numbered convention Any calling convention may be specified by number, allowing target-specific calling conventions to be used. Target specific @@ -580,6 +589,9 @@ initializer. Note that a constant with significant address *can* be merged with a ``unnamed_addr`` constant, the result being a constant whose address is significant. +If the ``local_unnamed_addr`` attribute is given, the address is known to +not be significant within the module. + A global variable may be declared to reside in a target-specific numbered address space. For targets that support them, address spaces may affect how optimizations are performed and/or what target @@ -610,18 +622,20 @@ assume that the globals are densely packed in their section and try to iterate over them as an array, alignment padding would break this iteration. The maximum alignment is ``1 << 29``. -Globals can also have a :ref:`DLL storage class <dllstorageclass>`. +Globals can also have a :ref:`DLL storage class <dllstorageclass>` and +an optional list of attached :ref:`metadata <metadata>`, Variables and aliases can have a :ref:`Thread Local Storage Model <tls_model>`. Syntax:: - [@<GlobalVarName> =] [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] - [unnamed_addr] [AddrSpace] [ExternallyInitialized] + @<GlobalVarName> = [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] + [(unnamed_addr|local_unnamed_addr)] [AddrSpace] + [ExternallyInitialized] <global | constant> <Type> [<InitializerConstant>] [, section "name"] [, comdat [($name)]] - [, align <Alignment>] + [, align <Alignment>] (, !name !N)* For example, the following defines a global in a numbered address space with an initializer, section, and alignment: @@ -665,14 +679,14 @@ an optional list of attached :ref:`metadata <metadata>`, an opening curly brace, a list of basic blocks, and a closing curly brace. LLVM function declarations consist of the "``declare``" keyword, an -optional :ref:`linkage type <linkage>`, an optional :ref:`visibility -style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, -an optional :ref:`calling convention <callingconv>`, -an optional ``unnamed_addr`` attribute, a return type, an optional -:ref:`parameter attribute <paramattrs>` for the return type, a function -name, a possibly empty list of arguments, an optional alignment, an optional -:ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, -and an optional :ref:`prologue <prologuedata>`. +optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style +<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an +optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr`` +or ``local_unnamed_addr`` attribute, a return type, an optional :ref:`parameter +attribute <paramattrs>` for the return type, a function name, a possibly +empty list of arguments, an optional alignment, an optional :ref:`garbage +collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional +:ref:`prologue <prologuedata>`. A function definition contains a list of basic blocks, forming the CFG (Control Flow Graph) for the function. Each basic block may optionally start with a label @@ -703,14 +717,17 @@ alignment. All alignments must be a power of 2. If the ``unnamed_addr`` attribute is given, the address is known to not be significant and two identical functions can be merged. +If the ``local_unnamed_addr`` attribute is given, the address is known to +not be significant within the module. + Syntax:: define [linkage] [visibility] [DLLStorageClass] [cconv] [ret attrs] <ResultType> @<FunctionName> ([argument list]) - [unnamed_addr] [fn Attrs] [section "name"] [comdat [($name)]] - [align N] [gc] [prefix Constant] [prologue Constant] - [personality Constant] (!name !N)* { ... } + [(unnamed_addr|local_unnamed_addr)] [fn Attrs] [section "name"] + [comdat [($name)]] [align N] [gc] [prefix Constant] + [prologue Constant] [personality Constant] (!name !N)* { ... } The argument list is a comma separated sequence of arguments where each argument is of the following form: @@ -737,7 +754,7 @@ Aliases may have an optional :ref:`linkage type <linkage>`, an optional Syntax:: - @<Name> = [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] [unnamed_addr] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee> + @<Name> = [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee> The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``, ``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers @@ -747,6 +764,9 @@ Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point to the same content. +If the ``local_unnamed_addr`` attribute is given, the address is known to +not be significant within the module. + Since aliases are only a second name, some restrictions apply, of which some can only be checked when producing an object file: @@ -760,6 +780,25 @@ some can only be checked when producing an object file: * No global value in the expression can be a declaration, since that would require a relocation, which is not possible. +.. _langref_ifunc: + +IFuncs +------- + +IFuncs, like as aliases, don't create any new data or func. They are just a new +symbol that dynamic linker resolves at runtime by calling a resolver function. + +IFuncs have a name and a resolver that is a function called by dynamic linker +that returns address of another function associated with the name. + +IFunc may have an optional :ref:`linkage type <linkage>` and an optional +:ref:`visibility style <visibility>`. + +Syntax:: + + @<Name> = [Linkage] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver> + + .. _langref_comdats: Comdats @@ -907,8 +946,7 @@ Currently, only the following parameter attributes are defined: ``zeroext`` This indicates to the code generator that the parameter or return value should be zero-extended to the extent required by the target's - ABI (which is usually 32-bits, but is 8-bits for a i1 on x86-64) by - the caller (for a parameter) or the callee (for a return value). + ABI by the caller (for a parameter) or the callee (for a return value). ``signext`` This indicates to the code generator that the parameter or return value should be sign-extended to the extent required by the target's @@ -1010,7 +1048,8 @@ Currently, only the following parameter attributes are defined: ``nocapture`` This indicates that the callee does not make any copies of the pointer that outlive the callee itself. This is not a valid - attribute for return values. + attribute for return values. Addresses used in volatile operations + are considered to be captured. .. _nest: @@ -1021,12 +1060,13 @@ Currently, only the following parameter attributes are defined: ``returned`` This indicates that the function always returns the argument as its return - value. This is an optimization hint to the code generator when generating - the caller, allowing tail call optimization and omission of register saves - and restores in some cases; it is not checked or enforced when generating - the callee. The parameter and the function return type must be valid - operands for the :ref:`bitcast instruction <i_bitcast>`. This is not a - valid attribute for return values and can only be applied to one parameter. + value. This is a hint to the optimizer and code generator used when + generating the caller, allowing value propagation, tail call optimization, + and omission of register saves and restores in some cases; it is not + checked or enforced when generating the callee. The parameter and the + function return type must be valid operands for the + :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for + return values and can only be applied to one parameter. ``nonnull`` This indicates that the parameter or return pointer is not null. This @@ -1059,6 +1099,30 @@ Currently, only the following parameter attributes are defined: ``dereferenceable(<n>)``). This attribute may only be applied to pointer typed parameters. +``swiftself`` + This indicates that the parameter is the self/context parameter. This is not + a valid attribute for return values and can only be applied to one + parameter. + +``swifterror`` + This attribute is motivated to model and optimize Swift error handling. It + can be applied to a parameter with pointer to pointer type or a + pointer-sized alloca. At the call site, the actual argument that corresponds + to a ``swifterror`` parameter has to come from a ``swifterror`` alloca. A + ``swifterror`` value (either the parameter or the alloca) can only be loaded + and stored from, or used as a ``swifterror`` argument. This is not a valid + attribute for return values and can only be applied to one parameter. + + These constraints allow the calling convention to optimize access to + ``swifterror`` variables by associating them with a specific register at + call boundaries rather than placing them in memory. Since this does change + the calling convention, a function which uses the ``swifterror`` attribute + on a parameter is not ABI-compatible with one which does not. + + These constraints also allow LLVM to assume that a ``swifterror`` argument + does not alias any other memory visible within a function and that a + ``swifterror`` alloca passed as an argument does not escape. + .. _gc: Garbage Collector Strategy Names @@ -1223,6 +1287,15 @@ example: epilogue, the backend should forcibly align the stack pointer. Specify the desired alignment, which must be a power of two, in parentheses. +``allocsize(<EltSizeParam>[, <NumEltsParam>])`` + This attribute indicates that the annotated function will always return at + least a given number of bytes (or null). Its arguments are zero-indexed + parameter numbers; if one argument is provided, then it's assumed that at + least ``CallSite.Args[EltSizeParam]`` bytes will be available at the + returned pointer. If two are provided, then it's assumed that + ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are + available. The referenced parameters must be integer types. No assumptions + are made about the contents of the returned block of memory. ``alwaysinline`` This attribute indicates that the inliner should attempt to inline this function into callers whenever possible, ignoring any active @@ -1239,10 +1312,26 @@ example: function call are also considered to be cold; and, thus, given low weight. ``convergent`` - This attribute indicates that the callee is dependent on a convergent - thread execution pattern under certain parallel execution models. - Transformations that are execution model agnostic may not make the execution - of a convergent operation control dependent on any additional values. + In some parallel execution models, there exist operations that cannot be + made control-dependent on any additional values. We call such operations + ``convergent``, and mark them with this attribute. + + The ``convergent`` attribute may appear on functions or call/invoke + instructions. When it appears on a function, it indicates that calls to + this function should not be made control-dependent on additional values. + For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so + calls to this intrinsic cannot be made control-dependent on additional + values. + + When it appears on a call/invoke, the ``convergent`` attribute indicates + that we should treat the call as though we're calling a convergent + function. This is particularly useful on indirect calls; without this we + may treat such calls as though the target is non-convergent. + + The optimizer may remove the ``convergent`` attribute on functions when it + can prove that the function does not execute any convergent operations. + Similarly, the optimizer may remove ``convergent`` on calls/invokes when it + can prove that the call/invoke cannot call a convergent function. ``inaccessiblememonly`` This attribute indicates that the function may only access memory that is not accessible by the module being compiled. This is a weaker form @@ -1334,6 +1423,31 @@ example: passes make choices that keep the code size of this function low, and otherwise do optimizations specifically to reduce code size as long as they do not significantly impact runtime performance. +``"patchable-function"`` + This attribute tells the code generator that the code + generated for this function needs to follow certain conventions that + make it possible for a runtime function to patch over it later. + The exact effect of this attribute depends on its string value, + for which there currently is one legal possibility: + + * ``"prologue-short-redirect"`` - This style of patchable + function is intended to support patching a function prologue to + redirect control away from the function in a thread safe + manner. It guarantees that the first instruction of the + function will be large enough to accommodate a short jump + instruction, and will be sufficiently aligned to allow being + fully changed via an atomic compare-and-swap instruction. + While the first requirement can be satisfied by inserting large + enough NOP, LLVM can and will try to re-purpose an existing + instruction (i.e. one that would have to be emitted anyway) as + the patchable instruction larger than a short jump. + + ``"prologue-short-redirect"`` is currently only supported on + x86-64. + + This attribute by itself does not imply restrictions on + inter-procedural optimizations. All of the semantic effects the + patching may have to be separately conveyed via the linkage type. ``readnone`` On a function, this attribute indicates that the function computes its result (or decides to unwind an exception) based strictly on its arguments, @@ -1361,6 +1475,13 @@ example: On an argument, this attribute indicates that the function does not write through this pointer argument, even though it may write to the memory that the pointer points to. +``writeonly`` + On a function, this attribute indicates that the function may write to but + does not read from memory. + + On an argument, this attribute indicates that the function may write to but + does not read through this pointer argument (even though it may read from + the memory that the pointer points to). ``argmemonly`` This attribute indicates that the only memory accesses inside function are loads and stores from objects pointed to by its pointer-typed arguments, @@ -1511,7 +1632,7 @@ operand bundle to not miscompile programs containing it. ways before control is transferred to the callee or invokee. - Calls and invokes with operand bundles have unknown read / write effect on the heap on entry and exit (even if the call target is - ``readnone`` or ``readonly``), unless they're overriden with + ``readnone`` or ``readonly``), unless they're overridden with callsite specific attributes. - An operand bundle at a call site cannot change the implementation of the called function. Inter-procedural optimizations work as @@ -1519,6 +1640,8 @@ operand bundle to not miscompile programs containing it. More specific types of operand bundles are described below. +.. _deopt_opbundles: + Deoptimization Operand Bundles ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -1602,6 +1725,18 @@ it is undefined behavior to execute a ``call`` or ``invoke`` which: Similarly, if no funclet EH pads have been entered-but-not-yet-exited, executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior. +GC Transition Operand Bundles +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +GC transition operand bundles are characterized by the +``"gc-transition"`` operand bundle tag. These operand bundles mark a +call as a transition between a function with one GC strategy to a +function with a different GC strategy. If coordinating the transition +between GC strategies requires additional code generation at the call +site, these bundles may contain any values that are needed by the +generated code. For more details, see :ref:`GC Transitions +<gc_transition_args>`. + .. _moduleasm: Module-Level Inline Assembly @@ -2086,6 +2221,26 @@ function's scope. uselistorder i32 (i32) @bar, { 1, 0 } uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 } +.. _source_filename: + +Source Filename +--------------- + +The *source filename* string is set to the original module identifier, +which will be the name of the compiled source file when compiling from +source through the clang front end, for example. It is then preserved through +the IR and bitcode. + +This is currently necessary to generate a consistent unique global +identifier for local functions used in profile data, which prepends the +source file name to the local function name. + +The syntax for the source file name is simply: + +.. code-block:: llvm + + source_filename = "/path/to/source.c" + .. _typesystem: Type System @@ -3119,7 +3274,7 @@ the same register to an output and an input. If this is not safe (e.g. if the assembly contains two instructions, where the first writes to one output, and the second reads an input and writes to a second output), then the "``&``" modifier must be used (e.g. "``=&r``") to specify that the output is an -"early-clobber" output. Marking an ouput as "early-clobber" ensures that LLVM +"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM will not use the same register for any inputs (other than an input tied to this output). @@ -3453,8 +3608,14 @@ SystemZ: - ``K``: An immediate signed 16-bit integer. - ``L``: An immediate signed 20-bit integer. - ``M``: An immediate integer 0x7fffffff. -- ``Q``, ``R``, ``S``, ``T``: A memory address operand, treated the same as - ``m``, at the moment. +- ``Q``: A memory address operand with a base address and a 12-bit immediate + unsigned displacement. +- ``R``: A memory address operand with a base address, a 12-bit immediate + unsigned displacement, and an index register. +- ``S``: A memory address operand with a base address and a 20-bit immediate + signed displacement. +- ``T``: A memory address operand with a base address, a 20-bit immediate + signed displacement, and an index register. - ``r`` or ``d``: A 32, 64, or 128-bit integer register. - ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an address context evaluates as zero). @@ -3792,7 +3953,7 @@ references to them from instructions). !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang", isOptimized: true, flags: "-O2", runtimeVersion: 2, - splitDebugFilename: "abc.debug", emissionKind: 1, + splitDebugFilename: "abc.debug", emissionKind: FullDebug, enums: !2, retainedTypes: !3, subprograms: !4, globals: !5, imports: !6, macros: !7, dwoId: 0x0abcd) @@ -3878,21 +4039,28 @@ The following ``tag:`` values are valid: .. code-block:: llvm - DW_TAG_formal_parameter = 5 DW_TAG_member = 13 DW_TAG_pointer_type = 15 DW_TAG_reference_type = 16 DW_TAG_typedef = 22 + DW_TAG_inheritance = 28 DW_TAG_ptr_to_member_type = 31 DW_TAG_const_type = 38 + DW_TAG_friend = 42 DW_TAG_volatile_type = 53 DW_TAG_restrict_type = 55 +.. _DIDerivedTypeMember: + ``DW_TAG_member`` is used to define a member of a :ref:`composite type -<DICompositeType>` or :ref:`subprogram <DISubprogram>`. The type of the member -is the ``baseType:``. The ``offset:`` is the member's bit offset. -``DW_TAG_formal_parameter`` is used to define a member which is a formal -argument of a subprogram. +<DICompositeType>`. The type of the member is the ``baseType:``. The +``offset:`` is the member's bit offset. If the composite type has an ODR +``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is +uniqued based only on its ``name:`` and ``scope:``. + +``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:`` +field of :ref:`composite types <DICompositeType>` to describe parents and +friends. ``DW_TAG_typedef`` is used to provide a name for the ``baseType:``. @@ -3911,9 +4079,15 @@ DICompositeType structures and unions. ``elements:`` points to a tuple of the composed types. If the source language supports ODR, the ``identifier:`` field gives the unique -identifier used for type merging between modules. When specified, other types -can refer to composite types indirectly via a :ref:`metadata string -<metadata-string>` that matches their identifier. +identifier used for type merging between modules. When specified, +:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member +derived types <DIDerivedTypeMember>` that reference the ODR-type in their +``scope:`` change uniquing rules. + +For a given ``identifier:``, there should only be a single composite type that +does not have ``flags: DIFlagFwdDecl`` set. LLVM tools that link modules +together will unique such definitions at parse time via the ``identifier:`` +field, even if the nodes are ``distinct``. .. code-block:: llvm @@ -3933,9 +4107,6 @@ The following ``tag:`` values are valid: DW_TAG_enumeration_type = 4 DW_TAG_structure_type = 19 DW_TAG_union_type = 23 - DW_TAG_subroutine_type = 21 - DW_TAG_inheritance = 28 - For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange descriptors <DISubrange>`, each representing the range of subscripts at that @@ -3949,7 +4120,9 @@ value for the set. All enumeration type descriptors are collected in the For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and ``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types -<DIDerivedType>` with ``tag: DW_TAG_member`` or ``tag: DW_TAG_inheritance``. +<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or +``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with +``isDefinition: false``. .. _DISubrange: @@ -4038,6 +4211,14 @@ metadata. The ``variables:`` field points at :ref:`variables <DILocalVariable>` that must be retained, even if their IR counterparts are optimized out of the IR. The ``type:`` field must point at an :ref:`DISubroutineType`. +.. _DISubprogramDeclaration: + +When ``isDefinition: false``, subprograms describe a declaration in the type +tree as opposed to a definition of a function. If the scope is a composite +type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``, +then the subprogram declaration is uniqued based only on its ``linkageName:`` +and ``scope:``. + .. code-block:: llvm define void @_Z3foov() !dbg !0 { @@ -4046,7 +4227,7 @@ the IR. The ``type:`` field must point at an :ref:`DISubroutineType`. !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1, file: !2, line: 7, type: !3, isLocal: true, - isDefinition: false, scopeLine: 8, + isDefinition: true, scopeLine: 8, containingType: !4, virtuality: DW_VIRTUALITY_pure_virtual, virtualIndex: 10, flags: DIFlagPrototyped, @@ -4165,7 +4346,7 @@ DIMacro ``DIMacro`` nodes represent definition or undefinition of a macro identifiers. The ``name:`` field is the macro identifier, followed by macro parameters when -definining a function-like macro, and the ``value`` field is the token-string +defining a function-like macro, and the ``value`` field is the token-string used to expand the macro identifier. .. code-block:: llvm @@ -4262,12 +4443,20 @@ instructions (loads, stores, memory-accessing calls, etc.) that carry ``noalias`` metadata can specifically be specified not to alias with some other collection of memory access instructions that carry ``alias.scope`` metadata. Each type of metadata specifies a list of scopes where each scope has an id and -a domain. When evaluating an aliasing query, if for some domain, the set +a domain. + +When evaluating an aliasing query, if for some domain, the set of scopes with that domain in one instruction's ``alias.scope`` list is a subset of (or equal to) the set of scopes for that domain in another instruction's ``noalias`` list, then the two memory accesses are assumed not to alias. +Because scopes in one domain don't affect scopes in other domains, separate +domains can be used to compose multiple independent noalias sets. This is +used for example during inlining. As the noalias function parameters are +turned into noalias scope metadata, a new domain is used every time the +function is inlined. + The metadata identifying each domain is itself a list containing one or two entries. The first entry is the name of the domain. Note that if the name is a string then it can be combined across functions and translation units. A @@ -4329,8 +4518,8 @@ it. ULP is defined as follows: distance between the two non-equal finite floating-point numbers nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``. -The metadata node shall consist of a single positive floating point -number representing the maximum relative error, for example: +The metadata node shall consist of a single positive float type number +representing the maximum relative error, for example: .. code-block:: llvm @@ -4542,6 +4731,38 @@ For example: !0 = !{!"llvm.loop.unroll.full"} +'``llvm.loop.licm_versioning.disable``' Metadata +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This metadata indicates that the loop should not be versioned for the purpose +of enabling loop-invariant code motion (LICM). The metadata has a single operand +which is the string ``llvm.loop.licm_versioning.disable``. For example: + +.. code-block:: llvm + + !0 = !{!"llvm.loop.licm_versioning.disable"} + +'``llvm.loop.distribute.enable``' Metadata +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Loop distribution allows splitting a loop into multiple loops. Currently, +this is only performed if the entire loop cannot be vectorized due to unsafe +memory dependencies. The transformation will atempt to isolate the unsafe +dependencies into their own loop. + +This metadata can be used to selectively enable or disable distribution of the +loop. The first operand is the string ``llvm.loop.distribute.enable`` and the +second operand is a bit. If the bit operand value is 1 distribution is +enabled. A value of 0 disables distribution: + +.. code-block:: llvm + + !0 = !{!"llvm.loop.distribute.enable", i1 0} + !1 = !{!"llvm.loop.distribute.enable", i1 1} + +This metadata should be used in conjunction with ``llvm.loop`` loop +identification metadata. + '``llvm.mem``' ^^^^^^^^^^^^^^^ @@ -4555,7 +4776,8 @@ The ``llvm.mem.parallel_loop_access`` metadata refers to a loop identifier, or metadata containing a list of loop identifiers for nested loops. The metadata is attached to memory accessing instructions and denotes that no loop carried memory dependence exist between it and other instructions denoted -with the same loop identifier. +with the same loop identifier. The metadata on memory reads also implies that +if conversion (i.e. speculative execution within a loop iteration) is safe. Precisely, given two instructions ``m1`` and ``m2`` that both have the ``llvm.mem.parallel_loop_access`` metadata, with ``L1`` and ``L2`` being the @@ -4625,12 +4847,6 @@ the loop identifier metadata node directly: !1 = !{!1} ; an identifier for the inner loop !2 = !{!2} ; an identifier for the outer loop -'``llvm.bitsets``' -^^^^^^^^^^^^^^^^^^ - -The ``llvm.bitsets`` global metadata is used to implement -:doc:`bitsets <BitSets>`. - '``invariant.group``' Metadata ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -5267,7 +5483,7 @@ Syntax: :: - <result> = invoke [cconv] [ret attrs] <ptr to function ty> <function ptr val>(<function args>) [fn attrs] + <result> = invoke [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [operand bundles] to label <normal label> unwind label <exception label> Overview: @@ -5303,12 +5519,16 @@ This instruction requires several arguments: #. The optional :ref:`Parameter Attributes <paramattrs>` list for return values. Only '``zeroext``', '``signext``', and '``inreg``' attributes are valid here. -#. '``ptr to function ty``': shall be the signature of the pointer to - function value being invoked. In most cases, this is a direct - function invocation, but indirect ``invoke``'s are just as possible, - branching off an arbitrary pointer to function value. -#. '``function ptr val``': An LLVM value containing a pointer to a - function to be invoked. +#. '``ty``': the type of the call instruction itself which is also the + type of the return value. Functions that return no value are marked + ``void``. +#. '``fnty``': shall be the signature of the function being invoked. The + argument types must match the types implied by this signature. This + type can be omitted if the function is not varargs. +#. '``fnptrval``': An LLVM value containing a pointer to a function to + be invoked. In most cases, this is a direct function invocation, but + indirect ``invoke``'s are just as possible, calling an arbitrary pointer + to function value. #. '``function args``': argument list whose types match the function signature argument types and parameter attributes. All arguments must be of :ref:`first class <t_firstclass>` type. If the function signature @@ -6767,7 +6987,7 @@ Syntax: :: <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.load !<index>][, !invariant.group !<index>][, !nonnull !<index>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>] - <result> = load atomic [volatile] <ty>* <pointer> [singlethread] <ordering>, align <alignment> [, !invariant.group !<index>] + <result> = load atomic [volatile] <ty>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> [, !invariant.group !<index>] !<index> = !{ i32 1 } !<deref_bytes_node> = !{i64 <dereferenceable_bytes>} !<align_node> = !{ i64 <value_alignment> } @@ -6780,12 +7000,12 @@ The '``load``' instruction is used to read from memory. Arguments: """""""""" -The argument to the ``load`` instruction specifies the memory address -from which to load. The type specified must be a :ref:`first -class <t_firstclass>` type. If the ``load`` is marked as ``volatile``, -then the optimizer is not allowed to modify the number or order of -execution of this ``load`` with other :ref:`volatile -operations <volatile>`. +The argument to the ``load`` instruction specifies the memory address from which +to load. The type specified must be a :ref:`first class <t_firstclass>` type of +known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If +the ``load`` is marked as ``volatile``, then the optimizer is not allowed to +modify the number or order of execution of this ``load`` with other +:ref:`volatile operations <volatile>`. If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering <ordering>` and optional ``singlethread`` argument. The ``release`` and @@ -6805,7 +7025,12 @@ alignment for the target. It is the responsibility of the code emitter to ensure that the alignment information is correct. Overestimating the alignment results in undefined behavior. Underestimating the alignment may produce less efficient code. An alignment of 1 is always safe. The -maximum possible alignment is ``1 << 29``. +maximum possible alignment is ``1 << 29``. An alignment value higher +than the size of the loaded type implies memory up to the alignment +value bytes can be safely loaded without trapping in the default +address space. Access of the high bytes can interfere with debugging +tools, so should not be accessed if the function has the +``sanitize_thread`` or ``sanitize_address`` attributes. The optional ``!nontemporal`` metadata must reference a single metadata name ``<index>`` corresponding to a metadata node with one @@ -6903,13 +7128,14 @@ The '``store``' instruction is used to write to memory. Arguments: """""""""" -There are two arguments to the ``store`` instruction: a value to store -and an address at which to store it. The type of the ``<pointer>`` -operand must be a pointer to the :ref:`first class <t_firstclass>` type of -the ``<value>`` operand. If the ``store`` is marked as ``volatile``, -then the optimizer is not allowed to modify the number or order of -execution of this ``store`` with other :ref:`volatile -operations <volatile>`. +There are two arguments to the ``store`` instruction: a value to store and an +address at which to store it. The type of the ``<pointer>`` operand must be a +pointer to the :ref:`first class <t_firstclass>` type of the ``<value>`` +operand. If the ``store`` is marked as ``volatile``, then the optimizer is not +allowed to modify the number or order of execution of this ``store`` with other +:ref:`volatile operations <volatile>`. Only values of :ref:`first class +<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque +structural type <t_opaque>`) can be stored. If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering <ordering>` and optional ``singlethread`` argument. The ``acquire`` and @@ -6929,7 +7155,14 @@ alignment for the target. It is the responsibility of the code emitter to ensure that the alignment information is correct. Overestimating the alignment results in undefined behavior. Underestimating the alignment may produce less efficient code. An alignment of 1 is always -safe. The maximum possible alignment is ``1 << 29``. +safe. The maximum possible alignment is ``1 << 29``. An alignment +value higher than the size of the stored type implies memory up to the +alignment value bytes can be stored to without trapping in the default +address space. Storing to the higher bytes however may result in data +races if another thread can access the same address. Introducing a +data race is not allowed. Storing to the extra bytes is not allowed +even in situations where a data race is known to not exist if the +function has the ``sanitize_address`` attribute. The optional ``!nontemporal`` metadata must reference a single metadata name ``<index>`` corresponding to a metadata node with one ``i32`` entry of @@ -7044,13 +7277,13 @@ Arguments: There are three arguments to the '``cmpxchg``' instruction: an address to operate on, a value to compare to the value currently be at that address, and a new value to place at that address if the compared values -are equal. The type of '<cmp>' must be an integer type whose bit width -is a power of two greater than or equal to eight and less than or equal -to a target-specific size limit. '<cmp>' and '<new>' must have the same -type, and the type of '<pointer>' must be a pointer to that type. If the -``cmpxchg`` is marked as ``volatile``, then the optimizer is not allowed -to modify the number or order of execution of this ``cmpxchg`` with -other :ref:`volatile operations <volatile>`. +are equal. The type of '<cmp>' must be an integer or pointer type whose +bit width is a power of two greater than or equal to eight and less +than or equal to a target-specific size limit. '<cmp>' and '<new>' must +have the same type, and the type of '<pointer>' must be a pointer to +that type. If the ``cmpxchg`` is marked as ``volatile``, then the +optimizer is not allowed to modify the number or order of execution of +this ``cmpxchg`` with other :ref:`volatile operations <volatile>`. The success and failure :ref:`ordering <ordering>` arguments specify how this ``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters @@ -7091,11 +7324,11 @@ Example: .. code-block:: llvm entry: - %orig = atomic load i32, i32* %ptr unordered ; yields i32 + %orig = load atomic i32, i32* %ptr unordered, align 4 ; yields i32 br label %loop loop: - %cmp = phi i32 [ %orig, %entry ], [%old, %loop] + %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop] %squared = mul i32 %cmp, %cmp %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 } %value_loaded = extractvalue { i32, i1 } %val_success, 0 @@ -7977,7 +8210,7 @@ Arguments: The '``icmp``' instruction takes three operands. The first operand is the condition code indicating the kind of comparison to perform. It is -not a value, just a keyword. The possible condition code are: +not a value, just a keyword. The possible condition codes are: #. ``eq``: equal #. ``ne``: not equal @@ -8041,9 +8274,6 @@ Example: <result> = icmp ule i16 -4, 5 ; yields: result=false <result> = icmp sge i16 4, 5 ; yields: result=false -Note that the code generator does not yet support vector types with the -``icmp`` instruction. - .. _i_fcmp: '``fcmp``' Instruction @@ -8074,7 +8304,7 @@ Arguments: The '``fcmp``' instruction takes three operands. The first operand is the condition code indicating the kind of comparison to perform. It is -not a value, just a keyword. The possible condition code are: +not a value, just a keyword. The possible condition codes are: #. ``false``: no comparison, always returns false #. ``oeq``: ordered and equal @@ -8156,9 +8386,6 @@ Example: <result> = fcmp olt float 4.0, 5.0 ; yields: result=true <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false -Note that the code generator does not yet support vector types with the -``fcmp`` instruction. - .. _i_phi: '``phi``' Instruction @@ -8270,7 +8497,7 @@ Syntax: :: - <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] <ty> [<fnty>*] <fnptrval>(<function args>) [fn attrs] + <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ] Overview: @@ -8343,13 +8570,11 @@ This instruction requires several arguments: #. '``ty``': the type of the call instruction itself which is also the type of the return value. Functions that return no value are marked ``void``. -#. '``fnty``': shall be the signature of the pointer to function value - being invoked. The argument types must match the types implied by - this signature. This type can be omitted if the function is not - varargs and if the function type does not return a pointer to a - function. +#. '``fnty``': shall be the signature of the function being called. The + argument types must match the types implied by this signature. This + type can be omitted if the function is not varargs. #. '``fnptrval``': An LLVM value containing a pointer to a function to - be invoked. In most cases, this is a direct function invocation, but + be called. In most cases, this is a direct function call, but indirect ``call``'s are just as possible, calling an arbitrary pointer to function value. #. '``function args``': argument list whose types match the function @@ -8358,8 +8583,8 @@ This instruction requires several arguments: indicates the function accepts a variable number of arguments, the extra arguments can be specified. #. The optional :ref:`function attributes <fnattrs>` list. Only - '``noreturn``', '``nounwind``', '``readonly``' and '``readnone``' - attributes are valid here. + '``noreturn``', '``nounwind``', '``readonly``' , '``readnone``', + and '``convergent``' attributes are valid here. #. The optional :ref:`operand bundles <opbundles>` list. Semantics: @@ -9497,6 +9722,33 @@ pass will generate the appropriate data structures and replace the ``llvm.instrprof_value_profile`` intrinsic with the call to the profile runtime library with proper arguments. +'``llvm.thread.pointer``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i8* @llvm.thread.pointer() + +Overview: +""""""""" + +The '``llvm.thread.pointer``' intrinsic returns the value of the thread +pointer. + +Semantics: +"""""""""" + +The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area +for the current thread. The exact semantics of this value are target +specific: it may point to the start of TLS area, to the end, or somewhere +in the middle. Depending on the target, this intrinsic may read a register, +call a helper function, read from an alternate memory space, or perform +other operations necessary to locate the TLS area. Not all targets support +this intrinsic. + Standard C Library Intrinsics ----------------------------- @@ -10459,8 +10711,8 @@ Overview: """"""""" The '``llvm.bitreverse``' family of intrinsics is used to reverse the -bitpattern of an integer value; for example ``0b1234567`` becomes -``0b7654321``. +bitpattern of an integer value; for example ``0b10110110`` becomes +``0b01101101``. Semantics: """""""""" @@ -10558,7 +10810,7 @@ targets support all bit widths or vector types, however. declare i32 @llvm.ctlz.i32 (i32 <src>, i1 <is_zero_undef>) declare i64 @llvm.ctlz.i64 (i64 <src>, i1 <is_zero_undef>) declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>) - declase <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) + declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) Overview: """"""""" @@ -10605,7 +10857,7 @@ support all bit widths or vector types, however. declare i32 @llvm.cttz.i32 (i32 <src>, i1 <is_zero_undef>) declare i64 @llvm.cttz.i64 (i64 <src>, i1 <is_zero_undef>) declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>) - declase <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) + declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) Overview: """"""""" @@ -10640,7 +10892,26 @@ then the result is the size in bits of the type of ``src`` if Arithmetic with Overflow Intrinsics ----------------------------------- -LLVM provides intrinsics for some arithmetic with overflow operations. +LLVM provides intrinsics for fast arithmetic overflow checking. + +Each of these intrinsics returns a two-element struct. The first +element of this struct contains the result of the corresponding +arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of +the result. Therefore, for example, the first element of the struct +returned by ``llvm.sadd.with.overflow.i32`` is always the same as the +result of a 32-bit ``add`` instruction with the same operands, where +the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag. + +The second element of the result is an ``i1`` that is 1 if the +arithmetic operation overflowed and 0 otherwise. An operation +overflows if, for any values of its operands ``A`` and ``B`` and for +any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is +not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is +``sext`` for signed overflow and ``zext`` for unsigned overflow, and +``op`` is the underlying arithmetic operation. + +The behavior of these intrinsics is well-defined for all argument +values. '``llvm.sadd.with.overflow.*``' Intrinsics ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -10980,7 +11251,7 @@ Examples of non-canonical encodings: - Many normal decimal floating point numbers have non-canonical alternative encodings. - Some machines, like GPUs or ARMv7 NEON, do not support subnormal values. - These are treated as non-canonical encodings of zero and with be flushed to + These are treated as non-canonical encodings of zero and will be flushed to a zero of the same sign by this operation. Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with @@ -11304,12 +11575,12 @@ This is an overloaded intrinsic. The loaded data is a vector of any integer, flo :: - declare <16 x float> @llvm.masked.load.v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) - declare <2 x double> @llvm.masked.load.v2f64 (<2 x double>* <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) + declare <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) + declare <2 x double> @llvm.masked.load.v2f64.p0v2f64 (<2 x double>* <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) ;; The data is a vector of pointers to double - declare <8 x double*> @llvm.masked.load.v8p0f64 (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>) + declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64 (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>) ;; The data is a vector of function pointers - declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>) + declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>) Overview: """"""""" @@ -11332,7 +11603,7 @@ The result of this operation is equivalent to a regular vector load instruction :: - %res = call <16 x float> @llvm.masked.load.v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru) + %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru) ;; The result of the two following instructions is identical aside from potential memory access exception %loadlal = load <16 x float>, <16 x float>* %ptr, align 4 @@ -11349,12 +11620,12 @@ This is an overloaded intrinsic. The data stored in memory is a vector of any in :: - declare void @llvm.masked.store.v8i32 (<8 x i32> <value>, <8 x i32>* <ptr>, i32 <alignment>, <8 x i1> <mask>) - declare void @llvm.masked.store.v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>) + declare void @llvm.masked.store.v8i32.p0v8i32 (<8 x i32> <value>, <8 x i32>* <ptr>, i32 <alignment>, <8 x i1> <mask>) + declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>) ;; The data is a vector of pointers to double - declare void @llvm.masked.store.v8p0f64 (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>) + declare void @llvm.masked.store.v8p0f64.p0v8p0f64 (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>) ;; The data is a vector of function pointers - declare void @llvm.masked.store.v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>) + declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>) Overview: """"""""" @@ -11375,7 +11646,7 @@ The result of this operation is equivalent to a load-modify-store sequence. Howe :: - call void @llvm.masked.store.v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4, <16 x i1> %mask) + call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4, <16 x i1> %mask) ;; The result of the following instructions is identical aside from potential data races and memory access exceptions %oldval = load <16 x float>, <16 x float>* %ptr, align 4 @@ -11475,7 +11746,7 @@ The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector :: - ;; This instruction unconditionaly stores data vector in multiple addresses + ;; This instruction unconditionally stores data vector in multiple addresses call @llvm.masked.scatter.v8i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4, <8 x i1> <true, true, .. true>) ;; It is equivalent to a list of scalar stores @@ -11859,43 +12130,40 @@ checked against the original guard by ``llvm.stackprotectorcheck``. If they are different, then ``llvm.stackprotectorcheck`` causes the program to abort by calling the ``__stack_chk_fail()`` function. -'``llvm.stackprotectorcheck``' Intrinsic -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +'``llvm.stackguard``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Syntax: """"""" :: - declare void @llvm.stackprotectorcheck(i8** <guard>) + declare i8* @llvm.stackguard() Overview: """"""""" -The ``llvm.stackprotectorcheck`` intrinsic compares ``guard`` against an already -created stack protector and if they are not equal calls the -``__stack_chk_fail()`` function. +The ``llvm.stackguard`` intrinsic returns the system stack guard value. + +It should not be generated by frontends, since it is only for internal usage. +The reason why we create this intrinsic is that we still support IR form Stack +Protector in FastISel. Arguments: """""""""" -The ``llvm.stackprotectorcheck`` intrinsic requires one pointer argument, the -the variable ``@__stack_chk_guard``. +None. Semantics: """""""""" -This intrinsic is provided to perform the stack protector check by comparing -``guard`` with the stack slot created by ``llvm.stackprotector`` and if the -values do not match call the ``__stack_chk_fail()`` function. +On some platforms, the value returned by this intrinsic remains unchanged +between loads in the same thread. On other platforms, it returns the same +global variable value, if any, e.g. ``@__stack_chk_guard``. -The reason to provide this as an IR level intrinsic instead of implementing it -via other IR operations is that in order to perform this operation at the IR -level without an intrinsic, one would need to create additional basic blocks to -handle the success/failure cases. This makes it difficult to stop the stack -protector check from disrupting sibling tail calls in Codegen. With this -intrinsic, we are able to generate the stack protector basic blocks late in -codegen after the tail call decision has occurred. +Currently some platforms have IR-level customized stack guard loading (e.g. +X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be +in the future. '``llvm.objectsize``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -12010,9 +12278,9 @@ sufficient overall improvement in code quality. For this reason, that the optimizer can otherwise deduce or facts that are of little use to the optimizer. -.. _bitset.test: +.. _type.test: -'``llvm.bitset.test``' Intrinsic +'``llvm.type.test``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Syntax: @@ -12020,20 +12288,74 @@ Syntax: :: - declare i1 @llvm.bitset.test(i8* %ptr, metadata %bitset) nounwind readnone + declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone Arguments: """""""""" The first argument is a pointer to be tested. The second argument is a -metadata object representing an identifier for a :doc:`bitset <BitSets>`. +metadata object representing a :doc:`type identifier <TypeMetadata>`. Overview: """"""""" -The ``llvm.bitset.test`` intrinsic tests whether the given pointer is a -member of the given bitset. +The ``llvm.type.test`` intrinsic tests whether the given pointer is associated +with the given type identifier. + +'``llvm.type.checked.load``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly + + +Arguments: +"""""""""" + +The first argument is a pointer from which to load a function pointer. The +second argument is the byte offset from which to load the function pointer. The +third argument is a metadata object representing a :doc:`type identifier +<TypeMetadata>`. + +Overview: +""""""""" + +The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a +virtual table pointer using type metadata. This intrinsic is used to implement +control flow integrity in conjunction with virtual call optimization. The +virtual call optimization pass will optimize away ``llvm.type.checked.load`` +intrinsics associated with devirtualized calls, thereby removing the type +check in cases where it is not needed to enforce the control flow integrity +constraint. + +If the given pointer is associated with a type metadata identifier, this +function returns true as the second element of its return value. (Note that +the function may also return true if the given pointer is not associated +with a type metadata identifier.) If the function's return value's second +element is true, the following rules apply to the first element: + +- If the given pointer is associated with the given type metadata identifier, + it is the function pointer loaded from the given byte offset from the given + pointer. + +- If the given pointer is not associated with the given type metadata + identifier, it is one of the following (the choice of which is unspecified): + + 1. The function pointer that would have been loaded from an arbitrarily chosen + (through an unspecified mechanism) pointer associated with the type + metadata. + + 2. If the function has a non-void return type, a pointer to a function that + returns an unspecified value without causing side effects. + +If the function's return value's second element is false, the value of the +first element is undefined. + '``llvm.donothing``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -12049,8 +12371,9 @@ Overview: """"""""" The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only -two intrinsics (besides ``llvm.experimental.patchpoint``) that can be called -with an invoke instruction. +three intrinsics (besides ``llvm.experimental.patchpoint`` and +``llvm.experimental.gc.statepoint``) that can be called with an invoke +instruction. Arguments: """""""""" @@ -12063,6 +12386,155 @@ Semantics: This intrinsic does nothing, and it's removed by optimizers and ignored by codegen. +'``llvm.experimental.deoptimize``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ] + +Overview: +""""""""" + +This intrinsic, together with :ref:`deoptimization operand bundles +<deopt_opbundles>`, allow frontends to express transfer of control and +frame-local state from the currently executing (typically more specialized, +hence faster) version of a function into another (typically more generic, hence +slower) version. + +In languages with a fully integrated managed runtime like Java and JavaScript +this intrinsic can be used to implement "uncommon trap" or "side exit" like +functionality. In unmanaged languages like C and C++, this intrinsic can be +used to represent the slow paths of specialized functions. + + +Arguments: +"""""""""" + +The intrinsic takes an arbitrary number of arguments, whose meaning is +decided by the :ref:`lowering strategy<deoptimize_lowering>`. + +Semantics: +"""""""""" + +The ``@llvm.experimental.deoptimize`` intrinsic executes an attached +deoptimization continuation (denoted using a :ref:`deoptimization +operand bundle <deopt_opbundles>`) and returns the value returned by +the deoptimization continuation. Defining the semantic properties of +the continuation itself is out of scope of the language reference -- +as far as LLVM is concerned, the deoptimization continuation can +invoke arbitrary side effects, including reading from and writing to +the entire heap. + +Deoptimization continuations expressed using ``"deopt"`` operand bundles always +continue execution to the end of the physical frame containing them, so all +calls to ``@llvm.experimental.deoptimize`` must be in "tail position": + + - ``@llvm.experimental.deoptimize`` cannot be invoked. + - The call must immediately precede a :ref:`ret <i_ret>` instruction. + - The ``ret`` instruction must return the value produced by the + ``@llvm.experimental.deoptimize`` call if there is one, or void. + +Note that the above restrictions imply that the return type for a call to +``@llvm.experimental.deoptimize`` will match the return type of its immediate +caller. + +The inliner composes the ``"deopt"`` continuations of the caller into the +``"deopt"`` continuations present in the inlinee, and also updates calls to this +intrinsic to return directly from the frame of the function it inlined into. + +All declarations of ``@llvm.experimental.deoptimize`` must share the +same calling convention. + +.. _deoptimize_lowering: + +Lowering: +""""""""" + +Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the +symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to +ensure that this symbol is defined). The call arguments to +``@llvm.experimental.deoptimize`` are lowered as if they were formal +arguments of the specified types, and not as varargs. + + +'``llvm.experimental.guard``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ] + +Overview: +""""""""" + +This intrinsic, together with :ref:`deoptimization operand bundles +<deopt_opbundles>`, allows frontends to express guards or checks on +optimistic assumptions made during compilation. The semantics of +``@llvm.experimental.guard`` is defined in terms of +``@llvm.experimental.deoptimize`` -- its body is defined to be +equivalent to: + +.. code-block:: llvm + + define void @llvm.experimental.guard(i1 %pred, <args...>) { + %realPred = and i1 %pred, undef + br i1 %realPred, label %continue, label %leave [, !make.implicit !{}] + + leave: + call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ] + ret void + + continue: + ret void + } + + +with the optional ``[, !make.implicit !{}]`` present if and only if it +is present on the call site. For more details on ``!make.implicit``, +see :doc:`FaultMaps`. + +In words, ``@llvm.experimental.guard`` executes the attached +``"deopt"`` continuation if (but **not** only if) its first argument +is ``false``. Since the optimizer is allowed to replace the ``undef`` +with an arbitrary value, it can optimize guard to fail "spuriously", +i.e. without the original condition being false (hence the "not only +if"); and this allows for "check widening" type optimizations. + +``@llvm.experimental.guard`` cannot be invoked. + + +'``llvm.load.relative``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly + +Overview: +""""""""" + +This intrinsic loads a 32-bit value from the address ``%ptr + %offset``, +adds ``%ptr`` to that value and returns it. The constant folder specifically +recognizes the form of this intrinsic and the constant initializers it may +load from; if a loaded constant initializer is known to have the form +``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. + +LLVM provides that the calculation of such a constant initializer will +not overflow at link time under the medium code model if ``x`` is an +``unnamed_addr`` function. However, it does not provide this guarantee for +a constant initializer folded into a function body. This intrinsic can be +used to avoid the possibility of overflows when loading from such a constant. + Stack Map Intrinsics -------------------- diff --git a/docs/LibFuzzer.rst b/docs/LibFuzzer.rst index 84adff3616f7d..92937c2d0b529 100644 --- a/docs/LibFuzzer.rst +++ b/docs/LibFuzzer.rst @@ -1,90 +1,373 @@ -======================================================== -LibFuzzer -- a library for coverage-guided fuzz testing. -======================================================== +======================================================= +libFuzzer – a library for coverage-guided fuzz testing. +======================================================= .. contents:: :local: - :depth: 4 + :depth: 1 Introduction ============ -This library is intended primarily for in-process coverage-guided fuzz testing -(fuzzing) of other libraries. The typical workflow looks like this: - -* Build the Fuzzer library as a static archive (or just a set of .o files). - Note that the Fuzzer contains the main() function. - Preferably do *not* use sanitizers while building the Fuzzer. -* Build the library you are going to test with - `-fsanitize-coverage={bb,edge}[,indirect-calls,8bit-counters]` - and one of the sanitizers. We recommend to build the library in several - different modes (e.g. asan, msan, lsan, ubsan, etc) and even using different - optimizations options (e.g. -O0, -O1, -O2) to diversify testing. -* Build a test driver using the same options as the library. - The test driver is a C/C++ file containing interesting calls to the library - inside a single function ``extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size);``. - Currently, the only expected return value is 0, others are reserved for future. -* Link the Fuzzer, the library and the driver together into an executable - using the same sanitizer options as for the library. -* Collect the initial corpus of inputs for the - fuzzer (a directory with test inputs, one file per input). - The better your inputs are the faster you will find something interesting. - Also try to keep your inputs small, otherwise the Fuzzer will run too slow. - By default, the Fuzzer limits the size of every input to 64 bytes - (use ``-max_len=N`` to override). -* Run the fuzzer with the test corpus. As new interesting test cases are - discovered they will be added to the corpus. If a bug is discovered by - the sanitizer (asan, etc) it will be reported as usual and the reproducer - will be written to disk. - Each Fuzzer process is single-threaded (unless the library starts its own - threads). You can run the Fuzzer on the same corpus in multiple processes - in parallel. - - -The Fuzzer is similar in concept to AFL_, -but uses in-process Fuzzing, which is more fragile, more restrictive, but -potentially much faster as it has no overhead for process start-up. -It uses LLVM's SanitizerCoverage_ instrumentation to get in-process -coverage-feedback - -The code resides in the LLVM repository, requires the fresh Clang compiler to build -and is used to fuzz various parts of LLVM, -but the Fuzzer itself does not (and should not) depend on any -part of LLVM and can be used for other projects w/o requiring the rest of LLVM. - -Flags -===== -The most important flags are:: - - seed 0 Random seed. If 0, seed is generated. - runs -1 Number of individual test runs (-1 for infinite runs). - max_len 64 Maximum length of the test input. - cross_over 1 If 1, cross over inputs. - mutate_depth 5 Apply this number of consecutive mutations to each input. - timeout 1200 Timeout in seconds (if positive). If one unit runs more than this number of seconds the process will abort. - max_total_time 0 If positive, indicates the maximal total time in seconds to run the fuzzer. - help 0 Print help. - merge 0 If 1, the 2-nd, 3-rd, etc corpora will be merged into the 1-st corpus. Only interesting units will be taken. - jobs 0 Number of jobs to run. If jobs >= 1 we spawn this number of jobs in separate worker processes with stdout/stderr redirected to fuzz-JOB.log. - workers 0 Number of simultaneous worker processes to run the jobs. If zero, "min(jobs,NumberOfCpuCores()/2)" is used. - sync_command 0 Execute an external command "<sync_command> <test_corpus>" to synchronize the test corpus. - sync_timeout 600 Minimum timeout between syncs. - use_traces 0 Experimental: use instruction traces - only_ascii 0 If 1, generate only ASCII (isprint+isspace) inputs. - test_single_input "" Use specified file content as test input. Test will be run only once. Useful for debugging a particular case. - artifact_prefix "" Write fuzzing artifacts (crash, timeout, or slow inputs) as $(artifact_prefix)file - exact_artifact_path "" Write the single artifact on failure (crash, timeout) as $(exact_artifact_path). This overrides -artifact_prefix and will not use checksum in the file name. Do not use the same path for several parallel processes. +LibFuzzer is a library for in-process, coverage-guided, evolutionary fuzzing +of other libraries. + +LibFuzzer is similar in concept to American Fuzzy Lop (AFL_), but it performs +all of its fuzzing inside a single process. This in-process fuzzing can be more +restrictive and fragile, but is potentially much faster as there is no overhead +for process start-up. + +The fuzzer is linked with the library under test, and feeds fuzzed inputs to the +library via a specific fuzzing entrypoint (aka "target function"); the fuzzer +then tracks which areas of the code are reached, and generates mutations on the +corpus of input data in order to maximize the code coverage. The code coverage +information for libFuzzer is provided by LLVM's SanitizerCoverage_ +instrumentation. + +Contact: libfuzzer(#)googlegroups.com + +Versions +======== + +LibFuzzer is under active development so a current (or at least very recent) +version of Clang is the only supported variant. + +(If `building Clang from trunk`_ is too time-consuming or difficult, then +the Clang binaries that the Chromium developers build are likely to be +fairly recent: + +.. code-block:: console + + mkdir TMP_CLANG + cd TMP_CLANG + git clone https://chromium.googlesource.com/chromium/src/tools/clang + cd .. + TMP_CLANG/clang/scripts/update.py + +This installs the Clang binary as +``./third_party/llvm-build/Release+Asserts/bin/clang``) + +The libFuzzer code resides in the LLVM repository, and requires a recent Clang +compiler to build (and is used to `fuzz various parts of LLVM itself`_). +However the fuzzer itself does not (and should not) depend on any part of LLVM +infrastructure and can be used for other projects without requiring the rest +of LLVM. + + + +Getting Started +=============== + +.. contents:: + :local: + :depth: 1 + +Building +-------- + +The first step for using libFuzzer on a library is to implement a fuzzing +target function that accepts a sequence of bytes, like this: + +.. code-block:: c++ + + // fuzz_target.cc + extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { + DoSomethingInterestingWithMyAPI(Data, Size); + return 0; // Non-zero return values are reserved for future use. + } + +Next, build the libFuzzer library as a static archive, without any sanitizer +options. Note that the libFuzzer library contains the ``main()`` function: + +.. code-block:: console + + svn co http://llvm.org/svn/llvm-project/llvm/trunk/lib/Fuzzer + # Alternative: get libFuzzer from a dedicated git mirror: + # git clone https://chromium.googlesource.com/chromium/llvm-project/llvm/lib/Fuzzer + clang++ -c -g -O2 -std=c++11 Fuzzer/*.cpp -IFuzzer + ar ruv libFuzzer.a Fuzzer*.o + +Then build the fuzzing target function and the library under test using +the SanitizerCoverage_ option, which instruments the code so that the fuzzer +can retrieve code coverage information (to guide the fuzzing). Linking with +the libFuzzer code then gives an fuzzer executable. + +You should also enable one or more of the *sanitizers*, which help to expose +latent bugs by making incorrect behavior generate errors at runtime: + + - AddressSanitizer_ (ASAN) detects memory access errors. Use `-fsanitize=address`. + - UndefinedBehaviorSanitizer_ (UBSAN) detects the use of various features of C/C++ that are explicitly + listed as resulting in undefined behavior. Use `-fsanitize=undefined -fno-sanitize-recover=undefined` + or any individual UBSAN check, e.g. `-fsanitize=signed-integer-overflow -fno-sanitize-recover=undefined`. + You may combine ASAN and UBSAN in one build. + - MemorySanitizer_ (MSAN) detects uninitialized reads: code whose behavior relies on memory + contents that have not been initialized to a specific value. Use `-fsanitize=memory`. + MSAN can not be combined with other sanirizers and should be used as a seprate build. + +Finally, link with ``libFuzzer.a``:: + + clang -fsanitize-coverage=edge -fsanitize=address your_lib.cc fuzz_target.cc libFuzzer.a -o my_fuzzer + +Corpus +------ + +Coverage-guided fuzzers like libFuzzer rely on a corpus of sample inputs for the +code under test. This corpus should ideally be seeded with a varied collection +of valid and invalid inputs for the code under test; for example, for a graphics +library the initial corpus might hold a variety of different small PNG/JPG/GIF +files. The fuzzer generates random mutations based around the sample inputs in +the current corpus. If a mutation triggers execution of a previously-uncovered +path in the code under test, then that mutation is saved to the corpus for +future variations. + +LibFuzzer will work without any initial seeds, but will be less +efficient if the library under test accepts complex, +structured inputs. + +The corpus can also act as a sanity/regression check, to confirm that the +fuzzing entrypoint still works and that all of the sample inputs run through +the code under test without problems. + +If you have a large corpus (either generated by fuzzing or acquired by other means) +you may want to minimize it while still preserving the full coverage. One way to do that +is to use the `-merge=1` flag: + +.. code-block:: console + + mkdir NEW_CORPUS_DIR # Store minimized corpus here. + ./my_fuzzer -merge=1 NEW_CORPUS_DIR FULL_CORPUS_DIR + +You may use the same flag to add more interesting items to an existing corpus. +Only the inputs that trigger new coverage will be added to the first corpus. + +.. code-block:: console + + ./my_fuzzer -merge=1 CURRENT_CORPUS_DIR NEW_POTENTIALLY_INTERESTING_INPUTS_DIR + + +Running +------- + +To run the fuzzer, first create a Corpus_ directory that holds the +initial "seed" sample inputs: + +.. code-block:: console + + mkdir CORPUS_DIR + cp /some/input/samples/* CORPUS_DIR + +Then run the fuzzer on the corpus directory: + +.. code-block:: console + + ./my_fuzzer CORPUS_DIR # -max_len=1000 -jobs=20 ... + +As the fuzzer discovers new interesting test cases (i.e. test cases that +trigger coverage of new paths through the code under test), those test cases +will be added to the corpus directory. + +By default, the fuzzing process will continue indefinitely – at least until +a bug is found. Any crashes or sanitizer failures will be reported as usual, +stopping the fuzzing process, and the particular input that triggered the bug +will be written to disk (typically as ``crash-<sha1>``, ``leak-<sha1>``, +or ``timeout-<sha1>``). + + +Parallel Fuzzing +---------------- + +Each libFuzzer process is single-threaded, unless the library under test starts +its own threads. However, it is possible to run multiple libFuzzer processes in +parallel with a shared corpus directory; this has the advantage that any new +inputs found by one fuzzer process will be available to the other fuzzer +processes (unless you disable this with the ``-reload=0`` option). + +This is primarily controlled by the ``-jobs=N`` option, which indicates that +that `N` fuzzing jobs should be run to completion (i.e. until a bug is found or +time/iteration limits are reached). These jobs will be run across a set of +worker processes, by default using half of the available CPU cores; the count of +worker processes can be overridden by the ``-workers=N`` option. For example, +running with ``-jobs=30`` on a 12-core machine would run 6 workers by default, +with each worker averaging 5 bugs by completion of the entire process. + + +Options +======= + +To run the fuzzer, pass zero or more corpus directories as command line +arguments. The fuzzer will read test inputs from each of these corpus +directories, and any new test inputs that are generated will be written +back to the first corpus directory: + +.. code-block:: console + + ./fuzzer [-flag1=val1 [-flag2=val2 ...] ] [dir1 [dir2 ...] ] + +If a list of files (rather than directories) are passed to the fuzzer program, +then it will re-run those files as test inputs but will not perform any fuzzing. +In this mode the fuzzer binary can be used as a regression test (e.g. on a +continuous integration system) to check the target function and saved inputs +still work. + +The most important command line options are: + +``-help`` + Print help message. +``-seed`` + Random seed. If 0 (the default), the seed is generated. +``-runs`` + Number of individual test runs, -1 (the default) to run indefinitely. +``-max_len`` + Maximum length of a test input. If 0 (the default), libFuzzer tries to guess + a good value based on the corpus (and reports it). +``-timeout`` + Timeout in seconds, default 1200. If an input takes longer than this timeout, + the process is treated as a failure case. +``-rss_limit_mb`` + Memory usage limit in Mb, default 2048. Use 0 to disable the limit. + If an input requires more than this amount of RSS memory to execute, + the process is treated as a failure case. + The limit is checked in a separate thread every second. + If running w/o ASAN/MSAN, you may use 'ulimit -v' instead. +``-timeout_exitcode`` + Exit code (default 77) to emit when terminating due to timeout, when + ``-abort_on_timeout`` is not set. +``-max_total_time`` + If positive, indicates the maximum total time in seconds to run the fuzzer. + If 0 (the default), run indefinitely. +``-merge`` + If set to 1, any corpus inputs from the 2nd, 3rd etc. corpus directories + that trigger new code coverage will be merged into the first corpus + directory. Defaults to 0. This flag can be used to minimize a corpus. +``-reload`` + If set to 1 (the default), the corpus directory is re-read periodically to + check for new inputs; this allows detection of new inputs that were discovered + by other fuzzing processes. +``-jobs`` + Number of fuzzing jobs to run to completion. Default value is 0, which runs a + single fuzzing process until completion. If the value is >= 1, then this + number of jobs performing fuzzing are run, in a collection of parallel + separate worker processes; each such worker process has its + ``stdout``/``stderr`` redirected to ``fuzz-<JOB>.log``. +``-workers`` + Number of simultaneous worker processes to run the fuzzing jobs to completion + in. If 0 (the default), ``min(jobs, NumberOfCpuCores()/2)`` is used. +``-dict`` + Provide a dictionary of input keywords; see Dictionaries_. +``-use_counters`` + Use `coverage counters`_ to generate approximate counts of how often code + blocks are hit; defaults to 1. +``-use_traces`` + Use instruction traces (experimental, defaults to 0); see `Data-flow-guided fuzzing`_. +``-only_ascii`` + If 1, generate only ASCII (``isprint``+``isspace``) inputs. Defaults to 0. +``-artifact_prefix`` + Provide a prefix to use when saving fuzzing artifacts (crash, timeout, or + slow inputs) as ``$(artifact_prefix)file``. Defaults to empty. +``-exact_artifact_path`` + Ignored if empty (the default). If non-empty, write the single artifact on + failure (crash, timeout) as ``$(exact_artifact_path)``. This overrides + ``-artifact_prefix`` and will not use checksum in the file name. Do not use + the same path for several parallel processes. +``-print_final_stats`` + If 1, print statistics at exit. Defaults to 0. +``-detect-leaks`` + If 1 (default) and if LeakSanitizer is enabled + try to detect memory leaks during fuzzing (i.e. not only at shut down). +``-close_fd_mask`` + Indicate output streams to close at startup. Be careful, this will + remove diagnostic output from target code (e.g. messages on assert failure). + + - 0 (default): close neither ``stdout`` nor ``stderr`` + - 1 : close ``stdout`` + - 2 : close ``stderr`` + - 3 : close both ``stdout`` and ``stderr``. For the full list of flags run the fuzzer binary with ``-help=1``. -Usage examples -============== +Output +====== + +During operation the fuzzer prints information to ``stderr``, for example:: + + INFO: Seed: 3338750330 + Loaded 1024/1211 files from corpus/ + INFO: -max_len is not provided, using 64 + #0 READ units: 1211 exec/s: 0 + #1211 INITED cov: 2575 bits: 8855 indir: 5 units: 830 exec/s: 1211 + #1422 NEW cov: 2580 bits: 8860 indir: 5 units: 831 exec/s: 1422 L: 21 MS: 1 ShuffleBytes- + #1688 NEW cov: 2581 bits: 8865 indir: 5 units: 832 exec/s: 1688 L: 19 MS: 2 EraseByte-CrossOver- + #1734 NEW cov: 2583 bits: 8879 indir: 5 units: 833 exec/s: 1734 L: 27 MS: 3 ChangeBit-EraseByte-ShuffleBytes- + ... + +The early parts of the output include information about the fuzzer options and +configuration, including the current random seed (in the ``Seed:`` line; this +can be overridden with the ``-seed=N`` flag). + +Further output lines have the form of an event code and statistics. The +possible event codes are: + +``READ`` + The fuzzer has read in all of the provided input samples from the corpus + directories. +``INITED`` + The fuzzer has completed initialization, which includes running each of + the initial input samples through the code under test. +``NEW`` + The fuzzer has created a test input that covers new areas of the code + under test. This input will be saved to the primary corpus directory. +``pulse`` + The fuzzer has generated 2\ :sup:`n` inputs (generated periodically to reassure + the user that the fuzzer is still working). +``DONE`` + The fuzzer has completed operation because it has reached the specified + iteration limit (``-runs``) or time limit (``-max_total_time``). +``MIN<n>`` + The fuzzer is minimizing the combination of input corpus directories into + a single unified corpus (due to the ``-merge`` command line option). +``RELOAD`` + The fuzzer is performing a periodic reload of inputs from the corpus + directory; this allows it to discover any inputs discovered by other + fuzzer processes (see `Parallel Fuzzing`_). + +Each output line also reports the following statistics (when non-zero): + +``cov:`` + Total number of code blocks or edges covered by the executing the current + corpus. +``bits:`` + Rough measure of the number of code blocks or edges covered, and how often; + only valid if the fuzzer is run with ``-use_counters=1``. +``indir:`` + Number of distinct function `caller-callee pairs`_ executed with the + current corpus; only valid if the code under test was built with + ``-fsanitize-coverage=indirect-calls``. +``units:`` + Number of entries in the current input corpus. +``exec/s:`` + Number of fuzzer iterations per second. + +For ``NEW`` events, the output line also includes information about the mutation +operation that produced the new input: + +``L:`` + Size of the new input in bytes. +``MS: <n> <operations>`` + Count and list of the mutation operations used to generate the input. + + +Examples +======== +.. contents:: + :local: + :depth: 1 Toy example ----------- -A simple function that does something interesting if it receives the input "HI!":: +A simple function that does something interesting if it receives the input +"HI!":: - cat << EOF >> test_fuzzer.cc + cat << EOF > test_fuzzer.cc #include <stdint.h> #include <stddef.h> extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) { @@ -95,32 +378,37 @@ A simple function that does something interesting if it receives the input "HI!" return 0; } EOF - # Get lib/Fuzzer. Assuming that you already have fresh clang in PATH. - svn co http://llvm.org/svn/llvm-project/llvm/trunk/lib/Fuzzer - # Build lib/Fuzzer files. - clang -c -g -O2 -std=c++11 Fuzzer/*.cpp -IFuzzer - # Build test_fuzzer.cc with asan and link against lib/Fuzzer. - clang++ -fsanitize=address -fsanitize-coverage=edge test_fuzzer.cc Fuzzer*.o + # Build test_fuzzer.cc with asan and link against libFuzzer.a + clang++ -fsanitize=address -fsanitize-coverage=edge test_fuzzer.cc libFuzzer.a # Run the fuzzer with no corpus. ./a.out -You should get ``Illegal instruction (core dumped)`` pretty quickly. +You should get an error pretty quickly:: + + #0 READ units: 1 exec/s: 0 + #1 INITED cov: 3 units: 1 exec/s: 0 + #2 NEW cov: 5 units: 2 exec/s: 0 L: 64 MS: 0 + #19237 NEW cov: 9 units: 3 exec/s: 0 L: 64 MS: 0 + #20595 NEW cov: 10 units: 4 exec/s: 0 L: 1 MS: 4 ChangeASCIIInt-ShuffleBytes-ChangeByte-CrossOver- + #34574 NEW cov: 13 units: 5 exec/s: 0 L: 2 MS: 3 ShuffleBytes-CrossOver-ChangeBit- + #34807 NEW cov: 15 units: 6 exec/s: 0 L: 3 MS: 1 CrossOver- + ==31511== ERROR: libFuzzer: deadly signal + ... + artifact_prefix='./'; Test unit written to ./crash-b13e8756b13a00cf168300179061fb4b91fefbed + PCRE2 ----- -Here we show how to use lib/Fuzzer on something real, yet simple: pcre2_:: +Here we show how to use libFuzzer on something real, yet simple: pcre2_:: COV_FLAGS=" -fsanitize-coverage=edge,indirect-calls,8bit-counters" # Get PCRE2 - svn co svn://vcs.exim.org/pcre2/code/trunk pcre - # Get lib/Fuzzer. Assuming that you already have fresh clang in PATH. - svn co http://llvm.org/svn/llvm-project/llvm/trunk/lib/Fuzzer - # Build PCRE2 with AddressSanitizer and coverage. - (cd pcre; ./autogen.sh; CC="clang -fsanitize=address $COV_FLAGS" ./configure --prefix=`pwd`/../inst && make -j && make install) - # Build lib/Fuzzer files. - clang -c -g -O2 -std=c++11 Fuzzer/*.cpp -IFuzzer - # Build the actual function that does something interesting with PCRE2. + wget ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre2-10.20.tar.gz + tar xf pcre2-10.20.tar.gz + # Build PCRE2 with AddressSanitizer and coverage; requires autotools. + (cd pcre2-10.20; ./autogen.sh; CC="clang -fsanitize=address $COV_FLAGS" ./configure --prefix=`pwd`/../inst && make -j && make install) + # Build the fuzzing target function that does something interesting with PCRE2. cat << EOF > pcre_fuzzer.cc #include <string.h> #include <stdint.h> @@ -141,61 +429,67 @@ Here we show how to use lib/Fuzzer on something real, yet simple: pcre2_:: EOF clang++ -g -fsanitize=address $COV_FLAGS -c -std=c++11 -I inst/include/ pcre_fuzzer.cc # Link. - clang++ -g -fsanitize=address -Wl,--whole-archive inst/lib/*.a -Wl,-no-whole-archive Fuzzer*.o pcre_fuzzer.o -o pcre_fuzzer + clang++ -g -fsanitize=address -Wl,--whole-archive inst/lib/*.a -Wl,-no-whole-archive libFuzzer.a pcre_fuzzer.o -o pcre_fuzzer This will give you a binary of the fuzzer, called ``pcre_fuzzer``. -Now, create a directory that will hold the test corpus:: +Now, create a directory that will hold the test corpus: + +.. code-block:: console mkdir -p CORPUS For simple input languages like regular expressions this is all you need. -For more complicated inputs populate the directory with some input samples. -Now run the fuzzer with the corpus dir as the only parameter:: +For more complicated/structured inputs, the fuzzer works much more efficiently +if you can populate the corpus directory with a variety of valid and invalid +inputs for the code under test. +Now run the fuzzer with the corpus directory as the only parameter: - ./pcre_fuzzer ./CORPUS +.. code-block:: console -You will see output like this:: + ./pcre_fuzzer ./CORPUS - Seed: 1876794929 - #0 READ cov 0 bits 0 units 1 exec/s 0 - #1 pulse cov 3 bits 0 units 1 exec/s 0 - #1 INITED cov 3 bits 0 units 1 exec/s 0 - #2 pulse cov 208 bits 0 units 1 exec/s 0 - #2 NEW cov 208 bits 0 units 2 exec/s 0 L: 64 - #3 NEW cov 217 bits 0 units 3 exec/s 0 L: 63 - #4 pulse cov 217 bits 0 units 3 exec/s 0 +Initially, you will see Output_ like this:: -* The ``Seed:`` line shows you the current random seed (you can change it with ``-seed=N`` flag). -* The ``READ`` line shows you how many input files were read (since you passed an empty dir there were inputs, but one dummy input was synthesised). -* The ``INITED`` line shows you that how many inputs will be fuzzed. -* The ``NEW`` lines appear with the fuzzer finds a new interesting input, which is saved to the CORPUS dir. If multiple corpus dirs are given, the first one is used. -* The ``pulse`` lines appear periodically to show the current status. + INFO: Seed: 2938818941 + INFO: -max_len is not provided, using 64 + INFO: A corpus is not provided, starting from an empty corpus + #0 READ units: 1 exec/s: 0 + #1 INITED cov: 3 bits: 3 units: 1 exec/s: 0 + #2 NEW cov: 176 bits: 176 indir: 3 units: 2 exec/s: 0 L: 64 MS: 0 + #8 NEW cov: 176 bits: 179 indir: 3 units: 3 exec/s: 0 L: 63 MS: 2 ChangeByte-EraseByte- + ... + #14004 NEW cov: 1500 bits: 4536 indir: 5 units: 406 exec/s: 0 L: 54 MS: 3 ChangeBit-ChangeBit-CrossOver- Now, interrupt the fuzzer and run it again the same way. You will see:: - Seed: 1879995378 - #0 READ cov 0 bits 0 units 564 exec/s 0 - #1 pulse cov 502 bits 0 units 564 exec/s 0 + INFO: Seed: 3398349082 + INFO: -max_len is not provided, using 64 + #0 READ units: 405 exec/s: 0 + #405 INITED cov: 1499 bits: 4535 indir: 5 units: 286 exec/s: 0 + #587 NEW cov: 1499 bits: 4540 indir: 5 units: 287 exec/s: 0 L: 52 MS: 2 InsertByte-EraseByte- + #667 NEW cov: 1501 bits: 4542 indir: 5 units: 288 exec/s: 0 L: 39 MS: 2 ChangeBit-InsertByte- + #672 NEW cov: 1501 bits: 4543 indir: 5 units: 289 exec/s: 0 L: 15 MS: 2 ChangeASCIIInt-ChangeBit- + #739 NEW cov: 1501 bits: 4544 indir: 5 units: 290 exec/s: 0 L: 64 MS: 4 ShuffleBytes-ChangeASCIIInt-InsertByte-ChangeBit- ... - #512 pulse cov 2933 bits 0 units 564 exec/s 512 - #564 INITED cov 2991 bits 0 units 344 exec/s 564 - #1024 pulse cov 2991 bits 0 units 344 exec/s 1024 - #1455 NEW cov 2995 bits 0 units 345 exec/s 1455 L: 49 -This time you were running the fuzzer with a non-empty input corpus (564 items). -As the first step, the fuzzer minimized the set to produce 344 interesting items (the ``INITED`` line) +On the second execution the fuzzer has a non-empty input corpus (405 items). As +the first step, the fuzzer minimized this corpus (the ``INITED`` line) to +produce 286 interesting items, omitting inputs that do not hit any additional +code. -It is quite convenient to store test corpuses in git. -As an example, here is a git repository with test inputs for the above PCRE2 fuzzer:: +(Aside: although the fuzzer only saves new inputs that hit additional code, this +does not mean that the corpus as a whole is kept minimized. For example, if +an input hitting A-B-C then an input that hits A-B-C-D are generated, +they will both be saved, even though the latter subsumes the former.) - git clone https://github.com/kcc/fuzzing-with-sanitizers.git - ./pcre_fuzzer ./fuzzing-with-sanitizers/pcre2/C1/ -You may run ``N`` independent fuzzer jobs in parallel on ``M`` CPUs:: +You may run ``N`` independent fuzzer jobs in parallel on ``M`` CPUs: + +.. code-block:: console N=100; M=4; ./pcre_fuzzer ./CORPUS -jobs=$N -workers=$M -By default (``-reload=1``) the fuzzer processes will periodically scan the CORPUS directory +By default (``-reload=1``) the fuzzer processes will periodically scan the corpus directory and reload any new tests. This way the test inputs found by one process will be picked up by all others. @@ -205,15 +499,15 @@ Heartbleed ---------- Remember Heartbleed_? As it was recently `shown <https://blog.hboeck.de/archives/868-How-Heartbleed-couldve-been-found.html>`_, -fuzzing with AddressSanitizer can find Heartbleed. Indeed, here are the step-by-step instructions -to find Heartbleed with LibFuzzer:: +fuzzing with AddressSanitizer_ can find Heartbleed. Indeed, here are the step-by-step instructions +to find Heartbleed with libFuzzer:: wget https://www.openssl.org/source/openssl-1.0.1f.tar.gz tar xf openssl-1.0.1f.tar.gz COV_FLAGS="-fsanitize-coverage=edge,indirect-calls" # -fsanitize-coverage=8bit-counters (cd openssl-1.0.1f/ && ./config && make -j 32 CC="clang -g -fsanitize=address $COV_FLAGS") - # Get and build LibFuzzer + # Get and build libFuzzer svn co http://llvm.org/svn/llvm-project/llvm/trunk/lib/Fuzzer clang -c -g -O2 -std=c++11 Fuzzer/*.cpp -IFuzzer # Get examples of key/pem files. @@ -267,14 +561,16 @@ Voila:: #2 0x580be3 in ssl3_read_bytes openssl-1.0.1f/ssl/s3_pkt.c:1092:4 Note: a `similar fuzzer <https://boringssl.googlesource.com/boringssl/+/HEAD/FUZZING.md>`_ -is now a part of the boringssl source tree. +is now a part of the BoringSSL_ source tree. Advanced features ================= +.. contents:: + :local: + :depth: 1 Dictionaries ------------ -*EXPERIMENTAL*. LibFuzzer supports user-supplied dictionaries with input language keywords or other interesting byte sequences (e.g. multi-byte magic values). Use ``-dict=DICTIONARY_FILE``. For some input languages using a dictionary @@ -304,16 +600,51 @@ It will later use those recorded inputs during mutations. This mode can be combined with DataFlowSanitizer_ to achieve better sensitivity. +Fuzzer-friendly build mode +--------------------------- +Sometimes the code under test is not fuzzing-friendly. Examples: + + - The target code uses a PRNG seeded e.g. by system time and + thus two consequent invocations may potentially execute different code paths + even if the end result will be the same. This will cause a fuzzer to treat + two similar inputs as significantly different and it will blow up the test corpus. + E.g. libxml uses ``rand()`` inside its hash table. + - The target code uses checksums to protect from invalid inputs. + E.g. png checks CRC for every chunk. + +In many cases it makes sense to build a special fuzzing-friendly build +with certain fuzzing-unfriendly features disabled. We propose to use a common build macro +for all such cases for consistency: ``FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION``. + +.. code-block:: c++ + + void MyInitPRNG() { + #ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION + // In fuzzing mode the behavior of the code should be deterministic. + srand(0); + #else + srand(time(0)); + #endif + } + + + AFL compatibility ----------------- -LibFuzzer can be used in parallel with AFL_ on the same test corpus. +LibFuzzer can be used together with AFL_ on the same test corpus. Both fuzzers expect the test corpus to reside in a directory, one file per input. -You can run both fuzzers on the same corpus in parallel:: +You can run both fuzzers on the same corpus, one after another: + +.. code-block:: console - ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program -r @@ + ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program @@ ./llvm-fuzz testcase_dir findings_dir # Will write new tests to testcase_dir Periodically restart both fuzzers so that they can use each other's findings. +Currently, there is no simple way to run both fuzzing engines in parallel while sharing the same corpus dir. + +You may also use AFL on your target function ``LLVMFuzzerTestOneInput``: +see an example `here <https://github.com/llvm-mirror/llvm/blob/master/lib/Fuzzer/afl/afl_driver.cpp>`__. How good is my fuzzer? ---------------------- @@ -321,14 +652,20 @@ How good is my fuzzer? Once you implement your target function ``LLVMFuzzerTestOneInput`` and fuzz it to death, you will want to know whether the function or the corpus can be improved further. One easy to use metric is, of course, code coverage. -You can get the coverage for your corpus like this:: +You can get the coverage for your corpus like this: + +.. code-block:: console + + ASAN_OPTIONS=coverage=1:html_cov_report=1 ./fuzzer CORPUS_DIR -runs=0 - ASAN_OPTIONS=coverage_pcs=1 ./fuzzer CORPUS_DIR -runs=0 +This will run all tests in the CORPUS_DIR but will not perform any fuzzing. +At the end of the process it will dump a single html file with coverage information. +See SanitizerCoverage_ for details. -This will run all the tests in the CORPUS_DIR but will not generate any new tests -and dump covered PCs to disk before exiting. -Then you can subtract the set of covered PCs from the set of all instrumented PCs in the binary, -see SanitizerCoverage_ for details. +You may also use other ways to visualize coverage, +e.g. using `Clang coverage <http://clang.llvm.org/docs/SourceBasedCodeCoverage.html>`_, +but those will require +you to rebuild the code with different compiler flags. User-supplied mutators ---------------------- @@ -336,21 +673,83 @@ User-supplied mutators LibFuzzer allows to use custom (user-supplied) mutators, see FuzzerInterface.h_ +Startup initialization +---------------------- +If the library being tested needs to be initialized, there are several options. + +The simplest way is to have a statically initialized global object inside +`LLVMFuzzerTestOneInput` (or in global scope if that works for you): + +.. code-block:: c++ + + extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { + static bool Initialized = DoInitialization(); + ... + +Alternatively, you may define an optional init function and it will receive +the program arguments that you can read and modify. Do this **only** if you +realy need to access ``argv``/``argc``. + +.. code-block:: c++ + + extern "C" int LLVMFuzzerInitialize(int *argc, char ***argv) { + ReadAndMaybeModify(argc, argv); + return 0; + } + + +Leaks +----- + +Binaries built with AddressSanitizer_ or LeakSanitizer_ will try to detect +memory leaks at the process shutdown. +For in-process fuzzing this is inconvenient +since the fuzzer needs to report a leak with a reproducer as soon as the leaky +mutation is found. However, running full leak detection after every mutation +is expensive. + +By default (``-detect_leaks=1``) libFuzzer will count the number of +``malloc`` and ``free`` calls when executing every mutation. +If the numbers don't match (which by itself doesn't mean there is a leak) +libFuzzer will invoke the more expensive LeakSanitizer_ +pass and if the actual leak is found, it will be reported with the reproducer +and the process will exit. + +If your target has massive leaks and the leak detection is disabled +you will eventually run out of RAM (see the ``-rss_limit_mb`` flag). + + +Developing libFuzzer +==================== + +Building libFuzzer as a part of LLVM project and running its test requires +fresh clang as the host compiler and special CMake configuration: + +.. code-block:: console + + cmake -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=YES -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON /path/to/llvm + ninja check-fuzzer + + Fuzzing components of LLVM ========================== +.. contents:: + :local: + :depth: 1 + +To build any of the LLVM fuzz targets use the build instructions above. clang-format-fuzzer ------------------- The inputs are random pieces of C++-like text. -Build (make sure to use fresh clang as the host compiler):: +.. code-block:: console - cmake -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=YES -DCMAKE_BUILD_TYPE=Release /path/to/llvm ninja clang-format-fuzzer mkdir CORPUS_DIR ./bin/clang-format-fuzzer CORPUS_DIR -Optionally build other kinds of binaries (asan+Debug, msan, ubsan, etc). +Optionally build other kinds of binaries (ASan+Debug, MSan, UBSan, etc). Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23052 @@ -380,37 +779,27 @@ finds an invalid instruction or runs out of data. Please note that the command line interface differs slightly from that of other fuzzers. The fuzzer arguments should follow ``--fuzzer-args`` and should have a single dash, while other arguments control the operation mode and target in a -similar manner to ``llvm-mc`` and should have two dashes. For example:: +similar manner to ``llvm-mc`` and should have two dashes. For example: + +.. code-block:: console llvm-mc-fuzzer --triple=aarch64-linux-gnu --disassemble --fuzzer-args -max_len=4 -jobs=10 Buildbot -------- -We have a buildbot that runs the above fuzzers for LLVM components -24/7/365 at http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer . - -Pre-fuzzed test inputs in git ------------------------------ - -The buildbot occumulates large test corpuses over time. -The corpuses are stored in git on github and can be used like this:: - - git clone https://github.com/kcc/fuzzing-with-sanitizers.git - bin/clang-format-fuzzer fuzzing-with-sanitizers/llvm/clang-format/C1 - bin/clang-fuzzer fuzzing-with-sanitizers/llvm/clang/C1/ - bin/llvm-as-fuzzer fuzzing-with-sanitizers/llvm/llvm-as/C1 -only_ascii=1 - +A buildbot continuously runs the above fuzzers for LLVM components, with results +shown at http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer . FAQ ========================= -Q. Why Fuzzer does not use any of the LLVM support? ---------------------------------------------------- +Q. Why doesn't libFuzzer use any of the LLVM support? +----------------------------------------------------- There are two reasons. -First, we want this library to be used outside of the LLVM w/o users having to +First, we want this library to be used outside of the LLVM without users having to build the rest of LLVM. This may sound unconvincing for many LLVM folks, but in practice the need for building the whole LLVM frightens many potential users -- and we want more users to use this code. @@ -422,19 +811,17 @@ coverage set of the process (since the fuzzer is in-process). In other words, by using more external dependencies we will slow down the fuzzer while the main reason for it to exist is extreme speed. -Q. What about Windows then? The Fuzzer contains code that does not build on Windows. +Q. What about Windows then? The fuzzer contains code that does not build on Windows. ------------------------------------------------------------------------------------ -The sanitizer coverage support does not work on Windows either as of 01/2015. -Once it's there, we'll need to re-implement OS-specific parts (I/O, signals). +Volunteers are welcome. Q. When this Fuzzer is not a good solution for a problem? --------------------------------------------------------- * If the test inputs are validated by the target library and the validator - asserts/crashes on invalid inputs, the in-process fuzzer is not applicable - (we could use fork() w/o exec, but it comes with extra overhead). -* Bugs in the target library may accumulate w/o being detected. E.g. a memory + asserts/crashes on invalid inputs, in-process fuzzing is not applicable. +* Bugs in the target library may accumulate without being detected. E.g. a memory corruption that goes undetected at first and then leads to a crash while testing another input. This is why it is highly recommended to run this in-process fuzzer with all sanitizers to detect most bugs on the spot. @@ -442,7 +829,7 @@ Q. When this Fuzzer is not a good solution for a problem? consumption and infinite loops in the target library (still possible). * The target library should not have significant global state that is not reset between the runs. -* Many interesting target libs are not designed in a way that supports +* Many interesting target libraries are not designed in a way that supports the in-process fuzzer interface (e.g. require a file path instead of a byte array). * If a single test run takes a considerable fraction of a second (or @@ -454,18 +841,16 @@ Q. So, what exactly this Fuzzer is good for? -------------------------------------------- This Fuzzer might be a good choice for testing libraries that have relatively -small inputs, each input takes < 1ms to run, and the library code is not expected +small inputs, each input takes < 10ms to run, and the library code is not expected to crash on invalid inputs. -Examples: regular expression matchers, text or binary format parsers. +Examples: regular expression matchers, text or binary format parsers, compression, +network, crypto. Trophies ======== * GLIBC: https://sourceware.org/glibc/wiki/FuzzingLibc -* MUSL LIBC: - - * http://git.musl-libc.org/cgit/musl/commit/?id=39dfd58417ef642307d90306e1c7e50aaec5a35c - * http://www.openwall.com/lists/oss-security/2015/03/30/3 +* MUSL LIBC: `[1] <http://git.musl-libc.org/cgit/musl/commit/?id=39dfd58417ef642307d90306e1c7e50aaec5a35c>`__ `[2] <http://www.openwall.com/lists/oss-security/2015/03/30/3>`__ * `pugixml <https://github.com/zeux/pugixml/issues/39>`_ @@ -482,23 +867,39 @@ Trophies * `Python <http://bugs.python.org/issue25388>`_ -* OpenSSL/BoringSSL: `[1] <https://boringssl.googlesource.com/boringssl/+/cb852981cd61733a7a1ae4fd8755b7ff950e857d>`_ +* OpenSSL/BoringSSL: `[1] <https://boringssl.googlesource.com/boringssl/+/cb852981cd61733a7a1ae4fd8755b7ff950e857d>`_ `[2] <https://openssl.org/news/secadv/20160301.txt>`_ `[3] <https://boringssl.googlesource.com/boringssl/+/2b07fa4b22198ac02e0cee8f37f3337c3dba91bc>`_ `[4] <https://boringssl.googlesource.com/boringssl/+/6b6e0b20893e2be0e68af605a60ffa2cbb0ffa64>`_ `[5] <https://github.com/openssl/openssl/pull/931/commits/dd5ac557f052cc2b7f718ac44a8cb7ac6f77dca8>`_ `[6] <https://github.com/openssl/openssl/pull/931/commits/19b5b9194071d1d84e38ac9a952e715afbc85a81>`_ * `Libxml2 - <https://bugzilla.gnome.org/buglist.cgi?bug_status=__all__&content=libFuzzer&list_id=68957&order=Importance&product=libxml2&query_format=specific>`_ + <https://bugzilla.gnome.org/buglist.cgi?bug_status=__all__&content=libFuzzer&list_id=68957&order=Importance&product=libxml2&query_format=specific>`_ and `[HT206167] <https://support.apple.com/en-gb/HT206167>`_ (CVE-2015-5312, CVE-2015-7500, CVE-2015-7942) * `Linux Kernel's BPF verifier <https://github.com/iovisor/bpf-fuzzer>`_ +* Capstone: `[1] <https://github.com/aquynh/capstone/issues/600>`__ `[2] <https://github.com/aquynh/capstone/commit/6b88d1d51eadf7175a8f8a11b690684443b11359>`__ + +* file:`[1] <http://bugs.gw.com/view.php?id=550>`__ `[2] <http://bugs.gw.com/view.php?id=551>`__ `[3] <http://bugs.gw.com/view.php?id=553>`__ `[4] <http://bugs.gw.com/view.php?id=554>`__ + +* Radare2: `[1] <https://github.com/revskills?tab=contributions&from=2016-04-09>`__ + +* gRPC: `[1] <https://github.com/grpc/grpc/pull/6071/commits/df04c1f7f6aec6e95722ec0b023a6b29b6ea871c>`__ `[2] <https://github.com/grpc/grpc/pull/6071/commits/22a3dfd95468daa0db7245a4e8e6679a52847579>`__ `[3] <https://github.com/grpc/grpc/pull/6071/commits/9cac2a12d9e181d130841092e9d40fa3309d7aa7>`__ `[4] <https://github.com/grpc/grpc/pull/6012/commits/82a91c91d01ce9b999c8821ed13515883468e203>`__ `[5] <https://github.com/grpc/grpc/pull/6202/commits/2e3e0039b30edaf89fb93bfb2c1d0909098519fa>`__ `[6] <https://github.com/grpc/grpc/pull/6106/files>`__ + +* WOFF2: `[1] <https://github.com/google/woff2/commit/a15a8ab>`__ + * LLVM: `Clang <https://llvm.org/bugs/show_bug.cgi?id=23057>`_, `Clang-format <https://llvm.org/bugs/show_bug.cgi?id=23052>`_, `libc++ <https://llvm.org/bugs/show_bug.cgi?id=24411>`_, `llvm-as <https://llvm.org/bugs/show_bug.cgi?id=24639>`_, Disassembler: http://reviews.llvm.org/rL247405, http://reviews.llvm.org/rL247414, http://reviews.llvm.org/rL247416, http://reviews.llvm.org/rL247417, http://reviews.llvm.org/rL247420, http://reviews.llvm.org/rL247422. .. _pcre2: http://www.pcre.org/ - .. _AFL: http://lcamtuf.coredump.cx/afl/ - .. _SanitizerCoverage: http://clang.llvm.org/docs/SanitizerCoverage.html .. _SanitizerCoverageTraceDataFlow: http://clang.llvm.org/docs/SanitizerCoverage.html#tracing-data-flow .. _DataFlowSanitizer: http://clang.llvm.org/docs/DataFlowSanitizer.html - +.. _AddressSanitizer: http://clang.llvm.org/docs/AddressSanitizer.html +.. _LeakSanitizer: http://clang.llvm.org/docs/LeakSanitizer.html .. _Heartbleed: http://en.wikipedia.org/wiki/Heartbleed - .. _FuzzerInterface.h: https://github.com/llvm-mirror/llvm/blob/master/lib/Fuzzer/FuzzerInterface.h +.. _3.7.0: http://llvm.org/releases/3.7.0/docs/LibFuzzer.html +.. _building Clang from trunk: http://clang.llvm.org/get_started.html +.. _MemorySanitizer: http://clang.llvm.org/docs/MemorySanitizer.html +.. _UndefinedBehaviorSanitizer: http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html +.. _`coverage counters`: http://clang.llvm.org/docs/SanitizerCoverage.html#coverage-counters +.. _`caller-callee pairs`: http://clang.llvm.org/docs/SanitizerCoverage.html#caller-callee-coverage +.. _BoringSSL: https://boringssl.googlesource.com/boringssl/ +.. _`fuzz various parts of LLVM itself`: `Fuzzing components of LLVM`_ diff --git a/docs/LinkTimeOptimization.rst b/docs/LinkTimeOptimization.rst index 55a7486874a31..9c1e5607596bb 100644 --- a/docs/LinkTimeOptimization.rst +++ b/docs/LinkTimeOptimization.rst @@ -87,9 +87,9 @@ To compile, run: .. code-block:: console - % clang -emit-llvm -c a.c -o a.o # <-- a.o is LLVM bitcode file + % clang -flto -c a.c -o a.o # <-- a.o is LLVM bitcode file % clang -c main.c -o main.o # <-- main.o is native object file - % clang a.o main.o -o main # <-- standard link command without modifications + % clang -flto a.o main.o -o main # <-- standard link command with -flto * In this example, the linker recognizes that ``foo2()`` is an externally visible symbol defined in LLVM bitcode file. The linker completes its usual diff --git a/docs/Makefile b/docs/Makefile deleted file mode 100644 index da649bc887321..0000000000000 --- a/docs/Makefile +++ /dev/null @@ -1,138 +0,0 @@ -##===- docs/Makefile ---------------------------------------*- Makefile -*-===## -# -# The LLVM Compiler Infrastructure -# -# This file is distributed under the University of Illinois Open Source -# License. See LICENSE.TXT for details. -# -##===----------------------------------------------------------------------===## - -LEVEL := .. -DIRS := - -ifdef BUILD_FOR_WEBSITE -PROJ_OBJ_DIR = . -DOXYGEN = doxygen - -$(PROJ_OBJ_DIR)/doxygen.cfg: doxygen.cfg.in - cat $< | sed \ - -e 's/@DOT@/dot/g' \ - -e 's/@PACKAGE_VERSION@/mainline/' \ - -e 's/@abs_top_builddir@/../g' \ - -e 's/@abs_top_srcdir@/../g' \ - -e 's/@enable_external_search@/NO/g' \ - -e 's/@enable_searchengine@/NO/g' \ - -e 's/@enable_server_based_search@/NO/g' \ - -e 's/@extra_search_mappings@//g' \ - -e 's/@llvm_doxygen_generate_qhp@//g' \ - -e 's/@llvm_doxygen_qch_filename@//g' \ - -e 's/@llvm_doxygen_qhelpgenerator_path@//g' \ - -e 's/@llvm_doxygen_qhp_cust_filter_attrs@//g' \ - -e 's/@llvm_doxygen_qhp_cust_filter_name@//g' \ - -e 's/@llvm_doxygen_qhp_namespace@//g' \ - -e 's/@searchengine_url@//g' \ - -e 's/@DOT_IMAGE_FORMAT@/png/g' \ - > $@ -endif - -include $(LEVEL)/Makefile.common - -HTML := $(wildcard $(PROJ_SRC_DIR)/*.html) \ - $(wildcard $(PROJ_SRC_DIR)/*.css) -DOXYFILES := doxygen.cfg.in doxygen.intro - -.PHONY: install-html install-doxygen doxygen install-ocamldoc ocamldoc generated - -install_targets := install-html -ifeq ($(ENABLE_DOXYGEN),1) -install_targets += install-doxygen -endif -ifdef OCAMLFIND -ifneq (,$(filter ocaml,$(BINDINGS_TO_BUILD))) -install_targets += install-ocamldoc -endif -endif -install-local:: $(install_targets) - -generated_targets := doxygen -ifdef OCAMLFIND -generated_targets += ocamldoc -endif - -# Live documentation is generated for the web site using this target: -# 'make generated BUILD_FOR_WEBSITE=1' -generated:: $(generated_targets) - -install-html: $(PROJ_OBJ_DIR)/html.tar.gz - $(Echo) Installing HTML documentation - $(Verb) $(MKDIR) $(DESTDIR)$(PROJ_docsdir)/html - $(Verb) $(DataInstall) $(HTML) $(DESTDIR)$(PROJ_docsdir)/html - $(Verb) $(DataInstall) $(PROJ_OBJ_DIR)/html.tar.gz $(DESTDIR)$(PROJ_docsdir) - -$(PROJ_OBJ_DIR)/html.tar.gz: $(HTML) - $(Echo) Packaging HTML documentation - $(Verb) $(RM) -rf $@ $(PROJ_OBJ_DIR)/html.tar - $(Verb) cd $(PROJ_SRC_DIR) && \ - $(TAR) cf $(PROJ_OBJ_DIR)/html.tar *.html - $(Verb) $(GZIPBIN) $(PROJ_OBJ_DIR)/html.tar - -install-doxygen: doxygen - $(Echo) Installing doxygen documentation - $(Verb) $(DataInstall) $(PROJ_OBJ_DIR)/doxygen.tar.gz $(DESTDIR)$(PROJ_docsdir) - $(Verb) cd $(PROJ_OBJ_DIR)/doxygen/html && \ - for DIR in $$($(FIND) . -type d); do \ - DESTSUB="$(DESTDIR)$(PROJ_docsdir)/html/doxygen/$$(echo $$DIR | cut -c 3-)"; \ - $(MKDIR) $$DESTSUB && \ - $(FIND) $$DIR -maxdepth 1 -type f -exec $(DataInstall) {} $$DESTSUB \; ; \ - if [ $$? != 0 ]; then exit 1; fi \ - done - -doxygen: regendoc $(PROJ_OBJ_DIR)/doxygen.tar.gz - -regendoc: - $(Echo) Building doxygen documentation - $(Verb) $(RM) -rf $(PROJ_OBJ_DIR)/doxygen - $(Verb) $(DOXYGEN) $(PROJ_OBJ_DIR)/doxygen.cfg - -$(PROJ_OBJ_DIR)/doxygen.tar.gz: $(DOXYFILES) $(PROJ_OBJ_DIR)/doxygen.cfg - $(Echo) Packaging doxygen documentation - $(Verb) $(RM) -rf $@ $(PROJ_OBJ_DIR)/doxygen.tar - $(Verb) $(TAR) cf $(PROJ_OBJ_DIR)/doxygen.tar doxygen - $(Verb) $(GZIPBIN) $(PROJ_OBJ_DIR)/doxygen.tar - $(Verb) $(CP) $(PROJ_OBJ_DIR)/doxygen.tar.gz $(PROJ_OBJ_DIR)/doxygen/html/ - -userloc: $(LLVM_SRC_ROOT)/docs/userloc.html - -$(LLVM_SRC_ROOT)/docs/userloc.html: - $(Echo) Making User LOC Table - $(Verb) cd $(LLVM_SRC_ROOT) ; ./utils/userloc.pl -details -recurse \ - -html lib include tools runtime utils examples autoconf test > docs/userloc.html - -install-ocamldoc: ocamldoc - $(Echo) Installing ocamldoc documentation - $(Verb) $(MKDIR) $(DESTDIR)$(PROJ_docsdir)/ocamldoc/html - $(Verb) $(DataInstall) $(PROJ_OBJ_DIR)/ocamldoc.tar.gz $(DESTDIR)$(PROJ_docsdir) - $(Verb) cd $(PROJ_OBJ_DIR)/ocamldoc && \ - $(FIND) . -type f -exec \ - $(DataInstall) {} $(DESTDIR)$(PROJ_docsdir)/ocamldoc/html \; - -ocamldoc: regen-ocamldoc - $(Echo) Packaging ocamldoc documentation - $(Verb) $(RM) -rf $(PROJ_OBJ_DIR)/ocamldoc.tar* - $(Verb) $(TAR) cf $(PROJ_OBJ_DIR)/ocamldoc.tar ocamldoc - $(Verb) $(GZIPBIN) $(PROJ_OBJ_DIR)/ocamldoc.tar - $(Verb) $(CP) $(PROJ_OBJ_DIR)/ocamldoc.tar.gz $(PROJ_OBJ_DIR)/ocamldoc/html/ - -regen-ocamldoc: - $(Echo) Building ocamldoc documentation - $(Verb) $(RM) -rf $(PROJ_OBJ_DIR)/ocamldoc - $(Verb) $(MAKE) -C $(LEVEL)/bindings/ocaml ocamldoc - $(Verb) $(MKDIR) $(PROJ_OBJ_DIR)/ocamldoc/html - $(Verb) \ - $(OCAMLFIND) ocamldoc -d $(PROJ_OBJ_DIR)/ocamldoc/html -sort -colorize-code -html \ - `$(FIND) $(LEVEL)/bindings/ocaml -name "*.odoc" \ - -path "*/$(BuildMode)/*.odoc" -exec echo -load '{}' ';'` - -uninstall-local:: - $(Echo) Uninstalling Documentation - $(Verb) $(RM) -rf $(DESTDIR)$(PROJ_docsdir) diff --git a/docs/MakefileGuide.rst b/docs/MakefileGuide.rst deleted file mode 100644 index a5e273124a41c..0000000000000 --- a/docs/MakefileGuide.rst +++ /dev/null @@ -1,916 +0,0 @@ -=================== -LLVM Makefile Guide -=================== - -.. contents:: - :local: - -Introduction -============ - -This document provides *usage* information about the LLVM makefile system. While -loosely patterned after the BSD makefile system, LLVM has taken a departure from -BSD in order to implement additional features needed by LLVM. Although makefile -systems, such as ``automake``, were attempted at one point, it has become clear -that the features needed by LLVM and the ``Makefile`` norm are too great to use -a more limited tool. Consequently, LLVM requires simply GNU Make 3.79, a widely -portable makefile processor. LLVM unabashedly makes heavy use of the features of -GNU Make so the dependency on GNU Make is firm. If you're not familiar with -``make``, it is recommended that you read the `GNU Makefile Manual -<http://www.gnu.org/software/make/manual/make.html>`_. - -While this document is rightly part of the `LLVM Programmer's -Manual <ProgrammersManual.html>`_, it is treated separately here because of the -volume of content and because it is often an early source of bewilderment for -new developers. - -General Concepts -================ - -The LLVM Makefile System is the component of LLVM that is responsible for -building the software, testing it, generating distributions, checking those -distributions, installing and uninstalling, etc. It consists of a several files -throughout the source tree. These files and other general concepts are described -in this section. - -Projects --------- - -The LLVM Makefile System is quite generous. It not only builds its own software, -but it can build yours too. Built into the system is knowledge of the -``llvm/projects`` directory. Any directory under ``projects`` that has both a -``configure`` script and a ``Makefile`` is assumed to be a project that uses the -LLVM Makefile system. Building software that uses LLVM does not require the -LLVM Makefile System nor even placement in the ``llvm/projects`` -directory. However, doing so will allow your project to get up and running -quickly by utilizing the built-in features that are used to compile LLVM. LLVM -compiles itself using the same features of the makefile system as used for -projects. - -For further details, consult the `Projects <Projects.html>`_ page. - -Variable Values ---------------- - -To use the makefile system, you simply create a file named ``Makefile`` in your -directory and declare values for certain variables. The variables and values -that you select determine what the makefile system will do. These variables -enable rules and processing in the makefile system that automatically Do The -Right Thing (C). - -Including Makefiles -------------------- - -Setting variables alone is not enough. You must include into your Makefile -additional files that provide the rules of the LLVM Makefile system. The various -files involved are described in the sections that follow. - -``Makefile`` -^^^^^^^^^^^^ - -Each directory to participate in the build needs to have a file named -``Makefile``. This is the file first read by ``make``. It has three -sections: - -#. Settable Variables --- Required that must be set first. -#. ``include $(LEVEL)/Makefile.common`` --- include the LLVM Makefile system. -#. Override Variables --- Override variables set by the LLVM Makefile system. - -.. _$(LEVEL)/Makefile.common: - -``Makefile.common`` -^^^^^^^^^^^^^^^^^^^ - -Every project must have a ``Makefile.common`` file at its top source -directory. This file serves three purposes: - -#. It includes the project's configuration makefile to obtain values determined - by the ``configure`` script. This is done by including the - `$(LEVEL)/Makefile.config`_ file. - -#. It specifies any other (static) values that are needed throughout the - project. Only values that are used in all or a large proportion of the - project's directories should be placed here. - -#. It includes the standard rules for the LLVM Makefile system, - `$(LLVM_SRC_ROOT)/Makefile.rules`_. This file is the *guts* of the LLVM - ``Makefile`` system. - -.. _$(LEVEL)/Makefile.config: - -``Makefile.config`` -^^^^^^^^^^^^^^^^^^^ - -Every project must have a ``Makefile.config`` at the top of its *build* -directory. This file is **generated** by the ``configure`` script from the -pattern provided by the ``Makefile.config.in`` file located at the top of the -project's *source* directory. The contents of this file depend largely on what -configuration items the project uses, however most projects can get what they -need by just relying on LLVM's configuration found in -``$(LLVM_OBJ_ROOT)/Makefile.config``. - -.. _$(LLVM_SRC_ROOT)/Makefile.rules: - -``Makefile.rules`` -^^^^^^^^^^^^^^^^^^ - -This file, located at ``$(LLVM_SRC_ROOT)/Makefile.rules`` is the heart of the -LLVM Makefile System. It provides all the logic, dependencies, and rules for -building the targets supported by the system. What it does largely depends on -the values of ``make`` `variables`_ that have been set *before* -``Makefile.rules`` is included. - -Comments -^^^^^^^^ - -User ``Makefile``\s need not have comments in them unless the construction is -unusual or it does not strictly follow the rules and patterns of the LLVM -makefile system. Makefile comments are invoked with the pound (``#``) character. -The ``#`` character and any text following it, to the end of the line, are -ignored by ``make``. - -Tutorial -======== - -This section provides some examples of the different kinds of modules you can -build with the LLVM makefile system. In general, each directory you provide will -build a single object although that object may be composed of additionally -compiled components. - -Libraries ---------- - -Only a few variable definitions are needed to build a regular library. -Normally, the makefile system will build all the software into a single -``libname.o`` (pre-linked) object. This means the library is not searchable and -that the distinction between compilation units has been dissolved. Optionally, -you can ask for a shared library (.so) or archive library (.a) built. Archive -libraries are the default. For example: - -.. code-block:: makefile - - LIBRARYNAME = mylib - SHARED_LIBRARY = 1 - BUILD_ARCHIVE = 1 - -says to build a library named ``mylib`` with both a shared library -(``mylib.so``) and an archive library (``mylib.a``) version. The contents of all -the libraries produced will be the same, they are just constructed differently. -Note that you normally do not need to specify the sources involved. The LLVM -Makefile system will infer the source files from the contents of the source -directory. - -The ``LOADABLE_MODULE=1`` directive can be used in conjunction with -``SHARED_LIBRARY=1`` to indicate that the resulting shared library should be -openable with the ``dlopen`` function and searchable with the ``dlsym`` function -(or your operating system's equivalents). While this isn't strictly necessary on -Linux and a few other platforms, it is required on systems like HP-UX and -Darwin. You should use ``LOADABLE_MODULE`` for any shared library that you -intend to be loaded into an tool via the ``-load`` option. :ref:`Pass -documentation <writing-an-llvm-pass-makefile>` has an example of why you might -want to do this. - -Loadable Modules -^^^^^^^^^^^^^^^^ - -In some situations, you need to create a loadable module. Loadable modules can -be loaded into programs like ``opt`` or ``llc`` to specify additional passes to -run or targets to support. Loadable modules are also useful for debugging a -pass or providing a pass with another package if that pass can't be included in -LLVM. - -LLVM provides complete support for building such a module. All you need to do is -use the ``LOADABLE_MODULE`` variable in your ``Makefile``. For example, to build -a loadable module named ``MyMod`` that uses the LLVM libraries ``LLVMSupport.a`` -and ``LLVMSystem.a``, you would specify: - -.. code-block:: makefile - - LIBRARYNAME := MyMod - LOADABLE_MODULE := 1 - LINK_COMPONENTS := support system - -Use of the ``LOADABLE_MODULE`` facility implies several things: - -#. There will be no "``lib``" prefix on the module. This differentiates it from - a standard shared library of the same name. - -#. The `SHARED_LIBRARY`_ variable is turned on. - -#. The `LINK_LIBS_IN_SHARED`_ variable is turned on. - -A loadable module is loaded by LLVM via the facilities of libtool's libltdl -library which is part of ``lib/System`` implementation. - -Tools ------ - -For building executable programs (tools), you must provide the name of the tool -and the names of the libraries you wish to link with the tool. For example: - -.. code-block:: makefile - - TOOLNAME = mytool - USEDLIBS = mylib - LINK_COMPONENTS = support system - -says that we are to build a tool name ``mytool`` and that it requires three -libraries: ``mylib``, ``LLVMSupport.a`` and ``LLVMSystem.a``. - -Note that two different variables are used to indicate which libraries are -linked: ``USEDLIBS`` and ``LLVMLIBS``. This distinction is necessary to support -projects. ``LLVMLIBS`` refers to the LLVM libraries found in the LLVM object -directory. ``USEDLIBS`` refers to the libraries built by your project. In the -case of building LLVM tools, ``USEDLIBS`` and ``LLVMLIBS`` can be used -interchangeably since the "project" is LLVM itself and ``USEDLIBS`` refers to -the same place as ``LLVMLIBS``. - -Also note that there are two different ways of specifying a library: with a -``.a`` suffix and without. Without the suffix, the entry refers to the re-linked -(.o) file which will include *all* symbols of the library. This is -useful, for example, to include all passes from a library of passes. If the -``.a`` suffix is used then the library is linked as a searchable library (with -the ``-l`` option). In this case, only the symbols that are unresolved *at -that point* will be resolved from the library, if they exist. Other -(unreferenced) symbols will not be included when the ``.a`` syntax is used. Note -that in order to use the ``.a`` suffix, the library in question must have been -built with the ``BUILD_ARCHIVE`` option set. - -JIT Tools -^^^^^^^^^ - -Many tools will want to use the JIT features of LLVM. To do this, you simply -specify that you want an execution 'engine', and the makefiles will -automatically link in the appropriate JIT for the host or an interpreter if none -is available: - -.. code-block:: makefile - - TOOLNAME = my_jit_tool - USEDLIBS = mylib - LINK_COMPONENTS = engine - -Of course, any additional libraries may be listed as other components. To get a -full understanding of how this changes the linker command, it is recommended -that you: - -.. code-block:: bash - - % cd examples/Fibonacci - % make VERBOSE=1 - -Targets Supported -================= - -This section describes each of the targets that can be built using the LLVM -Makefile system. Any target can be invoked from any directory but not all are -applicable to a given directory (e.g. "check", "dist" and "install" will always -operate as if invoked from the top level directory). - -================= =============== ================== -Target Name Implied Targets Target Description -================= =============== ================== -``all`` \ Compile the software recursively. Default target. -``all-local`` \ Compile the software in the local directory only. -``check`` \ Change to the ``test`` directory in a project and run the test suite there. -``check-local`` \ Run a local test suite. Generally this is only defined in the ``Makefile`` of the project's ``test`` directory. -``clean`` \ Remove built objects recursively. -``clean-local`` \ Remove built objects from the local directory only. -``dist`` ``all`` Prepare a source distribution tarball. -``dist-check`` ``all`` Prepare a source distribution tarball and check that it builds. -``dist-clean`` ``clean`` Clean source distribution tarball temporary files. -``install`` ``all`` Copy built objects to installation directory. -``preconditions`` ``all`` Check to make sure configuration and makefiles are up to date. -``printvars`` ``all`` Prints variables defined by the makefile system (for debugging). -``tags`` \ Make C and C++ tags files for emacs and vi. -``uninstall`` \ Remove built objects from installation directory. -================= =============== ================== - -.. _all: - -``all`` (default) ------------------ - -When you invoke ``make`` with no arguments, you are implicitly instructing it to -seek the ``all`` target (goal). This target is used for building the software -recursively and will do different things in different directories. For example, -in a ``lib`` directory, the ``all`` target will compile source files and -generate libraries. But, in a ``tools`` directory, it will link libraries and -generate executables. - -``all-local`` -------------- - -This target is the same as `all`_ but it operates only on the current directory -instead of recursively. - -``check`` ---------- - -This target can be invoked from anywhere within a project's directories but -always invokes the `check-local`_ target in the project's ``test`` directory, if -it exists and has a ``Makefile``. A warning is produced otherwise. If -`TESTSUITE`_ is defined on the ``make`` command line, it will be passed down to -the invocation of ``make check-local`` in the ``test`` directory. The intended -usage for this is to assist in running specific suites of tests. If -``TESTSUITE`` is not set, the implementation of ``check-local`` should run all -normal tests. It is up to the project to define what different values for -``TESTSUTE`` will do. See the :doc:`Testing Guide <TestingGuide>` for further -details. - -``check-local`` ---------------- - -This target should be implemented by the ``Makefile`` in the project's ``test`` -directory. It is invoked by the ``check`` target elsewhere. Each project is -free to define the actions of ``check-local`` as appropriate for that -project. The LLVM project itself uses the :doc:`Lit <CommandGuide/lit>` testing -tool to run a suite of feature and regression tests. Other projects may choose -to use :program:`lit` or any other testing mechanism. - -``clean`` ---------- - -This target cleans the build directory, recursively removing all things that the -Makefile builds. The cleaning rules have been made guarded so they shouldn't go -awry (via ``rm -f $(UNSET_VARIABLE)/*`` which will attempt to erase the entire -directory structure). - -``clean-local`` ---------------- - -This target does the same thing as ``clean`` but only for the current (local) -directory. - -``dist`` --------- - -This target builds a distribution tarball. It first builds the entire project -using the ``all`` target and then tars up the necessary files and compresses -it. The generated tarball is sufficient for a casual source distribution, but -probably not for a release (see ``dist-check``). - -``dist-check`` --------------- - -This target does the same thing as the ``dist`` target but also checks the -distribution tarball. The check is made by unpacking the tarball to a new -directory, configuring it, building it, installing it, and then verifying that -the installation results are correct (by comparing to the original build). This -target can take a long time to run but should be done before a release goes out -to make sure that the distributed tarball can actually be built into a working -release. - -``dist-clean`` --------------- - -This is a special form of the ``clean`` clean target. It performs a normal -``clean`` but also removes things pertaining to building the distribution. - -``install`` ------------ - -This target finalizes shared objects and executables and copies all libraries, -headers, executables and documentation to the directory given with the -``--prefix`` option to ``configure``. When completed, the prefix directory will -have everything needed to **use** LLVM. - -The LLVM makefiles can generate complete **internal** documentation for all the -classes by using ``doxygen``. By default, this feature is **not** enabled -because it takes a long time and generates a massive amount of data (>100MB). If -you want this feature, you must configure LLVM with the --enable-doxygen switch -and ensure that a modern version of doxygen (1.3.7 or later) is available in -your ``PATH``. You can download doxygen from `here -<http://www.stack.nl/~dimitri/doxygen/download.html#latestsrc>`_. - -``preconditions`` ------------------ - -This utility target checks to see if the ``Makefile`` in the object directory is -older than the ``Makefile`` in the source directory and copies it if so. It also -reruns the ``configure`` script if that needs to be done and rebuilds the -``Makefile.config`` file similarly. Users may overload this target to ensure -that sanity checks are run *before* any building of targets as all the targets -depend on ``preconditions``. - -``printvars`` -------------- - -This utility target just causes the LLVM makefiles to print out some of the -makefile variables so that you can double check how things are set. - -``reconfigure`` ---------------- - -This utility target will force a reconfigure of LLVM or your project. It simply -runs ``$(PROJ_OBJ_ROOT)/config.status --recheck`` to rerun the configuration -tests and rebuild the configured files. This isn't generally useful as the -makefiles will reconfigure themselves whenever its necessary. - -``spotless`` ------------- - -.. warning:: - - Use with caution! - -This utility target, only available when ``$(PROJ_OBJ_ROOT)`` is not the same as -``$(PROJ_SRC_ROOT)``, will completely clean the ``$(PROJ_OBJ_ROOT)`` directory -by removing its content entirely and reconfiguring the directory. This returns -the ``$(PROJ_OBJ_ROOT)`` directory to a completely fresh state. All content in -the directory except configured files and top-level makefiles will be lost. - -``tags`` --------- - -This target will generate a ``TAGS`` file in the top-level source directory. It -is meant for use with emacs, XEmacs, or ViM. The TAGS file provides an index of -symbol definitions so that the editor can jump you to the definition -quickly. - -``uninstall`` -------------- - -This target is the opposite of the ``install`` target. It removes the header, -library and executable files from the installation directories. Note that the -directories themselves are not removed because it is not guaranteed that LLVM is -the only thing installing there (e.g. ``--prefix=/usr``). - -.. _variables: - -Variables -========= - -Variables are used to tell the LLVM Makefile System what to do and to obtain -information from it. Variables are also used internally by the LLVM Makefile -System. Variable names that contain only the upper case alphabetic letters and -underscore are intended for use by the end user. All other variables are -internal to the LLVM Makefile System and should not be relied upon nor -modified. The sections below describe how to use the LLVM Makefile -variables. - -Control Variables ------------------ - -Variables listed in the table below should be set *before* the inclusion of -`$(LEVEL)/Makefile.common`_. These variables provide input to the LLVM make -system that tell it what to do for the current directory. - -``BUILD_ARCHIVE`` - If set to any value, causes an archive (.a) library to be built. - -``BUILT_SOURCES`` - Specifies a set of source files that are generated from other source - files. These sources will be built before any other target processing to - ensure they are present. - -``CONFIG_FILES`` - Specifies a set of configuration files to be installed. - -``DEBUG_SYMBOLS`` - If set to any value, causes the build to include debugging symbols even in - optimized objects, libraries and executables. This alters the flags - specified to the compilers and linkers. Debugging isn't fun in an optimized - build, but it is possible. - -``DIRS`` - Specifies a set of directories, usually children of the current directory, - that should also be made using the same goal. These directories will be - built serially. - -``DISABLE_AUTO_DEPENDENCIES`` - If set to any value, causes the makefiles to **not** automatically generate - dependencies when running the compiler. Use of this feature is discouraged - and it may be removed at a later date. - -``ENABLE_OPTIMIZED`` - If set to 1, causes the build to generate optimized objects, libraries and - executables. This alters the flags specified to the compilers and - linkers. Generally debugging won't be a fun experience with an optimized - build. - -``ENABLE_PROFILING`` - If set to 1, causes the build to generate both optimized and profiled - objects, libraries and executables. This alters the flags specified to the - compilers and linkers to ensure that profile data can be collected from the - tools built. Use the ``gprof`` tool to analyze the output from the profiled - tools (``gmon.out``). - -``DISABLE_ASSERTIONS`` - If set to 1, causes the build to disable assertions, even if building a - debug or profile build. This will exclude all assertion check code from the - build. LLVM will execute faster, but with little help when things go - wrong. - -``EXPERIMENTAL_DIRS`` - Specify a set of directories that should be built, but if they fail, it - should not cause the build to fail. Note that this should only be used - temporarily while code is being written. - -``EXPORTED_SYMBOL_FILE`` - Specifies the name of a single file that contains a list of the symbols to - be exported by the linker. One symbol per line. - -``EXPORTED_SYMBOL_LIST`` - Specifies a set of symbols to be exported by the linker. - -``EXTRA_DIST`` - Specifies additional files that should be distributed with LLVM. All source - files, all built sources, all Makefiles, and most documentation files will - be automatically distributed. Use this variable to distribute any files that - are not automatically distributed. - -``KEEP_SYMBOLS`` - If set to any value, specifies that when linking executables the makefiles - should retain debug symbols in the executable. Normally, symbols are - stripped from the executable. - -``LEVEL`` (required) - Specify the level of nesting from the top level. This variable must be set - in each makefile as it is used to find the top level and thus the other - makefiles. - -``LIBRARYNAME`` - Specify the name of the library to be built. (Required For Libraries) - -``LINK_COMPONENTS`` - When specified for building a tool, the value of this variable will be - passed to the ``llvm-config`` tool to generate a link line for the - tool. Unlike ``USEDLIBS`` and ``LLVMLIBS``, not all libraries need to be - specified. The ``llvm-config`` tool will figure out the library dependencies - and add any libraries that are needed. The ``USEDLIBS`` variable can still - be used in conjunction with ``LINK_COMPONENTS`` so that additional - project-specific libraries can be linked with the LLVM libraries specified - by ``LINK_COMPONENTS``. - -.. _LINK_LIBS_IN_SHARED: - -``LINK_LIBS_IN_SHARED`` - By default, shared library linking will ignore any libraries specified with - the `LLVMLIBS`_ or `USEDLIBS`_. This prevents shared libs from including - things that will be in the LLVM tool the shared library will be loaded - into. However, sometimes it is useful to link certain libraries into your - shared library and this option enables that feature. - -.. _LLVMLIBS: - -``LLVMLIBS`` - Specifies the set of libraries from the LLVM ``$(ObjDir)`` that will be - linked into the tool or library. - -``LOADABLE_MODULE`` - If set to any value, causes the shared library being built to also be a - loadable module. Loadable modules can be opened with the dlopen() function - and searched with dlsym (or the operating system's equivalent). Note that - setting this variable without also setting ``SHARED_LIBRARY`` will have no - effect. - -``NO_INSTALL`` - Specifies that the build products of the directory should not be installed - but should be built even if the ``install`` target is given. This is handy - for directories that build libraries or tools that are only used as part of - the build process, such as code generators (e.g. ``tblgen``). - -``OPTIONAL_DIRS`` - Specify a set of directories that may be built, if they exist, but it is - not an error for them not to exist. - -``PARALLEL_DIRS`` - Specify a set of directories to build recursively and in parallel if the - ``-j`` option was used with ``make``. - -.. _SHARED_LIBRARY: - -``SHARED_LIBRARY`` - If set to any value, causes a shared library (``.so``) to be built in - addition to any other kinds of libraries. Note that this option will cause - all source files to be built twice: once with options for position - independent code and once without. Use it only where you really need a - shared library. - -``SOURCES`` (optional) - Specifies the list of source files in the current directory to be - built. Source files of any type may be specified (programs, documentation, - config files, etc.). If not specified, the makefile system will infer the - set of source files from the files present in the current directory. - -``SUFFIXES`` - Specifies a set of filename suffixes that occur in suffix match rules. Only - set this if your local ``Makefile`` specifies additional suffix match - rules. - -``TARGET`` - Specifies the name of the LLVM code generation target that the current - directory builds. Setting this variable enables additional rules to build - ``.inc`` files from ``.td`` files. - -.. _TESTSUITE: - -``TESTSUITE`` - Specifies the directory of tests to run in ``llvm/test``. - -``TOOLNAME`` - Specifies the name of the tool that the current directory should build. - -``TOOL_VERBOSE`` - Implies ``VERBOSE`` and also tells each tool invoked to be verbose. This is - handy when you're trying to see the sub-tools invoked by each tool invoked - by the makefile. For example, this will pass ``-v`` to the GCC compilers - which causes it to print out the command lines it uses to invoke sub-tools - (compiler, assembler, linker). - -.. _USEDLIBS: - -``USEDLIBS`` - Specifies the list of project libraries that will be linked into the tool or - library. - -``VERBOSE`` - Tells the Makefile system to produce detailed output of what it is doing - instead of just summary comments. This will generate a LOT of output. - -Override Variables ------------------- - -Override variables can be used to override the default values provided by the -LLVM makefile system. These variables can be set in several ways: - -* In the environment (e.g. setenv, export) --- not recommended. -* On the ``make`` command line --- recommended. -* On the ``configure`` command line. -* In the Makefile (only *after* the inclusion of `$(LEVEL)/Makefile.common`_). - -The override variables are given below: - -``AR`` (defaulted) - Specifies the path to the ``ar`` tool. - -``PROJ_OBJ_DIR`` - The directory into which the products of build rules will be placed. This - might be the same as `PROJ_SRC_DIR`_ but typically is not. - -.. _PROJ_SRC_DIR: - -``PROJ_SRC_DIR`` - The directory which contains the source files to be built. - -``BUILD_EXAMPLES`` - If set to 1, build examples in ``examples`` and (if building Clang) - ``tools/clang/examples`` directories. - -``BZIP2`` (configured) - The path to the ``bzip2`` tool. - -``CC`` (configured) - The path to the 'C' compiler. - -``CFLAGS`` - Additional flags to be passed to the 'C' compiler. - -``CPPFLAGS`` - Additional flags passed to the C/C++ preprocessor. - -``CXX`` - Specifies the path to the C++ compiler. - -``CXXFLAGS`` - Additional flags to be passed to the C++ compiler. - -``DATE`` (configured) - Specifies the path to the ``date`` program or any program that can generate - the current date and time on its standard output. - -``DOT`` (configured) - Specifies the path to the ``dot`` tool or ``false`` if there isn't one. - -``ECHO`` (configured) - Specifies the path to the ``echo`` tool for printing output. - -``EXEEXT`` (configured) - Provides the extension to be used on executables built by the makefiles. - The value may be empty on platforms that do not use file extensions for - executables (e.g. Unix). - -``INSTALL`` (configured) - Specifies the path to the ``install`` tool. - -``LDFLAGS`` (configured) - Allows users to specify additional flags to pass to the linker. - -``LIBS`` (configured) - The list of libraries that should be linked with each tool. - -``LIBTOOL`` (configured) - Specifies the path to the ``libtool`` tool. This tool is renamed ``mklib`` - by the ``configure`` script. - -``LLVMAS`` (defaulted) - Specifies the path to the ``llvm-as`` tool. - -``LLVMGCC`` (defaulted) - Specifies the path to the LLVM version of the GCC 'C' Compiler. - -``LLVMGXX`` (defaulted) - Specifies the path to the LLVM version of the GCC C++ Compiler. - -``LLVMLD`` (defaulted) - Specifies the path to the LLVM bitcode linker tool - -``LLVM_OBJ_ROOT`` (configured) - Specifies the top directory into which the output of the build is placed. - -``LLVM_SRC_ROOT`` (configured) - Specifies the top directory in which the sources are found. - -``LLVM_TARBALL_NAME`` (configured) - Specifies the name of the distribution tarball to create. This is configured - from the name of the project and its version number. - -``MKDIR`` (defaulted) - Specifies the path to the ``mkdir`` tool that creates directories. - -``ONLY_TOOLS`` - If set, specifies the list of tools to build. - -``PLATFORMSTRIPOPTS`` - The options to provide to the linker to specify that a stripped (no symbols) - executable should be built. - -``RANLIB`` (defaulted) - Specifies the path to the ``ranlib`` tool. - -``RM`` (defaulted) - Specifies the path to the ``rm`` tool. - -``SED`` (defaulted) - Specifies the path to the ``sed`` tool. - -``SHLIBEXT`` (configured) - Provides the filename extension to use for shared libraries. - -``TBLGEN`` (defaulted) - Specifies the path to the ``tblgen`` tool. - -``TAR`` (defaulted) - Specifies the path to the ``tar`` tool. - -``ZIP`` (defaulted) - Specifies the path to the ``zip`` tool. - -Readable Variables ------------------- - -Variables listed in the table below can be used by the user's Makefile but -should not be changed. Changing the value will generally cause the build to go -wrong, so don't do it. - -``bindir`` - The directory into which executables will ultimately be installed. This - value is derived from the ``--prefix`` option given to ``configure``. - -``BuildMode`` - The name of the type of build being performed: Debug, Release, or - Profile. - -``bytecode_libdir`` - The directory into which bitcode libraries will ultimately be installed. - This value is derived from the ``--prefix`` option given to ``configure``. - -``ConfigureScriptFLAGS`` - Additional flags given to the ``configure`` script when reconfiguring. - -``DistDir`` - The *current* directory for which a distribution copy is being made. - -.. _Echo: - -``Echo`` - The LLVM Makefile System output command. This provides the ``llvm[n]`` - prefix and starts with ``@`` so the command itself is not printed by - ``make``. - -``EchoCmd`` - Same as `Echo`_ but without the leading ``@``. - -``includedir`` - The directory into which include files will ultimately be installed. This - value is derived from the ``--prefix`` option given to ``configure``. - -``libdir`` - The directory into which native libraries will ultimately be installed. - This value is derived from the ``--prefix`` option given to - ``configure``. - -``LibDir`` - The configuration specific directory into which libraries are placed before - installation. - -``MakefileConfig`` - Full path of the ``Makefile.config`` file. - -``MakefileConfigIn`` - Full path of the ``Makefile.config.in`` file. - -``ObjDir`` - The configuration and directory specific directory where build objects - (compilation results) are placed. - -``SubDirs`` - The complete list of sub-directories of the current directory as - specified by other variables. - -``Sources`` - The complete list of source files. - -``sysconfdir`` - The directory into which configuration files will ultimately be - installed. This value is derived from the ``--prefix`` option given to - ``configure``. - -``ToolDir`` - The configuration specific directory into which executables are placed - before they are installed. - -``TopDistDir`` - The top most directory into which the distribution files are copied. - -``Verb`` - Use this as the first thing on your build script lines to enable or disable - verbose mode. It expands to either an ``@`` (quiet mode) or nothing (verbose - mode). - -Internal Variables ------------------- - -Variables listed below are used by the LLVM Makefile System and considered -internal. You should not use these variables under any circumstances. - -.. code-block:: makefile - - Archive - AR.Flags - BaseNameSources - BCLinkLib - C.Flags - Compile.C - CompileCommonOpts - Compile.CXX - ConfigStatusScript - ConfigureScript - CPP.Flags - CPP.Flags - CXX.Flags - DependFiles - DestArchiveLib - DestBitcodeLib - DestModule - DestSharedLib - DestTool - DistAlways - DistCheckDir - DistCheckTop - DistFiles - DistName - DistOther - DistSources - DistSubDirs - DistTarBZ2 - DistTarGZip - DistZip - ExtraLibs - FakeSources - INCFiles - InternalTargets - LD.Flags - LibName.A - LibName.BC - LibName.LA - LibName.O - LibTool.Flags - Link - LinkModule - LLVMLibDir - LLVMLibsOptions - LLVMLibsPaths - LLVMToolDir - LLVMUsedLibs - LocalTargets - Module - ObjectsLO - ObjectsO - ObjMakefiles - ParallelTargets - PreConditions - ProjLibsOptions - ProjLibsPaths - ProjUsedLibs - Ranlib - RecursiveTargets - SrcMakefiles - Strip - StripWarnMsg - TableGen - TDFiles - ToolBuildPath - TopLevelTargets - UserTargets diff --git a/docs/MergeFunctions.rst b/docs/MergeFunctions.rst index b2f6030edc1cc..f808010f3acf6 100644 --- a/docs/MergeFunctions.rst +++ b/docs/MergeFunctions.rst @@ -56,7 +56,7 @@ As a good start point, Kaleidoscope tutorial could be used: Especially it's important to understand chapter 3 of tutorial: -:doc:`tutorial/LangImpl3` +:doc:`tutorial/LangImpl03` Reader also should know how passes work in LLVM, they could use next article as a reference and start point here: @@ -697,7 +697,7 @@ Below is detailed body description. If “F” may be overridden ------------------------ As follows from ``mayBeOverridden`` comments: “whether the definition of this -global may be replaced by something non-equivalent at link time”. If so, thats +global may be replaced by something non-equivalent at link time”. If so, that's ok: we can use alias to *F* instead of *G* or change call instructions itself. HasGlobalAliases, removeUsers diff --git a/docs/NVPTXUsage.rst b/docs/NVPTXUsage.rst index fc697ca004619..8b8c40f1fd7e7 100644 --- a/docs/NVPTXUsage.rst +++ b/docs/NVPTXUsage.rst @@ -39,7 +39,7 @@ declare a function as a kernel function. This metadata is attached to the .. code-block:: llvm - !0 = metadata !{<function-ref>, metadata !"kernel", i32 1} + !0 = !{<function-ref>, metadata !"kernel", i32 1} The first parameter is a reference to the kernel function. The following example shows a kernel function calling a device function in LLVM IR. The @@ -54,14 +54,14 @@ function ``@my_kernel`` is callable from host code, but ``@my_fmad`` is not. } define void @my_kernel(float* %ptr) { - %val = load float* %ptr + %val = load float, float* %ptr %ret = call float @my_fmad(float %val, float %val, float %val) store float %ret, float* %ptr ret void } !nvvm.annotations = !{!1} - !1 = metadata !{void (float*)* @my_kernel, metadata !"kernel", i32 1} + !1 = !{void (float*)* @my_kernel, !"kernel", i32 1} When compiled, the PTX kernel functions are callable by host-side code. @@ -361,7 +361,7 @@ With programmatic pass pipeline: .. code-block:: c++ - extern ModulePass *llvm::createNVVMReflectPass(const StringMap<int>& Mapping); + extern FunctionPass *llvm::createNVVMReflectPass(const StringMap<int>& Mapping); StringMap<int> ReflectParams; ReflectParams["__CUDA_FTZ"] = 1; @@ -395,7 +395,7 @@ JIT compiling a PTX string to a device binary: .. code-block:: c++ CUmodule module; - CUfunction funcion; + CUfunction function; // JIT compile a null-terminated PTX string cuModuleLoadData(&module, (void*)PTXString); @@ -446,13 +446,13 @@ The Kernel %id = tail call i32 @llvm.nvvm.read.ptx.sreg.tid.x() readnone nounwind ; Compute pointers into A, B, and C - %ptrA = getelementptr float addrspace(1)* %A, i32 %id - %ptrB = getelementptr float addrspace(1)* %B, i32 %id - %ptrC = getelementptr float addrspace(1)* %C, i32 %id + %ptrA = getelementptr float, float addrspace(1)* %A, i32 %id + %ptrB = getelementptr float, float addrspace(1)* %B, i32 %id + %ptrC = getelementptr float, float addrspace(1)* %C, i32 %id ; Read A, B - %valA = load float addrspace(1)* %ptrA, align 4 - %valB = load float addrspace(1)* %ptrB, align 4 + %valA = load float, float addrspace(1)* %ptrA, align 4 + %valB = load float, float addrspace(1)* %ptrB, align 4 ; Compute C = A + B %valC = fadd float %valA, %valB @@ -464,9 +464,9 @@ The Kernel } !nvvm.annotations = !{!0} - !0 = metadata !{void (float addrspace(1)*, - float addrspace(1)*, - float addrspace(1)*)* @kernel, metadata !"kernel", i32 1} + !0 = !{void (float addrspace(1)*, + float addrspace(1)*, + float addrspace(1)*)* @kernel, !"kernel", i32 1} We can use the LLVM ``llc`` tool to directly run the NVPTX code generator: @@ -566,7 +566,7 @@ Intrinsic CUDA Equivalent ``i32 @llvm.nvvm.read.ptx.sreg.ctaid.{x,y,z}`` blockIdx.{x,y,z} ``i32 @llvm.nvvm.read.ptx.sreg.ntid.{x,y,z}`` blockDim.{x,y,z} ``i32 @llvm.nvvm.read.ptx.sreg.nctaid.{x,y,z}`` gridDim.{x,y,z} -``void @llvm.cuda.syncthreads()`` __syncthreads() +``void @llvm.nvvm.barrier0()`` __syncthreads() ================================================ ==================== @@ -608,16 +608,16 @@ as a PTX `kernel` function. These metadata nodes take the form: .. code-block:: text - metadata !{<function ref>, metadata !"kernel", i32 1} + !{<function ref>, metadata !"kernel", i32 1} For the previous example, we have: .. code-block:: llvm !nvvm.annotations = !{!0} - !0 = metadata !{void (float addrspace(1)*, - float addrspace(1)*, - float addrspace(1)*)* @kernel, metadata !"kernel", i32 1} + !0 = !{void (float addrspace(1)*, + float addrspace(1)*, + float addrspace(1)*)* @kernel, !"kernel", i32 1} Here, we have a single metadata declaration in ``nvvm.annotations``. This metadata annotates our ``@kernel`` function with the ``kernel`` attribute. @@ -830,13 +830,13 @@ Libdevice provides an ``__nv_powf`` function that we will use. %id = tail call i32 @llvm.nvvm.read.ptx.sreg.tid.x() readnone nounwind ; Compute pointers into A, B, and C - %ptrA = getelementptr float addrspace(1)* %A, i32 %id - %ptrB = getelementptr float addrspace(1)* %B, i32 %id - %ptrC = getelementptr float addrspace(1)* %C, i32 %id + %ptrA = getelementptr float, float addrspace(1)* %A, i32 %id + %ptrB = getelementptr float, float addrspace(1)* %B, i32 %id + %ptrC = getelementptr float, float addrspace(1)* %C, i32 %id ; Read A, B - %valA = load float addrspace(1)* %ptrA, align 4 - %valB = load float addrspace(1)* %ptrB, align 4 + %valA = load float, float addrspace(1)* %ptrA, align 4 + %valB = load float, float addrspace(1)* %ptrB, align 4 ; Compute C = pow(A, B) %valC = call float @__nv_powf(float %valA, float %valB) @@ -848,9 +848,9 @@ Libdevice provides an ``__nv_powf`` function that we will use. } !nvvm.annotations = !{!0} - !0 = metadata !{void (float addrspace(1)*, - float addrspace(1)*, - float addrspace(1)*)* @kernel, metadata !"kernel", i32 1} + !0 = !{void (float addrspace(1)*, + float addrspace(1)*, + float addrspace(1)*)* @kernel, !"kernel", i32 1} To compile this kernel, we perform the following steps: diff --git a/docs/Passes.rst b/docs/Passes.rst index cc0a853bc4deb..77461f3c52d9b 100644 --- a/docs/Passes.rst +++ b/docs/Passes.rst @@ -253,14 +253,6 @@ This pass decodes the debug info metadata in a module and prints in a For example, run this pass from ``opt`` along with the ``-analyze`` option, and it'll print to standard output. -``-no-aa``: No Alias Analysis (always returns 'may' alias) ----------------------------------------------------------- - -This is the default implementation of the Alias Analysis interface. It always -returns "I don't know" for alias queries. NoAA is unlike other alias analysis -implementations, in that it does not chain to a previous analysis. As such it -doesn't follow many of the rules that other alias analyses must. - ``-postdomfrontier``: Post-Dominance Frontier Construction ---------------------------------------------------------- @@ -955,7 +947,7 @@ that this should make CFG hacking much easier. To make later hacking easier, the entry block is split into two, such that all introduced ``alloca`` instructions (and nothing else) are in the entry block. -``-scalarrepl``: Scalar Replacement of Aggregates (DT) +``-sroa``: Scalar Replacement of Aggregates ------------------------------------------------------ The well-known scalar replacement of aggregates transformation. This transform @@ -964,12 +956,6 @@ individual ``alloca`` instructions for each member if possible. Then, if possible, it transforms the individual ``alloca`` instructions into nice clean scalar SSA form. -This combines a simple scalar replacement of aggregates algorithm with the -:ref:`mem2reg <passes-mem2reg>` algorithm because they often interact, -especially for C++ programs. As such, iterating between ``scalarrepl``, then -:ref:`mem2reg <passes-mem2reg>` until we run out of things to promote works -well. - .. _passes-sccp: ``-sccp``: Sparse Conditional Constant Propagation diff --git a/docs/Phabricator.rst b/docs/Phabricator.rst index af1e4429fda9b..04319a9a378f6 100644 --- a/docs/Phabricator.rst +++ b/docs/Phabricator.rst @@ -127,37 +127,80 @@ a change from Phabricator. Committing a change ------------------- -Arcanist can manage the commit transparently. It will retrieve the description, -reviewers, the ``Differential Revision``, etc from the review and commit it to the repository. +Once a patch has been reviewed and approved on Phabricator it can then be +committed to trunk. There are multiple workflows to achieve this. Whichever +method you follow it is recommend that your commit message ends with the line: + +:: + + Differential Revision: <URL> + +where ``<URL>`` is the URL for the code review, starting with +``http://reviews.llvm.org/``. + +This allows people reading the version history to see the review for +context. This also allows Phabricator to detect the commit, close the +review, and add a link from the review to the commit. + +Note that if you use the Arcanist tool the ``Differential Revision`` line will +be added automatically. If you don't want to use Arcanist, you can add the +``Differential Revision`` line (as the last line) to the commit message +yourself. + +Using the Arcanist tool can simplify the process of committing reviewed code +as it will retrieve reviewers, the ``Differential Revision``, etc from the review +and place it in the commit message. Several methods of using Arcanist to commit +code are given below. If you do not wish to use Arcanist then simply commit +the reviewed patch as you would normally. + +Note that if you commit the change without using Arcanist and forget to add the +``Differential Revision`` line to your commit message then it is recommended +that you close the review manually. In the web UI, under "Leap Into Action" put +the SVN revision number in the Comment, set the Action to "Close Revision" and +click Submit. Note the review must have been Accepted first. + +Subversion and Arcanist +^^^^^^^^^^^^^^^^^^^^^^^ + +On a clean Subversion working copy run the following (where ``<Revision>`` is +the Phabricator review number): :: arc patch D<Revision> arc commit --revision D<Revision> +The first command will take the latest version of the reviewed patch and apply it to the working +copy. The second command will commit this revision to trunk. + +git-svn and Arcanist +^^^^^^^^^^^^^^^^^^^^ -When committing a change that has been reviewed using -Phabricator, the convention is for the commit message to end with the -line: +This presumes that the git repository has been configured as described in :ref:`developers-work-with-git-svn`. + +On a clean Git repository on an up to date ``master`` branch run the +following (where ``<Revision>`` is the Phabricator review number): :: - Differential Revision: <URL> + arc patch D<Revision> -where ``<URL>`` is the URL for the code review, starting with -``http://reviews.llvm.org/``. -Note that Arcanist will add this automatically. +This will create a new branch called ``arcpatch-D<Revision>`` based on the +current ``master`` and will create a commit corresponding to ``D<Revision>`` with a +commit message derived from information in the Phabricator review. + +Check you are happy with the commit message and amend it if necessary. Now switch to +the ``master`` branch and add the new commit to it and commit it to trunk. This +can be done by running the following: + +:: + + git checkout master + git merge --ff-only arcpatch-D<Revision> + git svn dcommit -This allows people reading the version history to see the review for -context. This also allows Phabricator to detect the commit, close the -review, and add a link from the review to the commit. -If you use ``git`` or ``svn`` to commit the change and forget to add the line -to your commit message, you should close the review manually. In the web UI, -under "Leap Into Action" put the SVN revision number in the Comment, set the -Action to "Close Revision" and click Submit. Note the review must have been -Accepted first. Abandoning a change ------------------- diff --git a/docs/ProgrammersManual.rst b/docs/ProgrammersManual.rst index 665e30aeb6762..030637048bfb2 100644 --- a/docs/ProgrammersManual.rst +++ b/docs/ProgrammersManual.rst @@ -263,8 +263,193 @@ almost never be stored or mentioned directly. They are intended solely for use when defining a function which should be able to efficiently accept concatenated strings. +.. _error_apis: + +Error handling +-------------- + +Proper error handling helps us identify bugs in our code, and helps end-users +understand errors in their tool usage. Errors fall into two broad categories: +*programmatic* and *recoverable*, with different strategies for handling and +reporting. + +Programmatic Errors +^^^^^^^^^^^^^^^^^^^ + +Programmatic errors are violations of program invariants or API contracts, and +represent bugs within the program itself. Our aim is to document invariants, and +to abort quickly at the point of failure (providing some basic diagnostic) when +invariants are broken at runtime. + +The fundamental tools for handling programmatic errors are assertions and the +llvm_unreachable function. Assertions are used to express invariant conditions, +and should include a message describing the invariant: + +.. code-block:: c++ + + assert(isPhysReg(R) && "All virt regs should have been allocated already."); + +The llvm_unreachable function can be used to document areas of control flow +that should never be entered if the program invariants hold: + +.. code-block:: c++ + + enum { Foo, Bar, Baz } X = foo(); + + switch (X) { + case Foo: /* Handle Foo */; break; + case Bar: /* Handle Bar */; break; + default: + llvm_unreachable("X should be Foo or Bar here"); + } + +Recoverable Errors +^^^^^^^^^^^^^^^^^^ + +Recoverable errors represent an error in the program's environment, for example +a resource failure (a missing file, a dropped network connection, etc.), or +malformed input. These errors should be detected and communicated to a level of +the program where they can be handled appropriately. Handling the error may be +as simple as reporting the issue to the user, or it may involve attempts at +recovery. + +Recoverable errors are modeled using LLVM's ``Error`` scheme. This scheme +represents errors using function return values, similar to classic C integer +error codes, or C++'s ``std::error_code``. However, the ``Error`` class is +actually a lightweight wrapper for user-defined error types, allowing arbitrary +information to be attached to describe the error. This is similar to the way C++ +exceptions allow throwing of user-defined types. + +Success values are created by calling ``Error::success()``: + +.. code-block:: c++ + + Error foo() { + // Do something. + // Return success. + return Error::success(); + } + +Success values are very cheap to construct and return - they have minimal +impact on program performance. + +Failure values are constructed using ``make_error<T>``, where ``T`` is any class +that inherits from the ErrorInfo utility: + +.. code-block:: c++ + + class MyError : public ErrorInfo<MyError> { + public: + MyError(std::string Msg) : Msg(Msg) {} + void log(OStream &OS) const override { OS << "MyError - " << Msg; } + static char ID; + private: + std::string Msg; + }; + + char MyError::ID = 0; // In MyError.cpp + + Error bar() { + if (checkErrorCondition) + return make_error<MyError>("Error condition detected"); + + // No error - proceed with bar. + + // Return success value. + return Error::success(); + } + +Error values can be implicitly converted to bool: true for error, false for +success, enabling the following idiom: + +.. code-block:: c++ + + Error mayFail(); + + Error foo() { + if (auto Err = mayFail()) + return Err; + // Success! We can proceed. + ... + +For functions that can fail but need to return a value the ``Expected<T>`` +utility can be used. Values of this type can be constructed with either a +``T``, or a ``Error``. Expected<T> values are also implicitly convertible to +boolean, but with the opposite convention to Error: true for success, false for +error. If success, the ``T`` value can be accessed via the dereference operator. +If failure, the ``Error`` value can be extracted using the ``takeError()`` +method. Idiomatic usage looks like: + +.. code-block:: c++ + + Expected<float> parseAndSquareRoot(IStream &IS) { + float f; + OS >> f; + if (f < 0) + return make_error<FloatingPointError>(...); + return sqrt(f); + } + + Error foo(IStream &IS) { + if (auto SqrtOrErr = parseAndSquartRoot(IS)) { + float Sqrt = *SqrtOrErr; + // ... + } else + return SqrtOrErr.takeError(); + } + +All Error instances, whether success or failure, must be either checked or +moved from (via std::move or a return) before they are destructed. Accidentally +discarding an unchecked error will cause a program abort at the point where the +unchecked value's destructor is run, making it easy to identify and fix +violations of this rule. + +Success values are considered checked once they have been tested (by invoking +the boolean conversion operator): + +.. code-block:: c++ + + if (auto Err = canFail(...)) + return Err; // Failure value - move error to caller. + + // Safe to continue: Err was checked. + +In contrast, the following code will always cause an abort, regardless of the +return value of ``foo``: + +.. code-block:: c++ + + canFail(); + // Program will always abort here, even if canFail() returns Success, since + // the value is not checked. + +Failure values are considered checked once a handler for the error type has +been activated: + +.. code-block:: c++ + + auto Err = canFail(...); + if (auto Err2 = + handleErrors(std::move(Err), + [](std::unique_ptr<MyError> M) { + // Try to handle 'M'. If successful, return a success value from + // the handler. + if (tryToHandle(M)) + return Error::success(); + + // We failed to handle 'M' - return it from the handler. + // This value will be passed back from catchErrors and + // wind up in Err2, where it will be returned from this function. + return Error(std::move(M)); + }))) + return Err2; + + .. _function_apis: +More information on Error and its related utilities can be found in the +Error.h header file. + Passing functions and other callable objects -------------------------------------------- @@ -295,7 +480,7 @@ The ``function_ref`` class template ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``function_ref`` -(`doxygen <http://llvm.org/doxygen/classllvm_1_1function_ref.html>`__) class +(`doxygen <http://llvm.org/docs/doxygen/html/classllvm_1_1function__ref_3_01Ret_07Params_8_8_8_08_4.html>`__) class template represents a reference to a callable object, templated over the type of the callable. This is a good choice for passing a callback to a function, if you don't need to hold onto the callback after the function returns. In this @@ -420,7 +605,7 @@ system in place to ensure that names do not conflict. If two different modules use the same string, they will all be turned on when the name is specified. This allows, for example, all debug information for instruction scheduling to be enabled with ``-debug-only=InstrSched``, even if the source lives in multiple -files. The name must not include a comma (,) as that is used to seperate the +files. The name must not include a comma (,) as that is used to separate the arguments of the ``-debug-only`` option. For performance reasons, -debug-only is not available in optimized build @@ -1135,7 +1320,7 @@ llvm/ADT/StringSet.h ``StringSet`` is a thin wrapper around :ref:`StringMap\<char\> <dss_stringmap>`, and it allows efficient storage and retrieval of unique strings. -Functionally analogous to ``SmallSet<StringRef>``, ``StringSet`` also suports +Functionally analogous to ``SmallSet<StringRef>``, ``StringSet`` also supports iteration. (The iterator dereferences to a ``StringMapEntry<char>``, so you need to call ``i->getKey()`` to access the item of the StringSet.) On the other hand, ``StringSet`` doesn't support range-insertion and @@ -1696,7 +1881,7 @@ pointer from an iterator is very straight-forward. Assuming that ``i`` is a However, the iterators you'll be working with in the LLVM framework are special: they will automatically convert to a ptr-to-instance type whenever they need to. -Instead of derferencing the iterator and then taking the address of the result, +Instead of dereferencing the iterator and then taking the address of the result, you can simply assign the iterator to the proper pointer type and you get the dereference and address-of operation as a result of the assignment (behind the scenes, this is a result of overloading casting mechanisms). Thus the second @@ -2036,7 +2221,7 @@ sequence of instructions that form a ``BasicBlock``: CallInst* callTwo = Builder.CreateCall(...); Value* result = Builder.CreateMul(callOne, callTwo); - See :doc:`tutorial/LangImpl3` for a practical use of the ``IRBuilder``. + See :doc:`tutorial/LangImpl03` for a practical use of the ``IRBuilder``. .. _schanges_deleting: @@ -2234,11 +2419,6 @@ determine what context they belong to by looking at their own ``Type``. If you are adding new entities to LLVM IR, please try to maintain this interface design. -For clients that do *not* require the benefits of isolation, LLVM provides a -convenience API ``getGlobalContext()``. This returns a global, lazily -initialized ``LLVMContext`` that may be used in situations where isolation is -not a concern. - .. _jitthreading: Threads and the JIT diff --git a/docs/README.txt b/docs/README.txt index 31764b2951b2f..6c6e5b90ecf27 100644 --- a/docs/README.txt +++ b/docs/README.txt @@ -11,12 +11,13 @@ updated after every commit. Manpage output is also supported, see below. If you instead would like to generate and view the HTML locally, install Sphinx <http://sphinx-doc.org/> and then do: - cd docs/ - make -f Makefile.sphinx - $BROWSER _build/html/index.html + cd <build-dir> + cmake -DLLVM_ENABLE_SPHINX=true -DSPHINX_OUTPUT_HTML=true <src-dir> + make -j3 docs-llvm-html + $BROWSER <build-dir>/docs//html/index.html The mapping between reStructuredText files and generated documentation is -`docs/Foo.rst` <-> `_build/html/Foo.html` <-> `http://llvm.org/docs/Foo.html`. +`docs/Foo.rst` <-> `<build-dir>/docs//html/Foo.html` <-> `http://llvm.org/docs/Foo.html`. If you are interested in writing new documentation, you will want to read `SphinxQuickstartTemplate.rst` which will get you writing documentation @@ -29,14 +30,15 @@ Manpage Output Building the manpages is similar to building the HTML documentation. The primary difference is to use the `man` makefile target, instead of the default (which is `html`). Sphinx then produces the man pages in the -directory `_build/man/`. +directory `<build-dir>/docs/man/`. - cd docs/ - make -f Makefile.sphinx man - man -l _build/man/FileCheck.1 + cd <build-dir> + cmake -DLLVM_ENABLE_SPHINX=true -DSPHINX_OUTPUT_MAN=true <src-dir> + make -j3 docs-llvm-man + man -l >build-dir>/docs/man/FileCheck.1 The correspondence between .rst files and man pages is -`docs/CommandGuide/Foo.rst` <-> `_build/man/Foo.1`. +`docs/CommandGuide/Foo.rst` <-> `<build-dir>/docs//man/Foo.1`. These .rst files are also included during HTML generation so they are also viewable online (as noted above) at e.g. `http://llvm.org/docs/CommandGuide/Foo.html`. diff --git a/docs/ReleaseNotes.rst b/docs/ReleaseNotes.rst index a25429734bbf1..54f2d530b1e7d 100644 --- a/docs/ReleaseNotes.rst +++ b/docs/ReleaseNotes.rst @@ -1,16 +1,21 @@ ====================== -LLVM 3.8 Release Notes +LLVM 3.9 Release Notes ====================== .. contents:: :local: +.. warning:: + These are in-progress notes for the upcoming LLVM 3.9 release. You may + prefer the `LLVM 3.8 Release Notes <http://llvm.org/releases/3.8.0/docs + /ReleaseNotes.html>`_. + Introduction ============ This document contains the release notes for the LLVM Compiler Infrastructure, -release 3.8. Here we describe the status of LLVM, including major improvements +release 3.9. Here we describe the status of LLVM, including major improvements from the previous release, improvements in various subprojects of LLVM, and some of the current users of the code. All LLVM releases may be downloaded from the `LLVM releases web site <http://llvm.org/releases/>`_. @@ -21,272 +26,146 @@ have questions or comments, the `LLVM Developer's Mailing List <http://lists.llvm.org/mailman/listinfo/llvm-dev>`_ is a good place to send them. +Note that if you are reading this file from a Subversion checkout or the main +LLVM web page, this document applies to the *next* release, not the current +one. To see the release notes for a specific release, please see the `releases +page <http://llvm.org/releases/>`_. + Non-comprehensive list of changes in this release ================================================= -* With this release, the minimum Windows version required for running LLVM is - Windows 7. Earlier versions, including Windows Vista and XP are no longer - supported. - -* With this release, the autoconf build system is deprecated. It will be removed - in the 3.9 release. Please migrate to using CMake. For more information see: - `Building LLVM with CMake <CMake.html>`_ - -* We have documented our C API stability guarantees for both development and - release branches, as well as documented how to extend the C API. Please see - the `developer documentation <DeveloperPolicy.html#c-api-changes>`_ for more - information. +* The LLVMContext gains a new runtime check (see + LLVMContext::discardValueNames()) that can be set to discard Value names + (other than GlobalValue). This is intended to be used in release builds by + clients that are interested in saving CPU/memory as much as possible. -* The C API function ``LLVMLinkModules`` is deprecated. It will be removed in the - 3.9 release. Please migrate to ``LLVMLinkModules2``. Unlike the old function the - new one +* There is no longer a "global context" available in LLVM, except for the C API. - * Doesn't take an unused parameter. - * Destroys the source instead of only damaging it. - * Does not record a message. Use the diagnostic handler instead. +* .. note about autoconf build having been removed. -* The C API functions ``LLVMParseBitcode``, ``LLVMParseBitcodeInContext``, - ``LLVMGetBitcodeModuleInContext`` and ``LLVMGetBitcodeModule`` have been deprecated. - They will be removed in 3.9. Please migrate to the versions with a 2 suffix. - Unlike the old ones the new ones do not record a diagnostic message. Use - the diagnostic handler instead. +* .. note about C API functions LLVMParseBitcode, + LLVMParseBitcodeInContext, LLVMGetBitcodeModuleInContext and + LLVMGetBitcodeModule having been removed. LLVMGetTargetMachineData has been + removed (use LLVMGetDataLayout instead). -* The deprecated C APIs ``LLVMGetBitcodeModuleProviderInContext`` and - ``LLVMGetBitcodeModuleProvider`` have been removed. +* The C API function LLVMLinkModules has been removed. -* The deprecated C APIs ``LLVMCreateExecutionEngine``, ``LLVMCreateInterpreter``, - ``LLVMCreateJITCompiler``, ``LLVMAddModuleProvider`` and ``LLVMRemoveModuleProvider`` - have been removed. +* The C API function LLVMAddTargetData has been removed. -* With this release, the C API headers have been reorganized to improve build - time. Type specific declarations have been moved to Type.h, and error - handling routines have been moved to ErrorHandling.h. Both are included in - Core.h so nothing should change for projects directly including the headers, - but transitive dependencies may be affected. +* The C API function LLVMGetDataLayout is deprecated + in favor of LLVMGetDataLayoutStr. -* llvm-ar now supports thin archives. +* The C API enum LLVMAttribute and associated API is deprecated in favor of + the new LLVMAttributeRef API. The deprecated functions are + LLVMAddFunctionAttr, LLVMAddTargetDependentFunctionAttr, + LLVMRemoveFunctionAttr, LLVMGetFunctionAttr, LLVMAddAttribute, + LLVMRemoveAttribute, LLVMGetAttribute, LLVMAddInstrAttribute, + LLVMRemoveInstrAttribute and LLVMSetInstrParamAlignment. -* llvm doesn't produce ``.data.rel.ro.local`` or ``.data.rel`` sections anymore. +* ``TargetFrameLowering::eliminateCallFramePseudoInstr`` now returns an + iterator to the next instruction instead of ``void``. Targets that previously + did ``MBB.erase(I); return;`` now probably want ``return MBB.erase(I);``. -* Aliases to ``available_externally`` globals are now rejected by the verifier. +* ``SelectionDAGISel::Select`` now returns ``void``. Out of tree targets will + need to be updated to replace the argument node and remove any dead nodes in + cases where they currently return an ``SDNode *`` from this interface. -* The IR Linker has been split into ``IRMover`` that moves bits from one module to - another and Linker proper that decides what to link. +* Introduction of ThinLTO: [FIXME: needs to be documented more extensively in + /docs/ ; ping Mehdi/Teresa before the release if not done] -* Support for dematerializing has been dropped. +* Raised the minimum required CMake version to 3.4.3. -* ``RegisterScheduler::setDefault`` was removed. Targets that used to call into the - command line parser to set the ``DAGScheduler``, and that don't have enough - control with ``setSchedulingPreference``, should look into overriding the - ``SubTargetHook`` "``getDAGScheduler()``". +.. NOTE + For small 1-3 sentence descriptions, just add an entry at the end of + this list. If your description won't fit comfortably in one bullet + point (e.g. maybe you would like to give an example of the + functionality, or simply have a lot to talk about), see the `NOTE` below + for adding a new subsection. -* ``ilist_iterator<T>`` no longer has implicit conversions to and from ``T*``, - since ``ilist_iterator<T>`` may be pointing at the sentinel (which is usually - not of type ``T`` at all). To convert from an iterator ``I`` to a pointer, - use ``&*I``; to convert from a pointer ``P`` to an iterator, use - ``P->getIterator()``. Alternatively, explicit conversions via - ``static_cast<T>(U)`` are still available. +* ... next change ... -* ``ilist_node<T>::getNextNode()`` and ``ilist_node<T>::getPrevNode()`` now - fail at compile time when the node cannot access its parent list. - Previously, when the sentinel was was an ``ilist_half_node<T>``, this API - could return the sentinel instead of ``nullptr``. Frustrated callers should - be updated to use ``iplist<T>::getNextNode(T*)`` instead. Alternatively, if - the node ``N`` is guaranteed not to be the last in the list, it is safe to - call ``&*++N->getIterator()`` directly. +.. NOTE + If you would like to document a larger change, then you can add a + subsection about it right here. You can copy the following boilerplate + and un-indent it (the indentation causes it to be inside this comment). -* The `Kaleidoscope tutorials <tutorial/index.html>`_ have been updated to use - the ORC JIT APIs. + Special New Feature + ------------------- -* ORC now has a basic set of C bindings. + Makes programs 10x faster by doing Special New Thing. -* Optional support for linking clang and the LLVM tools with a single libLLVM - shared library. To enable this, pass ``-DLLVM_LINK_LLVM_DYLIB=ON`` to CMake. - See `Building LLVM with CMake`_ for more details. +Changes to the LLVM IR +---------------------- -* The optimization to move the prologue and epilogue of functions in colder - code path (shrink-wrapping) is now enabled by default. +* New intrinsics ``llvm.masked.load``, ``llvm.masked.store``, + ``llvm.masked.gather`` and ``llvm.masked.scatter`` were introduced to the + LLVM IR to allow selective memory access for vector data types. -* A new target-independent gcc-compatible emulated Thread Local Storage mode - is added. When ``-femultated-tls`` flag is used, all accesses to TLS - variables are converted to calls to ``__emutls_get_address`` in the runtime - library. - -* MSVC-compatible exception handling has been completely overhauled. New - instructions have been introduced to facilitate this: - `New exception handling instructions <ExceptionHandling.html#new-exception-handling-instructions>`_. - While we have done our best to test this feature thoroughly, it would - not be completely surprising if there were a few lingering issues that - early adopters might bump into. +Changes to LLVM's IPO model +--------------------------- +LLVM no longer does inter-procedural analysis and optimization (except +inlining) on functions with comdat linkage. Doing IPO over such +functions is unsound because the implementation the linker chooses at +link-time may be differently optimized than the one what was visible +during optimization, and may have arbitrarily different observable +behavior. See `PR26774 <http://llvm.org/PR26774>`_ for more details. -Changes to the ARM Backends ---------------------------- +Changes to the ARM Backend +-------------------------- -During this release the AArch64 target has: - -* Added support for more sanitizers (MSAN, TSAN) and made them compatible with - all VMA kernel configurations (currently tested on 39 and 42 bits). -* Gained initial LLD support in the new ELF back-end -* Extended the Load/Store optimiser and cleaned up some of the bad decisions - made earlier. -* Expanded LLDB support, including watchpoints, native building, Renderscript, - LLDB-server, debugging 32-bit applications. -* Added support for the ``Exynos M1`` chip. - -During this release the ARM target has: - -* Gained massive performance improvements on embedded benchmarks due to finally - running the stride vectorizer in full form, incrementing the performance gains - that we already had in the previous releases with limited stride vectorization. -* Expanded LLDB support, including watchpoints, unwind tables -* Extended the Load/Store optimiser and cleaned up some of the bad decisions - made earlier. -* Simplified code generation for global variable addresses in ELF, resulting in - a significant (4% in Chromium) reduction in code size. -* Gained some additional code size improvements, though there's still a long road - ahead, especially for older cores. -* Added some EABI floating point comparison functions to Compiler-RT -* Added support for Windows+GNU triple, ``+features`` in ``-mcpu``/``-march`` options. + During this release ... Changes to the MIPS Target -------------------------- -During this release the MIPS target has: - -* Significantly extended support for the Integrated Assembler. See below for - more information -* Added support for the ``P5600`` processor. -* Added support for the ``interrupt`` attribute for MIPS32R2 and later. This - attribute will generate a function which can be used as a interrupt handler - on bare metal MIPS targets using the static relocation model. -* Added support for the ``ERETNC`` instruction found in MIPS32R5 and later. -* Added support for OpenCL. See http://portablecl.org/. - -* Address spaces 1 to 255 are now reserved for software use and conversions - between them are no-op casts. - -* Removed the ``mips16`` value for the ``-mcpu`` option since it is an :abbr:`ASE - (Application Specific Extension)` and not a processor. If you were using this, - please specify another CPU and use ``-mips16`` to enable MIPS16. -* Removed ``copy_u.w`` from 32-bit MSA and ``copy_u.d`` from 64-bit MSA since - they have been removed from the MSA specification due to forward compatibility - issues. For example, 32-bit MSA code containing ``copy_u.w`` would behave - differently on a 64-bit processor supporting MSA. The corresponding intrinsics - are still available and may expand to ``copy_s.[wd]`` where this is - appropriate for forward compatibility purposes. -* Relaxed the ``-mnan`` option to allow ``-mnan=2008`` on MIPS32R2/MIPS64R2 for - compatibility with GCC. -* Made MIPS64R6 the default CPU for 64-bit Android triples. - -The MIPS target has also fixed various bugs including the following notable -fixes: - -* Fixed reversed operands on ``mthi``/``mtlo`` in the DSP :abbr:`ASE - (Application Specific Extension)`. -* The code generator no longer uses ``jal`` for calls to absolute immediate - addresses. -* Disabled fast instruction selection on MIPS32R6 and MIPS64R6 since this is not - yet supported. -* Corrected addend for ``R_MIPS_HI16`` and ``R_MIPS_PCHI16`` in MCJIT -* The code generator no longer crashes when handling subregisters of an 64-bit - FPU register with undefined value. -* The code generator no longer attempts to use ``$zero`` for operands that do - not permit ``$zero``. -* Corrected the opcode used for ``ll``/``sc`` when using MIPS32R6/MIPS64R6 and - the Integrated Assembler. -* Added support for atomic load and atomic store. -* Corrected debug info when dynamically re-aligning the stack. - -We have made a large number of improvements to the integrated assembler for -MIPS. In this release, the integrated assembler isn't quite production-ready -since there are a few known issues related to bare-metal support, checking -immediates on instructions, and the N32/N64 ABI's. However, the current support -should be sufficient for many users of the O32 ABI, particularly those targeting -MIPS32 on Linux or bare-metal MIPS32. - -If you would like to try the integrated assembler, please use -``-fintegrated-as``. + During this release ... + Changes to the PowerPC Target ----------------------------- -There are numerous improvements to the PowerPC target in this release: - -* Shrink wrapping optimization has been enabled for PowerPC Little Endian - -* Direct move instructions are used when converting scalars to vectors - -* Thread Sanitizer (TSAN) is now supported for PowerPC - -* New MI peephole pass to clean up redundant XXPERMDI instructions - -* Add branch hints to highly biased branch instructions (code reaching - unreachable terminators and exceptional control flow constructs) - -* Promote boolean return values to integer to prevent excessive usage of - condition registers - -* Additional vector APIs for vector comparisons and vector merges have been - added to altivec.h - -* Many bugs have been identified and fixed + Moved some optimizations from O3 to O2 (D18562) +* Enable sibling call optimization on ppc64 ELFv1/ELFv2 abi Changes to the X86 Target ------------------------------ - -* TLS is enabled for Cygwin as emutls. - -* Smaller code for materializing 32-bit 1 and -1 constants at ``-Os``. +------------------------- -* More efficient code for wide integer compares. (E.g. 64-bit compares - on 32-bit targets.) +* LLVM now supports the Intel CPU codenamed Skylake Server with AVX-512 + extensions using ``-march=skylake-avx512``. The switch enables the + ISA extensions AVX-512{F, CD, VL, BW, DQ}. -* Tail call support for ``thiscall``, ``stdcall``, ``vectorcall``, and - ``fastcall`` functions. +* LLVM now supports the Intel CPU codenamed Knights Landing with AVX-512 + extensions using ``-march=knl``. The switch enables the ISA extensions + AVX-512{F, CD, ER, PF}. -Changes to the Hexagon Target +Changes to the AMDGPU Target ----------------------------- -In addition to general code size and performance improvements, Hexagon target -now has basic support for Hexagon V60 architecture and Hexagon Vector -Extensions (HVX). + * Mesa 11.0.x is no longer supported -Changes to the AVR Target -------------------------- - -Slightly less than half of the AVR backend has been merged in at this point. It is still -missing a number large parts which cause it to be unusable, but is well on the -road to being completely merged and workable. Changes to the OCaml bindings ----------------------------- -* The ocaml function link_modules has been replaced with link_modules' which - uses LLVMLinkModules2. + During this release ... + +Support for attribute 'notail' has been added +--------------------------------------------- +This marker prevents optimization passes from adding 'tail' or +'musttail' markers to a call. It is used to prevent tail call +optimization from being performed on the call. -External Open Source Projects Using LLVM 3.8 +External Open Source Projects Using LLVM 3.9 ============================================ An exciting aspect of LLVM is that it is used as an enabling technology for a lot of other language and tools projects. This section lists some of the -projects that have already been updated to work with LLVM 3.8. +projects that have already been updated to work with LLVM 3.9. -LDC - the LLVM-based D compiler -------------------------------- - -`D <http://dlang.org>`_ is a language with C-like syntax and static typing. It -pragmatically combines efficiency, control, and modeling power, with safety and -programmer productivity. D supports powerful concepts like Compile-Time Function -Execution (CTFE) and Template Meta-Programming, provides an innovative approach -to concurrency and offers many classical paradigms. - -`LDC <http://wiki.dlang.org/LDC>`_ uses the frontend from the reference compiler -combined with LLVM as backend to produce efficient native code. LDC targets -x86/x86_64 systems like Linux, OS X and Windows and also PowerPC (32/64 bit) -and ARM. Ports to other architectures like AArch64 and MIPS64 are underway. +* A project Additional Information @@ -301,3 +180,4 @@ going into the ``llvm/docs/`` directory in the LLVM tree. If you have any questions or comments about LLVM, please feel free to contact us via the `mailing lists <http://llvm.org/docs/#maillist>`_. + diff --git a/docs/ReportingGuide.rst b/docs/ReportingGuide.rst new file mode 100644 index 0000000000000..f7ecbb38d45e4 --- /dev/null +++ b/docs/ReportingGuide.rst @@ -0,0 +1,143 @@ +=============== +Reporting Guide +=============== + +.. note:: + + This document is currently a **DRAFT** document while it is being discussed + by the community. + +If you believe someone is violating the :doc:`code of conduct <CodeOfConduct>` +you can always report it to the LLVM Foundation Code of Conduct Advisory +Committee by emailing conduct@llvm.org. **All reports will be kept +confidential.** This isn't a public list and only `members`_ of the advisory +committee will receive the report. + +If you believe anyone is in **physical danger**, please notify appropriate law +enforcement first. If you are unsure what law enforcement agency is +appropriate, please include this in your report and we will attempt to notify +them. + +If the violation occurs at an event such as a Developer Meeting and requires +immediate attention, you can also reach out to any of the event organizers or +staff. Event organizers and staff will be prepared to handle the incident and +able to help. If you cannot find one of the organizers, the venue staff can +locate one for you. We will also post detailed contact information for specific +events as part of each events' information. In person reports will still be +kept confidential exactly as above, but also feel free to (anonymously if +needed) email conduct@llvm.org. + +.. note:: + The LLVM community has long handled inappropriate behavior on its own, using + both private communication and public responses. Nothing in this document is + intended to discourage this self enforcement of community norms. Instead, + the mechanisms described here are intended to supplement any self + enforcement within the community. They provide avenues for handling severe + cases or cases where the reporting party does not wish to respond directly + for any reason. + +Filing a report +=============== + +Reports can be as formal or informal as needed for the situation at hand. If +possible, please include as much information as you can. If you feel +comfortable, please consider including: + +* Your contact info (so we can get in touch with you if we need to follow up). +* Names (real, nicknames, or pseudonyms) of any individuals involved. If there + were other witnesses besides you, please try to include them as well. +* When and where the incident occurred. Please be as specific as possible. +* Your account of what occurred. If there is a publicly available record (e.g. + a mailing list archive or a public IRC logger) please include a link. +* Any extra context you believe existed for the incident. +* If you believe this incident is ongoing. +* Any other information you believe we should have. + +What happens after you file a report? +===================================== + +You will receive an email from the advisory committee acknowledging receipt +within 24 hours (and we will aim to respond much quicker than that). + +The advisory committee will immediately meet to review the incident and try to +determine: + +* What happened and who was involved. +* Whether this event constitutes a code of conduct violation. +* Whether this is an ongoing situation, or if there is a threat to anyone's + physical safety. + +If this is determined to be an ongoing incident or a threat to physical safety, +the working groups' immediate priority will be to protect everyone involved. +This means we may delay an "official" response until we believe that the +situation has ended and that everyone is physically safe. + +The working group will try to contact other parties involved or witnessing the +event to gain clarity on what happened and understand any different +perspectives. + +Once the advisory committee has a complete account of the events they will make +a decision as to how to respond. Responses may include: + +* Nothing, if we determine no violation occurred or it has already been + appropriately resolved. +* Providing either moderation or mediation to ongoing interactions (where + appropriate, safe, and desired by both parties). +* A private reprimand from the working group to the individuals involved. +* An imposed vacation (i.e. asking someone to "take a week off" from a mailing + list or IRC). +* A public reprimand. +* A permanent or temporary ban from some or all LLVM spaces (mailing lists, + IRC, etc.) +* Involvement of relevant law enforcement if appropriate. + +If the situation is not resolved within one week, we'll respond within one week +to the original reporter with an update and explanation. + +Once we've determined our response, we will separately contact the original +reporter and other individuals to let them know what actions (if any) we'll be +taking. We will take into account feedback from the individuals involved on the +appropriateness of our response, but we don't guarantee we'll act on it. + +After any incident, the advisory committee will make a report on the situation +to the LLVM Foundation board. The board may choose to make a public statement +about the incident. If that's the case, the identities of anyone involved will +remain confidential unless instructed by those inviduals otherwise. + +Appealing +========= + +Only permanent resolutions (such as bans) or requests for public actions may be +appealed. To appeal a decision of the working group, contact the LLVM +Foundation board at board@llvm.org with your appeal and the board will review +the case. + +In general, it is **not** appropriate to appeal a particular decision on +a public mailing list. Doing so would involve disclosure of information which +whould be confidential. Disclosing this kind of information publicly may be +considered a separate and (potentially) more serious violation of the Code of +Conduct. This is not meant to limit discussion of the Code of Conduct, the +advisory board itself, or the appropriateness of responses in general, but +**please** refrain from mentioning specific facts about cases without the +explicit permission of all parties involved. + +.. _members: + +Members of the Code of Conduct Advisory Committee +================================================= + +The members serving on the advisory committee are listed here with contact +information in case you are more comfortable talking directly to a specific +member of the committee. + +.. note:: + + FIXME: When we form the initial advisory committee, the members names and private contact info need to be added here. + + + +(This text is based on the `Django Project`_ Code of Conduct, which is in turn +based on wording from the `Speak Up! project`_.) + +.. _Django Project: https://www.djangoproject.com/conduct/ +.. _Speak Up! project: http://speakup.io/coc.html diff --git a/docs/ScudoHardenedAllocator.rst b/docs/ScudoHardenedAllocator.rst new file mode 100644 index 0000000000000..5bc390eadd5c4 --- /dev/null +++ b/docs/ScudoHardenedAllocator.rst @@ -0,0 +1,117 @@ +======================== +Scudo Hardened Allocator +======================== + +.. contents:: + :local: + :depth: 1 + +Introduction +============ +The Scudo Hardened Allocator is a user-mode allocator based on LLVM Sanitizer's +CombinedAllocator, which aims at providing additional mitigations against heap +based vulnerabilities, while maintaining good performance. + +The name "Scudo" has been retained from the initial implementation (Escudo +meaning Shield in Spanish and Portuguese). + +Design +====== +Chunk Header +------------ +Every chunk of heap memory will be preceded by a chunk header. This has two +purposes, the first one being to store various information about the chunk, +the second one being to detect potential heap overflows. In order to achieve +this, the header will be checksumed, involving the pointer to the chunk itself +and a global secret. Any corruption of the header will be detected when said +header is accessed, and the process terminated. + +The following information is stored in the header: + +- the 16-bit checksum; +- the user requested size for that chunk, which is necessary for reallocation + purposes; +- the state of the chunk (available, allocated or quarantined); +- the allocation type (malloc, new, new[] or memalign), to detect potential + mismatches in the allocation APIs used; +- whether or not the chunk is offseted (ie: if the chunk beginning is different + than the backend allocation beginning, which is most often the case with some + aligned allocations); +- the associated offset; +- a 16-bit salt. + +On x64, which is currently the only architecture supported, the header fits +within 16-bytes, which works nicely with the minimum alignment requirements. + +The checksum is computed as a CRC32 (requiring the SSE 4.2 instruction set) +of the global secret, the chunk pointer itself, and the 16 bytes of header with +the checksum field zeroed out. + +The header is atomically loaded and stored to prevent races (this requires +platform support such as the cmpxchg16b instruction). This is important as two +consecutive chunks could belong to different threads. We also want to avoid +any type of double fetches of information located in the header, and use local +copies of the header for this purpose. + +Delayed Freelist +----------------- +A delayed freelist allows us to not return a chunk directly to the backend, but +to keep it aside for a while. Once a criterion is met, the delayed freelist is +emptied, and the quarantined chunks are returned to the backend. This helps +mitigate use-after-free vulnerabilities by reducing the determinism of the +allocation and deallocation patterns. + +This feature is using the Sanitizer's Quarantine as its base, and the amount of +memory that it can hold is configurable by the user (see the Options section +below). + +Randomness +---------- +It is important for the allocator to not make use of fixed addresses. We use +the dynamic base option for the SizeClassAllocator, allowing us to benefit +from the randomness of mmap. + +Usage +===== + +Library +------- +The allocator static library can be built from the LLVM build tree thanks to +the "scudo" CMake rule. The associated tests can be exercised thanks to the +"check-scudo" CMake rule. + +Linking the static library to your project can require the use of the +"whole-archive" linker flag (or equivalent), depending on your linker. +Additional flags might also be necessary. + +Your linked binary should now make use of the Scudo allocation and deallocation +functions. + +Options +------- +Several aspects of the allocator can be configured through environment options, +following the usual ASan options syntax, through the variable SCUDO_OPTIONS. + +For example: SCUDO_OPTIONS="DeleteSizeMismatch=1:QuarantineSizeMb=16". + +The following options are available: + +- QuarantineSizeMb (integer, defaults to 64): the size (in Mb) of quarantine + used to delay the actual deallocation of chunks. Lower value may reduce + memory usage but decrease the effectiveness of the mitigation; a negative + value will fallback to a default of 64Mb; + +- ThreadLocalQuarantineSizeKb (integer, default to 1024): the size (in Kb) of + per-thread cache used to offload the global quarantine. Lower value may + reduce memory usage but might increase the contention on the global + quarantine. + +- DeallocationTypeMismatch (boolean, defaults to true): whether or not we report + errors on malloc/delete, new/free, new/delete[], etc; + +- DeleteSizeMismatch (boolean, defaults to true): whether or not we report + errors on mismatch between size of new and delete; + +- ZeroContents (boolean, defaults to false): whether or not we zero chunk + contents on allocation and deallocation. + diff --git a/docs/SourceLevelDebugging.rst b/docs/SourceLevelDebugging.rst index 270c44eb50baa..1815ee398e0c1 100644 --- a/docs/SourceLevelDebugging.rst +++ b/docs/SourceLevelDebugging.rst @@ -63,16 +63,18 @@ away during the compilation process. This meta information provides an LLVM user a relationship between generated code and the original program source code. -Currently, debug information is consumed by DwarfDebug to produce dwarf -information used by the gdb debugger. Other targets could use the same -information to produce stabs or other debug forms. +Currently, there are two backend consumers of debug info: DwarfDebug and +CodeViewDebug. DwarfDebug produces DWARF sutable for use with GDB, LLDB, and +other DWARF-based debuggers. :ref:`CodeViewDebug <codeview>` produces CodeView, +the Microsoft debug info format, which is usable with Microsoft debuggers such +as Visual Studio and WinDBG. LLVM's debug information format is mostly derived +from and inspired by DWARF, but it is feasible to translate into other target +debug info formats such as STABS. It would also be reasonable to use debug information to feed profiling tools for analysis of generated code, or, tools for reconstructing the original source from generated code. -TODO - expound a bit more. - .. _intro_debugopt: Debugging optimized code @@ -197,7 +199,7 @@ value. The first argument is the new value (wrapped as metadata). The second argument is the offset in the user source variable where the new value is written. The third argument is a `local variable <LangRef.html#dilocalvariable>`_ containing a description of the variable. The -third argument is a `complex expression <LangRef.html#diexpression>`_. +fourth argument is a `complex expression <LangRef.html#diexpression>`_. Object lifetimes and scoping ============================ @@ -259,7 +261,7 @@ Compiled to LLVM, this function would be represented like this: !llvm.module.flags = !{!7, !8, !9} !llvm.ident = !{!10} - !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 3.7.0 (trunk 231150) (llvm/trunk 231154)", isOptimized: false, runtimeVersion: 0, emissionKind: 1, enums: !2, retainedTypes: !2, subprograms: !3, globals: !2, imports: !2) + !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 3.7.0 (trunk 231150) (llvm/trunk 231154)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !2, subprograms: !3, globals: !2, imports: !2) !1 = !DIFile(filename: "/dev/stdin", directory: "/Users/dexonsmith/data/llvm/debug-info") !2 = !{} !3 = !{!4} @@ -407,7 +409,7 @@ a C/C++ front-end would generate the following descriptors: !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 3.7.0 (trunk 231150) (llvm/trunk 231154)", - isOptimized: false, runtimeVersion: 0, emissionKind: 1, + isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !2, subprograms: !2, globals: !3, imports: !2) @@ -679,7 +681,13 @@ New DWARF Constants | DW_APPLE_PROPERTY_strong | 0x400 | +--------------------------------------+-------+ | DW_APPLE_PROPERTY_unsafe_unretained | 0x800 | -+--------------------------------+-----+-------+ ++--------------------------------------+-------+ +| DW_APPLE_PROPERTY_nullability | 0x1000| ++--------------------------------------+-------+ +| DW_APPLE_PROPERTY_null_resettable | 0x2000| ++--------------------------------------+-------+ +| DW_APPLE_PROPERTY_class | 0x4000| ++--------------------------------------+-------+ Name Accelerator Tables ----------------------- @@ -1333,3 +1341,74 @@ names as follows: * "``.apple_namespaces``" -> "``__apple_namespac``" (16 character limit) * "``.apple_objc``" -> "``__apple_objc``" +.. _codeview: + +CodeView Debug Info Format +========================== + +LLVM supports emitting CodeView, the Microsoft debug info format, and this +section describes the design and implementation of that support. + +Format Background +----------------- + +CodeView as a format is clearly oriented around C++ debugging, and in C++, the +majority of debug information tends to be type information. Therefore, the +overriding design constraint of CodeView is the separation of type information +from other "symbol" information so that type information can be efficiently +merged across translation units. Both type information and symbol information is +generally stored as a sequence of records, where each record begins with a +16-bit record size and a 16-bit record kind. + +Type information is usually stored in the ``.debug$T`` section of the object +file. All other debug info, such as line info, string table, symbol info, and +inlinee info, is stored in one or more ``.debug$S`` sections. There may only be +one ``.debug$T`` section per object file, since all other debug info refers to +it. If a PDB (enabled by the ``/Zi`` MSVC option) was used during compilation, +the ``.debug$T`` section will contain only an ``LF_TYPESERVER2`` record pointing +to the PDB. When using PDBs, symbol information appears to remain in the object +file ``.debug$S`` sections. + +Type records are referred to by their index, which is the number of records in +the stream before a given record plus ``0x1000``. Many common basic types, such +as the basic integral types and unqualified pointers to them, are represented +using type indices less than ``0x1000``. Such basic types are built in to +CodeView consumers and do not require type records. + +Each type record may only contain type indices that are less than its own type +index. This ensures that the graph of type stream references is acyclic. While +the source-level type graph may contain cycles through pointer types (consider a +linked list struct), these cycles are removed from the type stream by always +referring to the forward declaration record of user-defined record types. Only +"symbol" records in the ``.debug$S`` streams may refer to complete, +non-forward-declaration type records. + +Working with CodeView +--------------------- + +These are instructions for some common tasks for developers working to improve +LLVM's CodeView support. Most of them revolve around using the CodeView dumper +embedded in ``llvm-readobj``. + +* Testing MSVC's output:: + + $ cl -c -Z7 foo.cpp # Use /Z7 to keep types in the object file + $ llvm-readobj -codeview foo.obj + +* Getting LLVM IR debug info out of Clang:: + + $ clang -g -gcodeview --target=x86_64-windows-msvc foo.cpp -S -emit-llvm + + Use this to generate LLVM IR for LLVM test cases. + +* Generate and dump CodeView from LLVM IR metadata:: + + $ llc foo.ll -filetype=obj -o foo.obj + $ llvm-readobj -codeview foo.obj > foo.txt + + Use this pattern in lit test cases and FileCheck the output of llvm-readobj + +Improving LLVM's CodeView support is a process of finding interesting type +records, constructing a C++ test case that makes MSVC emit those records, +dumping the records, understanding them, and then generating equivalent records +in LLVM's backend. diff --git a/docs/Statepoints.rst b/docs/Statepoints.rst index 442b1c269c479..a78ab3c217035 100644 --- a/docs/Statepoints.rst +++ b/docs/Statepoints.rst @@ -251,7 +251,9 @@ we get: Note that in this example %p and %obj.relocate are the same address and we could replace one with the other, potentially removing the derived pointer -from the live set at the safepoint entirely. +from the live set at the safepoint entirely. + +.. _gc_transition_args: GC Transitions ^^^^^^^^^^^^^^^^^^ @@ -260,7 +262,7 @@ As a practical consideration, many garbage-collected systems allow code that is collector-aware ("managed code") to call code that is not collector-aware ("unmanaged code"). It is common that such calls must also be safepoints, since it is desirable to allow the collector to run during the execution of -unmanaged code. Futhermore, it is common that coordinating the transition from +unmanaged code. Furthermore, it is common that coordinating the transition from managed to unmanaged code requires extra code generation at the call site to inform the collector of the transition. In order to support these needs, a statepoint may be marked as a GC transition, and data that is necessary to @@ -566,15 +568,36 @@ Each statepoint generates the following Locations: * Constant which describes number of following deopt *Locations* (not operands) * Variable number of Locations, one for each deopt parameter listed in - the IR statepoint (same number as described by previous Constant) -* Variable number of Locations pairs, one pair for each unique pointer - which needs relocated. The first Location in each pair describes - the base pointer for the object. The second is the derived pointer - actually being relocated. It is guaranteed that the base pointer - must also appear explicitly as a relocation pair if used after the - statepoint. There may be fewer pairs then gc parameters in the IR + the IR statepoint (same number as described by previous Constant). At + the moment, only deopt parameters with a bitwidth of 64 bits or less + are supported. Values of a type larger than 64 bits can be specified + and reported only if a) the value is constant at the call site, and b) + the constant can be represented with less than 64 bits (assuming zero + extension to the original bitwidth). +* Variable number of relocation records, each of which consists of + exactly two Locations. Relocation records are described in detail + below. + +Each relocation record provides sufficient information for a collector to +relocate one or more derived pointers. Each record consists of a pair of +Locations. The second element in the record represents the pointer (or +pointers) which need updated. The first element in the record provides a +pointer to the base of the object with which the pointer(s) being relocated is +associated. This information is required for handling generalized derived +pointers since a pointer may be outside the bounds of the original allocation, +but still needs to be relocated with the allocation. Additionally: + +* It is guaranteed that the base pointer must also appear explicitly as a + relocation pair if used after the statepoint. +* There may be fewer relocation records then gc parameters in the IR statepoint. Each *unique* pair will occur at least once; duplicates - are possible. + are possible. +* The Locations within each record may either be of pointer size or a + multiple of pointer size. In the later case, the record must be + interpreted as describing a sequence of pointers and their corresponding + base pointers. If the Location is of size N x sizeof(pointer), then + there will be N records of one pointer each contained within the Location. + Both Locations in a pair can be assumed to be of the same size. Note that the Locations used in each section may describe the same physical location. e.g. A stack slot may appear as a deopt location, @@ -768,6 +791,41 @@ Supported Architectures Support for statepoint generation requires some code for each backend. Today, only X86_64 is supported. +Problem Areas and Active Work +============================= + +#. As the existing users of the late rewriting model have matured, we've found + cases where the optimizer breaks the assumption that an SSA value of + gc-pointer type actually contains a gc-pointer and vice-versa. We need to + clarify our expectations and propose at least one small IR change. (Today, + the gc-pointer distinction is managed via address spaces. This turns out + not to be quite strong enough.) + +#. Support for languages which allow unmanaged pointers to garbage collected + objects (i.e. pass a pointer to an object to a C routine) via pinning. + +#. Support for garbage collected objects allocated on the stack. Specifically, + allocas are always assumed to be in address space 0 and we need a + cast/promotion operator to let rewriting identify them. + +#. The current statepoint lowering is known to be somewhat poor. In the very + long term, we'd like to integrate statepoints with the register allocator; + in the near term this is unlikely to happen. We've found the quality of + lowering to be relatively unimportant as hot-statepoints are almost always + inliner bugs. + +#. Concerns have been raised that the statepoint representation results in a + large amount of IR being produced for some examples and that this + contributes to higher than expected memory usage and compile times. There's + no immediate plans to make changes due to this, but alternate models may be + explored in the future. + +#. Relocations along exceptional paths are currently broken in ToT. In + particular, there is current no way to represent a rethrow on a path which + also has relocations. See `this llvm-dev discussion + <https://groups.google.com/forum/#!topic/llvm-dev/AE417XjgxvI>`_ for more + detail. + Bugs and Enhancements ===================== diff --git a/docs/TableGen/LangRef.rst b/docs/TableGen/LangRef.rst index 27b2c8beaa69a..58da6285c077e 100644 --- a/docs/TableGen/LangRef.rst +++ b/docs/TableGen/LangRef.rst @@ -154,7 +154,7 @@ programmer. .. productionlist:: Declaration: `Type` `TokIdentifier` ["=" `Value`] -It assigns the value to the identifer. +It assigns the value to the identifier. Types ----- diff --git a/docs/TestSuiteMakefileGuide.rst b/docs/TestSuiteMakefileGuide.rst index e2852a073518c..b6f32262b066f 100644 --- a/docs/TestSuiteMakefileGuide.rst +++ b/docs/TestSuiteMakefileGuide.rst @@ -1,6 +1,6 @@ -============================== -LLVM test-suite Makefile Guide -============================== +===================== +LLVM test-suite Guide +===================== .. contents:: :local: @@ -9,10 +9,11 @@ Overview ======== This document describes the features of the Makefile-based LLVM -test-suite. This way of interacting with the test-suite is deprecated in -favor of running the test-suite using LNT, but may continue to prove -useful for some users. See the Testing Guide's :ref:`test-suite Quickstart -<test-suite-quickstart>` section for more information. +test-suite as well as the cmake based replacement. This way of interacting +with the test-suite is deprecated in favor of running the test-suite using LNT, +but may continue to prove useful for some users. See the Testing +Guide's :ref:`test-suite Quickstart <test-suite-quickstart>` section for more +information. Test suite Structure ==================== @@ -83,8 +84,77 @@ generated. If a test fails, a large <program> FAILED message will be displayed. This will help you separate benign warnings from actual test failures. -Running the test suite -====================== +Running the test suite via CMake +================================ + +To run the test suite, you need to use the following steps: + +#. The test suite uses the lit test runner to run the test-suite, + you need to have lit installed first. Check out LLVM and install lit: + + .. code-block:: bash + + % svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm + % cd llvm/utils/lit + % sudo python setup.py install # Or without sudo, install in virtual-env. + running install + running bdist_egg + running egg_info + writing lit.egg-info/PKG-INFO + ... + % lit --version + lit 0.5.0dev + +#. Check out the ``test-suite`` module with: + + .. code-block:: bash + + % svn co http://llvm.org/svn/llvm-project/test-suite/trunk test-suite + +#. Use CMake to configure the test suite in a new directory. You cannot build + the test suite in the source tree. + + .. code-block:: bash + + % mkdir test-suite-build + % cd test-suite-build + % cmake ../test-suite + +#. Build the benchmarks, using the makefiles CMake generated. + +.. code-block:: bash + + % make + Scanning dependencies of target timeit-target + [ 0%] Building C object tools/CMakeFiles/timeit-target.dir/timeit.c.o + [ 0%] Linking C executable timeit-target + [ 0%] Built target timeit-target + Scanning dependencies of target fpcmp-host + [ 0%] [TEST_SUITE_HOST_CC] Building host executable fpcmp + [ 0%] Built target fpcmp-host + Scanning dependencies of target timeit-host + [ 0%] [TEST_SUITE_HOST_CC] Building host executable timeit + [ 0%] Built target timeit-host + + +#. Run the tests with lit: + +.. code-block:: bash + + % lit -v -j 1 . -o results.json + -- Testing: 474 tests, 1 threads -- + PASS: test-suite :: MultiSource/Applications/ALAC/decode/alacconvert-decode.test (1 of 474) + ********** TEST 'test-suite :: MultiSource/Applications/ALAC/decode/alacconvert-decode.test' RESULTS ********** + compile_time: 0.2192 + exec_time: 0.0462 + hash: "59620e187c6ac38b36382685ccd2b63b" + size: 83348 + ********** + PASS: test-suite :: MultiSource/Applications/ALAC/encode/alacconvert-encode.test (2 of 474) + + +Running the test suite via Makefiles (deprecated) +================================================= First, all tests are executed within the LLVM object directory tree. They *are not* executed inside of the LLVM source tree. This is because diff --git a/docs/TestingGuide.rst b/docs/TestingGuide.rst index 134ddd88c87d5..5dac58309e45a 100644 --- a/docs/TestingGuide.rst +++ b/docs/TestingGuide.rst @@ -25,6 +25,10 @@ In order to use the LLVM testing infrastructure, you will need all of the software required to build LLVM, as well as `Python <http://python.org>`_ 2.7 or later. +If you intend to run the :ref:`test-suite <test-suite-overview>`, you will also +need a development version of zlib (zlib1g-dev is known to work on several Linux +distributions). + LLVM testing infrastructure organization ======================================== @@ -99,19 +103,11 @@ is in the ``test-suite`` module. See :ref:`test-suite Quickstart Regression tests ---------------- -To run all of the LLVM regression tests, use the master Makefile in the -``llvm/test`` directory. LLVM Makefiles require GNU Make (read the :doc:`LLVM -Makefile Guide <MakefileGuide>` for more details): - -.. code-block:: bash - - % make -C llvm/test - -or: +To run all of the LLVM regression tests use the check-llvm target: .. code-block:: bash - % make check + % make check-llvm If you have `Clang <http://clang.llvm.org/>`_ checked out and built, you can run the LLVM and Clang tests simultaneously using: @@ -391,6 +387,23 @@ depends on special features of sub-architectures, you must add the specific triple, test with the specific FileCheck and put it into the specific directory that will filter out all other architectures. +REQUIRES and REQUIRES-ANY directive +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Some tests can be enabled only in specific situation - like having +debug build. Use ``REQUIRES`` directive to specify those requirements. + +.. code-block:: llvm + + ; This test will be only enabled in the build with asserts + ; REQUIRES: asserts + +You can separate requirements by a comma. +``REQUIRES`` means all listed requirements must be satisfied. +``REQUIRES-ANY`` means at least one must be satisfied. + +List of features that can be used in ``REQUIRES`` and ``REQUIRES-ANY`` can be +found in lit.cfg files. Substitutions ------------- @@ -543,6 +556,8 @@ the last RUN: line. This has two side effects: (b) it speeds things up for really big test cases by avoiding interpretation of the remainder of the file. +.. _test-suite-overview: + ``test-suite`` Overview ======================= diff --git a/docs/TypeMetadata.rst b/docs/TypeMetadata.rst new file mode 100644 index 0000000000000..98d58b71a6d3b --- /dev/null +++ b/docs/TypeMetadata.rst @@ -0,0 +1,226 @@ +============= +Type Metadata +============= + +Type metadata is a mechanism that allows IR modules to co-operatively build +pointer sets corresponding to addresses within a given set of globals. LLVM's +`control flow integrity`_ implementation uses this metadata to efficiently +check (at each call site) that a given address corresponds to either a +valid vtable or function pointer for a given class or function type, and its +whole-program devirtualization pass uses the metadata to identify potential +callees for a given virtual call. + +To use the mechanism, a client creates metadata nodes with two elements: + +1. a byte offset into the global (generally zero for functions) +2. a metadata object representing an identifier for the type + +These metadata nodes are associated with globals by using global object +metadata attachments with the ``!type`` metadata kind. + +Each type identifier must exclusively identify either global variables +or functions. + +.. admonition:: Limitation + + The current implementation only supports attaching metadata to functions on + the x86-32 and x86-64 architectures. + +An intrinsic, :ref:`llvm.type.test <type.test>`, is used to test whether a +given pointer is associated with a type identifier. + +.. _control flow integrity: http://clang.llvm.org/docs/ControlFlowIntegrity.html + +Representing Type Information using Type Metadata +================================================= + +This section describes how Clang represents C++ type information associated with +virtual tables using type metadata. + +Consider the following inheritance hierarchy: + +.. code-block:: c++ + + struct A { + virtual void f(); + }; + + struct B : A { + virtual void f(); + virtual void g(); + }; + + struct C { + virtual void h(); + }; + + struct D : A, C { + virtual void f(); + virtual void h(); + }; + +The virtual table objects for A, B, C and D look like this (under the Itanium ABI): + +.. csv-table:: Virtual Table Layout for A, B, C, D + :header: Class, 0, 1, 2, 3, 4, 5, 6 + + A, A::offset-to-top, &A::rtti, &A::f + B, B::offset-to-top, &B::rtti, &B::f, &B::g + C, C::offset-to-top, &C::rtti, &C::h + D, D::offset-to-top, &D::rtti, &D::f, &D::h, D::offset-to-top, &D::rtti, thunk for &D::h + +When an object of type A is constructed, the address of ``&A::f`` in A's +virtual table object is stored in the object's vtable pointer. In ABI parlance +this address is known as an `address point`_. Similarly, when an object of type +B is constructed, the address of ``&B::f`` is stored in the vtable pointer. In +this way, the vtable in B's virtual table object is compatible with A's vtable. + +D is a little more complicated, due to the use of multiple inheritance. Its +virtual table object contains two vtables, one compatible with A's vtable and +the other compatible with C's vtable. Objects of type D contain two virtual +pointers, one belonging to the A subobject and containing the address of +the vtable compatible with A's vtable, and the other belonging to the C +subobject and containing the address of the vtable compatible with C's vtable. + +The full set of compatibility information for the above class hierarchy is +shown below. The following table shows the name of a class, the offset of an +address point within that class's vtable and the name of one of the classes +with which that address point is compatible. + +.. csv-table:: Type Offsets for A, B, C, D + :header: VTable for, Offset, Compatible Class + + A, 16, A + B, 16, A + , , B + C, 16, C + D, 16, A + , , D + , 48, C + +The next step is to encode this compatibility information into the IR. The way +this is done is to create type metadata named after each of the compatible +classes, with which we associate each of the compatible address points in +each vtable. For example, these type metadata entries encode the compatibility +information for the above hierarchy: + +:: + + @_ZTV1A = constant [...], !type !0 + @_ZTV1B = constant [...], !type !0, !type !1 + @_ZTV1C = constant [...], !type !2 + @_ZTV1D = constant [...], !type !0, !type !3, !type !4 + + !0 = !{i64 16, !"_ZTS1A"} + !1 = !{i64 16, !"_ZTS1B"} + !2 = !{i64 16, !"_ZTS1C"} + !3 = !{i64 16, !"_ZTS1D"} + !4 = !{i64 48, !"_ZTS1C"} + +With this type metadata, we can now use the ``llvm.type.test`` intrinsic to +test whether a given pointer is compatible with a type identifier. Working +backwards, if ``llvm.type.test`` returns true for a particular pointer, +we can also statically determine the identities of the virtual functions +that a particular virtual call may call. For example, if a program assumes +a pointer to be a member of ``!"_ZST1A"``, we know that the address can +be only be one of ``_ZTV1A+16``, ``_ZTV1B+16`` or ``_ZTV1D+16`` (i.e. the +address points of the vtables of A, B and D respectively). If we then load +an address from that pointer, we know that the address can only be one of +``&A::f``, ``&B::f`` or ``&D::f``. + +.. _address point: https://mentorembedded.github.io/cxx-abi/abi.html#vtable-general + +Testing Addresses For Type Membership +===================================== + +If a program tests an address using ``llvm.type.test``, this will cause +a link-time optimization pass, ``LowerTypeTests``, to replace calls to this +intrinsic with efficient code to perform type member tests. At a high level, +the pass will lay out referenced globals in a consecutive memory region in +the object file, construct bit vectors that map onto that memory region, +and generate code at each of the ``llvm.type.test`` call sites to test +pointers against those bit vectors. Because of the layout manipulation, the +globals' definitions must be available at LTO time. For more information, +see the `control flow integrity design document`_. + +A type identifier that identifies functions is transformed into a jump table, +which is a block of code consisting of one branch instruction for each +of the functions associated with the type identifier that branches to the +target function. The pass will redirect any taken function addresses to the +corresponding jump table entry. In the object file's symbol table, the jump +table entries take the identities of the original functions, so that addresses +taken outside the module will pass any verification done inside the module. + +Jump tables may call external functions, so their definitions need not +be available at LTO time. Note that if an externally defined function is +associated with a type identifier, there is no guarantee that its identity +within the module will be the same as its identity outside of the module, +as the former will be the jump table entry if a jump table is necessary. + +The `GlobalLayoutBuilder`_ class is responsible for laying out the globals +efficiently to minimize the sizes of the underlying bitsets. + +.. _control flow integrity design document: http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html + +:Example: + +:: + + target datalayout = "e-p:32:32" + + @a = internal global i32 0, !type !0 + @b = internal global i32 0, !type !0, !type !1 + @c = internal global i32 0, !type !1 + @d = internal global [2 x i32] [i32 0, i32 0], !type !2 + + define void @e() !type !3 { + ret void + } + + define void @f() { + ret void + } + + declare void @g() !type !3 + + !0 = !{i32 0, !"typeid1"} + !1 = !{i32 0, !"typeid2"} + !2 = !{i32 4, !"typeid2"} + !3 = !{i32 0, !"typeid3"} + + declare i1 @llvm.type.test(i8* %ptr, metadata %typeid) nounwind readnone + + define i1 @foo(i32* %p) { + %pi8 = bitcast i32* %p to i8* + %x = call i1 @llvm.type.test(i8* %pi8, metadata !"typeid1") + ret i1 %x + } + + define i1 @bar(i32* %p) { + %pi8 = bitcast i32* %p to i8* + %x = call i1 @llvm.type.test(i8* %pi8, metadata !"typeid2") + ret i1 %x + } + + define i1 @baz(void ()* %p) { + %pi8 = bitcast void ()* %p to i8* + %x = call i1 @llvm.type.test(i8* %pi8, metadata !"typeid3") + ret i1 %x + } + + define void @main() { + %a1 = call i1 @foo(i32* @a) ; returns 1 + %b1 = call i1 @foo(i32* @b) ; returns 1 + %c1 = call i1 @foo(i32* @c) ; returns 0 + %a2 = call i1 @bar(i32* @a) ; returns 0 + %b2 = call i1 @bar(i32* @b) ; returns 1 + %c2 = call i1 @bar(i32* @c) ; returns 1 + %d02 = call i1 @bar(i32* getelementptr ([2 x i32]* @d, i32 0, i32 0)) ; returns 0 + %d12 = call i1 @bar(i32* getelementptr ([2 x i32]* @d, i32 0, i32 1)) ; returns 1 + %e = call i1 @baz(void ()* @e) ; returns 1 + %f = call i1 @baz(void ()* @f) ; returns 0 + %g = call i1 @baz(void ()* @g) ; returns 1 + ret void + } + +.. _GlobalLayoutBuilder: http://llvm.org/klaus/llvm/blob/master/include/llvm/Transforms/IPO/LowerTypeTests.h diff --git a/docs/WritingAnLLVMBackend.rst b/docs/WritingAnLLVMBackend.rst index fdadbb04e94f7..023f6ffc46029 100644 --- a/docs/WritingAnLLVMBackend.rst +++ b/docs/WritingAnLLVMBackend.rst @@ -135,14 +135,13 @@ First, you should create a subdirectory under ``lib/Target`` to hold all the files related to your target. If your target is called "Dummy", create the directory ``lib/Target/Dummy``. -In this new directory, create a ``Makefile``. It is easiest to copy a -``Makefile`` of another target and modify it. It should at least contain the -``LEVEL``, ``LIBRARYNAME`` and ``TARGET`` variables, and then include -``$(LEVEL)/Makefile.common``. The library can be named ``LLVMDummy`` (for -example, see the MIPS target). Alternatively, you can split the library into -``LLVMDummyCodeGen`` and ``LLVMDummyAsmPrinter``, the latter of which should be -implemented in a subdirectory below ``lib/Target/Dummy`` (for example, see the -PowerPC target). +In this new directory, create a ``CMakeLists.txt``. It is easiest to copy a +``CMakeLists.txt`` of another target and modify it. It should at least contain +the ``LLVM_TARGET_DEFINITIONS`` variable. The library can be named ``LLVMDummy`` +(for example, see the MIPS target). Alternatively, you can split the library +into ``LLVMDummyCodeGen`` and ``LLVMDummyAsmPrinter``, the latter of which +should be implemented in a subdirectory below ``lib/Target/Dummy`` (for example, +see the PowerPC target). Note that these two naming schemes are hardcoded into ``llvm-config``. Using any other naming scheme will confuse ``llvm-config`` and produce a lot of @@ -156,13 +155,12 @@ generator, you should do what all current machine backends do: create a subclass of ``LLVMTargetMachine``. (To create a target from scratch, create a subclass of ``TargetMachine``.) -To get LLVM to actually build and link your target, you need to add it to the -``TARGETS_TO_BUILD`` variable. To do this, you modify the configure script to -know about your target when parsing the ``--enable-targets`` option. Search -the configure script for ``TARGETS_TO_BUILD``, add your target to the lists -there (some creativity required), and then reconfigure. Alternatively, you can -change ``autoconf/configure.ac`` and regenerate configure by running -``./autoconf/AutoRegen.sh``. +To get LLVM to actually build and link your target, you need to run ``cmake`` +with ``-DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=Dummy``. This will build your +target without needing to add it to the list of all the targets. + +Once your target is stable, you can add it to the ``LLVM_ALL_TARGETS`` variable +located in the main ``CMakeLists.txt``. Target Machine ============== diff --git a/docs/WritingAnLLVMPass.rst b/docs/WritingAnLLVMPass.rst index 241066842b7bb..9e9d9f1703a59 100644 --- a/docs/WritingAnLLVMPass.rst +++ b/docs/WritingAnLLVMPass.rst @@ -525,6 +525,14 @@ interface. Implementing a loop pass is usually straightforward. these methods should return ``true`` if they modified the program, or ``false`` if they didn't. +A ``LoopPass`` subclass which is intended to run as part of the main loop pass +pipeline needs to preserve all of the same *function* analyses that the other +loop passes in its pipeline require. To make that easier, +a ``getLoopAnalysisUsage`` function is provided by ``LoopUtils.h``. It can be +called within the subclass's ``getAnalysisUsage`` override to get consistent +and correct behavior. Analogously, ``INITIALIZE_PASS_DEPENDENCY(LoopPass)`` +will initialize this set of function analyses. + The ``doInitialization(Loop *, LPPassManager &)`` method ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -1392,7 +1400,7 @@ some with solutions, some without. * Restarting the program breaks breakpoints. After following the information above, you have succeeded in getting some breakpoints planted in your pass. - Nex thing you know, you restart the program (i.e., you type "``run``" again), + Next thing you know, you restart the program (i.e., you type "``run``" again), and you start getting errors about breakpoints being unsettable. The only way I have found to "fix" this problem is to delete the breakpoints that are already set in your pass, run the program, and re-set the breakpoints once diff --git a/docs/YamlIO.rst b/docs/YamlIO.rst index f0baeb4c69d49..04e63fac6a4ba 100644 --- a/docs/YamlIO.rst +++ b/docs/YamlIO.rst @@ -456,10 +456,11 @@ looks like: template <> struct ScalarTraits<MyCustomType> { - static void output(const T &value, void*, llvm::raw_ostream &out) { + static void output(const MyCustomType &value, void*, + llvm::raw_ostream &out) { out << value; // do custom formatting here } - static StringRef input(StringRef scalar, void*, T &value) { + static StringRef input(StringRef scalar, void*, MyCustomType &value) { // do custom parsing here. Return the empty string on success, // or an error message on failure. return StringRef(); diff --git a/docs/conf.py b/docs/conf.py index 6e3f16ceef1ae..224cca142884d 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -48,9 +48,9 @@ copyright = u'2003-%d, LLVM Project' % date.today().year # built documents. # # The short X.Y version. -version = '3.8' +version = '3.9' # The full version, including alpha/beta/rc tags. -release = '3.8' +release = '3.9' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. diff --git a/docs/doxygen.intro b/docs/doxygen-mainpage.dox index 699dadc27e858..02a74799bffc9 100644 --- a/docs/doxygen.intro +++ b/docs/doxygen-mainpage.dox @@ -1,18 +1,18 @@ -/// @mainpage LLVM +/// \mainpage LLVM /// -/// @section main_intro Introduction +/// \section main_intro Introduction /// Welcome to LLVM. /// -/// This documentation describes the @b internal software that makes -/// up LLVM, not the @b external use of LLVM. There are no instructions -/// here on how to use LLVM, only the APIs that make up the software. For usage +/// This documentation describes the **internal** software that makes +/// up LLVM, not the **external** use of LLVM. There are no instructions +/// here on how to use LLVM, only the APIs that make up the software. For usage /// instructions, please see the programmer's guide or reference manual. /// -/// @section main_caveat Caveat -/// This documentation is generated directly from the source code with doxygen. +/// \section main_caveat Caveat +/// This documentation is generated directly from the source code with doxygen. /// Since LLVM is constantly under active development, what you're about to /// read is out of date! However, it may still be useful since certain portions -/// of LLVM are very stable. +/// of LLVM are very stable. /// -/// @section main_changelog Change Log +/// \section main_changelog Change Log /// - Original content written 12/30/2003 by Reid Spencer diff --git a/docs/doxygen.cfg.in b/docs/doxygen.cfg.in index 5a74cecc8aac6..7699711adce90 100644 --- a/docs/doxygen.cfg.in +++ b/docs/doxygen.cfg.in @@ -745,7 +745,7 @@ WARN_LOGFILE = INPUT = @abs_top_srcdir@/include \ @abs_top_srcdir@/lib \ - @abs_top_srcdir@/docs/doxygen.intro + @abs_top_srcdir@/docs/doxygen-mainpage.dox # This tag can be used to specify the character encoding of the source files # that doxygen parses. Internally doxygen uses the UTF-8 encoding. Doxygen uses @@ -1791,18 +1791,6 @@ GENERATE_XML = NO XML_OUTPUT = xml -# The XML_SCHEMA tag can be used to specify a XML schema, which can be used by a -# validating XML parser to check the syntax of the XML files. -# This tag requires that the tag GENERATE_XML is set to YES. - -XML_SCHEMA = - -# The XML_DTD tag can be used to specify a XML DTD, which can be used by a -# validating XML parser to check the syntax of the XML files. -# This tag requires that the tag GENERATE_XML is set to YES. - -XML_DTD = - # If the XML_PROGRAMLISTING tag is set to YES doxygen will dump the program # listings (including syntax highlighting and cross-referencing information) to # the XML output. Note that enabling this will significantly increase the size @@ -2071,7 +2059,7 @@ DOT_NUM_THREADS = 0 # The default value is: Helvetica. # This tag requires that the tag HAVE_DOT is set to YES. -DOT_FONTNAME = FreeSans +DOT_FONTNAME = Helvetica # The DOT_FONTSIZE tag can be used to set the size (in points) of the font of # dot graphs. diff --git a/docs/index.rst b/docs/index.rst index 6cbce63216438..ef1d4ec64eb55 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,6 +1,11 @@ Overview ======== +.. warning:: + + If you are using a released version of LLVM, see `the download page + <http://llvm.org/releases/>`_ to find your documentation. + The LLVM compiler infrastructure supports a wide range of projects, from industrial strength compilers to specialized JIT applications to small research projects. @@ -60,12 +65,13 @@ representation. :hidden: CMake + CMakePrimer + AdvancedBuilds HowToBuildOnARM HowToCrossCompileLLVM CommandGuide/index GettingStarted GettingStartedVS - BuildingLLVMWithAutotools FAQ Lexicon HowToAddABuilder @@ -81,7 +87,9 @@ representation. GetElementPtr Frontend/PerformanceTips MCJITDesignAndImplementation + CodeOfConduct CompileCudaWithLLVM + ReportingGuide :doc:`GettingStarted` Discusses how to get up and running quickly with the LLVM infrastructure. @@ -102,10 +110,6 @@ representation. An addendum to the main Getting Started guide for those using Visual Studio on Windows. -:doc:`BuildingLLVMWithAutotools` - An addendum to the Getting Started guide with instructions for building LLVM - with the Autotools build system. - :doc:`tutorial/index` Tutorials about using LLVM. Includes a tutorial about making a custom language with LLVM. @@ -174,6 +178,7 @@ For developers of applications which use LLVM as a library. ProgrammersManual Extensions LibFuzzer + ScudoHardenedAllocator :doc:`LLVM Language Reference Manual <LangRef>` Defines the LLVM intermediate representation and the assembly form of the @@ -218,6 +223,9 @@ For developers of applications which use LLVM as a library. :doc:`LibFuzzer` A library for writing in-process guided fuzzers. +:doc:`ScudoHardenedAllocator` + A library that implements a security-hardened `malloc()`. + Subsystem Documentation ======================= @@ -255,7 +263,7 @@ For API clients and LLVM developers. CoverageMappingFormat Statepoints MergeFunctions - BitSets + TypeMetadata FaultMaps MIRLangRef @@ -379,7 +387,6 @@ Information about LLVM's development process. :hidden: DeveloperPolicy - MakefileGuide Projects LLVMBuild HowToReleaseLLVM @@ -400,9 +407,6 @@ Information about LLVM's development process. Describes the LLVMBuild organization and files used by LLVM to specify component descriptions. -:doc:`MakefileGuide` - Describes how the LLVM makefiles work and how to use them. - :doc:`HowToReleaseLLVM` This is a guide to preparing LLVM releases. Most developers can ignore it. diff --git a/docs/tutorial/BuildingAJIT1.rst b/docs/tutorial/BuildingAJIT1.rst new file mode 100644 index 0000000000000..f30b979579dcf --- /dev/null +++ b/docs/tutorial/BuildingAJIT1.rst @@ -0,0 +1,375 @@ +======================================================= +Building a JIT: Starting out with KaleidoscopeJIT +======================================================= + +.. contents:: + :local: + +Chapter 1 Introduction +====================== + +Welcome to Chapter 1 of the "Building an ORC-based JIT in LLVM" tutorial. This +tutorial runs through the implementation of a JIT compiler using LLVM's +On-Request-Compilation (ORC) APIs. It begins with a simplified version of the +KaleidoscopeJIT class used in the +`Implementing a language with LLVM <LangImpl1.html>`_ tutorials and then +introduces new features like optimization, lazy compilation and remote +execution. + +The goal of this tutorial is to introduce you to LLVM's ORC JIT APIs, show how +these APIs interact with other parts of LLVM, and to teach you how to recombine +them to build a custom JIT that is suited to your use-case. + +The structure of the tutorial is: + +- Chapter #1: Investigate the simple KaleidoscopeJIT class. This will + introduce some of the basic concepts of the ORC JIT APIs, including the + idea of an ORC *Layer*. + +- `Chapter #2 <BuildingAJIT2.html>`_: Extend the basic KaleidoscopeJIT by adding + a new layer that will optimize IR and generated code. + +- `Chapter #3 <BuildingAJIT3.html>`_: Further extend the JIT by adding a + Compile-On-Demand layer to lazily compile IR. + +- `Chapter #4 <BuildingAJIT4.html>`_: Improve the laziness of our JIT by + replacing the Compile-On-Demand layer with a custom layer that uses the ORC + Compile Callbacks API directly to defer IR-generation until functions are + called. + +- `Chapter #5 <BuildingAJIT5.html>`_: Add process isolation by JITing code into + a remote process with reduced privileges using the JIT Remote APIs. + +To provide input for our JIT we will use the Kaleidoscope REPL from +`Chapter 7 <LangImpl7.html>`_ of the "Implementing a language in LLVM tutorial", +with one minor modification: We will remove the FunctionPassManager from the +code for that chapter and replace it with optimization support in our JIT class +in Chapter #2. + +Finally, a word on API generations: ORC is the 3rd generation of LLVM JIT API. +It was preceded by MCJIT, and before that by the (now deleted) legacy JIT. +These tutorials don't assume any experience with these earlier APIs, but +readers acquainted with them will see many familiar elements. Where appropriate +we will make this connection with the earlier APIs explicit to help people who +are transitioning from them to ORC. + +JIT API Basics +============== + +The purpose of a JIT compiler is to compile code "on-the-fly" as it is needed, +rather than compiling whole programs to disk ahead of time as a traditional +compiler does. To support that aim our initial, bare-bones JIT API will be: + +1. Handle addModule(Module &M) -- Make the given IR module available for + execution. +2. JITSymbol findSymbol(const std::string &Name) -- Search for pointers to + symbols (functions or variables) that have been added to the JIT. +3. void removeModule(Handle H) -- Remove a module from the JIT, releasing any + memory that had been used for the compiled code. + +A basic use-case for this API, executing the 'main' function from a module, +will look like: + +.. code-block:: c++ + + std::unique_ptr<Module> M = buildModule(); + JIT J; + Handle H = J.addModule(*M); + int (*Main)(int, char*[]) = + (int(*)(int, char*[])J.findSymbol("main").getAddress(); + int Result = Main(); + J.removeModule(H); + +The APIs that we build in these tutorials will all be variations on this simple +theme. Behind the API we will refine the implementation of the JIT to add +support for optimization and lazy compilation. Eventually we will extend the +API itself to allow higher-level program representations (e.g. ASTs) to be +added to the JIT. + +KaleidoscopeJIT +=============== + +In the previous section we described our API, now we examine a simple +implementation of it: The KaleidoscopeJIT class [1]_ that was used in the +`Implementing a language with LLVM <LangImpl1.html>`_ tutorials. We will use +the REPL code from `Chapter 7 <LangImpl7.html>`_ of that tutorial to supply the +input for our JIT: Each time the user enters an expression the REPL will add a +new IR module containing the code for that expression to the JIT. If the +expression is a top-level expression like '1+1' or 'sin(x)', the REPL will also +use the findSymbol method of our JIT class find and execute the code for the +expression, and then use the removeModule method to remove the code again +(since there's no way to re-invoke an anonymous expression). In later chapters +of this tutorial we'll modify the REPL to enable new interactions with our JIT +class, but for now we will take this setup for granted and focus our attention on +the implementation of our JIT itself. + +Our KaleidoscopeJIT class is defined in the KaleidoscopeJIT.h header. After the +usual include guards and #includes [2]_, we get to the definition of our class: + +.. code-block:: c++ + + #ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H + #define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H + + #include "llvm/ExecutionEngine/ExecutionEngine.h" + #include "llvm/ExecutionEngine/RTDyldMemoryManager.h" + #include "llvm/ExecutionEngine/Orc/CompileUtils.h" + #include "llvm/ExecutionEngine/Orc/IRCompileLayer.h" + #include "llvm/ExecutionEngine/Orc/LambdaResolver.h" + #include "llvm/ExecutionEngine/Orc/ObjectLinkingLayer.h" + #include "llvm/IR/Mangler.h" + #include "llvm/Support/DynamicLibrary.h" + + namespace llvm { + namespace orc { + + class KaleidoscopeJIT { + private: + + std::unique_ptr<TargetMachine> TM; + const DataLayout DL; + ObjectLinkingLayer<> ObjectLayer; + IRCompileLayer<decltype(ObjectLayer)> CompileLayer; + + public: + + typedef decltype(CompileLayer)::ModuleSetHandleT ModuleHandleT; + +Our class begins with four members: A TargetMachine, TM, which will be used +to build our LLVM compiler instance; A DataLayout, DL, which will be used for +symbol mangling (more on that later), and two ORC *layers*: an +ObjectLinkingLayer and a IRCompileLayer. We'll be talking more about layers in +the next chapter, but for now you can think of them as analogous to LLVM +Passes: they wrap up useful JIT utilities behind an easy to compose interface. +The first layer, ObjectLinkingLayer, is the foundation of our JIT: it takes +in-memory object files produced by a compiler and links them on the fly to make +them executable. This JIT-on-top-of-a-linker design was introduced in MCJIT, +however the linker was hidden inside the MCJIT class. In ORC we expose the +linker so that clients can access and configure it directly if they need to. In +this tutorial our ObjectLinkingLayer will just be used to support the next layer +in our stack: the IRCompileLayer, which will be responsible for taking LLVM IR, +compiling it, and passing the resulting in-memory object files down to the +object linking layer below. + +That's it for member variables, after that we have a single typedef: +ModuleHandle. This is the handle type that will be returned from our JIT's +addModule method, and can be passed to the removeModule method to remove a +module. The IRCompileLayer class already provides a convenient handle type +(IRCompileLayer::ModuleSetHandleT), so we just alias our ModuleHandle to this. + +.. code-block:: c++ + + KaleidoscopeJIT() + : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()), + CompileLayer(ObjectLayer, SimpleCompiler(*TM)) { + llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr); + } + + TargetMachine &getTargetMachine() { return *TM; } + +Next up we have our class constructor. We begin by initializing TM using the +EngineBuilder::selectTarget helper method, which constructs a TargetMachine for +the current process. Next we use our newly created TargetMachine to initialize +DL, our DataLayout. Then we initialize our IRCompileLayer. Our IRCompile layer +needs two things: (1) A reference to our object linking layer, and (2) a +compiler instance to use to perform the actual compilation from IR to object +files. We use the off-the-shelf SimpleCompiler instance for now. Finally, in +the body of the constructor, we call the DynamicLibrary::LoadLibraryPermanently +method with a nullptr argument. Normally the LoadLibraryPermanently method is +called with the path of a dynamic library to load, but when passed a null +pointer it will 'load' the host process itself, making its exported symbols +available for execution. + +.. code-block:: c++ + + ModuleHandle addModule(std::unique_ptr<Module> M) { + // Build our symbol resolver: + // Lambda 1: Look back into the JIT itself to find symbols that are part of + // the same "logical dylib". + // Lambda 2: Search for external symbols in the host process. + auto Resolver = createLambdaResolver( + [&](const std::string &Name) { + if (auto Sym = CompileLayer.findSymbol(Name, false)) + return Sym.toRuntimeDyldSymbol(); + return RuntimeDyld::SymbolInfo(nullptr); + }, + [](const std::string &S) { + if (auto SymAddr = + RTDyldMemoryManager::getSymbolAddressInProcess(Name)) + return RuntimeDyld::SymbolInfo(SymAddr, JITSymbolFlags::Exported); + return RuntimeDyld::SymbolInfo(nullptr); + }); + + // Build a singlton module set to hold our module. + std::vector<std::unique_ptr<Module>> Ms; + Ms.push_back(std::move(M)); + + // Add the set to the JIT with the resolver we created above and a newly + // created SectionMemoryManager. + return CompileLayer.addModuleSet(std::move(Ms), + make_unique<SectionMemoryManager>(), + std::move(Resolver)); + } + +Now we come to the first of our JIT API methods: addModule. This method is +responsible for adding IR to the JIT and making it available for execution. In +this initial implementation of our JIT we will make our modules "available for +execution" by adding them straight to the IRCompileLayer, which will +immediately compile them. In later chapters we will teach our JIT to be lazier +and instead add the Modules to a "pending" list to be compiled if and when they +are first executed. + +To add our module to the IRCompileLayer we need to supply two auxiliary objects +(as well as the module itself): a memory manager and a symbol resolver. The +memory manager will be responsible for managing the memory allocated to JIT'd +machine code, setting memory permissions, and registering exception handling +tables (if the JIT'd code uses exceptions). For our memory manager we will use +the SectionMemoryManager class: another off-the-shelf utility that provides all +the basic functionality we need. The second auxiliary class, the symbol +resolver, is more interesting for us. It exists to tell the JIT where to look +when it encounters an *external symbol* in the module we are adding. External +symbols are any symbol not defined within the module itself, including calls to +functions outside the JIT and calls to functions defined in other modules that +have already been added to the JIT. It may seem as though modules added to the +JIT should "know about one another" by default, but since we would still have to +supply a symbol resolver for references to code outside the JIT it turns out to +be easier to just re-use this one mechanism for all symbol resolution. This has +the added benefit that the user has full control over the symbol resolution +process. Should we search for definitions within the JIT first, then fall back +on external definitions? Or should we prefer external definitions where +available and only JIT code if we don't already have an available +implementation? By using a single symbol resolution scheme we are free to choose +whatever makes the most sense for any given use case. + +Building a symbol resolver is made especially easy by the *createLambdaResolver* +function. This function takes two lambdas [3]_ and returns a +RuntimeDyld::SymbolResolver instance. The first lambda is used as the +implementation of the resolver's findSymbolInLogicalDylib method, which searches +for symbol definitions that should be thought of as being part of the same +"logical" dynamic library as this Module. If you are familiar with static +linking: this means that findSymbolInLogicalDylib should expose symbols with +common linkage and hidden visibility. If all this sounds foreign you can ignore +the details and just remember that this is the first method that the linker will +use to try to find a symbol definition. If the findSymbolInLogicalDylib method +returns a null result then the linker will call the second symbol resolver +method, called findSymbol, which searches for symbols that should be thought of +as external to (but visibile from) the module and its logical dylib. In this +tutorial we will adopt the following simple scheme: All modules added to the JIT +will behave as if they were linked into a single, ever-growing logical dylib. To +implement this our first lambda (the one defining findSymbolInLogicalDylib) will +just search for JIT'd code by calling the CompileLayer's findSymbol method. If +we don't find a symbol in the JIT itself we'll fall back to our second lambda, +which implements findSymbol. This will use the +RTDyldMemoyrManager::getSymbolAddressInProcess method to search for the symbol +within the program itself. If we can't find a symbol definition via either of +these paths the JIT will refuse to accept our module, returning a "symbol not +found" error. + +Now that we've built our symbol resolver we're ready to add our module to the +JIT. We do this by calling the CompileLayer's addModuleSet method [4]_. Since +we only have a single Module and addModuleSet expects a collection, we will +create a vector of modules and add our module as the only member. Since we +have already typedef'd our ModuleHandle type to be the same as the +CompileLayer's handle type, we can return the handle from addModuleSet +directly from our addModule method. + +.. code-block:: c++ + + JITSymbol findSymbol(const std::string Name) { + std::string MangledName; + raw_string_ostream MangledNameStream(MangledName); + Mangler::getNameWithPrefix(MangledNameStream, Name, DL); + return CompileLayer.findSymbol(MangledNameStream.str(), true); + } + + void removeModule(ModuleHandle H) { + CompileLayer.removeModuleSet(H); + } + +Now that we can add code to our JIT, we need a way to find the symbols we've +added to it. To do that we call the findSymbol method on our IRCompileLayer, +but with a twist: We have to *mangle* the name of the symbol we're searching +for first. The reason for this is that the ORC JIT components use mangled +symbols internally the same way a static compiler and linker would, rather +than using plain IR symbol names. The kind of mangling will depend on the +DataLayout, which in turn depends on the target platform. To allow us to +remain portable and search based on the un-mangled name, we just re-produce +this mangling ourselves. + +We now come to the last method in our JIT API: removeModule. This method is +responsible for destructing the MemoryManager and SymbolResolver that were +added with a given module, freeing any resources they were using in the +process. In our Kaleidoscope demo we rely on this method to remove the module +representing the most recent top-level expression, preventing it from being +treated as a duplicate definition when the next top-level expression is +entered. It is generally good to free any module that you know you won't need +to call further, just to free up the resources dedicated to it. However, you +don't strictly need to do this: All resources will be cleaned up when your +JIT class is destructed, if the haven't been freed before then. + +This brings us to the end of Chapter 1 of Building a JIT. You now have a basic +but fully functioning JIT stack that you can use to take LLVM IR and make it +executable within the context of your JIT process. In the next chapter we'll +look at how to extend this JIT to produce better quality code, and in the +process take a deeper look at the ORC layer concept. + +`Next: Extending the KaleidoscopeJIT <BuildingAJIT2.html>`_ + +Full Code Listing +================= + +Here is the complete code listing for our running example. To build this +example, use: + +.. code-block:: bash + + # Compile + clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orc native` -O3 -o toy + # Run + ./toy + +Here is the code: + +.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter1/KaleidoscopeJIT.h + :language: c++ + +.. [1] Actually we use a cut-down version of KaleidoscopeJIT that makes a + simplifying assumption: symbols cannot be re-defined. This will make it + impossible to re-define symbols in the REPL, but will make our symbol + lookup logic simpler. Re-introducing support for symbol redefinition is + left as an exercise for the reader. (The KaleidoscopeJIT.h used in the + original tutorials will be a helpful reference). + +.. [2] +-----------------------+-----------------------------------------------+ + | File | Reason for inclusion | + +=======================+===============================================+ + | ExecutionEngine.h | Access to the EngineBuilder::selectTarget | + | | method. | + +-----------------------+-----------------------------------------------+ + | | Access to the | + | RTDyldMemoryManager.h | RTDyldMemoryManager::getSymbolAddressInProcess| + | | method. | + +-----------------------+-----------------------------------------------+ + | CompileUtils.h | Provides the SimpleCompiler class. | + +-----------------------+-----------------------------------------------+ + | IRCompileLayer.h | Provides the IRCompileLayer class. | + +-----------------------+-----------------------------------------------+ + | | Access the createLambdaResolver function, | + | LambdaResolver.h | which provides easy construction of symbol | + | | resolvers. | + +-----------------------+-----------------------------------------------+ + | ObjectLinkingLayer.h | Provides the ObjectLinkingLayer class. | + +-----------------------+-----------------------------------------------+ + | Mangler.h | Provides the Mangler class for platform | + | | specific name-mangling. | + +-----------------------+-----------------------------------------------+ + | DynamicLibrary.h | Provides the DynamicLibrary class, which | + | | makes symbols in the host process searchable. | + +-----------------------+-----------------------------------------------+ + +.. [3] Actually they don't have to be lambdas, any object with a call operator + will do, including plain old functions or std::functions. + +.. [4] ORC layers accept sets of Modules, rather than individual ones, so that + all Modules in the set could be co-located by the memory manager, though + this feature is not yet implemented. diff --git a/docs/tutorial/BuildingAJIT2.rst b/docs/tutorial/BuildingAJIT2.rst new file mode 100644 index 0000000000000..8fa92317f54fe --- /dev/null +++ b/docs/tutorial/BuildingAJIT2.rst @@ -0,0 +1,336 @@ +===================================================================== +Building a JIT: Adding Optimizations -- An introduction to ORC Layers +===================================================================== + +.. contents:: + :local: + +**This tutorial is under active development. It is incomplete and details may +change frequently.** Nonetheless we invite you to try it out as it stands, and +we welcome any feedback. + +Chapter 2 Introduction +====================== + +Welcome to Chapter 2 of the "Building an ORC-based JIT in LLVM" tutorial. In +`Chapter 1 <BuildingAJIT1.html>`_ of this series we examined a basic JIT +class, KaleidoscopeJIT, that could take LLVM IR modules as input and produce +executable code in memory. KaleidoscopeJIT was able to do this with relatively +little code by composing two off-the-shelf *ORC layers*: IRCompileLayer and +ObjectLinkingLayer, to do much of the heavy lifting. + +In this layer we'll learn more about the ORC layer concept by using a new layer, +IRTransformLayer, to add IR optimization support to KaleidoscopeJIT. + +Optimizing Modules using the IRTransformLayer +============================================= + +In `Chapter 4 <LangImpl4.html>`_ of the "Implementing a language with LLVM" +tutorial series the llvm *FunctionPassManager* is introduced as a means for +optimizing LLVM IR. Interested readers may read that chapter for details, but +in short: to optimize a Module we create an llvm::FunctionPassManager +instance, configure it with a set of optimizations, then run the PassManager on +a Module to mutate it into a (hopefully) more optimized but semantically +equivalent form. In the original tutorial series the FunctionPassManager was +created outside the KaleidoscopeJIT and modules were optimized before being +added to it. In this Chapter we will make optimization a phase of our JIT +instead. For now this will provide us a motivation to learn more about ORC +layers, but in the long term making optimization part of our JIT will yield an +important benefit: When we begin lazily compiling code (i.e. deferring +compilation of each function until the first time it's run), having +optimization managed by our JIT will allow us to optimize lazily too, rather +than having to do all our optimization up-front. + +To add optimization support to our JIT we will take the KaleidoscopeJIT from +Chapter 1 and compose an ORC *IRTransformLayer* on top. We will look at how the +IRTransformLayer works in more detail below, but the interface is simple: the +constructor for this layer takes a reference to the layer below (as all layers +do) plus an *IR optimization function* that it will apply to each Module that +is added via addModuleSet: + +.. code-block:: c++ + + class KaleidoscopeJIT { + private: + std::unique_ptr<TargetMachine> TM; + const DataLayout DL; + ObjectLinkingLayer<> ObjectLayer; + IRCompileLayer<decltype(ObjectLayer)> CompileLayer; + + typedef std::function<std::unique_ptr<Module>(std::unique_ptr<Module>)> + OptimizeFunction; + + IRTransformLayer<decltype(CompileLayer), OptimizeFunction> OptimizeLayer; + + public: + typedef decltype(OptimizeLayer)::ModuleSetHandleT ModuleHandle; + + KaleidoscopeJIT() + : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()), + CompileLayer(ObjectLayer, SimpleCompiler(*TM)), + OptimizeLayer(CompileLayer, + [this](std::unique_ptr<Module> M) { + return optimizeModule(std::move(M)); + }) { + llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr); + } + +Our extended KaleidoscopeJIT class starts out the same as it did in Chapter 1, +but after the CompileLayer we introduce a typedef for our optimization function. +In this case we use a std::function (a handy wrapper for "function-like" things) +from a single unique_ptr<Module> input to a std::unique_ptr<Module> output. With +our optimization function typedef in place we can declare our OptimizeLayer, +which sits on top of our CompileLayer. + +To initialize our OptimizeLayer we pass it a reference to the CompileLayer +below (standard practice for layers), and we initialize the OptimizeFunction +using a lambda that calls out to an "optimizeModule" function that we will +define below. + +.. code-block:: c++ + + // ... + auto Resolver = createLambdaResolver( + [&](const std::string &Name) { + if (auto Sym = OptimizeLayer.findSymbol(Name, false)) + return Sym.toRuntimeDyldSymbol(); + return RuntimeDyld::SymbolInfo(nullptr); + }, + // ... + +.. code-block:: c++ + + // ... + return OptimizeLayer.addModuleSet(std::move(Ms), + make_unique<SectionMemoryManager>(), + std::move(Resolver)); + // ... + +.. code-block:: c++ + + // ... + return OptimizeLayer.findSymbol(MangledNameStream.str(), true); + // ... + +.. code-block:: c++ + + // ... + OptimizeLayer.removeModuleSet(H); + // ... + +Next we need to replace references to 'CompileLayer' with references to +OptimizeLayer in our key methods: addModule, findSymbol, and removeModule. In +addModule we need to be careful to replace both references: the findSymbol call +inside our resolver, and the call through to addModuleSet. + +.. code-block:: c++ + + std::unique_ptr<Module> optimizeModule(std::unique_ptr<Module> M) { + // Create a function pass manager. + auto FPM = llvm::make_unique<legacy::FunctionPassManager>(M.get()); + + // Add some optimizations. + FPM->add(createInstructionCombiningPass()); + FPM->add(createReassociatePass()); + FPM->add(createGVNPass()); + FPM->add(createCFGSimplificationPass()); + FPM->doInitialization(); + + // Run the optimizations over all functions in the module being added to + // the JIT. + for (auto &F : *M) + FPM->run(F); + + return M; + } + +At the bottom of our JIT we add a private method to do the actual optimization: +*optimizeModule*. This function sets up a FunctionPassManager, adds some passes +to it, runs it over every function in the module, and then returns the mutated +module. The specific optimizations are the same ones used in +`Chapter 4 <LangImpl4.html>`_ of the "Implementing a language with LLVM" +tutorial series. Readers may visit that chapter for a more in-depth +discussion of these, and of IR optimization in general. + +And that's it in terms of changes to KaleidoscopeJIT: When a module is added via +addModule the OptimizeLayer will call our optimizeModule function before passing +the transformed module on to the CompileLayer below. Of course, we could have +called optimizeModule directly in our addModule function and not gone to the +bother of using the IRTransformLayer, but doing so gives us another opportunity +to see how layers compose. It also provides a neat entry point to the *layer* +concept itself, because IRTransformLayer turns out to be one of the simplest +implementations of the layer concept that can be devised: + +.. code-block:: c++ + + template <typename BaseLayerT, typename TransformFtor> + class IRTransformLayer { + public: + typedef typename BaseLayerT::ModuleSetHandleT ModuleSetHandleT; + + IRTransformLayer(BaseLayerT &BaseLayer, + TransformFtor Transform = TransformFtor()) + : BaseLayer(BaseLayer), Transform(std::move(Transform)) {} + + template <typename ModuleSetT, typename MemoryManagerPtrT, + typename SymbolResolverPtrT> + ModuleSetHandleT addModuleSet(ModuleSetT Ms, + MemoryManagerPtrT MemMgr, + SymbolResolverPtrT Resolver) { + + for (auto I = Ms.begin(), E = Ms.end(); I != E; ++I) + *I = Transform(std::move(*I)); + + return BaseLayer.addModuleSet(std::move(Ms), std::move(MemMgr), + std::move(Resolver)); + } + + void removeModuleSet(ModuleSetHandleT H) { BaseLayer.removeModuleSet(H); } + + JITSymbol findSymbol(const std::string &Name, bool ExportedSymbolsOnly) { + return BaseLayer.findSymbol(Name, ExportedSymbolsOnly); + } + + JITSymbol findSymbolIn(ModuleSetHandleT H, const std::string &Name, + bool ExportedSymbolsOnly) { + return BaseLayer.findSymbolIn(H, Name, ExportedSymbolsOnly); + } + + void emitAndFinalize(ModuleSetHandleT H) { + BaseLayer.emitAndFinalize(H); + } + + TransformFtor& getTransform() { return Transform; } + + const TransformFtor& getTransform() const { return Transform; } + + private: + BaseLayerT &BaseLayer; + TransformFtor Transform; + }; + +This is the whole definition of IRTransformLayer, from +``llvm/include/llvm/ExecutionEngine/Orc/IRTransformLayer.h``, stripped of its +comments. It is a template class with two template arguments: ``BaesLayerT`` and +``TransformFtor`` that provide the type of the base layer and the type of the +"transform functor" (in our case a std::function) respectively. This class is +concerned with two very simple jobs: (1) Running every IR Module that is added +with addModuleSet through the transform functor, and (2) conforming to the ORC +layer interface. The interface consists of one typedef and five methods: + ++------------------+-----------------------------------------------------------+ +| Interface | Description | ++==================+===========================================================+ +| | Provides a handle that can be used to identify a module | +| ModuleSetHandleT | set when calling findSymbolIn, removeModuleSet, or | +| | emitAndFinalize. | ++------------------+-----------------------------------------------------------+ +| | Takes a given set of Modules and makes them "available | +| | for execution. This means that symbols in those modules | +| | should be searchable via findSymbol and findSymbolIn, and | +| | the address of the symbols should be read/writable (for | +| | data symbols), or executable (for function symbols) after | +| | JITSymbol::getAddress() is called. Note: This means that | +| addModuleSet | addModuleSet doesn't have to compile (or do any other | +| | work) up-front. It *can*, like IRCompileLayer, act | +| | eagerly, but it can also simply record the module and | +| | take no further action until somebody calls | +| | JITSymbol::getAddress(). In IRTransformLayer's case | +| | addModuleSet eagerly applies the transform functor to | +| | each module in the set, then passes the resulting set | +| | of mutated modules down to the layer below. | ++------------------+-----------------------------------------------------------+ +| | Removes a set of modules from the JIT. Code or data | +| removeModuleSet | defined in these modules will no longer be available, and | +| | the memory holding the JIT'd definitions will be freed. | ++------------------+-----------------------------------------------------------+ +| | Searches for the named symbol in all modules that have | +| | previously been added via addModuleSet (and not yet | +| findSymbol | removed by a call to removeModuleSet). In | +| | IRTransformLayer we just pass the query on to the layer | +| | below. In our REPL this is our default way to search for | +| | function definitions. | ++------------------+-----------------------------------------------------------+ +| | Searches for the named symbol in the module set indicated | +| | by the given ModuleSetHandleT. This is just an optimized | +| | search, better for lookup-speed when you know exactly | +| | a symbol definition should be found. In IRTransformLayer | +| findSymbolIn | we just pass this query on to the layer below. In our | +| | REPL we use this method to search for functions | +| | representing top-level expressions, since we know exactly | +| | where we'll find them: in the top-level expression module | +| | we just added. | ++------------------+-----------------------------------------------------------+ +| | Forces all of the actions required to make the code and | +| | data in a module set (represented by a ModuleSetHandleT) | +| | accessible. Behaves as if some symbol in the set had been | +| | searched for and JITSymbol::getSymbolAddress called. This | +| emitAndFinalize | is rarely needed, but can be useful when dealing with | +| | layers that usually behave lazily if the user wants to | +| | trigger early compilation (for example, to use idle CPU | +| | time to eagerly compile code in the background). | ++------------------+-----------------------------------------------------------+ + +This interface attempts to capture the natural operations of a JIT (with some +wrinkles like emitAndFinalize for performance), similar to the basic JIT API +operations we identified in Chapter 1. Conforming to the layer concept allows +classes to compose neatly by implementing their behaviors in terms of the these +same operations, carried out on the layer below. For example, an eager layer +(like IRTransformLayer) can implement addModuleSet by running each module in the +set through its transform up-front and immediately passing the result to the +layer below. A lazy layer, by contrast, could implement addModuleSet by +squirreling away the modules doing no other up-front work, but applying the +transform (and calling addModuleSet on the layer below) when the client calls +findSymbol instead. The JIT'd program behavior will be the same either way, but +these choices will have different performance characteristics: Doing work +eagerly means the JIT takes longer up-front, but proceeds smoothly once this is +done. Deferring work allows the JIT to get up-and-running quickly, but will +force the JIT to pause and wait whenever some code or data is needed that hasn't +already been processed. + +Our current REPL is eager: Each function definition is optimized and compiled as +soon as it's typed in. If we were to make the transform layer lazy (but not +change things otherwise) we could defer optimization until the first time we +reference a function in a top-level expression (see if you can figure out why, +then check out the answer below [1]_). In the next chapter, however we'll +introduce fully lazy compilation, in which function's aren't compiled until +they're first called at run-time. At this point the trade-offs get much more +interesting: the lazier we are, the quicker we can start executing the first +function, but the more often we'll have to pause to compile newly encountered +functions. If we only code-gen lazily, but optimize eagerly, we'll have a slow +startup (which everything is optimized) but relatively short pauses as each +function just passes through code-gen. If we both optimize and code-gen lazily +we can start executing the first function more quickly, but we'll have longer +pauses as each function has to be both optimized and code-gen'd when it's first +executed. Things become even more interesting if we consider interproceedural +optimizations like inlining, which must be performed eagerly. These are +complex trade-offs, and there is no one-size-fits all solution to them, but by +providing composable layers we leave the decisions to the person implementing +the JIT, and make it easy for them to experiment with different configurations. + +`Next: Adding Per-function Lazy Compilation <BuildingAJIT3.html>`_ + +Full Code Listing +================= + +Here is the complete code listing for our running example with an +IRTransformLayer added to enable optimization. To build this example, use: + +.. code-block:: bash + + # Compile + clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orc native` -O3 -o toy + # Run + ./toy + +Here is the code: + +.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter2/KaleidoscopeJIT.h + :language: c++ + +.. [1] When we add our top-level expression to the JIT, any calls to functions + that we defined earlier will appear to the ObjectLinkingLayer as + external symbols. The ObjectLinkingLayer will call the SymbolResolver + that we defined in addModuleSet, which in turn calls findSymbol on the + OptimizeLayer, at which point even a lazy transform layer will have to + do its work. diff --git a/docs/tutorial/BuildingAJIT3.rst b/docs/tutorial/BuildingAJIT3.rst new file mode 100644 index 0000000000000..ba0dab91c4ef5 --- /dev/null +++ b/docs/tutorial/BuildingAJIT3.rst @@ -0,0 +1,171 @@ +============================================= +Building a JIT: Per-function Lazy Compilation +============================================= + +.. contents:: + :local: + +**This tutorial is under active development. It is incomplete and details may +change frequently.** Nonetheless we invite you to try it out as it stands, and +we welcome any feedback. + +Chapter 3 Introduction +====================== + +Welcome to Chapter 3 of the "Building an ORC-based JIT in LLVM" tutorial. This +chapter discusses lazy JITing and shows you how to enable it by adding an ORC +CompileOnDemand layer the JIT from `Chapter 2 <BuildingAJIT2.html>`_. + +Lazy Compilation +================ + +When we add a module to the KaleidoscopeJIT class described in Chapter 2 it is +immediately optimized, compiled and linked for us by the IRTransformLayer, +IRCompileLayer and ObjectLinkingLayer respectively. This scheme, where all the +work to make a Module executable is done up front, is relatively simple to +understand its performance characteristics are easy to reason about. However, +it will lead to very high startup times if the amount of code to be compiled is +large, and may also do a lot of unnecessary compilation if only a few compiled +functions are ever called at runtime. A truly "just-in-time" compiler should +allow us to defer the compilation of any given function until the moment that +function is first called, improving launch times and eliminating redundant work. +In fact, the ORC APIs provide us with a layer to lazily compile LLVM IR: +*CompileOnDemandLayer*. + +The CompileOnDemandLayer conforms to the layer interface described in Chapter 2, +but the addModuleSet method behaves quite differently from the layers we have +seen so far: rather than doing any work up front, it just constructs a *stub* +for each function in the module and arranges for the stub to trigger compilation +of the actual function the first time it is called. Because stub functions are +very cheap to produce CompileOnDemand's addModuleSet method runs very quickly, +reducing the time required to launch the first function to be executed, and +saving us from doing any redundant compilation. By conforming to the layer +interface, CompileOnDemand can be easily added on top of our existing JIT class. +We just need a few changes: + +.. code-block:: c++ + + ... + #include "llvm/ExecutionEngine/SectionMemoryManager.h" + #include "llvm/ExecutionEngine/Orc/CompileOnDemandLayer.h" + #include "llvm/ExecutionEngine/Orc/CompileUtils.h" + ... + + ... + class KaleidoscopeJIT { + private: + std::unique_ptr<TargetMachine> TM; + const DataLayout DL; + std::unique_ptr<JITCompileCallbackManager> CompileCallbackManager; + ObjectLinkingLayer<> ObjectLayer; + IRCompileLayer<decltype(ObjectLayer)> CompileLayer; + + typedef std::function<std::unique_ptr<Module>(std::unique_ptr<Module>)> + OptimizeFunction; + + IRTransformLayer<decltype(CompileLayer), OptimizeFunction> OptimizeLayer; + CompileOnDemandLayer<decltype(OptimizeLayer)> CODLayer; + + public: + typedef decltype(CODLayer)::ModuleSetHandleT ModuleHandle; + +First we need to include the CompileOnDemandLayer.h header, then add two new +members: a std::unique_ptr<CompileCallbackManager> and a CompileOnDemandLayer, +to our class. The CompileCallbackManager is a utility that enables us to +create re-entry points into the compiler for functions that we want to lazily +compile. In the next chapter we'll be looking at this class in detail, but for +now we'll be treating it as an opaque utility: We just need to pass a reference +to it into our new CompileOnDemandLayer, and the layer will do all the work of +setting up the callbacks using the callback manager we gave it. + +.. code-block:: c++ + + KaleidoscopeJIT() + : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()), + CompileLayer(ObjectLayer, SimpleCompiler(*TM)), + OptimizeLayer(CompileLayer, + [this](std::unique_ptr<Module> M) { + return optimizeModule(std::move(M)); + }), + CompileCallbackManager( + orc::createLocalCompileCallbackManager(TM->getTargetTriple(), 0)), + CODLayer(OptimizeLayer, + [this](Function &F) { return std::set<Function*>({&F}); }, + *CompileCallbackManager, + orc::createLocalIndirectStubsManagerBuilder( + TM->getTargetTriple())) { + llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr); + } + +Next we have to update our constructor to initialize the new members. To create +an appropriate compile callback manager we use the +createLocalCompileCallbackManager function, which takes a TargetMachine and a +TargetAddress to call if it receives a request to compile an unknown function. +In our simple JIT this situation is unlikely to come up, so we'll cheat and +just pass '0' here. In a production quality JIT you could give the address of a +function that throws an exception in order to unwind the JIT'd code stack. + +Now we can construct our CompileOnDemandLayer. Following the pattern from +previous layers we start by passing a reference to the next layer down in our +stack -- the OptimizeLayer. Next we need to supply a 'partitioning function': +when a not-yet-compiled function is called, the CompileOnDemandLayer will call +this function to ask us what we would like to compile. At a minimum we need to +compile the function being called (given by the argument to the partitioning +function), but we could also request that the CompileOnDemandLayer compile other +functions that are unconditionally called (or highly likely to be called) from +the function being called. For KaleidoscopeJIT we'll keep it simple and just +request compilation of the function that was called. Next we pass a reference to +our CompileCallbackManager. Finally, we need to supply an "indirect stubs +manager builder". This is a function that constructs IndirectStubManagers, which +are in turn used to build the stubs for each module. The CompileOnDemandLayer +will call the indirect stub manager builder once for each call to addModuleSet, +and use the resulting indirect stubs manager to create stubs for all functions +in all modules added. If/when the module set is removed from the JIT the +indirect stubs manager will be deleted, freeing any memory allocated to the +stubs. We supply this function by using the +createLocalIndirectStubsManagerBuilder utility. + +.. code-block:: c++ + + // ... + if (auto Sym = CODLayer.findSymbol(Name, false)) + // ... + return CODLayer.addModuleSet(std::move(Ms), + make_unique<SectionMemoryManager>(), + std::move(Resolver)); + // ... + + // ... + return CODLayer.findSymbol(MangledNameStream.str(), true); + // ... + + // ... + CODLayer.removeModuleSet(H); + // ... + +Finally, we need to replace the references to OptimizeLayer in our addModule, +findSymbol, and removeModule methods. With that, we're up and running. + +**To be done:** + +** Discuss CompileCallbackManagers and IndirectStubManagers in more detail.** + +Full Code Listing +================= + +Here is the complete code listing for our running example with a CompileOnDemand +layer added to enable lazy function-at-a-time compilation. To build this example, use: + +.. code-block:: bash + + # Compile + clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orc native` -O3 -o toy + # Run + ./toy + +Here is the code: + +.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter3/KaleidoscopeJIT.h + :language: c++ + +`Next: Extreme Laziness -- Using Compile Callbacks to JIT directly from ASTs <BuildingAJIT4.html>`_ diff --git a/docs/tutorial/BuildingAJIT4.rst b/docs/tutorial/BuildingAJIT4.rst new file mode 100644 index 0000000000000..39d9198a85c3d --- /dev/null +++ b/docs/tutorial/BuildingAJIT4.rst @@ -0,0 +1,48 @@ +=========================================================================== +Building a JIT: Extreme Laziness - Using Compile Callbacks to JIT from ASTs +=========================================================================== + +.. contents:: + :local: + +**This tutorial is under active development. It is incomplete and details may +change frequently.** Nonetheless we invite you to try it out as it stands, and +we welcome any feedback. + +Chapter 4 Introduction +====================== + +Welcome to Chapter 4 of the "Building an ORC-based JIT in LLVM" tutorial. This +chapter introduces the Compile Callbacks and Indirect Stubs APIs and shows how +they can be used to replace the CompileOnDemand layer from +`Chapter 3 <BuildingAJIT3.html>`_ with a custom lazy-JITing scheme that JITs +directly from Kaleidoscope ASTs. + +**To be done:** + +**(1) Describe the drawbacks of JITing from IR (have to compile to IR first, +which reduces the benefits of laziness).** + +**(2) Describe CompileCallbackManagers and IndirectStubManagers in detail.** + +**(3) Run through the implementation of addFunctionAST.** + +Full Code Listing +================= + +Here is the complete code listing for our running example that JITs lazily from +Kaleidoscope ASTS. To build this example, use: + +.. code-block:: bash + + # Compile + clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orc native` -O3 -o toy + # Run + ./toy + +Here is the code: + +.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter4/KaleidoscopeJIT.h + :language: c++ + +`Next: Remote-JITing -- Process-isolation and laziness-at-a-distance <BuildingAJIT5.html>`_ diff --git a/docs/tutorial/BuildingAJIT5.rst b/docs/tutorial/BuildingAJIT5.rst new file mode 100644 index 0000000000000..94ea92ce5ad2b --- /dev/null +++ b/docs/tutorial/BuildingAJIT5.rst @@ -0,0 +1,55 @@ +============================================================================= +Building a JIT: Remote-JITing -- Process Isolation and Laziness at a Distance +============================================================================= + +.. contents:: + :local: + +**This tutorial is under active development. It is incomplete and details may +change frequently.** Nonetheless we invite you to try it out as it stands, and +we welcome any feedback. + +Chapter 5 Introduction +====================== + +Welcome to Chapter 5 of the "Building an ORC-based JIT in LLVM" tutorial. This +chapter introduces the ORC RemoteJIT Client/Server APIs and shows how to use +them to build a JIT stack that will execute its code via a communications +channel with a different process. This can be a separate process on the same +machine, a process on a different machine, or even a process on a different +platform/architecture. The code builds on top of the lazy-AST-compiling JIT +stack from `Chapter 4 <BuildingAJIT3.html>`_. + +**To be done -- this is going to be a long one:** + +**(1) Introduce channels, RPC, RemoteJIT Client and Server APIs** + +**(2) Describe the client code in greater detail. Discuss modifications of the +KaleidoscopeJIT class, and the REPL itself.** + +**(3) Describe the server code.** + +**(4) Describe how to run the demo.** + +Full Code Listing +================= + +Here is the complete code listing for our running example that JITs lazily from +Kaleidoscope ASTS. To build this example, use: + +.. code-block:: bash + + # Compile + clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orc native` -O3 -o toy + # Run + ./toy + +Here is the code for the modified KaleidoscopeJIT: + +.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter5/KaleidoscopeJIT.h + :language: c++ + +And the code for the JIT server: + +.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter5/Server/server.cpp + :language: c++ diff --git a/docs/tutorial/LangImpl1.rst b/docs/tutorial/LangImpl01.rst index b04cde10274e0..f7fbd150ef11a 100644 --- a/docs/tutorial/LangImpl1.rst +++ b/docs/tutorial/LangImpl01.rst @@ -42,45 +42,48 @@ in the various pieces. The structure of the tutorial is: to implement everything in C++ instead of using lexer and parser generators. LLVM obviously works just fine with such tools, feel free to use one if you prefer. -- `Chapter #2 <LangImpl2.html>`_: Implementing a Parser and AST - +- `Chapter #2 <LangImpl02.html>`_: Implementing a Parser and AST - With the lexer in place, we can talk about parsing techniques and basic AST construction. This tutorial describes recursive descent parsing and operator precedence parsing. Nothing in Chapters 1 or 2 is LLVM-specific, the code doesn't even link in LLVM at this point. :) -- `Chapter #3 <LangImpl3.html>`_: Code generation to LLVM IR - With +- `Chapter #3 <LangImpl03.html>`_: Code generation to LLVM IR - With the AST ready, we can show off how easy generation of LLVM IR really is. -- `Chapter #4 <LangImpl4.html>`_: Adding JIT and Optimizer Support +- `Chapter #4 <LangImpl04.html>`_: Adding JIT and Optimizer Support - Because a lot of people are interested in using LLVM as a JIT, we'll dive right into it and show you the 3 lines it takes to add JIT support. LLVM is also useful in many other ways, but this is one simple and "sexy" way to show off its power. :) -- `Chapter #5 <LangImpl5.html>`_: Extending the Language: Control +- `Chapter #5 <LangImpl05.html>`_: Extending the Language: Control Flow - With the language up and running, we show how to extend it with control flow operations (if/then/else and a 'for' loop). This gives us a chance to talk about simple SSA construction and control flow. -- `Chapter #6 <LangImpl6.html>`_: Extending the Language: +- `Chapter #6 <LangImpl06.html>`_: Extending the Language: User-defined Operators - This is a silly but fun chapter that talks about extending the language to let the user program define their own arbitrary unary and binary operators (with assignable precedence!). This lets us build a significant piece of the "language" as library routines. -- `Chapter #7 <LangImpl7.html>`_: Extending the Language: Mutable +- `Chapter #7 <LangImpl07.html>`_: Extending the Language: Mutable Variables - This chapter talks about adding user-defined local variables along with an assignment operator. The interesting part about this is how easy and trivial it is to construct SSA form in LLVM: no, LLVM does *not* require your front-end to construct SSA form! -- `Chapter #8 <LangImpl8.html>`_: Extending the Language: Debug +- `Chapter #8 <LangImpl08.html>`_: Compiling to Object Files - This + chapter explains how to take LLVM IR and compile it down to object + files. +- `Chapter #9 <LangImpl09.html>`_: Extending the Language: Debug Information - Having built a decent little programming language with control flow, functions and mutable variables, we consider what it takes to add debug information to standalone executables. This debug information will allow you to set breakpoints in Kaleidoscope functions, print out argument variables, and call functions - all from within the debugger! -- `Chapter #9 <LangImpl8.html>`_: Conclusion and other useful LLVM +- `Chapter #10 <LangImpl10.html>`_: Conclusion and other useful LLVM tidbits - This chapter wraps up the series by talking about potential ways to extend the language, but also includes a bunch of pointers to info about "special topics" like adding garbage @@ -146,7 +149,7 @@ useful for mutually recursive functions). For example: A more interesting example is included in Chapter 6 where we write a little Kaleidoscope application that `displays a Mandelbrot -Set <LangImpl6.html#kicking-the-tires>`_ at various levels of magnification. +Set <LangImpl06.html#kicking-the-tires>`_ at various levels of magnification. Lets dive into the implementation of this language! @@ -280,11 +283,11 @@ file. These are handled with this code: } With this, we have the complete lexer for the basic Kaleidoscope -language (the `full code listing <LangImpl2.html#full-code-listing>`_ for the Lexer -is available in the `next chapter <LangImpl2.html>`_ of the tutorial). +language (the `full code listing <LangImpl02.html#full-code-listing>`_ for the Lexer +is available in the `next chapter <LangImpl02.html>`_ of the tutorial). Next we'll `build a simple parser that uses this to build an Abstract -Syntax Tree <LangImpl2.html>`_. When we have that, we'll include a +Syntax Tree <LangImpl02.html>`_. When we have that, we'll include a driver so that you can use the lexer and parser together. -`Next: Implementing a Parser and AST <LangImpl2.html>`_ +`Next: Implementing a Parser and AST <LangImpl02.html>`_ diff --git a/docs/tutorial/LangImpl2.rst b/docs/tutorial/LangImpl02.rst index dab60172b9882..701cbc9611363 100644 --- a/docs/tutorial/LangImpl2.rst +++ b/docs/tutorial/LangImpl02.rst @@ -176,17 +176,17 @@ be parsed. .. code-block:: c++ - /// Error* - These are little helper functions for error handling. - std::unique_ptr<ExprAST> Error(const char *Str) { - fprintf(stderr, "Error: %s\n", Str); + /// LogError* - These are little helper functions for error handling. + std::unique_ptr<ExprAST> LogError(const char *Str) { + fprintf(stderr, "LogError: %s\n", Str); return nullptr; } - std::unique_ptr<PrototypeAST> ErrorP(const char *Str) { - Error(Str); + std::unique_ptr<PrototypeAST> LogErrorP(const char *Str) { + LogError(Str); return nullptr; } -The ``Error`` routines are simple helper routines that our parser will +The ``LogError`` routines are simple helper routines that our parser will use to handle errors. The error recovery in our parser will not be the best and is not particular user-friendly, but it will be enough for our tutorial. These routines make it easier to handle errors in routines @@ -233,7 +233,7 @@ the parenthesis operator is defined like this: return nullptr; if (CurTok != ')') - return Error("expected ')'"); + return LogError("expected ')'"); getNextToken(); // eat ). return V; } @@ -241,7 +241,7 @@ the parenthesis operator is defined like this: This function illustrates a number of interesting things about the parser: -1) It shows how we use the Error routines. When called, this function +1) It shows how we use the LogError routines. When called, this function expects that the current token is a '(' token, but after parsing the subexpression, it is possible that there is no ')' waiting. For example, if the user types in "(4 x" instead of "(4)", the parser should emit an @@ -288,7 +288,7 @@ function calls: break; if (CurTok != ',') - return Error("Expected ')' or ',' in argument list"); + return LogError("Expected ')' or ',' in argument list"); getNextToken(); } } @@ -324,7 +324,7 @@ primary expression, we need to determine what sort of expression it is: static std::unique_ptr<ExprAST> ParsePrimary() { switch (CurTok) { default: - return Error("unknown token when expecting an expression"); + return LogError("unknown token when expecting an expression"); case tok_identifier: return ParseIdentifierExpr(); case tok_number: @@ -571,20 +571,20 @@ expressions): /// ::= id '(' id* ')' static std::unique_ptr<PrototypeAST> ParsePrototype() { if (CurTok != tok_identifier) - return ErrorP("Expected function name in prototype"); + return LogErrorP("Expected function name in prototype"); std::string FnName = IdentifierStr; getNextToken(); if (CurTok != '(') - return ErrorP("Expected '(' in prototype"); + return LogErrorP("Expected '(' in prototype"); // Read the list of argument names. std::vector<std::string> ArgNames; while (getNextToken() == tok_identifier) ArgNames.push_back(IdentifierStr); if (CurTok != ')') - return ErrorP("Expected ')' in prototype"); + return LogErrorP("Expected ')' in prototype"); // success. getNextToken(); // eat ')'. @@ -731,5 +731,5 @@ Here is the code: .. literalinclude:: ../../examples/Kaleidoscope/Chapter2/toy.cpp :language: c++ -`Next: Implementing Code Generation to LLVM IR <LangImpl3.html>`_ +`Next: Implementing Code Generation to LLVM IR <LangImpl03.html>`_ diff --git a/docs/tutorial/LangImpl3.rst b/docs/tutorial/LangImpl03.rst index 83ad35f14aeea..2bb3a300026e0 100644 --- a/docs/tutorial/LangImpl3.rst +++ b/docs/tutorial/LangImpl03.rst @@ -67,26 +67,26 @@ way to model this. Again, this tutorial won't dwell on good software engineering practices: for our purposes, adding a virtual method is simplest. -The second thing we want is an "Error" method like we used for the +The second thing we want is an "LogError" method like we used for the parser, which will be used to report errors found during code generation (for example, use of an undeclared parameter): .. code-block:: c++ - static std::unique_ptr<Module> *TheModule; - static IRBuilder<> Builder(getGlobalContext()); - static std::map<std::string, Value*> NamedValues; + static LLVMContext TheContext; + static IRBuilder<> Builder(TheContext); + static std::unique_ptr<Module> TheModule; + static std::map<std::string, Value *> NamedValues; - Value *ErrorV(const char *Str) { - Error(Str); + Value *LogErrorV(const char *Str) { + LogError(Str); return nullptr; } -The static variables will be used during code generation. ``TheModule`` -is an LLVM construct that contains functions and global variables. In many -ways, it is the top-level structure that the LLVM IR uses to contain code. -It will own the memory for all of the IR that we generate, which is why -the codegen() method returns a raw Value\*, rather than a unique_ptr<Value>. +The static variables will be used during code generation. ``TheContext`` +is an opaque object that owns a lot of core LLVM data structures, such as +the type and constant value tables. We don't need to understand it in +detail, we just need a single instance to pass into APIs that require it. The ``Builder`` object is a helper object that makes it easy to generate LLVM instructions. Instances of the @@ -94,6 +94,12 @@ LLVM instructions. Instances of the class template keep track of the current place to insert instructions and has methods to create new instructions. +``TheModule`` is an LLVM construct that contains functions and global +variables. In many ways, it is the top-level structure that the LLVM IR +uses to contain code. It will own the memory for all of the IR that we +generate, which is why the codegen() method returns a raw Value\*, +rather than a unique_ptr<Value>. + The ``NamedValues`` map keeps track of which values are defined in the current scope and what their LLVM representation is. (In other words, it is a symbol table for the code). In this form of Kaleidoscope, the only @@ -116,7 +122,7 @@ First we'll do numeric literals: .. code-block:: c++ Value *NumberExprAST::codegen() { - return ConstantFP::get(getGlobalContext(), APFloat(Val)); + return ConstantFP::get(LLVMContext, APFloat(Val)); } In the LLVM IR, numeric constants are represented with the @@ -133,7 +139,7 @@ are all uniqued together and shared. For this reason, the API uses the // Look this variable up in the function. Value *V = NamedValues[Name]; if (!V) - ErrorV("Unknown variable name"); + LogErrorV("Unknown variable name"); return V; } @@ -165,10 +171,10 @@ variables <LangImpl7.html#user-defined-local-variables>`_. case '<': L = Builder.CreateFCmpULT(L, R, "cmptmp"); // Convert bool 0/1 to double 0.0 or 1.0 - return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()), + return Builder.CreateUIToFP(L, Type::getDoubleTy(LLVMContext), "booltmp"); default: - return ErrorV("invalid binary operator"); + return LogErrorV("invalid binary operator"); } } @@ -214,11 +220,11 @@ would return 0.0 and -1.0, depending on the input value. // Look up the name in the global module table. Function *CalleeF = TheModule->getFunction(Callee); if (!CalleeF) - return ErrorV("Unknown function referenced"); + return LogErrorV("Unknown function referenced"); // If argument mismatch error. if (CalleeF->arg_size() != Args.size()) - return ErrorV("Incorrect # arguments passed"); + return LogErrorV("Incorrect # arguments passed"); std::vector<Value *> ArgsV; for (unsigned i = 0, e = Args.size(); i != e; ++i) { @@ -264,9 +270,9 @@ with: Function *PrototypeAST::codegen() { // Make the function type: double(double,double) etc. std::vector<Type*> Doubles(Args.size(), - Type::getDoubleTy(getGlobalContext())); + Type::getDoubleTy(LLVMContext)); FunctionType *FT = - FunctionType::get(Type::getDoubleTy(getGlobalContext()), Doubles, false); + FunctionType::get(Type::getDoubleTy(LLVMContext), Doubles, false); Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule); @@ -328,7 +334,7 @@ codegen and attach a function body. return nullptr; if (!TheFunction->empty()) - return (Function*)ErrorV("Function cannot be redefined."); + return (Function*)LogErrorV("Function cannot be redefined."); For function definitions, we start by searching TheModule's symbol table for an @@ -340,7 +346,7 @@ assert that the function is empty (i.e. has no body yet) before we start. .. code-block:: c++ // Create a new basic block to start insertion into. - BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction); + BasicBlock *BB = BasicBlock::Create(LLVMContext, "entry", TheFunction); Builder.SetInsertPoint(BB); // Record the function arguments in the NamedValues map. @@ -557,5 +563,5 @@ Here is the code: .. literalinclude:: ../../examples/Kaleidoscope/Chapter3/toy.cpp :language: c++ -`Next: Adding JIT and Optimizer Support <LangImpl4.html>`_ +`Next: Adding JIT and Optimizer Support <LangImpl04.html>`_ diff --git a/docs/tutorial/LangImpl4.rst b/docs/tutorial/LangImpl04.rst index a671d0c37f9d8..78596cd8eee5d 100644 --- a/docs/tutorial/LangImpl4.rst +++ b/docs/tutorial/LangImpl04.rst @@ -131,7 +131,8 @@ for us: void InitializeModuleAndPassManager(void) { // Open a new module. - TheModule = llvm::make_unique<Module>("my cool jit", getGlobalContext()); + Context LLVMContext; + TheModule = llvm::make_unique<Module>("my cool jit", LLVMContext); TheModule->setDataLayout(TheJIT->getTargetMachine().createDataLayout()); // Create a new pass manager attached to it. @@ -605,5 +606,5 @@ Here is the code: .. literalinclude:: ../../examples/Kaleidoscope/Chapter4/toy.cpp :language: c++ -`Next: Extending the language: control flow <LangImpl5.html>`_ +`Next: Extending the language: control flow <LangImpl05.html>`_ diff --git a/docs/tutorial/LangImpl5-cfg.png b/docs/tutorial/LangImpl05-cfg.png Binary files differindex cdba92ff6c5c9..cdba92ff6c5c9 100644 --- a/docs/tutorial/LangImpl5-cfg.png +++ b/docs/tutorial/LangImpl05-cfg.png diff --git a/docs/tutorial/LangImpl5.rst b/docs/tutorial/LangImpl05.rst index d916f92bf99e9..ae0935d9ba1f9 100644 --- a/docs/tutorial/LangImpl5.rst +++ b/docs/tutorial/LangImpl05.rst @@ -127,7 +127,7 @@ First we define a new parsing function: return nullptr; if (CurTok != tok_then) - return Error("expected then"); + return LogError("expected then"); getNextToken(); // eat the then auto Then = ParseExpression(); @@ -135,7 +135,7 @@ First we define a new parsing function: return nullptr; if (CurTok != tok_else) - return Error("expected else"); + return LogError("expected else"); getNextToken(); @@ -154,7 +154,7 @@ Next we hook it up as a primary expression: static std::unique_ptr<ExprAST> ParsePrimary() { switch (CurTok) { default: - return Error("unknown token when expecting an expression"); + return LogError("unknown token when expecting an expression"); case tok_identifier: return ParseIdentifierExpr(); case tok_number: @@ -217,7 +217,7 @@ IR into "t.ll" and run "``llvm-as < t.ll | opt -analyze -view-cfg``", `a window will pop up <../ProgrammersManual.html#viewing-graphs-while-debugging-code>`_ and you'll see this graph: -.. figure:: LangImpl5-cfg.png +.. figure:: LangImpl05-cfg.png :align: center :alt: Example CFG @@ -292,7 +292,7 @@ for ``IfExprAST``: // Convert condition to a bool by comparing equal to 0.0. CondV = Builder.CreateFCmpONE( - CondV, ConstantFP::get(getGlobalContext(), APFloat(0.0)), "ifcond"); + CondV, ConstantFP::get(LLVMContext, APFloat(0.0)), "ifcond"); This code is straightforward and similar to what we saw before. We emit the expression for the condition, then compare that value to zero to get @@ -305,9 +305,9 @@ a truth value as a 1-bit (bool) value. // Create blocks for the then and else cases. Insert the 'then' block at the // end of the function. BasicBlock *ThenBB = - BasicBlock::Create(getGlobalContext(), "then", TheFunction); - BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else"); - BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont"); + BasicBlock::Create(LLVMContext, "then", TheFunction); + BasicBlock *ElseBB = BasicBlock::Create(LLVMContext, "else"); + BasicBlock *MergeBB = BasicBlock::Create(LLVMContext, "ifcont"); Builder.CreateCondBr(CondV, ThenBB, ElseBB); @@ -400,7 +400,7 @@ code: TheFunction->getBasicBlockList().push_back(MergeBB); Builder.SetInsertPoint(MergeBB); PHINode *PN = - Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, "iftmp"); + Builder.CreatePHI(Type::getDoubleTy(LLVMContext), 2, "iftmp"); PN->addIncoming(ThenV, ThenBB); PN->addIncoming(ElseV, ElseBB); @@ -518,13 +518,13 @@ value to null in the AST node: getNextToken(); // eat the for. if (CurTok != tok_identifier) - return Error("expected identifier after for"); + return LogError("expected identifier after for"); std::string IdName = IdentifierStr; getNextToken(); // eat identifier. if (CurTok != '=') - return Error("expected '=' after for"); + return LogError("expected '=' after for"); getNextToken(); // eat '='. @@ -532,7 +532,7 @@ value to null in the AST node: if (!Start) return nullptr; if (CurTok != ',') - return Error("expected ',' after for start value"); + return LogError("expected ',' after for start value"); getNextToken(); auto End = ParseExpression(); @@ -549,7 +549,7 @@ value to null in the AST node: } if (CurTok != tok_in) - return Error("expected 'in' after for"); + return LogError("expected 'in' after for"); getNextToken(); // eat 'in'. auto Body = ParseExpression(); @@ -625,7 +625,7 @@ expression). Function *TheFunction = Builder.GetInsertBlock()->getParent(); BasicBlock *PreheaderBB = Builder.GetInsertBlock(); BasicBlock *LoopBB = - BasicBlock::Create(getGlobalContext(), "loop", TheFunction); + BasicBlock::Create(LLVMContext, "loop", TheFunction); // Insert an explicit fall through from the current block to the LoopBB. Builder.CreateBr(LoopBB); @@ -642,7 +642,7 @@ the two blocks. Builder.SetInsertPoint(LoopBB); // Start the PHI node with an entry for Start. - PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), + PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(LLVMContext), 2, VarName.c_str()); Variable->addIncoming(StartVal, PreheaderBB); @@ -693,7 +693,7 @@ table. return nullptr; } else { // If not specified, use 1.0. - StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0)); + StepVal = ConstantFP::get(LLVMContext, APFloat(1.0)); } Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar"); @@ -712,7 +712,7 @@ iteration of the loop. // Convert condition to a bool by comparing equal to 0.0. EndCond = Builder.CreateFCmpONE( - EndCond, ConstantFP::get(getGlobalContext(), APFloat(0.0)), "loopcond"); + EndCond, ConstantFP::get(LLVMContext, APFloat(0.0)), "loopcond"); Finally, we evaluate the exit value of the loop, to determine whether the loop should exit. This mirrors the condition evaluation for the @@ -723,7 +723,7 @@ if/then/else statement. // Create the "after loop" block and insert it. BasicBlock *LoopEndBB = Builder.GetInsertBlock(); BasicBlock *AfterBB = - BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction); + BasicBlock::Create(LLVMContext, "afterloop", TheFunction); // Insert the conditional branch into the end of LoopEndBB. Builder.CreateCondBr(EndCond, LoopBB, AfterBB); @@ -751,7 +751,7 @@ insertion position to it. NamedValues.erase(VarName); // for expr always returns 0.0. - return Constant::getNullValue(Type::getDoubleTy(getGlobalContext())); + return Constant::getNullValue(Type::getDoubleTy(LLVMContext)); } The final code handles various cleanups: now that we have the "NextVar" @@ -786,5 +786,5 @@ Here is the code: .. literalinclude:: ../../examples/Kaleidoscope/Chapter5/toy.cpp :language: c++ -`Next: Extending the language: user-defined operators <LangImpl6.html>`_ +`Next: Extending the language: user-defined operators <LangImpl06.html>`_ diff --git a/docs/tutorial/LangImpl6.rst b/docs/tutorial/LangImpl06.rst index 827cd392effbb..7c9a2123e8f38 100644 --- a/docs/tutorial/LangImpl6.rst +++ b/docs/tutorial/LangImpl06.rst @@ -176,7 +176,7 @@ user-defined operator, we need to parse it: switch (CurTok) { default: - return ErrorP("Expected function name in prototype"); + return LogErrorP("Expected function name in prototype"); case tok_identifier: FnName = IdentifierStr; Kind = 0; @@ -185,7 +185,7 @@ user-defined operator, we need to parse it: case tok_binary: getNextToken(); if (!isascii(CurTok)) - return ErrorP("Expected binary operator"); + return LogErrorP("Expected binary operator"); FnName = "binary"; FnName += (char)CurTok; Kind = 2; @@ -194,7 +194,7 @@ user-defined operator, we need to parse it: // Read the precedence if present. if (CurTok == tok_number) { if (NumVal < 1 || NumVal > 100) - return ErrorP("Invalid precedecnce: must be 1..100"); + return LogErrorP("Invalid precedecnce: must be 1..100"); BinaryPrecedence = (unsigned)NumVal; getNextToken(); } @@ -202,20 +202,20 @@ user-defined operator, we need to parse it: } if (CurTok != '(') - return ErrorP("Expected '(' in prototype"); + return LogErrorP("Expected '(' in prototype"); std::vector<std::string> ArgNames; while (getNextToken() == tok_identifier) ArgNames.push_back(IdentifierStr); if (CurTok != ')') - return ErrorP("Expected ')' in prototype"); + return LogErrorP("Expected ')' in prototype"); // success. getNextToken(); // eat ')'. // Verify right number of names for operator. if (Kind && ArgNames.size() != Kind) - return ErrorP("Invalid number of operands for operator"); + return LogErrorP("Invalid number of operands for operator"); return llvm::make_unique<PrototypeAST>(FnName, std::move(ArgNames), Kind != 0, BinaryPrecedence); @@ -251,7 +251,7 @@ default case for our existing binary operator node: case '<': L = Builder.CreateFCmpULT(L, R, "cmptmp"); // Convert bool 0/1 to double 0.0 or 1.0 - return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()), + return Builder.CreateUIToFP(L, Type::getDoubleTy(LLVMContext), "booltmp"); default: break; @@ -288,7 +288,7 @@ The final piece of code we are missing, is a bit of top-level magic: BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence(); // Create a new basic block to start insertion into. - BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction); + BasicBlock *BB = BasicBlock::Create(LLVMContext, "entry", TheFunction); Builder.SetInsertPoint(BB); if (Value *RetVal = Body->codegen()) { @@ -403,7 +403,7 @@ operator code above with: switch (CurTok) { default: - return ErrorP("Expected function name in prototype"); + return LogErrorP("Expected function name in prototype"); case tok_identifier: FnName = IdentifierStr; Kind = 0; @@ -412,7 +412,7 @@ operator code above with: case tok_unary: getNextToken(); if (!isascii(CurTok)) - return ErrorP("Expected unary operator"); + return LogErrorP("Expected unary operator"); FnName = "unary"; FnName += (char)CurTok; Kind = 1; @@ -435,7 +435,7 @@ unary operators. It looks like this: Function *F = TheModule->getFunction(std::string("unary")+Opcode); if (!F) - return ErrorV("Unknown unary operator"); + return LogErrorV("Unknown unary operator"); return Builder.CreateCall(F, OperandV, "unop"); } @@ -546,17 +546,17 @@ converge: # Determine whether the specific location diverges. # Solve for z = z^2 + c in the complex plane. - def mandleconverger(real imag iters creal cimag) + def mandelconverger(real imag iters creal cimag) if iters > 255 | (real*real + imag*imag > 4) then iters else - mandleconverger(real*real - imag*imag + creal, + mandelconverger(real*real - imag*imag + creal, 2*real*imag + cimag, iters+1, creal, cimag); # Return the number of iterations required for the iteration to escape - def mandleconverge(real imag) - mandleconverger(real, imag, 0, real, imag); + def mandelconverge(real imag) + mandelconverger(real, imag, 0, real, imag); This "``z = z2 + c``" function is a beautiful little creature that is the basis for computation of the `Mandelbrot @@ -570,12 +570,12 @@ but we can whip together something using the density plotter above: :: - # Compute and plot the mandlebrot set with the specified 2 dimensional range + # Compute and plot the mandelbrot set with the specified 2 dimensional range # info. def mandelhelp(xmin xmax xstep ymin ymax ystep) for y = ymin, y < ymax, ystep in ( (for x = xmin, x < xmax, xstep in - printdensity(mandleconverge(x,y))) + printdensity(mandelconverge(x,y))) : putchard(10) ) @@ -585,7 +585,7 @@ but we can whip together something using the density plotter above: mandelhelp(realstart, realstart+realmag*78, realmag, imagstart, imagstart+imagmag*40, imagmag); -Given this, we can try plotting out the mandlebrot set! Lets try it out: +Given this, we can try plotting out the mandelbrot set! Lets try it out: :: @@ -764,5 +764,5 @@ Here is the code: :language: c++ `Next: Extending the language: mutable variables / SSA -construction <LangImpl7.html>`_ +construction <LangImpl07.html>`_ diff --git a/docs/tutorial/LangImpl7.rst b/docs/tutorial/LangImpl07.rst index 1cd7d56fddb4b..4d86ecad38aaa 100644 --- a/docs/tutorial/LangImpl7.rst +++ b/docs/tutorial/LangImpl07.rst @@ -224,7 +224,7 @@ variables in certain circumstances: class <../LangRef.html#first-class-types>`_ values (such as pointers, scalars and vectors), and only if the array size of the allocation is 1 (or missing in the .ll file). mem2reg is not capable of promoting - structs or arrays to registers. Note that the "scalarrepl" pass is + structs or arrays to registers. Note that the "sroa" pass is more powerful and can promote structs, "unions", and arrays in many cases. @@ -252,13 +252,13 @@ is: technique dovetails very naturally with this style of debug info. If nothing else, this makes it much easier to get your front-end up and -running, and is very simple to implement. Lets extend Kaleidoscope with +running, and is very simple to implement. Let's extend Kaleidoscope with mutable variables now! Mutable Variables in Kaleidoscope ================================= -Now that we know the sort of problem we want to tackle, lets see what +Now that we know the sort of problem we want to tackle, let's see what this looks like in the context of our little Kaleidoscope language. We're going to add two features: @@ -306,7 +306,7 @@ Adjusting Existing Variables for Mutation The symbol table in Kaleidoscope is managed at code generation time by the '``NamedValues``' map. This map currently keeps track of the LLVM "Value\*" that holds the double value for the named variable. In order -to support mutation, we need to change this slightly, so that it +to support mutation, we need to change this slightly, so that ``NamedValues`` holds the *memory location* of the variable in question. Note that this change is a refactoring: it changes the structure of the code, but does not (by itself) change the behavior of the compiler. All @@ -339,7 +339,7 @@ the function: const std::string &VarName) { IRBuilder<> TmpB(&TheFunction->getEntryBlock(), TheFunction->getEntryBlock().begin()); - return TmpB.CreateAlloca(Type::getDoubleTy(getGlobalContext()), 0, + return TmpB.CreateAlloca(Type::getDoubleTy(LLVMContext), 0, VarName.c_str()); } @@ -359,7 +359,7 @@ from the stack slot: // Look this variable up in the function. Value *V = NamedValues[Name]; if (!V) - return ErrorV("Unknown variable name"); + return LogErrorV("Unknown variable name"); // Load the value. return Builder.CreateLoad(V, Name.c_str()); @@ -578,7 +578,7 @@ implement codegen for the assignment operator. This looks like: // Assignment requires the LHS to be an identifier. VariableExprAST *LHSE = dynamic_cast<VariableExprAST*>(LHS.get()); if (!LHSE) - return ErrorV("destination of '=' must be a variable"); + return LogErrorV("destination of '=' must be a variable"); Unlike the rest of the binary operators, our assignment operator doesn't follow the "emit LHS, emit RHS, do computation" model. As such, it is @@ -597,7 +597,7 @@ allowed. // Look up the name. Value *Variable = NamedValues[LHSE->getName()]; if (!Variable) - return ErrorV("Unknown variable name"); + return LogErrorV("Unknown variable name"); Builder.CreateStore(Val, Variable); return Val; @@ -632,7 +632,7 @@ When run, this example prints "123" and then "4", showing that we did actually mutate the value! Okay, we have now officially implemented our goal: getting this to work requires SSA construction in the general case. However, to be really useful, we want the ability to define our -own local variables, lets add this next! +own local variables, let's add this next! User-defined Local Variables ============================ @@ -703,7 +703,7 @@ do is add it as a primary expression: static std::unique_ptr<ExprAST> ParsePrimary() { switch (CurTok) { default: - return Error("unknown token when expecting an expression"); + return LogError("unknown token when expecting an expression"); case tok_identifier: return ParseIdentifierExpr(); case tok_number: @@ -732,7 +732,7 @@ Next we define ParseVarExpr: // At least one variable name is required. if (CurTok != tok_identifier) - return Error("expected identifier after var"); + return LogError("expected identifier after var"); The first part of this code parses the list of identifier/expr pairs into the local ``VarNames`` vector. @@ -759,7 +759,7 @@ into the local ``VarNames`` vector. getNextToken(); // eat the ','. if (CurTok != tok_identifier) - return Error("expected identifier list after var"); + return LogError("expected identifier list after var"); } Once all the variables are parsed, we then parse the body and create the @@ -769,7 +769,7 @@ AST node: // At this point, we have to have 'in'. if (CurTok != tok_in) - return Error("expected 'in' keyword after 'var'"); + return LogError("expected 'in' keyword after 'var'"); getNextToken(); // eat 'in'. auto Body = ParseExpression(); @@ -812,7 +812,7 @@ previous value that we replace in OldBindings. if (!InitVal) return nullptr; } else { // If not specified, use 0.0. - InitVal = ConstantFP::get(getGlobalContext(), APFloat(0.0)); + InitVal = ConstantFP::get(LLVMContext, APFloat(0.0)); } AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName); @@ -877,5 +877,5 @@ Here is the code: .. literalinclude:: ../../examples/Kaleidoscope/Chapter7/toy.cpp :language: c++ -`Next: Adding Debug Information <LangImpl8.html>`_ +`Next: Compiling to Object Code <LangImpl08.html>`_ diff --git a/docs/tutorial/LangImpl08.rst b/docs/tutorial/LangImpl08.rst new file mode 100644 index 0000000000000..96eccaebd3295 --- /dev/null +++ b/docs/tutorial/LangImpl08.rst @@ -0,0 +1,218 @@ +======================================== + Kaleidoscope: Compiling to Object Code +======================================== + +.. contents:: + :local: + +Chapter 8 Introduction +====================== + +Welcome to Chapter 8 of the "`Implementing a language with LLVM +<index.html>`_" tutorial. This chapter describes how to compile our +language down to object files. + +Choosing a target +================= + +LLVM has native support for cross-compilation. You can compile to the +architecture of your current machine, or just as easily compile for +other architectures. In this tutorial, we'll target the current +machine. + +To specify the architecture that you want to target, we use a string +called a "target triple". This takes the form +``<arch><sub>-<vendor>-<sys>-<abi>`` (see the `cross compilation docs +<http://clang.llvm.org/docs/CrossCompilation.html#target-triple>`_). + +As an example, we can see what clang thinks is our current target +triple: + +:: + + $ clang --version | grep Target + Target: x86_64-unknown-linux-gnu + +Running this command may show something different on your machine as +you might be using a different architecture or operating system to me. + +Fortunately, we don't need to hard-code a target triple to target the +current machine. LLVM provides ``sys::getDefaultTargetTriple``, which +returns the target triple of the current machine. + +.. code-block:: c++ + + auto TargetTriple = sys::getDefaultTargetTriple(); + +LLVM doesn't require us to to link in all the target +functionality. For example, if we're just using the JIT, we don't need +the assembly printers. Similarly, if we're only targeting certain +architectures, we can only link in the functionality for those +architectures. + +For this example, we'll initialize all the targets for emitting object +code. + +.. code-block:: c++ + + InitializeAllTargetInfos(); + InitializeAllTargets(); + InitializeAllTargetMCs(); + InitializeAllAsmParsers(); + InitializeAllAsmPrinters(); + +We can now use our target triple to get a ``Target``: + +.. code-block:: c++ + + std::string Error; + auto Target = TargetRegistry::lookupTarget(TargetTriple, Error); + + // Print an error and exit if we couldn't find the requested target. + // This generally occurs if we've forgotten to initialise the + // TargetRegistry or we have a bogus target triple. + if (!Target) { + errs() << Error; + return 1; + } + +Target Machine +============== + +We will also need a ``TargetMachine``. This class provides a complete +machine description of the machine we're targeting. If we want to +target a specific feature (such as SSE) or a specific CPU (such as +Intel's Sandylake), we do so now. + +To see which features and CPUs that LLVM knows about, we can use +``llc``. For example, let's look at x86: + +:: + + $ llvm-as < /dev/null | llc -march=x86 -mattr=help + Available CPUs for this target: + + amdfam10 - Select the amdfam10 processor. + athlon - Select the athlon processor. + athlon-4 - Select the athlon-4 processor. + ... + + Available features for this target: + + 16bit-mode - 16-bit mode (i8086). + 32bit-mode - 32-bit mode (80386). + 3dnow - Enable 3DNow! instructions. + 3dnowa - Enable 3DNow! Athlon instructions. + ... + +For our example, we'll use the generic CPU without any additional +features, options or relocation model. + +.. code-block:: c++ + + auto CPU = "generic"; + auto Features = ""; + + TargetOptions opt; + auto RM = Optional<Reloc::Model>(); + auto TargetMachine = Target->createTargetMachine(TargetTriple, CPU, Features, opt, RM); + + +Configuring the Module +====================== + +We're now ready to configure our module, to specify the target and +data layout. This isn't strictly necessary, but the `frontend +performance guide <../Frontend/PerformanceTips.html>`_ recommends +this. Optimizations benefit from knowing about the target and data +layout. + +.. code-block:: c++ + + TheModule->setDataLayout(TargetMachine->createDataLayout()); + TheModule->setTargetTriple(TargetTriple); + +Emit Object Code +================ + +We're ready to emit object code! Let's define where we want to write +our file to: + +.. code-block:: c++ + + auto Filename = "output.o"; + std::error_code EC; + raw_fd_ostream dest(Filename, EC, sys::fs::F_None); + + if (EC) { + errs() << "Could not open file: " << EC.message(); + return 1; + } + +Finally, we define a pass that emits object code, then we run that +pass: + +.. code-block:: c++ + + legacy::PassManager pass; + auto FileType = TargetMachine::CGFT_ObjectFile; + + if (TargetMachine->addPassesToEmitFile(pass, dest, FileType)) { + errs() << "TargetMachine can't emit a file of this type"; + return 1; + } + + pass.run(*TheModule); + dest.flush(); + +Putting It All Together +======================= + +Does it work? Let's give it a try. We need to compile our code, but +note that the arguments to ``llvm-config`` are different to the previous chapters. + +:: + + $ clang++ -g -O3 toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs all` -o toy + +Let's run it, and define a simple ``average`` function. Press Ctrl-D +when you're done. + +:: + + $ ./toy + ready> def average(x y) (x + y) * 0.5; + ^D + Wrote output.o + +We have an object file! To test it, let's write a simple program and +link it with our output. Here's the source code: + +.. code-block:: c++ + + #include <iostream> + + extern "C" { + double average(double, double); + } + + int main() { + std::cout << "average of 3.0 and 4.0: " << average(3.0, 4.0) << std::endl; + } + +We link our program to output.o and check the result is what we +expected: + +:: + + $ clang++ main.cpp output.o -o main + $ ./main + average of 3.0 and 4.0: 3.5 + +Full Code Listing +================= + +.. literalinclude:: ../../examples/Kaleidoscope/Chapter8/toy.cpp + :language: c++ + +`Next: Adding Debug Information <LangImpl09.html>`_ diff --git a/docs/tutorial/LangImpl8.rst b/docs/tutorial/LangImpl09.rst index 3b0f443f08d54..0053960756d29 100644 --- a/docs/tutorial/LangImpl8.rst +++ b/docs/tutorial/LangImpl09.rst @@ -5,11 +5,11 @@ Kaleidoscope: Adding Debug Information .. contents:: :local: -Chapter 8 Introduction +Chapter 9 Introduction ====================== -Welcome to Chapter 8 of the "`Implementing a language with -LLVM <index.html>`_" tutorial. In chapters 1 through 7, we've built a +Welcome to Chapter 9 of the "`Implementing a language with +LLVM <index.html>`_" tutorial. In chapters 1 through 8, we've built a decent little programming language with functions and variables. What happens if something goes wrong though, how do you debug your program? @@ -149,7 +149,7 @@ command line: .. code-block:: bash - Kaleidoscope-Ch8 < fib.ks | & clang -x ir - + Kaleidoscope-Ch9 < fib.ks | & clang -x ir - which gives an a.out/a.exe in the current working directory. @@ -455,8 +455,8 @@ debug information. To build this example, use: Here is the code: -.. literalinclude:: ../../examples/Kaleidoscope/Chapter8/toy.cpp +.. literalinclude:: ../../examples/Kaleidoscope/Chapter9/toy.cpp :language: c++ -`Next: Conclusion and other useful LLVM tidbits <LangImpl9.html>`_ +`Next: Conclusion and other useful LLVM tidbits <LangImpl10.html>`_ diff --git a/docs/tutorial/LangImpl9.rst b/docs/tutorial/LangImpl10.rst index f02bba857c149..5799c99402c0c 100644 --- a/docs/tutorial/LangImpl9.rst +++ b/docs/tutorial/LangImpl10.rst @@ -51,10 +51,7 @@ For example, try adding: applications. Adding them is mostly an exercise in learning how the LLVM `getelementptr <../LangRef.html#getelementptr-instruction>`_ instruction works: it is so nifty/unconventional, it `has its own - FAQ <../GetElementPtr.html>`_! If you add support for recursive types - (e.g. linked lists), make sure to read the `section in the LLVM - Programmer's Manual <../ProgrammersManual.html#TypeResolve>`_ that - describes how to construct them. + FAQ <../GetElementPtr.html>`_! - **standard runtime** - Our current language allows the user to access arbitrary external functions, and we use it for things like "printd" and "putchard". As you extend the language to add higher-level @@ -103,8 +100,8 @@ LLVM's capabilities. Properties of the LLVM IR ========================= -We have a couple common questions about code in the LLVM IR form - lets -just get these out of the way right now, shall we? +We have a couple of common questions about code in the LLVM IR form - +let's just get these out of the way right now, shall we? Target Independence ------------------- diff --git a/docs/tutorial/OCamlLangImpl1.rst b/docs/tutorial/OCamlLangImpl1.rst index cf968b5ae89ce..9de92305a1c31 100644 --- a/docs/tutorial/OCamlLangImpl1.rst +++ b/docs/tutorial/OCamlLangImpl1.rst @@ -106,7 +106,7 @@ support the if/then/else construct, a for loop, user defined operators, JIT compilation with a simple command line interface, etc. Because we want to keep things simple, the only datatype in Kaleidoscope -is a 64-bit floating point type (aka 'float' in O'Caml parlance). As +is a 64-bit floating point type (aka 'float' in OCaml parlance). As such, all values are implicitly double precision and the language doesn't require type declarations. This gives the language a very nice and simple syntax. For example, the following simple example computes diff --git a/docs/tutorial/OCamlLangImpl5.rst b/docs/tutorial/OCamlLangImpl5.rst index 675b9bc1978b0..3a135b2333733 100644 --- a/docs/tutorial/OCamlLangImpl5.rst +++ b/docs/tutorial/OCamlLangImpl5.rst @@ -178,7 +178,7 @@ IR into "t.ll" and run "``llvm-as < t.ll | opt -analyze -view-cfg``", `a window will pop up <../ProgrammersManual.html#viewing-graphs-while-debugging-code>`_ and you'll see this graph: -.. figure:: LangImpl5-cfg.png +.. figure:: LangImpl05-cfg.png :align: center :alt: Example CFG diff --git a/docs/tutorial/OCamlLangImpl6.rst b/docs/tutorial/OCamlLangImpl6.rst index a3ae11fd7e549..2fa25f5c22fb5 100644 --- a/docs/tutorial/OCamlLangImpl6.rst +++ b/docs/tutorial/OCamlLangImpl6.rst @@ -496,17 +496,17 @@ converge: # determine whether the specific location diverges. # Solve for z = z^2 + c in the complex plane. - def mandleconverger(real imag iters creal cimag) + def mandelconverger(real imag iters creal cimag) if iters > 255 | (real*real + imag*imag > 4) then iters else - mandleconverger(real*real - imag*imag + creal, + mandelconverger(real*real - imag*imag + creal, 2*real*imag + cimag, iters+1, creal, cimag); # return the number of iterations required for the iteration to escape - def mandleconverge(real imag) - mandleconverger(real, imag, 0, real, imag); + def mandelconverge(real imag) + mandelconverger(real, imag, 0, real, imag); This "z = z\ :sup:`2`\ + c" function is a beautiful little creature that is the basis for computation of the `Mandelbrot @@ -520,12 +520,12 @@ but we can whip together something using the density plotter above: :: - # compute and plot the mandlebrot set with the specified 2 dimensional range + # compute and plot the mandelbrot set with the specified 2 dimensional range # info. def mandelhelp(xmin xmax xstep ymin ymax ystep) for y = ymin, y < ymax, ystep in ( (for x = xmin, x < xmax, xstep in - printdensity(mandleconverge(x,y))) + printdensity(mandelconverge(x,y))) : putchard(10) ) @@ -535,7 +535,7 @@ but we can whip together something using the density plotter above: mandelhelp(realstart, realstart+realmag*78, realmag, imagstart, imagstart+imagmag*40, imagmag); -Given this, we can try plotting out the mandlebrot set! Lets try it out: +Given this, we can try plotting out the mandelbrot set! Lets try it out: :: diff --git a/docs/tutorial/OCamlLangImpl7.rst b/docs/tutorial/OCamlLangImpl7.rst index c8c701b91012d..f36845c523434 100644 --- a/docs/tutorial/OCamlLangImpl7.rst +++ b/docs/tutorial/OCamlLangImpl7.rst @@ -224,7 +224,7 @@ variables in certain circumstances: class <../LangRef.html#first-class-types>`_ values (such as pointers, scalars and vectors), and only if the array size of the allocation is 1 (or missing in the .ll file). mem2reg is not capable of promoting - structs or arrays to registers. Note that the "scalarrepl" pass is + structs or arrays to registers. Note that the "sroa" pass is more powerful and can promote structs, "unions", and arrays in many cases. diff --git a/docs/tutorial/index.rst b/docs/tutorial/index.rst index dde53badd3ad8..494cfd0a33a77 100644 --- a/docs/tutorial/index.rst +++ b/docs/tutorial/index.rst @@ -22,6 +22,16 @@ Kaleidoscope: Implementing a Language with LLVM in Objective Caml OCamlLangImpl* +Building a JIT in LLVM +=============================================== + +.. toctree:: + :titlesonly: + :glob: + :numbered: + + BuildingAJIT* + External Tutorials ================== |