diff options
Diffstat (limited to 'docs/LangRef.rst')
-rw-r--r-- | docs/LangRef.rst | 772 |
1 files changed, 622 insertions, 150 deletions
diff --git a/docs/LangRef.rst b/docs/LangRef.rst index 5f8a3a5a4a987..f6dda59fda255 100644 --- a/docs/LangRef.rst +++ b/docs/LangRef.rst @@ -250,6 +250,11 @@ linkage: together. This is the LLVM, typesafe, equivalent of having the system linker append together "sections" with identical names when .o files are linked. + + Unfortunately this doesn't correspond to any feature in .o files, so it + can only be used for variables like ``llvm.global_ctors`` which llvm + interprets specially. + ``extern_weak`` The semantics of this linkage follow the ELF object file model: the symbol is weak until linked, if not linked, the symbol becomes null @@ -427,6 +432,10 @@ added in the future: - On X86-64 the callee preserves all general purpose registers, except for RDI and RAX. +"``swiftcc``" - This calling convention is used for Swift language. + - On X86-64 RCX and R8 are available for additional integer returns, and + XMM2 and XMM3 are available for additional FP/vector returns. + - On iOS platforms, we use AAPCS-VFP calling convention. "``cc <n>``" - Numbered convention Any calling convention may be specified by number, allowing target-specific calling conventions to be used. Target specific @@ -580,6 +589,9 @@ initializer. Note that a constant with significant address *can* be merged with a ``unnamed_addr`` constant, the result being a constant whose address is significant. +If the ``local_unnamed_addr`` attribute is given, the address is known to +not be significant within the module. + A global variable may be declared to reside in a target-specific numbered address space. For targets that support them, address spaces may affect how optimizations are performed and/or what target @@ -610,18 +622,20 @@ assume that the globals are densely packed in their section and try to iterate over them as an array, alignment padding would break this iteration. The maximum alignment is ``1 << 29``. -Globals can also have a :ref:`DLL storage class <dllstorageclass>`. +Globals can also have a :ref:`DLL storage class <dllstorageclass>` and +an optional list of attached :ref:`metadata <metadata>`, Variables and aliases can have a :ref:`Thread Local Storage Model <tls_model>`. Syntax:: - [@<GlobalVarName> =] [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] - [unnamed_addr] [AddrSpace] [ExternallyInitialized] + @<GlobalVarName> = [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] + [(unnamed_addr|local_unnamed_addr)] [AddrSpace] + [ExternallyInitialized] <global | constant> <Type> [<InitializerConstant>] [, section "name"] [, comdat [($name)]] - [, align <Alignment>] + [, align <Alignment>] (, !name !N)* For example, the following defines a global in a numbered address space with an initializer, section, and alignment: @@ -665,14 +679,14 @@ an optional list of attached :ref:`metadata <metadata>`, an opening curly brace, a list of basic blocks, and a closing curly brace. LLVM function declarations consist of the "``declare``" keyword, an -optional :ref:`linkage type <linkage>`, an optional :ref:`visibility -style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, -an optional :ref:`calling convention <callingconv>`, -an optional ``unnamed_addr`` attribute, a return type, an optional -:ref:`parameter attribute <paramattrs>` for the return type, a function -name, a possibly empty list of arguments, an optional alignment, an optional -:ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, -and an optional :ref:`prologue <prologuedata>`. +optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style +<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an +optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr`` +or ``local_unnamed_addr`` attribute, a return type, an optional :ref:`parameter +attribute <paramattrs>` for the return type, a function name, a possibly +empty list of arguments, an optional alignment, an optional :ref:`garbage +collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional +:ref:`prologue <prologuedata>`. A function definition contains a list of basic blocks, forming the CFG (Control Flow Graph) for the function. Each basic block may optionally start with a label @@ -703,14 +717,17 @@ alignment. All alignments must be a power of 2. If the ``unnamed_addr`` attribute is given, the address is known to not be significant and two identical functions can be merged. +If the ``local_unnamed_addr`` attribute is given, the address is known to +not be significant within the module. + Syntax:: define [linkage] [visibility] [DLLStorageClass] [cconv] [ret attrs] <ResultType> @<FunctionName> ([argument list]) - [unnamed_addr] [fn Attrs] [section "name"] [comdat [($name)]] - [align N] [gc] [prefix Constant] [prologue Constant] - [personality Constant] (!name !N)* { ... } + [(unnamed_addr|local_unnamed_addr)] [fn Attrs] [section "name"] + [comdat [($name)]] [align N] [gc] [prefix Constant] + [prologue Constant] [personality Constant] (!name !N)* { ... } The argument list is a comma separated sequence of arguments where each argument is of the following form: @@ -737,7 +754,7 @@ Aliases may have an optional :ref:`linkage type <linkage>`, an optional Syntax:: - @<Name> = [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] [unnamed_addr] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee> + @<Name> = [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee> The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``, ``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers @@ -747,6 +764,9 @@ Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point to the same content. +If the ``local_unnamed_addr`` attribute is given, the address is known to +not be significant within the module. + Since aliases are only a second name, some restrictions apply, of which some can only be checked when producing an object file: @@ -760,6 +780,25 @@ some can only be checked when producing an object file: * No global value in the expression can be a declaration, since that would require a relocation, which is not possible. +.. _langref_ifunc: + +IFuncs +------- + +IFuncs, like as aliases, don't create any new data or func. They are just a new +symbol that dynamic linker resolves at runtime by calling a resolver function. + +IFuncs have a name and a resolver that is a function called by dynamic linker +that returns address of another function associated with the name. + +IFunc may have an optional :ref:`linkage type <linkage>` and an optional +:ref:`visibility style <visibility>`. + +Syntax:: + + @<Name> = [Linkage] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver> + + .. _langref_comdats: Comdats @@ -907,8 +946,7 @@ Currently, only the following parameter attributes are defined: ``zeroext`` This indicates to the code generator that the parameter or return value should be zero-extended to the extent required by the target's - ABI (which is usually 32-bits, but is 8-bits for a i1 on x86-64) by - the caller (for a parameter) or the callee (for a return value). + ABI by the caller (for a parameter) or the callee (for a return value). ``signext`` This indicates to the code generator that the parameter or return value should be sign-extended to the extent required by the target's @@ -1010,7 +1048,8 @@ Currently, only the following parameter attributes are defined: ``nocapture`` This indicates that the callee does not make any copies of the pointer that outlive the callee itself. This is not a valid - attribute for return values. + attribute for return values. Addresses used in volatile operations + are considered to be captured. .. _nest: @@ -1021,12 +1060,13 @@ Currently, only the following parameter attributes are defined: ``returned`` This indicates that the function always returns the argument as its return - value. This is an optimization hint to the code generator when generating - the caller, allowing tail call optimization and omission of register saves - and restores in some cases; it is not checked or enforced when generating - the callee. The parameter and the function return type must be valid - operands for the :ref:`bitcast instruction <i_bitcast>`. This is not a - valid attribute for return values and can only be applied to one parameter. + value. This is a hint to the optimizer and code generator used when + generating the caller, allowing value propagation, tail call optimization, + and omission of register saves and restores in some cases; it is not + checked or enforced when generating the callee. The parameter and the + function return type must be valid operands for the + :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for + return values and can only be applied to one parameter. ``nonnull`` This indicates that the parameter or return pointer is not null. This @@ -1059,6 +1099,30 @@ Currently, only the following parameter attributes are defined: ``dereferenceable(<n>)``). This attribute may only be applied to pointer typed parameters. +``swiftself`` + This indicates that the parameter is the self/context parameter. This is not + a valid attribute for return values and can only be applied to one + parameter. + +``swifterror`` + This attribute is motivated to model and optimize Swift error handling. It + can be applied to a parameter with pointer to pointer type or a + pointer-sized alloca. At the call site, the actual argument that corresponds + to a ``swifterror`` parameter has to come from a ``swifterror`` alloca. A + ``swifterror`` value (either the parameter or the alloca) can only be loaded + and stored from, or used as a ``swifterror`` argument. This is not a valid + attribute for return values and can only be applied to one parameter. + + These constraints allow the calling convention to optimize access to + ``swifterror`` variables by associating them with a specific register at + call boundaries rather than placing them in memory. Since this does change + the calling convention, a function which uses the ``swifterror`` attribute + on a parameter is not ABI-compatible with one which does not. + + These constraints also allow LLVM to assume that a ``swifterror`` argument + does not alias any other memory visible within a function and that a + ``swifterror`` alloca passed as an argument does not escape. + .. _gc: Garbage Collector Strategy Names @@ -1223,6 +1287,15 @@ example: epilogue, the backend should forcibly align the stack pointer. Specify the desired alignment, which must be a power of two, in parentheses. +``allocsize(<EltSizeParam>[, <NumEltsParam>])`` + This attribute indicates that the annotated function will always return at + least a given number of bytes (or null). Its arguments are zero-indexed + parameter numbers; if one argument is provided, then it's assumed that at + least ``CallSite.Args[EltSizeParam]`` bytes will be available at the + returned pointer. If two are provided, then it's assumed that + ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are + available. The referenced parameters must be integer types. No assumptions + are made about the contents of the returned block of memory. ``alwaysinline`` This attribute indicates that the inliner should attempt to inline this function into callers whenever possible, ignoring any active @@ -1239,10 +1312,26 @@ example: function call are also considered to be cold; and, thus, given low weight. ``convergent`` - This attribute indicates that the callee is dependent on a convergent - thread execution pattern under certain parallel execution models. - Transformations that are execution model agnostic may not make the execution - of a convergent operation control dependent on any additional values. + In some parallel execution models, there exist operations that cannot be + made control-dependent on any additional values. We call such operations + ``convergent``, and mark them with this attribute. + + The ``convergent`` attribute may appear on functions or call/invoke + instructions. When it appears on a function, it indicates that calls to + this function should not be made control-dependent on additional values. + For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so + calls to this intrinsic cannot be made control-dependent on additional + values. + + When it appears on a call/invoke, the ``convergent`` attribute indicates + that we should treat the call as though we're calling a convergent + function. This is particularly useful on indirect calls; without this we + may treat such calls as though the target is non-convergent. + + The optimizer may remove the ``convergent`` attribute on functions when it + can prove that the function does not execute any convergent operations. + Similarly, the optimizer may remove ``convergent`` on calls/invokes when it + can prove that the call/invoke cannot call a convergent function. ``inaccessiblememonly`` This attribute indicates that the function may only access memory that is not accessible by the module being compiled. This is a weaker form @@ -1334,6 +1423,31 @@ example: passes make choices that keep the code size of this function low, and otherwise do optimizations specifically to reduce code size as long as they do not significantly impact runtime performance. +``"patchable-function"`` + This attribute tells the code generator that the code + generated for this function needs to follow certain conventions that + make it possible for a runtime function to patch over it later. + The exact effect of this attribute depends on its string value, + for which there currently is one legal possibility: + + * ``"prologue-short-redirect"`` - This style of patchable + function is intended to support patching a function prologue to + redirect control away from the function in a thread safe + manner. It guarantees that the first instruction of the + function will be large enough to accommodate a short jump + instruction, and will be sufficiently aligned to allow being + fully changed via an atomic compare-and-swap instruction. + While the first requirement can be satisfied by inserting large + enough NOP, LLVM can and will try to re-purpose an existing + instruction (i.e. one that would have to be emitted anyway) as + the patchable instruction larger than a short jump. + + ``"prologue-short-redirect"`` is currently only supported on + x86-64. + + This attribute by itself does not imply restrictions on + inter-procedural optimizations. All of the semantic effects the + patching may have to be separately conveyed via the linkage type. ``readnone`` On a function, this attribute indicates that the function computes its result (or decides to unwind an exception) based strictly on its arguments, @@ -1361,6 +1475,13 @@ example: On an argument, this attribute indicates that the function does not write through this pointer argument, even though it may write to the memory that the pointer points to. +``writeonly`` + On a function, this attribute indicates that the function may write to but + does not read from memory. + + On an argument, this attribute indicates that the function may write to but + does not read through this pointer argument (even though it may read from + the memory that the pointer points to). ``argmemonly`` This attribute indicates that the only memory accesses inside function are loads and stores from objects pointed to by its pointer-typed arguments, @@ -1511,7 +1632,7 @@ operand bundle to not miscompile programs containing it. ways before control is transferred to the callee or invokee. - Calls and invokes with operand bundles have unknown read / write effect on the heap on entry and exit (even if the call target is - ``readnone`` or ``readonly``), unless they're overriden with + ``readnone`` or ``readonly``), unless they're overridden with callsite specific attributes. - An operand bundle at a call site cannot change the implementation of the called function. Inter-procedural optimizations work as @@ -1519,6 +1640,8 @@ operand bundle to not miscompile programs containing it. More specific types of operand bundles are described below. +.. _deopt_opbundles: + Deoptimization Operand Bundles ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -1602,6 +1725,18 @@ it is undefined behavior to execute a ``call`` or ``invoke`` which: Similarly, if no funclet EH pads have been entered-but-not-yet-exited, executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior. +GC Transition Operand Bundles +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +GC transition operand bundles are characterized by the +``"gc-transition"`` operand bundle tag. These operand bundles mark a +call as a transition between a function with one GC strategy to a +function with a different GC strategy. If coordinating the transition +between GC strategies requires additional code generation at the call +site, these bundles may contain any values that are needed by the +generated code. For more details, see :ref:`GC Transitions +<gc_transition_args>`. + .. _moduleasm: Module-Level Inline Assembly @@ -2086,6 +2221,26 @@ function's scope. uselistorder i32 (i32) @bar, { 1, 0 } uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 } +.. _source_filename: + +Source Filename +--------------- + +The *source filename* string is set to the original module identifier, +which will be the name of the compiled source file when compiling from +source through the clang front end, for example. It is then preserved through +the IR and bitcode. + +This is currently necessary to generate a consistent unique global +identifier for local functions used in profile data, which prepends the +source file name to the local function name. + +The syntax for the source file name is simply: + +.. code-block:: llvm + + source_filename = "/path/to/source.c" + .. _typesystem: Type System @@ -3119,7 +3274,7 @@ the same register to an output and an input. If this is not safe (e.g. if the assembly contains two instructions, where the first writes to one output, and the second reads an input and writes to a second output), then the "``&``" modifier must be used (e.g. "``=&r``") to specify that the output is an -"early-clobber" output. Marking an ouput as "early-clobber" ensures that LLVM +"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM will not use the same register for any inputs (other than an input tied to this output). @@ -3453,8 +3608,14 @@ SystemZ: - ``K``: An immediate signed 16-bit integer. - ``L``: An immediate signed 20-bit integer. - ``M``: An immediate integer 0x7fffffff. -- ``Q``, ``R``, ``S``, ``T``: A memory address operand, treated the same as - ``m``, at the moment. +- ``Q``: A memory address operand with a base address and a 12-bit immediate + unsigned displacement. +- ``R``: A memory address operand with a base address, a 12-bit immediate + unsigned displacement, and an index register. +- ``S``: A memory address operand with a base address and a 20-bit immediate + signed displacement. +- ``T``: A memory address operand with a base address, a 20-bit immediate + signed displacement, and an index register. - ``r`` or ``d``: A 32, 64, or 128-bit integer register. - ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an address context evaluates as zero). @@ -3792,7 +3953,7 @@ references to them from instructions). !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang", isOptimized: true, flags: "-O2", runtimeVersion: 2, - splitDebugFilename: "abc.debug", emissionKind: 1, + splitDebugFilename: "abc.debug", emissionKind: FullDebug, enums: !2, retainedTypes: !3, subprograms: !4, globals: !5, imports: !6, macros: !7, dwoId: 0x0abcd) @@ -3878,21 +4039,28 @@ The following ``tag:`` values are valid: .. code-block:: llvm - DW_TAG_formal_parameter = 5 DW_TAG_member = 13 DW_TAG_pointer_type = 15 DW_TAG_reference_type = 16 DW_TAG_typedef = 22 + DW_TAG_inheritance = 28 DW_TAG_ptr_to_member_type = 31 DW_TAG_const_type = 38 + DW_TAG_friend = 42 DW_TAG_volatile_type = 53 DW_TAG_restrict_type = 55 +.. _DIDerivedTypeMember: + ``DW_TAG_member`` is used to define a member of a :ref:`composite type -<DICompositeType>` or :ref:`subprogram <DISubprogram>`. The type of the member -is the ``baseType:``. The ``offset:`` is the member's bit offset. -``DW_TAG_formal_parameter`` is used to define a member which is a formal -argument of a subprogram. +<DICompositeType>`. The type of the member is the ``baseType:``. The +``offset:`` is the member's bit offset. If the composite type has an ODR +``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is +uniqued based only on its ``name:`` and ``scope:``. + +``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:`` +field of :ref:`composite types <DICompositeType>` to describe parents and +friends. ``DW_TAG_typedef`` is used to provide a name for the ``baseType:``. @@ -3911,9 +4079,15 @@ DICompositeType structures and unions. ``elements:`` points to a tuple of the composed types. If the source language supports ODR, the ``identifier:`` field gives the unique -identifier used for type merging between modules. When specified, other types -can refer to composite types indirectly via a :ref:`metadata string -<metadata-string>` that matches their identifier. +identifier used for type merging between modules. When specified, +:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member +derived types <DIDerivedTypeMember>` that reference the ODR-type in their +``scope:`` change uniquing rules. + +For a given ``identifier:``, there should only be a single composite type that +does not have ``flags: DIFlagFwdDecl`` set. LLVM tools that link modules +together will unique such definitions at parse time via the ``identifier:`` +field, even if the nodes are ``distinct``. .. code-block:: llvm @@ -3933,9 +4107,6 @@ The following ``tag:`` values are valid: DW_TAG_enumeration_type = 4 DW_TAG_structure_type = 19 DW_TAG_union_type = 23 - DW_TAG_subroutine_type = 21 - DW_TAG_inheritance = 28 - For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange descriptors <DISubrange>`, each representing the range of subscripts at that @@ -3949,7 +4120,9 @@ value for the set. All enumeration type descriptors are collected in the For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and ``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types -<DIDerivedType>` with ``tag: DW_TAG_member`` or ``tag: DW_TAG_inheritance``. +<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or +``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with +``isDefinition: false``. .. _DISubrange: @@ -4038,6 +4211,14 @@ metadata. The ``variables:`` field points at :ref:`variables <DILocalVariable>` that must be retained, even if their IR counterparts are optimized out of the IR. The ``type:`` field must point at an :ref:`DISubroutineType`. +.. _DISubprogramDeclaration: + +When ``isDefinition: false``, subprograms describe a declaration in the type +tree as opposed to a definition of a function. If the scope is a composite +type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``, +then the subprogram declaration is uniqued based only on its ``linkageName:`` +and ``scope:``. + .. code-block:: llvm define void @_Z3foov() !dbg !0 { @@ -4046,7 +4227,7 @@ the IR. The ``type:`` field must point at an :ref:`DISubroutineType`. !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1, file: !2, line: 7, type: !3, isLocal: true, - isDefinition: false, scopeLine: 8, + isDefinition: true, scopeLine: 8, containingType: !4, virtuality: DW_VIRTUALITY_pure_virtual, virtualIndex: 10, flags: DIFlagPrototyped, @@ -4165,7 +4346,7 @@ DIMacro ``DIMacro`` nodes represent definition or undefinition of a macro identifiers. The ``name:`` field is the macro identifier, followed by macro parameters when -definining a function-like macro, and the ``value`` field is the token-string +defining a function-like macro, and the ``value`` field is the token-string used to expand the macro identifier. .. code-block:: llvm @@ -4262,12 +4443,20 @@ instructions (loads, stores, memory-accessing calls, etc.) that carry ``noalias`` metadata can specifically be specified not to alias with some other collection of memory access instructions that carry ``alias.scope`` metadata. Each type of metadata specifies a list of scopes where each scope has an id and -a domain. When evaluating an aliasing query, if for some domain, the set +a domain. + +When evaluating an aliasing query, if for some domain, the set of scopes with that domain in one instruction's ``alias.scope`` list is a subset of (or equal to) the set of scopes for that domain in another instruction's ``noalias`` list, then the two memory accesses are assumed not to alias. +Because scopes in one domain don't affect scopes in other domains, separate +domains can be used to compose multiple independent noalias sets. This is +used for example during inlining. As the noalias function parameters are +turned into noalias scope metadata, a new domain is used every time the +function is inlined. + The metadata identifying each domain is itself a list containing one or two entries. The first entry is the name of the domain. Note that if the name is a string then it can be combined across functions and translation units. A @@ -4329,8 +4518,8 @@ it. ULP is defined as follows: distance between the two non-equal finite floating-point numbers nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``. -The metadata node shall consist of a single positive floating point -number representing the maximum relative error, for example: +The metadata node shall consist of a single positive float type number +representing the maximum relative error, for example: .. code-block:: llvm @@ -4542,6 +4731,38 @@ For example: !0 = !{!"llvm.loop.unroll.full"} +'``llvm.loop.licm_versioning.disable``' Metadata +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This metadata indicates that the loop should not be versioned for the purpose +of enabling loop-invariant code motion (LICM). The metadata has a single operand +which is the string ``llvm.loop.licm_versioning.disable``. For example: + +.. code-block:: llvm + + !0 = !{!"llvm.loop.licm_versioning.disable"} + +'``llvm.loop.distribute.enable``' Metadata +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Loop distribution allows splitting a loop into multiple loops. Currently, +this is only performed if the entire loop cannot be vectorized due to unsafe +memory dependencies. The transformation will atempt to isolate the unsafe +dependencies into their own loop. + +This metadata can be used to selectively enable or disable distribution of the +loop. The first operand is the string ``llvm.loop.distribute.enable`` and the +second operand is a bit. If the bit operand value is 1 distribution is +enabled. A value of 0 disables distribution: + +.. code-block:: llvm + + !0 = !{!"llvm.loop.distribute.enable", i1 0} + !1 = !{!"llvm.loop.distribute.enable", i1 1} + +This metadata should be used in conjunction with ``llvm.loop`` loop +identification metadata. + '``llvm.mem``' ^^^^^^^^^^^^^^^ @@ -4555,7 +4776,8 @@ The ``llvm.mem.parallel_loop_access`` metadata refers to a loop identifier, or metadata containing a list of loop identifiers for nested loops. The metadata is attached to memory accessing instructions and denotes that no loop carried memory dependence exist between it and other instructions denoted -with the same loop identifier. +with the same loop identifier. The metadata on memory reads also implies that +if conversion (i.e. speculative execution within a loop iteration) is safe. Precisely, given two instructions ``m1`` and ``m2`` that both have the ``llvm.mem.parallel_loop_access`` metadata, with ``L1`` and ``L2`` being the @@ -4625,12 +4847,6 @@ the loop identifier metadata node directly: !1 = !{!1} ; an identifier for the inner loop !2 = !{!2} ; an identifier for the outer loop -'``llvm.bitsets``' -^^^^^^^^^^^^^^^^^^ - -The ``llvm.bitsets`` global metadata is used to implement -:doc:`bitsets <BitSets>`. - '``invariant.group``' Metadata ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -5267,7 +5483,7 @@ Syntax: :: - <result> = invoke [cconv] [ret attrs] <ptr to function ty> <function ptr val>(<function args>) [fn attrs] + <result> = invoke [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [operand bundles] to label <normal label> unwind label <exception label> Overview: @@ -5303,12 +5519,16 @@ This instruction requires several arguments: #. The optional :ref:`Parameter Attributes <paramattrs>` list for return values. Only '``zeroext``', '``signext``', and '``inreg``' attributes are valid here. -#. '``ptr to function ty``': shall be the signature of the pointer to - function value being invoked. In most cases, this is a direct - function invocation, but indirect ``invoke``'s are just as possible, - branching off an arbitrary pointer to function value. -#. '``function ptr val``': An LLVM value containing a pointer to a - function to be invoked. +#. '``ty``': the type of the call instruction itself which is also the + type of the return value. Functions that return no value are marked + ``void``. +#. '``fnty``': shall be the signature of the function being invoked. The + argument types must match the types implied by this signature. This + type can be omitted if the function is not varargs. +#. '``fnptrval``': An LLVM value containing a pointer to a function to + be invoked. In most cases, this is a direct function invocation, but + indirect ``invoke``'s are just as possible, calling an arbitrary pointer + to function value. #. '``function args``': argument list whose types match the function signature argument types and parameter attributes. All arguments must be of :ref:`first class <t_firstclass>` type. If the function signature @@ -6767,7 +6987,7 @@ Syntax: :: <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.load !<index>][, !invariant.group !<index>][, !nonnull !<index>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>] - <result> = load atomic [volatile] <ty>* <pointer> [singlethread] <ordering>, align <alignment> [, !invariant.group !<index>] + <result> = load atomic [volatile] <ty>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> [, !invariant.group !<index>] !<index> = !{ i32 1 } !<deref_bytes_node> = !{i64 <dereferenceable_bytes>} !<align_node> = !{ i64 <value_alignment> } @@ -6780,12 +7000,12 @@ The '``load``' instruction is used to read from memory. Arguments: """""""""" -The argument to the ``load`` instruction specifies the memory address -from which to load. The type specified must be a :ref:`first -class <t_firstclass>` type. If the ``load`` is marked as ``volatile``, -then the optimizer is not allowed to modify the number or order of -execution of this ``load`` with other :ref:`volatile -operations <volatile>`. +The argument to the ``load`` instruction specifies the memory address from which +to load. The type specified must be a :ref:`first class <t_firstclass>` type of +known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If +the ``load`` is marked as ``volatile``, then the optimizer is not allowed to +modify the number or order of execution of this ``load`` with other +:ref:`volatile operations <volatile>`. If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering <ordering>` and optional ``singlethread`` argument. The ``release`` and @@ -6805,7 +7025,12 @@ alignment for the target. It is the responsibility of the code emitter to ensure that the alignment information is correct. Overestimating the alignment results in undefined behavior. Underestimating the alignment may produce less efficient code. An alignment of 1 is always safe. The -maximum possible alignment is ``1 << 29``. +maximum possible alignment is ``1 << 29``. An alignment value higher +than the size of the loaded type implies memory up to the alignment +value bytes can be safely loaded without trapping in the default +address space. Access of the high bytes can interfere with debugging +tools, so should not be accessed if the function has the +``sanitize_thread`` or ``sanitize_address`` attributes. The optional ``!nontemporal`` metadata must reference a single metadata name ``<index>`` corresponding to a metadata node with one @@ -6903,13 +7128,14 @@ The '``store``' instruction is used to write to memory. Arguments: """""""""" -There are two arguments to the ``store`` instruction: a value to store -and an address at which to store it. The type of the ``<pointer>`` -operand must be a pointer to the :ref:`first class <t_firstclass>` type of -the ``<value>`` operand. If the ``store`` is marked as ``volatile``, -then the optimizer is not allowed to modify the number or order of -execution of this ``store`` with other :ref:`volatile -operations <volatile>`. +There are two arguments to the ``store`` instruction: a value to store and an +address at which to store it. The type of the ``<pointer>`` operand must be a +pointer to the :ref:`first class <t_firstclass>` type of the ``<value>`` +operand. If the ``store`` is marked as ``volatile``, then the optimizer is not +allowed to modify the number or order of execution of this ``store`` with other +:ref:`volatile operations <volatile>`. Only values of :ref:`first class +<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque +structural type <t_opaque>`) can be stored. If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering <ordering>` and optional ``singlethread`` argument. The ``acquire`` and @@ -6929,7 +7155,14 @@ alignment for the target. It is the responsibility of the code emitter to ensure that the alignment information is correct. Overestimating the alignment results in undefined behavior. Underestimating the alignment may produce less efficient code. An alignment of 1 is always -safe. The maximum possible alignment is ``1 << 29``. +safe. The maximum possible alignment is ``1 << 29``. An alignment +value higher than the size of the stored type implies memory up to the +alignment value bytes can be stored to without trapping in the default +address space. Storing to the higher bytes however may result in data +races if another thread can access the same address. Introducing a +data race is not allowed. Storing to the extra bytes is not allowed +even in situations where a data race is known to not exist if the +function has the ``sanitize_address`` attribute. The optional ``!nontemporal`` metadata must reference a single metadata name ``<index>`` corresponding to a metadata node with one ``i32`` entry of @@ -7044,13 +7277,13 @@ Arguments: There are three arguments to the '``cmpxchg``' instruction: an address to operate on, a value to compare to the value currently be at that address, and a new value to place at that address if the compared values -are equal. The type of '<cmp>' must be an integer type whose bit width -is a power of two greater than or equal to eight and less than or equal -to a target-specific size limit. '<cmp>' and '<new>' must have the same -type, and the type of '<pointer>' must be a pointer to that type. If the -``cmpxchg`` is marked as ``volatile``, then the optimizer is not allowed -to modify the number or order of execution of this ``cmpxchg`` with -other :ref:`volatile operations <volatile>`. +are equal. The type of '<cmp>' must be an integer or pointer type whose +bit width is a power of two greater than or equal to eight and less +than or equal to a target-specific size limit. '<cmp>' and '<new>' must +have the same type, and the type of '<pointer>' must be a pointer to +that type. If the ``cmpxchg`` is marked as ``volatile``, then the +optimizer is not allowed to modify the number or order of execution of +this ``cmpxchg`` with other :ref:`volatile operations <volatile>`. The success and failure :ref:`ordering <ordering>` arguments specify how this ``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters @@ -7091,11 +7324,11 @@ Example: .. code-block:: llvm entry: - %orig = atomic load i32, i32* %ptr unordered ; yields i32 + %orig = load atomic i32, i32* %ptr unordered, align 4 ; yields i32 br label %loop loop: - %cmp = phi i32 [ %orig, %entry ], [%old, %loop] + %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop] %squared = mul i32 %cmp, %cmp %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 } %value_loaded = extractvalue { i32, i1 } %val_success, 0 @@ -7977,7 +8210,7 @@ Arguments: The '``icmp``' instruction takes three operands. The first operand is the condition code indicating the kind of comparison to perform. It is -not a value, just a keyword. The possible condition code are: +not a value, just a keyword. The possible condition codes are: #. ``eq``: equal #. ``ne``: not equal @@ -8041,9 +8274,6 @@ Example: <result> = icmp ule i16 -4, 5 ; yields: result=false <result> = icmp sge i16 4, 5 ; yields: result=false -Note that the code generator does not yet support vector types with the -``icmp`` instruction. - .. _i_fcmp: '``fcmp``' Instruction @@ -8074,7 +8304,7 @@ Arguments: The '``fcmp``' instruction takes three operands. The first operand is the condition code indicating the kind of comparison to perform. It is -not a value, just a keyword. The possible condition code are: +not a value, just a keyword. The possible condition codes are: #. ``false``: no comparison, always returns false #. ``oeq``: ordered and equal @@ -8156,9 +8386,6 @@ Example: <result> = fcmp olt float 4.0, 5.0 ; yields: result=true <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false -Note that the code generator does not yet support vector types with the -``fcmp`` instruction. - .. _i_phi: '``phi``' Instruction @@ -8270,7 +8497,7 @@ Syntax: :: - <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] <ty> [<fnty>*] <fnptrval>(<function args>) [fn attrs] + <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ] Overview: @@ -8343,13 +8570,11 @@ This instruction requires several arguments: #. '``ty``': the type of the call instruction itself which is also the type of the return value. Functions that return no value are marked ``void``. -#. '``fnty``': shall be the signature of the pointer to function value - being invoked. The argument types must match the types implied by - this signature. This type can be omitted if the function is not - varargs and if the function type does not return a pointer to a - function. +#. '``fnty``': shall be the signature of the function being called. The + argument types must match the types implied by this signature. This + type can be omitted if the function is not varargs. #. '``fnptrval``': An LLVM value containing a pointer to a function to - be invoked. In most cases, this is a direct function invocation, but + be called. In most cases, this is a direct function call, but indirect ``call``'s are just as possible, calling an arbitrary pointer to function value. #. '``function args``': argument list whose types match the function @@ -8358,8 +8583,8 @@ This instruction requires several arguments: indicates the function accepts a variable number of arguments, the extra arguments can be specified. #. The optional :ref:`function attributes <fnattrs>` list. Only - '``noreturn``', '``nounwind``', '``readonly``' and '``readnone``' - attributes are valid here. + '``noreturn``', '``nounwind``', '``readonly``' , '``readnone``', + and '``convergent``' attributes are valid here. #. The optional :ref:`operand bundles <opbundles>` list. Semantics: @@ -9497,6 +9722,33 @@ pass will generate the appropriate data structures and replace the ``llvm.instrprof_value_profile`` intrinsic with the call to the profile runtime library with proper arguments. +'``llvm.thread.pointer``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i8* @llvm.thread.pointer() + +Overview: +""""""""" + +The '``llvm.thread.pointer``' intrinsic returns the value of the thread +pointer. + +Semantics: +"""""""""" + +The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area +for the current thread. The exact semantics of this value are target +specific: it may point to the start of TLS area, to the end, or somewhere +in the middle. Depending on the target, this intrinsic may read a register, +call a helper function, read from an alternate memory space, or perform +other operations necessary to locate the TLS area. Not all targets support +this intrinsic. + Standard C Library Intrinsics ----------------------------- @@ -10459,8 +10711,8 @@ Overview: """"""""" The '``llvm.bitreverse``' family of intrinsics is used to reverse the -bitpattern of an integer value; for example ``0b1234567`` becomes -``0b7654321``. +bitpattern of an integer value; for example ``0b10110110`` becomes +``0b01101101``. Semantics: """""""""" @@ -10558,7 +10810,7 @@ targets support all bit widths or vector types, however. declare i32 @llvm.ctlz.i32 (i32 <src>, i1 <is_zero_undef>) declare i64 @llvm.ctlz.i64 (i64 <src>, i1 <is_zero_undef>) declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>) - declase <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) + declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) Overview: """"""""" @@ -10605,7 +10857,7 @@ support all bit widths or vector types, however. declare i32 @llvm.cttz.i32 (i32 <src>, i1 <is_zero_undef>) declare i64 @llvm.cttz.i64 (i64 <src>, i1 <is_zero_undef>) declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>) - declase <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) + declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) Overview: """"""""" @@ -10640,7 +10892,26 @@ then the result is the size in bits of the type of ``src`` if Arithmetic with Overflow Intrinsics ----------------------------------- -LLVM provides intrinsics for some arithmetic with overflow operations. +LLVM provides intrinsics for fast arithmetic overflow checking. + +Each of these intrinsics returns a two-element struct. The first +element of this struct contains the result of the corresponding +arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of +the result. Therefore, for example, the first element of the struct +returned by ``llvm.sadd.with.overflow.i32`` is always the same as the +result of a 32-bit ``add`` instruction with the same operands, where +the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag. + +The second element of the result is an ``i1`` that is 1 if the +arithmetic operation overflowed and 0 otherwise. An operation +overflows if, for any values of its operands ``A`` and ``B`` and for +any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is +not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is +``sext`` for signed overflow and ``zext`` for unsigned overflow, and +``op`` is the underlying arithmetic operation. + +The behavior of these intrinsics is well-defined for all argument +values. '``llvm.sadd.with.overflow.*``' Intrinsics ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -10980,7 +11251,7 @@ Examples of non-canonical encodings: - Many normal decimal floating point numbers have non-canonical alternative encodings. - Some machines, like GPUs or ARMv7 NEON, do not support subnormal values. - These are treated as non-canonical encodings of zero and with be flushed to + These are treated as non-canonical encodings of zero and will be flushed to a zero of the same sign by this operation. Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with @@ -11304,12 +11575,12 @@ This is an overloaded intrinsic. The loaded data is a vector of any integer, flo :: - declare <16 x float> @llvm.masked.load.v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) - declare <2 x double> @llvm.masked.load.v2f64 (<2 x double>* <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) + declare <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) + declare <2 x double> @llvm.masked.load.v2f64.p0v2f64 (<2 x double>* <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) ;; The data is a vector of pointers to double - declare <8 x double*> @llvm.masked.load.v8p0f64 (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>) + declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64 (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>) ;; The data is a vector of function pointers - declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>) + declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>) Overview: """"""""" @@ -11332,7 +11603,7 @@ The result of this operation is equivalent to a regular vector load instruction :: - %res = call <16 x float> @llvm.masked.load.v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru) + %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru) ;; The result of the two following instructions is identical aside from potential memory access exception %loadlal = load <16 x float>, <16 x float>* %ptr, align 4 @@ -11349,12 +11620,12 @@ This is an overloaded intrinsic. The data stored in memory is a vector of any in :: - declare void @llvm.masked.store.v8i32 (<8 x i32> <value>, <8 x i32>* <ptr>, i32 <alignment>, <8 x i1> <mask>) - declare void @llvm.masked.store.v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>) + declare void @llvm.masked.store.v8i32.p0v8i32 (<8 x i32> <value>, <8 x i32>* <ptr>, i32 <alignment>, <8 x i1> <mask>) + declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>) ;; The data is a vector of pointers to double - declare void @llvm.masked.store.v8p0f64 (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>) + declare void @llvm.masked.store.v8p0f64.p0v8p0f64 (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>) ;; The data is a vector of function pointers - declare void @llvm.masked.store.v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>) + declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>) Overview: """"""""" @@ -11375,7 +11646,7 @@ The result of this operation is equivalent to a load-modify-store sequence. Howe :: - call void @llvm.masked.store.v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4, <16 x i1> %mask) + call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4, <16 x i1> %mask) ;; The result of the following instructions is identical aside from potential data races and memory access exceptions %oldval = load <16 x float>, <16 x float>* %ptr, align 4 @@ -11475,7 +11746,7 @@ The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector :: - ;; This instruction unconditionaly stores data vector in multiple addresses + ;; This instruction unconditionally stores data vector in multiple addresses call @llvm.masked.scatter.v8i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4, <8 x i1> <true, true, .. true>) ;; It is equivalent to a list of scalar stores @@ -11859,43 +12130,40 @@ checked against the original guard by ``llvm.stackprotectorcheck``. If they are different, then ``llvm.stackprotectorcheck`` causes the program to abort by calling the ``__stack_chk_fail()`` function. -'``llvm.stackprotectorcheck``' Intrinsic -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +'``llvm.stackguard``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Syntax: """"""" :: - declare void @llvm.stackprotectorcheck(i8** <guard>) + declare i8* @llvm.stackguard() Overview: """"""""" -The ``llvm.stackprotectorcheck`` intrinsic compares ``guard`` against an already -created stack protector and if they are not equal calls the -``__stack_chk_fail()`` function. +The ``llvm.stackguard`` intrinsic returns the system stack guard value. + +It should not be generated by frontends, since it is only for internal usage. +The reason why we create this intrinsic is that we still support IR form Stack +Protector in FastISel. Arguments: """""""""" -The ``llvm.stackprotectorcheck`` intrinsic requires one pointer argument, the -the variable ``@__stack_chk_guard``. +None. Semantics: """""""""" -This intrinsic is provided to perform the stack protector check by comparing -``guard`` with the stack slot created by ``llvm.stackprotector`` and if the -values do not match call the ``__stack_chk_fail()`` function. +On some platforms, the value returned by this intrinsic remains unchanged +between loads in the same thread. On other platforms, it returns the same +global variable value, if any, e.g. ``@__stack_chk_guard``. -The reason to provide this as an IR level intrinsic instead of implementing it -via other IR operations is that in order to perform this operation at the IR -level without an intrinsic, one would need to create additional basic blocks to -handle the success/failure cases. This makes it difficult to stop the stack -protector check from disrupting sibling tail calls in Codegen. With this -intrinsic, we are able to generate the stack protector basic blocks late in -codegen after the tail call decision has occurred. +Currently some platforms have IR-level customized stack guard loading (e.g. +X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be +in the future. '``llvm.objectsize``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -12010,9 +12278,9 @@ sufficient overall improvement in code quality. For this reason, that the optimizer can otherwise deduce or facts that are of little use to the optimizer. -.. _bitset.test: +.. _type.test: -'``llvm.bitset.test``' Intrinsic +'``llvm.type.test``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Syntax: @@ -12020,20 +12288,74 @@ Syntax: :: - declare i1 @llvm.bitset.test(i8* %ptr, metadata %bitset) nounwind readnone + declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone Arguments: """""""""" The first argument is a pointer to be tested. The second argument is a -metadata object representing an identifier for a :doc:`bitset <BitSets>`. +metadata object representing a :doc:`type identifier <TypeMetadata>`. Overview: """"""""" -The ``llvm.bitset.test`` intrinsic tests whether the given pointer is a -member of the given bitset. +The ``llvm.type.test`` intrinsic tests whether the given pointer is associated +with the given type identifier. + +'``llvm.type.checked.load``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly + + +Arguments: +"""""""""" + +The first argument is a pointer from which to load a function pointer. The +second argument is the byte offset from which to load the function pointer. The +third argument is a metadata object representing a :doc:`type identifier +<TypeMetadata>`. + +Overview: +""""""""" + +The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a +virtual table pointer using type metadata. This intrinsic is used to implement +control flow integrity in conjunction with virtual call optimization. The +virtual call optimization pass will optimize away ``llvm.type.checked.load`` +intrinsics associated with devirtualized calls, thereby removing the type +check in cases where it is not needed to enforce the control flow integrity +constraint. + +If the given pointer is associated with a type metadata identifier, this +function returns true as the second element of its return value. (Note that +the function may also return true if the given pointer is not associated +with a type metadata identifier.) If the function's return value's second +element is true, the following rules apply to the first element: + +- If the given pointer is associated with the given type metadata identifier, + it is the function pointer loaded from the given byte offset from the given + pointer. + +- If the given pointer is not associated with the given type metadata + identifier, it is one of the following (the choice of which is unspecified): + + 1. The function pointer that would have been loaded from an arbitrarily chosen + (through an unspecified mechanism) pointer associated with the type + metadata. + + 2. If the function has a non-void return type, a pointer to a function that + returns an unspecified value without causing side effects. + +If the function's return value's second element is false, the value of the +first element is undefined. + '``llvm.donothing``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -12049,8 +12371,9 @@ Overview: """"""""" The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only -two intrinsics (besides ``llvm.experimental.patchpoint``) that can be called -with an invoke instruction. +three intrinsics (besides ``llvm.experimental.patchpoint`` and +``llvm.experimental.gc.statepoint``) that can be called with an invoke +instruction. Arguments: """""""""" @@ -12063,6 +12386,155 @@ Semantics: This intrinsic does nothing, and it's removed by optimizers and ignored by codegen. +'``llvm.experimental.deoptimize``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ] + +Overview: +""""""""" + +This intrinsic, together with :ref:`deoptimization operand bundles +<deopt_opbundles>`, allow frontends to express transfer of control and +frame-local state from the currently executing (typically more specialized, +hence faster) version of a function into another (typically more generic, hence +slower) version. + +In languages with a fully integrated managed runtime like Java and JavaScript +this intrinsic can be used to implement "uncommon trap" or "side exit" like +functionality. In unmanaged languages like C and C++, this intrinsic can be +used to represent the slow paths of specialized functions. + + +Arguments: +"""""""""" + +The intrinsic takes an arbitrary number of arguments, whose meaning is +decided by the :ref:`lowering strategy<deoptimize_lowering>`. + +Semantics: +"""""""""" + +The ``@llvm.experimental.deoptimize`` intrinsic executes an attached +deoptimization continuation (denoted using a :ref:`deoptimization +operand bundle <deopt_opbundles>`) and returns the value returned by +the deoptimization continuation. Defining the semantic properties of +the continuation itself is out of scope of the language reference -- +as far as LLVM is concerned, the deoptimization continuation can +invoke arbitrary side effects, including reading from and writing to +the entire heap. + +Deoptimization continuations expressed using ``"deopt"`` operand bundles always +continue execution to the end of the physical frame containing them, so all +calls to ``@llvm.experimental.deoptimize`` must be in "tail position": + + - ``@llvm.experimental.deoptimize`` cannot be invoked. + - The call must immediately precede a :ref:`ret <i_ret>` instruction. + - The ``ret`` instruction must return the value produced by the + ``@llvm.experimental.deoptimize`` call if there is one, or void. + +Note that the above restrictions imply that the return type for a call to +``@llvm.experimental.deoptimize`` will match the return type of its immediate +caller. + +The inliner composes the ``"deopt"`` continuations of the caller into the +``"deopt"`` continuations present in the inlinee, and also updates calls to this +intrinsic to return directly from the frame of the function it inlined into. + +All declarations of ``@llvm.experimental.deoptimize`` must share the +same calling convention. + +.. _deoptimize_lowering: + +Lowering: +""""""""" + +Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the +symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to +ensure that this symbol is defined). The call arguments to +``@llvm.experimental.deoptimize`` are lowered as if they were formal +arguments of the specified types, and not as varargs. + + +'``llvm.experimental.guard``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ] + +Overview: +""""""""" + +This intrinsic, together with :ref:`deoptimization operand bundles +<deopt_opbundles>`, allows frontends to express guards or checks on +optimistic assumptions made during compilation. The semantics of +``@llvm.experimental.guard`` is defined in terms of +``@llvm.experimental.deoptimize`` -- its body is defined to be +equivalent to: + +.. code-block:: llvm + + define void @llvm.experimental.guard(i1 %pred, <args...>) { + %realPred = and i1 %pred, undef + br i1 %realPred, label %continue, label %leave [, !make.implicit !{}] + + leave: + call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ] + ret void + + continue: + ret void + } + + +with the optional ``[, !make.implicit !{}]`` present if and only if it +is present on the call site. For more details on ``!make.implicit``, +see :doc:`FaultMaps`. + +In words, ``@llvm.experimental.guard`` executes the attached +``"deopt"`` continuation if (but **not** only if) its first argument +is ``false``. Since the optimizer is allowed to replace the ``undef`` +with an arbitrary value, it can optimize guard to fail "spuriously", +i.e. without the original condition being false (hence the "not only +if"); and this allows for "check widening" type optimizations. + +``@llvm.experimental.guard`` cannot be invoked. + + +'``llvm.load.relative``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly + +Overview: +""""""""" + +This intrinsic loads a 32-bit value from the address ``%ptr + %offset``, +adds ``%ptr`` to that value and returns it. The constant folder specifically +recognizes the form of this intrinsic and the constant initializers it may +load from; if a loaded constant initializer is known to have the form +``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. + +LLVM provides that the calculation of such a constant initializer will +not overflow at link time under the medium code model if ``x`` is an +``unnamed_addr`` function. However, it does not provide this guarantee for +a constant initializer folded into a function body. This intrinsic can be +used to avoid the possibility of overflows when loading from such a constant. + Stack Map Intrinsics -------------------- |