summaryrefslogtreecommitdiff
path: root/docs/NVPTXUsage.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/NVPTXUsage.rst')
-rw-r--r--docs/NVPTXUsage.rst46
1 files changed, 19 insertions, 27 deletions
diff --git a/docs/NVPTXUsage.rst b/docs/NVPTXUsage.rst
index fdfc8e41dc3b..159fe078653c 100644
--- a/docs/NVPTXUsage.rst
+++ b/docs/NVPTXUsage.rst
@@ -289,7 +289,7 @@ code often follows a pattern:
return my_function_precise(a);
}
-The default value for all unspecified reflection parameters is zero.
+The default value for all unspecified reflection parameters is zero.
The ``NVVMReflect`` pass should be executed early in the optimization
pipeline, immediately after the link stage. The ``internalize`` pass is also
@@ -326,6 +326,16 @@ often leave behind dead code of the form:
Therefore, it is recommended that ``NVVMReflect`` is executed early in the
optimization pipeline before dead-code elimination.
+The NVPTX TargetMachine knows how to schedule ``NVVMReflect`` at the beginning
+of your pass manager; just use the following code when setting up your pass
+manager:
+
+.. code-block:: c++
+
+ std::unique_ptr<TargetMachine> TM = ...;
+ PassManagerBuilder PMBuilder(...);
+ if (TM)
+ TM->adjustPassManager(PMBuilder);
Reflection Parameters
---------------------
@@ -339,35 +349,17 @@ Flag Description
``__CUDA_FTZ=[0,1]`` Use optimized code paths that flush subnormals to zero
==================== ======================================================
+The value of this flag is determined by the "nvvm-reflect-ftz" module flag.
+The following sets the ftz flag to 1.
-Invoking NVVMReflect
---------------------
-
-To ensure that all dead code caused by the reflection pass is eliminated, it
-is recommended that the reflection pass is executed early in the LLVM IR
-optimization pipeline. The pass takes an optional mapping of reflection
-parameter name to an integer value. This mapping can be specified as either a
-command-line option to ``opt`` or as an LLVM ``StringMap<int>`` object when
-programmatically creating a pass pipeline.
-
-With ``opt``:
-
-.. code-block:: text
-
- # opt -nvvm-reflect -nvvm-reflect-list=<var>=<value>,<var>=<value> module.bc -o module.reflect.bc
-
-
-With programmatic pass pipeline:
-
-.. code-block:: c++
-
- extern FunctionPass *llvm::createNVVMReflectPass(const StringMap<int>& Mapping);
-
- StringMap<int> ReflectParams;
- ReflectParams["__CUDA_FTZ"] = 1;
- Passes.add(createNVVMReflectPass(ReflectParams));
+.. code-block:: llvm
+ !llvm.module.flag = !{!0}
+ !0 = !{i32 4, !"nvvm-reflect-ftz", i32 1}
+(``i32 4`` indicates that the value set here overrides the value in another
+module we link with. See the `LangRef <LangRef.html#module-flags-metadata>`
+for details.)
Executing PTX
=============