summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/LanguageExtensions.rst80
-rw-r--r--docs/OpenMPSupport.rst74
-rw-r--r--docs/ReleaseNotes.rst42
3 files changed, 116 insertions, 80 deletions
diff --git a/docs/LanguageExtensions.rst b/docs/LanguageExtensions.rst
index e155cefb7890..5782edd35370 100644
--- a/docs/LanguageExtensions.rst
+++ b/docs/LanguageExtensions.rst
@@ -474,44 +474,58 @@ Half-Precision Floating Point
=============================
Clang supports two half-precision (16-bit) floating point types: ``__fp16`` and
-``_Float16``. ``__fp16`` is defined in the ARM C Language Extensions (`ACLE
-<http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053d/IHI0053D_acle_2_1.pdf>`_)
-and ``_Float16`` in ISO/IEC TS 18661-3:2015.
+``_Float16``. These types are supported in all language modes.
-``__fp16`` is a storage and interchange format only. This means that values of
-``__fp16`` promote to (at least) float when used in arithmetic operations.
-There are two ``__fp16`` formats. Clang supports the IEEE 754-2008 format and
-not the ARM alternative format.
+``__fp16`` is supported on every target, as it is purely a storage format; see below.
+``_Float16`` is currently only supported on the following targets, with further
+targets pending ABI standardization:
+- 32-bit ARM
+- 64-bit ARM (AArch64)
+- SPIR
+``_Float16`` will be supported on more targets as they define ABIs for it.
-ISO/IEC TS 18661-3:2015 defines C support for additional floating point types.
-``_FloatN`` is defined as a binary floating type, where the N suffix denotes
-the number of bits and is 16, 32, 64, or greater and equal to 128 and a
-multiple of 32. Clang supports ``_Float16``. The difference from ``__fp16`` is
-that arithmetic on ``_Float16`` is performed in half-precision, thus it is not
-a storage-only format. ``_Float16`` is available as a source language type in
-both C and C++ mode.
+``__fp16`` is a storage and interchange format only. This means that values of
+``__fp16`` are immediately promoted to (at least) ``float`` when used in arithmetic
+operations, so that e.g. the result of adding two ``__fp16`` values has type ``float``.
+The behavior of ``__fp16`` is specified by the ARM C Language Extensions (`ACLE <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053d/IHI0053D_acle_2_1.pdf>`_).
+Clang uses the ``binary16`` format from IEEE 754-2008 for ``__fp16``, not the ARM
+alternative format.
-It is recommended that portable code use the ``_Float16`` type because
-``__fp16`` is an ARM C-Language Extension (ACLE), whereas ``_Float16`` is
-defined by the C standards committee, so using ``_Float16`` will not prevent
-code from being ported to architectures other than Arm. Also, ``_Float16``
-arithmetic and operations will directly map on half-precision instructions when
-they are available (e.g. Armv8.2-A), avoiding conversions to/from
-single-precision, and thus will result in more performant code. If
-half-precision instructions are unavailable, values will be promoted to
-single-precision, similar to the semantics of ``__fp16`` except that the
-results will be stored in single-precision.
+``_Float16`` is an extended floating-point type. This means that, just like arithmetic on
+``float`` or ``double``, arithmetic on ``_Float16`` operands is formally performed in the
+``_Float16`` type, so that e.g. the result of adding two ``_Float16`` values has type
+``_Float16``. The behavior of ``_Float16`` is specified by ISO/IEC TS 18661-3:2015
+("Floating-point extensions for C"). As with ``__fp16``, Clang uses the ``binary16``
+format from IEEE 754-2008 for ``_Float16``.
-In an arithmetic operation where one operand is of ``__fp16`` type and the
-other is of ``_Float16`` type, the ``_Float16`` type is first converted to
-``__fp16`` type and then the operation is completed as if both operands were of
-``__fp16`` type.
+``_Float16`` arithmetic will be performed using native half-precision support
+when available on the target (e.g. on ARMv8.2a); otherwise it will be performed
+at a higher precision (currently always ``float``) and then truncated down to
+``_Float16``. Note that C and C++ allow intermediate floating-point operands
+of an expression to be computed with greater precision than is expressible in
+their type, so Clang may avoid intermediate truncations in certain cases; this may
+lead to results that are inconsistent with native arithmetic.
-To define a ``_Float16`` literal, suffix ``f16`` can be appended to the compile-time
-constant declaration. There is no default argument promotion for ``_Float16``; this
-applies to the standard floating types only. As a consequence, for example, an
-explicit cast is required for printing a ``_Float16`` value (there is no string
-format specifier for ``_Float16``).
+It is recommended that portable code use ``_Float16`` instead of ``__fp16``,
+as it has been defined by the C standards committee and has behavior that is
+more familiar to most programmers.
+
+Because ``__fp16`` operands are always immediately promoted to ``float``, the
+common real type of ``__fp16`` and ``_Float16`` for the purposes of the usual
+arithmetic conversions is ``float``.
+
+A literal can be given ``_Float16`` type using the suffix ``f16``; for example:
+```
+3.14f16
+```
+
+Because default argument promotion only applies to the standard floating-point
+types, ``_Float16`` values are not promoted to ``double`` when passed as variadic
+or untyped arguments. As a consequence, some caution must be taken when using
+certain library facilities with ``_Float16``; for example, there is no ``printf`` format
+specifier for ``_Float16``, and (unlike ``float``) it will not be implicitly promoted to
+``double`` when passed to ``printf``, so the programmer must explicitly cast it to
+``double`` before using it with an ``%f`` or similar specifier.
Messages on ``deprecated`` and ``unavailable`` Attributes
=========================================================
diff --git a/docs/OpenMPSupport.rst b/docs/OpenMPSupport.rst
index 04a9648ca294..7b567c966ee5 100644
--- a/docs/OpenMPSupport.rst
+++ b/docs/OpenMPSupport.rst
@@ -17,60 +17,50 @@
OpenMP Support
==================
-Clang fully supports OpenMP 4.5. Clang supports offloading to X86_64, AArch64,
-PPC64[LE] and has `basic support for Cuda devices`_.
-
-Standalone directives
-=====================
-
-* #pragma omp [for] simd: :good:`Complete`.
-
-* #pragma omp declare simd: :partial:`Partial`. We support parsing/semantic
- analysis + generation of special attributes for X86 target, but still
- missing the LLVM pass for vectorization.
-
-* #pragma omp taskloop [simd]: :good:`Complete`.
-
-* #pragma omp target [enter|exit] data: :good:`Complete`.
-
-* #pragma omp target update: :good:`Complete`.
-
-* #pragma omp target: :good:`Complete`.
+Clang supports the following OpenMP 5.0 features
-* #pragma omp declare target: :good:`Complete`.
+* The `reduction`-based clauses in the `task` and `target`-based directives.
-* #pragma omp teams: :good:`Complete`.
+* Support relational-op != (not-equal) as one of the canonical forms of random
+ access iterator.
-* #pragma omp distribute [simd]: :good:`Complete`.
+* Support for mapping of the lambdas in target regions.
-* #pragma omp distribute parallel for [simd]: :good:`Complete`.
+* Parsing/sema analysis for the requires directive.
-Combined directives
-===================
+* Nested declare target directives.
-* #pragma omp parallel for simd: :good:`Complete`.
+* Make the `this` pointer implicitly mapped as `map(this[:1])`.
-* #pragma omp target parallel: :good:`Complete`.
+* The `close` *map-type-modifier*.
-* #pragma omp target parallel for [simd]: :good:`Complete`.
-
-* #pragma omp target simd: :good:`Complete`.
-
-* #pragma omp target teams: :good:`Complete`.
-
-* #pragma omp teams distribute [simd]: :good:`Complete`.
-
-* #pragma omp target teams distribute [simd]: :good:`Complete`.
+Clang fully supports OpenMP 4.5. Clang supports offloading to X86_64, AArch64,
+PPC64[LE] and has `basic support for Cuda devices`_.
-* #pragma omp teams distribute parallel for [simd]: :good:`Complete`.
+* #pragma omp declare simd: :partial:`Partial`. We support parsing/semantic
+ analysis + generation of special attributes for X86 target, but still
+ missing the LLVM pass for vectorization.
-* #pragma omp target teams distribute parallel for [simd]: :good:`Complete`.
+In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP Tools
+Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux, Windows, and macOS.
-Clang does not support any constructs/updates from OpenMP 5.0 except
-for `reduction`-based clauses in the `task` and `target`-based directives.
+General improvements
+--------------------
+- New collapse clause scheme to avoid expensive remainder operations.
+ Compute loop index variables after collapsing a loop nest via the
+ collapse clause by replacing the expensive remainder operation with
+ multiplications and additions.
-In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP Tools
-Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux, Windows, and mac OS.
+- The default schedules for the `distribute` and `for` constructs in a
+ parallel region and in SPMD mode have changed to ensure coalesced
+ accesses. For the `distribute` construct, a static schedule is used
+ with a chunk size equal to the number of threads per team (default
+ value of threads or as specified by the `thread_limit` clause if
+ present). For the `for` construct, the schedule is static with chunk
+ size of one.
+
+- Simplified SPMD code generation for `distribute parallel for` when
+ the new default schedules are applicable.
.. _basic support for Cuda devices:
diff --git a/docs/ReleaseNotes.rst b/docs/ReleaseNotes.rst
index b6a405dbc78b..50bf636a51f4 100644
--- a/docs/ReleaseNotes.rst
+++ b/docs/ReleaseNotes.rst
@@ -127,6 +127,10 @@ Non-comprehensive list of changes in this release
manually and rely on the old behaviour you will need to add appropriate
compiler flags for finding the corresponding libc++ include directory.
+- The integrated assembler is used now by default for all MIPS targets.
+
+- Improved support for MIPS N32 ABI and MIPS R6 target triples.
+
New Compiler Flags
------------------
@@ -136,6 +140,13 @@ New Compiler Flags
instrumenting for gcov-based profiling.
See the :doc:`UsersManual` for details.
+- When using a custom stack alignment, the ``stackrealign`` attribute is now
+ implicitly set on the main function.
+
+- Emission of ``R_MIPS_JALR`` and ``R_MICROMIPS_JALR`` relocations can now
+ be controlled by the ``-mrelax-pic-calls`` and ``-mno-relax-pic-calls``
+ options.
+
- ...
Deprecated Compiler Flags
@@ -179,6 +190,15 @@ Windows Support
`dllexport` and `dllimport` attributes not apply to inline member functions.
This can significantly reduce compile and link times. See the `User's Manual
<UsersManual.html#the-zc-dllexportinlines-option>`_ for more info.
+
+- For MinGW, ``-municode`` now correctly defines ``UNICODE`` during
+ preprocessing.
+
+- For MinGW, clang now produces vtables and RTTI for dllexported classes
+ without key functions. This fixes building Qt in debug mode.
+
+- Allow using Address Sanitizer and Undefined Behaviour Sanitizer on MinGW.
+
- ...
@@ -233,12 +253,15 @@ ABI Changes in Clang
OpenMP Support in Clang
----------------------------------
-- Support relational-op != (not-equal) as one of the canonical forms of random
- access iterator.
-
-- Added support for mapping of the lambdas in target regions.
+- OpenMP 5.0 features
-- Added parsing/sema analysis for OpenMP 5.0 requires directive.
+ - Support relational-op != (not-equal) as one of the canonical forms of random
+ access iterator.
+ - Added support for mapping of the lambdas in target regions.
+ - Added parsing/sema analysis for the requires directive.
+ - Support nested declare target directives.
+ - Make the `this` pointer implicitly mapped as `map(this[:1])`.
+ - Added the `close` *map-type-modifier*.
- Various bugfixes and improvements.
@@ -250,6 +273,15 @@ New features supported for Cuda devices:
- Fixed support for lastprivate/reduction variables in SPMD constructs.
+- New collapse clause scheme to avoid expensive remainder operations.
+
+- New default schedule for distribute and parallel constructs.
+
+- Simplified code generation for distribute and parallel in SPMD mode.
+
+- Flag (``-fopenmp_optimistic_collapse``) for user to limit collapsed
+ loop counter width when safe to do so.
+
- General performance improvement.
CUDA Support in Clang