| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This uses the new layout of the upstream repository, which was recently
migrated to GitHub, and converted into a "monorepo". That is, most of
the earlier separate sub-projects with their own branches and tags were
consolidated into one top-level directory, and are now branched and
tagged together.
Updating the vendor area to match this layout is next.
Notes:
svn path=/head/; revision=355940
|
|
|
|
|
|
|
| |
release 9.0.0 r372316, and update version numbers.
Notes:
svn path=/projects/clang900-import/; revision=352536
|
|\
| |
| |
| | |
Notes:
svn path=/projects/clang900-import/; revision=352105
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
[IfConversion] Fix diamond conversion with unanalyzable branches.
The code was incorrectly counting the number of identical
instructions, and therefore tried to predicate an instruction which
should not have been predicated. This could have various effects: a
compiler crash, an assembler failure, a miscompile, or just
generating an extra, unnecessary instruction.
Instead of depending on TargetInstrInfo::removeBranch, which only
works on analyzable branches, just remove all branch instructions.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43121 and
https://bugs.llvm.org/show_bug.cgi?id=41121 .
Differential Revision: https://reviews.llvm.org/D67203
This should fix "Unable to predicate BX killed renamable $r0" errors
when building the lang/spidermonkey170 and lang/spidermonkey38 ports for
armv7 and armv6.
PR: 236567
MFC after: 3 days
Notes:
svn path=/head/; revision=351938
|
| |
| |
| |
| |
| |
| |
| | |
release_90 branch r371301, and update version numbers.
Notes:
svn path=/projects/clang900-import/; revision=352010
|
| |
| |
| |
| |
| |
| |
| | |
release_90 branch r369369, and update version numbers.
Notes:
svn path=/projects/clang900-import/; revision=351708
|
|/
|
|
| |
Notes:
svn path=/projects/clang900-import/; revision=351344
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[CodeGen][NFC] Simplify checks for stack protector index checking
Use `hasStackProtectorIndex()` instead of `getStackProtectorIndex()
>= 0`.
Pull in r366371 from upstream llvm trunk (by Francis Visoiu Mistrih):
[PEI] Don't re-allocate a pre-allocated stack protector slot
The LocalStackSlotPass pre-allocates a stack protector and makes sure
that it comes before the local variables on the stack.
We need to make sure that later during PEI we don't re-allocate a new
stack protector slot. If that happens, the new stack protector slot
will end up being **after** the local variables that it should be
protecting.
Therefore, we would have two slots assigned for two different stack
protectors, one at the top of the stack, and one at the bottom. Since
PEI will overwrite the assigned slot for the stack protector, the
load that is used to compare the value of the stack protector will
use the slot assigned by PEI, which is wrong.
For this, we need to check if the object is pre-allocated, and re-use
that pre-allocated slot.
Differential Revision: https://reviews.llvm.org/D64757
Pull in r367068 from upstream llvm trunk (by Francis Visoiu Mistrih):
[CodeGen] Don't resolve the stack protector frame accesses until PEI
Currently, stack protector loads and stores are resolved during
LocalStackSlotAllocation (if the pass needs to run). When this is the
case, the base register assigned to the frame access is going to be
one of the vregs created during LocalStackSlotAllocation. This means
that we are keeping a pointer to the stack protector slot, and we're
using this pointer to load and store to it.
In case register pressure goes up, we may end up spilling this
pointer to the stack, which can be a security concern.
Instead, leave it to PEI to resolve the frame accesses. In order to
do that, we make all stack protector accesses go through frame index
operands, then PEI will resolve this using an offset from sp/fp/bp.
Differential Revision: https://reviews.llvm.org/D64759
Together, these fix a issue where the stack protection feature in LLVM's
ARM backend can be rendered ineffective when the stack protector slot is
re-allocated so that it appears after the local variables that it is
meant to protect, leaving the function potentially vulnerable to a
stack-based buffer overflow.
Reported by: andrew
Security: https://kb.cert.org/vuls/id/129209/
MFC after: 3 days
Notes:
svn path=/head/; revision=350362
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[SelectionDAG] soften assertion when legalizing narrow vector FP ops
The test based on PR42010:
https://bugs.llvm.org/show_bug.cgi?id=42010
...may show an inaccuracy for PPC's target defs, but we should not be
so aggressive with an assert here. There's no telling what
out-of-tree targets look like.
This fixes an assertion when building the graphics/mesa-dri port for
PowerPC64.
Reported by: Mark Millard <marklmi26-fbsd@yahoo.com>
PR: 238082
MFC after: 3 days
Notes:
svn path=/head/; revision=348288
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[RegAlloc] Avoid compile time regression with multiple copy hints.
As a fix for https://bugs.llvm.org/show_bug.cgi?id=40986 ("excessive
compile time building opencollada"), this patch makes sure that no
phys reg is hinted more than once from getRegAllocationHints().
This handles the case were many virtual registers are assigned to the
same physreg. The previous compile time fix (r343686) in
weightCalcHelper() only made sure that physical/virtual registers are
passed no more than once to addRegAllocationHint().
Review: Dimitry Andric, Quentin Colombet
https://reviews.llvm.org/D59201
This should fix a hang when compiling certain generated .cpp files in
the graphics/opencollada port.
PR: 236313
MFC after: 1 month
X-MFC-With: r344779
Notes:
svn path=/head/; revision=345021
|
|
|
|
|
|
|
| |
r353167, resolve conflicts, and bump version numbers.
Notes:
svn path=/projects/clang800-import/; revision=343806
|
|
|
|
| |
Notes:
svn path=/projects/clang800-import/; revision=343313
|
|
|
|
| |
Notes:
svn path=/projects/clang800-import/; revision=343210
|
|
|
|
|
|
|
|
|
|
|
| |
Remove debug printf leftover from r342397
PR: 234480
MFC after: 6 weeks
X-MFC-With: r341825
Notes:
svn path=/head/; revision=342593
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Revert "Revert r342183 "[DAGCombine] Fix crash when store merging
created an extract_subvector with invalid index.""
Fixed the assertion failure.
Differential Revision: https://reviews.llvm.org/D51831
This fixes 'Assertion failed: ((VT.getVectorNumElements() +
N2C->getZExtValue() <= N1.getValueType().getVectorNumElements()) &&
"Extract subvector overflow!"), function getNode' when building the
multimedia/aom port (with AVX2 enabled).
Reported by: jbeich
PR: 234480
MFC after: 6 weeks
X-MFC-With: r341825
Notes:
svn path=/head/; revision=342592
|
|
|
|
|
|
|
|
|
|
| |
r348686 (effectively 7.0.1 rc3), resolve conflicts, and bump version
numbers.
PR: 230240, 230355
Notes:
svn path=/projects/clang700-import/; revision=341763
|
|
|
|
|
|
|
|
|
|
| |
r346007 (effectively 7.0.1 rc2), resolve conflicts, and bump version
numbers.
PR: 230240, 230355
Notes:
svn path=/projects/clang700-import/; revision=340125
|
|
|
|
|
|
|
|
|
| |
r340910, resolve conflicts, and bump version numbers.
PR: 230240, 230355
Notes:
svn path=/projects/clang700-import/; revision=338391
|
|
|
|
|
|
|
|
|
| |
r339999, resolve conflicts, and bump version numbers.
PR: 230240,230355
Notes:
svn path=/projects/clang700-import/; revision=338014
|
|
|
|
|
|
|
| |
r339355, resolve conflicts, and bump version numbers.
Notes:
svn path=/projects/clang700-import/; revision=337645
|
|
|
|
|
|
|
| |
resolve conflicts.
Notes:
svn path=/projects/clang700-import/; revision=337149
|
|
|
|
| |
Notes:
svn path=/projects/clang700-import/; revision=336916
|
|
|
|
|
|
|
|
|
|
| |
6.0.1 release (upstream r335540).
Relnotes: yes
MFC after: 2 weeks
Notes:
svn path=/head/; revision=335799
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PeepholeOpt cleanup/refactor; NFC
- Less unnecessary use of `auto`
- Add early `using RegSubRegPair(AndIdx) =` to avoid countless
`TargetInstrInfo::` qualifications.
- Use references instead of pointers where possible.
- Remove unused parameters.
- Rewrite the CopyRewriter class hierarchy:
- Pull out uncoalescable copy rewriting functionality into
PeepholeOptimizer class.
- Use an abstract base class to make it clear that rewriters are
independent.
- Remove unnecessary \brief in doxygen comments.
- Remove unused constructor and method from ValueTracker.
- Replace UseAdvancedTracking of ValueTracker with DisableAdvCopyOpt
use.
Even though upstream marked this as "No Functional Change", it does
contain some functional changes, and these fix a compiler hang for one
particular source file in the devel/godot port.
PR: 228261
MFC after: 3 days
Notes:
svn path=/head/; revision=333715
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EFLAGS copy that lives out of a basic block!" errors on i386.
Pull in r325446 from upstream clang trunk (by me):
[X86] Add 'sahf' CPU feature to frontend
Summary:
Make clang accept `-msahf` (and `-mno-sahf`) flags to activate the
`+sahf` feature for the backend, for bug 36028 (Incorrect use of
pushf/popf enables/disables interrupts on amd64 kernels). This was
originally submitted in bug 36037 by Jonathan Looney
<jonlooney@gmail.com>.
As described there, GCC also uses `-msahf` for this feature, and the
backend already recognizes the `+sahf` feature. All that is needed is
to teach clang to pass this on to the backend.
The mapping of feature support onto CPUs may not be complete; rather,
it was chosen to match LLVM's idea of which CPUs support this feature
(see lib/Target/X86/X86.td).
I also updated the affected test case (CodeGen/attr-target-x86.c) to
match the emitted output.
Reviewers: craig.topper, coby, efriedma, rsmith
Reviewed By: craig.topper
Subscribers: emaste, cfe-commits
Differential Revision: https://reviews.llvm.org/D43394
Pull in r328944 from upstream llvm trunk (by Chandler Carruth):
[x86] Expose more of the condition conversion routines in the public
API for X86's instruction information. I've now got a second patch
under review that needs these same APIs. This bit is nicely
orthogonal and obvious, so landing it. NFC.
Pull in r329414 from upstream llvm trunk (by Craig Topper):
[X86] Merge itineraries for CLC, CMC, and STC.
These are very simple flag setting instructions that appear to only
be a single uop. They're unlikely to need this separation.
Pull in r329657 from upstream llvm trunk (by Chandler Carruth):
[x86] Introduce a pass to begin more systematically fixing PR36028
and similar issues.
The key idea is to lower COPY nodes populating EFLAGS by scanning the
uses of EFLAGS and introducing dedicated code to preserve the
necessary state in a GPR. In the vast majority of cases, these uses
are cmovCC and jCC instructions. For such cases, we can very easily
save and restore the necessary information by simply inserting a
setCC into a GPR where the original flags are live, and then testing
that GPR directly to feed the cmov or conditional branch.
However, things are a bit more tricky if arithmetic is using the
flags. This patch handles the vast majority of cases that seem to
come up in practice: adc, adcx, adox, rcl, and rcr; all without
taking advantage of partially preserved EFLAGS as LLVM doesn't
currently model that at all.
There are a large number of operations that techinaclly observe
EFLAGS currently but shouldn't in this case -- they typically are
using DF. Currently, they will not be handled by this approach.
However, I have never seen this issue come up in practice. It is
already pretty rare to have these patterns come up in practical code
with LLVM. I had to resort to writing MIR tests to cover most of the
logic in this pass already. I suspect even with its current amount
of coverage of arithmetic users of EFLAGS it will be a significant
improvement over the current use of pushf/popf. It will also produce
substantially faster code in most of the common patterns.
This patch also removes all of the old lowering for EFLAGS copies,
and the hack that forced us to use a frame pointer when EFLAGS copies
were found anywhere in a function so that the dynamic stack
adjustment wasn't a problem. None of this is needed as we now lower
all of these copies directly in MI and without require stack
adjustments.
Lots of thanks to Reid who came up with several aspects of this
approach, and Craig who helped me work out a couple of things
tripping me up while working on this.
Differential Revision: https://reviews.llvm.org/D45146
Pull in r329673 from upstream llvm trunk (by Chandler Carruth):
[x86] Model the direction flag (DF) separately from the rest of
EFLAGS.
This cleans up a number of operations that only claimed te use EFLAGS
due to using DF. But no instructions which we think of us setting
EFLAGS actually modify DF (other than things like popf) and so this
needlessly creates uses of EFLAGS that aren't really there.
In fact, DF is so restrictive it is pretty easy to model. Only STD,
CLD, and the whole-flags writes (WRFLAGS and POPF) need to model
this.
I've also somewhat cleaned up some of the flag management instruction
definitions to be in the correct .td file.
Adding this extra register also uncovered a failure to use the
correct datatype to hold X86 registers, and I've corrected that as
necessary here.
Differential Revision: https://reviews.llvm.org/D45154
Pull in r330264 from upstream llvm trunk (by Chandler Carruth):
[x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite
uses across basic blocks in the limited cases where it is very
straight forward to do so.
This will also be useful for other places where we do some limited
EFLAGS propagation across CFG edges and need to handle copy rewrites
afterward. I think this is rapidly approaching the maximum we can and
should be doing here. Everything else begins to require either heroic
analysis to prove how to do PHI insertion manually, or somehow
managing arbitrary PHI-ing of EFLAGS with general PHI insertion.
Neither of these seem at all promising so if those cases come up,
we'll almost certainly need to rewrite the parts of LLVM that produce
those patterns.
We do now require dominator trees in order to reliably diagnose
patterns that would require PHI nodes. This is a bit unfortunate but
it seems better than the completely mysterious crash we would get
otherwise.
Differential Revision: https://reviews.llvm.org/D45673
Together, these should ensure clang does not use pushf/popf sequences to
save and restore flags, avoiding problems with unrelated flags (such as
the interrupt flag) being restored unexpectedly.
Requested by: jtl
PR: 225330
MFC after: 1 week
Notes:
svn path=/head/; revision=332833
|
|
|
|
|
|
|
|
|
|
| |
Reported upstream as <https://bugs.llvm.org/show_bug.cgi?id=37133>.
Reported by: emaste, ci.freebsd.org
PR: 225330
Notes:
svn path=/head/; revision=332503
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[X86] Add 'sahf' CPU feature to frontend
Summary:
Make clang accept `-msahf` (and `-mno-sahf`) flags to activate the
`+sahf` feature for the backend, for bug 36028 (Incorrect use of
pushf/popf enables/disables interrupts on amd64 kernels). This was
originally submitted in bug 36037 by Jonathan Looney
<jonlooney@gmail.com>.
As described there, GCC also uses `-msahf` for this feature, and the
backend already recognizes the `+sahf` feature. All that is needed is
to teach clang to pass this on to the backend.
The mapping of feature support onto CPUs may not be complete; rather,
it was chosen to match LLVM's idea of which CPUs support this feature
(see lib/Target/X86/X86.td).
I also updated the affected test case (CodeGen/attr-target-x86.c) to
match the emitted output.
Reviewers: craig.topper, coby, efriedma, rsmith
Reviewed By: craig.topper
Subscribers: emaste, cfe-commits
Differential Revision: https://reviews.llvm.org/D43394
Pull in r328944 from upstream llvm trunk (by Chandler Carruth):
[x86] Expose more of the condition conversion routines in the public
API for X86's instruction information. I've now got a second patch
under review that needs these same APIs. This bit is nicely
orthogonal and obvious, so landing it. NFC.
Pull in r329414 from upstream llvm trunk (by Craig Topper):
[X86] Merge itineraries for CLC, CMC, and STC.
These are very simple flag setting instructions that appear to only
be a single uop. They're unlikely to need this separation.
Pull in r329657 from upstream llvm trunk (by Chandler Carruth):
[x86] Introduce a pass to begin more systematically fixing PR36028
and similar issues.
The key idea is to lower COPY nodes populating EFLAGS by scanning the
uses of EFLAGS and introducing dedicated code to preserve the
necessary state in a GPR. In the vast majority of cases, these uses
are cmovCC and jCC instructions. For such cases, we can very easily
save and restore the necessary information by simply inserting a
setCC into a GPR where the original flags are live, and then testing
that GPR directly to feed the cmov or conditional branch.
However, things are a bit more tricky if arithmetic is using the
flags. This patch handles the vast majority of cases that seem to
come up in practice: adc, adcx, adox, rcl, and rcr; all without
taking advantage of partially preserved EFLAGS as LLVM doesn't
currently model that at all.
There are a large number of operations that techinaclly observe
EFLAGS currently but shouldn't in this case -- they typically are
using DF. Currently, they will not be handled by this approach.
However, I have never seen this issue come up in practice. It is
already pretty rare to have these patterns come up in practical code
with LLVM. I had to resort to writing MIR tests to cover most of the
logic in this pass already. I suspect even with its current amount
of coverage of arithmetic users of EFLAGS it will be a significant
improvement over the current use of pushf/popf. It will also produce
substantially faster code in most of the common patterns.
This patch also removes all of the old lowering for EFLAGS copies,
and the hack that forced us to use a frame pointer when EFLAGS copies
were found anywhere in a function so that the dynamic stack
adjustment wasn't a problem. None of this is needed as we now lower
all of these copies directly in MI and without require stack
adjustments.
Lots of thanks to Reid who came up with several aspects of this
approach, and Craig who helped me work out a couple of things
tripping me up while working on this.
Differential Revision: https://reviews.llvm.org/D45146
Pull in r329673 from upstream llvm trunk (by Chandler Carruth):
[x86] Model the direction flag (DF) separately from the rest of
EFLAGS.
This cleans up a number of operations that only claimed te use EFLAGS
due to using DF. But no instructions which we think of us setting
EFLAGS actually modify DF (other than things like popf) and so this
needlessly creates uses of EFLAGS that aren't really there.
In fact, DF is so restrictive it is pretty easy to model. Only STD,
CLD, and the whole-flags writes (WRFLAGS and POPF) need to model
this.
I've also somewhat cleaned up some of the flag management instruction
definitions to be in the correct .td file.
Adding this extra register also uncovered a failure to use the
correct datatype to hold X86 registers, and I've corrected that as
necessary here.
Differential Revision: https://reviews.llvm.org/D45154
Together, these should ensure clang does not use pushf/popf sequences to
save and restore flags, avoiding problems with unrelated flags (such as
the interrupt flag) being restored unexpectedly.
Requested by: jtl
PR: 225330
MFC after: 1 week
Notes:
svn path=/head/; revision=332501
|
|
|
|
|
|
|
|
|
|
|
| |
6.0.0 (branches/release_60 r325330).
MFC after: 3 months
X-MFC-With: r327952
PR: 224669
Notes:
svn path=/head/; revision=329410
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
6.0.0 (branches/release_60 r324090).
This introduces retpoline support, with the -mretpoline flag. The
upstream initial commit message (r323155 by Chandler Carruth) contains
quite a bit of explanation. Quoting:
Introduce the "retpoline" x86 mitigation technique for variant #2 of
the speculative execution vulnerabilities disclosed today,
specifically identified by CVE-2017-5715, "Branch Target Injection",
and is one of the two halves to Spectre.
Summary:
First, we need to explain the core of the vulnerability. Note that
this is a very incomplete description, please see the Project Zero
blog post for details:
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
The basis for branch target injection is to direct speculative
execution of the processor to some "gadget" of executable code by
poisoning the prediction of indirect branches with the address of
that gadget. The gadget in turn contains an operation that provides a
side channel for reading data. Most commonly, this will look like a
load of secret data followed by a branch on the loaded value and then
a load of some predictable cache line. The attacker then uses timing
of the processors cache to determine which direction the branch took
*in the speculative execution*, and in turn what one bit of the
loaded value was. Due to the nature of these timing side channels and
the branch predictor on Intel processors, this allows an attacker to
leak data only accessible to a privileged domain (like the kernel)
back into an unprivileged domain.
The goal is simple: avoid generating code which contains an indirect
branch that could have its prediction poisoned by an attacker. In
many cases, the compiler can simply use directed conditional branches
and a small search tree. LLVM already has support for lowering
switches in this way and the first step of this patch is to disable
jump-table lowering of switches and introduce a pass to rewrite
explicit indirectbr sequences into a switch over integers.
However, there is no fully general alternative to indirect calls. We
introduce a new construct we call a "retpoline" to implement indirect
calls in a non-speculatable way. It can be thought of loosely as a
trampoline for indirect calls which uses the RET instruction on x86.
Further, we arrange for a specific call->ret sequence which ensures
the processor predicts the return to go to a controlled, known
location. The retpoline then "smashes" the return address pushed onto
the stack by the call with the desired target of the original
indirect call. The result is a predicted return to the next
instruction after a call (which can be used to trap speculative
execution within an infinite loop) and an actual indirect branch to
an arbitrary address.
On 64-bit x86 ABIs, this is especially easily done in the compiler by
using a guaranteed scratch register to pass the target into this
device. For 32-bit ABIs there isn't a guaranteed scratch register
and so several different retpoline variants are introduced to use a
scratch register if one is available in the calling convention and to
otherwise use direct stack push/pop sequences to pass the target
address.
This "retpoline" mitigation is fully described in the following blog
post: https://support.google.com/faqs/answer/7625886
We also support a target feature that disables emission of the
retpoline thunk by the compiler to allow for custom thunks if users
want them. These are particularly useful in environments like
kernels that routinely do hot-patching on boot and want to hot-patch
their thunk to different code sequences. They can write this custom
thunk and use `-mretpoline-external-thunk` *in addition* to
`-mretpoline`. In this case, on x86-64 thu thunk names must be:
```
__llvm_external_retpoline_r11
```
or on 32-bit:
```
__llvm_external_retpoline_eax
__llvm_external_retpoline_ecx
__llvm_external_retpoline_edx
__llvm_external_retpoline_push
```
And the target of the retpoline is passed in the named register, or in
the case of the `push` suffix on the top of the stack via a `pushl`
instruction.
There is one other important source of indirect branches in x86 ELF
binaries: the PLT. These patches also include support for LLD to
generate PLT entries that perform a retpoline-style indirection.
The only other indirect branches remaining that we are aware of are
from precompiled runtimes (such as crt0.o and similar). The ones we
have found are not really attackable, and so we have not focused on
them here, but eventually these runtimes should also be replicated for
retpoline-ed configurations for completeness.
For kernels or other freestanding or fully static executables, the
compiler switch `-mretpoline` is sufficient to fully mitigate this
particular attack. For dynamic executables, you must compile *all*
libraries with `-mretpoline` and additionally link the dynamic
executable and all shared libraries with LLD and pass `-z
retpolineplt` (or use similar functionality from some other linker).
We strongly recommend also using `-z now` as non-lazy binding allows
the retpoline-mitigated PLT to be substantially smaller.
When manually apply similar transformations to `-mretpoline` to the
Linux kernel we observed very small performance hits to applications
running typic al workloads, and relatively minor hits (approximately
2%) even for extremely syscall-heavy applications. This is largely
due to the small number of indirect branches that occur in
performance sensitive paths of the kernel.
When using these patches on statically linked applications,
especially C++ applications, you should expect to see a much more
dramatic performance hit. For microbenchmarks that are switch,
indirect-, or virtual-call heavy we have seen overheads ranging from
10% to 50%.
However, real-world workloads exhibit substantially lower performance
impact. Notably, techniques such as PGO and ThinLTO dramatically
reduce the impact of hot indirect calls (by speculatively promoting
them to direct calls) and allow optimized search trees to be used to
lower switches. If you need to deploy these techniques in C++
applications, we *strongly* recommend that you ensure all hot call
targets are statically linked (avoiding PLT indirection) and use both
PGO and ThinLTO. Well tuned servers using all of these techniques saw
5% - 10% overhead from the use of retpoline.
We will add detailed documentation covering these components in
subsequent patches, but wanted to make the core functionality
available as soon as possible. Happy for more code review, but we'd
really like to get these patches landed and backported ASAP for
obvious reasons. We're planning to backport this to both 6.0 and 5.0
release streams and get a 5.0 release with just this cherry picked
ASAP for distros and vendors.
This patch is the work of a number of people over the past month:
Eric, Reid, Rui, and myself. I'm mailing it out as a single commit
due to the time sensitive nature of landing this and the need to
backport it. Huge thanks to everyone who helped out here, and
everyone at Intel who helped out in discussions about how to craft
this. Also, credit goes to Paul Turner (at Google, but not an LLVM
contributor) for much of the underlying retpoline design.
Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer
Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D41723
MFC after: 3 months
X-MFC-With: r327952
PR: 224669
Notes:
svn path=/head/; revision=328817
|
|
|
|
|
|
|
|
|
|
|
| |
6.0.0 (branches/release_60 r323948).
MFC after: 3 months
X-MFC-With: r327952
PR: 224669
Notes:
svn path=/head/; revision=328753
|
|
|
|
|
|
|
|
|
|
|
| |
6.0.0 (branches/release_60 r323338).
MFC after: 3 months
X-MFC-With: r327952
PR: 224669
Notes:
svn path=/head/; revision=328381
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[CGP] Fix Complex addressing mode for offset
If the offset is differ in two addressing mode we can continue only
if ScaleReg is not set due to we will use it as merge of different
offsets.
It should fix PR35799 and PR35805.
Reviewers: john.brawn, reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41227
This should fix "ScaledReg == nullptr" assertions when building the
graphics/xpx, mail/alpine and editors/pico-alpine ports.
Reported by: jbeich
PR: 224866, 224995
Notes:
svn path=/projects/clang600-import/; revision=327734
|
|
|
|
|
|
|
| |
update build glue and version numbers.
Notes:
svn path=/projects/clang600-import/; revision=327657
|
|
|
|
|
|
|
|
| |
update build glue and version numbers, add new intrinsics headers, and
update OptionalObsoleteFiles.inc.
Notes:
svn path=/projects/clang600-import/; revision=327330
|
|
|
|
| |
Notes:
svn path=/projects/clang600-import/; revision=327134
|
|
|
|
| |
Notes:
svn path=/projects/clang600-import/; revision=327023
|
|
|
|
|
|
|
|
|
| |
upstream release_50 branch. This corresponds to 5.0.1 rc2.
MFC after: 2 weeks
Notes:
svn path=/head/; revision=326496
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the upstream release_50 branch. This corresponds to 5.0.0 rc4.
As of this version, the cad/stepcode port should now compile in a more
reasonable time on i386 (see bug 221836 for more information).
PR: 221836
MFC after: 2 months
X-MFC-with: r321369
Notes:
svn path=/head/; revision=323112
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the upstream release_50 branch.
As of this version, lib/msun's trig test should also work correctly
again (see bug 220989 for more information).
PR: 220989
MFC after: 2 months
X-MFC-with: r321369
Notes:
svn path=/head/; revision=322855
|
|
|
|
|
|
|
|
|
|
| |
upstream release_50 branch.
MFC after: 2 months
X-MFC-with: r321369
Notes:
svn path=/head/; revision=322740
|
|
|
|
|
|
|
|
|
|
| |
upstream release_50 branch.
MFC after: 2 months
X-MFC-with: r321369
Notes:
svn path=/head/; revision=322320
|
|
|
|
|
|
|
|
|
|
| |
upstream release_50 branch. This is just after upstream's 5.0.0-rc1.
MFC after: 2 months
X-MFC-with: r321369
Notes:
svn path=/head/; revision=321723
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[CodeGenPrepare] Cut off FindAllMemoryUses if there are too many uses.
This avoids excessive compile time. The case I'm looking at is
Function.cpp from an old version of LLVM that still had the giant
memcmp string matcher in it. Before r308322 this compiled in about 2
minutes, after it, clang takes infinite* time to compile it. With
this patch we're at 5 min, which is still bad but this is a
pathological case.
The cut off at 20 uses was chosen by looking at other cut-offs in LLVM
for user scanning. It's probably too high, but does the job and is
very unlikely to regress anything.
Fixes PR33900.
* I'm impatient and aborted after 15 minutes, on the bug report it was
killed after 2h.
Pull in r308986 from upstream llvm trunk (by Simon Pilgrim):
[X86][CGP] Reduce memcmp() expansion to 2 load pairs (PR33914)
D35067/rL308322 attempted to support up to 4 load pairs for memcmp
inlining which resulted in regressions for some optimized libc memcmp
implementations (PR33914).
Until we can match these more optimal cases, this patch reduces the
memcmp expansion to a maximum of 2 load pairs (which matches what we
do for -Os).
This patch should be considered for the 5.0.0 release branch as well
Differential Revision: https://reviews.llvm.org/D35830
These fix a hang (or extremely long compile time) when building older
LLVM ports.
Reported by: antoine
PR: 219139
Notes:
svn path=/head/; revision=321664
|
|
|
|
|
|
|
| |
build glue.
Notes:
svn path=/projects/clang500-import/; revision=321238
|
|
|
|
|
|
|
| |
build glue.
Notes:
svn path=/projects/clang500-import/; revision=320970
|
|
|
|
|
|
|
| |
build glue.
Notes:
svn path=/projects/clang500-import/; revision=320572
|
|
|
|
|
|
|
| |
build glue.
Notes:
svn path=/projects/clang500-import/; revision=320397
|
|
|
|
| |
Notes:
svn path=/projects/clang500-import/; revision=320053
|
|
|
|
|
|
|
| |
build glue.
Notes:
svn path=/projects/clang500-import/; revision=320041
|
|
|
|
|
|
|
| |
build glue.
Notes:
svn path=/projects/clang500-import/; revision=319799
|