diff options
author | Ed Schouten <ed@FreeBSD.org> | 2009-06-02 17:52:33 +0000 |
---|---|---|
committer | Ed Schouten <ed@FreeBSD.org> | 2009-06-02 17:52:33 +0000 |
commit | 009b1c42aa6266385f2c37e227516b24077e6dd7 (patch) | |
tree | 64ba909838c23261cace781ece27d106134ea451 /lib/Target/X86/README-FPStack.txt | |
download | src-009b1c42aa6266385f2c37e227516b24077e6dd7.tar.gz src-009b1c42aa6266385f2c37e227516b24077e6dd7.zip |
Notes
Diffstat (limited to 'lib/Target/X86/README-FPStack.txt')
-rw-r--r-- | lib/Target/X86/README-FPStack.txt | 85 |
1 files changed, 85 insertions, 0 deletions
diff --git a/lib/Target/X86/README-FPStack.txt b/lib/Target/X86/README-FPStack.txt new file mode 100644 index 000000000000..be28e8b394a4 --- /dev/null +++ b/lib/Target/X86/README-FPStack.txt @@ -0,0 +1,85 @@ +//===---------------------------------------------------------------------===// +// Random ideas for the X86 backend: FP stack related stuff +//===---------------------------------------------------------------------===// + +//===---------------------------------------------------------------------===// + +Some targets (e.g. athlons) prefer freep to fstp ST(0): +http://gcc.gnu.org/ml/gcc-patches/2004-04/msg00659.html + +//===---------------------------------------------------------------------===// + +This should use fiadd on chips where it is profitable: +double foo(double P, int *I) { return P+*I; } + +We have fiadd patterns now but the followings have the same cost and +complexity. We need a way to specify the later is more profitable. + +def FpADD32m : FpI<(ops RFP:$dst, RFP:$src1, f32mem:$src2), OneArgFPRW, + [(set RFP:$dst, (fadd RFP:$src1, + (extloadf64f32 addr:$src2)))]>; + // ST(0) = ST(0) + [mem32] + +def FpIADD32m : FpI<(ops RFP:$dst, RFP:$src1, i32mem:$src2), OneArgFPRW, + [(set RFP:$dst, (fadd RFP:$src1, + (X86fild addr:$src2, i32)))]>; + // ST(0) = ST(0) + [mem32int] + +//===---------------------------------------------------------------------===// + +The FP stackifier needs to be global. Also, it should handle simple permutates +to reduce number of shuffle instructions, e.g. turning: + +fld P -> fld Q +fld Q fld P +fxch + +or: + +fxch -> fucomi +fucomi jl X +jg X + +Ideas: +http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02410.html + + +//===---------------------------------------------------------------------===// + +Add a target specific hook to DAG combiner to handle SINT_TO_FP and +FP_TO_SINT when the source operand is already in memory. + +//===---------------------------------------------------------------------===// + +Open code rint,floor,ceil,trunc: +http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02006.html +http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02011.html + +Opencode the sincos[f] libcall. + +//===---------------------------------------------------------------------===// + +None of the FPStack instructions are handled in +X86RegisterInfo::foldMemoryOperand, which prevents the spiller from +folding spill code into the instructions. + +//===---------------------------------------------------------------------===// + +Currently the x86 codegen isn't very good at mixing SSE and FPStack +code: + +unsigned int foo(double x) { return x; } + +foo: + subl $20, %esp + movsd 24(%esp), %xmm0 + movsd %xmm0, 8(%esp) + fldl 8(%esp) + fisttpll (%esp) + movl (%esp), %eax + addl $20, %esp + ret + +This just requires being smarter when custom expanding fptoui. + +//===---------------------------------------------------------------------===// |