diff options
Diffstat (limited to 'contrib/gcc/doc/tree-ssa.texi')
-rw-r--r-- | contrib/gcc/doc/tree-ssa.texi | 1723 |
1 files changed, 1723 insertions, 0 deletions
diff --git a/contrib/gcc/doc/tree-ssa.texi b/contrib/gcc/doc/tree-ssa.texi new file mode 100644 index 0000000000000..4ed5a33f2329a --- /dev/null +++ b/contrib/gcc/doc/tree-ssa.texi @@ -0,0 +1,1723 @@ +@c Copyright (c) 2004, 2005 Free Software Foundation, Inc. +@c Free Software Foundation, Inc. +@c This is part of the GCC manual. +@c For copying conditions, see the file gcc.texi. + +@c --------------------------------------------------------------------- +@c Tree SSA +@c --------------------------------------------------------------------- + +@node Tree SSA +@chapter Analysis and Optimization of GIMPLE Trees +@cindex Tree SSA +@cindex Optimization infrastructure for GIMPLE + +GCC uses three main intermediate languages to represent the program +during compilation: GENERIC, GIMPLE and RTL@. GENERIC is a +language-independent representation generated by each front end. It +is used to serve as an interface between the parser and optimizer. +GENERIC is a common representation that is able to represent programs +written in all the languages supported by GCC@. + +GIMPLE and RTL are used to optimize the program. GIMPLE is used for +target and language independent optimizations (e.g., inlining, +constant propagation, tail call elimination, redundancy elimination, +etc). Much like GENERIC, GIMPLE is a language independent, tree based +representation. However, it differs from GENERIC in that the GIMPLE +grammar is more restrictive: expressions contain no more than 3 +operands (except function calls), it has no control flow structures +and expressions with side-effects are only allowed on the right hand +side of assignments. See the chapter describing GENERIC and GIMPLE +for more details. + +This chapter describes the data structures and functions used in the +GIMPLE optimizers (also known as ``tree optimizers'' or ``middle +end''). In particular, it focuses on all the macros, data structures, +functions and programming constructs needed to implement optimization +passes for GIMPLE@. + +@menu +* GENERIC:: A high-level language-independent representation. +* GIMPLE:: A lower-level factored tree representation. +* Annotations:: Attributes for statements and variables. +* Statement Operands:: Variables referenced by GIMPLE statements. +* SSA:: Static Single Assignment representation. +* Alias analysis:: Representing aliased loads and stores. +@end menu + +@node GENERIC +@section GENERIC +@cindex GENERIC + +The purpose of GENERIC is simply to provide a language-independent way of +representing an entire function in trees. To this end, it was necessary to +add a few new tree codes to the back end, but most everything was already +there. If you can express it with the codes in @code{gcc/tree.def}, it's +GENERIC@. + +Early on, there was a great deal of debate about how to think about +statements in a tree IL@. In GENERIC, a statement is defined as any +expression whose value, if any, is ignored. A statement will always +have @code{TREE_SIDE_EFFECTS} set (or it will be discarded), but a +non-statement expression may also have side effects. A +@code{CALL_EXPR}, for instance. + +It would be possible for some local optimizations to work on the +GENERIC form of a function; indeed, the adapted tree inliner works +fine on GENERIC, but the current compiler performs inlining after +lowering to GIMPLE (a restricted form described in the next section). +Indeed, currently the frontends perform this lowering before handing +off to @code{tree_rest_of_compilation}, but this seems inelegant. + +If necessary, a front end can use some language-dependent tree codes +in its GENERIC representation, so long as it provides a hook for +converting them to GIMPLE and doesn't expect them to work with any +(hypothetical) optimizers that run before the conversion to GIMPLE@. +The intermediate representation used while parsing C and C++ looks +very little like GENERIC, but the C and C++ gimplifier hooks are +perfectly happy to take it as input and spit out GIMPLE@. + +@node GIMPLE +@section GIMPLE +@cindex GIMPLE + +GIMPLE is a simplified subset of GENERIC for use in optimization. The +particular subset chosen (and the name) was heavily influenced by the +SIMPLE IL used by the McCAT compiler project at McGill University, +though we have made some different choices. For one thing, SIMPLE +doesn't support @code{goto}; a production compiler can't afford that +kind of restriction. + +GIMPLE retains much of the structure of the parse trees: lexical +scopes are represented as containers, rather than markers. However, +expressions are broken down into a 3-address form, using temporary +variables to hold intermediate values. Also, control structures are +lowered to gotos. + +In GIMPLE no container node is ever used for its value; if a +@code{COND_EXPR} or @code{BIND_EXPR} has a value, it is stored into a +temporary within the controlled blocks, and that temporary is used in +place of the container. + +The compiler pass which lowers GENERIC to GIMPLE is referred to as the +@samp{gimplifier}. The gimplifier works recursively, replacing complex +statements with sequences of simple statements. + +@c Currently, the only way to +@c tell whether or not an expression is in GIMPLE form is by recursively +@c examining it; in the future there will probably be a flag to help avoid +@c redundant work. FIXME FIXME + +@menu +* Interfaces:: +* Temporaries:: +* GIMPLE Expressions:: +* Statements:: +* GIMPLE Example:: +* Rough GIMPLE Grammar:: +@end menu + +@node Interfaces +@subsection Interfaces +@cindex gimplification + +The tree representation of a function is stored in +@code{DECL_SAVED_TREE}. It is lowered to GIMPLE by a call to +@code{gimplify_function_tree}. + +If a front end wants to include language-specific tree codes in the tree +representation which it provides to the back end, it must provide a +definition of @code{LANG_HOOKS_GIMPLIFY_EXPR} which knows how to +convert the front end trees to GIMPLE@. Usually such a hook will involve +much of the same code for expanding front end trees to RTL@. This function +can return fully lowered GIMPLE, or it can return GENERIC trees and let the +main gimplifier lower them the rest of the way; this is often simpler. +GIMPLE that is not fully lowered is known as ``high GIMPLE'' and +consists of the IL before the pass @code{pass_lower_cf}. High GIMPLE +still contains lexical scopes and nested expressions, while low GIMPLE +exposes all of the implicit jumps for control expressions like +@code{COND_EXPR}. + +The C and C++ front ends currently convert directly from front end +trees to GIMPLE, and hand that off to the back end rather than first +converting to GENERIC@. Their gimplifier hooks know about all the +@code{_STMT} nodes and how to convert them to GENERIC forms. There +was some work done on a genericization pass which would run first, but +the existence of @code{STMT_EXPR} meant that in order to convert all +of the C statements into GENERIC equivalents would involve walking the +entire tree anyway, so it was simpler to lower all the way. This +might change in the future if someone writes an optimization pass +which would work better with higher-level trees, but currently the +optimizers all expect GIMPLE@. + +A front end which wants to use the tree optimizers (and already has +some sort of whole-function tree representation) only needs to provide +a definition of @code{LANG_HOOKS_GIMPLIFY_EXPR}, call +@code{gimplify_function_tree} to lower to GIMPLE, and then hand off to +@code{tree_rest_of_compilation} to compile and output the function. + +You can tell the compiler to dump a C-like representation of the GIMPLE +form with the flag @option{-fdump-tree-gimple}. + +@node Temporaries +@subsection Temporaries +@cindex Temporaries + +When gimplification encounters a subexpression which is too complex, it +creates a new temporary variable to hold the value of the subexpression, +and adds a new statement to initialize it before the current statement. +These special temporaries are known as @samp{expression temporaries}, and are +allocated using @code{get_formal_tmp_var}. The compiler tries to +always evaluate identical expressions into the same temporary, to simplify +elimination of redundant calculations. + +We can only use expression temporaries when we know that it will not be +reevaluated before its value is used, and that it will not be otherwise +modified@footnote{These restrictions are derived from those in Morgan 4.8.}. +Other temporaries can be allocated using +@code{get_initialized_tmp_var} or @code{create_tmp_var}. + +Currently, an expression like @code{a = b + 5} is not reduced any +further. We tried converting it to something like +@smallexample + T1 = b + 5; + a = T1; +@end smallexample +but this bloated the representation for minimal benefit. However, a +variable which must live in memory cannot appear in an expression; its +value is explicitly loaded into a temporary first. Similarly, storing +the value of an expression to a memory variable goes through a +temporary. + +@node GIMPLE Expressions +@subsection Expressions +@cindex GIMPLE Expressions + +In general, expressions in GIMPLE consist of an operation and the +appropriate number of simple operands; these operands must either be a +GIMPLE rvalue (@code{is_gimple_val}), i.e.@: a constant or a register +variable. More complex operands are factored out into temporaries, so +that +@smallexample + a = b + c + d +@end smallexample +becomes +@smallexample + T1 = b + c; + a = T1 + d; +@end smallexample + +The same rule holds for arguments to a @code{CALL_EXPR}. + +The target of an assignment is usually a variable, but can also be an +@code{INDIRECT_REF} or a compound lvalue as described below. + +@menu +* Compound Expressions:: +* Compound Lvalues:: +* Conditional Expressions:: +* Logical Operators:: +@end menu + +@node Compound Expressions +@subsubsection Compound Expressions +@cindex Compound Expressions + +The left-hand side of a C comma expression is simply moved into a separate +statement. + +@node Compound Lvalues +@subsubsection Compound Lvalues +@cindex Compound Lvalues + +Currently compound lvalues involving array and structure field references +are not broken down; an expression like @code{a.b[2] = 42} is not reduced +any further (though complex array subscripts are). This restriction is a +workaround for limitations in later optimizers; if we were to convert this +to + +@smallexample + T1 = &a.b; + T1[2] = 42; +@end smallexample + +alias analysis would not remember that the reference to @code{T1[2]} came +by way of @code{a.b}, so it would think that the assignment could alias +another member of @code{a}; this broke @code{struct-alias-1.c}. Future +optimizer improvements may make this limitation unnecessary. + +@node Conditional Expressions +@subsubsection Conditional Expressions +@cindex Conditional Expressions + +A C @code{?:} expression is converted into an @code{if} statement with +each branch assigning to the same temporary. So, + +@smallexample + a = b ? c : d; +@end smallexample +becomes +@smallexample + if (b) + T1 = c; + else + T1 = d; + a = T1; +@end smallexample + +Tree level if-conversion pass re-introduces @code{?:} expression, if appropriate. +It is used to vectorize loops with conditions using vector conditional operations. + +Note that in GIMPLE, @code{if} statements are also represented using +@code{COND_EXPR}, as described below. + +@node Logical Operators +@subsubsection Logical Operators +@cindex Logical Operators + +Except when they appear in the condition operand of a @code{COND_EXPR}, +logical `and' and `or' operators are simplified as follows: +@code{a = b && c} becomes + +@smallexample + T1 = (bool)b; + if (T1) + T1 = (bool)c; + a = T1; +@end smallexample + +Note that @code{T1} in this example cannot be an expression temporary, +because it has two different assignments. + +@node Statements +@subsection Statements +@cindex Statements + +Most statements will be assignment statements, represented by +@code{MODIFY_EXPR}. A @code{CALL_EXPR} whose value is ignored can +also be a statement. No other C expressions can appear at statement level; +a reference to a volatile object is converted into a @code{MODIFY_EXPR}. +In GIMPLE form, type of @code{MODIFY_EXPR} is not meaningful. Instead, use type +of LHS or RHS@. + +There are also several varieties of complex statements. + +@menu +* Blocks:: +* Statement Sequences:: +* Empty Statements:: +* Loops:: +* Selection Statements:: +* Jumps:: +* Cleanups:: +* GIMPLE Exception Handling:: +@end menu + +@node Blocks +@subsubsection Blocks +@cindex Blocks + +Block scopes and the variables they declare in GENERIC and GIMPLE are +expressed using the @code{BIND_EXPR} code, which in previous versions of +GCC was primarily used for the C statement-expression extension. + +Variables in a block are collected into @code{BIND_EXPR_VARS} in +declaration order. Any runtime initialization is moved out of +@code{DECL_INITIAL} and into a statement in the controlled block. When +gimplifying from C or C++, this initialization replaces the +@code{DECL_STMT}. + +Variable-length arrays (VLAs) complicate this process, as their size often +refers to variables initialized earlier in the block. To handle this, we +currently split the block at that point, and move the VLA into a new, inner +@code{BIND_EXPR}. This strategy may change in the future. + +@code{DECL_SAVED_TREE} for a GIMPLE function will always be a +@code{BIND_EXPR} which contains declarations for the temporary variables +used in the function. + +A C++ program will usually contain more @code{BIND_EXPR}s than there are +syntactic blocks in the source code, since several C++ constructs have +implicit scopes associated with them. On the other hand, although the C++ +front end uses pseudo-scopes to handle cleanups for objects with +destructors, these don't translate into the GIMPLE form; multiple +declarations at the same level use the same @code{BIND_EXPR}. + +@node Statement Sequences +@subsubsection Statement Sequences +@cindex Statement Sequences + +Multiple statements at the same nesting level are collected into a +@code{STATEMENT_LIST}. Statement lists are modified and traversed +using the interface in @samp{tree-iterator.h}. + +@node Empty Statements +@subsubsection Empty Statements +@cindex Empty Statements + +Whenever possible, statements with no effect are discarded. But if they +are nested within another construct which cannot be discarded for some +reason, they are instead replaced with an empty statement, generated by +@code{build_empty_stmt}. Initially, all empty statements were shared, +after the pattern of the Java front end, but this caused a lot of trouble in +practice. + +An empty statement is represented as @code{(void)0}. + +@node Loops +@subsubsection Loops +@cindex Loops + +At one time loops were expressed in GIMPLE using @code{LOOP_EXPR}, but +now they are lowered to explicit gotos. + +@node Selection Statements +@subsubsection Selection Statements +@cindex Selection Statements + +A simple selection statement, such as the C @code{if} statement, is +expressed in GIMPLE using a void @code{COND_EXPR}. If only one branch is +used, the other is filled with an empty statement. + +Normally, the condition expression is reduced to a simple comparison. If +it is a shortcut (@code{&&} or @code{||}) expression, however, we try to +break up the @code{if} into multiple @code{if}s so that the implied shortcut +is taken directly, much like the transformation done by @code{do_jump} in +the RTL expander. + +A @code{SWITCH_EXPR} in GIMPLE contains the condition and a +@code{TREE_VEC} of @code{CASE_LABEL_EXPR}s describing the case values +and corresponding @code{LABEL_DECL}s to jump to. The body of the +@code{switch} is moved after the @code{SWITCH_EXPR}. + +@node Jumps +@subsubsection Jumps +@cindex Jumps + +Other jumps are expressed by either @code{GOTO_EXPR} or @code{RETURN_EXPR}. + +The operand of a @code{GOTO_EXPR} must be either a label or a variable +containing the address to jump to. + +The operand of a @code{RETURN_EXPR} is either @code{NULL_TREE}, +@code{RESULT_DECL}, or a @code{MODIFY_EXPR} which sets the return value. It +would be nice to move the @code{MODIFY_EXPR} into a separate statement, but the +special return semantics in @code{expand_return} make that difficult. It may +still happen in the future, perhaps by moving most of that logic into +@code{expand_assignment}. + +@node Cleanups +@subsubsection Cleanups +@cindex Cleanups + +Destructors for local C++ objects and similar dynamic cleanups are +represented in GIMPLE by a @code{TRY_FINALLY_EXPR}. +@code{TRY_FINALLY_EXPR} has two operands, both of which are a sequence +of statements to execute. The first sequence is executed. When it +completes the second sequence is executed. + +The first sequence may complete in the following ways: + +@enumerate + +@item Execute the last statement in the sequence and fall off the +end. + +@item Execute a goto statement (@code{GOTO_EXPR}) to an ordinary +label outside the sequence. + +@item Execute a return statement (@code{RETURN_EXPR}). + +@item Throw an exception. This is currently not explicitly represented in +GIMPLE. + +@end enumerate + +The second sequence is not executed if the first sequence completes by +calling @code{setjmp} or @code{exit} or any other function that does +not return. The second sequence is also not executed if the first +sequence completes via a non-local goto or a computed goto (in general +the compiler does not know whether such a goto statement exits the +first sequence or not, so we assume that it doesn't). + +After the second sequence is executed, if it completes normally by +falling off the end, execution continues wherever the first sequence +would have continued, by falling off the end, or doing a goto, etc. + +@code{TRY_FINALLY_EXPR} complicates the flow graph, since the cleanup +needs to appear on every edge out of the controlled block; this +reduces the freedom to move code across these edges. Therefore, the +EH lowering pass which runs before most of the optimization passes +eliminates these expressions by explicitly adding the cleanup to each +edge. Rethrowing the exception is represented using @code{RESX_EXPR}. + + +@node GIMPLE Exception Handling +@subsubsection Exception Handling +@cindex GIMPLE Exception Handling + +Other exception handling constructs are represented using +@code{TRY_CATCH_EXPR}. @code{TRY_CATCH_EXPR} has two operands. The +first operand is a sequence of statements to execute. If executing +these statements does not throw an exception, then the second operand +is ignored. Otherwise, if an exception is thrown, then the second +operand of the @code{TRY_CATCH_EXPR} is checked. The second operand +may have the following forms: + +@enumerate + +@item A sequence of statements to execute. When an exception occurs, +these statements are executed, and then the exception is rethrown. + +@item A sequence of @code{CATCH_EXPR} expressions. Each @code{CATCH_EXPR} +has a list of applicable exception types and handler code. If the +thrown exception matches one of the caught types, the associated +handler code is executed. If the handler code falls off the bottom, +execution continues after the original @code{TRY_CATCH_EXPR}. + +@item An @code{EH_FILTER_EXPR} expression. This has a list of +permitted exception types, and code to handle a match failure. If the +thrown exception does not match one of the allowed types, the +associated match failure code is executed. If the thrown exception +does match, it continues unwinding the stack looking for the next +handler. + +@end enumerate + +Currently throwing an exception is not directly represented in GIMPLE, +since it is implemented by calling a function. At some point in the future +we will want to add some way to express that the call will throw an +exception of a known type. + +Just before running the optimizers, the compiler lowers the high-level +EH constructs above into a set of @samp{goto}s, magic labels, and EH +regions. Continuing to unwind at the end of a cleanup is represented +with a @code{RESX_EXPR}. + +@node GIMPLE Example +@subsection GIMPLE Example +@cindex GIMPLE Example + +@smallexample +struct A @{ A(); ~A(); @}; + +int i; +int g(); +void f() +@{ + A a; + int j = (--i, i ? 0 : 1); + + for (int x = 42; x > 0; --x) + @{ + i += g()*4 + 32; + @} +@} +@end smallexample + +becomes + +@smallexample +void f() +@{ + int i.0; + int T.1; + int iftmp.2; + int T.3; + int T.4; + int T.5; + int T.6; + + @{ + struct A a; + int j; + + __comp_ctor (&a); + try + @{ + i.0 = i; + T.1 = i.0 - 1; + i = T.1; + i.0 = i; + if (i.0 == 0) + iftmp.2 = 1; + else + iftmp.2 = 0; + j = iftmp.2; + @{ + int x; + + x = 42; + goto test; + loop:; + + T.3 = g (); + T.4 = T.3 * 4; + i.0 = i; + T.5 = T.4 + i.0; + T.6 = T.5 + 32; + i = T.6; + x = x - 1; + + test:; + if (x > 0) + goto loop; + else + goto break_; + break_:; + @} + @} + finally + @{ + __comp_dtor (&a); + @} + @} +@} +@end smallexample + +@node Rough GIMPLE Grammar +@subsection Rough GIMPLE Grammar +@cindex Rough GIMPLE Grammar + +@smallexample + function : FUNCTION_DECL + DECL_SAVED_TREE -> compound-stmt + + compound-stmt: STATEMENT_LIST + members -> stmt + + stmt : block + | if-stmt + | switch-stmt + | goto-stmt + | return-stmt + | resx-stmt + | label-stmt + | try-stmt + | modify-stmt + | call-stmt + + block : BIND_EXPR + BIND_EXPR_VARS -> chain of DECLs + BIND_EXPR_BLOCK -> BLOCK + BIND_EXPR_BODY -> compound-stmt + + if-stmt : COND_EXPR + op0 -> condition + op1 -> compound-stmt + op2 -> compound-stmt + + switch-stmt : SWITCH_EXPR + op0 -> val + op1 -> NULL + op2 -> TREE_VEC of CASE_LABEL_EXPRs + The CASE_LABEL_EXPRs are sorted by CASE_LOW, + and default is last. + + goto-stmt : GOTO_EXPR + op0 -> LABEL_DECL | val + + return-stmt : RETURN_EXPR + op0 -> return-value + + return-value : NULL + | RESULT_DECL + | MODIFY_EXPR + op0 -> RESULT_DECL + op1 -> lhs + + resx-stmt : RESX_EXPR + + label-stmt : LABEL_EXPR + op0 -> LABEL_DECL + + try-stmt : TRY_CATCH_EXPR + op0 -> compound-stmt + op1 -> handler + | TRY_FINALLY_EXPR + op0 -> compound-stmt + op1 -> compound-stmt + + handler : catch-seq + | EH_FILTER_EXPR + | compound-stmt + + catch-seq : STATEMENT_LIST + members -> CATCH_EXPR + + modify-stmt : MODIFY_EXPR + op0 -> lhs + op1 -> rhs + + call-stmt : CALL_EXPR + op0 -> val | OBJ_TYPE_REF + op1 -> call-arg-list + + call-arg-list: TREE_LIST + members -> lhs | CONST + + addr-expr-arg: ID + | compref + + addressable : addr-expr-arg + | indirectref + + with-size-arg: addressable + | call-stmt + + indirectref : INDIRECT_REF + op0 -> val + + lhs : addressable + | bitfieldref + | WITH_SIZE_EXPR + op0 -> with-size-arg + op1 -> val + + min-lval : ID + | indirectref + + bitfieldref : BIT_FIELD_REF + op0 -> inner-compref + op1 -> CONST + op2 -> var + + compref : inner-compref + | TARGET_MEM_REF + op0 -> ID + op1 -> val + op2 -> val + op3 -> CONST + op4 -> CONST + | REALPART_EXPR + op0 -> inner-compref + | IMAGPART_EXPR + op0 -> inner-compref + + inner-compref: min-lval + | COMPONENT_REF + op0 -> inner-compref + op1 -> FIELD_DECL + op2 -> val + | ARRAY_REF + op0 -> inner-compref + op1 -> val + op2 -> val + op3 -> val + | ARRAY_RANGE_REF + op0 -> inner-compref + op1 -> val + op2 -> val + op3 -> val + | VIEW_CONVERT_EXPR + op0 -> inner-compref + + condition : val + | RELOP + op0 -> val + op1 -> val + + val : ID + | CONST + + rhs : lhs + | CONST + | call-stmt + | ADDR_EXPR + op0 -> addr-expr-arg + | UNOP + op0 -> val + | BINOP + op0 -> val + op1 -> val + | RELOP + op0 -> val + op1 -> val + | COND_EXPR + op0 -> condition + op1 -> val + op2 -> val +@end smallexample + +@node Annotations +@section Annotations +@cindex annotations + +The optimizers need to associate attributes with statements and +variables during the optimization process. For instance, we need to +know what basic block a statement belongs to or whether a variable +has aliases. All these attributes are stored in data structures +called annotations which are then linked to the field @code{ann} in +@code{struct tree_common}. + +Presently, we define annotations for statements (@code{stmt_ann_t}), +variables (@code{var_ann_t}) and SSA names (@code{ssa_name_ann_t}). +Annotations are defined and documented in @file{tree-flow.h}. + + +@node Statement Operands +@section Statement Operands +@cindex operands +@cindex virtual operands +@cindex real operands +@findex update_stmt + +Almost every GIMPLE statement will contain a reference to a variable +or memory location. Since statements come in different shapes and +sizes, their operands are going to be located at various spots inside +the statement's tree. To facilitate access to the statement's +operands, they are organized into lists associated inside each +statement's annotation. Each element in an operand list is a pointer +to a @code{VAR_DECL}, @code{PARM_DECL} or @code{SSA_NAME} tree node. +This provides a very convenient way of examining and replacing +operands. + +Data flow analysis and optimization is done on all tree nodes +representing variables. Any node for which @code{SSA_VAR_P} returns +nonzero is considered when scanning statement operands. However, not +all @code{SSA_VAR_P} variables are processed in the same way. For the +purposes of optimization, we need to distinguish between references to +local scalar variables and references to globals, statics, structures, +arrays, aliased variables, etc. The reason is simple, the compiler +can gather complete data flow information for a local scalar. On the +other hand, a global variable may be modified by a function call, it +may not be possible to keep track of all the elements of an array or +the fields of a structure, etc. + +The operand scanner gathers two kinds of operands: @dfn{real} and +@dfn{virtual}. An operand for which @code{is_gimple_reg} returns true +is considered real, otherwise it is a virtual operand. We also +distinguish between uses and definitions. An operand is used if its +value is loaded by the statement (e.g., the operand at the RHS of an +assignment). If the statement assigns a new value to the operand, the +operand is considered a definition (e.g., the operand at the LHS of +an assignment). + +Virtual and real operands also have very different data flow +properties. Real operands are unambiguous references to the +full object that they represent. For instance, given + +@smallexample +@{ + int a, b; + a = b +@} +@end smallexample + +Since @code{a} and @code{b} are non-aliased locals, the statement +@code{a = b} will have one real definition and one real use because +variable @code{b} is completely modified with the contents of +variable @code{a}. Real definition are also known as @dfn{killing +definitions}. Similarly, the use of @code{a} reads all its bits. + +In contrast, virtual operands are used with variables that can have +a partial or ambiguous reference. This includes structures, arrays, +globals, and aliased variables. In these cases, we have two types of +definitions. For globals, structures, and arrays, we can determine from +a statement whether a variable of these types has a killing definition. +If the variable does, then the statement is marked as having a +@dfn{must definition} of that variable. However, if a statement is only +defining a part of the variable (i.e.@: a field in a structure), or if we +know that a statement might define the variable but we cannot say for sure, +then we mark that statement as having a @dfn{may definition}. For +instance, given + +@smallexample +@{ + int a, b, *p; + + if (...) + p = &a; + else + p = &b; + *p = 5; + return *p; +@} +@end smallexample + +The assignment @code{*p = 5} may be a definition of @code{a} or +@code{b}. If we cannot determine statically where @code{p} is +pointing to at the time of the store operation, we create virtual +definitions to mark that statement as a potential definition site for +@code{a} and @code{b}. Memory loads are similarly marked with virtual +use operands. Virtual operands are shown in tree dumps right before +the statement that contains them. To request a tree dump with virtual +operands, use the @option{-vops} option to @option{-fdump-tree}: + +@smallexample +@{ + int a, b, *p; + + if (...) + p = &a; + else + p = &b; + # a = V_MAY_DEF <a> + # b = V_MAY_DEF <b> + *p = 5; + + # VUSE <a> + # VUSE <b> + return *p; +@} +@end smallexample + +Notice that @code{V_MAY_DEF} operands have two copies of the referenced +variable. This indicates that this is not a killing definition of +that variable. In this case we refer to it as a @dfn{may definition} +or @dfn{aliased store}. The presence of the second copy of the +variable in the @code{V_MAY_DEF} operand will become important when the +function is converted into SSA form. This will be used to link all +the non-killing definitions to prevent optimizations from making +incorrect assumptions about them. + +Operands are updated as soon as the statement is finished via a call +to @code{update_stmt}. If statement elements are changed via +@code{SET_USE} or @code{SET_DEF}, then no further action is required +(i.e., those macros take care of updating the statement). If changes +are made by manipulating the statement's tree directly, then a call +must be made to @code{update_stmt} when complete. Calling one of the +@code{bsi_insert} routines or @code{bsi_replace} performs an implicit +call to @code{update_stmt}. + +@subsection Operand Iterators And Access Routines +@cindex Operand Iterators +@cindex Operand Access Routines + +Operands are collected by @file{tree-ssa-operands.c}. They are stored +inside each statement's annotation and can be accessed through either the +operand iterators or an access routine. + +The following access routines are available for examining operands: + +@enumerate +@item @code{SINGLE_SSA_@{USE,DEF,TREE@}_OPERAND}: These accessors will return +NULL unless there is exactly one operand matching the specified flags. If +there is exactly one operand, the operand is returned as either a @code{tree}, +@code{def_operand_p}, or @code{use_operand_p}. + +@smallexample +tree t = SINGLE_SSA_TREE_OPERAND (stmt, flags); +use_operand_p u = SINGLE_SSA_USE_OPERAND (stmt, SSA_ALL_VIRTUAL_USES); +def_operand_p d = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_ALL_DEFS); +@end smallexample + +@item @code{ZERO_SSA_OPERANDS}: This macro returns true if there are no +operands matching the specified flags. + +@smallexample +if (ZERO_SSA_OPERANDS (stmt, SSA_OP_ALL_VIRTUALS)) + return; +@end smallexample + +@item @code{NUM_SSA_OPERANDS}: This macro Returns the number of operands +matching 'flags'. This actually executes a loop to perform the count, so +only use this if it is really needed. + +@smallexample +int count = NUM_SSA_OPERANDS (stmt, flags) +@end smallexample +@end enumerate + + +If you wish to iterate over some or all operands, use the +@code{FOR_EACH_SSA_@{USE,DEF,TREE@}_OPERAND} iterator. For example, to print +all the operands for a statement: + +@smallexample +void +print_ops (tree stmt) +@{ + ssa_op_iter; + tree var; + + FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_ALL_OPERANDS) + print_generic_expr (stderr, var, TDF_SLIM); +@} +@end smallexample + + +How to choose the appropriate iterator: + +@enumerate +@item Determine whether you are need to see the operand pointers, or just the + trees, and choose the appropriate macro: + +@smallexample +Need Macro: +---- ------- +use_operand_p FOR_EACH_SSA_USE_OPERAND +def_operand_p FOR_EACH_SSA_DEF_OPERAND +tree FOR_EACH_SSA_TREE_OPERAND +@end smallexample + +@item You need to declare a variable of the type you are interested + in, and an ssa_op_iter structure which serves as the loop + controlling variable. + +@item Determine which operands you wish to use, and specify the flags of + those you are interested in. They are documented in + @file{tree-ssa-operands.h}: + +@smallexample +#define SSA_OP_USE 0x01 /* @r{Real USE operands.} */ +#define SSA_OP_DEF 0x02 /* @r{Real DEF operands.} */ +#define SSA_OP_VUSE 0x04 /* @r{VUSE operands.} */ +#define SSA_OP_VMAYUSE 0x08 /* @r{USE portion of V_MAY_DEFS.} */ +#define SSA_OP_VMAYDEF 0x10 /* @r{DEF portion of V_MAY_DEFS.} */ +#define SSA_OP_VMUSTDEF 0x20 /* @r{V_MUST_DEF definitions.} */ + +/* @r{These are commonly grouped operand flags.} */ +#define SSA_OP_VIRTUAL_USES (SSA_OP_VUSE | SSA_OP_VMAYUSE) +#define SSA_OP_VIRTUAL_DEFS (SSA_OP_VMAYDEF | SSA_OP_VMUSTDEF) +#define SSA_OP_ALL_USES (SSA_OP_VIRTUAL_USES | SSA_OP_USE) +#define SSA_OP_ALL_DEFS (SSA_OP_VIRTUAL_DEFS | SSA_OP_DEF) +#define SSA_OP_ALL_OPERANDS (SSA_OP_ALL_USES | SSA_OP_ALL_DEFS) +@end smallexample +@end enumerate + +So if you want to look at the use pointers for all the @code{USE} and +@code{VUSE} operands, you would do something like: + +@smallexample + use_operand_p use_p; + ssa_op_iter iter; + + FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, (SSA_OP_USE | SSA_OP_VUSE)) + @{ + process_use_ptr (use_p); + @} +@end smallexample + +The @code{TREE} macro is basically the same as the @code{USE} and +@code{DEF} macros, only with the use or def dereferenced via +@code{USE_FROM_PTR (use_p)} and @code{DEF_FROM_PTR (def_p)}. Since we +aren't using operand pointers, use and defs flags can be mixed. + +@smallexample + tree var; + ssa_op_iter iter; + + FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_VUSE | SSA_OP_VMUSTDEF) + @{ + print_generic_expr (stderr, var, TDF_SLIM); + @} +@end smallexample + +@code{V_MAY_DEF}s are broken into two flags, one for the +@code{DEF} portion (@code{SSA_OP_VMAYDEF}) and one for the USE portion +(@code{SSA_OP_VMAYUSE}). If all you want to look at are the +@code{V_MAY_DEF}s together, there is a fourth iterator macro for this, +which returns both a def_operand_p and a use_operand_p for each +@code{V_MAY_DEF} in the statement. Note that you don't need any flags for +this one. + +@smallexample + use_operand_p use_p; + def_operand_p def_p; + ssa_op_iter iter; + + FOR_EACH_SSA_MAYDEF_OPERAND (def_p, use_p, stmt, iter) + @{ + my_code; + @} +@end smallexample + +@code{V_MUST_DEF}s are broken into two flags, one for the +@code{DEF} portion (@code{SSA_OP_VMUSTDEF}) and one for the kill portion +(@code{SSA_OP_VMUSTKILL}). If all you want to look at are the +@code{V_MUST_DEF}s together, there is a fourth iterator macro for this, +which returns both a def_operand_p and a use_operand_p for each +@code{V_MUST_DEF} in the statement. Note that you don't need any flags for +this one. + +@smallexample + use_operand_p kill_p; + def_operand_p def_p; + ssa_op_iter iter; + + FOR_EACH_SSA_MUSTDEF_OPERAND (def_p, kill_p, stmt, iter) + @{ + my_code; + @} +@end smallexample + + +There are many examples in the code as well, as well as the +documentation in @file{tree-ssa-operands.h}. + +There are also a couple of variants on the stmt iterators regarding PHI +nodes. + +@code{FOR_EACH_PHI_ARG} Works exactly like +@code{FOR_EACH_SSA_USE_OPERAND}, except it works over @code{PHI} arguments +instead of statement operands. + +@smallexample +/* Look at every virtual PHI use. */ +FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_VIRTUAL_USES) +@{ + my_code; +@} + +/* Look at every real PHI use. */ +FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_USES) + my_code; + +/* Look at every every PHI use. */ +FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_ALL_USES) + my_code; +@end smallexample + +@code{FOR_EACH_PHI_OR_STMT_@{USE,DEF@}} works exactly like +@code{FOR_EACH_SSA_@{USE,DEF@}_OPERAND}, except it will function on +either a statement or a @code{PHI} node. These should be used when it is +appropriate but they are not quite as efficient as the individual +@code{FOR_EACH_PHI} and @code{FOR_EACH_SSA} routines. + +@smallexample +FOR_EACH_PHI_OR_STMT_USE (use_operand_p, stmt, iter, flags) + @{ + my_code; + @} + +FOR_EACH_PHI_OR_STMT_DEF (def_operand_p, phi, iter, flags) + @{ + my_code; + @} +@end smallexample + +@subsection Immediate Uses +@cindex Immediate Uses + +Immediate use information is now always available. Using the immediate use +iterators, you may examine every use of any @code{SSA_NAME}. For instance, +to change each use of @code{ssa_var} to @code{ssa_var2} and call fold_stmt on +each stmt after that is done: + +@smallexample + use_operand_p imm_use_p; + imm_use_iterator iterator; + tree ssa_var, stmt; + + + FOR_EACH_IMM_USE_STMT (stmt, iterator, ssa_var) + @{ + FOR_EACH_IMM_USE_ON_STMT (imm_use_p, iterator) + SET_USE (imm_use_p, ssa_var_2); + fold_stmt (stmt); + @} +@end smallexample + +There are 2 iterators which can be used. @code{FOR_EACH_IMM_USE_FAST} is +used when the immediate uses are not changed, i.e., you are looking at the +uses, but not setting them. + +If they do get changed, then care must be taken that things are not changed +under the iterators, so use the @code{FOR_EACH_IMM_USE_STMT} and +@code{FOR_EACH_IMM_USE_ON_STMT} iterators. They attempt to preserve the +sanity of the use list by moving all the uses for a statement into +a controlled position, and then iterating over those uses. Then the +optimization can manipulate the stmt when all the uses have been +processed. This is a little slower than the FAST version since it adds a +placeholder element and must sort through the list a bit for each statement. +This placeholder element must be also be removed if the loop is +terminated early. The macro @code{BREAK_FROM_IMM_USE_SAFE} is provided +to do this : + +@smallexample + FOR_EACH_IMM_USE_STMT (stmt, iterator, ssa_var) + @{ + if (stmt == last_stmt) + BREAK_FROM_SAFE_IMM_USE (iter); + + FOR_EACH_IMM_USE_ON_STMT (imm_use_p, iterator) + SET_USE (imm_use_p, ssa_var_2); + fold_stmt (stmt); + @} +@end smallexample + +There are checks in @code{verify_ssa} which verify that the immediate use list +is up to date, as well as checking that an optimization didn't break from the +loop without using this macro. It is safe to simply 'break'; from a +@code{FOR_EACH_IMM_USE_FAST} traverse. + +Some useful functions and macros: +@enumerate +@item @code{has_zero_uses (ssa_var)} : Returns true if there are no uses of +@code{ssa_var}. +@item @code{has_single_use (ssa_var)} : Returns true if there is only a +single use of @code{ssa_var}. +@item @code{single_imm_use (ssa_var, use_operand_p *ptr, tree *stmt)} : +Returns true if there is only a single use of @code{ssa_var}, and also returns +the use pointer and statement it occurs in in the second and third parameters. +@item @code{num_imm_uses (ssa_var)} : Returns the number of immediate uses of +@code{ssa_var}. It is better not to use this if possible since it simply +utilizes a loop to count the uses. +@item @code{PHI_ARG_INDEX_FROM_USE (use_p)} : Given a use within a @code{PHI} +node, return the index number for the use. An assert is triggered if the use +isn't located in a @code{PHI} node. +@item @code{USE_STMT (use_p)} : Return the statement a use occurs in. +@end enumerate + +Note that uses are not put into an immediate use list until their statement is +actually inserted into the instruction stream via a @code{bsi_*} routine. + +It is also still possible to utilize lazy updating of statements, but this +should be used only when absolutely required. Both alias analysis and the +dominator optimizations currently do this. + +When lazy updating is being used, the immediate use information is out of date +and cannot be used reliably. Lazy updating is achieved by simply marking +statements modified via calls to @code{mark_stmt_modified} instead of +@code{update_stmt}. When lazy updating is no longer required, all the +modified statements must have @code{update_stmt} called in order to bring them +up to date. This must be done before the optimization is finished, or +@code{verify_ssa} will trigger an abort. + +This is done with a simple loop over the instruction stream: +@smallexample + block_stmt_iterator bsi; + basic_block bb; + FOR_EACH_BB (bb) + @{ + for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi)) + update_stmt_if_modified (bsi_stmt (bsi)); + @} +@end smallexample + +@node SSA +@section Static Single Assignment +@cindex SSA +@cindex static single assignment + +Most of the tree optimizers rely on the data flow information provided +by the Static Single Assignment (SSA) form. We implement the SSA form +as described in @cite{R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and +K. Zadeck. Efficiently Computing Static Single Assignment Form and the +Control Dependence Graph. ACM Transactions on Programming Languages +and Systems, 13(4):451-490, October 1991}. + +The SSA form is based on the premise that program variables are +assigned in exactly one location in the program. Multiple assignments +to the same variable create new versions of that variable. Naturally, +actual programs are seldom in SSA form initially because variables +tend to be assigned multiple times. The compiler modifies the program +representation so that every time a variable is assigned in the code, +a new version of the variable is created. Different versions of the +same variable are distinguished by subscripting the variable name with +its version number. Variables used in the right-hand side of +expressions are renamed so that their version number matches that of +the most recent assignment. + +We represent variable versions using @code{SSA_NAME} nodes. The +renaming process in @file{tree-ssa.c} wraps every real and +virtual operand with an @code{SSA_NAME} node which contains +the version number and the statement that created the +@code{SSA_NAME}. Only definitions and virtual definitions may +create new @code{SSA_NAME} nodes. + +Sometimes, flow of control makes it impossible to determine what is the +most recent version of a variable. In these cases, the compiler +inserts an artificial definition for that variable called +@dfn{PHI function} or @dfn{PHI node}. This new definition merges +all the incoming versions of the variable to create a new name +for it. For instance, + +@smallexample +if (...) + a_1 = 5; +else if (...) + a_2 = 2; +else + a_3 = 13; + +# a_4 = PHI <a_1, a_2, a_3> +return a_4; +@end smallexample + +Since it is not possible to determine which of the three branches +will be taken at runtime, we don't know which of @code{a_1}, +@code{a_2} or @code{a_3} to use at the return statement. So, the +SSA renamer creates a new version @code{a_4} which is assigned +the result of ``merging'' @code{a_1}, @code{a_2} and @code{a_3}. +Hence, PHI nodes mean ``one of these operands. I don't know +which''. + +The following macros can be used to examine PHI nodes + +@defmac PHI_RESULT (@var{phi}) +Returns the @code{SSA_NAME} created by PHI node @var{phi} (i.e., +@var{phi}'s LHS)@. +@end defmac + +@defmac PHI_NUM_ARGS (@var{phi}) +Returns the number of arguments in @var{phi}. This number is exactly +the number of incoming edges to the basic block holding @var{phi}@. +@end defmac + +@defmac PHI_ARG_ELT (@var{phi}, @var{i}) +Returns a tuple representing the @var{i}th argument of @var{phi}@. +Each element of this tuple contains an @code{SSA_NAME} @var{var} and +the incoming edge through which @var{var} flows. +@end defmac + +@defmac PHI_ARG_EDGE (@var{phi}, @var{i}) +Returns the incoming edge for the @var{i}th argument of @var{phi}. +@end defmac + +@defmac PHI_ARG_DEF (@var{phi}, @var{i}) +Returns the @code{SSA_NAME} for the @var{i}th argument of @var{phi}. +@end defmac + + +@subsection Preserving the SSA form +@findex update_ssa +@cindex preserving SSA form +Some optimization passes make changes to the function that +invalidate the SSA property. This can happen when a pass has +added new symbols or changed the program so that variables that +were previously aliased aren't anymore. Whenever something like this +happens, the affected symbols must be renamed into SSA form again. +Transformations that emit new code or replicate existing statements +will also need to update the SSA form@. + +Since GCC implements two different SSA forms for register and virtual +variables, keeping the SSA form up to date depends on whether you are +updating register or virtual names. In both cases, the general idea +behind incremental SSA updates is similar: when new SSA names are +created, they typically are meant to replace other existing names in +the program@. + +For instance, given the following code: + +@smallexample + 1 L0: + 2 x_1 = PHI (0, x_5) + 3 if (x_1 < 10) + 4 if (x_1 > 7) + 5 y_2 = 0 + 6 else + 7 y_3 = x_1 + x_7 + 8 endif + 9 x_5 = x_1 + 1 + 10 goto L0; + 11 endif +@end smallexample + +Suppose that we insert new names @code{x_10} and @code{x_11} (lines +@code{4} and @code{8})@. + +@smallexample + 1 L0: + 2 x_1 = PHI (0, x_5) + 3 if (x_1 < 10) + 4 x_10 = ... + 5 if (x_1 > 7) + 6 y_2 = 0 + 7 else + 8 x_11 = ... + 9 y_3 = x_1 + x_7 + 10 endif + 11 x_5 = x_1 + 1 + 12 goto L0; + 13 endif +@end smallexample + +We want to replace all the uses of @code{x_1} with the new definitions +of @code{x_10} and @code{x_11}. Note that the only uses that should +be replaced are those at lines @code{5}, @code{9} and @code{11}. +Also, the use of @code{x_7} at line @code{9} should @emph{not} be +replaced (this is why we cannot just mark symbol @code{x} for +renaming)@. + +Additionally, we may need to insert a PHI node at line @code{11} +because that is a merge point for @code{x_10} and @code{x_11}. So the +use of @code{x_1} at line @code{11} will be replaced with the new PHI +node. The insertion of PHI nodes is optional. They are not strictly +necessary to preserve the SSA form, and depending on what the caller +inserted, they may not even be useful for the optimizers@. + +Updating the SSA form is a two step process. First, the pass has to +identify which names need to be updated and/or which symbols need to +be renamed into SSA form for the first time. When new names are +introduced to replace existing names in the program, the mapping +between the old and the new names are registered by calling +@code{register_new_name_mapping} (note that if your pass creates new +code by duplicating basic blocks, the call to @code{tree_duplicate_bb} +will set up the necessary mappings automatically). On the other hand, +if your pass exposes a new symbol that should be put in SSA form for +the first time, the new symbol should be registered with +@code{mark_sym_for_renaming}. + +After the replacement mappings have been registered and new symbols +marked for renaming, a call to @code{update_ssa} makes the registered +changes. This can be done with an explicit call or by creating +@code{TODO} flags in the @code{tree_opt_pass} structure for your pass. +There are several @code{TODO} flags that control the behavior of +@code{update_ssa}: + +@itemize @bullet +@item @code{TODO_update_ssa}. Update the SSA form inserting PHI nodes + for newly exposed symbols and virtual names marked for updating. + When updating real names, only insert PHI nodes for a real name + @code{O_j} in blocks reached by all the new and old definitions for + @code{O_j}. If the iterated dominance frontier for @code{O_j} + is not pruned, we may end up inserting PHI nodes in blocks that + have one or more edges with no incoming definition for + @code{O_j}. This would lead to uninitialized warnings for + @code{O_j}'s symbol@. + +@item @code{TODO_update_ssa_no_phi}. Update the SSA form without + inserting any new PHI nodes at all. This is used by passes that + have either inserted all the PHI nodes themselves or passes that + need only to patch use-def and def-def chains for virtuals + (e.g., DCE)@. + + +@item @code{TODO_update_ssa_full_phi}. Insert PHI nodes everywhere + they are needed. No pruning of the IDF is done. This is used + by passes that need the PHI nodes for @code{O_j} even if it + means that some arguments will come from the default definition + of @code{O_j}'s symbol (e.g., @code{pass_linear_transform})@. + + WARNING: If you need to use this flag, chances are that your + pass may be doing something wrong. Inserting PHI nodes for an + old name where not all edges carry a new replacement may lead to + silent codegen errors or spurious uninitialized warnings@. + +@item @code{TODO_update_ssa_only_virtuals}. Passes that update the + SSA form on their own may want to delegate the updating of + virtual names to the generic updater. Since FUD chains are + easier to maintain, this simplifies the work they need to do. + NOTE: If this flag is used, any OLD->NEW mappings for real names + are explicitly destroyed and only the symbols marked for + renaming are processed@. +@end itemize + +@subsection Preserving the virtual SSA form +@cindex preserving virtual SSA form + +The virtual SSA form is harder to preserve than the non-virtual SSA form +mainly because the set of virtual operands for a statement may change at +what some would consider unexpected times. In general, any time you +have modified a statement that has virtual operands, you should verify +whether the list of virtual operands has changed, and if so, mark the +newly exposed symbols by calling @code{mark_new_vars_to_rename}. + +There is one additional caveat to preserving virtual SSA form. When the +entire set of virtual operands may be eliminated due to better +disambiguation, a bare SMT will be added to the list of virtual +operands, to signify the non-visible aliases that the are still being +referenced. If the set of bare SMT's may change, +@code{TODO_update_smt_usage} should be added to the todo flags. + +With the current pruning code, this can only occur when constants are +propagated into array references that were previously non-constant, or +address expressions are propagated into their uses. + +@subsection Examining @code{SSA_NAME} nodes +@cindex examining SSA_NAMEs + +The following macros can be used to examine @code{SSA_NAME} nodes + +@defmac SSA_NAME_DEF_STMT (@var{var}) +Returns the statement @var{s} that creates the @code{SSA_NAME} +@var{var}. If @var{s} is an empty statement (i.e., @code{IS_EMPTY_STMT +(@var{s})} returns @code{true}), it means that the first reference to +this variable is a USE or a VUSE@. +@end defmac + +@defmac SSA_NAME_VERSION (@var{var}) +Returns the version number of the @code{SSA_NAME} object @var{var}. +@end defmac + + +@subsection Walking use-def chains + +@deftypefn {Tree SSA function} void walk_use_def_chains (@var{var}, @var{fn}, @var{data}) + +Walks use-def chains starting at the @code{SSA_NAME} node @var{var}. +Calls function @var{fn} at each reaching definition found. Function +@var{FN} takes three arguments: @var{var}, its defining statement +(@var{def_stmt}) and a generic pointer to whatever state information +that @var{fn} may want to maintain (@var{data}). Function @var{fn} is +able to stop the walk by returning @code{true}, otherwise in order to +continue the walk, @var{fn} should return @code{false}. + +Note, that if @var{def_stmt} is a @code{PHI} node, the semantics are +slightly different. For each argument @var{arg} of the PHI node, this +function will: + +@enumerate +@item Walk the use-def chains for @var{arg}. +@item Call @code{FN (@var{arg}, @var{phi}, @var{data})}. +@end enumerate + +Note how the first argument to @var{fn} is no longer the original +variable @var{var}, but the PHI argument currently being examined. +If @var{fn} wants to get at @var{var}, it should call +@code{PHI_RESULT} (@var{phi}). +@end deftypefn + +@subsection Walking the dominator tree + +@deftypefn {Tree SSA function} void walk_dominator_tree (@var{walk_data}, @var{bb}) + +This function walks the dominator tree for the current CFG calling a +set of callback functions defined in @var{struct dom_walk_data} in +@file{domwalk.h}. The call back functions you need to define give you +hooks to execute custom code at various points during traversal: + +@enumerate +@item Once to initialize any local data needed while processing + @var{bb} and its children. This local data is pushed into an + internal stack which is automatically pushed and popped as the + walker traverses the dominator tree. + +@item Once before traversing all the statements in the @var{bb}. + +@item Once for every statement inside @var{bb}. + +@item Once after traversing all the statements and before recursing + into @var{bb}'s dominator children. + +@item It then recurses into all the dominator children of @var{bb}. + +@item After recursing into all the dominator children of @var{bb} it + can, optionally, traverse every statement in @var{bb} again + (i.e., repeating steps 2 and 3). + +@item Once after walking the statements in @var{bb} and @var{bb}'s + dominator children. At this stage, the block local data stack + is popped. +@end enumerate +@end deftypefn + +@node Alias analysis +@section Alias analysis +@cindex alias +@cindex flow-sensitive alias analysis +@cindex flow-insensitive alias analysis + +Alias analysis proceeds in 4 main phases: + +@enumerate +@item Structural alias analysis. + +This phase walks the types for structure variables, and determines which +of the fields can overlap using offset and size of each field. For each +field, a ``subvariable'' called a ``Structure field tag'' (SFT)@ is +created, which represents that field as a separate variable. All +accesses that could possibly overlap with a given field will have +virtual operands for the SFT of that field. + +@smallexample +struct foo +@{ + int a; + int b; +@} +struct foo temp; +int bar (void) +@{ + int tmp1, tmp2, tmp3; + SFT.0_2 = V_MUST_DEF <SFT.0_1> + temp.a = 5; + SFT.1_4 = V_MUST_DEF <SFT.1_3> + temp.b = 6; + + VUSE <SFT.1_4> + tmp1_5 = temp.b; + VUSE <SFT.0_2> + tmp2_6 = temp.a; + + tmp3_7 = tmp1_5 + tmp2_6; + return tmp3_7; +@} +@end smallexample + +If you copy the symbol tag for a variable for some reason, you probably +also want to copy the subvariables for that variable. + +@item Points-to and escape analysis. + +This phase walks the use-def chains in the SSA web looking for +three things: + + @itemize @bullet + @item Assignments of the form @code{P_i = &VAR} + @item Assignments of the form P_i = malloc() + @item Pointers and ADDR_EXPR that escape the current function. + @end itemize + +The concept of `escaping' is the same one used in the Java world. +When a pointer or an ADDR_EXPR escapes, it means that it has been +exposed outside of the current function. So, assignment to +global variables, function arguments and returning a pointer are +all escape sites. + +This is where we are currently limited. Since not everything is +renamed into SSA, we lose track of escape properties when a +pointer is stashed inside a field in a structure, for instance. +In those cases, we are assuming that the pointer does escape. + +We use escape analysis to determine whether a variable is +call-clobbered. Simply put, if an ADDR_EXPR escapes, then the +variable is call-clobbered. If a pointer P_i escapes, then all +the variables pointed-to by P_i (and its memory tag) also escape. + +@item Compute flow-sensitive aliases + +We have two classes of memory tags. Memory tags associated with +the pointed-to data type of the pointers in the program. These +tags are called ``symbol memory tag'' (SMT)@. The other class are +those associated with SSA_NAMEs, called ``name memory tag'' (NMT)@. +The basic idea is that when adding operands for an INDIRECT_REF +*P_i, we will first check whether P_i has a name tag, if it does +we use it, because that will have more precise aliasing +information. Otherwise, we use the standard symbol tag. + +In this phase, we go through all the pointers we found in +points-to analysis and create alias sets for the name memory tags +associated with each pointer P_i. If P_i escapes, we mark +call-clobbered the variables it points to and its tag. + + +@item Compute flow-insensitive aliases + +This pass will compare the alias set of every symbol memory tag and +every addressable variable found in the program. Given a symbol +memory tag SMT and an addressable variable V@. If the alias sets +of SMT and V conflict (as computed by may_alias_p), then V is +marked as an alias tag and added to the alias set of SMT@. +@end enumerate + +For instance, consider the following function: + +@smallexample +foo (int i) +@{ + int *p, *q, a, b; + + if (i > 10) + p = &a; + else + q = &b; + + *p = 3; + *q = 5; + a = b + 2; + return *p; +@} +@end smallexample + +After aliasing analysis has finished, the symbol memory tag for +pointer @code{p} will have two aliases, namely variables @code{a} and +@code{b}. +Every time pointer @code{p} is dereferenced, we want to mark the +operation as a potential reference to @code{a} and @code{b}. + +@smallexample +foo (int i) +@{ + int *p, a, b; + + if (i_2 > 10) + p_4 = &a; + else + p_6 = &b; + # p_1 = PHI <p_4(1), p_6(2)>; + + # a_7 = V_MAY_DEF <a_3>; + # b_8 = V_MAY_DEF <b_5>; + *p_1 = 3; + + # a_9 = V_MAY_DEF <a_7> + # VUSE <b_8> + a_9 = b_8 + 2; + + # VUSE <a_9>; + # VUSE <b_8>; + return *p_1; +@} +@end smallexample + +In certain cases, the list of may aliases for a pointer may grow +too large. This may cause an explosion in the number of virtual +operands inserted in the code. Resulting in increased memory +consumption and compilation time. + +When the number of virtual operands needed to represent aliased +loads and stores grows too large (configurable with @option{--param +max-aliased-vops}), alias sets are grouped to avoid severe +compile-time slow downs and memory consumption. The alias +grouping heuristic proceeds as follows: + +@enumerate +@item Sort the list of pointers in decreasing number of contributed +virtual operands. + +@item Take the first pointer from the list and reverse the role +of the memory tag and its aliases. Usually, whenever an +aliased variable Vi is found to alias with a memory tag +T, we add Vi to the may-aliases set for T@. Meaning that +after alias analysis, we will have: + +@smallexample +may-aliases(T) = @{ V1, V2, V3, ..., Vn @} +@end smallexample + +This means that every statement that references T, will get +@code{n} virtual operands for each of the Vi tags. But, when +alias grouping is enabled, we make T an alias tag and add it +to the alias set of all the Vi variables: + +@smallexample +may-aliases(V1) = @{ T @} +may-aliases(V2) = @{ T @} +... +may-aliases(Vn) = @{ T @} +@end smallexample + +This has two effects: (a) statements referencing T will only get +a single virtual operand, and, (b) all the variables Vi will now +appear to alias each other. So, we lose alias precision to +improve compile time. But, in theory, a program with such a high +level of aliasing should not be very optimizable in the first +place. + +@item Since variables may be in the alias set of more than one +memory tag, the grouping done in step (2) needs to be extended +to all the memory tags that have a non-empty intersection with +the may-aliases set of tag T@. For instance, if we originally +had these may-aliases sets: + +@smallexample +may-aliases(T) = @{ V1, V2, V3 @} +may-aliases(R) = @{ V2, V4 @} +@end smallexample + +In step (2) we would have reverted the aliases for T as: + +@smallexample +may-aliases(V1) = @{ T @} +may-aliases(V2) = @{ T @} +may-aliases(V3) = @{ T @} +@end smallexample + +But note that now V2 is no longer aliased with R@. We could +add R to may-aliases(V2), but we are in the process of +grouping aliases to reduce virtual operands so what we do is +add V4 to the grouping to obtain: + +@smallexample +may-aliases(V1) = @{ T @} +may-aliases(V2) = @{ T @} +may-aliases(V3) = @{ T @} +may-aliases(V4) = @{ T @} +@end smallexample + +@item If the total number of virtual operands due to aliasing is +still above the threshold set by max-alias-vops, go back to (2). +@end enumerate |