aboutsummaryrefslogtreecommitdiff
path: root/docs/tutorial/LangImpl4.html
diff options
context:
space:
mode:
Diffstat (limited to 'docs/tutorial/LangImpl4.html')
-rw-r--r--docs/tutorial/LangImpl4.html1152
1 files changed, 0 insertions, 1152 deletions
diff --git a/docs/tutorial/LangImpl4.html b/docs/tutorial/LangImpl4.html
deleted file mode 100644
index 5e9c65676c9e..000000000000
--- a/docs/tutorial/LangImpl4.html
+++ /dev/null
@@ -1,1152 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
- "http://www.w3.org/TR/html4/strict.dtd">
-
-<html>
-<head>
- <title>Kaleidoscope: Adding JIT and Optimizer Support</title>
- <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
- <meta name="author" content="Chris Lattner">
- <link rel="stylesheet" href="../_static/llvm.css" type="text/css">
-</head>
-
-<body>
-
-<h1>Kaleidoscope: Adding JIT and Optimizer Support</h1>
-
-<ul>
-<li><a href="index.html">Up to Tutorial Index</a></li>
-<li>Chapter 4
- <ol>
- <li><a href="#intro">Chapter 4 Introduction</a></li>
- <li><a href="#trivialconstfold">Trivial Constant Folding</a></li>
- <li><a href="#optimizerpasses">LLVM Optimization Passes</a></li>
- <li><a href="#jit">Adding a JIT Compiler</a></li>
- <li><a href="#code">Full Code Listing</a></li>
- </ol>
-</li>
-<li><a href="LangImpl5.html">Chapter 5</a>: Extending the Language: Control
-Flow</li>
-</ul>
-
-<div class="doc_author">
- <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
-</div>
-
-<!-- *********************************************************************** -->
-<h2><a name="intro">Chapter 4 Introduction</a></h2>
-<!-- *********************************************************************** -->
-
-<div>
-
-<p>Welcome to Chapter 4 of the "<a href="index.html">Implementing a language
-with LLVM</a>" tutorial. Chapters 1-3 described the implementation of a simple
-language and added support for generating LLVM IR. This chapter describes
-two new techniques: adding optimizer support to your language, and adding JIT
-compiler support. These additions will demonstrate how to get nice, efficient code
-for the Kaleidoscope language.</p>
-
-</div>
-
-<!-- *********************************************************************** -->
-<h2><a name="trivialconstfold">Trivial Constant Folding</a></h2>
-<!-- *********************************************************************** -->
-
-<div>
-
-<p>
-Our demonstration for Chapter 3 is elegant and easy to extend. Unfortunately,
-it does not produce wonderful code. The IRBuilder, however, does give us
-obvious optimizations when compiling simple code:</p>
-
-<div class="doc_code">
-<pre>
-ready&gt; <b>def test(x) 1+2+x;</b>
-Read function definition:
-define double @test(double %x) {
-entry:
- %addtmp = fadd double 3.000000e+00, %x
- ret double %addtmp
-}
-</pre>
-</div>
-
-<p>This code is not a literal transcription of the AST built by parsing the
-input. That would be:
-
-<div class="doc_code">
-<pre>
-ready&gt; <b>def test(x) 1+2+x;</b>
-Read function definition:
-define double @test(double %x) {
-entry:
- %addtmp = fadd double 2.000000e+00, 1.000000e+00
- %addtmp1 = fadd double %addtmp, %x
- ret double %addtmp1
-}
-</pre>
-</div>
-
-<p>Constant folding, as seen above, in particular, is a very common and very
-important optimization: so much so that many language implementors implement
-constant folding support in their AST representation.</p>
-
-<p>With LLVM, you don't need this support in the AST. Since all calls to build
-LLVM IR go through the LLVM IR builder, the builder itself checked to see if
-there was a constant folding opportunity when you call it. If so, it just does
-the constant fold and return the constant instead of creating an instruction.
-
-<p>Well, that was easy :). In practice, we recommend always using
-<tt>IRBuilder</tt> when generating code like this. It has no
-"syntactic overhead" for its use (you don't have to uglify your compiler with
-constant checks everywhere) and it can dramatically reduce the amount of
-LLVM IR that is generated in some cases (particular for languages with a macro
-preprocessor or that use a lot of constants).</p>
-
-<p>On the other hand, the <tt>IRBuilder</tt> is limited by the fact
-that it does all of its analysis inline with the code as it is built. If you
-take a slightly more complex example:</p>
-
-<div class="doc_code">
-<pre>
-ready&gt; <b>def test(x) (1+2+x)*(x+(1+2));</b>
-ready> Read function definition:
-define double @test(double %x) {
-entry:
- %addtmp = fadd double 3.000000e+00, %x
- %addtmp1 = fadd double %x, 3.000000e+00
- %multmp = fmul double %addtmp, %addtmp1
- ret double %multmp
-}
-</pre>
-</div>
-
-<p>In this case, the LHS and RHS of the multiplication are the same value. We'd
-really like to see this generate "<tt>tmp = x+3; result = tmp*tmp;</tt>" instead
-of computing "<tt>x+3</tt>" twice.</p>
-
-<p>Unfortunately, no amount of local analysis will be able to detect and correct
-this. This requires two transformations: reassociation of expressions (to
-make the add's lexically identical) and Common Subexpression Elimination (CSE)
-to delete the redundant add instruction. Fortunately, LLVM provides a broad
-range of optimizations that you can use, in the form of "passes".</p>
-
-</div>
-
-<!-- *********************************************************************** -->
-<h2><a name="optimizerpasses">LLVM Optimization Passes</a></h2>
-<!-- *********************************************************************** -->
-
-<div>
-
-<p>LLVM provides many optimization passes, which do many different sorts of
-things and have different tradeoffs. Unlike other systems, LLVM doesn't hold
-to the mistaken notion that one set of optimizations is right for all languages
-and for all situations. LLVM allows a compiler implementor to make complete
-decisions about what optimizations to use, in which order, and in what
-situation.</p>
-
-<p>As a concrete example, LLVM supports both "whole module" passes, which look
-across as large of body of code as they can (often a whole file, but if run
-at link time, this can be a substantial portion of the whole program). It also
-supports and includes "per-function" passes which just operate on a single
-function at a time, without looking at other functions. For more information
-on passes and how they are run, see the <a href="../WritingAnLLVMPass.html">How
-to Write a Pass</a> document and the <a href="../Passes.html">List of LLVM
-Passes</a>.</p>
-
-<p>For Kaleidoscope, we are currently generating functions on the fly, one at
-a time, as the user types them in. We aren't shooting for the ultimate
-optimization experience in this setting, but we also want to catch the easy and
-quick stuff where possible. As such, we will choose to run a few per-function
-optimizations as the user types the function in. If we wanted to make a "static
-Kaleidoscope compiler", we would use exactly the code we have now, except that
-we would defer running the optimizer until the entire file has been parsed.</p>
-
-<p>In order to get per-function optimizations going, we need to set up a
-<a href="../WritingAnLLVMPass.html#passmanager">FunctionPassManager</a> to hold and
-organize the LLVM optimizations that we want to run. Once we have that, we can
-add a set of optimizations to run. The code looks like this:</p>
-
-<div class="doc_code">
-<pre>
- FunctionPassManager OurFPM(TheModule);
-
- // Set up the optimizer pipeline. Start with registering info about how the
- // target lays out data structures.
- OurFPM.add(new DataLayout(*TheExecutionEngine->getDataLayout()));
- // Provide basic AliasAnalysis support for GVN.
- OurFPM.add(createBasicAliasAnalysisPass());
- // Do simple "peephole" optimizations and bit-twiddling optzns.
- OurFPM.add(createInstructionCombiningPass());
- // Reassociate expressions.
- OurFPM.add(createReassociatePass());
- // Eliminate Common SubExpressions.
- OurFPM.add(createGVNPass());
- // Simplify the control flow graph (deleting unreachable blocks, etc).
- OurFPM.add(createCFGSimplificationPass());
-
- OurFPM.doInitialization();
-
- // Set the global so the code gen can use this.
- TheFPM = &amp;OurFPM;
-
- // Run the main "interpreter loop" now.
- MainLoop();
-</pre>
-</div>
-
-<p>This code defines a <tt>FunctionPassManager</tt>, "<tt>OurFPM</tt>". It
-requires a pointer to the <tt>Module</tt> to construct itself. Once it is set
-up, we use a series of "add" calls to add a bunch of LLVM passes. The first
-pass is basically boilerplate, it adds a pass so that later optimizations know
-how the data structures in the program are laid out. The
-"<tt>TheExecutionEngine</tt>" variable is related to the JIT, which we will get
-to in the next section.</p>
-
-<p>In this case, we choose to add 4 optimization passes. The passes we chose
-here are a pretty standard set of "cleanup" optimizations that are useful for
-a wide variety of code. I won't delve into what they do but, believe me,
-they are a good starting place :).</p>
-
-<p>Once the PassManager is set up, we need to make use of it. We do this by
-running it after our newly created function is constructed (in
-<tt>FunctionAST::Codegen</tt>), but before it is returned to the client:</p>
-
-<div class="doc_code">
-<pre>
- if (Value *RetVal = Body->Codegen()) {
- // Finish off the function.
- Builder.CreateRet(RetVal);
-
- // Validate the generated code, checking for consistency.
- verifyFunction(*TheFunction);
-
- <b>// Optimize the function.
- TheFPM-&gt;run(*TheFunction);</b>
-
- return TheFunction;
- }
-</pre>
-</div>
-
-<p>As you can see, this is pretty straightforward. The
-<tt>FunctionPassManager</tt> optimizes and updates the LLVM Function* in place,
-improving (hopefully) its body. With this in place, we can try our test above
-again:</p>
-
-<div class="doc_code">
-<pre>
-ready&gt; <b>def test(x) (1+2+x)*(x+(1+2));</b>
-ready> Read function definition:
-define double @test(double %x) {
-entry:
- %addtmp = fadd double %x, 3.000000e+00
- %multmp = fmul double %addtmp, %addtmp
- ret double %multmp
-}
-</pre>
-</div>
-
-<p>As expected, we now get our nicely optimized code, saving a floating point
-add instruction from every execution of this function.</p>
-
-<p>LLVM provides a wide variety of optimizations that can be used in certain
-circumstances. Some <a href="../Passes.html">documentation about the various
-passes</a> is available, but it isn't very complete. Another good source of
-ideas can come from looking at the passes that <tt>Clang</tt> runs to get
-started. The "<tt>opt</tt>" tool allows you to experiment with passes from the
-command line, so you can see if they do anything.</p>
-
-<p>Now that we have reasonable code coming out of our front-end, lets talk about
-executing it!</p>
-
-</div>
-
-<!-- *********************************************************************** -->
-<h2><a name="jit">Adding a JIT Compiler</a></h2>
-<!-- *********************************************************************** -->
-
-<div>
-
-<p>Code that is available in LLVM IR can have a wide variety of tools
-applied to it. For example, you can run optimizations on it (as we did above),
-you can dump it out in textual or binary forms, you can compile the code to an
-assembly file (.s) for some target, or you can JIT compile it. The nice thing
-about the LLVM IR representation is that it is the "common currency" between
-many different parts of the compiler.
-</p>
-
-<p>In this section, we'll add JIT compiler support to our interpreter. The
-basic idea that we want for Kaleidoscope is to have the user enter function
-bodies as they do now, but immediately evaluate the top-level expressions they
-type in. For example, if they type in "1 + 2;", we should evaluate and print
-out 3. If they define a function, they should be able to call it from the
-command line.</p>
-
-<p>In order to do this, we first declare and initialize the JIT. This is done
-by adding a global variable and a call in <tt>main</tt>:</p>
-
-<div class="doc_code">
-<pre>
-<b>static ExecutionEngine *TheExecutionEngine;</b>
-...
-int main() {
- ..
- <b>// Create the JIT. This takes ownership of the module.
- TheExecutionEngine = EngineBuilder(TheModule).create();</b>
- ..
-}
-</pre>
-</div>
-
-<p>This creates an abstract "Execution Engine" which can be either a JIT
-compiler or the LLVM interpreter. LLVM will automatically pick a JIT compiler
-for you if one is available for your platform, otherwise it will fall back to
-the interpreter.</p>
-
-<p>Once the <tt>ExecutionEngine</tt> is created, the JIT is ready to be used.
-There are a variety of APIs that are useful, but the simplest one is the
-"<tt>getPointerToFunction(F)</tt>" method. This method JIT compiles the
-specified LLVM Function and returns a function pointer to the generated machine
-code. In our case, this means that we can change the code that parses a
-top-level expression to look like this:</p>
-
-<div class="doc_code">
-<pre>
-static void HandleTopLevelExpression() {
- // Evaluate a top-level expression into an anonymous function.
- if (FunctionAST *F = ParseTopLevelExpr()) {
- if (Function *LF = F-&gt;Codegen()) {
- LF->dump(); // Dump the function for exposition purposes.
-
- <b>// JIT the function, returning a function pointer.
- void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
-
- // Cast it to the right type (takes no arguments, returns a double) so we
- // can call it as a native function.
- double (*FP)() = (double (*)())(intptr_t)FPtr;
- fprintf(stderr, "Evaluated to %f\n", FP());</b>
- }
-</pre>
-</div>
-
-<p>Recall that we compile top-level expressions into a self-contained LLVM
-function that takes no arguments and returns the computed double. Because the
-LLVM JIT compiler matches the native platform ABI, this means that you can just
-cast the result pointer to a function pointer of that type and call it directly.
-This means, there is no difference between JIT compiled code and native machine
-code that is statically linked into your application.</p>
-
-<p>With just these two changes, lets see how Kaleidoscope works now!</p>
-
-<div class="doc_code">
-<pre>
-ready&gt; <b>4+5;</b>
-Read top-level expression:
-define double @0() {
-entry:
- ret double 9.000000e+00
-}
-
-<em>Evaluated to 9.000000</em>
-</pre>
-</div>
-
-<p>Well this looks like it is basically working. The dump of the function
-shows the "no argument function that always returns double" that we synthesize
-for each top-level expression that is typed in. This demonstrates very basic
-functionality, but can we do more?</p>
-
-<div class="doc_code">
-<pre>
-ready&gt; <b>def testfunc(x y) x + y*2; </b>
-Read function definition:
-define double @testfunc(double %x, double %y) {
-entry:
- %multmp = fmul double %y, 2.000000e+00
- %addtmp = fadd double %multmp, %x
- ret double %addtmp
-}
-
-ready&gt; <b>testfunc(4, 10);</b>
-Read top-level expression:
-define double @1() {
-entry:
- %calltmp = call double @testfunc(double 4.000000e+00, double 1.000000e+01)
- ret double %calltmp
-}
-
-<em>Evaluated to 24.000000</em>
-</pre>
-</div>
-
-<p>This illustrates that we can now call user code, but there is something a bit
-subtle going on here. Note that we only invoke the JIT on the anonymous
-functions that <em>call testfunc</em>, but we never invoked it
-on <em>testfunc</em> itself. What actually happened here is that the JIT
-scanned for all non-JIT'd functions transitively called from the anonymous
-function and compiled all of them before returning
-from <tt>getPointerToFunction()</tt>.</p>
-
-<p>The JIT provides a number of other more advanced interfaces for things like
-freeing allocated machine code, rejit'ing functions to update them, etc.
-However, even with this simple code, we get some surprisingly powerful
-capabilities - check this out (I removed the dump of the anonymous functions,
-you should get the idea by now :) :</p>
-
-<div class="doc_code">
-<pre>
-ready&gt; <b>extern sin(x);</b>
-Read extern:
-declare double @sin(double)
-
-ready&gt; <b>extern cos(x);</b>
-Read extern:
-declare double @cos(double)
-
-ready&gt; <b>sin(1.0);</b>
-Read top-level expression:
-define double @2() {
-entry:
- ret double 0x3FEAED548F090CEE
-}
-
-<em>Evaluated to 0.841471</em>
-
-ready&gt; <b>def foo(x) sin(x)*sin(x) + cos(x)*cos(x);</b>
-Read function definition:
-define double @foo(double %x) {
-entry:
- %calltmp = call double @sin(double %x)
- %multmp = fmul double %calltmp, %calltmp
- %calltmp2 = call double @cos(double %x)
- %multmp4 = fmul double %calltmp2, %calltmp2
- %addtmp = fadd double %multmp, %multmp4
- ret double %addtmp
-}
-
-ready&gt; <b>foo(4.0);</b>
-Read top-level expression:
-define double @3() {
-entry:
- %calltmp = call double @foo(double 4.000000e+00)
- ret double %calltmp
-}
-
-<em>Evaluated to 1.000000</em>
-</pre>
-</div>
-
-<p>Whoa, how does the JIT know about sin and cos? The answer is surprisingly
-simple: in this
-example, the JIT started execution of a function and got to a function call. It
-realized that the function was not yet JIT compiled and invoked the standard set
-of routines to resolve the function. In this case, there is no body defined
-for the function, so the JIT ended up calling "<tt>dlsym("sin")</tt>" on the
-Kaleidoscope process itself.
-Since "<tt>sin</tt>" is defined within the JIT's address space, it simply
-patches up calls in the module to call the libm version of <tt>sin</tt>
-directly.</p>
-
-<p>The LLVM JIT provides a number of interfaces (look in the
-<tt>ExecutionEngine.h</tt> file) for controlling how unknown functions get
-resolved. It allows you to establish explicit mappings between IR objects and
-addresses (useful for LLVM global variables that you want to map to static
-tables, for example), allows you to dynamically decide on the fly based on the
-function name, and even allows you to have the JIT compile functions lazily the
-first time they're called.</p>
-
-<p>One interesting application of this is that we can now extend the language
-by writing arbitrary C++ code to implement operations. For example, if we add:
-</p>
-
-<div class="doc_code">
-<pre>
-/// putchard - putchar that takes a double and returns 0.
-extern "C"
-double putchard(double X) {
- putchar((char)X);
- return 0;
-}
-</pre>
-</div>
-
-<p>Now we can produce simple output to the console by using things like:
-"<tt>extern putchard(x); putchard(120);</tt>", which prints a lowercase 'x' on
-the console (120 is the ASCII code for 'x'). Similar code could be used to
-implement file I/O, console input, and many other capabilities in
-Kaleidoscope.</p>
-
-<p>This completes the JIT and optimizer chapter of the Kaleidoscope tutorial. At
-this point, we can compile a non-Turing-complete programming language, optimize
-and JIT compile it in a user-driven way. Next up we'll look into <a
-href="LangImpl5.html">extending the language with control flow constructs</a>,
-tackling some interesting LLVM IR issues along the way.</p>
-
-</div>
-
-<!-- *********************************************************************** -->
-<h2><a name="code">Full Code Listing</a></h2>
-<!-- *********************************************************************** -->
-
-<div>
-
-<p>
-Here is the complete code listing for our running example, enhanced with the
-LLVM JIT and optimizer. To build this example, use:
-</p>
-
-<div class="doc_code">
-<pre>
-# Compile
-clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
-# Run
-./toy
-</pre>
-</div>
-
-<p>
-If you are compiling this on Linux, make sure to add the "-rdynamic" option
-as well. This makes sure that the external functions are resolved properly
-at runtime.</p>
-
-<p>Here is the code:</p>
-
-<div class="doc_code">
-<pre>
-#include "llvm/DerivedTypes.h"
-#include "llvm/ExecutionEngine/ExecutionEngine.h"
-#include "llvm/ExecutionEngine/JIT.h"
-#include "llvm/IRBuilder.h"
-#include "llvm/LLVMContext.h"
-#include "llvm/Module.h"
-#include "llvm/PassManager.h"
-#include "llvm/Analysis/Verifier.h"
-#include "llvm/Analysis/Passes.h"
-#include "llvm/DataLayout.h"
-#include "llvm/Transforms/Scalar.h"
-#include "llvm/Support/TargetSelect.h"
-#include &lt;cstdio&gt;
-#include &lt;string&gt;
-#include &lt;map&gt;
-#include &lt;vector&gt;
-using namespace llvm;
-
-//===----------------------------------------------------------------------===//
-// Lexer
-//===----------------------------------------------------------------------===//
-
-// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
-// of these for known things.
-enum Token {
- tok_eof = -1,
-
- // commands
- tok_def = -2, tok_extern = -3,
-
- // primary
- tok_identifier = -4, tok_number = -5
-};
-
-static std::string IdentifierStr; // Filled in if tok_identifier
-static double NumVal; // Filled in if tok_number
-
-/// gettok - Return the next token from standard input.
-static int gettok() {
- static int LastChar = ' ';
-
- // Skip any whitespace.
- while (isspace(LastChar))
- LastChar = getchar();
-
- if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
- IdentifierStr = LastChar;
- while (isalnum((LastChar = getchar())))
- IdentifierStr += LastChar;
-
- if (IdentifierStr == "def") return tok_def;
- if (IdentifierStr == "extern") return tok_extern;
- return tok_identifier;
- }
-
- if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
- std::string NumStr;
- do {
- NumStr += LastChar;
- LastChar = getchar();
- } while (isdigit(LastChar) || LastChar == '.');
-
- NumVal = strtod(NumStr.c_str(), 0);
- return tok_number;
- }
-
- if (LastChar == '#') {
- // Comment until end of line.
- do LastChar = getchar();
- while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp;&amp; LastChar != '\r');
-
- if (LastChar != EOF)
- return gettok();
- }
-
- // Check for end of file. Don't eat the EOF.
- if (LastChar == EOF)
- return tok_eof;
-
- // Otherwise, just return the character as its ascii value.
- int ThisChar = LastChar;
- LastChar = getchar();
- return ThisChar;
-}
-
-//===----------------------------------------------------------------------===//
-// Abstract Syntax Tree (aka Parse Tree)
-//===----------------------------------------------------------------------===//
-
-/// ExprAST - Base class for all expression nodes.
-class ExprAST {
-public:
- virtual ~ExprAST() {}
- virtual Value *Codegen() = 0;
-};
-
-/// NumberExprAST - Expression class for numeric literals like "1.0".
-class NumberExprAST : public ExprAST {
- double Val;
-public:
- NumberExprAST(double val) : Val(val) {}
- virtual Value *Codegen();
-};
-
-/// VariableExprAST - Expression class for referencing a variable, like "a".
-class VariableExprAST : public ExprAST {
- std::string Name;
-public:
- VariableExprAST(const std::string &amp;name) : Name(name) {}
- virtual Value *Codegen();
-};
-
-/// BinaryExprAST - Expression class for a binary operator.
-class BinaryExprAST : public ExprAST {
- char Op;
- ExprAST *LHS, *RHS;
-public:
- BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
- : Op(op), LHS(lhs), RHS(rhs) {}
- virtual Value *Codegen();
-};
-
-/// CallExprAST - Expression class for function calls.
-class CallExprAST : public ExprAST {
- std::string Callee;
- std::vector&lt;ExprAST*&gt; Args;
-public:
- CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
- : Callee(callee), Args(args) {}
- virtual Value *Codegen();
-};
-
-/// PrototypeAST - This class represents the "prototype" for a function,
-/// which captures its name, and its argument names (thus implicitly the number
-/// of arguments the function takes).
-class PrototypeAST {
- std::string Name;
- std::vector&lt;std::string&gt; Args;
-public:
- PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args)
- : Name(name), Args(args) {}
-
- Function *Codegen();
-};
-
-/// FunctionAST - This class represents a function definition itself.
-class FunctionAST {
- PrototypeAST *Proto;
- ExprAST *Body;
-public:
- FunctionAST(PrototypeAST *proto, ExprAST *body)
- : Proto(proto), Body(body) {}
-
- Function *Codegen();
-};
-
-//===----------------------------------------------------------------------===//
-// Parser
-//===----------------------------------------------------------------------===//
-
-/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
-/// token the parser is looking at. getNextToken reads another token from the
-/// lexer and updates CurTok with its results.
-static int CurTok;
-static int getNextToken() {
- return CurTok = gettok();
-}
-
-/// BinopPrecedence - This holds the precedence for each binary operator that is
-/// defined.
-static std::map&lt;char, int&gt; BinopPrecedence;
-
-/// GetTokPrecedence - Get the precedence of the pending binary operator token.
-static int GetTokPrecedence() {
- if (!isascii(CurTok))
- return -1;
-
- // Make sure it's a declared binop.
- int TokPrec = BinopPrecedence[CurTok];
- if (TokPrec &lt;= 0) return -1;
- return TokPrec;
-}
-
-/// Error* - These are little helper functions for error handling.
-ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
-PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
-FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
-
-static ExprAST *ParseExpression();
-
-/// identifierexpr
-/// ::= identifier
-/// ::= identifier '(' expression* ')'
-static ExprAST *ParseIdentifierExpr() {
- std::string IdName = IdentifierStr;
-
- getNextToken(); // eat identifier.
-
- if (CurTok != '(') // Simple variable ref.
- return new VariableExprAST(IdName);
-
- // Call.
- getNextToken(); // eat (
- std::vector&lt;ExprAST*&gt; Args;
- if (CurTok != ')') {
- while (1) {
- ExprAST *Arg = ParseExpression();
- if (!Arg) return 0;
- Args.push_back(Arg);
-
- if (CurTok == ')') break;
-
- if (CurTok != ',')
- return Error("Expected ')' or ',' in argument list");
- getNextToken();
- }
- }
-
- // Eat the ')'.
- getNextToken();
-
- return new CallExprAST(IdName, Args);
-}
-
-/// numberexpr ::= number
-static ExprAST *ParseNumberExpr() {
- ExprAST *Result = new NumberExprAST(NumVal);
- getNextToken(); // consume the number
- return Result;
-}
-
-/// parenexpr ::= '(' expression ')'
-static ExprAST *ParseParenExpr() {
- getNextToken(); // eat (.
- ExprAST *V = ParseExpression();
- if (!V) return 0;
-
- if (CurTok != ')')
- return Error("expected ')'");
- getNextToken(); // eat ).
- return V;
-}
-
-/// primary
-/// ::= identifierexpr
-/// ::= numberexpr
-/// ::= parenexpr
-static ExprAST *ParsePrimary() {
- switch (CurTok) {
- default: return Error("unknown token when expecting an expression");
- case tok_identifier: return ParseIdentifierExpr();
- case tok_number: return ParseNumberExpr();
- case '(': return ParseParenExpr();
- }
-}
-
-/// binoprhs
-/// ::= ('+' primary)*
-static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
- // If this is a binop, find its precedence.
- while (1) {
- int TokPrec = GetTokPrecedence();
-
- // If this is a binop that binds at least as tightly as the current binop,
- // consume it, otherwise we are done.
- if (TokPrec &lt; ExprPrec)
- return LHS;
-
- // Okay, we know this is a binop.
- int BinOp = CurTok;
- getNextToken(); // eat binop
-
- // Parse the primary expression after the binary operator.
- ExprAST *RHS = ParsePrimary();
- if (!RHS) return 0;
-
- // If BinOp binds less tightly with RHS than the operator after RHS, let
- // the pending operator take RHS as its LHS.
- int NextPrec = GetTokPrecedence();
- if (TokPrec &lt; NextPrec) {
- RHS = ParseBinOpRHS(TokPrec+1, RHS);
- if (RHS == 0) return 0;
- }
-
- // Merge LHS/RHS.
- LHS = new BinaryExprAST(BinOp, LHS, RHS);
- }
-}
-
-/// expression
-/// ::= primary binoprhs
-///
-static ExprAST *ParseExpression() {
- ExprAST *LHS = ParsePrimary();
- if (!LHS) return 0;
-
- return ParseBinOpRHS(0, LHS);
-}
-
-/// prototype
-/// ::= id '(' id* ')'
-static PrototypeAST *ParsePrototype() {
- if (CurTok != tok_identifier)
- return ErrorP("Expected function name in prototype");
-
- std::string FnName = IdentifierStr;
- getNextToken();
-
- if (CurTok != '(')
- return ErrorP("Expected '(' in prototype");
-
- std::vector&lt;std::string&gt; ArgNames;
- while (getNextToken() == tok_identifier)
- ArgNames.push_back(IdentifierStr);
- if (CurTok != ')')
- return ErrorP("Expected ')' in prototype");
-
- // success.
- getNextToken(); // eat ')'.
-
- return new PrototypeAST(FnName, ArgNames);
-}
-
-/// definition ::= 'def' prototype expression
-static FunctionAST *ParseDefinition() {
- getNextToken(); // eat def.
- PrototypeAST *Proto = ParsePrototype();
- if (Proto == 0) return 0;
-
- if (ExprAST *E = ParseExpression())
- return new FunctionAST(Proto, E);
- return 0;
-}
-
-/// toplevelexpr ::= expression
-static FunctionAST *ParseTopLevelExpr() {
- if (ExprAST *E = ParseExpression()) {
- // Make an anonymous proto.
- PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
- return new FunctionAST(Proto, E);
- }
- return 0;
-}
-
-/// external ::= 'extern' prototype
-static PrototypeAST *ParseExtern() {
- getNextToken(); // eat extern.
- return ParsePrototype();
-}
-
-//===----------------------------------------------------------------------===//
-// Code Generation
-//===----------------------------------------------------------------------===//
-
-static Module *TheModule;
-static IRBuilder&lt;&gt; Builder(getGlobalContext());
-static std::map&lt;std::string, Value*&gt; NamedValues;
-static FunctionPassManager *TheFPM;
-
-Value *ErrorV(const char *Str) { Error(Str); return 0; }
-
-Value *NumberExprAST::Codegen() {
- return ConstantFP::get(getGlobalContext(), APFloat(Val));
-}
-
-Value *VariableExprAST::Codegen() {
- // Look this variable up in the function.
- Value *V = NamedValues[Name];
- return V ? V : ErrorV("Unknown variable name");
-}
-
-Value *BinaryExprAST::Codegen() {
- Value *L = LHS-&gt;Codegen();
- Value *R = RHS-&gt;Codegen();
- if (L == 0 || R == 0) return 0;
-
- switch (Op) {
- case '+': return Builder.CreateFAdd(L, R, "addtmp");
- case '-': return Builder.CreateFSub(L, R, "subtmp");
- case '*': return Builder.CreateFMul(L, R, "multmp");
- case '&lt;':
- L = Builder.CreateFCmpULT(L, R, "cmptmp");
- // Convert bool 0/1 to double 0.0 or 1.0
- return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
- "booltmp");
- default: return ErrorV("invalid binary operator");
- }
-}
-
-Value *CallExprAST::Codegen() {
- // Look up the name in the global module table.
- Function *CalleeF = TheModule-&gt;getFunction(Callee);
- if (CalleeF == 0)
- return ErrorV("Unknown function referenced");
-
- // If argument mismatch error.
- if (CalleeF-&gt;arg_size() != Args.size())
- return ErrorV("Incorrect # arguments passed");
-
- std::vector&lt;Value*&gt; ArgsV;
- for (unsigned i = 0, e = Args.size(); i != e; ++i) {
- ArgsV.push_back(Args[i]-&gt;Codegen());
- if (ArgsV.back() == 0) return 0;
- }
-
- return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
-}
-
-Function *PrototypeAST::Codegen() {
- // Make the function type: double(double,double) etc.
- std::vector&lt;Type*&gt; Doubles(Args.size(),
- Type::getDoubleTy(getGlobalContext()));
- FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
- Doubles, false);
-
- Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
-
- // If F conflicted, there was already something named 'Name'. If it has a
- // body, don't allow redefinition or reextern.
- if (F-&gt;getName() != Name) {
- // Delete the one we just made and get the existing one.
- F-&gt;eraseFromParent();
- F = TheModule-&gt;getFunction(Name);
-
- // If F already has a body, reject this.
- if (!F-&gt;empty()) {
- ErrorF("redefinition of function");
- return 0;
- }
-
- // If F took a different number of args, reject.
- if (F-&gt;arg_size() != Args.size()) {
- ErrorF("redefinition of function with different # args");
- return 0;
- }
- }
-
- // Set names for all arguments.
- unsigned Idx = 0;
- for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
- ++AI, ++Idx) {
- AI-&gt;setName(Args[Idx]);
-
- // Add arguments to variable symbol table.
- NamedValues[Args[Idx]] = AI;
- }
-
- return F;
-}
-
-Function *FunctionAST::Codegen() {
- NamedValues.clear();
-
- Function *TheFunction = Proto-&gt;Codegen();
- if (TheFunction == 0)
- return 0;
-
- // Create a new basic block to start insertion into.
- BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
- Builder.SetInsertPoint(BB);
-
- if (Value *RetVal = Body-&gt;Codegen()) {
- // Finish off the function.
- Builder.CreateRet(RetVal);
-
- // Validate the generated code, checking for consistency.
- verifyFunction(*TheFunction);
-
- // Optimize the function.
- TheFPM-&gt;run(*TheFunction);
-
- return TheFunction;
- }
-
- // Error reading body, remove function.
- TheFunction-&gt;eraseFromParent();
- return 0;
-}
-
-//===----------------------------------------------------------------------===//
-// Top-Level parsing and JIT Driver
-//===----------------------------------------------------------------------===//
-
-static ExecutionEngine *TheExecutionEngine;
-
-static void HandleDefinition() {
- if (FunctionAST *F = ParseDefinition()) {
- if (Function *LF = F-&gt;Codegen()) {
- fprintf(stderr, "Read function definition:");
- LF-&gt;dump();
- }
- } else {
- // Skip token for error recovery.
- getNextToken();
- }
-}
-
-static void HandleExtern() {
- if (PrototypeAST *P = ParseExtern()) {
- if (Function *F = P-&gt;Codegen()) {
- fprintf(stderr, "Read extern: ");
- F-&gt;dump();
- }
- } else {
- // Skip token for error recovery.
- getNextToken();
- }
-}
-
-static void HandleTopLevelExpression() {
- // Evaluate a top-level expression into an anonymous function.
- if (FunctionAST *F = ParseTopLevelExpr()) {
- if (Function *LF = F-&gt;Codegen()) {
- fprintf(stderr, "Read top-level expression:");
- LF->dump();
-
- // JIT the function, returning a function pointer.
- void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
-
- // Cast it to the right type (takes no arguments, returns a double) so we
- // can call it as a native function.
- double (*FP)() = (double (*)())(intptr_t)FPtr;
- fprintf(stderr, "Evaluated to %f\n", FP());
- }
- } else {
- // Skip token for error recovery.
- getNextToken();
- }
-}
-
-/// top ::= definition | external | expression | ';'
-static void MainLoop() {
- while (1) {
- fprintf(stderr, "ready&gt; ");
- switch (CurTok) {
- case tok_eof: return;
- case ';': getNextToken(); break; // ignore top-level semicolons.
- case tok_def: HandleDefinition(); break;
- case tok_extern: HandleExtern(); break;
- default: HandleTopLevelExpression(); break;
- }
- }
-}
-
-//===----------------------------------------------------------------------===//
-// "Library" functions that can be "extern'd" from user code.
-//===----------------------------------------------------------------------===//
-
-/// putchard - putchar that takes a double and returns 0.
-extern "C"
-double putchard(double X) {
- putchar((char)X);
- return 0;
-}
-
-//===----------------------------------------------------------------------===//
-// Main driver code.
-//===----------------------------------------------------------------------===//
-
-int main() {
- InitializeNativeTarget();
- LLVMContext &amp;Context = getGlobalContext();
-
- // Install standard binary operators.
- // 1 is lowest precedence.
- BinopPrecedence['&lt;'] = 10;
- BinopPrecedence['+'] = 20;
- BinopPrecedence['-'] = 20;
- BinopPrecedence['*'] = 40; // highest.
-
- // Prime the first token.
- fprintf(stderr, "ready&gt; ");
- getNextToken();
-
- // Make the module, which holds all the code.
- TheModule = new Module("my cool jit", Context);
-
- // Create the JIT. This takes ownership of the module.
- std::string ErrStr;
- TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&amp;ErrStr).create();
- if (!TheExecutionEngine) {
- fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
- exit(1);
- }
-
- FunctionPassManager OurFPM(TheModule);
-
- // Set up the optimizer pipeline. Start with registering info about how the
- // target lays out data structures.
- OurFPM.add(new DataLayout(*TheExecutionEngine-&gt;getDataLayout()));
- // Provide basic AliasAnalysis support for GVN.
- OurFPM.add(createBasicAliasAnalysisPass());
- // Do simple "peephole" optimizations and bit-twiddling optzns.
- OurFPM.add(createInstructionCombiningPass());
- // Reassociate expressions.
- OurFPM.add(createReassociatePass());
- // Eliminate Common SubExpressions.
- OurFPM.add(createGVNPass());
- // Simplify the control flow graph (deleting unreachable blocks, etc).
- OurFPM.add(createCFGSimplificationPass());
-
- OurFPM.doInitialization();
-
- // Set the global so the code gen can use this.
- TheFPM = &amp;OurFPM;
-
- // Run the main "interpreter loop" now.
- MainLoop();
-
- TheFPM = 0;
-
- // Print out all of the generated code.
- TheModule-&gt;dump();
-
- return 0;
-}
-</pre>
-</div>
-
-<a href="LangImpl5.html">Next: Extending the language: control flow</a>
-</div>
-
-<!-- *********************************************************************** -->
-<hr>
-<address>
- <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
- src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
- <a href="http://validator.w3.org/check/referer"><img
- src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
-
- <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
- <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br>
- Last modified: $Date: 2012-10-08 18:39:34 +0200 (Mon, 08 Oct 2012) $
-</address>
-</body>
-</html>