21 files changed, 1347 insertions, 127 deletions
diff --git a/docs/tutorial/BuildingAJIT1.rst b/docs/tutorial/BuildingAJIT1.rst
new file mode 100644
index 0000000000000..f30b979579dcf
--- /dev/null
+++ b/docs/tutorial/BuildingAJIT1.rst
@@ -0,0 +1,375 @@
+=======================================================
+Building a JIT: Starting out with KaleidoscopeJIT
+=======================================================
+
+.. contents::
+   :local:
+
+Chapter 1 Introduction
+======================
+
+Welcome to Chapter 1 of the "Building an ORC-based JIT in LLVM" tutorial. This
+tutorial runs through the implementation of a JIT compiler using LLVM's
+On-Request-Compilation (ORC) APIs. It begins with a simplified version of the
+KaleidoscopeJIT class used in the
+`Implementing a language with LLVM <LangImpl1.html>`_ tutorials and then
+introduces new features like optimization, lazy compilation and remote
+execution.
+
+The goal of this tutorial is to introduce you to LLVM's ORC JIT APIs, show how
+these APIs interact with other parts of LLVM, and to teach you how to recombine
+them to build a custom JIT that is suited to your use-case.
+
+The structure of the tutorial is:
+
+- Chapter #1: Investigate the simple KaleidoscopeJIT class. This will
+  introduce some of the basic concepts of the ORC JIT APIs, including the
+  idea of an ORC *Layer*.
+
+- `Chapter #2 <BuildingAJIT2.html>`_: Extend the basic KaleidoscopeJIT by adding
+  a new layer that will optimize IR and generated code.
+
+- `Chapter #3 <BuildingAJIT3.html>`_: Further extend the JIT by adding a
+  Compile-On-Demand layer to lazily compile IR.
+
+- `Chapter #4 <BuildingAJIT4.html>`_: Improve the laziness of our JIT by
+  replacing the Compile-On-Demand layer with a custom layer that uses the ORC
+  Compile Callbacks API directly to defer IR-generation until functions are
+  called.
+
+- `Chapter #5 <BuildingAJIT5.html>`_: Add process isolation by JITing code into
+  a remote process with reduced privileges using the JIT Remote APIs.
+
+To provide input for our JIT we will use the Kaleidoscope REPL from
+`Chapter 7 <LangImpl7.html>`_ of the "Implementing a language in LLVM tutorial",
+with one minor modification: We will remove the FunctionPassManager from the
+code for that chapter and replace it with optimization support in our JIT class
+in Chapter #2.
+
+Finally, a word on API generations: ORC is the 3rd generation of LLVM JIT API.
+It was preceded by MCJIT, and before that by the (now deleted) legacy JIT.
+These tutorials don't assume any experience with these earlier APIs, but
+readers acquainted with them will see many familiar elements. Where appropriate
+we will make this connection with the earlier APIs explicit to help people who
+are transitioning from them to ORC.
+
+JIT API Basics
+==============
+
+The purpose of a JIT compiler is to compile code "on-the-fly" as it is needed,
+rather than compiling whole programs to disk ahead of time as a traditional
+compiler does. To support that aim our initial, bare-bones JIT API will be:
+
+1. Handle addModule(Module &M) -- Make the given IR module available for
+   execution.
+2. JITSymbol findSymbol(const std::string &Name) -- Search for pointers to
+   symbols (functions or variables) that have been added to the JIT.
+3. void removeModule(Handle H) -- Remove a module from the JIT, releasing any
+   memory that had been used for the compiled code.
+
+A basic use-case for this API, executing the 'main' function from a module,
+will look like:
+
+.. code-block:: c++
+
+  std::unique_ptr<Module> M = buildModule();
+  JIT J;
+  Handle H = J.addModule(*M);
+  int (*Main)(int, char*[]) =
+    (int(*)(int, char*[])J.findSymbol("main").getAddress();
+  int Result = Main();
+  J.removeModule(H);
+
+The APIs that we build in these tutorials will all be variations on this simple
+theme. Behind the API we will refine the implementation of the JIT to add
+support for optimization and lazy compilation. Eventually we will extend the
+API itself to allow higher-level program representations (e.g. ASTs) to be
+added to the JIT.
+
+KaleidoscopeJIT
+===============
+
+In the previous section we described our API, now we examine a simple
+implementation of it: The KaleidoscopeJIT class [1]_ that was used in the
+`Implementing a language with LLVM <LangImpl1.html>`_ tutorials. We will use
+the REPL code from `Chapter 7 <LangImpl7.html>`_ of that tutorial to supply the
+input for our JIT: Each time the user enters an expression the REPL will add a
+new IR module containing the code for that expression to the JIT. If the
+expression is a top-level expression like '1+1' or 'sin(x)', the REPL will also
+use the findSymbol method of our JIT class find and execute the code for the
+expression, and then use the removeModule method to remove the code again
+(since there's no way to re-invoke an anonymous expression). In later chapters
+of this tutorial we'll modify the REPL to enable new interactions with our JIT
+class, but for now we will take this setup for granted and focus our attention on
+the implementation of our JIT itself.
+
+Our KaleidoscopeJIT class is defined in the KaleidoscopeJIT.h header. After the
+usual include guards and #includes [2]_, we get to the definition of our class:
+
+.. code-block:: c++
+
+  #ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
+  #define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
+
+  #include "llvm/ExecutionEngine/ExecutionEngine.h"
+  #include "llvm/ExecutionEngine/RTDyldMemoryManager.h"
+  #include "llvm/ExecutionEngine/Orc/CompileUtils.h"
+  #include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
+  #include "llvm/ExecutionEngine/Orc/LambdaResolver.h"
+  #include "llvm/ExecutionEngine/Orc/ObjectLinkingLayer.h"
+  #include "llvm/IR/Mangler.h"
+  #include "llvm/Support/DynamicLibrary.h"
+
+  namespace llvm {
+  namespace orc {
+
+  class KaleidoscopeJIT {
+  private:
+
+    std::unique_ptr<TargetMachine> TM;
+    const DataLayout DL;
+    ObjectLinkingLayer<> ObjectLayer;
+    IRCompileLayer<decltype(ObjectLayer)> CompileLayer;
+
+  public:
+
+    typedef decltype(CompileLayer)::ModuleSetHandleT ModuleHandleT;
+
+Our class begins with four members: A TargetMachine, TM, which will be used
+to build our LLVM compiler instance; A DataLayout, DL, which will be used for
+symbol mangling (more on that later), and two ORC *layers*: an
+ObjectLinkingLayer and a IRCompileLayer. We'll be talking more about layers in
+the next chapter, but for now you can think of them as analogous to LLVM
+Passes: they wrap up useful JIT utilities behind an easy to compose interface.
+The first layer, ObjectLinkingLayer, is the foundation of our JIT: it takes
+in-memory object files produced by a compiler and links them on the fly to make
+them executable. This JIT-on-top-of-a-linker design was introduced in MCJIT,
+however the linker was hidden inside the MCJIT class. In ORC we expose the
+linker so that clients can access and configure it directly if they need to. In
+this tutorial our ObjectLinkingLayer will just be used to support the next layer
+in our stack: the IRCompileLayer, which will be responsible for taking LLVM IR,
+compiling it, and passing the resulting in-memory object files down to the
+object linking layer below.
+
+That's it for member variables, after that we have a single typedef:
+ModuleHandle. This is the handle type that will be returned from our JIT's
+addModule method, and can be passed to the removeModule method to remove a
+module. The IRCompileLayer class already provides a convenient handle type
+(IRCompileLayer::ModuleSetHandleT), so we just alias our ModuleHandle to this.
+
+.. code-block:: c++
+
+  KaleidoscopeJIT()
+      : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
+    CompileLayer(ObjectLayer, SimpleCompiler(*TM)) {
+    llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
+  }
+
+  TargetMachine &getTargetMachine() { return *TM; }
+
+Next up we have our class constructor. We begin by initializing TM using the
+EngineBuilder::selectTarget helper method, which constructs a TargetMachine for
+the current process. Next we use our newly created TargetMachine to initialize
+DL, our DataLayout. Then we initialize our IRCompileLayer. Our IRCompile layer
+needs two things: (1) A reference to our object linking layer, and (2) a
+compiler instance to use to perform the actual compilation from IR to object
+files. We use the off-the-shelf SimpleCompiler instance for now. Finally, in
+the body of the constructor, we call the DynamicLibrary::LoadLibraryPermanently
+method with a nullptr argument. Normally the LoadLibraryPermanently method is
+called with the path of a dynamic library to load, but when passed a null
+pointer it will 'load' the host process itself, making its exported symbols
+available for execution.
+
+.. code-block:: c++
+
+  ModuleHandle addModule(std::unique_ptr<Module> M) {
+    // Build our symbol resolver:
+    // Lambda 1: Look back into the JIT itself to find symbols that are part of
+    //           the same "logical dylib".
+    // Lambda 2: Search for external symbols in the host process.
+    auto Resolver = createLambdaResolver(
+        [&](const std::string &Name) {
+          if (auto Sym = CompileLayer.findSymbol(Name, false))
+            return Sym.toRuntimeDyldSymbol();
+          return RuntimeDyld::SymbolInfo(nullptr);
+        },
+        [](const std::string &S) {
+          if (auto SymAddr =
+                RTDyldMemoryManager::getSymbolAddressInProcess(Name))
+            return RuntimeDyld::SymbolInfo(SymAddr, JITSymbolFlags::Exported);
+          return RuntimeDyld::SymbolInfo(nullptr);
+        });
+
+    // Build a singlton module set to hold our module.
+    std::vector<std::unique_ptr<Module>> Ms;
+    Ms.push_back(std::move(M));
+
+    // Add the set to the JIT with the resolver we created above and a newly
+    // created SectionMemoryManager.
+    return CompileLayer.addModuleSet(std::move(Ms),
+                                     make_unique<SectionMemoryManager>(),
+                                     std::move(Resolver));
+  }
+
+Now we come to the first of our JIT API methods: addModule. This method is
+responsible for adding IR to the JIT and making it available for execution. In
+this initial implementation of our JIT we will make our modules "available for
+execution" by adding them straight to the IRCompileLayer, which will
+immediately compile them. In later chapters we will teach our JIT to be lazier
+and instead add the Modules to a "pending" list to be compiled if and when they
+are first executed.
+
+To add our module to the IRCompileLayer we need to supply two auxiliary objects
+(as well as the module itself): a memory manager and a symbol resolver.  The
+memory manager will be responsible for managing the memory allocated to JIT'd
+machine code, setting memory permissions, and registering exception handling
+tables (if the JIT'd code uses exceptions). For our memory manager we will use
+the SectionMemoryManager class: another off-the-shelf utility that provides all
+the basic functionality we need. The second auxiliary class, the symbol
+resolver, is more interesting for us. It exists to tell the JIT where to look
+when it encounters an *external symbol* in the module we are adding.  External
+symbols are any symbol not defined within the module itself, including calls to
+functions outside the JIT and calls to functions defined in other modules that
+have already been added to the JIT. It may seem as though modules added to the
+JIT should "know about one another" by default, but since we would still have to
+supply a symbol resolver for references to code outside the JIT it turns out to
+be easier to just re-use this one mechanism for all symbol resolution. This has
+the added benefit that the user has full control over the symbol resolution
+process. Should we search for definitions within the JIT first, then fall back
+on external definitions? Or should we prefer external definitions where
+available and only JIT code if we don't already have an available
+implementation? By using a single symbol resolution scheme we are free to choose
+whatever makes the most sense for any given use case.
+
+Building a symbol resolver is made especially easy by the *createLambdaResolver*
+function. This function takes two lambdas [3]_ and returns a
+RuntimeDyld::SymbolResolver instance. The first lambda is used as the
+implementation of the resolver's findSymbolInLogicalDylib method, which searches
+for symbol definitions that should be thought of as being part of the same
+"logical" dynamic library as this Module. If you are familiar with static
+linking: this means that findSymbolInLogicalDylib should expose symbols with
+common linkage and hidden visibility. If all this sounds foreign you can ignore
+the details and just remember that this is the first method that the linker will
+use to try to find a symbol definition. If the findSymbolInLogicalDylib method
+returns a null result then the linker will call the second symbol resolver
+method, called findSymbol, which searches for symbols that should be thought of
+as external to (but visibile from) the module and its logical dylib. In this
+tutorial we will adopt the following simple scheme: All modules added to the JIT
+will behave as if they were linked into a single, ever-growing logical dylib. To
+implement this our first lambda (the one defining findSymbolInLogicalDylib) will
+just search for JIT'd code by calling the CompileLayer's findSymbol method. If
+we don't find a symbol in the JIT itself we'll fall back to our second lambda,
+which implements findSymbol. This will use the
+RTDyldMemoyrManager::getSymbolAddressInProcess method to search for the symbol
+within the program itself. If we can't find a symbol definition via either of
+these paths the JIT will refuse to accept our module, returning a "symbol not
+found" error.
+
+Now that we've built our symbol resolver we're ready to add our module to the
+JIT. We do this by calling the CompileLayer's addModuleSet method [4]_. Since
+we only have a single Module and addModuleSet expects a collection, we will
+create a vector of modules and add our module as the only member. Since we
+have already typedef'd our ModuleHandle type to be the same as the
+CompileLayer's handle type, we can return the handle from addModuleSet
+directly from our addModule method.
+
+.. code-block:: c++
+
+  JITSymbol findSymbol(const std::string Name) {
+    std::string MangledName;
+    raw_string_ostream MangledNameStream(MangledName);
+    Mangler::getNameWithPrefix(MangledNameStream, Name, DL);
+    return CompileLayer.findSymbol(MangledNameStream.str(), true);
+  }
+
+  void removeModule(ModuleHandle H) {
+    CompileLayer.removeModuleSet(H);
+  }
+
+Now that we can add code to our JIT, we need a way to find the symbols we've
+added to it. To do that we call the findSymbol method on our IRCompileLayer,
+but with a twist: We have to *mangle* the name of the symbol we're searching
+for first. The reason for this is that the ORC JIT components use mangled
+symbols internally the same way a static compiler and linker would, rather
+than using plain IR symbol names. The kind of mangling will depend on the
+DataLayout, which in turn depends on the target platform. To allow us to
+remain portable and search based on the un-mangled name, we just re-produce
+this mangling ourselves.
+
+We now come to the last method in our JIT API: removeModule. This method is
+responsible for destructing the MemoryManager and SymbolResolver that were
+added with a given module, freeing any resources they were using in the
+process. In our Kaleidoscope demo we rely on this method to remove the module
+representing the most recent top-level expression, preventing it from being
+treated as a duplicate definition when the next top-level expression is
+entered. It is generally good to free any module that you know you won't need
+to call further, just to free up the resources dedicated to it. However, you
+don't strictly need to do this: All resources will be cleaned up when your
+JIT class is destructed, if the haven't been freed before then.
+
+This brings us to the end of Chapter 1 of Building a JIT. You now have a basic
+but fully functioning JIT stack that you can use to take LLVM IR and make it
+executable within the context of your JIT process. In the next chapter we'll
+look at how to extend this JIT to produce better quality code, and in the
+process take a deeper look at the ORC layer concept.
+
+`Next: Extending the KaleidoscopeJIT <BuildingAJIT2.html>`_
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example. To build this
+example, use:
+
+.. code-block:: bash
+
+    # Compile
+    clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orc native` -O3 -o toy
+    # Run
+    ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter1/KaleidoscopeJIT.h
+   :language: c++
+
+.. [1] Actually we use a cut-down version of KaleidoscopeJIT that makes a
+       simplifying assumption: symbols cannot be re-defined. This will make it
+       impossible to re-define symbols in the REPL, but will make our symbol
+       lookup logic simpler. Re-introducing support for symbol redefinition is
+       left as an exercise for the reader. (The KaleidoscopeJIT.h used in the
+       original tutorials will be a helpful reference).
+
+.. [2] +-----------------------+-----------------------------------------------+
+       |         File          |               Reason for inclusion            |
+       +=======================+===============================================+
+       |   ExecutionEngine.h   | Access to the EngineBuilder::selectTarget     |
+       |                       | method.                                       |
+       +-----------------------+-----------------------------------------------+
+       |                       | Access to the                                 |
+       | RTDyldMemoryManager.h | RTDyldMemoryManager::getSymbolAddressInProcess|
+       |                       | method.                                       |
+       +-----------------------+-----------------------------------------------+
+       |    CompileUtils.h     | Provides the SimpleCompiler class.            |
+       +-----------------------+-----------------------------------------------+
+       |   IRCompileLayer.h    | Provides the IRCompileLayer class.            |
+       +-----------------------+-----------------------------------------------+
+       |                       | Access the createLambdaResolver function,     |
+       |   LambdaResolver.h    | which provides easy construction of symbol    |
+       |                       | resolvers.                                    |
+       +-----------------------+-----------------------------------------------+
+       |  ObjectLinkingLayer.h | Provides the ObjectLinkingLayer class.        |
+       +-----------------------+-----------------------------------------------+
+       |       Mangler.h       | Provides the Mangler class for platform       |
+       |                       | specific name-mangling.                       |
+       +-----------------------+-----------------------------------------------+
+       |   DynamicLibrary.h    | Provides the DynamicLibrary class, which      |
+       |                       | makes symbols in the host process searchable. |
+       +-----------------------+-----------------------------------------------+
+
+.. [3] Actually they don't have to be lambdas, any object with a call operator
+       will do, including plain old functions or std::functions.
+
+.. [4] ORC layers accept sets of Modules, rather than individual ones, so that
+       all Modules in the set could be co-located by the memory manager, though
+       this feature is not yet implemented.
diff --git a/docs/tutorial/BuildingAJIT2.rst b/docs/tutorial/BuildingAJIT2.rst
new file mode 100644
index 0000000000000..8fa92317f54fe
--- /dev/null
+++ b/docs/tutorial/BuildingAJIT2.rst
@@ -0,0 +1,336 @@
+=====================================================================
+Building a JIT: Adding Optimizations -- An introduction to ORC Layers
+=====================================================================
+
+.. contents::
+   :local:
+
+**This tutorial is under active development. It is incomplete and details may
+change frequently.** Nonetheless we invite you to try it out as it stands, and
+we welcome any feedback.
+
+Chapter 2 Introduction
+======================
+
+Welcome to Chapter 2 of the "Building an ORC-based JIT in LLVM" tutorial. In
+`Chapter 1 <BuildingAJIT1.html>`_ of this series we examined a basic JIT
+class, KaleidoscopeJIT, that could take LLVM IR modules as input and produce
+executable code in memory. KaleidoscopeJIT was able to do this with relatively
+little code by composing two off-the-shelf *ORC layers*: IRCompileLayer and
+ObjectLinkingLayer, to do much of the heavy lifting.
+
+In this layer we'll learn more about the ORC layer concept by using a new layer,
+IRTransformLayer, to add IR optimization support to KaleidoscopeJIT.
+
+Optimizing Modules using the IRTransformLayer
+=============================================
+
+In `Chapter 4 <LangImpl4.html>`_ of the "Implementing a language with LLVM"
+tutorial series the llvm *FunctionPassManager* is introduced as a means for
+optimizing LLVM IR. Interested readers may read that chapter for details, but
+in short: to optimize a Module we create an llvm::FunctionPassManager
+instance, configure it with a set of optimizations, then run the PassManager on
+a Module to mutate it into a (hopefully) more optimized but semantically
+equivalent form. In the original tutorial series the FunctionPassManager was
+created outside the KaleidoscopeJIT and modules were optimized before being
+added to it. In this Chapter we will make optimization a phase of our JIT
+instead. For now this will provide us a motivation to learn more about ORC
+layers, but in the long term making optimization part of our JIT will yield an
+important benefit: When we begin lazily compiling code (i.e. deferring
+compilation of each function until the first time it's run), having
+optimization managed by our JIT will allow us to optimize lazily too, rather
+than having to do all our optimization up-front.
+
+To add optimization support to our JIT we will take the KaleidoscopeJIT from
+Chapter 1 and compose an ORC *IRTransformLayer* on top. We will look at how the
+IRTransformLayer works in more detail below, but the interface is simple: the
+constructor for this layer takes a reference to the layer below (as all layers
+do) plus an *IR optimization function* that it will apply to each Module that
+is added via addModuleSet:
+
+.. code-block:: c++
+
+  class KaleidoscopeJIT {
+  private:
+    std::unique_ptr<TargetMachine> TM;
+    const DataLayout DL;
+    ObjectLinkingLayer<> ObjectLayer;
+    IRCompileLayer<decltype(ObjectLayer)> CompileLayer;
+
+    typedef std::function<std::unique_ptr<Module>(std::unique_ptr<Module>)>
+      OptimizeFunction;
+
+    IRTransformLayer<decltype(CompileLayer), OptimizeFunction> OptimizeLayer;
+
+  public:
+    typedef decltype(OptimizeLayer)::ModuleSetHandleT ModuleHandle;
+
+    KaleidoscopeJIT()
+        : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
+          CompileLayer(ObjectLayer, SimpleCompiler(*TM)),
+          OptimizeLayer(CompileLayer,
+                        [this](std::unique_ptr<Module> M) {
+                          return optimizeModule(std::move(M));
+                        }) {
+      llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
+    }
+
+Our extended KaleidoscopeJIT class starts out the same as it did in Chapter 1,
+but after the CompileLayer we introduce a typedef for our optimization function.
+In this case we use a std::function (a handy wrapper for "function-like" things)
+from a single unique_ptr<Module> input to a std::unique_ptr<Module> output. With
+our optimization function typedef in place we can declare our OptimizeLayer,
+which sits on top of our CompileLayer.
+
+To initialize our OptimizeLayer we pass it a reference to the CompileLayer
+below (standard practice for layers), and we initialize the OptimizeFunction
+using a lambda that calls out to an "optimizeModule" function that we will
+define below.
+
+.. code-block:: c++
+
+  // ...
+  auto Resolver = createLambdaResolver(
+      [&](const std::string &Name) {
+        if (auto Sym = OptimizeLayer.findSymbol(Name, false))
+          return Sym.toRuntimeDyldSymbol();
+        return RuntimeDyld::SymbolInfo(nullptr);
+      },
+  // ...
+
+.. code-block:: c++
+
+  // ...
+  return OptimizeLayer.addModuleSet(std::move(Ms),
+                                    make_unique<SectionMemoryManager>(),
+                                    std::move(Resolver));
+  // ...
+
+.. code-block:: c++
+
+  // ...
+  return OptimizeLayer.findSymbol(MangledNameStream.str(), true);
+  // ...
+
+.. code-block:: c++
+
+  // ...
+  OptimizeLayer.removeModuleSet(H);
+  // ...
+
+Next we need to replace references to 'CompileLayer' with references to
+OptimizeLayer in our key methods: addModule, findSymbol, and removeModule. In
+addModule we need to be careful to replace both references: the findSymbol call
+inside our resolver, and the call through to addModuleSet.
+
+.. code-block:: c++
+
+  std::unique_ptr<Module> optimizeModule(std::unique_ptr<Module> M) {
+    // Create a function pass manager.
+    auto FPM = llvm::make_unique<legacy::FunctionPassManager>(M.get());
+
+    // Add some optimizations.
+    FPM->add(createInstructionCombiningPass());
+    FPM->add(createReassociatePass());
+    FPM->add(createGVNPass());
+    FPM->add(createCFGSimplificationPass());
+    FPM->doInitialization();
+
+    // Run the optimizations over all functions in the module being added to
+    // the JIT.
+    for (auto &F : *M)
+      FPM->run(F);
+
+    return M;
+  }
+
+At the bottom of our JIT we add a private method to do the actual optimization:
+*optimizeModule*. This function sets up a FunctionPassManager, adds some passes
+to it, runs it over every function in the module, and then returns the mutated
+module. The specific optimizations are the same ones used in
+`Chapter 4 <LangImpl4.html>`_ of the "Implementing a language with LLVM"
+tutorial series. Readers may visit that chapter for a more in-depth
+discussion of these, and of IR optimization in general.
+
+And that's it in terms of changes to KaleidoscopeJIT: When a module is added via
+addModule the OptimizeLayer will call our optimizeModule function before passing
+the transformed module on to the CompileLayer below. Of course, we could have
+called optimizeModule directly in our addModule function and not gone to the
+bother of using the IRTransformLayer, but doing so gives us another opportunity
+to see how layers compose. It also provides a neat entry point to the *layer*
+concept itself, because IRTransformLayer turns out to be one of the simplest
+implementations of the layer concept that can be devised:
+
+.. code-block:: c++
+
+  template <typename BaseLayerT, typename TransformFtor>
+  class IRTransformLayer {
+  public:
+    typedef typename BaseLayerT::ModuleSetHandleT ModuleSetHandleT;
+
+    IRTransformLayer(BaseLayerT &BaseLayer,
+                     TransformFtor Transform = TransformFtor())
+      : BaseLayer(BaseLayer), Transform(std::move(Transform)) {}
+
+    template <typename ModuleSetT, typename MemoryManagerPtrT,
+              typename SymbolResolverPtrT>
+    ModuleSetHandleT addModuleSet(ModuleSetT Ms,
+                                  MemoryManagerPtrT MemMgr,
+                                  SymbolResolverPtrT Resolver) {
+
+      for (auto I = Ms.begin(), E = Ms.end(); I != E; ++I)
+        *I = Transform(std::move(*I));
+
+      return BaseLayer.addModuleSet(std::move(Ms), std::move(MemMgr),
+                                  std::move(Resolver));
+    }
+
+    void removeModuleSet(ModuleSetHandleT H) { BaseLayer.removeModuleSet(H); }
+
+    JITSymbol findSymbol(const std::string &Name, bool ExportedSymbolsOnly) {
+      return BaseLayer.findSymbol(Name, ExportedSymbolsOnly);
+    }
+
+    JITSymbol findSymbolIn(ModuleSetHandleT H, const std::string &Name,
+                           bool ExportedSymbolsOnly) {
+      return BaseLayer.findSymbolIn(H, Name, ExportedSymbolsOnly);
+    }
+
+    void emitAndFinalize(ModuleSetHandleT H) {
+      BaseLayer.emitAndFinalize(H);
+    }
+
+    TransformFtor& getTransform() { return Transform; }
+
+    const TransformFtor& getTransform() const { return Transform; }
+
+  private:
+    BaseLayerT &BaseLayer;
+    TransformFtor Transform;
+  };
+
+This is the whole definition of IRTransformLayer, from
+``llvm/include/llvm/ExecutionEngine/Orc/IRTransformLayer.h``, stripped of its
+comments. It is a template class with two template arguments: ``BaesLayerT`` and
+``TransformFtor`` that provide the type of the base layer and the type of the
+"transform functor" (in our case a std::function) respectively. This class is
+concerned with two very simple jobs: (1) Running every IR Module that is added
+with addModuleSet through the transform functor, and (2) conforming to the ORC
+layer interface. The interface consists of one typedef and five methods:
+
++------------------+-----------------------------------------------------------+
+|     Interface    |                         Description                       |
++==================+===========================================================+
+|                  | Provides a handle that can be used to identify a module   |
+| ModuleSetHandleT | set when calling findSymbolIn, removeModuleSet, or        |
+|                  | emitAndFinalize.                                          |
++------------------+-----------------------------------------------------------+
+|                  | Takes a given set of Modules and makes them "available    |
+|                  | for execution. This means that symbols in those modules   |
+|                  | should be searchable via findSymbol and findSymbolIn, and |
+|                  | the address of the symbols should be read/writable (for   |
+|                  | data symbols), or executable (for function symbols) after |
+|                  | JITSymbol::getAddress() is called. Note: This means that  |
+|   addModuleSet   | addModuleSet doesn't have to compile (or do any other     |
+|                  | work) up-front. It *can*, like IRCompileLayer, act        |
+|                  | eagerly, but it can also simply record the module and     |
+|                  | take no further action until somebody calls               |
+|                  | JITSymbol::getAddress(). In IRTransformLayer's case       |
+|                  | addModuleSet eagerly applies the transform functor to     |
+|                  | each module in the set, then passes the resulting set     |
+|                  | of mutated modules down to the layer below.               |
++------------------+-----------------------------------------------------------+
+|                  | Removes a set of modules from the JIT. Code or data       |
+|  removeModuleSet | defined in these modules will no longer be available, and |
+|                  | the memory holding the JIT'd definitions will be freed.   |
++------------------+-----------------------------------------------------------+
+|                  | Searches for the named symbol in all modules that have    |
+|                  | previously been added via addModuleSet (and not yet       |
+|    findSymbol    | removed by a call to removeModuleSet). In                 |
+|                  | IRTransformLayer we just pass the query on to the layer   |
+|                  | below. In our REPL this is our default way to search for  |
+|                  | function definitions.                                     |
++------------------+-----------------------------------------------------------+
+|                  | Searches for the named symbol in the module set indicated |
+|                  | by the given ModuleSetHandleT. This is just an optimized  |
+|                  | search, better for lookup-speed when you know exactly     |
+|                  | a symbol definition should be found. In IRTransformLayer  |
+|   findSymbolIn   | we just pass this query on to the layer below. In our     |
+|                  | REPL we use this method to search for functions           |
+|                  | representing top-level expressions, since we know exactly |
+|                  | where we'll find them: in the top-level expression module |
+|                  | we just added.                                            |
++------------------+-----------------------------------------------------------+
+|                  | Forces all of the actions required to make the code and   |
+|                  | data in a module set (represented by a ModuleSetHandleT)  |
+|                  | accessible. Behaves as if some symbol in the set had been |
+|                  | searched for and JITSymbol::getSymbolAddress called. This |
+| emitAndFinalize  | is rarely needed, but can be useful when dealing with     |
+|                  | layers that usually behave lazily if the user wants to    |
+|                  | trigger early compilation (for example, to use idle CPU   |
+|                  | time to eagerly compile code in the background).          |
++------------------+-----------------------------------------------------------+
+
+This interface attempts to capture the natural operations of a JIT (with some
+wrinkles like emitAndFinalize for performance), similar to the basic JIT API
+operations we identified in Chapter 1. Conforming to the layer concept allows
+classes to compose neatly by implementing their behaviors in terms of the these
+same operations, carried out on the layer below. For example, an eager layer
+(like IRTransformLayer) can implement addModuleSet by running each module in the
+set through its transform up-front and immediately passing the result to the
+layer below. A lazy layer, by contrast, could implement addModuleSet by
+squirreling away the modules doing no other up-front work, but applying the
+transform (and calling addModuleSet on the layer below) when the client calls
+findSymbol instead. The JIT'd program behavior will be the same either way, but
+these choices will have different performance characteristics: Doing work
+eagerly means the JIT takes longer up-front, but proceeds smoothly once this is
+done. Deferring work allows the JIT to get up-and-running quickly, but will
+force the JIT to pause and wait whenever some code or data is needed that hasn't
+already been processed.
+
+Our current REPL is eager: Each function definition is optimized and compiled as
+soon as it's typed in. If we were to make the transform layer lazy (but not
+change things otherwise) we could defer optimization until the first time we
+reference a function in a top-level expression (see if you can figure out why,
+then check out the answer below [1]_). In the next chapter, however we'll
+introduce fully lazy compilation, in which function's aren't compiled until
+they're first called at run-time. At this point the trade-offs get much more
+interesting: the lazier we are, the quicker we can start executing the first
+function, but the more often we'll have to pause to compile newly encountered
+functions. If we only code-gen lazily, but optimize eagerly, we'll have a slow
+startup (which everything is optimized) but relatively short pauses as each
+function just passes through code-gen. If we both optimize and code-gen lazily
+we can start executing the first function more quickly, but we'll have longer
+pauses as each function has to be both optimized and code-gen'd when it's first
+executed. Things become even more interesting if we consider interproceedural
+optimizations like inlining, which must be performed eagerly. These are
+complex trade-offs, and there is no one-size-fits all solution to them, but by
+providing composable layers we leave the decisions to the person implementing
+the JIT, and make it easy for them to experiment with different configurations.
+
+`Next: Adding Per-function Lazy Compilation <BuildingAJIT3.html>`_
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example with an
+IRTransformLayer added to enable optimization. To build this example, use:
+
+.. code-block:: bash
+
+    # Compile
+    clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orc native` -O3 -o toy
+    # Run
+    ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter2/KaleidoscopeJIT.h
+   :language: c++
+
+.. [1] When we add our top-level expression to the JIT, any calls to functions
+       that we defined earlier will appear to the ObjectLinkingLayer as
+       external symbols. The ObjectLinkingLayer will call the SymbolResolver
+       that we defined in addModuleSet, which in turn calls findSymbol on the
+       OptimizeLayer, at which point even a lazy transform layer will have to
+       do its work.
diff --git a/docs/tutorial/BuildingAJIT3.rst b/docs/tutorial/BuildingAJIT3.rst
new file mode 100644
index 0000000000000..ba0dab91c4ef5
--- /dev/null
+++ b/docs/tutorial/BuildingAJIT3.rst
@@ -0,0 +1,171 @@
+=============================================
+Building a JIT: Per-function Lazy Compilation
+=============================================
+
+.. contents::
+   :local:
+
+**This tutorial is under active development. It is incomplete and details may
+change frequently.** Nonetheless we invite you to try it out as it stands, and
+we welcome any feedback.
+
+Chapter 3 Introduction
+======================
+
+Welcome to Chapter 3 of the "Building an ORC-based JIT in LLVM" tutorial. This
+chapter discusses lazy JITing and shows you how to enable it by adding an ORC
+CompileOnDemand layer the JIT from `Chapter 2 <BuildingAJIT2.html>`_.
+
+Lazy Compilation
+================
+
+When we add a module to the KaleidoscopeJIT class described in Chapter 2 it is
+immediately optimized, compiled and linked for us by the IRTransformLayer,
+IRCompileLayer and ObjectLinkingLayer respectively. This scheme, where all the
+work to make a Module executable is done up front, is relatively simple to
+understand its performance characteristics are easy to reason about. However,
+it will lead to very high startup times if the amount of code to be compiled is
+large, and may also do a lot of unnecessary compilation if only a few compiled
+functions are ever called at runtime. A truly "just-in-time" compiler should
+allow us to defer the compilation of any given function until the moment that
+function is first called, improving launch times and eliminating redundant work.
+In fact, the ORC APIs provide us with a layer to lazily compile LLVM IR:
+*CompileOnDemandLayer*.
+
+The CompileOnDemandLayer conforms to the layer interface described in Chapter 2,
+but the addModuleSet method behaves quite differently from the layers we have
+seen so far: rather than doing any work up front, it just constructs a *stub*
+for each function in the module and arranges for the stub to trigger compilation
+of the actual function the first time it is called. Because stub functions are
+very cheap to produce CompileOnDemand's addModuleSet method runs very quickly,
+reducing the time required to launch the first function to be executed, and
+saving us from doing any redundant compilation. By conforming to the layer
+interface, CompileOnDemand can be easily added on top of our existing JIT class.
+We just need a few changes:
+
+.. code-block:: c++
+
+  ...
+  #include "llvm/ExecutionEngine/SectionMemoryManager.h"
+  #include "llvm/ExecutionEngine/Orc/CompileOnDemandLayer.h"
+  #include "llvm/ExecutionEngine/Orc/CompileUtils.h"
+  ...
+
+  ...
+  class KaleidoscopeJIT {
+  private:
+    std::unique_ptr<TargetMachine> TM;
+    const DataLayout DL;
+    std::unique_ptr<JITCompileCallbackManager> CompileCallbackManager;
+    ObjectLinkingLayer<> ObjectLayer;
+    IRCompileLayer<decltype(ObjectLayer)> CompileLayer;
+
+    typedef std::function<std::unique_ptr<Module>(std::unique_ptr<Module>)>
+      OptimizeFunction;
+
+    IRTransformLayer<decltype(CompileLayer), OptimizeFunction> OptimizeLayer;
+    CompileOnDemandLayer<decltype(OptimizeLayer)> CODLayer;
+
+  public:
+    typedef decltype(CODLayer)::ModuleSetHandleT ModuleHandle;
+
+First we need to include the CompileOnDemandLayer.h header, then add two new
+members: a std::unique_ptr<CompileCallbackManager> and a CompileOnDemandLayer,
+to our class. The CompileCallbackManager is a utility that enables us to
+create re-entry points into the compiler for functions that we want to lazily
+compile. In the next chapter we'll be looking at this class in detail, but for
+now we'll be treating it as an opaque utility: We just need to pass a reference
+to it into our new CompileOnDemandLayer, and the layer will do all the work of
+setting up the callbacks using the callback manager we gave it.
+
+.. code-block:: c++
+
+  KaleidoscopeJIT()
+      : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
+        CompileLayer(ObjectLayer, SimpleCompiler(*TM)),
+        OptimizeLayer(CompileLayer,
+                      [this](std::unique_ptr<Module> M) {
+                        return optimizeModule(std::move(M));
+                      }),
+        CompileCallbackManager(
+            orc::createLocalCompileCallbackManager(TM->getTargetTriple(), 0)),
+        CODLayer(OptimizeLayer,
+                 [this](Function &F) { return std::set<Function*>({&F}); },
+                 *CompileCallbackManager,
+                 orc::createLocalIndirectStubsManagerBuilder(
+                   TM->getTargetTriple())) {
+    llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
+  }
+
+Next we have to update our constructor to initialize the new members. To create
+an appropriate compile callback manager we use the
+createLocalCompileCallbackManager function, which takes a TargetMachine and a
+TargetAddress to call if it receives a request to compile an unknown function.
+In our simple JIT this situation is unlikely to come up, so we'll cheat and
+just pass '0' here. In a production quality JIT you could give the address of a
+function that throws an exception in order to unwind the JIT'd code stack.
+
+Now we can construct our CompileOnDemandLayer. Following the pattern from
+previous layers we start by passing a reference to the next layer down in our
+stack -- the OptimizeLayer. Next we need to supply a 'partitioning function':
+when a not-yet-compiled function is called, the CompileOnDemandLayer will call
+this function to ask us what we would like to compile. At a minimum we need to
+compile the function being called (given by the argument to the partitioning
+function), but we could also request that the CompileOnDemandLayer compile other
+functions that are unconditionally called (or highly likely to be called) from
+the function being called. For KaleidoscopeJIT we'll keep it simple and just
+request compilation of the function that was called. Next we pass a reference to
+our CompileCallbackManager. Finally, we need to supply an "indirect stubs
+manager builder". This is a function that constructs IndirectStubManagers, which
+are in turn used to build the stubs for each module. The CompileOnDemandLayer
+will call the indirect stub manager builder once for each call to addModuleSet,
+and use the resulting indirect stubs manager to create stubs for all functions
+in all modules added. If/when the module set is removed from the JIT the
+indirect stubs manager will be deleted, freeing any memory allocated to the
+stubs. We supply this function by using the
+createLocalIndirectStubsManagerBuilder utility.
+
+.. code-block:: c++
+
+  // ...
+          if (auto Sym = CODLayer.findSymbol(Name, false))
+  // ...
+  return CODLayer.addModuleSet(std::move(Ms),
+                               make_unique<SectionMemoryManager>(),
+                               std::move(Resolver));
+  // ...
+
+  // ...
+  return CODLayer.findSymbol(MangledNameStream.str(), true);
+  // ...
+
+  // ...
+  CODLayer.removeModuleSet(H);
+  // ...
+
+Finally, we need to replace the references to OptimizeLayer in our addModule,
+findSymbol, and removeModule methods. With that, we're up and running.
+
+**To be done:**
+
+** Discuss CompileCallbackManagers and IndirectStubManagers in more detail.**
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example with a CompileOnDemand
+layer added to enable lazy function-at-a-time compilation. To build this example, use:
+
+.. code-block:: bash
+
+    # Compile
+    clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orc native` -O3 -o toy
+    # Run
+    ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter3/KaleidoscopeJIT.h
+   :language: c++
+
+`Next: Extreme Laziness -- Using Compile Callbacks to JIT directly from ASTs <BuildingAJIT4.html>`_
diff --git a/docs/tutorial/BuildingAJIT4.rst b/docs/tutorial/BuildingAJIT4.rst
new file mode 100644
index 0000000000000..39d9198a85c3d
--- /dev/null
+++ b/docs/tutorial/BuildingAJIT4.rst
@@ -0,0 +1,48 @@
+===========================================================================
+Building a JIT: Extreme Laziness - Using Compile Callbacks to JIT from ASTs
+===========================================================================
+
+.. contents::
+   :local:
+
+**This tutorial is under active development. It is incomplete and details may
+change frequently.** Nonetheless we invite you to try it out as it stands, and
+we welcome any feedback.
+
+Chapter 4 Introduction
+======================
+
+Welcome to Chapter 4 of the "Building an ORC-based JIT in LLVM" tutorial. This
+chapter introduces the Compile Callbacks and Indirect Stubs APIs and shows how
+they can be used to replace the CompileOnDemand layer from
+`Chapter 3 <BuildingAJIT3.html>`_ with a custom lazy-JITing scheme that JITs
+directly from Kaleidoscope ASTs.
+
+**To be done:**
+
+**(1) Describe the drawbacks of JITing from IR (have to compile to IR first,
+which reduces the benefits of laziness).**
+
+**(2) Describe CompileCallbackManagers and IndirectStubManagers in detail.**
+
+**(3) Run through the implementation of addFunctionAST.**
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example that JITs lazily from
+Kaleidoscope ASTS. To build this example, use:
+
+.. code-block:: bash
+
+    # Compile
+    clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orc native` -O3 -o toy
+    # Run
+    ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter4/KaleidoscopeJIT.h
+   :language: c++
+
+`Next: Remote-JITing -- Process-isolation and laziness-at-a-distance <BuildingAJIT5.html>`_
diff --git a/docs/tutorial/BuildingAJIT5.rst b/docs/tutorial/BuildingAJIT5.rst
new file mode 100644
index 0000000000000..94ea92ce5ad2b
--- /dev/null
+++ b/docs/tutorial/BuildingAJIT5.rst
@@ -0,0 +1,55 @@
+=============================================================================
+Building a JIT: Remote-JITing -- Process Isolation and Laziness at a Distance
+=============================================================================
+
+.. contents::
+   :local:
+
+**This tutorial is under active development. It is incomplete and details may
+change frequently.** Nonetheless we invite you to try it out as it stands, and
+we welcome any feedback.
+
+Chapter 5 Introduction
+======================
+
+Welcome to Chapter 5 of the "Building an ORC-based JIT in LLVM" tutorial. This
+chapter introduces the ORC RemoteJIT Client/Server APIs and shows how to use
+them to build a JIT stack that will execute its code via a communications
+channel with a different process. This can be a separate process on the same
+machine, a process on a different machine, or even a process on a different
+platform/architecture. The code builds on top of the lazy-AST-compiling JIT
+stack from `Chapter 4 <BuildingAJIT3.html>`_.
+
+**To be done -- this is going to be a long one:**
+
+**(1) Introduce channels, RPC, RemoteJIT Client and Server APIs**
+
+**(2) Describe the client code in greater detail. Discuss modifications of the
+KaleidoscopeJIT class, and the REPL itself.**
+
+**(3) Describe the server code.**
+
+**(4) Describe how to run the demo.**
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example that JITs lazily from
+Kaleidoscope ASTS. To build this example, use:
+
+.. code-block:: bash
+
+    # Compile
+    clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orc native` -O3 -o toy
+    # Run
+    ./toy
+
+Here is the code for the modified KaleidoscopeJIT:
+
+.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter5/KaleidoscopeJIT.h
+   :language: c++
+
+And the code for the JIT server:
+
+.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter5/Server/server.cpp
+   :language: c++
diff --git a/docs/tutorial/LangImpl1.rst b/docs/tutorial/LangImpl01.rst
index b04cde10274e0..f7fbd150ef11a 100644
--- a/docs/tutorial/LangImpl1.rst
+++ b/docs/tutorial/LangImpl01.rst
@@ -42,45 +42,48 @@ in the various pieces. The structure of the tutorial is:
    to implement everything in C++ instead of using lexer and parser
    generators. LLVM obviously works just fine with such tools, feel free
    to use one if you prefer.
--  `Chapter #2 <LangImpl2.html>`_: Implementing a Parser and AST -
+-  `Chapter #2 <LangImpl02.html>`_: Implementing a Parser and AST -
    With the lexer in place, we can talk about parsing techniques and
    basic AST construction. This tutorial describes recursive descent
    parsing and operator precedence parsing. Nothing in Chapters 1 or 2
    is LLVM-specific, the code doesn't even link in LLVM at this point.
    :)
--  `Chapter #3 <LangImpl3.html>`_: Code generation to LLVM IR - With
+-  `Chapter #3 <LangImpl03.html>`_: Code generation to LLVM IR - With
    the AST ready, we can show off how easy generation of LLVM IR really
    is.
--  `Chapter #4 <LangImpl4.html>`_: Adding JIT and Optimizer Support
+-  `Chapter #4 <LangImpl04.html>`_: Adding JIT and Optimizer Support
    - Because a lot of people are interested in using LLVM as a JIT,
    we'll dive right into it and show you the 3 lines it takes to add JIT
    support. LLVM is also useful in many other ways, but this is one
    simple and "sexy" way to show off its power. :)
--  `Chapter #5 <LangImpl5.html>`_: Extending the Language: Control
+-  `Chapter #5 <LangImpl05.html>`_: Extending the Language: Control
    Flow - With the language up and running, we show how to extend it
    with control flow operations (if/then/else and a 'for' loop). This
    gives us a chance to talk about simple SSA construction and control
    flow.
--  `Chapter #6 <LangImpl6.html>`_: Extending the Language:
+-  `Chapter #6 <LangImpl06.html>`_: Extending the Language:
    User-defined Operators - This is a silly but fun chapter that talks
    about extending the language to let the user program define their own
    arbitrary unary and binary operators (with assignable precedence!).
    This lets us build a significant piece of the "language" as library
    routines.
--  `Chapter #7 <LangImpl7.html>`_: Extending the Language: Mutable
+-  `Chapter #7 <LangImpl07.html>`_: Extending the Language: Mutable
    Variables - This chapter talks about adding user-defined local
    variables along with an assignment operator. The interesting part
    about this is how easy and trivial it is to construct SSA form in
    LLVM: no, LLVM does *not* require your front-end to construct SSA
    form!
--  `Chapter #8 <LangImpl8.html>`_: Extending the Language: Debug
+-  `Chapter #8 <LangImpl08.html>`_: Compiling to Object Files - This
+   chapter explains how to take LLVM IR and compile it down to object
+   files.
+-  `Chapter #9 <LangImpl09.html>`_: Extending the Language: Debug
    Information - Having built a decent little programming language with
    control flow, functions and mutable variables, we consider what it
    takes to add debug information to standalone executables. This debug
    information will allow you to set breakpoints in Kaleidoscope
    functions, print out argument variables, and call functions - all
    from within the debugger!
--  `Chapter #9 <LangImpl8.html>`_: Conclusion and other useful LLVM
+-  `Chapter #10 <LangImpl10.html>`_: Conclusion and other useful LLVM
    tidbits - This chapter wraps up the series by talking about
    potential ways to extend the language, but also includes a bunch of
    pointers to info about "special topics" like adding garbage
@@ -146,7 +149,7 @@ useful for mutually recursive functions). For example:
 
 A more interesting example is included in Chapter 6 where we write a
 little Kaleidoscope application that `displays a Mandelbrot
-Set <LangImpl6.html#kicking-the-tires>`_ at various levels of magnification.
+Set <LangImpl06.html#kicking-the-tires>`_ at various levels of magnification.
 
 Lets dive into the implementation of this language!
 
@@ -280,11 +283,11 @@ file. These are handled with this code:
     }
 
 With this, we have the complete lexer for the basic Kaleidoscope
-language (the `full code listing <LangImpl2.html#full-code-listing>`_ for the Lexer
-is available in the `next chapter <LangImpl2.html>`_ of the tutorial).
+language (the `full code listing <LangImpl02.html#full-code-listing>`_ for the Lexer
+is available in the `next chapter <LangImpl02.html>`_ of the tutorial).
 Next we'll `build a simple parser that uses this to build an Abstract
-Syntax Tree <LangImpl2.html>`_. When we have that, we'll include a
+Syntax Tree <LangImpl02.html>`_. When we have that, we'll include a
 driver so that you can use the lexer and parser together.
 
-`Next: Implementing a Parser and AST <LangImpl2.html>`_
+`Next: Implementing a Parser and AST <LangImpl02.html>`_
 
diff --git a/docs/tutorial/LangImpl2.rst b/docs/tutorial/LangImpl02.rst
index dab60172b9882..701cbc9611363 100644
--- a/docs/tutorial/LangImpl2.rst
+++ b/docs/tutorial/LangImpl02.rst
@@ -176,17 +176,17 @@ be parsed.
 .. code-block:: c++
 
 
-    /// Error* - These are little helper functions for error handling.
-    std::unique_ptr<ExprAST> Error(const char *Str) {
-      fprintf(stderr, "Error: %s\n", Str);
+    /// LogError* - These are little helper functions for error handling.
+    std::unique_ptr<ExprAST> LogError(const char *Str) {
+      fprintf(stderr, "LogError: %s\n", Str);
       return nullptr;
     }
-    std::unique_ptr<PrototypeAST> ErrorP(const char *Str) {
-      Error(Str);
+    std::unique_ptr<PrototypeAST> LogErrorP(const char *Str) {
+      LogError(Str);
       return nullptr;
     }
 
-The ``Error`` routines are simple helper routines that our parser will
+The ``LogError`` routines are simple helper routines that our parser will
 use to handle errors. The error recovery in our parser will not be the
 best and is not particular user-friendly, but it will be enough for our
 tutorial. These routines make it easier to handle errors in routines
@@ -233,7 +233,7 @@ the parenthesis operator is defined like this:
         return nullptr;
 
       if (CurTok != ')')
-        return Error("expected ')'");
+        return LogError("expected ')'");
       getNextToken(); // eat ).
       return V;
     }
@@ -241,7 +241,7 @@ the parenthesis operator is defined like this:
 This function illustrates a number of interesting things about the
 parser:
 
-1) It shows how we use the Error routines. When called, this function
+1) It shows how we use the LogError routines. When called, this function
 expects that the current token is a '(' token, but after parsing the
 subexpression, it is possible that there is no ')' waiting. For example,
 if the user types in "(4 x" instead of "(4)", the parser should emit an
@@ -288,7 +288,7 @@ function calls:
             break;
 
           if (CurTok != ',')
-            return Error("Expected ')' or ',' in argument list");
+            return LogError("Expected ')' or ',' in argument list");
           getNextToken();
         }
       }
@@ -324,7 +324,7 @@ primary expression, we need to determine what sort of expression it is:
     static std::unique_ptr<ExprAST> ParsePrimary() {
       switch (CurTok) {
       default:
-        return Error("unknown token when expecting an expression");
+        return LogError("unknown token when expecting an expression");
       case tok_identifier:
         return ParseIdentifierExpr();
       case tok_number:
@@ -571,20 +571,20 @@ expressions):
     ///   ::= id '(' id* ')'
     static std::unique_ptr<PrototypeAST> ParsePrototype() {
       if (CurTok != tok_identifier)
-        return ErrorP("Expected function name in prototype");
+        return LogErrorP("Expected function name in prototype");
 
       std::string FnName = IdentifierStr;
       getNextToken();
 
       if (CurTok != '(')
-        return ErrorP("Expected '(' in prototype");
+        return LogErrorP("Expected '(' in prototype");
 
       // Read the list of argument names.
       std::vector<std::string> ArgNames;
       while (getNextToken() == tok_identifier)
         ArgNames.push_back(IdentifierStr);
       if (CurTok != ')')
-        return ErrorP("Expected ')' in prototype");
+        return LogErrorP("Expected ')' in prototype");
 
       // success.
       getNextToken();  // eat ')'.
@@ -731,5 +731,5 @@ Here is the code:
 .. literalinclude:: ../../examples/Kaleidoscope/Chapter2/toy.cpp
    :language: c++
 
-`Next: Implementing Code Generation to LLVM IR <LangImpl3.html>`_
+`Next: Implementing Code Generation to LLVM IR <LangImpl03.html>`_
 
diff --git a/docs/tutorial/LangImpl3.rst b/docs/tutorial/LangImpl03.rst
index 83ad35f14aeea..2bb3a300026e0 100644
--- a/docs/tutorial/LangImpl3.rst
+++ b/docs/tutorial/LangImpl03.rst
@@ -67,26 +67,26 @@ way to model this. Again, this tutorial won't dwell on good software
 engineering practices: for our purposes, adding a virtual method is
 simplest.
 
-The second thing we want is an "Error" method like we used for the
+The second thing we want is an "LogError" method like we used for the
 parser, which will be used to report errors found during code generation
 (for example, use of an undeclared parameter):
 
 .. code-block:: c++
 
-    static std::unique_ptr<Module> *TheModule;
-    static IRBuilder<> Builder(getGlobalContext());
-    static std::map<std::string, Value*> NamedValues;
+    static LLVMContext TheContext;
+    static IRBuilder<> Builder(TheContext);
+    static std::unique_ptr<Module> TheModule;
+    static std::map<std::string, Value *> NamedValues;
 
-    Value *ErrorV(const char *Str) {
-      Error(Str);
+    Value *LogErrorV(const char *Str) {
+      LogError(Str);
       return nullptr;
     }
 
-The static variables will be used during code generation. ``TheModule``
-is an LLVM construct that contains functions and global variables. In many
-ways, it is the top-level structure that the LLVM IR uses to contain code.
-It will own the memory for all of the IR that we generate, which is why
-the codegen() method returns a raw Value\*, rather than a unique_ptr<Value>.
+The static variables will be used during code generation. ``TheContext``
+is an opaque object that owns a lot of core LLVM data structures, such as
+the type and constant value tables. We don't need to understand it in
+detail, we just need a single instance to pass into APIs that require it.
 
 The ``Builder`` object is a helper object that makes it easy to generate
 LLVM instructions. Instances of the
@@ -94,6 +94,12 @@ LLVM instructions. Instances of the
 class template keep track of the current place to insert instructions
 and has methods to create new instructions.
 
+``TheModule`` is an LLVM construct that contains functions and global
+variables. In many ways, it is the top-level structure that the LLVM IR
+uses to contain code. It will own the memory for all of the IR that we
+generate, which is why the codegen() method returns a raw Value\*,
+rather than a unique_ptr<Value>.
+
 The ``NamedValues`` map keeps track of which values are defined in the
 current scope and what their LLVM representation is. (In other words, it
 is a symbol table for the code). In this form of Kaleidoscope, the only
@@ -116,7 +122,7 @@ First we'll do numeric literals:
 .. code-block:: c++
 
     Value *NumberExprAST::codegen() {
-      return ConstantFP::get(getGlobalContext(), APFloat(Val));
+      return ConstantFP::get(LLVMContext, APFloat(Val));
     }
 
 In the LLVM IR, numeric constants are represented with the
@@ -133,7 +139,7 @@ are all uniqued together and shared. For this reason, the API uses the
       // Look this variable up in the function.
       Value *V = NamedValues[Name];
       if (!V)
-        ErrorV("Unknown variable name");
+        LogErrorV("Unknown variable name");
       return V;
     }
 
@@ -165,10 +171,10 @@ variables <LangImpl7.html#user-defined-local-variables>`_.
       case '<':
         L = Builder.CreateFCmpULT(L, R, "cmptmp");
         // Convert bool 0/1 to double 0.0 or 1.0
-        return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
+        return Builder.CreateUIToFP(L, Type::getDoubleTy(LLVMContext),
                                     "booltmp");
       default:
-        return ErrorV("invalid binary operator");
+        return LogErrorV("invalid binary operator");
       }
     }
 
@@ -214,11 +220,11 @@ would return 0.0 and -1.0, depending on the input value.
       // Look up the name in the global module table.
       Function *CalleeF = TheModule->getFunction(Callee);
       if (!CalleeF)
-        return ErrorV("Unknown function referenced");
+        return LogErrorV("Unknown function referenced");
 
       // If argument mismatch error.
       if (CalleeF->arg_size() != Args.size())
-        return ErrorV("Incorrect # arguments passed");
+        return LogErrorV("Incorrect # arguments passed");
 
       std::vector<Value *> ArgsV;
       for (unsigned i = 0, e = Args.size(); i != e; ++i) {
@@ -264,9 +270,9 @@ with:
     Function *PrototypeAST::codegen() {
       // Make the function type:  double(double,double) etc.
       std::vector<Type*> Doubles(Args.size(),
-                                 Type::getDoubleTy(getGlobalContext()));
+                                 Type::getDoubleTy(LLVMContext));
       FunctionType *FT =
-        FunctionType::get(Type::getDoubleTy(getGlobalContext()), Doubles, false);
+        FunctionType::get(Type::getDoubleTy(LLVMContext), Doubles, false);
 
       Function *F =
         Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
@@ -328,7 +334,7 @@ codegen and attach a function body.
       return nullptr;
 
     if (!TheFunction->empty())
-      return (Function*)ErrorV("Function cannot be redefined.");
+      return (Function*)LogErrorV("Function cannot be redefined.");
 
 
 For function definitions, we start by searching TheModule's symbol table for an
@@ -340,7 +346,7 @@ assert that the function is empty (i.e. has no body yet) before we start.
 .. code-block:: c++
 
   // Create a new basic block to start insertion into.
-  BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
+  BasicBlock *BB = BasicBlock::Create(LLVMContext, "entry", TheFunction);
   Builder.SetInsertPoint(BB);
 
   // Record the function arguments in the NamedValues map.
@@ -557,5 +563,5 @@ Here is the code:
 .. literalinclude:: ../../examples/Kaleidoscope/Chapter3/toy.cpp
    :language: c++
 
-`Next: Adding JIT and Optimizer Support <LangImpl4.html>`_
+`Next: Adding JIT and Optimizer Support <LangImpl04.html>`_
 
diff --git a/docs/tutorial/LangImpl4.rst b/docs/tutorial/LangImpl04.rst
index a671d0c37f9d8..78596cd8eee5d 100644
--- a/docs/tutorial/LangImpl4.rst
+++ b/docs/tutorial/LangImpl04.rst
@@ -131,7 +131,8 @@ for us:
 
     void InitializeModuleAndPassManager(void) {
       // Open a new module.
-      TheModule = llvm::make_unique<Module>("my cool jit", getGlobalContext());
+      Context LLVMContext;
+      TheModule = llvm::make_unique<Module>("my cool jit", LLVMContext);
       TheModule->setDataLayout(TheJIT->getTargetMachine().createDataLayout());
 
       // Create a new pass manager attached to it.
@@ -605,5 +606,5 @@ Here is the code:
 .. literalinclude:: ../../examples/Kaleidoscope/Chapter4/toy.cpp
    :language: c++
 
-`Next: Extending the language: control flow <LangImpl5.html>`_
+`Next: Extending the language: control flow <LangImpl05.html>`_
 
diff --git a/docs/tutorial/LangImpl5-cfg.png b/docs/tutorial/LangImpl05-cfg.png
index cdba92ff6c5c9..cdba92ff6c5c9 100644
--- a/docs/tutorial/LangImpl5-cfg.png
+++ b/docs/tutorial/LangImpl05-cfg.png
diff --git a/docs/tutorial/LangImpl5.rst b/docs/tutorial/LangImpl05.rst
index d916f92bf99e9..ae0935d9ba1f9 100644
--- a/docs/tutorial/LangImpl5.rst
+++ b/docs/tutorial/LangImpl05.rst
@@ -127,7 +127,7 @@ First we define a new parsing function:
         return nullptr;
 
       if (CurTok != tok_then)
-        return Error("expected then");
+        return LogError("expected then");
       getNextToken();  // eat the then
 
       auto Then = ParseExpression();
@@ -135,7 +135,7 @@ First we define a new parsing function:
         return nullptr;
 
       if (CurTok != tok_else)
-        return Error("expected else");
+        return LogError("expected else");
 
       getNextToken();
 
@@ -154,7 +154,7 @@ Next we hook it up as a primary expression:
     static std::unique_ptr<ExprAST> ParsePrimary() {
       switch (CurTok) {
       default:
-        return Error("unknown token when expecting an expression");
+        return LogError("unknown token when expecting an expression");
       case tok_identifier:
         return ParseIdentifierExpr();
       case tok_number:
@@ -217,7 +217,7 @@ IR into "t.ll" and run "``llvm-as < t.ll | opt -analyze -view-cfg``", `a
 window will pop up <../ProgrammersManual.html#viewing-graphs-while-debugging-code>`_ and you'll
 see this graph:
 
-.. figure:: LangImpl5-cfg.png
+.. figure:: LangImpl05-cfg.png
    :align: center
    :alt: Example CFG
 
@@ -292,7 +292,7 @@ for ``IfExprAST``:
 
       // Convert condition to a bool by comparing equal to 0.0.
       CondV = Builder.CreateFCmpONE(
-          CondV, ConstantFP::get(getGlobalContext(), APFloat(0.0)), "ifcond");
+          CondV, ConstantFP::get(LLVMContext, APFloat(0.0)), "ifcond");
 
 This code is straightforward and similar to what we saw before. We emit
 the expression for the condition, then compare that value to zero to get
@@ -305,9 +305,9 @@ a truth value as a 1-bit (bool) value.
       // Create blocks for the then and else cases.  Insert the 'then' block at the
       // end of the function.
       BasicBlock *ThenBB =
-          BasicBlock::Create(getGlobalContext(), "then", TheFunction);
-      BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
-      BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
+          BasicBlock::Create(LLVMContext, "then", TheFunction);
+      BasicBlock *ElseBB = BasicBlock::Create(LLVMContext, "else");
+      BasicBlock *MergeBB = BasicBlock::Create(LLVMContext, "ifcont");
 
       Builder.CreateCondBr(CondV, ThenBB, ElseBB);
 
@@ -400,7 +400,7 @@ code:
       TheFunction->getBasicBlockList().push_back(MergeBB);
       Builder.SetInsertPoint(MergeBB);
       PHINode *PN =
-        Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, "iftmp");
+        Builder.CreatePHI(Type::getDoubleTy(LLVMContext), 2, "iftmp");
 
       PN->addIncoming(ThenV, ThenBB);
       PN->addIncoming(ElseV, ElseBB);
@@ -518,13 +518,13 @@ value to null in the AST node:
       getNextToken();  // eat the for.
 
       if (CurTok != tok_identifier)
-        return Error("expected identifier after for");
+        return LogError("expected identifier after for");
 
       std::string IdName = IdentifierStr;
       getNextToken();  // eat identifier.
 
       if (CurTok != '=')
-        return Error("expected '=' after for");
+        return LogError("expected '=' after for");
       getNextToken();  // eat '='.
 
 
@@ -532,7 +532,7 @@ value to null in the AST node:
       if (!Start)
         return nullptr;
       if (CurTok != ',')
-        return Error("expected ',' after for start value");
+        return LogError("expected ',' after for start value");
       getNextToken();
 
       auto End = ParseExpression();
@@ -549,7 +549,7 @@ value to null in the AST node:
       }
 
       if (CurTok != tok_in)
-        return Error("expected 'in' after for");
+        return LogError("expected 'in' after for");
       getNextToken();  // eat 'in'.
 
       auto Body = ParseExpression();
@@ -625,7 +625,7 @@ expression).
       Function *TheFunction = Builder.GetInsertBlock()->getParent();
       BasicBlock *PreheaderBB = Builder.GetInsertBlock();
       BasicBlock *LoopBB =
-          BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
+          BasicBlock::Create(LLVMContext, "loop", TheFunction);
 
       // Insert an explicit fall through from the current block to the LoopBB.
       Builder.CreateBr(LoopBB);
@@ -642,7 +642,7 @@ the two blocks.
       Builder.SetInsertPoint(LoopBB);
 
       // Start the PHI node with an entry for Start.
-      PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()),
+      PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(LLVMContext),
                                             2, VarName.c_str());
       Variable->addIncoming(StartVal, PreheaderBB);
 
@@ -693,7 +693,7 @@ table.
           return nullptr;
       } else {
         // If not specified, use 1.0.
-        StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
+        StepVal = ConstantFP::get(LLVMContext, APFloat(1.0));
       }
 
       Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
@@ -712,7 +712,7 @@ iteration of the loop.
 
       // Convert condition to a bool by comparing equal to 0.0.
       EndCond = Builder.CreateFCmpONE(
-          EndCond, ConstantFP::get(getGlobalContext(), APFloat(0.0)), "loopcond");
+          EndCond, ConstantFP::get(LLVMContext, APFloat(0.0)), "loopcond");
 
 Finally, we evaluate the exit value of the loop, to determine whether
 the loop should exit. This mirrors the condition evaluation for the
@@ -723,7 +723,7 @@ if/then/else statement.
       // Create the "after loop" block and insert it.
       BasicBlock *LoopEndBB = Builder.GetInsertBlock();
       BasicBlock *AfterBB =
-          BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
+          BasicBlock::Create(LLVMContext, "afterloop", TheFunction);
 
       // Insert the conditional branch into the end of LoopEndBB.
       Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
@@ -751,7 +751,7 @@ insertion position to it.
         NamedValues.erase(VarName);
 
       // for expr always returns 0.0.
-      return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
+      return Constant::getNullValue(Type::getDoubleTy(LLVMContext));
     }
 
 The final code handles various cleanups: now that we have the "NextVar"
@@ -786,5 +786,5 @@ Here is the code:
 .. literalinclude:: ../../examples/Kaleidoscope/Chapter5/toy.cpp
    :language: c++
 
-`Next: Extending the language: user-defined operators <LangImpl6.html>`_
+`Next: Extending the language: user-defined operators <LangImpl06.html>`_
 
diff --git a/docs/tutorial/LangImpl6.rst b/docs/tutorial/LangImpl06.rst
index 827cd392effbb..7c9a2123e8f38 100644
--- a/docs/tutorial/LangImpl6.rst
+++ b/docs/tutorial/LangImpl06.rst
@@ -176,7 +176,7 @@ user-defined operator, we need to parse it:
 
       switch (CurTok) {
       default:
-        return ErrorP("Expected function name in prototype");
+        return LogErrorP("Expected function name in prototype");
       case tok_identifier:
         FnName = IdentifierStr;
         Kind = 0;
@@ -185,7 +185,7 @@ user-defined operator, we need to parse it:
       case tok_binary:
         getNextToken();
         if (!isascii(CurTok))
-          return ErrorP("Expected binary operator");
+          return LogErrorP("Expected binary operator");
         FnName = "binary";
         FnName += (char)CurTok;
         Kind = 2;
@@ -194,7 +194,7 @@ user-defined operator, we need to parse it:
         // Read the precedence if present.
         if (CurTok == tok_number) {
           if (NumVal < 1 || NumVal > 100)
-            return ErrorP("Invalid precedecnce: must be 1..100");
+            return LogErrorP("Invalid precedecnce: must be 1..100");
           BinaryPrecedence = (unsigned)NumVal;
           getNextToken();
         }
@@ -202,20 +202,20 @@ user-defined operator, we need to parse it:
       }
 
       if (CurTok != '(')
-        return ErrorP("Expected '(' in prototype");
+        return LogErrorP("Expected '(' in prototype");
 
       std::vector<std::string> ArgNames;
       while (getNextToken() == tok_identifier)
         ArgNames.push_back(IdentifierStr);
       if (CurTok != ')')
-        return ErrorP("Expected ')' in prototype");
+        return LogErrorP("Expected ')' in prototype");
 
       // success.
       getNextToken();  // eat ')'.
 
       // Verify right number of names for operator.
       if (Kind && ArgNames.size() != Kind)
-        return ErrorP("Invalid number of operands for operator");
+        return LogErrorP("Invalid number of operands for operator");
 
       return llvm::make_unique<PrototypeAST>(FnName, std::move(ArgNames), Kind != 0,
                                              BinaryPrecedence);
@@ -251,7 +251,7 @@ default case for our existing binary operator node:
       case '<':
         L = Builder.CreateFCmpULT(L, R, "cmptmp");
         // Convert bool 0/1 to double 0.0 or 1.0
-        return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
+        return Builder.CreateUIToFP(L, Type::getDoubleTy(LLVMContext),
                                     "booltmp");
       default:
         break;
@@ -288,7 +288,7 @@ The final piece of code we are missing, is a bit of top-level magic:
         BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();
 
       // Create a new basic block to start insertion into.
-      BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
+      BasicBlock *BB = BasicBlock::Create(LLVMContext, "entry", TheFunction);
       Builder.SetInsertPoint(BB);
 
       if (Value *RetVal = Body->codegen()) {
@@ -403,7 +403,7 @@ operator code above with:
 
       switch (CurTok) {
       default:
-        return ErrorP("Expected function name in prototype");
+        return LogErrorP("Expected function name in prototype");
       case tok_identifier:
         FnName = IdentifierStr;
         Kind = 0;
@@ -412,7 +412,7 @@ operator code above with:
       case tok_unary:
         getNextToken();
         if (!isascii(CurTok))
-          return ErrorP("Expected unary operator");
+          return LogErrorP("Expected unary operator");
         FnName = "unary";
         FnName += (char)CurTok;
         Kind = 1;
@@ -435,7 +435,7 @@ unary operators. It looks like this:
 
       Function *F = TheModule->getFunction(std::string("unary")+Opcode);
       if (!F)
-        return ErrorV("Unknown unary operator");
+        return LogErrorV("Unknown unary operator");
 
       return Builder.CreateCall(F, OperandV, "unop");
     }
@@ -546,17 +546,17 @@ converge:
 
     # Determine whether the specific location diverges.
     # Solve for z = z^2 + c in the complex plane.
-    def mandleconverger(real imag iters creal cimag)
+    def mandelconverger(real imag iters creal cimag)
       if iters > 255 | (real*real + imag*imag > 4) then
         iters
       else
-        mandleconverger(real*real - imag*imag + creal,
+        mandelconverger(real*real - imag*imag + creal,
                         2*real*imag + cimag,
                         iters+1, creal, cimag);
 
     # Return the number of iterations required for the iteration to escape
-    def mandleconverge(real imag)
-      mandleconverger(real, imag, 0, real, imag);
+    def mandelconverge(real imag)
+      mandelconverger(real, imag, 0, real, imag);
 
 This "``z = z2 + c``" function is a beautiful little creature that is
 the basis for computation of the `Mandelbrot
@@ -570,12 +570,12 @@ but we can whip together something using the density plotter above:
 
 ::
 
-    # Compute and plot the mandlebrot set with the specified 2 dimensional range
+    # Compute and plot the mandelbrot set with the specified 2 dimensional range
     # info.
     def mandelhelp(xmin xmax xstep   ymin ymax ystep)
       for y = ymin, y < ymax, ystep in (
         (for x = xmin, x < xmax, xstep in
-           printdensity(mandleconverge(x,y)))
+           printdensity(mandelconverge(x,y)))
         : putchard(10)
       )
 
@@ -585,7 +585,7 @@ but we can whip together something using the density plotter above:
       mandelhelp(realstart, realstart+realmag*78, realmag,
                  imagstart, imagstart+imagmag*40, imagmag);
 
-Given this, we can try plotting out the mandlebrot set! Lets try it out:
+Given this, we can try plotting out the mandelbrot set! Lets try it out:
 
 ::
 
@@ -764,5 +764,5 @@ Here is the code:
    :language: c++
 
 `Next: Extending the language: mutable variables / SSA
-construction <LangImpl7.html>`_
+construction <LangImpl07.html>`_
 
diff --git a/docs/tutorial/LangImpl7.rst b/docs/tutorial/LangImpl07.rst
index 1cd7d56fddb4b..4d86ecad38aaa 100644
--- a/docs/tutorial/LangImpl7.rst
+++ b/docs/tutorial/LangImpl07.rst
@@ -224,7 +224,7 @@ variables in certain circumstances:
    class <../LangRef.html#first-class-types>`_ values (such as pointers,
    scalars and vectors), and only if the array size of the allocation is
    1 (or missing in the .ll file). mem2reg is not capable of promoting
-   structs or arrays to registers. Note that the "scalarrepl" pass is
+   structs or arrays to registers. Note that the "sroa" pass is
    more powerful and can promote structs, "unions", and arrays in many
    cases.
 
@@ -252,13 +252,13 @@ is:
    technique dovetails very naturally with this style of debug info.
 
 If nothing else, this makes it much easier to get your front-end up and
-running, and is very simple to implement. Lets extend Kaleidoscope with
+running, and is very simple to implement. Let's extend Kaleidoscope with
 mutable variables now!
 
 Mutable Variables in Kaleidoscope
 =================================
 
-Now that we know the sort of problem we want to tackle, lets see what
+Now that we know the sort of problem we want to tackle, let's see what
 this looks like in the context of our little Kaleidoscope language.
 We're going to add two features:
 
@@ -306,7 +306,7 @@ Adjusting Existing Variables for Mutation
 The symbol table in Kaleidoscope is managed at code generation time by
 the '``NamedValues``' map. This map currently keeps track of the LLVM
 "Value\*" that holds the double value for the named variable. In order
-to support mutation, we need to change this slightly, so that it
+to support mutation, we need to change this slightly, so that
 ``NamedValues`` holds the *memory location* of the variable in question.
 Note that this change is a refactoring: it changes the structure of the
 code, but does not (by itself) change the behavior of the compiler. All
@@ -339,7 +339,7 @@ the function:
                                               const std::string &VarName) {
       IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
                      TheFunction->getEntryBlock().begin());
-      return TmpB.CreateAlloca(Type::getDoubleTy(getGlobalContext()), 0,
+      return TmpB.CreateAlloca(Type::getDoubleTy(LLVMContext), 0,
                                VarName.c_str());
     }
 
@@ -359,7 +359,7 @@ from the stack slot:
       // Look this variable up in the function.
       Value *V = NamedValues[Name];
       if (!V)
-        return ErrorV("Unknown variable name");
+        return LogErrorV("Unknown variable name");
 
       // Load the value.
       return Builder.CreateLoad(V, Name.c_str());
@@ -578,7 +578,7 @@ implement codegen for the assignment operator. This looks like:
         // Assignment requires the LHS to be an identifier.
         VariableExprAST *LHSE = dynamic_cast<VariableExprAST*>(LHS.get());
         if (!LHSE)
-          return ErrorV("destination of '=' must be a variable");
+          return LogErrorV("destination of '=' must be a variable");
 
 Unlike the rest of the binary operators, our assignment operator doesn't
 follow the "emit LHS, emit RHS, do computation" model. As such, it is
@@ -597,7 +597,7 @@ allowed.
         // Look up the name.
         Value *Variable = NamedValues[LHSE->getName()];
         if (!Variable)
-          return ErrorV("Unknown variable name");
+          return LogErrorV("Unknown variable name");
 
         Builder.CreateStore(Val, Variable);
         return Val;
@@ -632,7 +632,7 @@ When run, this example prints "123" and then "4", showing that we did
 actually mutate the value! Okay, we have now officially implemented our
 goal: getting this to work requires SSA construction in the general
 case. However, to be really useful, we want the ability to define our
-own local variables, lets add this next!
+own local variables, let's add this next!
 
 User-defined Local Variables
 ============================
@@ -703,7 +703,7 @@ do is add it as a primary expression:
     static std::unique_ptr<ExprAST> ParsePrimary() {
       switch (CurTok) {
       default:
-        return Error("unknown token when expecting an expression");
+        return LogError("unknown token when expecting an expression");
       case tok_identifier:
         return ParseIdentifierExpr();
       case tok_number:
@@ -732,7 +732,7 @@ Next we define ParseVarExpr:
 
       // At least one variable name is required.
       if (CurTok != tok_identifier)
-        return Error("expected identifier after var");
+        return LogError("expected identifier after var");
 
 The first part of this code parses the list of identifier/expr pairs
 into the local ``VarNames`` vector.
@@ -759,7 +759,7 @@ into the local ``VarNames`` vector.
         getNextToken(); // eat the ','.
 
         if (CurTok != tok_identifier)
-          return Error("expected identifier list after var");
+          return LogError("expected identifier list after var");
       }
 
 Once all the variables are parsed, we then parse the body and create the
@@ -769,7 +769,7 @@ AST node:
 
       // At this point, we have to have 'in'.
       if (CurTok != tok_in)
-        return Error("expected 'in' keyword after 'var'");
+        return LogError("expected 'in' keyword after 'var'");
       getNextToken();  // eat 'in'.
 
       auto Body = ParseExpression();
@@ -812,7 +812,7 @@ previous value that we replace in OldBindings.
           if (!InitVal)
             return nullptr;
         } else { // If not specified, use 0.0.
-          InitVal = ConstantFP::get(getGlobalContext(), APFloat(0.0));
+          InitVal = ConstantFP::get(LLVMContext, APFloat(0.0));
         }
 
         AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
@@ -877,5 +877,5 @@ Here is the code:
 .. literalinclude:: ../../examples/Kaleidoscope/Chapter7/toy.cpp
    :language: c++
 
-`Next: Adding Debug Information <LangImpl8.html>`_
+`Next: Compiling to Object Code <LangImpl08.html>`_
 
diff --git a/docs/tutorial/LangImpl08.rst b/docs/tutorial/LangImpl08.rst
new file mode 100644
index 0000000000000..96eccaebd3295
--- /dev/null
+++ b/docs/tutorial/LangImpl08.rst
@@ -0,0 +1,218 @@
+========================================
+ Kaleidoscope: Compiling to Object Code
+========================================
+
+.. contents::
+   :local:
+
+Chapter 8 Introduction
+======================
+
+Welcome to Chapter 8 of the "`Implementing a language with LLVM
+<index.html>`_" tutorial. This chapter describes how to compile our
+language down to object files.
+
+Choosing a target
+=================
+
+LLVM has native support for cross-compilation. You can compile to the
+architecture of your current machine, or just as easily compile for
+other architectures. In this tutorial, we'll target the current
+machine.
+
+To specify the architecture that you want to target, we use a string
+called a "target triple". This takes the form
+``<arch><sub>-<vendor>-<sys>-<abi>`` (see the `cross compilation docs
+<http://clang.llvm.org/docs/CrossCompilation.html#target-triple>`_).
+
+As an example, we can see what clang thinks is our current target
+triple:
+
+::
+
+    $ clang --version | grep Target
+    Target: x86_64-unknown-linux-gnu
+
+Running this command may show something different on your machine as
+you might be using a different architecture or operating system to me.
+
+Fortunately, we don't need to hard-code a target triple to target the
+current machine. LLVM provides ``sys::getDefaultTargetTriple``, which
+returns the target triple of the current machine.
+
+.. code-block:: c++
+
+    auto TargetTriple = sys::getDefaultTargetTriple();
+
+LLVM doesn't require us to to link in all the target
+functionality. For example, if we're just using the JIT, we don't need
+the assembly printers. Similarly, if we're only targeting certain
+architectures, we can only link in the functionality for those
+architectures.
+
+For this example, we'll initialize all the targets for emitting object
+code.
+
+.. code-block:: c++
+
+    InitializeAllTargetInfos();
+    InitializeAllTargets();
+    InitializeAllTargetMCs();
+    InitializeAllAsmParsers();
+    InitializeAllAsmPrinters();
+
+We can now use our target triple to get a ``Target``:
+
+.. code-block:: c++
+
+  std::string Error;
+  auto Target = TargetRegistry::lookupTarget(TargetTriple, Error);
+
+  // Print an error and exit if we couldn't find the requested target.
+  // This generally occurs if we've forgotten to initialise the
+  // TargetRegistry or we have a bogus target triple.
+  if (!Target) {
+    errs() << Error;
+    return 1;
+  }
+
+Target Machine
+==============
+
+We will also need a ``TargetMachine``. This class provides a complete
+machine description of the machine we're targeting. If we want to
+target a specific feature (such as SSE) or a specific CPU (such as
+Intel's Sandylake), we do so now.
+
+To see which features and CPUs that LLVM knows about, we can use
+``llc``. For example, let's look at x86:
+
+::
+
+    $ llvm-as < /dev/null | llc -march=x86 -mattr=help
+    Available CPUs for this target:
+
+      amdfam10      - Select the amdfam10 processor.
+      athlon        - Select the athlon processor.
+      athlon-4      - Select the athlon-4 processor.
+      ...
+
+    Available features for this target:
+
+      16bit-mode            - 16-bit mode (i8086).
+      32bit-mode            - 32-bit mode (80386).
+      3dnow                 - Enable 3DNow! instructions.
+      3dnowa                - Enable 3DNow! Athlon instructions.
+      ...
+
+For our example, we'll use the generic CPU without any additional
+features, options or relocation model.
+
+.. code-block:: c++
+
+  auto CPU = "generic";
+  auto Features = "";
+
+  TargetOptions opt;
+  auto RM = Optional<Reloc::Model>();
+  auto TargetMachine = Target->createTargetMachine(TargetTriple, CPU, Features, opt, RM);
+
+
+Configuring the Module
+======================
+
+We're now ready to configure our module, to specify the target and
+data layout. This isn't strictly necessary, but the `frontend
+performance guide <../Frontend/PerformanceTips.html>`_ recommends
+this. Optimizations benefit from knowing about the target and data
+layout.
+
+.. code-block:: c++
+
+  TheModule->setDataLayout(TargetMachine->createDataLayout());
+  TheModule->setTargetTriple(TargetTriple);   
+  
+Emit Object Code
+================
+
+We're ready to emit object code! Let's define where we want to write
+our file to:
+
+.. code-block:: c++
+
+  auto Filename = "output.o";
+  std::error_code EC;
+  raw_fd_ostream dest(Filename, EC, sys::fs::F_None);
+
+  if (EC) {
+    errs() << "Could not open file: " << EC.message();
+    return 1;
+  }
+
+Finally, we define a pass that emits object code, then we run that
+pass:
+
+.. code-block:: c++
+
+  legacy::PassManager pass;
+  auto FileType = TargetMachine::CGFT_ObjectFile;
+
+  if (TargetMachine->addPassesToEmitFile(pass, dest, FileType)) {
+    errs() << "TargetMachine can't emit a file of this type";
+    return 1;
+  }
+
+  pass.run(*TheModule);
+  dest.flush();
+
+Putting It All Together
+=======================
+
+Does it work? Let's give it a try. We need to compile our code, but
+note that the arguments to ``llvm-config`` are different to the previous chapters.
+
+::
+
+    $ clang++ -g -O3 toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs all` -o toy
+
+Let's run it, and define a simple ``average`` function. Press Ctrl-D
+when you're done.
+
+::
+   
+    $ ./toy
+    ready> def average(x y) (x + y) * 0.5;
+    ^D
+    Wrote output.o
+
+We have an object file! To test it, let's write a simple program and
+link it with our output. Here's the source code:
+
+.. code-block:: c++
+
+    #include <iostream>
+
+    extern "C" {
+        double average(double, double);
+    }
+
+    int main() {
+        std::cout << "average of 3.0 and 4.0: " << average(3.0, 4.0) << std::endl;
+    }
+
+We link our program to output.o and check the result is what we
+expected:
+
+::
+
+    $ clang++ main.cpp output.o -o main
+    $ ./main
+    average of 3.0 and 4.0: 3.5
+
+Full Code Listing
+=================
+
+.. literalinclude:: ../../examples/Kaleidoscope/Chapter8/toy.cpp
+   :language: c++
+
+`Next: Adding Debug Information <LangImpl09.html>`_
diff --git a/docs/tutorial/LangImpl8.rst b/docs/tutorial/LangImpl09.rst
index 3b0f443f08d54..0053960756d29 100644
--- a/docs/tutorial/LangImpl8.rst
+++ b/docs/tutorial/LangImpl09.rst
@@ -5,11 +5,11 @@ Kaleidoscope: Adding Debug Information
 .. contents::
    :local:
 
-Chapter 8 Introduction
+Chapter 9 Introduction
 ======================
 
-Welcome to Chapter 8 of the "`Implementing a language with
-LLVM <index.html>`_" tutorial. In chapters 1 through 7, we've built a
+Welcome to Chapter 9 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. In chapters 1 through 8, we've built a
 decent little programming language with functions and variables.
 What happens if something goes wrong though, how do you debug your
 program?
@@ -149,7 +149,7 @@ command line:
 
 .. code-block:: bash
 
-  Kaleidoscope-Ch8 < fib.ks | & clang -x ir -
+  Kaleidoscope-Ch9 < fib.ks | & clang -x ir -
 
 which gives an a.out/a.exe in the current working directory.
 
@@ -455,8 +455,8 @@ debug information. To build this example, use:
 
 Here is the code:
 
-.. literalinclude:: ../../examples/Kaleidoscope/Chapter8/toy.cpp
+.. literalinclude:: ../../examples/Kaleidoscope/Chapter9/toy.cpp
    :language: c++
 
-`Next: Conclusion and other useful LLVM tidbits <LangImpl9.html>`_
+`Next: Conclusion and other useful LLVM tidbits <LangImpl10.html>`_
 
diff --git a/docs/tutorial/LangImpl9.rst b/docs/tutorial/LangImpl10.rst
index f02bba857c149..5799c99402c0c 100644
--- a/docs/tutorial/LangImpl9.rst
+++ b/docs/tutorial/LangImpl10.rst
@@ -51,10 +51,7 @@ For example, try adding:
    applications. Adding them is mostly an exercise in learning how the
    LLVM `getelementptr <../LangRef.html#getelementptr-instruction>`_ instruction
    works: it is so nifty/unconventional, it `has its own
-   FAQ <../GetElementPtr.html>`_! If you add support for recursive types
-   (e.g. linked lists), make sure to read the `section in the LLVM
-   Programmer's Manual <../ProgrammersManual.html#TypeResolve>`_ that
-   describes how to construct them.
+   FAQ <../GetElementPtr.html>`_!
 -  **standard runtime** - Our current language allows the user to access
    arbitrary external functions, and we use it for things like "printd"
    and "putchard". As you extend the language to add higher-level
@@ -103,8 +100,8 @@ LLVM's capabilities.
 Properties of the LLVM IR
 =========================
 
-We have a couple common questions about code in the LLVM IR form - lets
-just get these out of the way right now, shall we?
+We have a couple of common questions about code in the LLVM IR form -
+let's just get these out of the way right now, shall we?
 
 Target Independence
 -------------------
diff --git a/docs/tutorial/OCamlLangImpl1.rst b/docs/tutorial/OCamlLangImpl1.rst
index cf968b5ae89ce..9de92305a1c31 100644
--- a/docs/tutorial/OCamlLangImpl1.rst
+++ b/docs/tutorial/OCamlLangImpl1.rst
@@ -106,7 +106,7 @@ support the if/then/else construct, a for loop, user defined operators,
 JIT compilation with a simple command line interface, etc.
 
 Because we want to keep things simple, the only datatype in Kaleidoscope
-is a 64-bit floating point type (aka 'float' in O'Caml parlance). As
+is a 64-bit floating point type (aka 'float' in OCaml parlance). As
 such, all values are implicitly double precision and the language
 doesn't require type declarations. This gives the language a very nice
 and simple syntax. For example, the following simple example computes
diff --git a/docs/tutorial/OCamlLangImpl5.rst b/docs/tutorial/OCamlLangImpl5.rst
index 675b9bc1978b0..3a135b2333733 100644
--- a/docs/tutorial/OCamlLangImpl5.rst
+++ b/docs/tutorial/OCamlLangImpl5.rst
@@ -178,7 +178,7 @@ IR into "t.ll" and run "``llvm-as < t.ll | opt -analyze -view-cfg``", `a
 window will pop up <../ProgrammersManual.html#viewing-graphs-while-debugging-code>`_ and you'll
 see this graph:
 
-.. figure:: LangImpl5-cfg.png
+.. figure:: LangImpl05-cfg.png
    :align: center
    :alt: Example CFG
 
diff --git a/docs/tutorial/OCamlLangImpl6.rst b/docs/tutorial/OCamlLangImpl6.rst
index a3ae11fd7e549..2fa25f5c22fb5 100644
--- a/docs/tutorial/OCamlLangImpl6.rst
+++ b/docs/tutorial/OCamlLangImpl6.rst
@@ -496,17 +496,17 @@ converge:
 
     # determine whether the specific location diverges.
     # Solve for z = z^2 + c in the complex plane.
-    def mandleconverger(real imag iters creal cimag)
+    def mandelconverger(real imag iters creal cimag)
       if iters > 255 | (real*real + imag*imag > 4) then
         iters
       else
-        mandleconverger(real*real - imag*imag + creal,
+        mandelconverger(real*real - imag*imag + creal,
                         2*real*imag + cimag,
                         iters+1, creal, cimag);
 
     # return the number of iterations required for the iteration to escape
-    def mandleconverge(real imag)
-      mandleconverger(real, imag, 0, real, imag);
+    def mandelconverge(real imag)
+      mandelconverger(real, imag, 0, real, imag);
 
 This "z = z\ :sup:`2`\  + c" function is a beautiful little creature
 that is the basis for computation of the `Mandelbrot
@@ -520,12 +520,12 @@ but we can whip together something using the density plotter above:
 
 ::
 
-    # compute and plot the mandlebrot set with the specified 2 dimensional range
+    # compute and plot the mandelbrot set with the specified 2 dimensional range
     # info.
     def mandelhelp(xmin xmax xstep   ymin ymax ystep)
       for y = ymin, y < ymax, ystep in (
         (for x = xmin, x < xmax, xstep in
-           printdensity(mandleconverge(x,y)))
+           printdensity(mandelconverge(x,y)))
         : putchard(10)
       )
 
@@ -535,7 +535,7 @@ but we can whip together something using the density plotter above:
       mandelhelp(realstart, realstart+realmag*78, realmag,
                  imagstart, imagstart+imagmag*40, imagmag);
 
-Given this, we can try plotting out the mandlebrot set! Lets try it out:
+Given this, we can try plotting out the mandelbrot set! Lets try it out:
 
 ::
 
diff --git a/docs/tutorial/OCamlLangImpl7.rst b/docs/tutorial/OCamlLangImpl7.rst
index c8c701b91012d..f36845c523434 100644
--- a/docs/tutorial/OCamlLangImpl7.rst
+++ b/docs/tutorial/OCamlLangImpl7.rst
@@ -224,7 +224,7 @@ variables in certain circumstances:
    class <../LangRef.html#first-class-types>`_ values (such as pointers,
    scalars and vectors), and only if the array size of the allocation is
    1 (or missing in the .ll file). mem2reg is not capable of promoting
-   structs or arrays to registers. Note that the "scalarrepl" pass is
+   structs or arrays to registers. Note that the "sroa" pass is
    more powerful and can promote structs, "unions", and arrays in many
    cases.
 
diff --git a/docs/tutorial/index.rst b/docs/tutorial/index.rst
index dde53badd3ad8..494cfd0a33a77 100644
--- a/docs/tutorial/index.rst
+++ b/docs/tutorial/index.rst
@@ -22,6 +22,16 @@ Kaleidoscope: Implementing a Language with LLVM in Objective Caml
 
    OCamlLangImpl*
 
+Building a JIT in LLVM
+===============================================
+
+.. toctree::
+   :titlesonly:
+   :glob:
+   :numbered:
+
+   BuildingAJIT*
+
 External Tutorials
 ==================