summaryrefslogtreecommitdiff
path: root/docs/XRayExample.rst
diff options
context:
space:
mode:
authorDimitry Andric <dim@FreeBSD.org>2017-04-16 16:01:22 +0000
committerDimitry Andric <dim@FreeBSD.org>2017-04-16 16:01:22 +0000
commit71d5a2540a98c81f5bcaeb48805e0e2881f530ef (patch)
tree5343938942df402b49ec7300a1c25a2d4ccd5821 /docs/XRayExample.rst
parent31bbf64f3a4974a2d6c8b3b27ad2f519caf74057 (diff)
Diffstat (limited to 'docs/XRayExample.rst')
-rw-r--r--docs/XRayExample.rst273
1 files changed, 273 insertions, 0 deletions
diff --git a/docs/XRayExample.rst b/docs/XRayExample.rst
new file mode 100644
index 000000000000..5dfb0bfaf298
--- /dev/null
+++ b/docs/XRayExample.rst
@@ -0,0 +1,273 @@
+===================
+Debugging with XRay
+===================
+
+This document shows an example of how you would go about analyzing applications
+built with XRay instrumentation. Here we will attempt to debug ``llc``
+compiling some sample LLVM IR generated by Clang.
+
+.. contents::
+ :local:
+
+Building with XRay
+------------------
+
+To debug an application with XRay instrumentation, we need to build it with a
+Clang that supports the ``-fxray-instrument`` option. See `XRay <XRay.html>`_
+for more technical details of how XRay works for background information.
+
+In our example, we need to add ``-fxray-instrument`` to the list of flags
+passed to Clang when building a binary. Note that we need to link with Clang as
+well to get the XRay runtime linked in appropriately. For building ``llc`` with
+XRay, we do something similar below for our LLVM build:
+
+::
+
+ $ mkdir -p llvm-build && cd llvm-build
+ # Assume that the LLVM sources are at ../llvm
+ $ cmake -GNinja ../llvm -DCMAKE_BUILD_TYPE=Release \
+ -DCMAKE_C_FLAGS_RELEASE="-fxray-instrument" -DCMAKE_CXX_FLAGS="-fxray-instrument" \
+ # Once this finishes, we should build llc
+ $ ninja llc
+
+
+To verify that we have an XRay instrumented binary, we can use ``objdump`` to
+look for the ``xray_instr_map`` section.
+
+::
+
+ $ objdump -h -j xray_instr_map ./bin/llc
+ ./bin/llc: file format elf64-x86-64
+
+ Sections:
+ Idx Name Size VMA LMA File off Algn
+ 14 xray_instr_map 00002fc0 00000000041516c6 00000000041516c6 03d516c6 2**0
+ CONTENTS, ALLOC, LOAD, READONLY, DATA
+
+Getting Traces
+--------------
+
+By default, XRay does not write out the trace files or patch the application
+before main starts. If we just run ``llc`` it should just work like a normally
+built binary. However, if we want to get a full trace of the application's
+operations (of the functions we do end up instrumenting with XRay) then we need
+to enable XRay at application start. To do this, XRay checks the
+``XRAY_OPTIONS`` environment variable.
+
+::
+
+ # The following doesn't create an XRay trace by default.
+ $ ./bin/llc input.ll
+
+ # We need to set the XRAY_OPTIONS to enable some features.
+ $ XRAY_OPTIONS="patch_premain=true" ./bin/llc input.ll
+ ==69819==XRay: Log file in 'xray-log.llc.m35qPB'
+
+At this point we now have an XRay trace we can start analysing.
+
+The ``llvm-xray`` Tool
+----------------------
+
+Having a trace then allows us to do basic accounting of the functions that were
+instrumented, and how much time we're spending in parts of the code. To make
+sense of this data, we use the ``llvm-xray`` tool which has a few subcommands
+to help us understand our trace.
+
+One of the simplest things we can do is to get an accounting of the functions
+that have been instrumented. We can see an example accounting with ``llvm-xray
+account``:
+
+::
+
+ $ llvm-xray account xray-log.llc.m35qPB -top=10 -sort=sum -sortorder=dsc -instr_map ./bin/llc
+ Functions with latencies: 29
+ funcid count [ min, med, 90p, 99p, max] sum function
+ 187 360 [ 0.000000, 0.000001, 0.000014, 0.000032, 0.000075] 0.001596 LLLexer.cpp:446:0: llvm::LLLexer::LexIdentifier()
+ 85 130 [ 0.000000, 0.000000, 0.000018, 0.000023, 0.000156] 0.000799 X86ISelDAGToDAG.cpp:1984:0: (anonymous namespace)::X86DAGToDAGISel::Select(llvm::SDNode*)
+ 138 130 [ 0.000000, 0.000000, 0.000017, 0.000155, 0.000155] 0.000774 SelectionDAGISel.cpp:2963:0: llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int)
+ 188 103 [ 0.000000, 0.000000, 0.000003, 0.000123, 0.000214] 0.000737 LLParser.cpp:2692:0: llvm::LLParser::ParseValID(llvm::ValID&, llvm::LLParser::PerFunctionState*)
+ 88 1 [ 0.000562, 0.000562, 0.000562, 0.000562, 0.000562] 0.000562 X86ISelLowering.cpp:83:0: llvm::X86TargetLowering::X86TargetLowering(llvm::X86TargetMachine const&, llvm::X86Subtarget const&)
+ 125 102 [ 0.000001, 0.000003, 0.000010, 0.000017, 0.000049] 0.000471 Verifier.cpp:3714:0: (anonymous namespace)::Verifier::visitInstruction(llvm::Instruction&)
+ 90 8 [ 0.000023, 0.000035, 0.000106, 0.000106, 0.000106] 0.000342 X86ISelLowering.cpp:3363:0: llvm::X86TargetLowering::LowerCall(llvm::TargetLowering::CallLoweringInfo&, llvm::SmallVectorImpl<llvm::SDValue>&) const
+ 124 32 [ 0.000003, 0.000007, 0.000016, 0.000041, 0.000041] 0.000310 Verifier.cpp:1967:0: (anonymous namespace)::Verifier::visitFunction(llvm::Function const&)
+ 123 1 [ 0.000302, 0.000302, 0.000302, 0.000302, 0.000302] 0.000302 LLVMContextImpl.cpp:54:0: llvm::LLVMContextImpl::~LLVMContextImpl()
+ 139 46 [ 0.000000, 0.000002, 0.000006, 0.000008, 0.000019] 0.000138 TargetLowering.cpp:506:0: llvm::TargetLowering::SimplifyDemandedBits(llvm::SDValue, llvm::APInt const&, llvm::APInt&, llvm::APInt&, llvm::TargetLowering::TargetLoweringOpt&, unsigned int, bool) const
+
+This shows us that for our input file, ``llc`` spent the most cumulative time
+in the lexer (a total of 1 millisecond). If we wanted for example to work with
+this data in a spreadsheet, we can output the results as CSV using the
+``-format=csv`` option to the command for further analysis.
+
+If we want to get a textual representation of the raw trace we can use the
+``llvm-xray convert`` tool to get YAML output. The first few lines of that
+ouput for an example trace would look like the following:
+
+::
+
+ $ llvm-xray convert -f yaml -symbolize -instr_map=./bin/llc xray-log.llc.m35qPB
+ ---
+ header:
+ version: 1
+ type: 0
+ constant-tsc: true
+ nonstop-tsc: true
+ cycle-frequency: 2601000000
+ records:
+ - { type: 0, func-id: 110, function: __cxx_global_var_init.8, cpu: 37, thread: 69819, kind: function-enter, tsc: 5434426023268520 }
+ - { type: 0, func-id: 110, function: __cxx_global_var_init.8, cpu: 37, thread: 69819, kind: function-exit, tsc: 5434426023523052 }
+ - { type: 0, func-id: 164, function: __cxx_global_var_init, cpu: 37, thread: 69819, kind: function-enter, tsc: 5434426029925386 }
+ - { type: 0, func-id: 164, function: __cxx_global_var_init, cpu: 37, thread: 69819, kind: function-exit, tsc: 5434426030031128 }
+ - { type: 0, func-id: 142, function: '(anonymous namespace)::CommandLineParser::ParseCommandLineOptions(int, char const* const*, llvm::StringRef, llvm::raw_ostream*)', cpu: 37, thread: 69819, kind: function-enter, tsc: 5434426046951388 }
+ - { type: 0, func-id: 142, function: '(anonymous namespace)::CommandLineParser::ParseCommandLineOptions(int, char const* const*, llvm::StringRef, llvm::raw_ostream*)', cpu: 37, thread: 69819, kind: function-exit, tsc: 5434426047282020 }
+ - { type: 0, func-id: 187, function: 'llvm::LLLexer::LexIdentifier()', cpu: 37, thread: 69819, kind: function-enter, tsc: 5434426047857332 }
+ - { type: 0, func-id: 187, function: 'llvm::LLLexer::LexIdentifier()', cpu: 37, thread: 69819, kind: function-exit, tsc: 5434426047984152 }
+ - { type: 0, func-id: 187, function: 'llvm::LLLexer::LexIdentifier()', cpu: 37, thread: 69819, kind: function-enter, tsc: 5434426048036584 }
+ - { type: 0, func-id: 187, function: 'llvm::LLLexer::LexIdentifier()', cpu: 37, thread: 69819, kind: function-exit, tsc: 5434426048042292 }
+ - { type: 0, func-id: 187, function: 'llvm::LLLexer::LexIdentifier()', cpu: 37, thread: 69819, kind: function-enter, tsc: 5434426048055056 }
+ - { type: 0, func-id: 187, function: 'llvm::LLLexer::LexIdentifier()', cpu: 37, thread: 69819, kind: function-exit, tsc: 5434426048067316 }
+
+Controlling Fidelity
+--------------------
+
+So far in our examples, we haven't been getting full coverage of the functions
+we have in the binary. To get that, we need to modify the compiler flags so
+that we can instrument more (if not all) the functions we have in the binary.
+We have two options for doing that, and we explore both of these below.
+
+Instruction Threshold
+`````````````````````
+
+The first "blunt" way of doing this is by setting the minimum threshold for
+function bodies to 1. We can do that with the
+``-fxray-instruction-threshold=N`` flag when building our binary. We rebuild
+``llc`` with this option and observe the results:
+
+::
+
+ $ rm CMakeCache.txt
+ $ cmake -GNinja ../llvm -DCMAKE_BUILD_TYPE=Release \
+ -DCMAKE_C_FLAGS_RELEASE="-fxray-instrument -fxray-instruction-threshold=1" \
+ -DCMAKE_CXX_FLAGS="-fxray-instrument -fxray-instruction-threshold=1"
+ $ ninja llc
+ $ XRAY_OPTIONS="patch_premain=true" ./bin/llc input.ll
+ ==69819==XRay: Log file in 'xray-log.llc.5rqxkU'
+
+ $ llvm-xray account xray-log.llc.5rqxkU -top=10 -sort=sum -sortorder=dsc -instr_map ./bin/llc
+ Functions with latencies: 36652
+ funcid count [ min, med, 90p, 99p, max] sum function
+ 75 1 [ 0.672368, 0.672368, 0.672368, 0.672368, 0.672368] 0.672368 llc.cpp:271:0: main
+ 78 1 [ 0.626455, 0.626455, 0.626455, 0.626455, 0.626455] 0.626455 llc.cpp:381:0: compileModule(char**, llvm::LLVMContext&)
+ 139617 1 [ 0.472618, 0.472618, 0.472618, 0.472618, 0.472618] 0.472618 LegacyPassManager.cpp:1723:0: llvm::legacy::PassManager::run(llvm::Module&)
+ 139610 1 [ 0.472618, 0.472618, 0.472618, 0.472618, 0.472618] 0.472618 LegacyPassManager.cpp:1681:0: llvm::legacy::PassManagerImpl::run(llvm::Module&)
+ 139612 1 [ 0.470948, 0.470948, 0.470948, 0.470948, 0.470948] 0.470948 LegacyPassManager.cpp:1564:0: (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&)
+ 139607 2 [ 0.147345, 0.315994, 0.315994, 0.315994, 0.315994] 0.463340 LegacyPassManager.cpp:1530:0: llvm::FPPassManager::runOnModule(llvm::Module&)
+ 139605 21 [ 0.000002, 0.000002, 0.102593, 0.213336, 0.213336] 0.463331 LegacyPassManager.cpp:1491:0: llvm::FPPassManager::runOnFunction(llvm::Function&)
+ 139563 26096 [ 0.000002, 0.000002, 0.000037, 0.000063, 0.000215] 0.225708 LegacyPassManager.cpp:1083:0: llvm::PMDataManager::findAnalysisPass(void const*, bool)
+ 108055 188 [ 0.000002, 0.000120, 0.001375, 0.004523, 0.062624] 0.159279 MachineFunctionPass.cpp:38:0: llvm::MachineFunctionPass::runOnFunction(llvm::Function&)
+ 62635 22 [ 0.000041, 0.000046, 0.000050, 0.126744, 0.126744] 0.127715 X86TargetMachine.cpp:242:0: llvm::X86TargetMachine::getSubtargetImpl(llvm::Function const&) const
+
+
+Instrumentation Attributes
+``````````````````````````
+
+The other way is to use configuration files for selecting which functions
+should always be instrumented by the compiler. This gives us a way of ensuring
+that certain functions are either always or never instrumented by not having to
+add the attribute to the source.
+
+To use this feature, you can define one file for the functions to always
+instrument, and another for functions to never instrument. The format of these
+files are exactly the same as the SanitizerLists files that control similar
+things for the sanitizer implementations. For example, we can have two
+different files like below:
+
+::
+
+ # always-instrument.txt
+ # always instrument functions that match the following filters:
+ fun:main
+
+ # never-instrument.txt
+ # never instrument functions that match the following filters:
+ fun:__cxx_*
+
+Given the above two files we can re-build by providing those two files as
+arguments to clang as ``-fxray-always-instrument=always-instrument.txt`` or
+``-fxray-never-instrument=never-instrument.txt``.
+
+Further Exploration
+-------------------
+
+The ``llvm-xray`` tool has a few other subcommands that are in various stages
+of being developed. One interesting subcommand that can highlight a few
+interesting things is the ``graph`` subcommand. Given for example the following
+toy program that we build with XRay instrumentation, we can see how the
+generated graph may be a helpful indicator of where time is being spent for the
+application.
+
+.. code-block:: c++
+
+ // sample.cc
+ #include <iostream>
+ #include <thread>
+
+ [[clang::xray_always_intrument]] void f() {
+ std::cerr << '.';
+ }
+
+ [[clang::xray_always_intrument]] void g() {
+ for (int i = 0; i < 1 << 10; ++i) {
+ std::cerr << '-';
+ }
+ }
+
+ int main(int argc, char* argv[]) {
+ std::thread t1([] {
+ for (int i = 0; i < 1 << 10; ++i)
+ f();
+ });
+ std::thread t2([] {
+ g();
+ });
+ t1.join();
+ t2.join();
+ std::cerr << '\n';
+ }
+
+We then build the above with XRay instrumentation:
+
+::
+
+ $ clang++ -o sample -O3 sample.cc -std=c++11 -fxray-instrument -fxray-instruction-threshold=1
+ $ XRAY_OPTIONS="patch_premain=true" ./sample
+
+We can then explore the graph rendering of the trace generated by this sample
+application. We assume you have the graphviz toosl available in your system,
+including both ``unflatten`` and ``dot``. If you prefer rendering or exploring
+the graph using another tool, then that should be feasible as well. ``llvm-xray
+graph`` will create DOT format graphs which should be usable in most graph
+rendering applications. One example invocation of the ``llvm-xray graph``
+command should yield some interesting insights to the workings of C++
+applications:
+
+::
+
+ $ llvm-xray graph xray-log.sample.* -m sample -color-edges=sum -edge-label=sum \
+ | unflatten -f -l10 | dot -Tsvg -o sample.svg
+
+Next Steps
+----------
+
+If you have some interesting analyses you'd like to implement as part of the
+llvm-xray tool, please feel free to propose them on the llvm-dev@ mailing list.
+The following are some ideas to inspire you in getting involved and potentially
+making things better.
+
+ - Implement a query/filtering library that allows for finding patterns in the
+ XRay traces.
+ - A conversion from the XRay trace onto something that can be visualised
+ better by other tools (like the Chrome trace viewer for example).
+ - Collecting function call stacks and how often they're encountered in the
+ XRay trace.
+
+