diff options
Diffstat (limited to 'www/scripting.html')
| -rwxr-xr-x | www/scripting.html | 586 | 
1 files changed, 586 insertions, 0 deletions
| diff --git a/www/scripting.html b/www/scripting.html new file mode 100755 index 000000000000..10ba05b6a109 --- /dev/null +++ b/www/scripting.html @@ -0,0 +1,586 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> +<html xmlns="http://www.w3.org/1999/xhtml"> +<head> +<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> +<link href="style.css" rel="stylesheet" type="text/css" /> +<title>LLDB Example - Python Scripting to Debug a Problem</title> +</head> + +<body> +    <div class="www_title"> +      Example - Using Scripting and Python to Debug in LLDB +    </div> +     +<div id="container"> +	<div id="content"> +         <!--#include virtual="sidebar.incl"--> +		<div id="middle"> +			<div class="post"> +				<h1 class ="postheader">Introduction</h1> +				<div class="postcontent"> + +                    <p>LLDB has been structured from the beginning to be scriptable in two ways  +                    -- a Unix Python session can initiate/run a debug session non-interactively  +                    using LLDB; and within the LLDB debugger tool, Python scripts can be used to  +                    help with many tasks, including inspecting program data, iterating over  +                    containers and determining if a breakpoint should stop execution or continue.   +                    This document will show how to do some of these things by going through an  +                    example, explaining how to use Python scripting to find a bug in a program  +                    that searches for text in a large binary tree.</p> + +				</div> +				<div class="postfooter"></div> + +			<div class="post"> +				<h1 class ="postheader">The Test Program and Input</h1> +				<div class="postcontent"> + +                    <p>We have a simple C program (dictionary.c) that reads in a text file, and  +                    stores all the words from the file in a Binary Search Tree, sorted  +                    alphabetically.  It then enters a loop prompting the user for a word, searching +                    for the word in the tree (using Binary Search), and reporting to the user  +                    whether or not it found the word in the tree.</p> + +                    <p>The input text file we are using to test our program contains the text for  +                    William Shakespeare's famous tragedy "Romeo and Juliet".</p> + +				</div> +				<div class="postfooter"></div> + +    			<div class="post"> +    				<h1 class ="postheader">The Bug</h1> +    				<div class="postcontent"> + +		   <p>When we try running our program, we find there is a problem.  While it  +                   successfully finds some of the words we would expect to find, such as "love"  +                   or "sun", it fails to find the word "Romeo", which MUST be in the input text  +                   file:</p> + +                   <code color=#ff0000> +                   % ./dictionary Romeo-and-Juliet.txt<br> +                   Dictionary loaded.<br> +                   Enter search word: love<br> +                   Yes!<br> +                   Enter search word: sun<br> +                   Yes!<br> +                   Enter search word: Romeo<br> +                   No!<br> +                   Enter search word: ^D<br> +                   %<br> +                   </code> + +				</div> +				<div class="postfooter"></div> + + +    			<div class="post"> +    				<h1 class ="postheader">Is the word in our tree: Using Depth First Search</h1> +    				<div class="postcontent"> + +                   <p>Our first job is to determine if the word "Romeo" actually got inserted into +                   the tree or not.  Since "Romeo and Juliet" has thousands of words, trying to  +                   examine our binary search tree by hand is completely impractical.  Therefore we  +                   will write a Python script to search the tree for us.  We will write a recursive +                   Depth First Search function that traverses the entire tree searching for a word, +                   and maintaining information about the path from the root of the tree to the  +                   current node.  If it finds the word in the tree, it returns the path from the  +                   root to the node containing the word.  This is what our DFS function in Python  +                   would look like, with line numbers added for easy reference in later  +                   explanations:</p> + +                   <code> +<pre><tt> + 1: def DFS (root, word, cur_path): + 2:     root_word_ptr = root.GetChildMemberWithName ("word") + 3:     left_child_ptr = root.GetChildMemberWithName ("left") + 4:     right_child_ptr = root.GetChildMemberWithName ("right") + 5:     root_word = root_word_ptr.GetSummary() + 6:     end = len (root_word) - 1 + 7:     if root_word[0] == '"' and root_word[end] == '"': + 8:         root_word = root_word[1:end] + 9:     end = len (root_word) - 1 +10:     if root_word[0] == '\'' and root_word[end] == '\'': +11:        root_word = root_word[1:end] +12:     if root_word == word: +13:         return cur_path +14:     elif word < root_word: +15:         if left_child_ptr.GetValue() == None: +16:             return "" +17:         else: +18:             cur_path = cur_path + "L" +19:             return DFS (left_child_ptr, word, cur_path) +20:     else: +21:         if right_child_ptr.GetValue() == None: +22:             return "" +23:         else: +24:             cur_path = cur_path + "R" +25:             return DFS (right_child_ptr, word, cur_path) +</tt></pre> +                   </code> + +				</div> +				<div class="postfooter"></div> + + +    			<div class="post"> +    				<h1 class ="postheader"><a name="accessing-variables">Accessing & Manipulating <strong>Program</strong> Variables in Python</a> +</h1> +    				<div class="postcontent"> + +                   <p>Before we can call any Python function on any of our program's variables, we  +                   need to get the variable into a form that Python can access.  To show you how to +                   do this we will look at the parameters for the DFS function.  The first  +                   parameter is going to be a node in our binary search tree, put into a Python  +                   variable.  The second parameter is the word we are searching for (a string), and +                   the third parameter is a string representing the path from the root of the tree  +                   to our current node.</p> + +                   <p>The most interesting parameter is the first one, the Python variable that +                   needs to contain a node in our search tree. How can we take a variable out of  +                   our program and put it into a Python variable?  What kind of Python variable  +                   will it be?  The answers are to use the LLDB API functions, provided as part of  +                   the LLDB Python module.  Running Python from inside LLDB, LLDB will  +                   automatically give us our current frame object as a Python variable,  +                   "lldb.frame".  This variable has the type "SBFrame" (see the LLDB API for +                   more information about SBFrame objects).  One of the things we can do with a  +                   frame object, is to ask it to find and return its local variable.  We will call  +                   the API function "FindVariable" on the lldb.frame object to give us our  +                   dictionary variable as a Python variable:</p> + +                   <code> +                      root = lldb.frame.FindVariable ("dictionary") +                   </code> + +                   <p>The line above, executed in the Python script interpreter in LLDB, asks the  +                   current frame to find the variable named "dictionary" and return it.  We then  +                   store the returned value in the Python variable named "root".  This answers the  +                   question of HOW to get the variable, but it still doesn't explain WHAT actually +                   gets put into "root".  If you examine the LLDB API, you will find that the  +                   SBFrame method "FindVariable" returns an object of type SBValue. SBValue  +                   objects are used, among other things, to wrap up program variables and values. +                   There are many useful methods defined in the SBValue class to allow you to get  +                   information or children values out of SBValues.  For complete information, see  +                   the header file <a href="http://llvm.org/svn/llvm-project/lldb/trunk/include/lldb/API/SBValue.h">SBValue.h</a>.  The  +                   SBValue methods that we use in our DFS function are  +                   <code>GetChildMemberWithName()</code>,  +                   <code>GetSummary()</code>, and <code>GetValue()</code>.</p> + +				</div> +				<div class="postfooter"></div> + + +    			<div class="post"> +    				<h1 class ="postheader">Explaining Depth First Search Script in Detail</h1> +    				<div class="postcontent"> + +                   <p><strong>"DFS" Overview.</strong>  Before diving into the details of this  +                   code, it would be best to give a high-level overview of what it does.  The nodes +                   in our binary search tree were defined to have type <code>tree_node *</code>,  +                   which is defined as: + +                   <code> +<pre><tt>typedef struct tree_node +{ +  const char *word; +  struct tree_node *left; +  struct tree_node *right; +} tree_node;</tt></pre></code> + +                   <p>Lines 2-11 of DFS are getting data out of the current tree node and getting  +                   ready to do the actual search; lines 12-25 are the actual depth-first search.   +                   Lines 2-4 of our DFS function get the <code>word</code>, <code>left</code> and  +                   <code>right</code> fields out of the current node and store them in Python  +                   variables.  Since <code>root_word_ptr</code> is a pointer to our word, and we  +                   want the actual word, line 5 calls <code>GetSummary()</code> to get a string  +                   containing the value out of the pointer.  Since <code>GetSummary()</code> adds  +                   quotes around its result, lines 6-11 strip surrounding quotes off the word.</p> + +                   <p>Line 12 checks to see if the word in the current node is the one we are  +                   searching for.  If so, we are done, and line 13 returns the current path.   +                   Otherwise, line 14 checks to see if we should go left (search word comes before  +                   the current word).  If we decide to go left, line 15 checks to see if the left  +                   pointer child is NULL ("None" is the Python equivalent of NULL). If the left  +                   pointer is NULL, then the word is not in this tree and we return an empty path  +                   (line 16).   Otherwise, we add an "L" to the end of our current path string, to  +                   indicate we are going left (line 18), and then recurse on the left child (line  +                   19).  Lines 20-25 are the same as lines 14-19, except for going right rather  +                   than going left.</p> + +                   <p>One other note:  Typing something as long as our DFS function directly into  +                   the interpreter can be difficult, as making a single typing mistake means having +                   to start all over.  Therefore we recommend doing as we have done:  Writing your  +                   longer, more complicated script functions in a separate file (in this case  +                   tree_utils.py) and then importing it into your LLDB Python interpreter.</p> +                    +				</div> +				<div class="postfooter"></div> + + +    			<div class="post"> +    				<h1 class ="postheader">Seeing the DFS Script in Action</h1> +    				<div class="postcontent"> + + +                   <p>At this point we are ready to use the DFS function to see if the word "Romeo" +                   is in our tree or not.  To actually use it in LLDB on our dictionary program,  +                   you would do something like this:</p> + +                   <code> +                     % <strong>lldb</strong><br> +                     (lldb) <strong>process attach -n "dictionary"</strong><br> +                     Architecture set to: x86_64.<br> +                     Process 521 stopped<br> +                     * thread #1: tid = 0x2c03, 0x00007fff86c8bea0 libSystem.B.dylib`read$NOCANCEL + 8, stop reason = signal SIGSTOP<br> +                     frame #0: 0x00007fff86c8bea0 libSystem.B.dylib`read$NOCANCEL + 8<br> +                     (lldb) <strong>breakpoint set -n find_word</strong><br> +                     Breakpoint created: 1: name = 'find_word', locations = 1, resolved = 1<br> +                     (lldb) <strong>continue</strong><br> +                     Process 521 resuming<br> +                     Process 521 stopped<br> +                     * thread #1: tid = 0x2c03, 0x0000000100001830 dictionary`find_word + 16 <br> +                     at dictionary.c:105, stop reason = breakpoint 1.1<br> +                     frame #0: 0x0000000100001830 dictionary`find_word + 16 at dictionary.c:105<br> +                     102 int<br> +                     103 find_word (tree_node *dictionary, char *word)<br> +                     104 {<br> +                     -> 105   if (!word || !dictionary)<br> +                     106     return 0;<br> +                     107 <br> +                     108   int compare_value = strcmp (word, dictionary->word);<br> +                     (lldb) <strong>script</strong><br> +                     Python Interactive Interpreter. To exit, type 'quit()', 'exit()' or Ctrl-D.<br> +                     >>> <strong>import tree_utils</strong><br> +                     >>> <strong>root = lldb.frame.FindVariable ("dictionary")</strong><br> +                     >>> <strong>current_path = ""</strong><br> +                     >>> <strong>path = tree_utils.DFS (root, "Romeo", current_path)</strong><br> +                     >>> <strong>print path</strong><br> +                     LLRRL<br> +                     >>> <strong>^D</strong><br> +                     (lldb) <br> +                   </code> + +                   <p>The first bit of code above shows starting lldb, attaching to the dictionary  +                   program, and getting to the find_word function in LLDB.  The interesting part  +                   (as far as this example is concerned) begins when we enter the  +                   <code>script</code> command and drop into the embedded interactive Python  +                   interpreter.  We will go over this Python code line by line.  The first line</p> + +                   <code> +                     import tree_utils +                   </code> + +                   <p>imports the file where we wrote our DFS function, tree_utils.py, into Python.  +                   Notice that to import the file we leave off the ".py" extension.  We can now  +                   call any function in that file, giving it the prefix "tree_utils.", so that  +                   Python knows where to look for the function. The line</p> + +                   <code> +                     root = lldb.frame.FindVariable ("dictionary") +                   </code> + +                   <p>gets our program variable "dictionary" (which contains the binary search  +                   tree) and puts it into the Python variable "root".  See  +                   <a href="#accessing-variables">Accessing & Manipulating Program Variables in Python</a>  +                   above for more details about how this works. The next line is</p> + +                   <code> +                     current_path = "" +                   </code> + +                   <p>This line initializes the current_path from the root of the tree to our  +                   current node.  Since we are starting at the root of the tree, our current path  +                   starts as an empty string.  As we go right and left through the tree, the DFS  +                   function will append an 'R' or an 'L' to the current path, as appropriate. The  +                   line</p> + +                   <code> +                     path = tree_utils.DFS (root, "Romeo", current_path) +                   </code> + +                   <p>calls our DFS function (prefixing it with the module name so that Python can  +                   find it).  We pass in our binary tree stored in the variable <code>root</code>,  +                   the word we are searching for, and our current path.  We assign whatever path  +                   the DFS function returns to the Python variable <code>path</code>.</p> + + +                   <p>Finally, we want to see if the word was found or not, and if so we want to  +                   see the path through the tree to the word. So we do</p> + +                   <code> +                     print path +                   </code> + +                   <p>From this we can see that the word "Romeo" was indeed found in the tree, and +                   the path from the root of the tree to the node containing "Romeo" is  +                   left-left-right-right-left.</p> + +				</div> +				<div class="postfooter"></div> + + +    			<div class="post"> +    				<h1 class ="postheader">What next?  Using Breakpoint Command Scripts...</h1> +    				<div class="postcontent"> + +                   <p>We are halfway to figuring out what the problem is.  We know the word we are +                   looking for is in the binary tree, and we know exactly where it is in the binary +                   tree.  Now we need to figure out why our binary search algorithm is not finding  +                   the word.  We will do this using breakpoint command scripts.</p> + + +                   <p>The idea is as follows.  The binary search algorithm has two main decision  +                   points:  the decision to follow the right branch; and, the decision to follow  +                   the left branch.  We will set a breakpoint at each of these decision points, and +                   attach a Python breakpoint command script to each breakpoint.  The breakpoint +                   commands will use the global <code>path</code> Python variable that we got from  +                   our DFS function. Each time one of these decision breakpoints is hit, the script +                   will compare the actual decision with the decision the front of the  +                   <code>path</code> variable says should be made (the first character of the  +                   path).  If the actual decision and the path agree, then the front character is  +                   stripped off the path, and execution is resumed.  In this case the user never  +                   even sees the breakpoint being hit.  But if the decision differs from what the  +                   path says it should be, then the script prints out a message and does NOT resume +                   execution, leaving the user sitting at the first point where a wrong decision is +                   being made.</p> + +				</div> +				<div class="postfooter"></div> + + +    			<div class="post"> +    				<h1 class ="postheader">Side Note: Python Breakpoint Command Scripts are NOT What They Seem</h1> +    				<div class="postcontent"> + +				</div> +				<div class="postfooter"></div> + +                   <p>What do we mean by that?  When you enter a Python breakpoint command in LLDB, +                   it appears that you are entering one or more plain lines of Python. BUT LLDB  +                   then takes what you entered and wraps it into a Python FUNCTION (just like using +                   the "def" Python command).   It automatically gives the function an obscure,  +                   unique, hard-to-stumble-across function name, and gives it two parameters:  +                   <code>frame</code> and <code>bp_loc</code>.  When the breakpoint gets hit, LLDB  +                   wraps up the frame object where the breakpoint was hit, and the breakpoint  +                   location object for the breakpoint that was hit, and puts them into Python  +                   variables for you.  It then calls the Python function that was created for the  +                   breakpoint command, and passes in the frame and breakpoint location objects.</p> + +                   <p>So, being practical, what does this mean for you when you write your Python  +                   breakpoint commands?  It means that there are two things you need to keep in  +                   mind: 1. If you want to access any Python variables created outside your script, +                   <strong>you must declare such variables to be global</strong>.  If you do not +                   declare them as global, then the Python function will treat them as local  +                   variables, and you will get unexpected behavior.  2. <strong>All Python  +                   breakpoint command scripts automatically have a <code>frame</code> and a  +                   <code>bp_loc</code> variable.</strong>  The variables are pre-loaded by LLDB  +                   with the correct context for the breakpoint.  You do not have to use these  +                   variables, but they are there if you want them.</p> + +				</div> +				<div class="postfooter"></div> + + +    			<div class="post"> +    				<h1 class ="postheader">The Decision Point Breakpoint Commands</h1> +    				<div class="postcontent"> + +                   <p>This is what the Python breakpoint command script would look like for the  +                   decision to go right:<p> + +<code><pre><tt> +global path +if path[0] == 'R': +    path = path[1:] +    thread = frame.GetThread() +    process = thread.GetProcess() +    process.Continue() +else: +    print "Here is the problem; going right, should go left!" +</tt></pre></code> + +                   <p>Just as a reminder, LLDB is going to take this script and wrap it up in a  +                   function, like this:</p> + +<code><pre><tt> +def some_unique_and_obscure_function_name (frame, bp_loc): +    global path +    if path[0] == 'R': +        path = path[1:] +        thread = frame.GetThread() +        process = thread.GetProcess() +        process.Continue() +    else: +        print "Here is the problem; going right, should go left!" +</tt></pre></code> + +                   <p>LLDB will call the function, passing in the correct frame and breakpoint  +                   location whenever the breakpoint gets hit.  There are several things to notice  +                   about this function.  The first one is that we are accessing and updating a  +                   piece of state (the <code>path</code> variable), and actually conditioning our +                   behavior based upon this variable.  Since the variable was defined outside of  +                   our script (and therefore outside of the corresponding function) we need to tell +                   Python that we are accessing a global variable. That is what the first line of  +                   the script does.  Next we check where the path says we should go and compare it to  +                   our decision (recall that we are at the breakpoint for the decision to go  +                   right). If the path agrees with our decision, then  we strip the first character +                   off of the path.</p> + +                   <p>Since the decision matched the path, we want to resume execution.  To do this +                   we make use of the <code>frame</code> parameter that LLDB guarantees will be  +                   there for us.  We use LLDB API functions to get the current thread from the  +                   current frame, and then to get the process from the thread.  Once we have the  +                   process, we tell it to resume execution (using the <code>Continue()</code> API  +                   function).</p> + +                   <p>If the decision to go right does not agree with the path, then we do not  +                   resume execution.  We allow the breakpoint to remain stopped (by doing nothing), +                   and we print an informational message telling the user we have found the  +                   problem, and what the problem is.</p> + +				</div> +				<div class="postfooter"></div> + +    			<div class="post"> +    				<h1 class ="postheader">Actually Using the Breakpoint Commands</h1> +    				<div class="postcontent"> + +                   <p>Now we will look at what happens when we actually use these breakpoint  +                   commands on our program.  Doing a <code>source list -n find_word</code> shows  +                   us the function containing our two decision points.  Looking at the code below,  +                   we see that we want to set our breakpoints on lines 113 and 115:</p> + +<code><pre><tt> +(lldb) source list -n find_word +File: /Volumes/Data/HD2/carolinetice/Desktop/LLDB-Web-Examples/dictionary.c. +101  +102 int +103 find_word (tree_node *dictionary, char *word) +104 { +105   if (!word || !dictionary) +106     return 0; +107  +108   int compare_value = strcmp (word, dictionary->word); +109  +110   if (compare_value == 0) +111     return 1; +112   else if (compare_value < 0) +113     return find_word (dictionary->left, word); +114   else +115     return find_word (dictionary->right, word); +116 } +117  +</tt></pre></code> + +                   <p>So, we set our breakpoints, enter our breakpoint command scripts, and see  +                   what happens:<p> + +<code><pre><tt> +(lldb) breakpoint set -l 113 +Breakpoint created: 2: file ='dictionary.c', line = 113, locations = 1, resolved = 1 +(lldb) breakpoint set -l 115 +Breakpoint created: 3: file ='dictionary.c', line = 115, locations = 1, resolved = 1 +(lldb) breakpoint command add -s python 2 +Enter your Python command(s). Type 'DONE' to end. +> global path +> if (path[0] == 'L'): +>     path = path[1:] +>     thread = frame.GetThread() +>     process = thread.GetProcess() +>     process.Continue() +> else: +>     print "Here is the problem. Going left, should go right!" +> DONE +(lldb) breakpoint command add -s python 3 +Enter your Python command(s). Type 'DONE' to end. +> global path +> if (path[0] == 'R'): +>     path = path[1:] +>     thread = frame.GetThread() +>     process = thread.GetProcess() +>     process.Continue() +> else: +>     print "Here is the problem. Going right, should go left!" +> DONE +(lldb) continue +Process 696 resuming +Here is the problem. Going right, should go left! +Process 696 stopped +* thread #1: tid = 0x2d03, 0x000000010000189f dictionary`find_word + 127 at dictionary.c:115, stop reason = breakpoint 3.1 +  frame #0: 0x000000010000189f dictionary`find_word + 127 at dictionary.c:115 +    112   else if (compare_value < 0) +    113     return find_word (dictionary->left, word); +    114   else + -> 115     return find_word (dictionary->right, word); +    116 } +    117  +    118 void +(lldb) +</tt></pre></code> + + +                   <p>After setting our breakpoints, adding our breakpoint commands and continuing, +                   we run for a little bit and then hit one of our breakpoints, printing out the  +                   error message from the breakpoint command.  Apparently at this point in the +                   tree, our search algorithm decided to go right, but our path says the node we  +                   want is to the left. Examining the word at the node where we stopped, and our  +                   search word, we see:</p> + +                   <code> +                     (lldb) expr dictionary->word<br> +                     (const char *) $1 = 0x0000000100100080 "dramatis"<br> +                     (lldb) expr word<br> +                     (char *) $2 = 0x00007fff5fbff108 "romeo"<br> +                   </code> + +                   <p>So the word at our current node is "dramatis", and the word we are searching +                   for is "romeo".  "romeo" comes after "dramatis" alphabetically, so it seems like +                   going right would be the correct decision.  Let's ask Python what it thinks the +                   path from the current node to our word is:</p> + +                   <code> +                     (lldb) script print path<br> +                     LLRRL<br> +                   </code> + +                   <p>According to Python we need to go left-left-right-right-left from our current +                   node to find the word we are looking for.  Let's double check our tree, and see  +                   what word it has at that node:</p> + +                   <code> +                     (lldb) expr dictionary->left->left->right->right->left->word<br> +                     (const char *) $4 = 0x0000000100100880 "Romeo"<br> +                   </code> + +                   <p>So the word we are searching for is "romeo" and the word at our DFS location +                   is "Romeo".  Aha!  One is uppercase and the other is lowercase:  We seem to have +                   a case conversion problem somewhere in our program (we do).</p> + +                   <p>This is the end of our example on how you might use Python scripting in LLDB  +                   to help you find bugs in your program.</p> + +				</div> +				<div class="postfooter"></div> + +    			<div class="post"> +    				<h1 class ="postheader">Source Files for The Example</h1> +    				<div class="postcontent"> + + +                </div> +          	    <div class="postfooter"></div> + +                  <p> The complete code for the Dictionary program (with case-conversion bug),  +                  the DFS function and other Python script examples (tree_utils.py) used for this  +                  example are available via following file links:</p> + +<a href="http://llvm.org/svn/llvm-project/lldb/trunk/examples/scripting/tree_utils.py">tree_utils.py</a>  -  Example Python functions using LLDB's API, including DFS<br> +<a href="http://llvm.org/svn/llvm-project/lldb/trunk/examples/scripting/dictionary.c">dictionary.c</a>  -  Sample dictionary program, with bug<br> +    			 +                    <p>The text for "Romeo and Juliet" can be obtained from the Gutenberg Project +                    (http://www.gutenberg.org).</p> +            </div> +      	</div> +	</div> +</div> +</body> +</html> | 
