diff options
Diffstat (limited to 'doc/howto_libipt.md')
| -rw-r--r-- | doc/howto_libipt.md | 1271 |
1 files changed, 1271 insertions, 0 deletions
diff --git a/doc/howto_libipt.md b/doc/howto_libipt.md new file mode 100644 index 000000000000..3d3c12f0bb16 --- /dev/null +++ b/doc/howto_libipt.md @@ -0,0 +1,1271 @@ +Decoding Intel(R) Processor Trace Using libipt {#libipt} +======================================================== + +<!--- + ! Copyright (c) 2013-2019, Intel Corporation + ! + ! Redistribution and use in source and binary forms, with or without + ! modification, are permitted provided that the following conditions are met: + ! + ! * Redistributions of source code must retain the above copyright notice, + ! this list of conditions and the following disclaimer. + ! * Redistributions in binary form must reproduce the above copyright notice, + ! this list of conditions and the following disclaimer in the documentation + ! and/or other materials provided with the distribution. + ! * Neither the name of Intel Corporation nor the names of its contributors + ! may be used to endorse or promote products derived from this software + ! without specific prior written permission. + ! + ! THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + ! AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + ! IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + ! ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + ! LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + ! CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + ! SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + ! INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + ! CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + ! ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + ! POSSIBILITY OF SUCH DAMAGE. + !--> + +This chapter describes how to use libipt for various tasks around Intel +Processor Trace (Intel PT). For code examples, refer to the sample tools that +are contained in the source tree: + + * *ptdump* A packet dumper example. + * *ptxed* A control-flow reconstruction example. + * *pttc* A packet encoder example. + + +For detailed information about Intel PT, please refer to the respective chapter +in Volume 3 of the Intel Software Developer's Manual at +http://www.intel.com/sdm. + + +## Introduction + +The libipt decoder library provides multiple layers of abstraction ranging from +packet encoding and decoding to full execution flow reconstruction. The layers +are organized as follows: + + * *packets* This layer deals with raw Intel PT packets. + + * *events* This layer deals with packet combinations that + encode higher-level events. + + * *instruction flow* This layer deals with the execution flow on the + instruction level. + + * *block* This layer deals with the execution flow on the + instruction level. + + It is faster than the instruction flow decoder but + requires a small amount of post-processing. + + +Each layer provides its own encoder or decoder struct plus a set of functions +for allocating and freeing encoder or decoder objects and for synchronizing +decoders onto the Intel PT packet stream. Function names are prefixed with +`pt_<lyr>_` where `<lyr>` is an abbreviation of the layer name. The following +abbreviations are used: + + * *enc* Packet encoding (packet layer). + * *pkt* Packet decoding (packet layer). + * *qry* Event (or query) layer. + * *insn* Instruction flow layer. + * *blk* Block layer. + + +Here is some generic example code for working with decoders: + +~~~{.c} + struct pt_<layer>_decoder *decoder; + struct pt_config config; + int errcode; + + memset(&config, 0, sizeof(config)); + config.size = sizeof(config); + config.begin = <pt buffer begin>; + config.end = <pt buffer end>; + config.cpu = <cpu identifier>; + config... + + decoder = pt_<lyr>_alloc_decoder(&config); + if (!decoder) + <handle error>(errcode); + + errcode = pt_<lyr>_sync_<where>(decoder); + if (errcode < 0) + <handle error>(errcode); + + <use decoder>(decoder); + + pt_<lyr>_free_decoder(decoder); +~~~ + +First, configure the decoder. As a minimum, the size of the config struct and +the `begin` and `end` of the buffer containing the Intel PT data need to be set. +Configuration options details will be discussed later in this chapter. In the +case of packet encoding, this is the begin and end address of the pre-allocated +buffer, into which Intel PT packets shall be written. + +Next, allocate a decoder object for the layer you are interested in. A return +value of NULL indicates an error. There is no further information available on +the exact error condition. Most of the time, however, the error is the result +of an incomplete or inconsistent configuration. + +Before the decoder can be used, it needs to be synchronized onto the Intel PT +packet stream specified in the configuration. The only exception to this is the +packet encoder, which is implicitly synchronized onto the beginning of the Intel +PT buffer. + +Depending on the type of decoder, one or more synchronization options are +available. + + * `pt_<lyr>_sync_forward()` Synchronize onto the next PSB in forward + direction (or the first PSB if not yet + synchronized). + + * `pt_<lyr>_sync_backward()` Synchronize onto the next PSB in backward + direction (or the last PSB if not yet + synchronized). + + * `pt_<lyr>_sync_set()` Set the synchronization position to a + user-defined location in the Intel PT packet + stream. + There is no check whether the specified + location makes sense or is valid. + + +After synchronizing, the decoder can be used. While decoding, the decoder +stores the location of the last PSB it encountered during normal decode. +Subsequent calls to pt_<lyr>_sync_forward() will start searching from that +location. This is useful for re-synchronizing onto the Intel PT packet stream +in case of errors. An example of a typical decode loop is given below: + +~~~{.c} + for (;;) { + int errcode; + + errcode = <use decoder>(decoder); + if (errcode >= 0) + continue; + + if (errcode == -pte_eos) + return; + + <report error>(errcode); + + do { + errcode = pt_<lyr>_sync_forward(decoder); + + if (errcode == -pte_eos) + return; + } while (errcode < 0); + } +~~~ + +You can get the current decoder position as offset into the Intel PT buffer via: + + pt_<lyr>_get_offset() + + +You can get the position of the last synchronization point as offset into the +Intel PT buffer via: + + pt_<lyr>_get_sync_offset() + + +Each layer will be discussed in detail below. In the remainder of this section, +general functionality will be considered. + + +### Version + +You can query the library version using: + + * `pt_library_version()` + + +This function returns a version structure that can be used for compatibility +checks or simply for reporting the version of the decoder library. + + +### Errors + +The library uses a single error enum for all layers. + + * `enum pt_error_code` An enumeration of encode and decode errors. + + +Errors are typically represented as negative pt_error_code enumeration constants +and returned as an int. The library provides two functions for dealing with +errors: + + * `pt_errcode()` Translate an int return value into a pt_error_code + enumeration constant. + + * `pt_errstr()` Returns a human-readable error string. + + +Not all errors may occur on every layer. Every API function specifies the +errors it may return. + + +### Configuration + +Every encoder or decoder allocation function requires a configuration argument. +Some of its fields have already been discussed in the example above. Refer to +the `intel-pt.h` header for detailed and up-to-date documentation of each field. + +As a minimum, the `size` field needs to be set to `sizeof(struct pt_config)` and +`begin` and `end` need to be set to the Intel PT buffer to use. + +The size is used for detecting library version mismatches and to provide +backwards compatibility. Without the proper `size`, decoder allocation will +fail. + +Although not strictly required, it is recommended to also set the `cpu` field to +the processor, on which Intel PT has been collected (for decoders), or for which +Intel PT shall be generated (for encoders). This allows implementing +processor-specific behavior such as erratum workarounds. + + +## The Packet Layer + +This layer deals with Intel PT packet encoding and decoding. It can further be +split into three sub-layers: opcodes, encoding, and decoding. + + +### Opcodes + +The opcodes layer provides enumerations for all the bits necessary for Intel PT +encoding and decoding. The enumeration constants can be used without linking to +the decoder library. There is no encoder or decoder struct associated with this +layer. See the intel-pt.h header file for details. + + +### Packet Encoding + +The packet encoding layer provides support for encoding Intel PT +packet-by-packet. Start by configuring and allocating a `pt_packet_encoder` as +shown below: + +~~~{.c} + struct pt_encoder *encoder; + struct pt_config config; + int errcode; + + memset(&config, 0, sizeof(config)); + config.size = sizeof(config); + config.begin = <pt buffer begin>; + config.end = <pt buffer end>; + config.cpu = <cpu identifier>; + + encoder = pt_alloc_encoder(&config); + if (!encoder) + <handle error>(errcode); +~~~ + +For packet encoding, only the mandatory config fields need to be filled in. + +The allocated encoder object will be implicitly synchronized onto the beginning +of the Intel PT buffer. You may change the encoder's position at any time by +calling `pt_enc_sync_set()` with the desired buffer offset. + +Next, fill in a `pt_packet` object with details about the packet to be encoded. +You do not need to fill in the `size` field. The needed size is computed by the +encoder. There is no consistency check with the size specified in the packet +object. The following example encodes a TIP packet: + +~~~{.c} + struct pt_packet_encoder *encoder = ...; + struct pt_packet packet; + int errcode; + + packet.type = ppt_tip; + packet.payload.ip.ipc = pt_ipc_update_16; + packet.payload.ip.ip = <ip>; +~~~ + +For IP packets, for example FUP or TIP.PGE, there is no need to mask out bits in +the `ip` field that will not be encoded in the packet due to the specified IP +compression in the `ipc` field. The encoder will ignore them. + +There are no consistency checks whether the specified IP compression in the +`ipc` field is allowed in the current context or whether decode will result in +the full IP specified in the `ip` field. + +Once the packet object has been filled, it can be handed over to the encoder as +shown here: + +~~~{.c} + errcode = pt_enc_next(encoder, &packet); + if (errcode < 0) + <handle error>(errcode); +~~~ + +The encoder will encode the packet, write it into the Intel PT buffer, and +advance its position to the next byte after the packet. On a successful encode, +it will return the number of bytes that have been written. In case of errors, +nothing will be written and the encoder returns a negative error code. + + +### Packet Decoding + +The packet decoding layer provides support for decoding Intel PT +packet-by-packet. Start by configuring and allocating a `pt_packet_decoder` as +shown here: + +~~~{.c} + struct pt_packet_decoder *decoder; + struct pt_config config; + int errcode; + + memset(&config, 0, sizeof(config)); + config.size = sizeof(config); + config.begin = <pt buffer begin>; + config.end = <pt buffer end>; + config.cpu = <cpu identifier>; + config.decode.callback = <decode function>; + config.decode.context = <decode context>; + + decoder = pt_pkt_alloc_decoder(&config); + if (!decoder) + <handle error>(errcode); +~~~ + +For packet decoding, an optional decode callback function may be specified in +addition to the mandatory config fields. If specified, the callback function +will be called for packets the decoder does not know about. If there is no +decode callback specified, the decoder will return `-pte_bad_opc`. In addition +to the callback function pointer, an optional pointer to user-defined context +information can be specified. This context will be passed to the decode +callback function. + +Before the decoder can be used, it needs to be synchronized onto the Intel PT +packet stream. Packet decoders offer three synchronization functions. To +iterate over synchronization points in the Intel PT packet stream in forward or +backward direction, use one of the following two functions respectively: + + pt_pkt_sync_forward() + pt_pkt_sync_backward() + + +To manually synchronize the decoder at a particular offset into the Intel PT +packet stream, use the following function: + + pt_pkt_sync_set() + + +There are no checks to ensure that the specified offset is at the beginning of a +packet. The example below shows synchronization to the first synchronization +point: + +~~~{.c} + struct pt_packet_decoder *decoder; + int errcode; + + errcode = pt_pkt_sync_forward(decoder); + if (errcode < 0) + <handle error>(errcode); +~~~ + +The decoder will remember the last synchronization packet it decoded. +Subsequent calls to `pt_pkt_sync_forward` and `pt_pkt_sync_backward` will use +this as their starting point. + +You can get the current decoder position as offset into the Intel PT buffer via: + + pt_pkt_get_offset() + + +You can get the position of the last synchronization point as offset into the +Intel PT buffer via: + + pt_pkt_get_sync_offset() + + +Once the decoder is synchronized, you can iterate over packets by repeated calls +to `pt_pkt_next()` as shown in the following example: + +~~~{.c} + struct pt_packet_decoder *decoder; + int errcode; + + for (;;) { + struct pt_packet packet; + + errcode = pt_pkt_next(decoder, &packet, sizeof(packet)); + if (errcode < 0) + break; + + <process packet>(&packet); + } +~~~ + + +## The Event Layer + +The event layer deals with packet combinations that encode higher-level events. +It is used for reconstructing execution flow for users who need finer-grain +control not available via the instruction flow layer or for users who want to +integrate execution flow reconstruction with other functionality more tightly +than it would be possible otherwise. + +This section describes how to use the query decoder for reconstructing execution +flow. See the instruction flow decoder as an example. Start by configuring and +allocating a `pt_query_decoder` as shown below: + +~~~{.c} + struct pt_query_decoder *decoder; + struct pt_config config; + int errcode; + + memset(&config, 0, sizeof(config)); + config.size = sizeof(config); + config.begin = <pt buffer begin>; + config.end = <pt buffer end>; + config.cpu = <cpu identifier>; + config.decode.callback = <decode function>; + config.decode.context = <decode context>; + + decoder = pt_qry_alloc_decoder(&config); + if (!decoder) + <handle error>(errcode); +~~~ + +An optional packet decode callback function may be specified in addition to the +mandatory config fields. If specified, the callback function will be called for +packets the decoder does not know about. The query decoder will ignore the +unknown packet except for its size in order to skip it. If there is no decode +callback specified, the decoder will abort with `-pte_bad_opc`. In addition to +the callback function pointer, an optional pointer to user-defined context +information can be specified. This context will be passed to the decode +callback function. + +Before the decoder can be used, it needs to be synchronized onto the Intel PT +packet stream. To iterate over synchronization points in the Intel PT packet +stream in forward or backward direction, the query decoders offer the following +two synchronization functions respectively: + + pt_qry_sync_forward() + pt_qry_sync_backward() + + +To manually synchronize the decoder at a synchronization point (i.e. PSB packet) +in the Intel PT packet stream, use the following function: + + pt_qry_sync_set() + + +After successfully synchronizing, the query decoder will start reading the PSB+ +header to initialize its internal state. If tracing is enabled at this +synchronization point, the IP of the instruction, at which decoding should be +started, is returned. If tracing is disabled at this synchronization point, it +will be indicated in the returned status bits (see below). In this example, +synchronization to the first synchronization point is shown: + +~~~{.c} + struct pt_query_decoder *decoder; + uint64_t ip; + int status; + + status = pt_qry_sync_forward(decoder, &ip); + if (status < 0) + <handle error>(status); +~~~ + +In addition to a query decoder, you will need an instruction decoder for +decoding and classifying instructions. + + +#### In A Nutshell + +After synchronizing, you begin decoding instructions starting at the returned +IP. As long as you can determine the next instruction in execution order, you +continue on your own. Only when the next instruction cannot be determined by +examining the current instruction, you would ask the query decoder for guidance: + + * If the current instruction is a conditional branch, the + `pt_qry_cond_branch()` function will tell whether it was taken. + + * If the current instruction is an indirect branch, the + `pt_qry_indirect_branch()` function will provide the IP of its destination. + + +~~~{.c} + struct pt_query_decoder *decoder; + uint64_t ip; + + for (;;) { + struct <instruction> insn; + + insn = <decode instruction>(ip); + + ip += <instruction size>(insn); + + if (<is cond branch>(insn)) { + int status, taken; + + status = pt_qry_cond_branch(decoder, &taken); + if (status < 0) + <handle error>(status); + + if (taken) + ip += <branch displacement>(insn); + } else if (<is indirect branch>(insn)) { + int status; + + status = pt_qry_indirect_branch(decoder, &ip); + if (status < 0) + <handle error>(status); + } + } +~~~ + + +Certain aspects such as, for example, asynchronous events or synchronizing at a +location where tracing is disabled, have been ignored so far. Let us consider +them now. + + +#### Queries + +The query decoder provides four query functions: + + * `pt_qry_cond_branch()` Query whether the next conditional branch was + taken. + + * `pt_qry_indirect_branch()` Query for the destination IP of the next + indirect branch. + + * `pt_qry_event()` Query for the next event. + + * `pt_qry_time()` Query for the current time. + + +Each function returns either a positive vector of status bits or a negative +error code. For details on status bits and error conditions, please refer to +the `pt_status_flag` and `pt_error_code` enumerations in the intel-pt.h header. + +The `pts_ip_suppressed` status bit is used to indicate that no IP is available +at functions that are supposed to return an IP. Examples are the indirect +branch query function and both synchronization functions. + +The `pts_event_pending` status bit is used to indicate that there is an event +pending. You should query for this event before continuing execution flow +reconstruction. + +The `pts_eos` status bit is used to indicate the end of the trace. Any +subsequent query will return -pte_eos. + + +#### Events + +Events are signaled ahead of time. When you query for pending events as soon as +they are indicated, you will be aware of asynchronous events before you reach +the instruction associated with the event. + +For example, if tracing is disabled at the synchronization point, the IP will be +suppressed. In this case, it is very likely that a tracing enabled event is +signaled. You will also get events for initializing the decoder state after +synchronizing onto the Intel PT packet stream. For example, paging or execution +mode events. + +See the `enum pt_event_type` and `struct pt_event` in the intel-pt.h header for +details on possible events. This document does not give an example of event +processing. Refer to the implementation of the instruction flow decoder in +pt_insn.c for details. + + +#### Timing + +To be able to signal events, the decoder reads ahead until it arrives at a query +relevant packet. Errors encountered during that time will be postponed until +the respective query call. This reading ahead affects timing. The decoder will +always be a few packets ahead. When querying for the current time, the query +will return the time at the decoder's current packet. This corresponds to the +time at our next query. + + +#### Return Compression + +If Intel PT has been configured to compress returns, a successfully compressed +return is represented as a conditional branch instead of an indirect branch. +For a RET instruction, you first query for a conditional branch. If the query +succeeds, it should indicate that the branch was taken. In that case, the +return has been compressed. A not taken branch indicates an error. If the +query fails, the return has not been compressed and you query for an indirect +branch. + +There is no guarantee that returns will be compressed. Even though return +compression has been enabled, returns may still be represented as indirect +branches. + +To reconstruct the execution flow for compressed returns, you would maintain a +stack of return addresses. For each call instruction, push the IP of the +instruction following the call onto the stack. For compressed returns, pop the +topmost IP from the stack. See pt_retstack.h and pt_retstack.c for a sample +implementation. + + +## The Instruction Flow Layer + +The instruction flow layer provides a simple API for iterating over instructions +in execution order. Start by configuring and allocating a `pt_insn_decoder` as +shown below: + +~~~{.c} + struct pt_insn_decoder *decoder; + struct pt_config config; + int errcode; + + memset(&config, 0, sizeof(config)); + config.size = sizeof(config); + config.begin = <pt buffer begin>; + config.end = <pt buffer end>; + config.cpu = <cpu identifier>; + config.decode.callback = <decode function>; + config.decode.context = <decode context>; + + decoder = pt_insn_alloc_decoder(&config); + if (!decoder) + <handle error>(errcode); +~~~ + +An optional packet decode callback function may be specified in addition to the +mandatory config fields. If specified, the callback function will be called for +packets the decoder does not know about. The decoder will ignore the unknown +packet except for its size in order to skip it. If there is no decode callback +specified, the decoder will abort with `-pte_bad_opc`. In addition to the +callback function pointer, an optional pointer to user-defined context +information can be specified. This context will be passed to the decode +callback function. + +The image argument is optional. If no image is given, the decoder will use an +empty default image that can be populated later on and that is implicitly +destroyed when the decoder is freed. See below for more information on this. + + +#### The Traced Image + +In addition to the Intel PT configuration, the instruction flow decoder needs to +know the memory image for which Intel PT has been recorded. This memory image +is represented by a `pt_image` object. If decoding failed due to an IP lying +outside of the traced memory image, `pt_insn_next()` will return `-pte_nomap`. + +Use `pt_image_alloc()` to allocate and `pt_image_free()` to free an image. +Images may not be shared. Every decoder must use a different image. Use this +to prepare the image in advance or if you want to switch between images. + +Every decoder provides an empty default image that is used if no image is +specified during allocation. The default image is implicitly destroyed when the +decoder is freed. It can be obtained by calling `pt_insn_get_image()`. Use +this if you only use one decoder and one image. + +An image is a collection of contiguous, non-overlapping memory regions called +`sections`. Starting with an empty image, it may be populated with repeated +calls to `pt_image_add_file()` or `pt_image_add_cached()`, one for each section, +or with a call to `pt_image_copy()` to add all sections from another image. If +a newly added section overlaps with an existing section, the existing section +will be truncated or split to make room for the new section. + +In some cases, the memory image may change during the execution. You can use +the `pt_image_remove_by_filename()` function to remove previously added sections +by their file name and `pt_image_remove_by_asid()` to remove all sections for an +address-space. + +In addition to adding sections, you can register a callback function for reading +memory using `pt_image_set_callback()`. The `context` parameter you pass +together with the callback function pointer will be passed to your callback +function every time it is called. There can only be one callback at any time. +Adding a new callback will remove any previously added callback. To remove the +callback function, pass `NULL` to `pt_image_set_callback()`. + +Callback and files may be combined. The callback function is used whenever +the memory cannot be found in any of the image's sections. + +If more than one process is traced, the memory image may change when the process +context is switched. To simplify handling this case, an address-space +identifier may be passed to each of the above functions to define separate +images for different processes at the same time. The decoder will select the +correct image based on context switch information in the Intel PT trace. If +you want to manage this on your own, you can use `pt_insn_set_image()` to +replace the image a decoder uses. + + +#### The Traced Image Section Cache + +When using multiple decoders that work on related memory images it is desirable +to share image sections between decoders. The underlying file sections will be +mapped only once per image section cache. + +Use `pt_iscache_alloc()` to allocate and `pt_iscache_free()` to free an image +section cache. Freeing the cache does not destroy sections added to the cache. +They remain valid until they are no longer used. + +Use `pt_iscache_add_file()` to add a file section to an image section cache. +The function returns an image section identifier (ISID) that uniquely identifies +the section in this cache. Use `pt_image_add_cached()` to add a file section +from an image section cache to an image. + +Multiple image section caches may be used at the same time but it is recommended +not to mix sections from different image section caches in one image. + +A traced image section cache can also be used for reading an instruction's +memory via its IP and ISID as provided in `struct pt_insn`. + +The image section cache provides a cache of recently mapped sections and keeps +them mapped when they are unmapped by the images that used them. This avoid +repeated unmapping and re-mapping of image sections in some parallel debug +scenarios or when reading memory from the image section cache. + +Use `pt_iscache_set_limit()` to set the limit of this cache in bytes. This +accounts for the extra memory that will be used for keeping image sections +mapped including any block caches associated with image sections. To disable +caching, set the limit to zero. + + +#### Synchronizing + +Before the decoder can be used, it needs to be synchronized onto the Intel PT +packet stream. To iterate over synchronization points in the Intel PT packet +stream in forward or backward directions, the instruction flow decoders offer +the following two synchronization functions respectively: + + pt_insn_sync_forward() + pt_insn_sync_backward() + + +To manually synchronize the decoder at a synchronization point (i.e. PSB packet) +in the Intel PT packet stream, use the following function: + + pt_insn_sync_set() + + +The example below shows synchronization to the first synchronization point: + +~~~{.c} + struct pt_insn_decoder *decoder; + int errcode; + + errcode = pt_insn_sync_forward(decoder); + if (errcode < 0) + <handle error>(errcode); +~~~ + +The decoder will remember the last synchronization packet it decoded. +Subsequent calls to `pt_insn_sync_forward` and `pt_insn_sync_backward` will use +this as their starting point. + +You can get the current decoder position as offset into the Intel PT buffer via: + + pt_insn_get_offset() + + +You can get the position of the last synchronization point as offset into the +Intel PT buffer via: + + pt_insn_get_sync_offset() + + +#### Iterating + +Once the decoder is synchronized, you can iterate over instructions in execution +flow order by repeated calls to `pt_insn_next()` as shown in the following +example: + +~~~{.c} + struct pt_insn_decoder *decoder; + int status; + + for (;;) { + struct pt_insn insn; + + status = pt_insn_next(decoder, &insn, sizeof(insn)); + + if (insn.iclass != ptic_error) + <process instruction>(&insn); + + if (status < 0) + break; + + ... + } +~~~ + +Note that the example ignores non-error status returns. + +For each instruction, you get its IP, its size in bytes, the raw memory, an +identifier for the image section that contained it, the current execution mode, +and the speculation state, that is whether the instruction has been executed +speculatively. In addition, you get a coarse classification that can be used +for further processing without the need for a full instruction decode. + +If a traced image section cache is used the image section identifier can be used +to trace an instruction back to the binary file that contained it. This allows +mapping the instruction back to source code using the debug information +contained in or reachable via the binary file. + +Beware that `pt_insn_next()` may indicate errors that occur after the returned +instruction. The returned instruction is valid if its `iclass` field is set. + + +#### Events + +The instruction flow decoder uses an event system similar to the query +decoder's. Pending events are indicated by the `pts_event_pending` flag in the +status flag bit-vector returned from `pt_insn_sync_<where>()`, `pt_insn_next()` +and `pt_insn_event()`. + +When the `pts_event_pending` flag is set on return from `pt_insn_next()`, use +repeated calls to `pt_insn_event()` to drain all queued events. Then switch +back to calling `pt_insn_next()` to resume with instruction flow decode as +shown in the following example: + +~~~{.c} + struct pt_insn_decoder *decoder; + int status; + + for (;;) { + struct pt_insn insn; + + status = pt_insn_next(decoder, &insn, sizeof(insn)); + if (status < 0) + break; + + <process instruction>(&insn); + + while (status & pts_event_pending) { + struct pt_event event; + + status = pt_insn_event(decoder, &event, sizeof(event)); + if (status < 0) + <handle error>(status); + + <process event>(&event); + } + } +~~~ + + +#### The Instruction Flow Decode Loop + +If we put all of the above examples together, we end up with a decode loop as +shown below: + +~~~{.c} + int handle_events(struct pt_insn_decoder *decoder, int status) + { + while (status & pts_event_pending) { + struct pt_event event; + + status = pt_insn_event(decoder, &event, sizeof(event)); + if (status < 0) + break; + + <process event>(&event); + } + + return status; + } + + int decode(struct pt_insn_decoder *decoder) + { + int status; + + for (;;) { + status = pt_insn_sync_forward(decoder); + if (status < 0) + break; + + for (;;) { + struct pt_insn insn; + + status = handle_events(decoder, status); + if (status < 0) + break; + + status = pt_insn_next(decoder, &insn, sizeof(insn)); + + if (insn.iclass != ptic_error) + <process instruction>(&insn); + + if (status < 0) + break; + } + + <handle error>(status); + } + + <handle error>(status); + + return status; + } +~~~ + + +## The Block Layer + +The block layer provides a simple API for iterating over blocks of sequential +instructions in execution order. The instructions in a block are sequential in +the sense that no trace is required for reconstructing the instructions. The IP +of the first instruction is given in `struct pt_block` and the IP of other +instructions in the block can be determined by decoding and examining the +previous instruction. + +Start by configuring and allocating a `pt_block_decoder` as shown below: + +~~~{.c} + struct pt_block_decoder *decoder; + struct pt_config config; + + memset(&config, 0, sizeof(config)); + config.size = sizeof(config); + config.begin = <pt buffer begin>; + config.end = <pt buffer end>; + config.cpu = <cpu identifier>; + config.decode.callback = <decode function>; + config.decode.context = <decode context>; + + decoder = pt_blk_alloc_decoder(&config); +~~~ + +An optional packet decode callback function may be specified in addition to the +mandatory config fields. If specified, the callback function will be called for +packets the decoder does not know about. The decoder will ignore the unknown +packet except for its size in order to skip it. If there is no decode callback +specified, the decoder will abort with `-pte_bad_opc`. In addition to the +callback function pointer, an optional pointer to user-defined context +information can be specified. This context will be passed to the decode +callback function. + + +#### Synchronizing + +Before the decoder can be used, it needs to be synchronized onto the Intel PT +packet stream. To iterate over synchronization points in the Intel PT packet +stream in forward or backward directions, the block decoder offers the following +two synchronization functions respectively: + + pt_blk_sync_forward() + pt_blk_sync_backward() + + +To manually synchronize the decoder at a synchronization point (i.e. PSB packet) +in the Intel PT packet stream, use the following function: + + pt_blk_sync_set() + + +The example below shows synchronization to the first synchronization point: + +~~~{.c} + struct pt_block_decoder *decoder; + int errcode; + + errcode = pt_blk_sync_forward(decoder); + if (errcode < 0) + <handle error>(errcode); +~~~ + +The decoder will remember the last synchronization packet it decoded. +Subsequent calls to `pt_blk_sync_forward` and `pt_blk_sync_backward` will use +this as their starting point. + +You can get the current decoder position as offset into the Intel PT buffer via: + + pt_blk_get_offset() + + +You can get the position of the last synchronization point as offset into the +Intel PT buffer via: + + pt_blk_get_sync_offset() + + +#### Iterating + +Once the decoder is synchronized, it can be used to iterate over blocks of +instructions in execution flow order by repeated calls to `pt_blk_next()` as +shown in the following example: + +~~~{.c} + struct pt_block_decoder *decoder; + int status; + + for (;;) { + struct pt_block block; + + status = pt_blk_next(decoder, &block, sizeof(block)); + + if (block.ninsn > 0) + <process block>(&block); + + if (status < 0) + break; + + ... + } +~~~ + +Note that the example ignores non-error status returns. + +A block contains enough information to reconstruct the instructions. See +`struct pt_block` in `intel-pt.h` for details. Note that errors returned by +`pt_blk_next()` apply after the last instruction in the provided block. + +It is recommended to use a traced image section cache so the image section +identifier contained in a block can be used for reading the memory containing +the instructions in the block. This also allows mapping the instructions back +to source code using the debug information contained in or reachable via the +binary file. + +In some cases, the last instruction in a block may cross image section +boundaries. This can happen when a code segment is split into more than one +image section. The block is marked truncated in this case and provides the raw +bytes of the last instruction. + +The following example shows how instructions can be reconstructed from a block: + +~~~{.c} + struct pt_image_section_cache *iscache; + struct pt_block *block; + uint16_t ninsn; + uint64_t ip; + + ip = block->ip; + for (ninsn = 0; ninsn < block->ninsn; ++ninsn) { + uint8_t raw[pt_max_insn_size]; + <struct insn> insn; + int size; + + if (block->truncated && ((ninsn +1) == block->ninsn)) { + memcpy(raw, block->raw, block->size); + size = block->size; + } else { + size = pt_iscache_read(iscache, raw, sizeof(raw), block->isid, ip); + if (size < 0) + break; + } + + errcode = <decode instruction>(&insn, raw, size, block->mode); + if (errcode < 0) + break; + + <process instruction>(&insn); + + ip = <determine next ip>(&insn); + } +~~~ + + +#### Events + +The block decoder uses an event system similar to the query decoder's. Pending +events are indicated by the `pts_event_pending` flag in the status flag +bit-vector returned from `pt_blk_sync_<where>()`, `pt_blk_next()` and +`pt_blk_event()`. + +When the `pts_event_pending` flag is set on return from `pt_blk_sync_<where>()` +or `pt_blk_next()`, use repeated calls to `pt_blk_event()` to drain all queued +events. Then switch back to calling `pt_blk_next()` to resume with block decode +as shown in the following example: + +~~~{.c} + struct pt_block_decoder *decoder; + int status; + + for (;;) { + struct pt_block block; + + status = pt_blk_next(decoder, &block, sizeof(block)); + if (status < 0) + break; + + <process block>(&block); + + while (status & pts_event_pending) { + struct pt_event event; + + status = pt_blk_event(decoder, &event, sizeof(event)); + if (status < 0) + <handle error>(status); + + <process event>(&event); + } + } +~~~ + + +#### The Block Decode Loop + +If we put all of the above examples together, we end up with a decode loop as +shown below: + +~~~{.c} + int process_block(struct pt_block *block, + struct pt_image_section_cache *iscache) + { + uint16_t ninsn; + uint64_t ip; + + ip = block->ip; + for (ninsn = 0; ninsn < block->ninsn; ++ninsn) { + struct pt_insn insn; + + memset(&insn, 0, sizeof(insn)); + insn->speculative = block->speculative; + insn->isid = block->isid; + insn->mode = block->mode; + insn->ip = ip; + + if (block->truncated && ((ninsn +1) == block->ninsn)) { + insn.truncated = 1; + insn.size = block->size; + + memcpy(insn.raw, block->raw, insn.size); + } else { + int size; + + size = pt_iscache_read(iscache, insn.raw, sizeof(insn.raw), + insn.isid, insn.ip); + if (size < 0) + return size; + + insn.size = (uint8_t) size; + } + + <decode instruction>(&insn); + <process instruction>(&insn); + + ip = <determine next ip>(&insn); + } + + return 0; + } + + int handle_events(struct pt_blk_decoder *decoder, int status) + { + while (status & pts_event_pending) { + struct pt_event event; + + status = pt_blk_event(decoder, &event, sizeof(event)); + if (status < 0) + break; + + <process event>(&event); + } + + return status; + } + + int decode(struct pt_blk_decoder *decoder, + struct pt_image_section_cache *iscache) + { + int status; + + for (;;) { + status = pt_blk_sync_forward(decoder); + if (status < 0) + break; + + for (;;) { + struct pt_block block; + int errcode; + + status = handle_events(decoder, status); + if (status < 0) + break; + + status = pt_blk_next(decoder, &block, sizeof(block)); + + errcode = process_block(&block, iscache); + if (errcode < 0) + status = errcode; + + if (status < 0) + break; + } + + <handle error>(status); + } + + <handle error>(status); + + return status; + } +~~~ + + +## Parallel Decode + +Intel PT splits naturally into self-contained PSB segments that can be decoded +independently. Use the packet or query decoder to search for PSB's using +repeated calls to `pt_pkt_sync_forward()` and `pt_pkt_get_sync_offset()` (or +`pt_qry_sync_forward()` and `pt_qry_get_sync_offset()`). The following example +shows this using the query decoder, which will already give the IP needed in +the next step. + +~~~{.c} + struct pt_query_decoder *decoder; + uint64_t offset, ip; + int status, errcode; + + for (;;) { + status = pt_qry_sync_forward(decoder, &ip); + if (status < 0) + break; + + errcode = pt_qry_get_sync_offset(decoder, &offset); + if (errcode < 0) + <handle error>(errcode); + + <split trace>(offset, ip, status); + } +~~~ + +The individual trace segments can then be decoded using the query, instruction +flow, or block decoder as shown above in the previous examples. + +When stitching decoded trace segments together, a sequence of linear (in the +sense that it can be decoded without Intel PT) code has to be filled in. Use +the `pts_eos` status indication to stop decoding early enough. Then proceed +until the IP at the start of the succeeding trace segment is reached. When +using the instruction flow decoder, `pt_insn_next()` may be used for that as +shown in the following example: + +~~~{.c} + struct pt_insn_decoder *decoder; + struct pt_insn insn; + int status; + + for (;;) { + status = pt_insn_next(decoder, &insn, sizeof(insn)); + if (status < 0) + <handle error>(status); + + if (status & pts_eos) + break; + + <process instruction>(&insn); + } + + while (insn.ip != <next segment's start IP>) { + <process instruction>(&insn); + + status = pt_insn_next(decoder, &insn, sizeof(insn)); + if (status < 0) + <handle error>(status); + } +~~~ + + +## Threading + +The decoder library API is not thread-safe. Different threads may allocate and +use different decoder objects at the same time. Different decoders must not use +the same image object. Use `pt_image_copy()` to give each decoder its own copy +of a shared master image. |
