StarTrace/Instrumentation Backplane

From RAD Lab

Jump to: navigation, search

Contents

Instrumentation BackPlane (IBP)

[Producers] <=> [IBP] <=> [Consumers]

Executive Summary

The StarTrace Instrumentation BackPlane (IBP) attempts to solve the problem of getting information out of an end-to-end (e2e) path-based tracing system such as StarTrace or X-Trace. Having an e2e tracing system is useless unless the information gathered can be effectively used to fix or improve the system.

There are several challenges

  • Lots of information, especially for a large data center, but not all the information is needed all the time.
  • Efficient ways to reconstruct paths from many instrumentation points and make this transparent to the users.
  • Fast, and low overhead.
  • Dynamic, preferably with some type of feedback, ie. collect more information as they are needed.
  • Work well with existing software stack.

The IBP is our first stab at this problem.

Logical View

IBP is designed to facilitate information flow in a path-based instrumentation system. In such a system, the primary object of interest is a path, which consists of one or more events. These events may be of heterogeneous types and events from the same paths are connected by their identifiers. Every event has a parent, except for the root event.

In the IBP, there are two types of actors. One is the producer, which produce events. The other is the consumer, which consumes events or paths. Therefore the IBP is not only responsible for storing events, it must also reconstruct the paths from events. This alleviate the need for the consumer to perform path reconstruction, thus allowing the consumer to focus on path analysis.

We can think of IBP as a logical bus, with many producers and consumers attached to it. Information flows from the producers to consumers on this bus.

Examples of Producer/Consumers/Usage

Examples of Producers could include:

  • A log file that is appended to by some application component
  • An Apache module that generates events for each request
  • A modified database or database connector that generates events for each sql query
  • A dtrace probe??

Examples of Consumers could include:

  • A visualization website that webapp admins could use to "see" how the application is performing
  • A machine learning algorithm that does clustering/classification based on paths
  • Some type of interactive path query tool that could answer the following questions:
    • "what are all the requests that went through machine x and machine y?"
    • "what is the average time for requests of type x that didn't require an SQL query?"
    • "which paths involved more than x steps?"
    • "what are the 10 previous paths that involved machine x before machine x threw error y on a request?"
    • "which paths took more than 1 second to complete?"
    • "is there an app server that is being underutilized compared to all the other app servers?"

Specifications

Producer API

 register_producer(name): int
  • name: Producer name (ie. 'http1', or 'db17')
  • returns a producer id
 unregister_producer(id): bool
  • id: the producer id
 add_event(xid, pid, data*): void
  • xid: X-trace/StarTrace id
  • pid: producer id
  • data: various data, depending on the specific producer? hmm... is there a better way?

Consumer API

Data-exchange Format

  • YAML?
  • JSON?

IBP Reference Implementation

Demonstration of Usefulness

Relevant Papers