Skip to content

Dapper notes

Published: at 12:08 PM

Table of contents

Open Table of contents

Two solutions to associate all record entries with a given initiator

image

Black-box

use statistical regression techniques to infer that association

Annotation-based schemes

ely on applications or middleware to explicitly tag every record with a global identifier that links these message records back to the originating request

Trace trees and spans

Instrumentation point

Entry:

Defer computation or made asynchronous

Inter-process communication

Annotation API

image

Trace collection

image

Out-of-band trace collection

The Dapper system as described performs trace logging and collection out-of-band with the request tree itself. Because:

Security and privacy considerations

Default:

Opt-in:

Not anticipated benifits:

Tracing Overhead

Generation Overhead

Trace generation overhead is the most critical segment of Dapper’s performance footprint, since it can harder be turned off in an emergency.

Most important sources of trace generation overhead:

Creation time cost(average, measured on a 2.2GHz x86 server):

the difference is the added cost of allocating a globally unique trace id for root spans

Collection overhead

Effect on production workloads

Adaptive sampling

The Dapper overhead attributed to any given process is proportional to the number of traces that process samples per unit time.

Coping with aggresive sampling

low sampling probabilities - often as low as 0.01% for high-traffic services - does not hinder most important analysis for high-throughput services

If a notable execution pattern surfaces
once in such systems, it will surface thousands of times.

Additional sampling during collection

Why:

How:

Remark

General-purpose Dapper Tools

Depot API

to access trace data: