benzyx / magic-trace

magic-trace collects and displays high-resolution traces of what a process is doing

Home Page:https://magic-trace.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


magic-trace

Overview

magic-trace collects and displays high-resolution traces of what a process is doing. People have used it to:

  • figure out why an application running in production handles some requests slowly while simultaneously handling a sea of uninteresting requests,
  • look at what their code is actually doing instead of what they think it's doing,
  • get a history of what their application was doing before it crashed, instead of a mere stacktrace at that final instant,
  • ...and much more!

magic-trace:

  • has low overhead1,
  • doesn't require application changes to use,
  • traces every function call with ~40ns resolution, and
  • renders a timeline of call stacks going back (a configurable) ~10ms.

You use it like perf: point it to a process and off it goes. The key difference from perf is that instead of sampling call stacks throughout time, magic-trace uses Intel Processor Trace to snapshot a ring buffer of all control flow leading up to a chosen point in time2. Then, you can explore an interactive timeline of what happened.

You can point magic-trace at a function such that when your application calls it, magic-trace takes a snapshot. Alternatively, you can attach it to a running process and detatch it with Ctrl+C, to see a trace of an arbitrary point in your program.

Testimonials

"Magic-trace is one of the simplest command-line debugging tools I have ever used."

  • Francis Ricci, Jane Street

"Magic-trace is not just for performance. The tool gives insight directly into what happens in your program, when, and why. Consider using it for all your introspective goals!"

  • Andrew Hunter, Jane Street

"I can't believe you open sourced that."

  • Anonymous, Jump Trading

more testimonials...

Demo

Getting started

  1. Make sure the system you want to trace is supported. The constraints that most commonly trip people up are: VMs are mostly not supported, Intel only (Skylake or later), Linux only.

  2. Grab a release binary from the latest release page.

    1. If downloading the prebuilt binary (not package), chmod +x magic-trace3
    2. If downloading the package, run sudo dpkg -i magic-trace*.deb

    Then, test it by running magic-trace -help, which should bring up some help text.

  1. Here's a sample C program to try out. It's a slightly modified version of the example in man 3 dlopen. Download that, build it with gcc -ldl demo.c -o demo, then leave it running ./demo. We're going to use that program to learn how dlopen works.

  2. Run magic-trace attach -pid $(pidof demo). When you see the message that it's successfully attached, wait a couple seconds and Ctrl+C magic-trace. It will output a file called trace.ftf in your working directory.

  3. Open Perfetto, click "Open trace file" in the top-left-hand and give it the trace file generated in the previous step.

  4. Once it's loaded, expand the trace by clicking the two little arrows in the main trace area.

  1. That should have expanded into a trace. Your screen should now look something like this:

  1. Use WASD and the scroll wheel to navigate around. W zooms in (you'll need to zoom in a bunch to see anything useful), S zooms out, A moves left, D moves right, and scroll wheel moves your viewport up and down the stack. You'll only need to scroll to see particularly deep stack traces, it's probably not useful for this example. Zoom in until you can see an individual loop through dlopen/dlsym/cos/printf/dlclose.

  1. Click and drag on the white space around the call stacks to measure. Plant flags by clicking in the timeline along the top. Using the measurement tool, measure how long it takes to run cos. On my screen it takes ~3us.

Congratulations, you just magically traced your first program!

In contrast to traditional perf workflows, magic-trace excels at hypothesis generation. For example, you might notice that taking 3us to run cos is a really long time! If you zoom in even more, you'll see that there's actually 4 pink "[untraced]" cells in there. If you re-run magic-trace with root and pass it -include-kernel, you'll see stacktraces for those. They're page fault handlers! If you change the demo program to call cos twice in a row and retrace it, you'll see that the second call takes far less time and does not page fault.

How to use it

magic-trace continuously records control flow into a ring buffer. Upon some sort of trigger, it takes a snapshot of that buffer and reconstructs call stacks.

There are two ways to take a snapshot:

You just did this one: Ctrl+C magic-trace. If magic trace terminates without already having taken a snapshot, it takes a snapshot of the end of the program.

You can also trigger snapshots when the application calls a function. To do so, pass magic-trace the -trigger flag.

  • -trigger ? brings up a fuzzy-finding selector that lets you choose from all symbols in your executable,
  • -trigger SYMBOL selects a specific, fully mangled, symbol you know ahead of time, and
  • -trigger . selects the default symbol magic_trace_stop_indicator.

Stop indicators are powerful. Here are some ideas for where you might want to place one:

  • If you're using an asynchronous runtime, any time a scheduler cycle takes too long.
  • In a server, when a request takes a surprisingly long time.
  • After the garbage collector runs, to see what it's doing and what it interrupted.
  • After a compiler pass has completed.

You may leave the stop indicator in production code. It doesn't need to do anything in particular, magic-trace just needs the name. It is just an empty, but not inlined, function. It will cost ~10us to call, but only when magic-trace actually uses it to take a snapshot.

Documentation

More documentation is available on the magic-trace wiki.

Contributing

If you'd like to contribute:

  1. read the build instructions,
  2. set up your editor,
  3. take a quick tour through the codebase, then
  4. hit up the issue tracker for a good starter project.

Privacy policy

magic-trace does not send your code or derivatives of your code (including traces) anywhere.

Perfetto runs entirely in your browser and, as far as we can tell, also does not send your trace anywhere. If you're worried about that changing one day, build your own local copy of the perfetto UI and use that instead.

Acknowledgements

Tristan Hume is the original author of magic-trace. He wrote it while working at Jane Street, who currently maintains it.

Intel PT is the foundational technology upon which magic-trace rests. We'd like to thank the people at Intel for their years-long efforts to make it available, despite its slow uptake in the greater software community.

magic-trace would not be possible without perfs extensive support for Intel PT. perf does most of the work in interpreting Intel PT's output, and magic-trace likely wouldn't exist were it not for their efforts. Thank you, everyone who contributed.

magic-trace doesn't do any visualization itself, it relies on Perfetto. We'd like to thank the people at Google who worked on it, it solves a hard problem well so we don't have to.

Footnotes

  1. Less than perf -g, more than perf -glbr.

  2. perf can do this too, but that's not how most people use it. In fact, if you peek under the hood you'll discover that magic-trace uses perf for exactly this.

  3. https://github.com/actions/upload-artifact/issues/38

About

magic-trace collects and displays high-resolution traces of what a process is doing

https://magic-trace.org

License:MIT License


Languages

Language:OCaml 87.7%Language:C 11.4%Language:Standard ML 0.5%Language:Makefile 0.3%