Workshop

Tracing Bitcoin Core v23.0

Insights into Bitcoin Core process internals


0xB10C
bitcoin++ 2022
pls no photos, ty

p2p_monitor.py


Realtime Bitcoin Core P2P monitoring with USDT and eBPF

I'm Timo aka @0xB10C

supported by a Brink.dev grant

Workshop Structure

  1. Background and Theory (15 min)
  2. Hands on Tracing (60+ min)

Why tracing for Bitcoin Core?

Buzzword: observabillity

  • Goal: Gather low-level, process internal information during execution
  • Bitcoin Core is a black box (to many) doing complex things
  • Tracing helps with Bitcoin Core's security and robustness


What interfaces currently expose Bitcoin Core process internal state?
Making Bitcoin Core more stable, robust, and secure

Tracing with eBPF and USDT

Meet eBPF

extended Berkeley Packet Filter

"user-defined, sandboxed programs executed Linux kernel"


  • programs can be attached to kernel- and user-space tracepoints
  • can never crash, hang, or interfere with the kernel negatively
Security
Networking
Profiling
Observability
pictures from ebpf.io

Observability with eBPF

Tracing the kernel and userspace applications

dynamic static
kernel space kprobes / kretprobes (raw) tracepoints
user space uprobes / uretprobes Userspace, Statically Defined Tracing (USDT)

Meet USDT

Userspace, Statically Defined Tracing

  • can hook into individual USDT tracepoints
  • once the tracepoint is reached, internal state is passed to the eBPF VM
  • USDT also supported by: PostgreSQL, MySQL, Python, NodeJS, glibc, ...

Tracepoints in Bitcoin Core

Tracepoints in Bitcoin Core

  • Implemented with SystemTap as compile-time dependency
  • ELF note in the binary pointing to the address of a tracepoint (assembly NOP)
  • TRACE macros for tracepoints with 0 to 12 arguments

              #define TRACE(context, event)
              #define TRACE1(context, event, a)
              #define TRACE2(context, event, a, b)
              ...
              #define TRACE11(context, event, a, b, c, d, e, f, g, h, i, j, k)
              #define TRACE12(context, event, a, b, c, d, e, f, g, h, i, j, k, l)
            
in src/util/trace.h

Example tracepoint

net:outbound_message: sending an outbound P2P message to a peer

              TRACE6(net, outbound_message,
                peer->GetId(),
                peer->m_addr_name.c_str(),
                peer->ConnectionTypeAsString().c_str(),
                msg.m_type.c_str(),
                msg.data.size(),
                msg.data.data()
              );
            
in src/net.cpp

Tooling with USDT support

bpftrace and bcc

bpftrace BPF Compiler Collection (bcc)
high-level tracing language toolkit for creating tracing programs
basic, quick scripting more complex tools
only prints information (as logs, histograms, CSV, ...) no limitations
github.com/iovisor/bpftrace github.com/iovisor/bcc

bpftrace example


              #!/usr/bin/env bpftrace

              BEGIN {
                printf("Logging P2P traffic\n")
              }

              usdt:./src/bitcoind:net:inbound_message {
                $peer_id = (int64) arg0;
                $peer_addr = str(arg1);
                $peer_type = str(arg2);
                $msg_type = str(arg3);
                $msg_len = arg4;
                printf("inbound '%s' msg from peer %d (%s, %s) with %d bytes\n", $msg_type, $peer_id, $peer_type, $peer_addr, $msg_len);
              }
            
contrib/tracing/log_p2p_traffic.bt

bcc example (I)

C program that's compiled to eBPF bytecode by BCC


              BPF_PERF_OUTPUT(inbound_messages);

              int trace_inbound_message(struct pt_regs *ctx) {
                struct p2p_message msg = {};
                bpf_usdt_readarg(1, ctx, &msg.peer_id);
                bpf_usdt_readarg_p(2, ctx, &msg.peer_addr, MAX_PEER_ADDR_LENGTH);
                bpf_usdt_readarg_p(3, ctx, &msg.peer_conn_type, MAX_PEER_CONN_TYPE_LENGTH);
                bpf_usdt_readarg_p(4, ctx, &msg.msg_type, MAX_MSG_TYPE_LENGTH);
                bpf_usdt_readarg(5, ctx, &msg.msg_size);
                inbound_messages.perf_submit(ctx, &msg, sizeof(msg));
                return 0;
              };
            
Snippet from contrib/tracing/p2p_monitor.py

bcc example (II)

Python script loading the C program, hooking into the tracepoints, and waiting for inbound messages


              ctx = USDT(path=str(bitcoind_path))
              ctx.enable_probe(probe="net:inbound_message", fn_name="trace_inbound_message")
              bpf = BPF(text=c_prog_from_prev_slide, usdt_contexts=[ctx])

              def handle_inbound(_, data, __):
                event = bpf["inbound_messages"].event(data)
                print(f"inbound from {event.peer_id}: {event.msg_type}")

              bpf["inbound_messages"].open_perf_buffer(handle_inbound)

              while True:
                bpf.perf_buffer_poll()
            
Snippet from contrib/tracing/p2p_monitor.py

Hands on Tracing

open b10c.me/bpp22

Task 0: Logging in

Login in and get familiar with the system.
your IP should be on b10c.me/bpp22

              ssh workshop@[IPv4] -i [priv key file] # without the .pub

              # for example
              ssh workshop@13.37.83.33 -i ~/.ssh/id_bpp22workshop
            

  1. Which kernel version does your server have?
  2. How much memory does your server have?
  3. Are you able to switch to the root user with `sudo su`?

Task 1: Listing tracepoints

List the avaliable tracepoints in the `bitcoind` binary.

  1. Which tracepoints does the binary contain?
  2. Which of the three methods show you information about the arguments passed to the tracepoints?
  3. Are all tracepoints documented in the tracing documentation?
  4. Are you able find a tracepoint in the Bitcoin Core source code?

Task 2: Run `p2p_monitor.py`

Run the `p2p_monitor.py` Pyhton script to observe the testnet P2P traffic.
  1. How many peers is your node connected to?
  2. Does your node have any inbound peers? What are the connection types of your peers?
  3. Were you able to observe, for example, a handshake with a peer or a ping-pong?

Task 3: Tracing a unit test

Tracing with eBPF is programable

  1. How many UTXOs are added and spent during the test?
  2. What are the most common values of UTXOs added and removed?
  3. Which binary is the bpftrace script hooking into?
  4. What data does the fourth argument (i.e. `arg3`; zero-indexed) of the `add` and `spent` tracepoints contain?

Task 4: Tracepoint interface tests

Run the interface tests for the tracepoints.
  1. Do the tests pass?
  2. We require permissions to do BPF syscalls and read BPF maps for the tests. What happens when you run the tests with the workshop user?
  3. Is blindly running Python scripts downloaded from the internet as root user on your own machine a good idea?

Task 5: Add a new tracepoint

Add a tracepoint for the startup and shutdown of Bitcoin Core.
  1. What name did you choose for context and event?
  2. Where did you choose to place your tracepoints in the functions resposible for startup and shutdown?
  3. Is your shutdown tracepoint being triggered in case Bitcoin Core does not shutdown cleanly? If not, would a tracepoint for this make sense?
  4. Who wants to open a PR to Bitcoin Core adding these trancepoints?

Thanks!

Extra

Hooking into USDT tracepoints

                    ┌──────────────────┐            ┌──────────────┐
                    │ tracing script   │            │ bitcoind     │
                    │==================│      2.    │==============│
                    │  eBPF  │ tracing │      hooks │              │
                    │  code  │ logic   │      into┌─┤►tracepoint 1─┼───┐ 3.
                    └────┬───┴──▲──────┘          ├─┤►tracepoint 2 │   │ pass args
                1.       │      │ 4.              │ │ ...          │   │ to eBPF
        User    compiles │      │ pass data to    │ └──────────────┘   │ program
        Space    & loads │      │ tracing script  │                    │
        ─────────────────┼──────┼─────────────────┼────────────────────┼───
        Kernel           │      │                 │                    │
        Space       ┌──┬─▼──────┴─────────────────┴────────────┐       │
                    │  │  eBPF program                         │◄──────┘
                    │  └───────────────────────────────────────┤
                    │ eBPF kernel Virtual Machine (sandboxed)  │
                    └──────────────────────────────────────────┘
            

Guidelines and best practices

  • Tracepoints need a clear motivation, use-case, and example
  • No expensive computations soley for a tracepoint
  • semi-stable API
  • eBPF VM limits

more details in doc/tracing.md