From 10aad2459bc3ded03f1863b0ced76c18ea6432e4 Mon Sep 17 00:00:00 2001 From: Dongdong Tao Date: Sat, 13 Jun 2026 16:42:06 +0900 Subject: [PATCH 1/2] tools: add cephtrace_report.py, a visual summary of trace output radostrace and osdtrace stream one line per IO - thousands of rows. The existing analyzers turn that into numeric tables but draw nothing, and require knowing which flags to run. cephtrace_report is the zero-flag visual glance: it auto-detects the tool and prints bar charts plus a log-scale latency histogram so a capture's shape is understandable at sight. For osdtrace the headline is a 'where the time goes' block - the share of total latency per stage (messenger/queue/osd/bluestore, with kv_commit broken out) - which turns the dense per-stage numbers into an instant 'it's RocksDB commit, not the network'. radostrace gets a read/write split, per-pool bars, and a slowest-acting-set culprit hint. It deliberately does NOT add a second parser: the osdtrace path reuses analyze_osdtrace_output.parse_line and the radostrace path reuses analyze_radostrace_output.detect_file_format and the same column layout, so it can't drift from the analyzers. It complements them (and links to them for the deep-dive numbers) rather than replacing them. Reads a file or stdin, and both the space format and the CSV that 'radostrace -o' writes. Pure stdlib; flake8-clean and pylint 10/10 (the snake_case filename is required for both the linter and the pytest import). Adds a committed small osdtrace sample and offline pytest smoke tests that run via the existing tox -e test in CI. Docs: a new visualizing-output.md plus pointers from the two analyzer docs and the README. --- README.md | 1 + doc/analyze-osdtrace.md | 4 + doc/analyze-radostrace.md | 4 + doc/visualizing-output.md | 104 ++++++++++ tests/test_cephtrace_report.py | 106 ++++++++++ tools/cephtrace_report.py | 310 ++++++++++++++++++++++++++++ tools/sample-logs/osdtrace_data.log | 11 + 7 files changed, 540 insertions(+) create mode 100644 doc/visualizing-output.md create mode 100644 tests/test_cephtrace_report.py create mode 100755 tools/cephtrace_report.py create mode 100644 tools/sample-logs/osdtrace_data.log diff --git a/README.md b/README.md index 12010cc..92bb901 100644 --- a/README.md +++ b/README.md @@ -127,6 +127,7 @@ See cephtrace in action troubleshooting real performance issues. | **DWARF Files** | Managing debug information for tracing | [Guide](doc/dwarf-json-files.md) | ### ๐Ÿ“Š Analysis & Tools +- **[Visualizing Trace Output](doc/visualizing-output.md)** - One-screen visual summary (latency histogram, where the time goes) of any capture - **[Analyzing Radostrace Logs](doc/analyze-radostrace.md)** - Extract insights from client traces - **[Analyzing Osdtrace Logs](doc/analyze-osdtrace.md)** - Deep-dive into OSD performance data diff --git a/doc/analyze-osdtrace.md b/doc/analyze-osdtrace.md index edcaa3c..5bf3c65 100644 --- a/doc/analyze-osdtrace.md +++ b/doc/analyze-osdtrace.md @@ -1,5 +1,9 @@ # Osdtrace Analysis Tool +> For a quick visual overview (where latency is spent, distribution shape, +> per-OSD outliers) before diving into the numbers, see +> [Visualizing trace output](visualizing-output.md). + The `analyze_osdtrace_output.py` script provides automated statistical analysis of osdtrace log files with fio-style latency distribution output. ## Overview diff --git a/doc/analyze-radostrace.md b/doc/analyze-radostrace.md index d875c80..fce6f74 100644 --- a/doc/analyze-radostrace.md +++ b/doc/analyze-radostrace.md @@ -1,5 +1,9 @@ # Radostrace Analysis Tool +> For a quick visual overview (latency histogram, read/write split, slow-OSD +> hint) before diving into the numbers, see +> [Visualizing trace output](visualizing-output.md). + The `analyze_radostrace_output.py` script provides automated analysis of radostrace log files to identify problematic OSDs involved in high-latency operations. ## Overview diff --git a/doc/visualizing-output.md b/doc/visualizing-output.md new file mode 100644 index 0000000..eb09872 --- /dev/null +++ b/doc/visualizing-output.md @@ -0,0 +1,104 @@ +# Visualizing trace output + +`radostrace` and `osdtrace` stream one line per IO - thousands of rows in a +typical capture. `tools/cephtrace_report.py` turns a capture into a +**one-screen visual summary** that answers the three questions you usually have +at a glance: *is it slow, what's slow, and where is the time going.* + +It has no flags to remember - point it at a log (or pipe one in) and it +auto-detects which tool produced it. + +```bash +# osdtrace +sudo ./osdtrace --id 0 -t 30 > osd.log +./tools/cephtrace_report.py osd.log + +# radostrace (also reads the CSV that `radostrace -o` writes) +sudo ./radostrace -p -t 30 | ./tools/cephtrace_report.py +``` + +## osdtrace example + +Real output from a ~116k-op capture (a single OSD under a 4 KiB write+read +workload): + +``` +osdtrace summary - 115829 OSD ops + op_r 86926 op_w 28903 + +where the time goes (share of total latency, avg 4.1ms/op) + messenger โ– 0% + queue โ–ˆโ–ˆโ–ˆโ–‹ 9% + osd โ–‰ 2% + bluestore โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Œ 41% + +kv_commit โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Š 37% (rocksdb commit) + +op latency distribution p50 4.0ms p95 9.5ms p99 11.3ms max 30.1ms + <250us โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 11896 + <500us โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Œ 10420 + <1ms โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Š 15594 + <2ms โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‹ 13440 + <4ms โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Š 6647 + <8ms โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 39612 + <16ms โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ– 18164 + <32ms 56 + +slowest ops + 30.1ms op_w osd.0 pg 2.12 4.0K + 28.0ms op_w osd.0 pg 2.e 4.0K + + -> deeper: ./tools/analyze_osdtrace_output.py [-i | -f kv_commit | -o ] +``` + +The **where the time goes** block is the headline: it turns osdtrace's dense +per-stage numbers into an instant "your latency is RocksDB commit, not network +or queueing." Because the stages overlap (`op_lat` is not their sum), the +shares are of total latency and need not add up to 100% - the point is which +stage dominates. + +When a capture spans **more than one OSD**, an extra **per-OSD p95 latency** +block is drawn (highest p95 first), so an outlier OSD stands out immediately - +the single-OSD capture above omits it. + +## radostrace example + +``` +radostrace summary - 353 client ops + reads 152 โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Ž 43% + writes 201 โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 56% + +latency distribution p50 3.5ms p95 15.6ms p99 19.4ms max 19.7ms + <1ms โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‹ 36 + <8ms โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 92 + <16ms โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‹ 68 + +OSDs in the acting set of the slowest 17 ops + osd.0 โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 17 + + -> deeper: ./tools/analyze_radostrace_output.py [threshold_us] +``` + +The **OSDs in the acting set of the slowest ops** block is a quick culprit +pointer; for the full iterative ranking mapped to physical hosts, use the +analyzer it points to. + +## Relationship to the analyzers + +This tool is the visual *glance*. It deliberately reuses the existing parsers +(`analyze_osdtrace_output.parse_line` and `analyze_radostrace_output`'s format +detection) rather than maintaining a second one. When you need the numbers, +not the shape, use the detailed analyzers: + +- [Analyzing osdtrace logs](analyze-osdtrace.md) - fio-style percentile tables, + per-stage contribution (`-i`), per-OSD / per-field filtering +- [Analyzing radostrace logs](analyze-radostrace.md) - iterative culprit-OSD + ranking mapped to hosts via `ceph osd tree` + +## Notes + +- Pure Python 3 standard library - no dependencies to install. +- The bars use Unicode block characters; any modern terminal renders them. + Section headings are bold only when writing to a terminal, so piping or + redirecting to a file stays clean. +- Reads either the default space-separated output or the CSV produced by + `radostrace -o`. diff --git a/tests/test_cephtrace_report.py b/tests/test_cephtrace_report.py new file mode 100644 index 0000000..36129c6 --- /dev/null +++ b/tests/test_cephtrace_report.py @@ -0,0 +1,106 @@ +""" +Smoke tests for cephtrace_report.py. + +These run against the small committed sample logs in tools/sample-logs/ so +they are offline and fast. The module is imported directly (tox sets +PYTHONPATH=tools); cephtrace_report has a hyphen-free name precisely so it is +importable here. +""" +import os +import runpy +import sys + +import pytest + +import cephtrace_report # pylint: disable=E0401 + +SAMPLES = os.path.join( + os.path.dirname(os.path.dirname(os.path.abspath(__file__))), + "tools", "sample-logs", +) +OSD_LOG = os.path.join(SAMPLES, "osdtrace_data.log") +RADOS_LOG = os.path.join(SAMPLES, "radostrace_data.log") +RADOS_CSV = os.path.join(SAMPLES, "radostrace_data.csv") + + +def _run(path, capsys): + """Run the tool's main() against a file path and return stdout.""" + argv = sys.argv + sys.argv = ["cephtrace_report.py", path] + try: + rc = cephtrace_report.main() + finally: + sys.argv = argv + return rc, capsys.readouterr().out + + +def test_osdtrace_report(capsys): + """osdtrace sample renders every headline section and the op types.""" + rc, out = _run(OSD_LOG, capsys) + assert rc == 0 + assert "osdtrace summary - 11 OSD ops" in out + for token in ("op_r", "op_w", "subop_w"): + assert token in out + assert "where the time goes" in out + assert "op latency distribution" in out + # bluestore dominates this write-heavy sample; kv_commit must surface + assert "kv_commit" in out + # two distinct OSDs in the sample -> per-OSD section is drawn + assert "per-OSD p95 latency" in out + assert "slowest ops" in out + assert "analyze_osdtrace_output.py" in out + + +def test_radostrace_space_report(capsys): + """Space-formatted radostrace sample renders the client-side sections.""" + rc, out = _run(RADOS_LOG, capsys) + assert rc == 0 + assert "radostrace summary" in out + assert "reads" in out and "writes" in out + assert "latency distribution" in out + assert "slowest ops" in out + assert "analyze_radostrace_output.py" in out + + +def test_radostrace_csv_report(capsys): + """CSV-formatted (radostrace -o) sample is detected and rendered.""" + rc, out = _run(RADOS_CSV, capsys) + assert rc == 0 + assert "radostrace summary" in out + assert "latency distribution" in out + + +def test_no_rows_returns_error(tmp_path, capsys): + """A file with no recognizable trace rows fails cleanly (rc=1).""" + junk = tmp_path / "junk.log" + junk.write_text("hello\nworld\n", encoding="utf-8") + rc, _ = _run(str(junk), capsys) + assert rc == 1 + + +def test_help_runs(capsys): + """--help prints the module docstring and exits 0.""" + argv = sys.argv + sys.argv = ["cephtrace_report.py", "--help"] + try: + rc = cephtrace_report.main() + finally: + sys.argv = argv + assert rc == 0 + assert "one-screen visual summary" in capsys.readouterr().out + + +def test_runs_as_script(): + """The committed osdtrace sample runs end-to-end as a script (no import + side-effects, exit code 0).""" + argv = sys.argv + sys.argv = ["cephtrace_report.py", OSD_LOG] + try: + with pytest.raises(SystemExit) as exc: + runpy.run_path( + os.path.join(SAMPLES, "..", "cephtrace_report.py"), + run_name="__main__", + ) + assert exc.value.code == 0 + finally: + sys.argv = argv diff --git a/tools/cephtrace_report.py b/tools/cephtrace_report.py new file mode 100755 index 0000000..b0038b4 --- /dev/null +++ b/tools/cephtrace_report.py @@ -0,0 +1,310 @@ +#!/usr/bin/env python3 +""" +cephtrace_report - a one-screen visual summary of radostrace / osdtrace output. + +This is the "is it slow, what's slow, where's the time going" glance: it draws +bar charts and a latency histogram so the shape of a capture is understandable +at sight, with no flags to remember. + +It complements - it does not replace - the two detailed analyzers: + * analyze_osdtrace_output.py fio-style percentile tables, per-stage + contribution (-i), per-OSD/field filtering + * analyze_radostrace_output.py iterative culprit-OSD ranking mapped to hosts + +To avoid a second, divergent parser, the osdtrace path reuses +analyze_osdtrace_output.parse_line, and the radostrace path reuses +analyze_radostrace_output.detect_file_format and the same column layout. + +Usage: + radostrace -t 30 > rados.log && ./tools/cephtrace_report.py rados.log + sudo ./osdtrace --id 0 -t 30 | ./tools/cephtrace_report.py +""" +import csv +import os +import re +import shutil +import sys +from collections import Counter, defaultdict + +# Reuse the existing, battle-tested parsers from the sibling analyzers rather +# than maintain a second one that could drift. They live in this directory, +# so put it on the path first; pylint can't follow that runtime edit, hence +# the targeted disables. +sys.path.insert(0, os.path.dirname(os.path.abspath(__file__))) +# pylint: disable=wrong-import-position,import-error +from analyze_osdtrace_output import parse_line as parse_osd_line # noqa: E402 +from analyze_radostrace_output import detect_file_format # noqa: E402 +# pylint: enable=wrong-import-position,import-error + +# 1/8th-width cells give smooth, sub-character-resolution bars. +_CELLS = " โ–โ–Žโ–โ–Œโ–‹โ–Šโ–‰โ–ˆ" + + +def _bar(frac, width): + """Render a proportional unicode bar padded to ``width`` columns.""" + frac = 0.0 if frac < 0 else 1.0 if frac > 1 else frac + full = int(frac * width) + rem = int(round((frac * width - full) * 8)) + return ("โ–ˆ" * full + (_CELLS[rem] if rem else "")).ljust(width) + + +def _pct(sorted_vals, percentile): + """Linear-interpolated percentile of an already-sorted list.""" + if not sorted_vals: + return 0 + pos = (len(sorted_vals) - 1) * percentile / 100.0 + low = int(pos) + if low + 1 >= len(sorted_vals): + return sorted_vals[low] + return sorted_vals[low] + (sorted_vals[low + 1] - sorted_vals[low]) * ( + pos - low) + + +def _us(value): + """Human-readable latency from a microsecond value.""" + value = float(value) + if value < 1000: + return f"{value:.0f}us" + if value < 1_000_000: + return f"{value / 1000:.1f}ms" + return f"{value / 1_000_000:.2f}s" + + +def _hsize(value): + """Human-readable byte size.""" + value = float(value) + for unit in ("B", "K", "M", "G"): + if value < 1024: + fmt = "{:.0f}{}" if unit == "B" else "{:.1f}{}" + return fmt.format(value, unit) + value /= 1024 + return f"{value:.1f}T" + + +def _heading(text): + """Print a section heading, bold only when writing to a terminal.""" + print(f"\n\033[1m{text}\033[0m" if sys.stdout.isatty() else f"\n{text}") + + +def _bar_width(term_width): + """Bar column count that leaves room for labels at this terminal width.""" + return max(16, min(40, term_width - 34)) + + +def latency_histogram(lats, bar_w, label): + """Draw a log-scale latency histogram plus a p50/p95/p99/max line. + + The percentile *tables* already live in the analyzers; what they lack is + the distribution *shape*, which is what this renders. + """ + edges = [250, 500, 1000, 2000, 4000, 8000, 16000, 32000, 64000, 1e18] + names = ["<250us", "<500us", "<1ms", "<2ms", "<4ms", "<8ms", "<16ms", + "<32ms", "<64ms", ">=64ms"] + buckets = [0] * len(edges) + for value in lats: + for i, edge in enumerate(edges): + if value < edge: + buckets[i] += 1 + break + ordered = sorted(lats) + _heading(f"{label} distribution " + f"p50 {_us(_pct(ordered, 50))} p95 {_us(_pct(ordered, 95))} " + f"p99 {_us(_pct(ordered, 99))} max {_us(ordered[-1])}") + peak = max(buckets) or 1 + for name, count in zip(names, buckets): + if count: + print(f" {name:>7} {_bar(count / peak, bar_w)} {count}") + + +# --------------------------------------------------------------------------- +# osdtrace +# --------------------------------------------------------------------------- + +def _osd_where_time_goes(rows, bar_w): + """Aggregate share of total latency per stage (stages overlap, so the + shares need not sum to 100% - this shows which stage dominates).""" + total = sum(r["lat"] for r in rows) or 1 + stage_of = { + "messenger": lambda r: (r["throttle_lat"] + r["recv_lat"] + + r["dispatch_lat"]), + "queue": lambda r: r["queue_lat"], + "osd": lambda r: r["osd_lat"], + "bluestore": lambda r: r["bluestore_lat"], + } + _heading("where the time goes (share of total latency, " + f"avg {_us(total / len(rows))}/op)") + for stage, func in stage_of.items(): + frac = sum(func(r) for r in rows) / total + print(f" {stage:>10} {_bar(frac, bar_w)} {100 * frac:.0f}%") + kv_commit = sum(r["bluestore_details"].get("kv_commit", 0) for r in rows) + if kv_commit: + print(f" +kv_commit {_bar(kv_commit / total, bar_w)} " + f"{100 * kv_commit / total:.0f}% (rocksdb commit)") + + +def _osd_per_osd_p95(rows, bar_w): + """Per-OSD p95 latency bars - only meaningful with >1 OSD captured.""" + by_osd = defaultdict(list) + for row in rows: + by_osd[row["osd"]].append(row["lat"]) + if len(by_osd) <= 1: + return + _heading("per-OSD p95 latency") + ranked = sorted(by_osd.items(), + key=lambda kv: -_pct(sorted(kv[1]), 95))[:8] + top = _pct(sorted(ranked[0][1]), 95) or 1 + for osd, lats in ranked: + p95 = _pct(sorted(lats), 95) + print(f" osd.{osd:<5} {_bar(p95 / top, bar_w)} " + f"{_us(p95)} ({len(lats)} ops)") + + +def report_osdtrace(rows, width): + """Print the full osdtrace visual summary.""" + bar_w = _bar_width(width) + bytype = Counter(r["op"] for r in rows) + _heading(f"osdtrace summary - {len(rows)} OSD ops") + print(" " + " ".join(f"{t} {c}" for t, c in bytype.most_common())) + + _osd_where_time_goes(rows, bar_w) + latency_histogram([r["lat"] for r in rows], bar_w, "op latency") + _osd_per_osd_p95(rows, bar_w) + + _heading("slowest ops") + for row in sorted(rows, key=lambda r: -r["lat"])[:5]: + print(f" {_us(row['lat']):>7} {row['op']:<8} osd.{row['osd']} " + f"pg {row['pg']} {_hsize(row['size'])}") + + print("\n -> deeper: ./tools/analyze_osdtrace_output.py " + "[-i | -f kv_commit | -o ]") + + +# --------------------------------------------------------------------------- +# radostrace +# --------------------------------------------------------------------------- + +# Column layout shared with analyze_radostrace_output.py (space and CSV): +# 0 pid 1 client 2 tid 3 pool 4 pg 5 acting 6 WR 7 size 8 latency +# 9.. object + [ops] +def _parse_radostrace(lines, is_csv): + """Parse radostrace rows into dicts: wr, size, lat, pool, obj, acting.""" + rows = [] + if is_csv: + for row in csv.reader(lines): + if len(row) < 10 or not row[8].isdigit(): + continue + obj = row[9] + (" " + row[10] if len(row) > 10 and row[10] else "") + rows.append({"wr": row[6], "size": int(row[7]), + "lat": int(row[8]), "pool": row[3], "obj": obj, + "acting": [int(x) for x in re.findall(r"\d+", + row[5])]}) + return rows + for line in lines: + if not line.strip() or line.startswith(("Execution", "Version")): + continue + parts = line.split() + if len(parts) < 10 or not parts[8].isdigit(): + continue + rows.append({"wr": parts[6], "size": int(parts[7]), + "lat": int(parts[8]), "pool": parts[3], + "obj": " ".join(parts[9:]), + "acting": [int(x) for x in re.findall(r"\d+", parts[5])]}) + return rows + + +def _rados_pools(rows, bar_w): + """Ops-per-pool bars - only shown for multi-pool captures (RGW etc.).""" + pools = Counter(r["pool"] for r in rows) + if len(pools) <= 1: + return + _heading("ops per pool") + pmax = pools.most_common(1)[0][1] + for pool, count in pools.most_common(6): + print(f" pool {pool:<5} {_bar(count / pmax, bar_w)} {count}") + + +def _rados_slow_osds(rows, bar_w): + """OSDs in the acting set of the slowest ops - a quick culprit pointer + (the radostrace analyzer does the full iterative, host-mapped ranking).""" + cut = max(1, len(rows) // 20) + slow = sorted(rows, key=lambda r: -r["lat"])[:cut] + culprit = Counter(osd for r in slow for osd in r["acting"]) + if not culprit: + return + _heading(f"OSDs in the acting set of the slowest {cut} ops") + cmax = culprit.most_common(1)[0][1] + for osd, count in culprit.most_common(6): + print(f" osd.{osd:<5} {_bar(count / cmax, bar_w)} {count}") + + +def report_radostrace(rows, width): + """Print the full radostrace visual summary.""" + bar_w = _bar_width(width) + total = len(rows) + reads = sum(1 for r in rows if r["wr"] == "R") + writes = sum(1 for r in rows if r["wr"] == "W") + _heading(f"radostrace summary - {total} client ops") + rmax = max(reads, writes) or 1 + print(f" reads {reads:>6} {_bar(reads / rmax, bar_w)} " + f"{100 * reads // total}%") + print(f" writes {writes:>6} {_bar(writes / rmax, bar_w)} " + f"{100 * writes // total}%") + + latency_histogram([r["lat"] for r in rows], bar_w, "latency") + _rados_pools(rows, bar_w) + _rados_slow_osds(rows, bar_w) + + _heading("slowest ops") + for row in sorted(rows, key=lambda r: -r["lat"])[:5]: + print(f" {_us(row['lat']):>7} {row['wr']} " + f"{row['obj'][:max(20, width - 18)]}") + + print("\n -> deeper: ./tools/analyze_radostrace_output.py " + " [threshold_us]") + + +# --------------------------------------------------------------------------- + +def _is_osdtrace(lines): + """True when the input looks like osdtrace ('osd pg ...') output.""" + return any(re.match(r"osd\s+\d+\s+pg\s", ln) for ln in lines) + + +def _dispatch(lines, is_csv, width): + """Route buffered lines to the matching report; return exit code.""" + if _is_osdtrace(lines): + rows = [d[0] for d in (parse_osd_line(ln) for ln in lines) if d] + if not rows: + print("no osdtrace rows recognized", file=sys.stderr) + return 1 + report_osdtrace(rows, width) + else: + rows = _parse_radostrace(lines, is_csv) + if not rows: + print("no radostrace rows recognized", file=sys.stderr) + return 1 + report_radostrace(rows, width) + return 0 + + +def main(): + """Read a file arg or stdin, detect the tool, print a report.""" + if "-h" in sys.argv or "--help" in sys.argv: + print(__doc__) + return 0 + args = [a for a in sys.argv[1:] if not a.startswith("-")] + width = shutil.get_terminal_size((90, 24)).columns + + if args: + path = args[0] + with open(path, encoding="utf-8", errors="replace") as handle: + lines = handle.readlines() + is_csv = detect_file_format(path) == "csv" + else: + lines = sys.stdin.readlines() + is_csv = any("," in ln for ln in lines[:5] if ln.strip()) + return _dispatch(lines, is_csv, width) + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/tools/sample-logs/osdtrace_data.log b/tools/sample-logs/osdtrace_data.log new file mode 100644 index 0000000..9bec46c --- /dev/null +++ b/tools/sample-logs/osdtrace_data.log @@ -0,0 +1,11 @@ +osd 0 pg 2.1d op_r size 24 client 14164 tid 1 throttle_lat 2 recv_lat 12 dispatch_lat 20 queue_lat 84 osd_lat 69 bluestore_lat 33 op_lat 220 +osd 0 pg 2.d op_r size 4096 client 14164 tid 2 throttle_lat 2 recv_lat 13 dispatch_lat 19 queue_lat 105 osd_lat 133 bluestore_lat 510 op_lat 782 +osd 0 pg 2.a op_r size 4096 client 14164 tid 5 throttle_lat 0 recv_lat 4 dispatch_lat 4 queue_lat 350 osd_lat 87 bluestore_lat 224 op_lat 670 +osd 0 pg 2.1d op_r size 4096 client 14164 tid 3 throttle_lat 1 recv_lat 52 dispatch_lat 6 queue_lat 118 osd_lat 87 bluestore_lat 826 op_lat 1091 +osd 0 pg 2.d op_w size 4096 client 14160 tid 1 throttle_lat 2 recv_lat 14 dispatch_lat 39 queue_lat 176 osd_lat 304 peers [(-1, 0), (-1, 0)] bluestore_lat 4363 (prepare 132 aio_wait 0 (aio_size 0) seq_wait 13 kv_commit 4217) op_lat 4975 +osd 0 pg 2.1d op_w size 4096 client 14160 tid 2 throttle_lat 0 recv_lat 21 dispatch_lat 6 queue_lat 141 osd_lat 230 peers [(-1, 0), (-1, 0)] bluestore_lat 4342 (prepare 87 aio_wait 0 (aio_size 0) seq_wait 12 kv_commit 4241) op_lat 4862 +osd 0 pg 2.1b op_w size 4096 client 14160 tid 3 throttle_lat 0 recv_lat 9 dispatch_lat 5 queue_lat 412 osd_lat 121 peers [(-1, 0), (-1, 0)] bluestore_lat 8309 (prepare 79 aio_wait 0 (aio_size 0) seq_wait 14 kv_commit 8214) op_lat 9012 +osd 0 pg 2.a op_w size 4096 client 14160 tid 4 throttle_lat 0 recv_lat 8 dispatch_lat 5 queue_lat 229 osd_lat 126 peers [(-1, 0), (-1, 0)] bluestore_lat 8207 (prepare 76 aio_wait 0 (aio_size 0) seq_wait 14 kv_commit 8116) op_lat 8659 +osd 2 pg 4.a subop_w size 4461 client 34161 tid 14 throttle_lat 2 recv_lat 53 dispatch_lat 38 queue_lat 389 osd_lat 116 bluestore_lat 8041 (prepare 107 aio_wait 0 (aio_size 0) seq_wait 19 kv_commit 7914) subop_lat 8714 +osd 2 pg 4.c subop_w size 4460 client 34161 tid 2 throttle_lat 1 recv_lat 24 dispatch_lat 7 queue_lat 321 osd_lat 106 bluestore_lat 8214 (prepare 253 aio_wait 0 (aio_size 0) seq_wait 21 kv_commit 7940) subop_lat 8822 +osd 2 pg 4.c subop_w size 4461 client 34161 tid 11 throttle_lat 1 recv_lat 15 dispatch_lat 12 queue_lat 79 osd_lat 136 bluestore_lat 7499 (prepare 389 aio_wait 0 (aio_size 0) seq_wait 20 kv_commit 7089) subop_lat 8619 From 15eb713af6b01456956f24e897c7a3a8e205d93e Mon Sep 17 00:00:00 2001 From: Dongdong Tao Date: Sat, 13 Jun 2026 17:05:14 +0900 Subject: [PATCH 2/2] tools: add --html, a self-contained interactive report Adds a browser-viewable companion to the terminal summary. cephtrace_report.py --html writes a single self-contained .html (no CDN, no server, works offline - attach it to a ticket) with everything the terminal view shows plus: - latency over the capture (p95 per slice, arrival order) - reveals workloads that degrade or run in bursts, which a terminal can't show - hover tooltips on every bar; sortable per-OSD/per-pool and slowest-ops tables; click a group row to filter the histogram to that OSD or pool The HTML/CSS/JS lives in a separate report_template.html (so the Python stays flake8/pylint clean - long JS lines in an embedded string would trip E501). Python precomputes compact aggregates (overall + per-group histograms, per-slice percentiles, group stats, top-50 slowest) and inlines them as JSON; all rendering is vanilla JS over that payload, so a ~116k-op capture is a ~20KB file. The osdtrace path reuses parse_line and the radostrace path the shared column layout - still no second parser. '<' in the payload is escaped so an object name can't break out of the " that could ride in on an object + # name by escaping "<". + blob = json.dumps(payload, separators=(",", ":")).replace("<", "\\u003c") + with open(out_path, "w", encoding="utf-8") as handle: + handle.write(html.replace("__CEPHTRACE_DATA__", blob)) def main(): - """Read a file arg or stdin, detect the tool, print a report.""" + """Read a file arg or stdin, detect the tool, print or write a report.""" if "-h" in sys.argv or "--help" in sys.argv: print(__doc__) return 0 - args = [a for a in sys.argv[1:] if not a.startswith("-")] - width = shutil.get_terminal_size((90, 24)).columns - - if args: - path = args[0] + argv = sys.argv[1:] + html_out = None + if "--html" in argv: + idx = argv.index("--html") + if idx + 1 >= len(argv): + print("--html requires an output filename", file=sys.stderr) + return 2 + html_out = argv[idx + 1] + del argv[idx:idx + 2] + positional = [a for a in argv if not a.startswith("-")] + + if positional: + path = positional[0] with open(path, encoding="utf-8", errors="replace") as handle: lines = handle.readlines() is_csv = detect_file_format(path) == "csv" else: lines = sys.stdin.readlines() is_csv = any("," in ln for ln in lines[:5] if ln.strip()) - return _dispatch(lines, is_csv, width) + + kind, rows = _load(lines, is_csv) + if not rows: + print(f"no {kind} rows recognized", file=sys.stderr) + return 1 + + if html_out: + _write_html(_build_payload(kind, rows), html_out) + print(f"wrote {html_out} ({kind}, {len(rows)} ops)") + return 0 + + width = shutil.get_terminal_size((90, 24)).columns + if kind == "osdtrace": + report_osdtrace(rows, width) + else: + report_radostrace(rows, width) + return 0 if __name__ == "__main__": diff --git a/tools/report_template.html b/tools/report_template.html new file mode 100644 index 0000000..c32bbbc --- /dev/null +++ b/tools/report_template.html @@ -0,0 +1,364 @@ + + + + + + +cephtrace report + + + +
+

cephtrace report

+

+ +
+ + + +
+

Latency distribution

+ +
+
+
+ +
+

Latency over the capture + - p95 per slice, in arrival order →

+
+
+ +
+

+

+
+
+ +
+

Slowest ops

+
+
+ + +
+ +
+ + + +