Skip to content

feat: 可观测性最小集——零依赖指标 + 周期 slog 快照(方向 C)#136

Merged
NeverENG merged 2 commits into
mainfrom
feat/observability-metrics
Jun 14, 2026
Merged

feat: 可观测性最小集——零依赖指标 + 周期 slog 快照(方向 C)#136
NeverENG merged 2 commits into
mainfrom
feat/observability-metrics

Conversation

@NeverENG

@NeverENG NeverENG commented Jun 14, 2026

Copy link
Copy Markdown
Owner

背景(采集员视角痛点 #4:野外两眼一抹黑)

BanDB 此前无任何指标——丢了几帧、内存到顶没、写被背压卡住没,全看不到。本 PR 给这个 headless 边缘「服务端」装上仪表盘。

暴露方式选型

周期性 slog 快照(而非 expvar/Prometheus/Grafana):边缘设备 headless,任务中无法 curl 端口或开 Grafana,但日志一定读得到。零依赖、零新增端口,贴合「单二进制边缘」定位。埋点与暴露解耦——未来要 HTTP/Prometheus 出口无需改埋点。

改动(两个原子 commit)

1. pkg/metrics 核心 — 零依赖

  • 原子计数器(丢帧三类 / 读写 / 写错误 / 背压停顿)+ 仪表回调(MemTable 未刷盘字节/预算)
  • Take() 取快照,StartLogger(ctx, interval) 周期打印

2. 埋点 + 装配

  • ingesthook:三类丢帧分别计数
  • router:PUT/GET/DEL 计数 + 写错误计数
  • memtable:背压慢路径停顿计数 + 注册未刷盘字节仪表
  • 两个 Server main:每 10s 打一行快照

效果

tail 日志即可见:

level=INFO msg=metrics dropped_malformed=12 dropped_non_monotonic=3 backpressure_stalls=47 memtable_inflight_bytes=8597504 memtable_budget_bytes=16777216 writes=120493 write_errors=0 ...

测试

go build ./... 通过;pkg/metrics 单测覆盖计数增量与仪表回调实时读取;service/ingesthook/storage 既有测试不回归。设计为 atomic-only,无锁竞争(race 检测器因本机无 gcc 未跑)。

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Added built-in observability metrics to track key system events, including read/write/delete operations, dropped frames, errors, and backpressure stalls.
    • Metrics are automatically logged at configurable intervals for operational visibility.
  • Tests

    • Added unit tests for metrics functionality.
  • Chores

    • Integrated metrics initialization into server startup.

NeverENG and others added 2 commits June 14, 2026 21:01
新增 pkg/metrics:埋点与暴露解耦。计数器(丢帧/读写/背压)就地累加,
仪表(MemTable 未刷盘字节/预算)经回调实时读取,StartLogger 周期性用默认
slog 打一行快照——headless 边缘设备 tail 日志即可观测,无需端口/Prometheus。
未来可再接 HTTP 出口而无需改埋点。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- ingesthook: 三类丢帧分别计数(畸形/超限/回退)
- router: PUT/GET/DEL 读写计数与写错误计数
- memtable: 背压慢路径停顿计数,并注册未刷盘字节数仪表
- Server(两个 main): 每 10s 打印一行指标快照,tail 日志即可观测

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 14, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f91c78a5-66e5-42c4-813c-23acd56fa535

📥 Commits

Reviewing files that changed from the base of the PR and between e2ef3e3 and 7bd3480.

📒 Files selected for processing (7)
  • Server/server.go
  • Server/server_pprof.go
  • pkg/metrics/metrics.go
  • pkg/metrics/metrics_test.go
  • service/ingesthook/filter.go
  • service/router.go
  • storage/zstorage/memtable.go

📝 Walkthrough

Walkthrough

A new pkg/metrics package is introduced with exported atomic.Int64 counters for frame drops, KV operations, write errors, and backpressure stalls, plus a MemTable inflight/budget gauge, a Snapshot type, Take(), LogSnapshot(), and StartLogger(). These counters are then wired into the ingest filter, router, and memtable, and both server entry points start the periodic logger at a 10-second interval.

Changes

In-process metrics observability

Layer / File(s) Summary
metrics package: counters, gauges, Snapshot, and periodic logger
pkg/metrics/metrics.go, pkg/metrics/metrics_test.go
Introduces the metrics package with eight atomic.Int64 counters, an atomic.Value-backed MemTable inflight callback + budget gauge, a Snapshot struct, and Take(), LogSnapshot(), SetMemTableGauges(), and StartLogger(). Tests validate counter delta behavior and gauge callback registration/update.
Counter wiring in ingest filter, router, and memtable
service/ingesthook/filter.go, service/router.go, storage/zstorage/memtable.go
Increments FramesDroppedMalformed, FramesDroppedOversized, and FramesDroppedNonMonotonic on each drop path in the ingest filter; increments Writes, Reads, Deletes, and WriteErrors in router handlers; registers MemTable gauges via SetMemTableGauges in NewMemTable() and increments BackpressureStalls in acquireCredit.
Periodic logger startup in server entry points
Server/server.go, Server/server_pprof.go
Both main() functions call metrics.StartLogger(context.Background(), 10*time.Second) to start the background periodic logging goroutine.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • NeverENG/BanDB#125: Introduced the MemTable byte-budget backpressure logic (TryAcquire, MemTableMaxInflightBytes) that this PR directly instruments with BackpressureStalls and the MemTable gauge registration.
  • NeverENG/BanDB#130: Added the PreHandle hook and the malformed/oversized/non-monotonic drop paths in service/ingesthook/filter.go that this PR now instruments with metrics counters.

Poem

🐇 Hop hop, a new package appears,
Counting every write, read, and drop with care,
Atomic integers tick away the years,
While slog prints metrics into the air.
The MemTable whispers its inflight bytes,
And StartLogger watches through the night!

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/observability-metrics

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@NeverENG NeverENG merged commit bb79547 into main Jun 14, 2026
3 of 4 checks passed
@github-actions

Copy link
Copy Markdown

🐯 BanGD 数据库内核评审

整体风险:🟢 低

变更总结:该 PR 在 BanDB 中引入了一个零依赖的进程内可观测性层(pkg/metrics),包含:

  • 一组 atomic.Int64 全局计数器(丢帧三类、读写、写错误、背压停顿)
  • 一组仪表回调(MemTable 未刷盘字节数/预算),通过 atomic.Value 存储回调函数指针
  • Take() 快照 + LogSnapshot() 周期性 slog 打印,通过 go func + time.Ticker 在独立 goroutine 中每 10s 输出一行指标

埋点散布在三处:ingesthook/filter.go(丢帧三类)、service/router.go(读写/删除/写错误计数)、storage/zstorage/memtable.go(背压停顿计数 + 注册仪表回调)。两个 Server main 入口各启动一次 StartLogger

这是一个纯「观测」改造,不改变任何业务逻辑、不涉及存储格式或 WAL。整体设计合理,把埋点与暴露解耦,底层用 sync/atomic 避免锁竞争。

本评审不阻塞合入;架构级建议以 Issue 形式跟踪,普通问题在下方内联列出。

架构问题(共 2 项)


本次评审消耗 token:共 63504 tokens(输入 61516,输出 1988,缓存命中 0,缓存写入 0)|维度 [concurrency, memory, lock, storage, performance]|补充阅读周边文件 [pkg/credit/credit.go]|对抗式复核 3 票/条,过滤疑似误报 0 条

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant