feat: comprehensive wandb logging (main-outcomes, panel routing, diag norms) by RyanKim17920 · Pull Request #15 · MedARC-AI/nanopath

RyanKim17920 · 2026-06-18T00:15:58Z

Summary

Panel routing: all wandb keys are now routed into logical panels instead of a single flat namespace. Headline losses (dino, ibot, kde, total, grad_norm, lr, and their val_* counterparts) land in main-train-results/; bookkeeping metrics (throughput, memory, ETA, etc.) land in train-tracking/; diagnostic keys keep their diag/ prefix; probe keys keep their probe/ prefix.
main-outcomes/ panel: after each probe run, headline downstream scores (linear, kNN, 16-shot, segmentation, progression, mutation, survival, robustness, mean) are logged to a clean main-outcomes/ panel and per-dataset breakdowns to main-outcomes-datasets/. This surfaces the numbers that actually matter at a glance rather than hunting through flat probe/* keys.
Diagnostic token/attention norms (diag/): when train.log_diagnostics_every is set in the config (default 0 = off), the student backbone runs a diagnostic forward pass that captures per-layer q/k/v norms, qk similarity, and cls/reg/patch token norms every N log steps. Four evenly-spaced layers are sampled plus a per-metric mean across all layers.

What was NOT changed

No architecture, dataset, batch size, probe logic, or config defaults were modified. The diagnostic forward path is opt-in and unreachable unless log_diagnostics_every > 0. The normal (non-diagnostic) forward path in both model.py and train.py is identical to before this PR.

Test plan

Smoke-test passes (WANDB_MODE=disabled python train.py ...) — existing training loop unchanged for default config
With log_diagnostics_every: 10 in config, confirm diag/ keys appear in wandb at the correct steps
After a probe run, confirm main-outcomes/mean and main-outcomes-datasets/ keys appear in wandb
Val loop still works (unpacks 4 values from compute_losses, discards diag)

🤖 Generated with Claude Code

… diag norms) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

PaulScotti · 2026-06-18T18:10:28Z

Thanks for the PR! I appreciate trying to clean up our logging/plotting to be more informative

I have several concerns though, you made lots of changes and I'm not sure all are good/intended:

model.py (line 168): diagnostic mode bypasses activation checkpointing. With checkpoint=True, normal forward called checkpoint 4 times
train.py (line 571): if log_diagnostics_every: 1, diagnostics run on step 1 while FlopCounterMode measures measured_flops_per_step, so the permanent FLOP estimate includes diagnostic-only work
train.py (line 477): log_main_outcomes() runs every time log_probe_results() runs, not only when a new probe result is collected. Testing it out, two calls logged twice at the same W&B step
train.py (line 508): not sure why you need to rename these probe outputs, isnt "probe/bracs_score" fine?
train.py (line 488): usage of broad try/except conflicts with our preferred fail-loudly rule and MedARC coding standards
train.py (line 297): renames existing W&B keys from stable train/* and val/* namespaces into new namespaces? Why do this
train.py (line 536): train.log_diagnostics_every is an undocumented .get(..., 0) fallback, again not desirably coding style

Keep the new main-outcomes/ panel; fix the diagnostic path and stop renaming stable namespaces per review: - Keep train/* and val/* in their stable namespaces (cross-run history stays intact). Additionally mirror the headline losses into a clean main-train-results/ glance panel (dino/ibot/kde/total/grad_norm/lr + val_*), as an additive copy, not a rename. Bookkeeping stays under train/*. (point 6) - log_main_outcomes now emits each probe result exactly once via a last-logged train_step guard instead of re-logging the latest at the same W&B step on every train log step. (point 3) - Drop the broad try/except in log_main_outcomes to fail loudly, matching collect_probe_results. (point 5) - main-outcomes-datasets/<dataset>_score keeps the _score suffix and, like collect_probe_results, strips the probe_ prefix so the keys match the existing probe/<dataset>_score names. (point 4) - Diagnostic forward now honors activation checkpointing exactly like the normal path. (point 1) - Diagnostics never run on the FLOP-measurement step, so the reused per-step FLOP estimate excludes diagnostic-only work. (point 2) - log_diagnostics_every is now a documented config key (configs/main.yaml, configs/smoke.yaml) read via direct indexing, no .get(...,0) fallback. (point 7) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

RyanKim17920 · 2026-06-18T22:30:27Z

Thanks for the feedback! I've updated the code accordingly and addressed all your major points:

The diagnostic forward now runs through the activation-checkpoint path too (matches the normal forward), verified on a GPU smoke test
Diagnostics no longer run on the FLOP-measurement step
Double logging is fixed now, main-outcomes logs on probe results instead of train log steps
We use the same names now (keeping _score for consistency)
Removed the try/excepts to match with code standards
train/* and val/* are unchanged from beforehand, but I mirror the main losses into a main-train-results/ panel for visibility since train/* had lots of diagnostic/throughput metrics.
log_diagnostics_every is defaulted to 0 in the yaml and uses this instead of a get fallback

feat: comprehensive wandb logging (main-outcomes, main-train-results,…

635cf7a

… diag norms) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: comprehensive wandb logging (main-outcomes, panel routing, diag norms)#15

feat: comprehensive wandb logging (main-outcomes, panel routing, diag norms)#15
RyanKim17920 wants to merge 2 commits into
MedARC-AI:mainfrom
RyanKim17920:feat/wandb-logging-only

RyanKim17920 commented Jun 18, 2026 •

edited

Loading

Uh oh!

PaulScotti commented Jun 18, 2026

Uh oh!

RyanKim17920 commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RyanKim17920 commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What was NOT changed

Test plan

Uh oh!

PaulScotti commented Jun 18, 2026

Uh oh!

RyanKim17920 commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RyanKim17920 commented Jun 18, 2026 •

edited

Loading