feat: comprehensive wandb logging (main-outcomes, panel routing, diag norms)#15
Open
RyanKim17920 wants to merge 2 commits into
Open
feat: comprehensive wandb logging (main-outcomes, panel routing, diag norms)#15RyanKim17920 wants to merge 2 commits into
RyanKim17920 wants to merge 2 commits into
Conversation
… diag norms) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
|
Thanks for the PR! I appreciate trying to clean up our logging/plotting to be more informative I have several concerns though, you made lots of changes and I'm not sure all are good/intended:
|
Keep the new main-outcomes/ panel; fix the diagnostic path and stop renaming stable namespaces per review: - Keep train/* and val/* in their stable namespaces (cross-run history stays intact). Additionally mirror the headline losses into a clean main-train-results/ glance panel (dino/ibot/kde/total/grad_norm/lr + val_*), as an additive copy, not a rename. Bookkeeping stays under train/*. (point 6) - log_main_outcomes now emits each probe result exactly once via a last-logged train_step guard instead of re-logging the latest at the same W&B step on every train log step. (point 3) - Drop the broad try/except in log_main_outcomes to fail loudly, matching collect_probe_results. (point 5) - main-outcomes-datasets/<dataset>_score keeps the _score suffix and, like collect_probe_results, strips the probe_ prefix so the keys match the existing probe/<dataset>_score names. (point 4) - Diagnostic forward now honors activation checkpointing exactly like the normal path. (point 1) - Diagnostics never run on the FLOP-measurement step, so the reused per-step FLOP estimate excludes diagnostic-only work. (point 2) - log_diagnostics_every is now a documented config key (configs/main.yaml, configs/smoke.yaml) read via direct indexing, no .get(...,0) fallback. (point 7) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
Author
|
Thanks for the feedback! I've updated the code accordingly and addressed all your major points:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
dino,ibot,kde,total,grad_norm,lr, and theirval_*counterparts) land inmain-train-results/; bookkeeping metrics (throughput, memory, ETA, etc.) land intrain-tracking/; diagnostic keys keep theirdiag/prefix; probe keys keep theirprobe/prefix.main-outcomes/panel: after each probe run, headline downstream scores (linear, kNN, 16-shot, segmentation, progression, mutation, survival, robustness, mean) are logged to a cleanmain-outcomes/panel and per-dataset breakdowns tomain-outcomes-datasets/. This surfaces the numbers that actually matter at a glance rather than hunting through flatprobe/*keys.diag/): whentrain.log_diagnostics_everyis set in the config (default 0 = off), the student backbone runs a diagnostic forward pass that captures per-layer q/k/v norms, qk similarity, and cls/reg/patch token norms every N log steps. Four evenly-spaced layers are sampled plus a per-metric mean across all layers.What was NOT changed
No architecture, dataset, batch size, probe logic, or config defaults were modified. The diagnostic forward path is opt-in and unreachable unless
log_diagnostics_every > 0. The normal (non-diagnostic) forward path in bothmodel.pyandtrain.pyis identical to before this PR.Test plan
WANDB_MODE=disabled python train.py ...) — existing training loop unchanged for default configlog_diagnostics_every: 10in config, confirmdiag/keys appear in wandb at the correct stepsmain-outcomes/meanandmain-outcomes-datasets/keys appear in wandbcompute_losses, discardsdiag)🤖 Generated with Claude Code