Skip to content

feat: S-Level anti-cheat sandbox with 7-layer defense system#39

Merged
factnn merged 7 commits into
mainfrom
feat/sandbox-full
Jun 17, 2026
Merged

feat: S-Level anti-cheat sandbox with 7-layer defense system#39
factnn merged 7 commits into
mainfrom
feat/sandbox-full

Conversation

@factnn

@factnn factnn commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Summary

Implements a complete S-Level anti-cheat sandbox covering both daily verify (automatic) and competition mode (on-demand).

Daily Verify (automatic for both LLM/Agent tracks)

Applied via Verifier.__init__._setup_sandbox():

Layer Mechanism
Env vars TRITON_DISABLE_AUTOTUNE=1, CUDA_CACHE_DISABLE=1, cache dirs
CUDA protection Disable CUDA Graph, TF32, reset CUDA state
AST whitelist scan Torch API whitelist + alias tracking + getattr detection + print/data_ptr/global-mutable-state blocking
Dual-execution check Reuses accuracy test output (50% faster), anti_hack=True by default
GPU profiling Detects missing triton kernel launches

Competition Mode (on-demand)

New modules at src/sandbox/:

File Section Classes
cache_isolator.py Home/tmpfs isolation CacheIsolator
import_hook.py Runtime import hook + print block ForbiddenModuleLoader, RuntimeSandbox, SecureBuiltins, enable_competition_sandbox()
cuda_protector.py CUDA Graph/TF32/state CUDALayerProtector, DisabledCUDAGraphContext
shape_generator.py Bucketed random shapes ShapeBucket, BucketedShapeGenerator, TensorLayoutRandomizer
process_isolator.py Per-test subprocess TestConfig, ProcessIsolatedEvaluator
timing_validator.py CV/IQR statistical checks TimingAnomalyType, StatisticalTimingValidator, AdvancedTimingValidator
competition_evaluator.py Full orchestrator CompetitionConfig, TestCaseGenerator, CompetitionEvaluator

Key defenses against real-world attacks

  • print() → AST blocked + runtime no-op
  • data_ptr() / storage() → AST blocked
  • Module-level _cache = {} → AST blocked
  • Inter-iteration caching → per-iteration random seeds + clone
  • Hardcoded lookup tables → bucketed random shapes + seed-per-iteration

automerge-bot and others added 7 commits June 17, 2026 11:25
Implements full competition-grade anti-cheat sandbox from anti-cheat.md:

1. CacheIsolator (cache_isolator.py) — File system isolation:
   - Isolated HOME directory per test
   - Disabled triton/torch/cuda caches
   - Auto cleanup via context manager

2. ImportHookSandbox (import_hook.py) — Runtime import enforcement:
   - sys.meta_path hook for live import interception
   - Auto-patches triton.autotune/heuristics/Config at import time
   - Auto-patches torch.compile/CUDA Graph at import time
   - Blocks multiprocessing.shared_memory/posix_ipc/mmap
   - Secure exec/eval wrapper with keyword scanning

3. CUDALayerProtector (cuda_protector.py) — CUDA protection:
   - Disables CUDA Graph capture/replay
   - Resets CUDA state between tests
   - Disables TF32 for consistent precision

4. BucketedShapeGenerator (shape_generator.py) — Shape randomization:
   - GPU-alignment-friendly random shapes
   - GEMM/Attention/Conv specialized generators
   - TensorLayoutRandomizer for stride randomization

5. ProcessIsolatedEvaluator (process_isolator.py) — Process isolation:
   - Each test in fresh subprocess (mp.spawn)
   - All sandbox layers auto-applied in worker
   - Batch evaluation support

6. StatisticalTimingValidator (timing_validator.py) — Statistical checks:
   - CV/IQR/convergence scoring
   - Outlier detection (1.5*IQR rule)
   - Retest consistency check

FullSandbox (full_sandbox.py) ties all layers together into a single API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strictly follows triton_competition_anti_cheat_guide.md:

Layer 1 - cache_isolator.py (Section 3):
  CacheIsolator with HOME isolation, triton/torch cache dirs, cleanup

Layer 2 - import_hook.py (Section 4):
  ForbiddenModuleLoader, DisabledCUDAGraph, ImportHookSandbox,
  RuntimeSandbox, SecureBuiltins, SecurityError, enable_competition_sandbox

Layer 3 - cuda_protector.py (Section 5):
  CUDALayerProtector, DisabledCUDAGraphContext,
  CUDA Graph/TF32/profiler disable, CUDA state reset

Layer 4 - shape_generator.py (Section 6):
  ShapeBucket, BucketedShapeGenerator (STANDARD_BUCKETS, GEMM_BUCKETS,
  generate_gemm/conv/attention_shape),
  TensorLayoutRandomizer (randomize_layout/contiguity/strides)

Layer 5 - process_isolator.py (Section 7):
  TestConfig, isolated_test_worker (7-step isolation),
  ProcessIsolatedEvaluator (evaluate_single/batch with mp.spawn)

Layer 6 - timing_validator.py (Section 8):
  TimingAnomalyType, TimingValidationResult,
  StatisticalTimingValidator (CV/IQR/convergence/outliers/retest),
  AdvancedTimingValidator

Layer 7 - competition_evaluator.py (Section 9):
  CompetitionConfig, TestCase, TestCaseGenerator,
  isolated_worker_main, CompetitionEvaluator, main()

Plus full_sandbox.py convenience wrapper

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ccess)

Three new defenses against real-world competition attacks:

1. print() blocked:
   - AST scan: 'print()' calls flagged as hack
   - Runtime: builtins.print replaced with no-op in SecureBuiltins

2. data_ptr() / storage() blocked:
   - AST scan: .data_ptr, .untyped_storage, .storage, .storage_offset
     all flagged as forbidden memory access

3. Per-iteration random seeds:
   - Each timed run uses a different seed -> different input values
   - Kills hardcoded lookup tables and inter-iteration caching
   - Fresh clone() per iteration prevents pointer equality checks
   - Separate warmup seed to isolate warmup from scoring

These close the "200x speedup" hacks: input sniffing via print(),
inter-iteration result caching, and raw memory pointer reading.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds scope tracking to AST HackDetector:
- visit_FunctionDef/AsyncFunctionDef/ClassDef -> increment scope depth
- visit_Assign/visit_AnnAssign -> at depth 0, detect:
  - name = {} / name = [] / name = {...} (literals)
  - name = dict() / name = list() / name = set() (constructor calls)

Module-level dict/list/set declarations are banned in competition mode
because they enable inter-iteration result caching attacks.

Local mutable state inside functions is NOT flagged (legitimate use).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add dual_execution_check_with_ref() that accepts already-computed
reference output from the accuracy test, only running the
triton.jit-disabled execution and comparing with the saved result.
Saves 50% of Layer 2 execution time.

Also enables anti_hack (Layer 2 + Layer 3) by default in VerifyConfig.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Applied automatically to both LLM track and Agent track:

1. Env vars: TRITON_DISABLE_AUTOTUNE, TRITON_CACHE_DIR, TORCHINDUCTOR_DISABLE, CUDA_CACHE_DISABLE
2. CUDA layer protection: disable CUDA Graph, TF32, reset CUDA state
3. Runtime import hook: patch triton.autotune/torch.compile at import time

All set via Verifier._setup_sandbox() called in __init__.
Failures are non-fatal (try/except).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… daily verify

1. RuntimeSandbox.enable() now auto-enables SecureBuiltins (print noop)
2. Verifier._setup_sandbox: removed import hook (too heavy for daily use,
   only env vars + CUDA protector for lightweight path)
3. Both fixes verified: 8 modules all pass

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@factnn factnn merged commit 182b5b0 into main Jun 17, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant