Issues · strands-agents/evals · GitHub

Labels Milestones

[FEATURE] CLI improvements

#262

· poshinchen opened

on Jun 12, 2026

[FEATURE] Skill evaluation: with vs without skill batch-run, aggregation, comparison, analysis

area-evaluators

area-simulation

#247

· sangminwoo opened

on Jun 9, 2026

[FEATURE] Strands-evals CLI

#242

· poshinchen opened

on Jun 4, 2026

[FEATURE] Paired N-run experiments with batch aggregation and comparative analysis

area-evaluators

#228

· sangminwoo opened

on May 12, 2026

[FEATURE] [P0] Pipeline refinements

#226

· poshinchen opened

on May 12, 2026

[FEATURE] [P0] Attack strategies

#225

· poshinchen opened

on May 12, 2026

[FEATURE][P1] Red-Teaming: Tool-graph case generation, deterministic verification, adaptive red-teaming agent

area-generators

#222

· yeomjiwonyeom opened

on May 11, 2026

[FEATURE][P0.5] Red-Teaming: Agent-topology attacks & skill-based red-teaming

#221

· yeomjiwonyeom opened

on May 11, 2026

[FEATURE] Wrap / Deploy Strands-evals evaluators to Agentcore Code-base Evaluators

area-evaluators

#204

· poshinchen opened

on Apr 29, 2026

feat(experiment): ExperimentTrendAnalyzer — cross-run overall_score regression detection

#186

· nanookclaw opened

on Apr 3, 2026

[FEATURE] Built-in red teaming support

#177

· kevmyung opened

on Mar 24, 2026

[Experiment] Evals - Coding agent evaluation and iteration

area-evaluators

#152

· poshinchen opened

on Mar 3, 2026