feat(agent-loop): add finish reason state machine#40
Conversation
Add pure TypeScript types and classifiers for Codra's long-horizon agent loop, mirror enums in codra-protocol, and cover finish-reason routing with table-driven tests. No runtime, UI, or memory behavior changes.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d6b976e283
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (state === "truncated") { | ||
| return !options?.allowTruncatedRecovery; |
There was a problem hiding this comment.
Treat truncation as recoverable by default
When a length finish reason is routed to truncated, callers using the default terminal helper will stop the loop before the truncation recovery transition can run, so the checkpoint/rebuild path described for truncation is skipped unless every caller remembers to pass allowTruncatedRecovery. Since the state machine already has an explicit truncation_terminal event for the cases that should stop, truncated should not be terminal by default.
Useful? React with 👍 / 👎.
| if (statusCode >= 400) { | ||
| return "client_error"; |
There was a problem hiding this comment.
Classify HTTP 408 as a retryable timeout
If a provider returns HTTP 408, this falls through to client_error, and shouldRetryApiError will not retry it even though the default retry policy explicitly includes timeout. Add an explicit 408 mapping to timeout so request-timeout responses follow the bounded retry path instead of failing as non-retriable client errors.
Useful? React with 👍 / 👎.
| const nextState = contentClassificationToNextState( | ||
| event.contentClassification, | ||
| Boolean(event.hasGoal), | ||
| ); |
There was a problem hiding this comment.
Require explicit goal context before finalizing
If a content_classified event omits hasGoal for a task that actually has a goal, Boolean(undefined) routes a final_answer straight to final, bypassing the mandatory goal_verify state. Make the event require an explicit goal/no-goal decision or fail closed when it is missing so an integration mistake cannot accept a stop response as complete.
Useful? React with 👍 / 👎.
| "AgentGoalVerdict", | ||
| "AgentLoopEventType", | ||
| ], | ||
| interfaces: ["AgentLoopTransition", "AgentLoopDecision"], |
There was a problem hiding this comment.
Mirror AgentLoopEvent in the protocol sync
The new shared AgentLoopEvent payload drives nextAgentLoopState, but the agent-loop sync list only checks AgentLoopTransition and AgentLoopDecision, and there is no matching Rust struct in codra-protocol. This lets the check pass while the Rust-owned loop cannot share or validate the event payload fields (finishReason, contentClassification, hasGoal) against the TypeScript contract.
Useful? React with 👍 / 👎.
| if (attempt >= policy.maxAttempts) { | ||
| return false; |
There was a problem hiding this comment.
Allow the final documented retry attempt
With the documented one-based attempt counts, this rejects retriable errors at attempt 3 even though the PR's acceptance criteria call for 429 retries to remain allowed until attempt 4. In that integration, a provider that hits a transient 429/5xx/timeout loses one retry from the configured budget, so compare after the max attempt or rename/normalize the counter semantics.
Useful? React with 👍 / 👎.
Summary
Implements PR 1 from the MiMo-inspired long-horizon agent loop research (
docs/research/MIMO_LONG_HORIZON_AGENT_LOOP.md).This PR adds Codra’s pure agent loop foundation: types, finish-reason classifiers, and table-driven tests. No runtime behavior changed — task execution, UI, checkpointing, memory, dream/distill are untouched.
Types added
AgentLoopState— 18 harness states (start→call_model→ … →final/failed)AgentFinishReason—tool_calls,function_call,length,content_filter,stop,api_error,streaming_null,unknownAgentApiResponseStatus,AgentContentClassification,AgentApiErrorClass,AgentGoalVerdictAgentLoopTransition,AgentLoopDecision,AgentLoopEventClassifier functions
classifyFinishReason()— maps provider finish reasons to Codra routesclassifyContent()— detects final answer vs think-only vs empty vs questionshouldRetryApiError()— bounded retry policy (429/5xx/timeout)nextAgentLoopState()— event-driven state transitionsisTerminalAgentLoopState()— terminal detection with optional truncated recoveryfinishReasonToNextState(),contentClassificationToNextState(),classifyApiErrorFromStatus()Rust mirror
crates/codra-protocol/src/agent_loop.rs— serde enums/structs mirroring shared TS typesTests
packages/shared/agent-loop.test.tscodra-protocolValidation
pnpm --filter @codra/shared test— 41 passedpnpm --filter @codra/shared build— passedpnpm run check:task-loop-protocol-sync— passedcargo check -p codra-protocol— passedcargo test -p codra-protocol— 3 passedNext PR (not in this change)
Recommended follow-up: goal verification classifier — independent judge before accepting
final, preventing agents from stopping when the task is not actually complete. Runtime integration and checkpoint/memory layers come after that.