Skip to content

Fail AI agent step on deterministic Claude Code failures#2027

Merged
sathvikkumar-octo merged 3 commits into
mainfrom
sk/ai-agent-deterministic-failures
Jun 18, 2026
Merged

Fail AI agent step on deterministic Claude Code failures#2027
sathvikkumar-octo merged 3 commits into
mainfrom
sk/ai-agent-deterministic-failures

Conversation

@sathvikkumar-octo

@sathvikkumar-octo sathvikkumar-octo commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Per ADR-004, fail the step on any of: a non-zero CLI exit, a non-success terminal status (is_error or subtype != success), or a populated permission_denials list.

The processor now captures the final result event and a pure ClaudeAgentOutcomeEvaluator decides the outcome in the runner.

As part of testing this I discovered that in built tools appear to be allowed even if not called out in the allowed tools list but tackling that separately.

Resolves MD-2077

Per ADR-004, fail the step on any of: a non-zero CLI exit, a non-success terminal status (is_error or subtype != success), or a populated permission_denials list. The processor now captures the final result event and a pure ClaudeAgentOutcomeEvaluator decides the outcome in the runner.
@zentron

zentron commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Looks like I get a failure which is a good sign
image

Comment thread source/Calamari.AiAgent/ClaudeCodeBehaviour/ClaudeCodeCliRunner.cs Outdated
It is stateless with a single method, so an instance adds no value.
Addresses PR review feedback.
@sathvikkumar-octo sathvikkumar-octo enabled auto-merge (squash) June 18, 2026 05:29
@sathvikkumar-octo sathvikkumar-octo enabled auto-merge (squash) June 18, 2026 06:45
@sathvikkumar-octo sathvikkumar-octo merged commit cc43573 into main Jun 18, 2026
34 checks passed
@sathvikkumar-octo sathvikkumar-octo deleted the sk/ai-agent-deterministic-failures branch June 18, 2026 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants