Summary
When the host agent layer (the IDE / agent that runs evolver) gets an unrecoverable client error from its LLM provider — HTTP 400 / 401 / 403, e.g. a malformed-request rejection like field MaxTokens invalid, should be in [1, 65536], an auth failure, or a hard quota denial — every evolution attempt fails for a reason that has nothing to do with the Gene being run. Today evolver counts each of these toward the consecutive-failure streak, so after ~5 failures it trips failure_loop_detected and bans an innocent Gene (ban_gene:<gene>), then force_innovation_after_repair_loop.
This was surfaced by the auto-report in #534, whose root cause was a host-side LLM 400 on every call — evolver faithfully recorded 8 straight failures and banned gene_gep_repair_from_errors for an error it had no part in.
Current behaviour (no error-class distinction)
src/gep/signals.js increments consecutiveFailureCount for any outcome.status === 'failed', with no inspection of the failure cause.
- At streak ≥ 5 it emits
failure_loop_detected + ban_gene:<topGene> (and force_innovation_after_repair_loop).
- The host's error text is already in hand at the collector (
src/evolve/pipeline/collect.js, where a host errorMessage is captured and prefixed [LLM ERROR] …), but it never feeds the streak/ban logic.
The only 4xx classification that exists today lives in unrelated subsystems (hub heartbeat circuit breaker, ATP/stake terminal-status handling) and never touches the evolution-outcome path.
Proposed improvement
Classify an evolution outcome whose underlying failure is an unrecoverable host/LLM client error (4xx — request-invalid / auth / quota) as a distinct, non-Gene-attributable failure that:
- Aborts the current intent early instead of retrying into the streak, and
- Is excluded from
consecutiveFailureCount / per-Gene failure frequency, so it cannot trigger ban_gene / failure_loop_detected.
- Surfaces a clear, actionable signal to the operator (e.g.
host_llm_client_error) pointing at the host LLM config rather than at evolver/the Gene.
This keeps the failure-loop detector focused on genuine repair/optimize failures and stops a misconfigured host model from poisoning Gene reputation.
Notes
Summary
When the host agent layer (the IDE / agent that runs evolver) gets an unrecoverable client error from its LLM provider — HTTP
400/401/403, e.g. a malformed-request rejection likefield MaxTokens invalid, should be in [1, 65536], an auth failure, or a hard quota denial — every evolution attempt fails for a reason that has nothing to do with the Gene being run. Today evolver counts each of these toward the consecutive-failure streak, so after ~5 failures it tripsfailure_loop_detectedand bans an innocent Gene (ban_gene:<gene>), thenforce_innovation_after_repair_loop.This was surfaced by the auto-report in #534, whose root cause was a host-side LLM
400on every call — evolver faithfully recorded 8 straight failures and bannedgene_gep_repair_from_errorsfor an error it had no part in.Current behaviour (no error-class distinction)
src/gep/signals.jsincrementsconsecutiveFailureCountfor anyoutcome.status === 'failed', with no inspection of the failure cause.failure_loop_detected+ban_gene:<topGene>(andforce_innovation_after_repair_loop).src/evolve/pipeline/collect.js, where a hosterrorMessageis captured and prefixed[LLM ERROR] …), but it never feeds the streak/ban logic.The only 4xx classification that exists today lives in unrelated subsystems (hub heartbeat circuit breaker, ATP/stake terminal-status handling) and never touches the evolution-outcome path.
Proposed improvement
Classify an evolution outcome whose underlying failure is an unrecoverable host/LLM client error (4xx — request-invalid / auth / quota) as a distinct, non-Gene-attributable failure that:
consecutiveFailureCount/ per-Gene failure frequency, so it cannot triggerban_gene/failure_loop_detected.host_llm_client_error) pointing at the host LLM config rather than at evolver/the Gene.This keeps the failure-loop detector focused on genuine repair/optimize failures and stops a misconfigured host model from poisoning Gene reputation.
Notes
400, not an evolver bug).