Fix Bedrock reasoning for adaptive-thinking models (Claude Opus 4.8)#558
Fix Bedrock reasoning for adaptive-thinking models (Claude Opus 4.8)#558gruel-coveo wants to merge 2 commits into
Conversation
Models such as Claude Opus 4.8 on Bedrock reject the legacy
"thinking.type=enabled" payload that LiteLLM emits when it converts the
"reasoning_effort" argument, returning:
"thinking.type.enabled" is not supported for this model. Use
"thinking.type.adaptive" and "output_config.effort" to control
thinking behavior.
LiteLLM's model metadata already flags these models with
"supports_adaptive_thinking", but its request conversion still sends the
old shape. Detect that flag and pass the adaptive-thinking API via
extra_args (thinking.type=adaptive + output_config.effort) instead of the
reasoning_effort scalar. All other models keep their existing behavior.
Greptile SummaryThis PR fixes a Bedrock 400 error that occurs when running against Claude Opus 4.8 inference profiles, which reject the legacy
Confidence Score: 4/5Safe to merge for the described scenario; edge cases around non-standard effort strings and future models with only the adaptive-thinking flag are low-risk today but worth documenting. The core fix is correct and well-scoped: Claude Opus 4.8 gets the new API shape, all other models are untouched. Two minor concerns remain open: strix/core/inputs.py — the branching logic in Important Files Changed
|
Bedrock's output_config.effort only accepts low/medium/high (plus xhigh on models that advertise it via supports_xhigh_reasoning_effort). The ReasoningEffort range also includes "minimal" and "xhigh", which were previously embedded verbatim and would trigger an opaque 400 on adaptive-thinking models. Map "minimal" to "low", and downgrade "xhigh" to "high" on models that don't support it.
Problem
Running a scan against an AWS Bedrock Claude Opus 4.8 inference profile fails immediately with a
400 BadRequestError:Root cause
Strix detects that the model supports reasoning and passes
reasoning_effortto the SDK, which forwards it to LiteLLM. LiteLLM convertsreasoning_effortinto Bedrock's legacythinking.type=enabledpayload. Newer Anthropic models (e.g. Claude Opus 4.8) no longer accept that shape and requirethinking.type=adaptiveplusoutput_config.effort.LiteLLM's own model metadata already flags these models with
supports_adaptive_thinking: true(alongsidesupports_output_config), but its request-conversion code still emits the old shape, so the request is rejected before any tokens are generated.Fix
model_supports_adaptive_thinking()which reads LiteLLM'ssupports_adaptive_thinkingflag (refactors the shared model-cost lookup into a helper).make_model_settings, for reasoning-capable models that support adaptive thinking, send the adaptive-thinking API viaextra_args(thinking.type=adaptive+output_config.effort) instead of thereasoning_effortscalar.All other models keep their existing behavior.
Testing
Verified against live Bedrock:
converse-stream) calls. Before the change, the same call reproduced thethinking.type.enablederror.o3still uses thereasoningpath,gpt-4oand Claude 3.5 Sonnet are unaffected.ruff checkandmypypass on the changed files.