Skip to content

fix: re-validate interpolated output_file against untrusted inputs#6447

Open
theCyberTech wants to merge 3 commits into
mainfrom
fix/output-file-interpolation-traversal
Open

fix: re-validate interpolated output_file against untrusted inputs#6447
theCyberTech wants to merge 3 commits into
mainfrom
fix/output-file-interpolation-traversal

Conversation

@theCyberTech

@theCyberTech theCyberTech commented Jul 3, 2026

Copy link
Copy Markdown
Member

Summary

Task.output_file supports {var} templates that are filled from crew.kickoff(inputs=...). The field validator (output_file_validation) blocks .., absolute paths, ~/$, and shell metacharacters — but it returns templated values unchecked, and the concrete path produced later by interpolate_inputs_and_add_conversation_history is assigned to self.output_file without re-validation (Task.model_config does not set validate_assignment=True).

An untrusted inputs value can therefore inject traversal / an absolute path / expansion into a templated output_file and write the task output outside the working directory:

Task(description=..., expected_output=..., output_file="reports/{name}.md")
crew.kickoff(inputs={"name": "../../../../home/victim/.bashrc"})   # escapes cwd

This is a live variant of the earlier output_file path-traversal fix (73f3288): that fix added the validator, but the template-interpolation path bypasses it.

Fix

Validate the interpolated variable values (only those appearing in the output_file template) before interpolation, rejecting .., absolute paths (POSIX or Windows), ~/$ expansion, and shell metacharacters. The developer-authored template — including a legitimate absolute base directory — stays trusted, so existing templated-path behavior is unaffected.

  • New Task._validate_output_file_input_values(inputs) runs before output-path interpolation.
  • The field validator's shared checks were extracted into Task._sanitize_output_file_path (definition-time behavior preserved).

Tests

  • test_interpolate_output_file_rejects_unsafe_inputs — 5 malicious-input cases (traversal, absolute path, ~ expansion, shell char, Windows absolute) all raise ValueError.
  • test_interpolate_output_file_allows_safe_inputs — safe relative templates and developer-chosen absolute base templates still work.
  • Full tests/test_task.py green (73 passed), including the pre-existing test_interpolate_inputs; ruff clean.

@github-actions github-actions Bot added the size/M label Jul 3, 2026
@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds path sanitization and a runtime input validator for Task.output_file templates to reject path traversal, shell expansion, and absolute paths during interpolation, plus accompanying tests. Separately updates the vulnerability scan workflow to ignore advisory PYSEC-2026-597.

Changes

Task output_file path validation

Layer / File(s) Summary
Shared sanitization helper
lib/crewai/src/crewai/task.py
Imports PurePosixPath/PureWindowsPath and refactors literal/template output_file validation to use a shared _sanitize_output_file_path helper.
Runtime validation of interpolated values
lib/crewai/src/crewai/task.py
Adds _validate_output_file_input_values to reject substituted values with traversal, shell expansion/metacharacters, or absolute paths, invoked before output_file interpolation.
Test coverage
lib/crewai/tests/test_task.py
Adds parameterized tests for rejecting unsafe interpolation inputs and a test confirming safe relative/absolute output_file interpolation.

Vulnerability Scan Advisory Update

Layer / File(s) Summary
Ignore PYSEC-2026-597
.github/workflows/vulnerability-scan.yml
Adds --ignore-vuln PYSEC-2026-597 to the pip-audit command and documents the nltk path traversal advisory in the Ignored CVEs comments.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant Task
  participant Validator as _validate_output_file_input_values
  participant Interpolator as interpolate_only

  Caller->>Task: interpolate_inputs_and_add_conversation_history(inputs)
  Task->>Validator: validate output_file template variables
  alt unsafe value detected
    Validator-->>Task: raise ValueError
  else values safe
    Validator-->>Task: pass
    Task->>Interpolator: interpolate_only(output_file, inputs)
    Interpolator-->>Task: sanitized output_file
  end
Loading

Suggested reviewers: lorenzejay, vinibrsl

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: re-validating interpolated output_file values from untrusted inputs.
Description check ✅ Passed The description directly explains the templated output_file validation fix, implementation, and tests.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/output-file-interpolation-traversal

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

The output_file field validator accepts {var} templates unchecked, and the
concrete path produced by interpolate_inputs_and_add_conversation_history was
assigned without re-validation. An untrusted crew.kickoff(inputs=...) value
could inject '..', an absolute path, or ~/$ expansion into a templated
output_file and write outside the working directory.

Validate the interpolated variable values (only those appearing in the
output_file template) for traversal, absolute paths, shell expansion, and
shell metacharacters before interpolation. The developer-authored template
(including an absolute base directory) stays trusted, so legitimate templated
paths are unaffected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@theCyberTech theCyberTech force-pushed the fix/output-file-interpolation-traversal branch from 246a8b3 to 541bb6c Compare July 3, 2026 06:33
theCyberTech and others added 2 commits July 3, 2026 15:22
nltk 3.9.4 path-traversal advisory (percent-encoded sequences bypassing
the data.load()/find() check; incomplete fix for nltk#3504). 3.9.4 is the
latest release and the advisory lists no fixed version, so the pin cannot be
bumped to clear it. nltk is transitive via unstructured and crewai never
calls nltk.data.load()/find() with untrusted input. Mirrors the existing
PYSEC-2026-97 nltk ignore.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@theCyberTech theCyberTech marked this pull request as ready for review July 3, 2026 09:03
Copilot AI review requested due to automatic review settings July 3, 2026 09:03

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/vulnerability-scan.yml:
- Line 54: The vulnerability ignore list is out of sync between the workflow and
the pre-commit hook, so local runs can still block on PYSEC-2026-597. Update the
`--ignore-vuln` list in the `vulnerability-scan` workflow and the matching
pre-commit configuration so they both include the same advisory set, using the
existing ignore-list block comments as the place to keep them aligned.

In `@lib/crewai/src/crewai/task.py`:
- Around line 549-594: The current _validate_output_file_input_values method
only checks each placeholder value independently, so concatenated inputs in the
output_file template can still form path traversal or absolute paths. Update the
validation in task.py to also evaluate the fully interpolated output_file path
after substitution and ensure it remains contained within the trusted template
base, alongside the existing per-value checks. Use the _original_output_file
template and the interpolation logic around _validate_output_file_input_values
to locate the fix.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: e94f5828-a682-404c-a533-475716376fa7

📥 Commits

Reviewing files that changed from the base of the PR and between 2b90117 and 0138c14.

📒 Files selected for processing (3)
  • .github/workflows/vulnerability-scan.yml
  • lib/crewai/src/crewai/task.py
  • lib/crewai/tests/test_task.py

--ignore-vuln PYSEC-2024-277 \
--ignore-vuln PYSEC-2026-89 \
--ignore-vuln PYSEC-2026-97 \
--ignore-vuln PYSEC-2026-597 \

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Sync .pre-commit-config.yaml ignore list.

The pre-commit hook's own comment states "Keep this ignore list in sync with .github/workflows/vulnerability-scan.yml." but its --ignore-vuln list does not yet include PYSEC-2026-597. Without updating it there too, local pre-commit runs will still fail/block on this advisory even though CI now ignores it.

🔧 Suggested fix for .pre-commit-config.yaml
           --ignore-vuln PYSEC-2026-89
           --ignore-vuln PYSEC-2026-97
+          --ignore-vuln PYSEC-2026-597
           --ignore-vuln PYSEC-2025-148

Also applies to: 82-82

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/vulnerability-scan.yml at line 54, The vulnerability
ignore list is out of sync between the workflow and the pre-commit hook, so
local runs can still block on PYSEC-2026-597. Update the `--ignore-vuln` list in
the `vulnerability-scan` workflow and the matching pre-commit configuration so
they both include the same advisory set, using the existing ignore-list block
comments as the place to keep them aligned.

Source: Linked repositories

Comment on lines +549 to +594
def _validate_output_file_input_values(
self, inputs: dict[str, str | int | float | dict[str, Any] | list[Any]]
) -> None:
"""Reject untrusted input values that would escape the output path.

Only the variables that actually appear in the ``output_file`` template
are checked. The developer-authored template is trusted (it may contain
an absolute base directory), but a value substituted into it must not
introduce path traversal (``..``), an absolute path, a home/variable
expansion (``~``/``$``), or shell metacharacters that would redirect the
write outside the intended location.
"""
if not self._original_output_file:
return

template_vars = [
part.split("}")[0] for part in self._original_output_file.split("{")[1:]
]
for var in template_vars:
if var not in inputs:
continue
value = str(inputs[var])
if ".." in value:
raise ValueError(
f"Invalid value for output_file variable '{var}': path "
"traversal sequences ('..') are not allowed"
)
if value.startswith(("~", "$")):
raise ValueError(
f"Invalid value for output_file variable '{var}': shell "
"expansion characters are not allowed"
)
if any(char in value for char in ["|", ">", "<", "&", ";"]):
raise ValueError(
f"Invalid value for output_file variable '{var}': shell "
"special characters are not allowed"
)
if (
PurePosixPath(value).is_absolute()
or PureWindowsPath(value).is_absolute()
):
raise ValueError(
f"Invalid value for output_file variable '{var}': absolute "
"paths are not allowed"
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate the target file and inspect the relevant range with line numbers.
git ls-files 'lib/crewai/src/crewai/task.py'
wc -l lib/crewai/src/crewai/task.py
sed -n '480,620p' lib/crewai/src/crewai/task.py | cat -n

# Also inspect any nearby references to output_file validation/commentary.
rg -n "output_file|validate_output_file|fully interpolated|re-validated" lib/crewai/src/crewai/task.py

Repository: crewAIInc/crewAI

Length of output: 9365


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '1084,1110p' lib/crewai/src/crewai/task.py | cat -n
printf '\n---\n'
sed -n '1258,1285p' lib/crewai/src/crewai/task.py | cat -n

Repository: crewAIInc/crewAI

Length of output: 2947


Validate the interpolated output_file path as a whole
Per-value checks can be bypassed when adjacent placeholders concatenate into .. or an absolute path, so the final interpolated path still needs a whole-path containment check against the trusted template base.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/task.py` around lines 549 - 594, The current
_validate_output_file_input_values method only checks each placeholder value
independently, so concatenated inputs in the output_file template can still form
path traversal or absolute paths. Update the validation in task.py to also
evaluate the fully interpolated output_file path after substitution and ensure
it remains contained within the trusted template base, alongside the existing
per-value checks. Use the _original_output_file template and the interpolation
logic around _validate_output_file_input_values to locate the fix.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens Task.output_file templating so untrusted crew.kickoff(inputs=...) values can’t inject path traversal or absolute paths when interpolated into an output_file template.

Changes:

  • Adds runtime validation for values interpolated into Task.output_file templates to block traversal, absolute paths, shell expansion, and shell metacharacters.
  • Refactors shared literal-path checks into Task._sanitize_output_file_path.
  • Adds tests covering unsafe/safe output_file interpolation cases and updates vulnerability-scan ignore list with justification.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
lib/crewai/src/crewai/task.py Adds input-value validation for output_file interpolation and extracts shared sanitization logic.
lib/crewai/tests/test_task.py Adds regression tests to ensure interpolated output_file rejects malicious inputs and allows safe ones.
.github/workflows/vulnerability-scan.yml Ignores a newly reported transitive advisory with rationale in comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +586 to +589
if (
PurePosixPath(value).is_absolute()
or PureWindowsPath(value).is_absolute()
):
Comment on lines +939 to +942
("{p}", {"p": "~/.bashrc"}, "shell expansion"),
("{p}", {"p": "x;rm -rf /"}, "shell special characters"),
("{p}", {"p": r"C:\Windows\evil"}, "absolute paths"),
],
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants