feat: rule-validate + rule-test commands — closes #51#72
Conversation
Add two new CLI commands for rule YAML quality assurance: 1. — validates rule YAML files for typos, unknown keys, schema violations, and unparseable patterns. Exit codes: 0 (valid) / 1 (error) / 2 (warning + --strict). 2. — snapshot testing for rule YAML with inline / markers. Supports single file or directory input, output, flag. Also renames existing command to (the old name still works as a deprecated alias with stderr warning). Files changed: - scripts/commands/validate.py (deprecated alias, prints warning) - scripts/commands/registry_validate.py (new canonical name) - scripts/commands/rule_validate.py (new command) - scripts/commands/rule_test.py (new command) - scripts/rule_validator.py (validation logic, reusable) - scripts/rule_test_runner.py (test runner logic, reusable) - references/rule-schema.json (JSON Schema for rule YAML) - tests/rule_fixtures/ (10 migrated rules + 6 invalid fixtures + tests) - tests/test_rule_validate.py (23 tests) - tests/test_rule_test.py (27 tests) Closes #51
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Review - APPROVEReviewed full diff (35 files, +3023/-4). Excellent, comprehensive implementation that exceeds the issue spec. Strengths
Minor observations (non-blocking)
CI noteThe VerdictMerging. Model PR - comprehensive, well-tested, well-documented, forward-compatible with #46. Reviewed by BOS (orchestrate-workers skill) - diff read in full before approval. |
|



What changed
Adds two new CLI commands for rule YAML quality assurance, closing #51.
1.
codelens rule-validate <rule.yaml>Validates rule YAML files for typos, unknown keys, schema violations, and unparseable patterns. 4-stage validation pipeline:
id,message,severity,language), severity enum (critical|high|medium|low|info), unknown-key detection with typo suggestions (e.g.,pattern-eiter→pattern-either)pattern:string with tree-sitter to catch syntax errors (graceful skip if tree-sitter unavailable)pattern+patternsmutually exclusive;fixrequirespatternorpatternsExit codes:
0(valid) /1(error) /2(warning +--strict)2.
codelens rule-test <rule-path>Snapshot testing for rule YAML with inline
# ruleid: <id>/# okmarkers. Supports:--jsonoutput for CI--test-ignore-todoflag (skip# todoruleid:markers)python_security.yamlandjavascript_security.yaml3. Rename:
validate→registry-validateExisting
validatecommand renamed toregistry-validateto make room forrule-validate. The oldvalidatename still works as a deprecated alias — prints a one-line stderr warning and delegates to the same logic. Will be removed in a future release.Why
CodeLens currently silently skips malformed rules — there's no validation or test framework. Contributors can't catch typos (
pattern-eitervspattern-either), unknown keys, missing required fields, or unparseable patterns before a rule is loaded by the engine. This creates a false sense of security (the rule file exists, so the author thinks it works) and blocks rule ecosystem growth.Two worker reports independently proposed this (Opengrep #41, Semgrep CL-005/006 — see issue #51 for the worker consensus table).
Files touched
New files (8):
scripts/commands/registry_validate.py— renamed validate command (canonical name)scripts/commands/rule_validate.py— new command handlerscripts/commands/rule_test.py— new command handlerscripts/rule_validator.py— validation logic (reusable, 4-stage pipeline)scripts/rule_test_runner.py— test runner logic (reusable, marker parser)references/rule-schema.json— JSON Schema for rule YAMLtests/test_rule_validate.py— 23 teststests/test_rule_test.py— 27 testsModified files (1):
scripts/commands/validate.py— now a deprecated alias (prints stderr warning, delegates tovalidate_registry)New fixtures (22 files in
tests/rule_fixtures/):.yamlfiles (5 Python: sql-injection, command-injection, path-traversal, ssrf, xss-template; 5 JavaScript: xss-dom, sql-injection, command-injection, path-traversal, prototype-pollution).test.yamlfiles (4 samples each: 2 positive + 2 negative)Verification
Test results
Full test suite (excluding slow integration tests): 604 passed, 76 skipped — no regressions.
Manual test demos
rule-validate on valid rule (exit 0):
rule-validate on malformed YAML (exit 1):
rule-validate on typo (exit 0 with warnings, exit 2 with --strict):
rule-test on fixture directory (10/10 rules pass, 40/40 samples):
Deprecated
validatealias (prints warning, still works):Architecture decisions
Logic separated from command handlers —
rule_validator.pyandrule_test_runner.pycontain all reusable logic (dataclasses, validation pipeline, marker parser). The command files (rule_validate.py,rule_test.py) are thin CLI wrappers. This enables reuse by pre-commit hooks, CI pipelines, and programmatic callers.Unknown keys are warnings, not errors — CodeLens may add new rule fields in the future. Flagging unknown keys as warnings (rather than errors) avoids breaking existing rule files when the allowlist hasn't been updated yet. Use
--strictto treat them as errors.Pattern parseability is a warning — tree-sitter pattern syntax is close to (but not identical to) the target language. Metavariables like
$Xcan trigger false-positive parse errors. The check is a warning, not a hard error, and is gracefully skipped when tree-sitter is unavailable.__test__ = Falseon dataclasses —TestFailureandTestResultclass names start withTest, which would cause pytest to try collecting them as test classes. The__test__ = Falseattribute suppresses this.Test fixtures use table name
accountsinstead ofusers— the regex-based semantic engine does naive substring matching (var_name in stripped_line). The variable nameuseris a substring ofusers, causing false positives on negative test samples. Usingaccountsavoids this without changing the test semantics.Pre-commit hook integration
To add
rule-validateas a pre-commit hook, add this to.pre-commit-config.yaml(not included in this PR since no such file exists yet in the repo):Closes
Closes #51