Skip to content

fix(Security/AnnualReports): repair stale UpdateSources parser (0 → 390 reports)#1387

Open
christauff wants to merge 1 commit into
danielmiessler:mainfrom
christauff:fix/annual-reports-stale-parser
Open

fix(Security/AnnualReports): repair stale UpdateSources parser (0 → 390 reports)#1387
christauff wants to merge 1 commit into
danielmiessler:mainfrom
christauff:fix/annual-reports-stale-parser

Conversation

@christauff

Copy link
Copy Markdown
Contributor

Summary

The AnnualReports skill (Security pack) is non-functional as shipped: its UpdateSources tool parses 0 reports from the upstream data source and then crashes on write, so the companion ListSources and FetchReport tools have no sources.json to read. This PR repairs UpdateSources.ts — verified end-to-end at 0 → 390 reports.

What a user hits today

$ bun run Packs/Security/src/AnnualReports/Tools/UpdateSources.ts
📥 Fetching upstream README...
✅ Fetched 253071 bytes
  Total parsed: 0            # ← every category empty
❌ Error: ENOENT ... open '.../AnnualReports/Data/sources.json'

The skill ships no seed Data/sources.json, so ListSources/FetchReport also ENOENT until an update succeeds — which it never does. Net effect: the skill is dead on install.

Root cause — three independent defects

  1. Obsolete parser. parseMarkdownReports expected an old README layout with - Vendor: / - URL: sub-bullets. The upstream source (jacobdjwilson/awesome-annual-security-reports) now uses a single line per report:
    - [Vendor](vendor-url) - [Report Name](report-path) (year) - description
    The old pattern matches nothing → 0 reports.
  2. main() never merged parsed data. It only rewrote an existing file's timestamp (current.metadata.lastUpdated = ...) and never wrote the parsed reports — so even a working parser produced no data.
  3. Missing directory. writeFileSync(SOURCES_PATH, ...) raised ENOENT whenever Data/ did not exist (i.e. every fresh install).

The fix (single file: UpdateSources.ts)

  • Rewrote parseMarkdownReports for the current one-line entry format. Section tracking now also resets on any non-Analysis/Survey ## header, so links in the ## Contents TOC and ## Resources sections can never be mistaken for report entries.
  • main() now rebuilds sources.json from the parsed data (metadata + analysis/survey categories).
  • Added mkdirSync(dirname(SOURCES_PATH), { recursive: true }) before write.

Imports extended with mkdirSync (fs) and dirname (path). No changes were needed to ListSources.ts or FetchReport.ts — they were correct and simply had no data to consume.

Before / after

Before After
Reports parsed 0 390
UpdateSources write ENOENT crash writes Data/sources.json
ListSources ENOENT 21 categories, counts render
FetchReport (live) ENOENT fetches + caches summary

Testing

Run against the live upstream README, from a clean checkout of this branch:

$ bun run Packs/Security/src/AnnualReports/Tools/UpdateSources.ts
✅ Fetched 253071 bytes
  Total parsed: 390
✅ Updated sources.json
   Total reports: 390

$ bun run Packs/Security/src/AnnualReports/Tools/ListSources.ts
Total Reports: 390
  Global Threat Intelligence: 72 reports
  Industry Trends: 81 reports
  ...

$ bun run Packs/Security/src/AnnualReports/Tools/FetchReport.ts verizon "data breach"
📄 Found: Verizon - Data Breach Investigations Report
✅ Summary saved: .../Reports/verizon/data-breach-investigations-report-summary.md

Notes for reviewers

  • Scope: only Packs/Security/src/AnnualReports/Tools/UpdateSources.ts. The identical stale copies under Releases/v2.3–v2.5/ are frozen snapshots and are assumed to regenerate from Packs/src; left untouched.
  • Data/sources.json is intentionally not committed — it is generated at runtime from the live upstream source. If you'd prefer to ship a seed file so the skill works before the first UPDATE, I'm happy to add one.
  • The upstream data occasionally carries vendor-name typos (e.g. "Artic Wolf Labs", "Crowd Strike"); these are mirrored verbatim and are out of scope here.

🤖 Generated with Claude Code

The AnnualReports UPDATE workflow parses 0 reports and then crashes on
write against the current awesome-annual-security-reports README format,
leaving ListSources and FetchReport with no data to read.

- Rewrite parseMarkdownReports for the current one-line entry format
  `- [Vendor](url) - [Name](path) (year) - description`. The old parser
  expected removed `- Vendor:` / `- URL:` sub-bullets, so it matched 0
  entries. Section tracking now also resets on non-Analysis/Survey `##`
  headers so TOC and Resources links are never mistaken for reports.
- Actually rebuild sources.json from the parsed data. main() previously
  only bumped a timestamp on an (often nonexistent) file and never merged
  parsed reports.
- Create the Data/ directory before writing (writeFileSync raised ENOENT
  when Data/ was absent).

Verified end-to-end against the live upstream README: 0 -> 390 reports
parsed and written; ListSources and FetchReport (live fetch) functional.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@christauff

Copy link
Copy Markdown
Contributor Author

Heads-up for triage: the failing claude-review check here is not related to this change. Its log shows Could not fetch an OIDC token … Did you remember to add id-token: write with an empty ANTHROPIC_API_KEY — the expected result for a PR opened from a fork, since GitHub withholds repo secrets and OIDC from pull_request events originating outside the base repo. It will fail identically for any external contributor's PR.

The change itself was verified end-to-end against the live upstream README from a clean checkout of this branch:

  • UpdateSources.ts: 0 → 390 reports parsed and written (was a hard ENOENT crash before, parsing 0)
  • ListSources.ts and FetchReport.ts (live fetch) both functional afterward

Happy to adjust anything — including adding a seed Data/sources.json if you'd prefer the skill to work before the first UPDATE run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant