Skip to content

feat(seo): OG/Twitter tags, sitemap.xml, and Dataset structured data enhancements#5

Open
solrevdev wants to merge 3 commits into
masterfrom
feat/seo-meta-sitemap-dataset
Open

feat(seo): OG/Twitter tags, sitemap.xml, and Dataset structured data enhancements#5
solrevdev wants to merge 3 commits into
masterfrom
feat/seo-meta-sitemap-dataset

Conversation

@solrevdev

Copy link
Copy Markdown
Owner

Context

This PR continues a series of SEO improvements to the winget-search static GitHub Pages site at https://solrevdev.com/winget-search/.

Google Search Console status at time of PR

Both missing-field issues are now in validation started state in Search Console (Google is re-crawling to confirm the fixes from the previous commits):

  • Missing field 'license' — validation started (fix landed in 323a3b6)
  • Missing field 'creator' — validation started (fix landed in 323a3b6)

The screenshot below shows the Data sets report in Google Search Console confirming validation has started for both issues:

Google Search Console → solrevdev.com → Enhancements → Data sets

  • 0 Invalid, No critical issues
  • 1 Valid
  • "Improve item appearance": Missing field 'license' (Started) + Missing field 'creator' (Started)

These were non-critical warnings. The fixes have been live since 323a3b6 and the CI daily run at 02:00 UTC auto-stamps dateModified so Google always sees a fresh signal.


What this PR adds

1. Open Graph meta tags (index.html)

Adds og:type, og:site_name, og:title, og:description, og:url — improves how the page renders when shared on social platforms and is a general crawl signal for search engines.

2. Twitter Card meta tags (index.html)

Adds twitter:card (summary), twitter:site / twitter:creator (@solrevdev), title, description.

3. Canonical link + sitemap link (index.html)

  • <link rel="canonical" href="https://solrevdev.com/winget-search/"> — prevents duplicate content issues.
  • <link rel="sitemap" type="application/xml" href="sitemap.xml"> — advertises the sitemap in-page.

4. sitemap.xml (new file)

Covers both pages:

  • https://solrevdev.com/winget-search/changefreq: daily, priority: 1.0
  • https://solrevdev.com/winget-search/agent-access.htmlchangefreq: weekly, priority: 0.5

CI stamps <lastmod> on every build so it stays current.

Next step after merge: submit https://solrevdev.com/winget-search/sitemap.xml in Google Search Console → Sitemaps.

5. Dataset JSON-LD enhancements (index.html)

Three new fields added to the existing @type: Dataset block:

Field Value Why
sameAs https://github.com/solrevdev/winget-search Links the canonical GitHub source for the project
numberOfItems 0 (placeholder; CI stamps actual count) Concrete size signal for Google Dataset Search
variableMeasured ["packageId", "name", "version", "publisher"] Describes the fields in packages.json for richer Dataset results

6. CI workflow: numberOfItems + sitemap lastmod stamping

The build step now:

# Stamp numberOfItems with actual package count
PACKAGE_COUNT=$(jq '.metadata.total // length' packages.json)
sed -i "s/\"numberOfItems\": [0-9]*/\"numberOfItems\": ${PACKAGE_COUNT}/" deploy/index.html

# Stamp sitemap lastmod
sed -i "s/<lastmod>[^<]*<\/lastmod>/<lastmod>$(date -u +%Y-%m-%d)<\/lastmod>/g" deploy/sitemap.xml

Full Dataset JSON-LD state after merge

{
  "@context": "https://schema.org",
  "@type": "Dataset",
  "name": "Winget Package Search catalog",
  "description": "Current package metadata extracted from microsoft/winget-pkgs.",
  "url": "https://solrevdev.com/winget-search/",
  "isBasedOn": "https://github.com/microsoft/winget-pkgs",
  "keywords": ["winget", "Windows Package Manager", "package search", "software catalog", "Windows software", "package manager"],
  "inLanguage": "en",
  "temporalCoverage": "2025-05-29/..",
  "datePublished": "2025-05-29",
  "dateModified": "<stamped-by-CI>",
  "creator": { "@type": "Person", "name": "John Smith", "url": "https://solrevdev.com/about/" },
  "license": "https://spdx.org/licenses/MIT.html",
  "sameAs": "https://github.com/solrevdev/winget-search",
  "numberOfItems": "<stamped-by-CI>",
  "variableMeasured": ["packageId", "name", "version", "publisher"],
  "distribution": {
    "@type": "DataDownload",
    "encodingFormat": "application/json",
    "contentUrl": "https://solrevdev.com/winget-search/packages.json"
  }
}

Remaining SEO opportunities (future work)

  • og:image — a social preview image would complete the OG card
  • Submit sitemap.xml to Google Search Console manually (Sitemaps tab)
  • robots.txt with Sitemap: directive (none exists yet)
  • Consider a citation field in the Dataset block if the data is referenced externally

Test plan

  • Merge and confirm CI build passes + pages-build-deployment succeeds
  • Verify live structured data: curl -s https://solrevdev.com/winget-search/ | grep -A5 'numberOfItems'
  • Verify numberOfItems is non-zero (CI-stamped from packages.json count)
  • Verify sitemap at https://solrevdev.com/winget-search/sitemap.xml has today's lastmod
  • Check OG tags render correctly with https://developers.facebook.com/tools/debug/
  • Check Dataset structured data with https://search.google.com/test/rich-results
  • Submit sitemap.xml in Google Search Console → Sitemaps

🤖 Generated with Claude Code

solrevdev and others added 2 commits June 24, 2026 10:12
- Add canonical link and sitemap link tags to <head>
- Add Open Graph meta tags (type, site_name, title, description, url)
- Add Twitter Card meta tags (summary card, @solrevdev)
- Add sitemap.xml covering index and agent-access pages (daily/weekly)
- Add sameAs, numberOfItems, variableMeasured to Dataset JSON-LD
- CI now stamps numberOfItems from jq package count and sitemap lastmod

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@solrevdev

Copy link
Copy Markdown
Owner Author

Implemented the og:image work in commit 50931e2 (feat(seo): add social preview image).

Preview from this PR branch:

Winget Package Search social preview

What changed:

  • Added og-image.svg as the editable 1200x630 source asset.
  • Generated og-image.png as the social-scraper target. PNG is used for og:image because it is broadly supported by social platforms.
  • Updated index.html with og:image, dimensions, image type, alt text, and Twitter summary_large_image metadata.
  • Updated the GitHub Pages build workflow to publish both og-image.png and og-image.svg.

Validation performed locally:

  • xmllint --noout og-image.svg passes.
  • rsvg-convert --width 1200 --height 630 --output og-image.png og-image.svg generated the PNG.
  • file og-image.png reports PNG image data, 1200 x 630.
  • Local static serving returned HTTP 200 with Content-type: image/png for /og-image.png.
  • Playwright CLI opened the image successfully as og-image.png (1200×630).

After merge and Pages deployment, the expected live image URL is:
https://solrevdev.com/winget-search/og-image.png

Suggested post-deploy checks:

  • Open https://solrevdev.com/winget-search/og-image.png directly.
  • Re-scrape https://solrevdev.com/winget-search/ in the Facebook Sharing Debugger and any Twitter/X card validator available to confirm the card refreshes.

@solrevdev

Copy link
Copy Markdown
Owner Author

Follow-up fix: the first preview URL rendered as a Git LFS pointer because this repo tracks *.png through LFS by default.

Fixed in commit 949eec2 (fix(seo): store og image outside lfs):

  • Added a narrow .gitattributes exception for /og-image.png.
  • Re-added og-image.png as a normal Git blob, not an LFS pointer.

Verified the same raw URL now serves the image correctly:

  • content-type: image/png
  • content-length: 94651
  • first bytes are the PNG header (89504e47...)

Preview should now render here:

Winget Package Search social preview

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant