Skip to content

Add one-shot Elasticsearch bulk export with ECS field mapping#7

Open
nikfot wants to merge 3 commits into
mainfrom
feature/elasticsearch-export
Open

Add one-shot Elasticsearch bulk export with ECS field mapping#7
nikfot wants to merge 3 commits into
mainfrom
feature/elasticsearch-export

Conversation

@nikfot

@nikfot nikfot commented May 16, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds export-es subcommand for one-shot bulk indexing of parsed vmstat data into Elasticsearch
  • Maps vmstat fields to ECS-compatible names for correlation with other observability data
  • Supports Elastic Cloud (--cloud-id) and self-managed clusters (--es-url)
  • --dry-run mode prints bulk NDJSON to stdout without needing elasticsearch-py
  • elasticsearch-py is a lazy optional dependency

ECS Field Mapping

vmstat ECS field Conversion
time @timestamp ISO 8601
us/sy/id/wa/st system.cpu.*.pct 0-100 -> 0.0-1.0
free/swpd/inact/active system.memory.*.bytes KB -> bytes
bi/bo system.diskio.*.bytes KB -> bytes

Usage

# Dry run (no ES needed)
vmstat-visualizer export-es vmstat.log --dry-run --index vmstat-investigation

# Send to Elastic Cloud
vmstat-visualizer export-es vmstat.log --cloud-id CLOUD_ID --es-api-key KEY --index vmstat-2025.08

# Send to self-managed
vmstat-visualizer export-es vmstat.log --es-url https://localhost:9200 --es-api-key KEY

Design Decision

This is a one-shot post-hoc exporter, not a live shipper. See plan discussion for why a live vmstat wrapper was rejected (duplicates Metricbeat, contradicts tool identity, high maintenance burden).

Test plan

  • Verified ECS doc structure with dry-run
  • KB-to-bytes and pct-to-decimal conversions verified
  • Clear error message when elasticsearch-py missing
  • --dry-run works without any ES dependency

nikfot added 3 commits May 16, 2026 09:42
New 'export-es' subcommand parses a vmstat log file and bulk-indexes
it into Elasticsearch using ECS-compatible field names. Supports
Elastic Cloud (--cloud-id) and self-managed clusters (--es-url).

Includes --dry-run mode that prints the bulk NDJSON body to stdout
without requiring elasticsearch-py or a running cluster.

Fields are mapped to ECS conventions:
- KB values converted to bytes
- CPU percentages converted to 0.0-1.0 range
- Timestamps formatted as ISO 8601
- event.module=vmstat, event.dataset=system.vmstat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant