diff --git a/docs/PLUGIN_DOC.md b/docs/PLUGIN_DOC.md
index 6e9f8a99..0ca1366f 100644
--- a/docs/PLUGIN_DOC.md
+++ b/docs/PLUGIN_DOC.md
@@ -10,7 +10,7 @@
| DeviceEnumerationPlugin | powershell -Command "(Get-WmiObject -Class Win32_Processor | Measure-Object).Count"
lspci -d {vendorid_ep}: | grep -iE 'VGA|Display|3D|Processing accelerators|Co-processor|Accelerator' | grep -vi 'Virtual Function' | wc -l
powershell -Command "(wmic path win32_VideoController get name | findstr AMD | Measure-Object).Count"
lscpu
lshw
lspci -d {vendorid_ep}: | grep -i 'Virtual Function' | wc -l
powershell -Command "(Get-VMHostPartitionableGpu | Measure-Object).Count" | **Analyzer Args:**
- `cpu_count`: Optional[list[int]] — Expected CPU count(s); pass as int or list of ints. Analysis passes if actual is in list.
- `gpu_count`: Optional[list[int]] — Expected GPU count(s); pass as int or list of ints. Analysis passes if actual is in list.
- `vf_count`: Optional[list[int]] — Expected virtual function count(s); pass as int or list of ints. Analysis passes if actual is in list. | - | [DeviceEnumerationDataModel](#DeviceEnumerationDataModel-Model) | [DeviceEnumerationCollector](#Collector-Class-DeviceEnumerationCollector) | [DeviceEnumerationAnalyzer](#Data-Analyzer-Class-DeviceEnumerationAnalyzer) |
| DimmPlugin | sh -c 'dmidecode -t 17 | tr -s " " | grep -v "Volatile\|None\|Module" | grep Size' 2>/dev/null
dmidecode
wmic memorychip get Capacity | - | **Collection Args:**
- `skip_sudo`: bool — If True, do not use sudo when running dmidecode or wmic for memory info. | [DimmDataModel](#DimmDataModel-Model) | [DimmCollector](#Collector-Class-DimmCollector) | - |
| DkmsPlugin | dkms status
dkms --version | **Analyzer Args:**
- `dkms_status`: Union[str, list] — Expected dkms status string(s) to match (e.g. 'amd/1.0.0'). At least one of dkms_status or dkms_version required.
- `dkms_version`: Union[str, list] — Expected dkms version string(s) to match. At least one of dkms_status or dkms_version required.
- `regex_match`: bool — If True, match dkms_status and dkms_version as regex; otherwise exact match. | - | [DkmsDataModel](#DkmsDataModel-Model) | [DkmsCollector](#Collector-Class-DkmsCollector) | [DkmsAnalyzer](#Data-Analyzer-Class-DkmsAnalyzer) |
-| DmesgPlugin | dmesg --time-format iso -x
ls -1 /var/log/dmesg* 2>/dev/null | grep -E '^/var/log/dmesg(\.[0-9]+(\.gz)?)?$' || true | **Built-in Regexes:**
- Out of memory error: `(?:oom_kill_process.*)|(?:Out of memory.*)`
- I/O Page Fault: `IO_PAGE_FAULT`
- Kernel Panic: `\bkernel panic\b.*`
- SQ Interrupt: `sq_intr`
- SRAM ECC: `sram_ecc.*`
- Failed to load driver. IP hardware init error.: `\[amdgpu\]\] \*ERROR\* hw_init of IP block.*`
- Failed to load driver. IP software init error.: `\[amdgpu\]\] \*ERROR\* sw_init of IP block.*`
- Real Time throttling activated: `sched: RT throttling activated.*`
- RCU preempt detected stalls: `rcu_preempt detected stalls.*`
- RCU preempt self-detected stall: `rcu_preempt self-detected stall.*`
- QCM fence timeout: `qcm fence wait loop timeout.*`
- General protection fault: `(?:[\w-]+(?:\[[0-9.]+\])?\s+)?general protectio...`
- Segmentation fault: `(?:segfault.*in .*\[)|(?:[Ss]egmentation [Ff]au...`
- Failed to disallow cf state: `amdgpu: Failed to disallow cf state.*`
- Failed to terminate tmr: `\*ERROR\* Failed to terminate tmr.*`
- Suspend of IP block failed: `\*ERROR\* suspend of IP block <\w+> failed.*`
- amdgpu Page Fault: `(amdgpu \w{4}:\w{2}:\w{2}\.\w:\s+amdgpu:\s+\[\S...`
- Page Fault: `page fault for address.*`
- Fatal error during GPU init: `(?:amdgpu)(.*Fatal error during GPU init)|(Fata...`
- PCIe AER Error Status: `(pcieport [\w:.]+: AER: aer_status:[^\n]*(?:\n[...`
- PCIe AER Correctable Error Status: `(.*aer_cor_status: 0x[0-9a-fA-F]+, aer_cor_mask...`
- PCIe AER Uncorrectable Error Status: `(.*aer_uncor_status: 0x[0-9a-fA-F]+, aer_uncor_...`
- PCIe AER Uncorrectable Error Severity with TLP Header: `(.*aer_uncor_severity: 0x[0-9a-fA-F]+.*)(\n.*TL...`
- Failed to read journal file: `Failed to read journal file.*`
- Journal file corrupted or uncleanly shut down: `journal corrupted or uncleanly shut down.*`
- ACPI BIOS Error: `ACPI BIOS Error`
- ACPI Error: `ACPI Error`
- Filesystem corrupted!: `EXT4-fs error \(device .*\):`
- Error in buffered IO, check filesystem integrity: `(Buffer I\/O error on dev)(?:ice)? (\w+)`
- PCIe card no longer present: `pcieport (\w+:\w+:\w+\.\w+):\s+(\w+):\s+(Slot\(...`
- PCIe Link Down: `pcieport (\w+:\w+:\w+\.\w+):\s+(\w+):\s+(Slot\(...`
- Mismatched clock configuration between PCIe device and host: `pcieport (\w+:\w+:\w+\.\w+):\s+(\w+):\s+(curren...`
- RAS Correctable Error: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- RAS Uncorrectable Error: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- RAS Deferred Error: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- RAS Corrected PCIe Error: `((?:\[Hardware Error\]:\s+)?event severity: cor...`
- GPU Reset: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- GPU reset failed: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- ACA Error: `(Accelerator Check Architecture[^\n]*)(?:\n[^\n...`
- ACA Error: `(Accelerator Check Architecture[^\n]*)(?:\n[^\n...`
- MCE Error: `\[Hardware Error\]:.+MC\d+_STATUS.*(?:\n.*){0,5}`
- Mode 2 Reset Failed: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)? (...`
- RAS Corrected Error: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- SGX Error: `x86/cpu: SGX disabled by BIOS`
- MMP Error: `Failed to load MMP firmware qat_4xxx_mmp.bin`
- GPU Throttled: `amdgpu \w{4}:\w{2}:\w{2}.\w: amdgpu: WARN: GPU ...`
- RAS Poison Consumed: `amdgpu[ 0-9a-fA-F:.]+:(?:\s*amdgpu:)?\s+(?:{\d+...`
- RAS Poison created: `amdgpu[ 0-9a-fA-F:.]+:(?:\s*amdgpu:)?\s+(?:{\d+...`
- Bad page threshold exceeded: `(amdgpu: Saved bad pages (\d+) reaches threshol...`
- RAS Hardware Error: `Hardware error from APEI Generic Hardware Error...`
- Error Address: `Error Address.*(?:\s.*)`
- RAS EDR Event: `EDR: EDR event received`
- DPC Event: `DPC: .*`
- LNet: ko2iblnd has no matching interfaces: `(?:\[[^\]]+\]\s*)?LNetError:.*ko2iblnd:\s*No ma...`
- LNet: Error starting up LNI: `(?:\[[^\]]+\]\s*)?LNetError:\s*.*Error\s*-?\d+\...`
- Lustre: network initialisation failed: `LustreError:.*ptlrpc_init_portals\(\).*network ...` | **Collection Args:**
- `collect_rotated_logs`: bool — If True, also collect rotated dmesg log files from /var/log/dmesg*.
- `skip_sudo`: bool — If True, do not use sudo when running dmesg or listing log files.
- `log_dmesg_data`: bool — If True, log the collected dmesg output in artifacts. | [DmesgData](#DmesgData-Model) | [DmesgCollector](#Collector-Class-DmesgCollector) | [DmesgAnalyzer](#Data-Analyzer-Class-DmesgAnalyzer) |
+| DmesgPlugin | dmesg --time-format iso -x
ls -1 /var/log/dmesg* 2>/dev/null | grep -E '^/var/log/dmesg(\.[0-9]+(\.gz)?)?$' || true | **Built-in Regexes:**
- Out of memory error: `(?:oom_kill_process.*)|(?:Out of memory.*)`
- I/O Page Fault: `IO_PAGE_FAULT`
- Kernel Panic: `\bkernel panic\b.*`
- SQ Interrupt: `sq_intr`
- SRAM ECC: `sram_ecc.*`
- Failed to load driver. IP hardware init error.: `\[amdgpu\]\] \*ERROR\* hw_init of IP block.*`
- Failed to load driver. IP software init error.: `\[amdgpu\]\] \*ERROR\* sw_init of IP block.*`
- Real Time throttling activated: `sched: RT throttling activated.*`
- RCU preempt detected stalls: `rcu_preempt detected stalls.*`
- RCU preempt self-detected stall: `rcu_preempt self-detected stall.*`
- QCM fence timeout: `qcm fence wait loop timeout.*`
- General protection fault: `(?:[\w-]+(?:\[[0-9.]+\])?\s+)?general protectio...`
- Segmentation fault: `(?:segfault.*in .*\[)|(?:[Ss]egmentation [Ff]au...`
- Failed to disallow cf state: `amdgpu: Failed to disallow cf state.*`
- Failed to terminate tmr: `\*ERROR\* Failed to terminate tmr.*`
- Suspend of IP block failed: `\*ERROR\* suspend of IP block <\w+> failed.*`
- amdgpu Page Fault: `(amdgpu \w{4}:\w{2}:\w{2}\.\w:\s+amdgpu:\s+\[\S...`
- Page Fault: `page fault for address.*`
- Fatal error during GPU init: `(?:amdgpu)(.*Fatal error during GPU init)|(Fata...`
- PCIe AER Error Status: `(pcieport [\w:.]+: AER: aer_status:[^\n]*(?:\n[...`
- PCIe AER Correctable Error Status: `(.*aer_cor_status: 0x[0-9a-fA-F]+, aer_cor_mask...`
- PCIe AER Uncorrectable Error Status: `(.*aer_uncor_status: 0x[0-9a-fA-F]+, aer_uncor_...`
- PCIe AER Uncorrectable Error Severity with TLP Header: `(.*aer_uncor_severity: 0x[0-9a-fA-F]+.*)(\n.*TL...`
- Failed to read journal file: `Failed to read journal file.*`
- Journal file corrupted or uncleanly shut down: `journal corrupted or uncleanly shut down.*`
- ACPI BIOS Error: `ACPI BIOS Error`
- ACPI Error: `ACPI Error`
- Filesystem corrupted!: `EXT4-fs error \(device .*\):`
- Error in buffered IO, check filesystem integrity: `(Buffer I\/O error on dev)(?:ice)? (\w+)`
- PCIe card no longer present: `pcieport (\w+:\w+:\w+\.\w+):\s+(\w+):\s+(Slot\(...`
- PCIe Link Down: `pcieport (\w+:\w+:\w+\.\w+):\s+(\w+):\s+(Slot\(...`
- Mismatched clock configuration between PCIe device and host: `pcieport (\w+:\w+:\w+\.\w+):\s+(\w+):\s+(curren...`
- RAS Correctable Error: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- RAS Uncorrectable Error: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- RAS Deferred Error: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- RAS Corrected PCIe Error: `((?:\[Hardware Error\]:\s+)?event severity: cor...`
- GPU Reset: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- GPU reset failed: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- ACA Error: `(Accelerator Check Architecture[^\n]*)(?:\n[^\n...`
- ACA Error: `(Accelerator Check Architecture[^\n]*)(?:\n[^\n...`
- MCE Corrected Error: `\[Hardware Error\]:.+MC\d+_STATUS\[[^\]]*\|CE\|...`
- MCE Uncorrected Error: `\[Hardware Error\]:.+MC\d+_STATUS\[[^\]]*\|UC\|...`
- Mode 2 Reset Failed: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)? (...`
- RAS Corrected Error: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- SGX Error: `x86/cpu: SGX disabled by BIOS`
- MMP Error: `Failed to load MMP firmware qat_4xxx_mmp.bin`
- GPU Throttled: `amdgpu \w{4}:\w{2}:\w{2}.\w: amdgpu: WARN: GPU ...`
- RAS Poison Consumed: `amdgpu[ 0-9a-fA-F:.]+:(?:\s*amdgpu:)?\s+(?:{\d+...`
- RAS Poison created: `amdgpu[ 0-9a-fA-F:.]+:(?:\s*amdgpu:)?\s+(?:{\d+...`
- Bad page threshold exceeded: `(amdgpu: Saved bad pages (\d+) reaches threshol...`
- RAS Hardware Error: `Hardware error from APEI Generic Hardware Error...`
- Error Address: `Error Address.*(?:\s.*)`
- RAS EDR Event: `EDR: EDR event received`
- DPC Event: `DPC: .*`
- LNet: ko2iblnd has no matching interfaces: `(?:\[[^\]]+\]\s*)?LNetError:.*ko2iblnd:\s*No ma...`
- LNet: Error starting up LNI: `(?:\[[^\]]+\]\s*)?LNetError:\s*.*Error\s*-?\d+\...`
- Lustre: network initialisation failed: `LustreError:.*ptlrpc_init_portals\(\).*network ...` | **Collection Args:**
- `collect_rotated_logs`: bool — If True, also collect rotated dmesg log files from /var/log/dmesg*.
- `skip_sudo`: bool — If True, do not use sudo when running dmesg or listing log files.
- `log_dmesg_data`: bool — If True, log the collected dmesg output in artifacts. | [DmesgData](#DmesgData-Model) | [DmesgCollector](#Collector-Class-DmesgCollector) | [DmesgAnalyzer](#Data-Analyzer-Class-DmesgAnalyzer) |
| FabricsPlugin | lspci | grep -i cassini
lsmod | grep cxi
cxi_stat
ibstat
ibv_devinfo
ls -l /sys/class/infiniband/*/device/net
fi_info -p cxi
mst start
mst status -v
ip link show
ofed_info -s | - | - | [FabricsDataModel](#FabricsDataModel-Model) | [FabricsCollector](#Collector-Class-FabricsCollector) | - |
| JournalPlugin | journalctl --no-pager --system --output=short-iso
journalctl --no-pager --system --output=json | **Analyzer Args:**
- `analysis_range_start`: Optional[datetime.datetime] — Start of time range for analysis (ISO format). Only events on or after this time are analyzed.
- `analysis_range_end`: Optional[datetime.datetime] — End of time range for analysis (ISO format). Only events before this time are analyzed.
- `check_priority`: Optional[int] — Check against journal log priority (0=emergency..7=debug). If an entry has priority <= check_priority, an ERROR event...
- `group`: bool — If True, group entries that have the same priority and message. | **Collection Args:**
- `boot`: Optional[int] — Optional boot ID to limit journal collection to a specific boot. | [JournalData](#JournalData-Model) | [JournalCollector](#Collector-Class-JournalCollector) | [JournalAnalyzer](#Data-Analyzer-Class-JournalAnalyzer) |
| KernelPlugin | sh -c 'uname -a'
sh -c 'cat /proc/sys/kernel/numa_balancing'
wmic os get Version /Value | **Analyzer Args:**
- `exp_kernel`: Union[str, list] — Expected kernel version string(s) to match (e.g. from uname -a).
- `exp_numa`: Optional[int] — Expected value for kernel.numa_balancing (e.g. 0 or 1).
- `regex_match`: bool — If True, match exp_kernel as regex; otherwise exact match. | - | [KernelDataModel](#KernelDataModel-Model) | [KernelCollector](#Collector-Class-KernelCollector) | [KernelAnalyzer](#Data-Analyzer-Class-KernelAnalyzer) |
@@ -21,14 +21,14 @@
| NvmePlugin | nvme smart-log {dev}
nvme error-log {dev} --log-entries=256
nvme id-ctrl {dev}
nvme id-ns {dev}{ns}
nvme fw-log {dev}
nvme self-test-log {dev}
nvme get-log {dev} --log-id=6 --log-len=512
nvme telemetry-log {dev} --output-file={dev}_{f_name}
nvme list -o json | - | - | [NvmeDataModel](#NvmeDataModel-Model) | [NvmeCollector](#Collector-Class-NvmeCollector) | - |
| OsPlugin | sh -c '( lsb_release -ds || (cat /etc/*release | grep PRETTY_NAME) || uname -om ) 2>/dev/null | head -n1'
cat /etc/*release | grep VERSION_ID
wmic os get Version /value
wmic os get Caption /Value | **Analyzer Args:**
- `exp_os`: Union[str, list] — Expected OS name/version string(s) to match (e.g. from lsb_release or /etc/os-release).
- `exact_match`: bool — If True, require exact match for exp_os; otherwise substring match. | - | [OsDataModel](#OsDataModel-Model) | [OsCollector](#Collector-Class-OsCollector) | [OsAnalyzer](#Data-Analyzer-Class-OsAnalyzer) |
| PackagePlugin | dnf list --installed
dpkg-query -W
pacman -Q
cat /etc/*release
wmic product get name,version | **Analyzer Args:**
- `exp_package_ver`: Dict[str, Optional[str]] — Map package name -> expected version (None = any version). Checked against installed packages.
- `regex_match`: bool — If True, match package versions with regex; otherwise exact or prefix match.
- `rocm_regex`: Optional[str] — Optional regex to identify ROCm package version (used when enable_rocm_regex is True).
- `enable_rocm_regex`: bool — If True, use rocm_regex (or default pattern) to extract ROCm version for checks. | - | [PackageDataModel](#PackageDataModel-Model) | [PackageCollector](#Collector-Class-PackageCollector) | [PackageAnalyzer](#Data-Analyzer-Class-PackageAnalyzer) |
-| PciePlugin | lspci -d {vendor_id}: -nn
lspci -x
lspci -xxxx
lspci -PP
lspci -PP -d {vendor_id}:{dev_id}
lspci -vvv
lspci -vvvt | **Analyzer Args:**
- `exp_speed`: int — Expected PCIe link speed (generation 1–5).
- `exp_width`: int — Expected PCIe link width in lanes (1–16).
- `exp_sriov_count`: int — Expected SR-IOV virtual function count.
- `exp_gpu_count_override`: Optional[int] — Override expected GPU count for validation.
- `exp_max_payload_size`: Union[Dict[int, int], int, NoneType] — Expected max payload size: int for all devices, or dict keyed by device ID.
- `exp_max_rd_req_size`: Union[Dict[int, int], int, NoneType] — Expected max read request size: int for all devices, or dict keyed by device ID.
- `exp_ten_bit_tag_req_en`: Union[Dict[int, int], int, NoneType] — Expected 10-bit tag request enable: int for all devices, or dict keyed by device ID. | - | [PcieDataModel](#PcieDataModel-Model) | [PcieCollector](#Collector-Class-PcieCollector) | [PcieAnalyzer](#Data-Analyzer-Class-PcieAnalyzer) |
+| PciePlugin | lspci -d {vendor_id}: -nn
lspci -x
lspci -xxxx
lspci -PP
lspci -PP -d {vendor_id}:{dev_id}
lspci -PP -D -d {vendor_id}:{dev_id}
lspci -PP -D
lspci -vvv
lspci -vvvt | **Analyzer Args:**
- `exp_speed`: int — Expected PCIe link speed (generation 1–5).
- `exp_width`: int — Expected PCIe link width in lanes (1–16).
- `exp_sriov_count`: int — Expected SR-IOV virtual function count.
- `exp_gpu_count_override`: Optional[int] — Override expected GPU count for validation.
- `exp_max_payload_size`: Union[Dict[int, int], int, NoneType] — Expected max payload size: int for all devices, or dict keyed by device ID.
- `exp_max_rd_req_size`: Union[Dict[int, int], int, NoneType] — Expected max read request size: int for all devices, or dict keyed by device ID.
- `exp_ten_bit_tag_req_en`: Union[Dict[int, int], int, NoneType] — Expected 10-bit tag request enable: int for all devices, or dict keyed by device ID. | - | [PcieDataModel](#PcieDataModel-Model) | [PcieCollector](#Collector-Class-PcieCollector) | [PcieAnalyzer](#Data-Analyzer-Class-PcieAnalyzer) |
| ProcessPlugin | top -b -n 1
rocm-smi --showpids
top -b -n 1 -o %CPU | **Analyzer Args:**
- `max_kfd_processes`: int — Maximum allowed number of KFD (Kernel Fusion Driver) processes; 0 disables the check.
- `max_cpu_usage`: float — Maximum allowed CPU usage (percent) for process checks. | **Collection Args:**
- `top_n_process`: int — Number of top processes by CPU usage to collect (e.g. for top -b -n 1 -o %%CPU). | [ProcessDataModel](#ProcessDataModel-Model) | [ProcessCollector](#Collector-Class-ProcessCollector) | [ProcessAnalyzer](#Data-Analyzer-Class-ProcessAnalyzer) |
| RdmaPlugin | rdma link -j
rdma dev
rdma link
rdma statistic -j | - | - | [RdmaDataModel](#RdmaDataModel-Model) | [RdmaCollector](#Collector-Class-RdmaCollector) | [RdmaAnalyzer](#Data-Analyzer-Class-RdmaAnalyzer) |
| RocmPlugin | {rocm_path}/opencl/bin/*/clinfo
env | grep -Ei 'rocm|hsa|hip|mpi|openmp|ucx|miopen'
ls /sys/class/kfd/kfd/proc/
grep -i -E 'rocm' /etc/ld.so.conf.d/*
{rocm_path}/bin/rocminfo
ls -v -d {rocm_path}*
ls -v -d {rocm_path}-[3-7]* | tail -1
ldconfig -p | grep -i -E 'rocm'
grep . -H -r -i {rocm_path}/.info/* | **Analyzer Args:**
- `exp_rocm`: Union[str, list] — Expected ROCm version string(s) to match (e.g. from rocminfo).
- `exp_rocm_latest`: str — Expected 'latest' ROCm path or version string for versioned installs.
- `exp_rocm_sub_versions`: dict[str, Union[str, list]] — Map sub-version name (e.g. version_rocm) to expected string or list of allowed strings. | **Collection Args:**
- `rocm_path`: str — Base path to ROCm installation (e.g. /opt/rocm). Used for rocminfo, clinfo, and version discovery. | [RocmDataModel](#RocmDataModel-Model) | [RocmCollector](#Collector-Class-RocmCollector) | [RocmAnalyzer](#Data-Analyzer-Class-RocmAnalyzer) |
| StoragePlugin | sh -c 'df -lH -B1 | grep -v 'boot''
wmic LogicalDisk Where DriveType="3" Get DeviceId,Size,FreeSpace | - | **Collection Args:**
- `skip_sudo`: bool — If True, do not use sudo when running df and related storage commands. | [StorageDataModel](#StorageDataModel-Model) | [StorageCollector](#Collector-Class-StorageCollector) | [StorageAnalyzer](#Data-Analyzer-Class-StorageAnalyzer) |
| SysSettingsPlugin | cat /sys/{}
ls -1 /sys/{}
ls -l /sys/{} | **Analyzer Args:**
- `checks`: Optional[list[nodescraper.plugins.inband.sys_settings.analyzer_args.SysfsCheck]] — List of sysfs checks (path, expected values or pattern, display name). | **Collection Args:**
- `paths`: list[str] — Sysfs paths to read (cat). Paths with '*' are collected with ls -l (e.g. class/net/*/device).
- `directory_paths`: list[str] — Sysfs paths to list (ls -1); used for checks that match entry names by regex. | [SysSettingsDataModel](#SysSettingsDataModel-Model) | [SysSettingsCollector](#Collector-Class-SysSettingsCollector) | [SysSettingsAnalyzer](#Data-Analyzer-Class-SysSettingsAnalyzer) |
| SysctlPlugin | sysctl -n | **Analyzer Args:**
- `exp_vm_swappiness`: Optional[int] — Expected vm.swappiness value.
- `exp_vm_numa_balancing`: Optional[int] — Expected vm.numa_balancing value.
- `exp_vm_oom_kill_allocating_task`: Optional[int] — Expected vm.oom_kill_allocating_task value.
- `exp_vm_compaction_proactiveness`: Optional[int] — Expected vm.compaction_proactiveness value.
- `exp_vm_compact_unevictable_allowed`: Optional[int] — Expected vm.compact_unevictable_allowed value.
- `exp_vm_extfrag_threshold`: Optional[int] — Expected vm.extfrag_threshold value.
- `exp_vm_zone_reclaim_mode`: Optional[int] — Expected vm.zone_reclaim_mode value.
- `exp_vm_dirty_background_ratio`: Optional[int] — Expected vm.dirty_background_ratio value.
- `exp_vm_dirty_ratio`: Optional[int] — Expected vm.dirty_ratio value.
- `exp_vm_dirty_writeback_centisecs`: Optional[int] — Expected vm.dirty_writeback_centisecs value.
- `exp_kernel_numa_balancing`: Optional[int] — Expected kernel.numa_balancing value. | - | [SysctlDataModel](#SysctlDataModel-Model) | [SysctlCollector](#Collector-Class-SysctlCollector) | [SysctlAnalyzer](#Data-Analyzer-Class-SysctlAnalyzer) |
-| SyslogPlugin | ls -1 /var/log/syslog* 2>/dev/null | grep -E '^/var/log/syslog(\.[0-9]+(\.gz)?)?$' || true | - | - | [SyslogData](#SyslogData-Model) | [SyslogCollector](#Collector-Class-SyslogCollector) | - |
+| SyslogPlugin | ls -1 /var/log/syslog* 2>/dev/null | grep -E '^/var/log/syslog(\.[0-9]+(\.gz)?)?$' || true
ls -1 /var/log/messages* 2>/dev/null | grep -E '^/var/log/messages(\.[0-9]+(\.gz)?)?$' || true | - | - | [SyslogData](#SyslogData-Model) | [SyslogCollector](#Collector-Class-SyslogCollector) | - |
| UptimePlugin | uptime | - | - | [UptimeDataModel](#UptimeDataModel-Model) | [UptimeCollector](#Collector-Class-UptimeCollector) | - |
# Collectors
@@ -686,11 +686,13 @@ class for collection of PCIe data only supports Linux OS type.
- `lspci -vvv` : Verbose collection of PCIe data
- `lspci -vvvt`: Verbose tree view of PCIe data
- `lspci -PP`: Path view of PCIe data for the GPUs
+ - `lspci -PP -D`: Path view of PCIe data for the GPUs (with domain prefix)
- If system interaction level is set to STANDARD or higher, the following commands will be run with sudo:
- `lspci -xxxx`: Hex view of PCIe data for the GPUs
- otherwise the following commands will be run without sudo:
- `lspci -x`: Hex view of PCIe data for the GPUs
- - `lspci -d :` : Count the number of GPUs in the system with this command
+ - `lspci -d :` : Detect AMD GPU device IDs in the system
+ - `lspci -PP -D -d :` : Upstream BDF path for GPUs (with domain prefix)
- If system interaction level is set to STANDARD or higher, the following commands will be run with sudo:
- The sudo lspci -xxxx command is used to collect the PCIe configuration space for the GPUs in the system
- otherwise the following commands will be run without sudo:
@@ -706,10 +708,12 @@ class for collection of PCIe data only supports Linux OS type.
- **CMD_LSPCI_VERBOSE**: `lspci -vvv`
- **CMD_LSPCI_VERBOSE_TREE**: `lspci -vvvt`
- **CMD_LSPCI_PATH**: `lspci -PP`
+- **CMD_LSPCI_PATH_DOMAIN**: `lspci -PP -D`
- **CMD_LSPCI_HEX_SUDO**: `lspci -xxxx`
- **CMD_LSPCI_HEX**: `lspci -x`
- **CMD_LSPCI_AMD_DEVICES**: `lspci -d {vendor_id}: -nn`
- **CMD_LSPCI_PATH_DEVICE**: `lspci -PP -d {vendor_id}:{dev_id}`
+- **CMD_LSPCI_PATH_DEVICE_DOMAIN**: `lspci -PP -D -d {vendor_id}:{dev_id}`
### Provides Data
@@ -722,6 +726,8 @@ PcieDataModel
- lspci -xxxx
- lspci -PP
- lspci -PP -d {vendor_id}:{dev_id}
+- lspci -PP -D -d {vendor_id}:{dev_id}
+- lspci -PP -D
- lspci -vvv
- lspci -vvvt
@@ -907,6 +913,7 @@ Read syslog log
- **SUPPORTED_OS_FAMILY**: `{}`
- **CMD**: `ls -1 /var/log/syslog* 2>/dev/null | grep -E '^/var/log/syslog(\.[0-9]+(\.gz)?)?$' || true`
+- **CMD_MESSAGES**: `ls -1 /var/log/messages* 2>/dev/null | grep -E '^/var/log/messages(\.[0-9]+(\.gz)?)?$' || true`
### Provides Data
@@ -915,6 +922,7 @@ SyslogData
### Commands
- ls -1 /var/log/syslog* 2>/dev/null | grep -E '^/var/log/syslog(\.[0-9]+(\.gz)?)?$' || true
+- ls -1 /var/log/messages* 2>/dev/null | grep -E '^/var/log/messages(\.[0-9]+(\.gz)?)?$' || true
## Collector Class UptimeCollector
@@ -1237,6 +1245,7 @@ class for collection of PCIe data.
- lspci_verbose: Verbose collection of PCIe data
- lspci_verbose_tree: Tree view of PCIe data
- lspci_path: Path view of PCIe data for the GPUs
+ - lspci_path_domain: Path view of PCIe data for the GPUs (with domain prefix)
- lspci_hex: Hex view of PCIe data for the GPUs
**Link to code**: [pcie_data.py](https://github.com/amd/node-scraper/blob/HEAD/nodescraper/plugins/inband/pcie/pcie_data.py)
@@ -1467,17 +1476,18 @@ Check dmesg for errors
regex=re.compile('pcieport (\\w+:\\w+:\\w+\\.\\w+):\\s+(\\w+):\\s+(Slot\\(\\d+\\)):\\s+(Card not present)') message='PCIe card no longer present' event_category= event_priority=,
regex=re.compile('pcieport (\\w+:\\w+:\\w+\\.\\w+):\\s+(\\w+):\\s+(Slot\\(\\d+\\)):\\s+(Link Down)') message='PCIe Link Down' event_category= event_priority=,
regex=re.compile('pcieport (\\w+:\\w+:\\w+\\.\\w+):\\s+(\\w+):\\s+(current common clock configuration is inconsistent, reconfiguring)') message='Mismatched clock configuration between PCIe device and host' event_category= event_priority=,
- regex=re.compile('(?:\\d{4}-\\d+-\\d+T\\d+:\\d+:\\d+,\\d+[+-]\\d+:\\d+)?(.* correctable hardware errors detected in total in \\w+ block.*)') message='RAS Correctable Error' event_category= event_priority=,
+ regex=re.compile('(?:\\d{4}-\\d+-\\d+T\\d+:\\d+:\\d+,\\d+[+-]\\d+:\\d+)?(.* correctable hardware errors detected in total in \\w+ block.*)') message='RAS Correctable Error' event_category= event_priority=,
regex=re.compile('(?:\\d{4}-\\d+-\\d+T\\d+:\\d+:\\d+,\\d+[+-]\\d+:\\d+)?(.* uncorrectable hardware errors detected in \\w+ block.*)') message='RAS Uncorrectable Error' event_category= event_priority=,
regex=re.compile('(?:\\d{4}-\\d+-\\d+T\\d+:\\d+:\\d+,\\d+[+-]\\d+:\\d+)?(.* deferred hardware errors detected in \\w+ block.*)') message='RAS Deferred Error' event_category= event_priority=,
- regex=re.compile('((?:\\[Hardware Error\\]:\\s+)?event severity: corrected.*)\\n.*(\\[Hardware Error\\]:\\s+Error \\d+, type: corrected.*)\\n.*(\\[Hardware Error\\]:\\s+section_type: PCIe error.*)') message='RAS Corrected PCIe Error' event_category= event_priority=,
+ regex=re.compile('((?:\\[Hardware Error\\]:\\s+)?event severity: corrected.*)\\n.*(\\[Hardware Error\\]:\\s+Error \\d+, type: corrected.*)\\n.*(\\[Hardware Error\\]:\\s+section_type: PCIe error.*)') message='RAS Corrected PCIe Error' event_category= event_priority=,
regex=re.compile('(?:\\d{4}-\\d+-\\d+T\\d+:\\d+:\\d+,\\d+[+-]\\d+:\\d+)?(.*GPU reset begin.*)') message='GPU Reset' event_category= event_priority=,
regex=re.compile('(?:\\d{4}-\\d+-\\d+T\\d+:\\d+:\\d+,\\d+[+-]\\d+:\\d+)?(.*GPU reset(?:\\(\\d+\\))? failed.*)') message='GPU reset failed' event_category= event_priority=,
regex=re.compile('(Accelerator Check Architecture[^\\n]*)(?:\\n[^\\n]*){0,10}?(amdgpu[ 0-9a-fA-F:.]+:? [^\\n]*entry\\[\\d+\\]\\.STATUS=0x[0-9a-fA-F]+)(?:\\n[^\\n]*){0,5}?(amdgpu[ 0-9a-fA-F:.]+:? [^\\n]*entry\\[\\d+\\], re.MULTILINE) message='ACA Error' event_category= event_priority=,
regex=re.compile('(Accelerator Check Architecture[^\\n]*)(?:\\n[^\\n]*){0,10}?(amdgpu[ 0-9a-fA-F:.]+:? [^\\n]*CONTROL=0x[0-9a-fA-F]+)(?:\\n[^\\n]*){0,5}?(amdgpu[ 0-9a-fA-F:.]+:? [^\\n]*STATUS=0x[0-9a-fA-F]+)(?:\\n[^\\, re.MULTILINE) message='ACA Error' event_category= event_priority=,
- regex=re.compile('\\[Hardware Error\\]:.+MC\\d+_STATUS.*(?:\\n.*){0,5}') message='MCE Error' event_category= event_priority=,
+ regex=re.compile('\\[Hardware Error\\]:.+MC\\d+_STATUS\\[[^\\]]*\\|CE\\|[^\\]]*\\].*(?:\\n.*){0,5}') message='MCE Corrected Error' event_category= event_priority=,
+ regex=re.compile('\\[Hardware Error\\]:.+MC\\d+_STATUS\\[[^\\]]*\\|UC\\|[^\\]]*\\].*(?:\\n.*){0,5}') message='MCE Uncorrected Error' event_category= event_priority=,
regex=re.compile('(?:\\d{4}-\\d+-\\d+T\\d+:\\d+:\\d+,\\d+[+-]\\d+:\\d+)? (.*Mode2 reset failed.*)') message='Mode 2 Reset Failed' event_category= event_priority=,
- regex=re.compile('(?:\\d{4}-\\d+-\\d+T\\d+:\\d+:\\d+,\\d+[+-]\\d+:\\d+)?(.*\\[Hardware Error\\]: Corrected error.*)') message='RAS Corrected Error' event_category= event_priority=,
+ regex=re.compile('(?:\\d{4}-\\d+-\\d+T\\d+:\\d+:\\d+,\\d+[+-]\\d+:\\d+)?(.*\\[Hardware Error\\]: Corrected error.*)') message='RAS Corrected Error' event_category= event_priority=,
regex=re.compile('x86/cpu: SGX disabled by BIOS') message='SGX Error' event_category= event_priority=,
regex=re.compile('Failed to load MMP firmware qat_4xxx_mmp.bin') message='MMP Error' event_category= event_priority=,
regex=re.compile('amdgpu \\w{4}:\\w{2}:\\w{2}.\\w: amdgpu: WARN: GPU is throttled.*') message='GPU Throttled' event_category= event_priority=,
@@ -1495,7 +1505,7 @@ Check dmesg for errors
### Regex Patterns
-*57 items defined*
+*58 items defined*
- **Built-in Regexes:**
- - Out of memory error: `(?:oom_kill_process.*)|(?:Out of memory.*)`
@@ -1538,7 +1548,8 @@ Check dmesg for errors
- - GPU reset failed: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- - ACA Error: `(Accelerator Check Architecture[^\n]*)(?:\n[^\n...`
- - ACA Error: `(Accelerator Check Architecture[^\n]*)(?:\n[^\n...`
-- - MCE Error: `\[Hardware Error\]:.+MC\d+_STATUS.*(?:\n.*){0,5}`
+- - MCE Corrected Error: `\[Hardware Error\]:.+MC\d+_STATUS\[[^\]]*\|CE\|...`
+- - MCE Uncorrected Error: `\[Hardware Error\]:.+MC\d+_STATUS\[[^\]]*\|UC\|...`
- - Mode 2 Reset Failed: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)? (...`
- - RAS Corrected Error: `(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(....`
- - SGX Error: `x86/cpu: SGX disabled by BIOS`
diff --git a/nodescraper/base/__init__.py b/nodescraper/base/__init__.py
index 8428df4d..06d0a2f0 100644
--- a/nodescraper/base/__init__.py
+++ b/nodescraper/base/__init__.py
@@ -26,6 +26,7 @@
from .inbandcollectortask import InBandDataCollector
from .inbanddataplugin import InBandDataPlugin
from .oobanddataplugin import OOBandDataPlugin
+from .oobsshdataplugin import OOBSSHDataPlugin
from .redfishcollectortask import RedfishDataCollector
from .regexanalyzer import RegexAnalyzer
@@ -33,6 +34,7 @@
"InBandDataCollector",
"InBandDataPlugin",
"OOBandDataPlugin",
+ "OOBSSHDataPlugin",
"RedfishDataCollector",
"RegexAnalyzer",
]
diff --git a/nodescraper/base/oobsshdataplugin.py b/nodescraper/base/oobsshdataplugin.py
new file mode 100644
index 00000000..b383d7aa
--- /dev/null
+++ b/nodescraper/base/oobsshdataplugin.py
@@ -0,0 +1,54 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from __future__ import annotations
+
+from typing import Generic
+
+from nodescraper.connection.redfish import (
+ RedfishConnectionManager,
+ RedfishConnectionParams,
+)
+from nodescraper.generictypes import TAnalyzeArg, TCollectArg, TDataModel
+from nodescraper.interfaces import DataPlugin
+
+
+class OOBSSHDataPlugin(
+ DataPlugin[
+ RedfishConnectionManager,
+ RedfishConnectionParams,
+ TDataModel,
+ TCollectArg,
+ TAnalyzeArg,
+ ],
+ Generic[TDataModel, TCollectArg, TAnalyzeArg],
+):
+ """Base class for out-of-band (OOB) plugins that run shell commands on the BMC.
+
+ Configure the BMC using ``RedfishConnectionManager`` in the connection config.
+ Commands are executed over SSH (port 22) using the same host/username/password.
+ """
+
+ CONNECTION_TYPE = RedfishConnectionManager
diff --git a/nodescraper/configbuilder.py b/nodescraper/configbuilder.py
index 354ebc43..7823b95a 100644
--- a/nodescraper/configbuilder.py
+++ b/nodescraper/configbuilder.py
@@ -109,6 +109,13 @@ def _process_value(cls, value: Any) -> Optional[Union[dict, str, int, float, lis
return_dict = {}
for key, val in value.items():
return_dict[key] = cls._process_value(val)
+ return return_dict
+
+ if isinstance(value, list):
+ return_list = []
+ for item in value:
+ return_list.append(cls._process_value(item))
+ return return_list
elif not isinstance(
value,
diff --git a/nodescraper/connection/oob_ssh/__init__.py b/nodescraper/connection/oob_ssh/__init__.py
new file mode 100644
index 00000000..15e619da
--- /dev/null
+++ b/nodescraper/connection/oob_ssh/__init__.py
@@ -0,0 +1,28 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from .oob_ssh_connection_manager import OobSshConnectionManager
+
+__all__ = ["OobSshConnectionManager"]
diff --git a/nodescraper/connection/oob_ssh/oob_ssh_connection_manager.py b/nodescraper/connection/oob_ssh/oob_ssh_connection_manager.py
new file mode 100644
index 00000000..823c00cb
--- /dev/null
+++ b/nodescraper/connection/oob_ssh/oob_ssh_connection_manager.py
@@ -0,0 +1,124 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from __future__ import annotations
+
+from logging import Logger
+from typing import Optional, Union
+
+from nodescraper.enums import EventCategory, EventPriority, ExecutionStatus
+from nodescraper.interfaces.connectionmanager import ConnectionManager
+from nodescraper.interfaces.taskresulthook import TaskResultHook
+from nodescraper.models import SystemInfo, TaskResult
+from nodescraper.utils import get_exception_traceback
+
+from ..inband.inband import InBandConnection
+from ..inband.inbandremote import RemoteShell, SSHConnectionError
+from ..redfish.redfish_params import RedfishConnectionParams, redfish_params_to_ssh
+
+
+class OobSshConnectionManager(ConnectionManager[InBandConnection, RedfishConnectionParams]):
+ """SSH to the BMC using the same host and credentials as Redfish (OOB shell)."""
+
+ def __init__(
+ self,
+ system_info: SystemInfo,
+ logger: Optional[Logger] = None,
+ max_event_priority_level: Union[EventPriority, str] = EventPriority.CRITICAL,
+ parent: Optional[str] = None,
+ task_result_hooks: Optional[list[TaskResultHook]] = None,
+ connection_args: Optional[RedfishConnectionParams] = None,
+ **kwargs,
+ ):
+ super().__init__(
+ system_info,
+ logger,
+ max_event_priority_level,
+ parent,
+ task_result_hooks,
+ connection_args,
+ **kwargs,
+ )
+
+ def connect(self) -> TaskResult:
+ if not self.connection_args:
+ self._log_event(
+ category=EventCategory.RUNTIME,
+ description="No Redfish connection parameters provided for OOB SSH",
+ priority=EventPriority.CRITICAL,
+ console_log=True,
+ )
+ self.result.status = ExecutionStatus.EXECUTION_FAILURE
+ return self.result
+
+ raw = self.connection_args
+ if isinstance(raw, dict):
+ params = RedfishConnectionParams.model_validate(raw)
+ elif isinstance(raw, RedfishConnectionParams):
+ params = raw
+ else:
+ self._log_event(
+ category=EventCategory.RUNTIME,
+ description="Redfish connection_args must be dict or RedfishConnectionParams",
+ priority=EventPriority.CRITICAL,
+ console_log=True,
+ )
+ self.result.status = ExecutionStatus.EXECUTION_FAILURE
+ return self.result
+
+ try:
+ ssh_params = redfish_params_to_ssh(params)
+ self.logger.info("Initializing OOB SSH to BMC host %s", ssh_params.hostname)
+ self.connection = RemoteShell(ssh_params)
+ self.connection.connect_ssh()
+ except SSHConnectionError as exception:
+ self._log_event(
+ category=EventCategory.SSH,
+ description=str(exception),
+ priority=EventPriority.CRITICAL,
+ console_log=True,
+ )
+ self.result.status = ExecutionStatus.EXECUTION_FAILURE
+ self.connection = None
+ except Exception as exception:
+ self._log_event(
+ category=EventCategory.SSH,
+ description=f"Exception during OOB SSH: {exception!s}",
+ data=get_exception_traceback(exception),
+ priority=EventPriority.CRITICAL,
+ console_log=True,
+ )
+ self.result.status = ExecutionStatus.EXECUTION_FAILURE
+ self.connection = None
+ return self.result
+
+ def disconnect(self) -> None:
+ conn = self.connection
+ super().disconnect()
+ if isinstance(conn, RemoteShell):
+ try:
+ conn.client.close()
+ except Exception:
+ pass
diff --git a/nodescraper/connection/redfish/__init__.py b/nodescraper/connection/redfish/__init__.py
index f98faaac..12b5af16 100644
--- a/nodescraper/connection/redfish/__init__.py
+++ b/nodescraper/connection/redfish/__init__.py
@@ -39,7 +39,7 @@
collect_oem_diagnostic_data,
get_oem_diagnostic_allowable_values,
)
-from .redfish_params import RedfishConnectionParams
+from .redfish_params import RedfishConnectionParams, redfish_params_to_ssh
from .redfish_path import RedfishPath
__all__ = [
@@ -48,6 +48,7 @@
"RedfishGetResult",
"RedfishConnectionManager",
"RedfishConnectionParams",
+ "redfish_params_to_ssh",
"RedfishPath",
"collect_oem_diagnostic_data",
"get_oem_diagnostic_allowable_values",
diff --git a/nodescraper/connection/redfish/redfish_params.py b/nodescraper/connection/redfish/redfish_params.py
index 7d9b5d5f..4eb70a96 100644
--- a/nodescraper/connection/redfish/redfish_params.py
+++ b/nodescraper/connection/redfish/redfish_params.py
@@ -23,11 +23,15 @@
# SOFTWARE.
#
###############################################################################
+from __future__ import annotations
+
from typing import Optional, Union
from pydantic import BaseModel, ConfigDict, Field, SecretStr
from pydantic.networks import IPvAnyAddress
+from nodescraper.connection.inband.sshparams import SSHConnectionParams
+
from .redfish_connection import DEFAULT_REDFISH_API_ROOT
@@ -51,3 +55,28 @@ class RedfishConnectionParams(BaseModel):
default=DEFAULT_REDFISH_API_ROOT,
description="Redfish API path (e.g. 'redfish/v1'). Override for a different API version.",
)
+
+
+def redfish_params_to_ssh(
+ params: Union[RedfishConnectionParams, dict],
+ *,
+ ssh_port: Optional[int] = None,
+ key_filename: Optional[str] = None,
+) -> SSHConnectionParams:
+ """Map Redfish BMC credentials to SSH connection params for shell access."""
+ if isinstance(params, dict):
+ data = dict(params)
+ ssh_port = data.pop("ssh_port", ssh_port if ssh_port is not None else 22)
+ key_filename = data.pop("key_filename", key_filename)
+ params = RedfishConnectionParams.model_validate(data)
+ else:
+ if ssh_port is None:
+ ssh_port = 22
+
+ return SSHConnectionParams(
+ hostname=str(params.host),
+ username=params.username,
+ password=params.password,
+ port=ssh_port,
+ key_filename=key_filename,
+ )
diff --git a/nodescraper/interfaces/connectionmanager.py b/nodescraper/interfaces/connectionmanager.py
index 6b468f06..d413ffaf 100644
--- a/nodescraper/interfaces/connectionmanager.py
+++ b/nodescraper/interfaces/connectionmanager.py
@@ -126,7 +126,7 @@ def __init__(
def __init_subclass__(cls, **kwargs) -> None:
super().__init_subclass__(**kwargs)
- if hasattr(cls, "connect"):
+ if "connect" in cls.__dict__:
cls.connect = connect_decorator(cls.connect)
def __enter__(self):
diff --git a/nodescraper/interfaces/dataplugin.py b/nodescraper/interfaces/dataplugin.py
index 8448dff3..19820b31 100644
--- a/nodescraper/interfaces/dataplugin.py
+++ b/nodescraper/interfaces/dataplugin.py
@@ -39,6 +39,7 @@
from nodescraper.interfaces.plugin import PluginInterface
from nodescraper.models import (
AnalyzerArgs,
+ CollectorArgs,
DataModel,
DataPluginResult,
PluginResult,
@@ -51,6 +52,17 @@
from .task import SystemCompatibilityError
from .taskresulthook import TaskResultHook
+CollectorClasses = Union[
+ Type[DataCollector],
+ tuple[Type[DataCollector], ...],
+ list[Type[DataCollector]],
+]
+
+CollectorArgsClasses = Union[
+ Type[CollectorArgs],
+ dict[str, Type[CollectorArgs]],
+]
+
class DataPlugin(
PluginInterface, Generic[TConnectionManager, TConnectArg, TDataModel, TCollectArg, TAnalyzeArg]
@@ -61,7 +73,9 @@ class DataPlugin(
CONNECTION_TYPE: Optional[Type[TConnectionManager]]
- COLLECTOR: Optional[Type[DataCollector]] = None
+ COLLECTOR: Optional[CollectorClasses] = None
+
+ COLLECTOR_ARGS: Optional[CollectorArgsClasses] = None
ANALYZER: Optional[Type[DataAnalyzer]] = None
@@ -101,6 +115,43 @@ def __init__(
)
self._data: Optional[TDataModel] = None
+ @classmethod
+ def get_collector_classes(cls) -> tuple[Type[DataCollector], ...]:
+ """Return all collector classes configured on this plugin."""
+ collector = cls.COLLECTOR
+ if collector is None:
+ return ()
+ if isinstance(collector, (tuple, list)):
+ return tuple(collector)
+ return (collector,)
+
+ @classmethod
+ def _collector_args_class(
+ cls, collector_cls: Type[DataCollector]
+ ) -> Optional[Type[CollectorArgs]]:
+ collector_args = cls.COLLECTOR_ARGS
+ if isinstance(collector_args, dict):
+ return collector_args.get(collector_cls.__name__)
+ return collector_args
+
+ @classmethod
+ def _validate_collector_args(cls) -> None:
+ collector_args = cls.COLLECTOR_ARGS
+ if collector_args is None:
+ return
+ if isinstance(collector_args, dict):
+ for collector_name, args_cls in collector_args.items():
+ if not isinstance(args_cls, type) or not issubclass(args_cls, CollectorArgs):
+ raise TypeError(
+ f"COLLECTOR_ARGS[{collector_name!r}] must be a CollectorArgs subclass, "
+ f"got {args_cls!r}"
+ )
+ return
+ if not isinstance(collector_args, type) or not issubclass(collector_args, CollectorArgs):
+ raise TypeError(
+ f"COLLECTOR_ARGS must be a CollectorArgs subclass or dict, got {collector_args!r}"
+ )
+
@classmethod
def _validate_class_var(cls):
if not hasattr(cls, "DATA_MODEL"):
@@ -109,12 +160,96 @@ def _validate_class_var(cls):
if cls.DATA_MODEL is None:
raise TypeError("DATA_MODEL class variable not defined")
- if not cls.COLLECTOR and not cls.ANALYZER:
+ if not cls.get_collector_classes() and not cls.ANALYZER:
raise TypeError("No collector or analyzer task defined")
- if cls.COLLECTOR and not cls.CONNECTION_TYPE:
+ if cls.get_collector_classes() and not cls.CONNECTION_TYPE:
raise TypeError("CONNECTION_TYPE must be defined for collector")
+ for collector_cls in cls.get_collector_classes():
+ if not isinstance(collector_cls, type) or not issubclass(collector_cls, DataCollector):
+ raise TypeError(
+ f"COLLECTOR entries must be DataCollector subclasses, got {collector_cls!r}"
+ )
+
+ cls._validate_collector_args()
+
+ @classmethod
+ def _merge_collected_data(
+ cls,
+ existing: Optional[TDataModel],
+ new_data: Optional[TDataModel],
+ ) -> Optional[TDataModel]:
+ if new_data is None:
+ return existing
+ if existing is None:
+ return new_data
+ if not isinstance(new_data, existing.__class__):
+ raise TypeError(
+ f"Collector returned {new_data.__class__.__name__}, "
+ f"expected {existing.__class__.__name__}"
+ )
+ merged = {
+ **existing.model_dump(exclude_unset=True),
+ **new_data.model_dump(exclude_unset=True),
+ }
+ return existing.__class__.model_validate(merged)
+
+ @classmethod
+ def _aggregate_collection_results(
+ cls,
+ plugin_name: str,
+ results: list[TaskResult],
+ ) -> TaskResult:
+ if not results:
+ return TaskResult(
+ parent=plugin_name,
+ status=ExecutionStatus.NOT_RAN,
+ message=f"Data collection not ran for {plugin_name}",
+ )
+ if len(results) == 1:
+ return results[0]
+
+ aggregated = TaskResult(
+ parent=plugin_name,
+ status=max(result.status for result in results),
+ task=",".join(result.task for result in results if result.task),
+ )
+ messages = [result.message for result in results if result.message]
+ if messages:
+ aggregated.message = "; ".join(messages)
+ for result in results:
+ aggregated.artifacts.extend(result.artifacts)
+ aggregated.events.extend(result.events)
+ aggregated.details["collector_results"] = [
+ result.model_dump(exclude={"artifacts", "events"}) for result in results
+ ]
+ return aggregated
+
+ def _resolve_collector_args(
+ self,
+ collector_cls: Type[DataCollector],
+ collection_args: Optional[Union[TCollectArg, dict]],
+ ) -> Optional[Union[TCollectArg, dict]]:
+ if collection_args is None:
+ return None
+
+ collector_name = collector_cls.__name__
+ collector_names = {cls.__name__ for cls in self.get_collector_classes()}
+ raw_args: Optional[Union[TCollectArg, dict]] = collection_args
+
+ if isinstance(collection_args, dict) and collector_names.intersection(
+ collection_args.keys()
+ ):
+ raw_args = collection_args.get(collector_name)
+ if raw_args is None:
+ return None
+
+ args_cls = self._collector_args_class(collector_cls)
+ if args_cls is not None and isinstance(raw_args, dict):
+ return args_cls.model_validate(raw_args)
+ return raw_args
+
@classmethod
def is_valid(cls) -> bool:
"""Check that all required class variables are set
@@ -167,7 +302,8 @@ def collect(
Returns:
TaskResult: task result for data collection
"""
- if not self.COLLECTOR:
+ collector_classes = self.get_collector_classes()
+ if not collector_classes:
self.collection_result = TaskResult(
parent=self.__class__.__name__,
status=ExecutionStatus.NOT_RAN,
@@ -175,11 +311,13 @@ def collect(
)
return self.collection_result
+ primary_collector = collector_classes[0]
+
try:
if not self.connection_manager:
if not self.CONNECTION_TYPE:
self.collection_result = TaskResult(
- task=self.COLLECTOR.__name__,
+ task=primary_collector.__name__,
parent=self.__class__.__name__,
status=ExecutionStatus.NOT_RAN,
message=f"No connection manager type provided for {self.__class__.__name__}",
@@ -203,49 +341,53 @@ def collect(
if self.connection_manager.result.status != ExecutionStatus.OK:
self.collection_result = TaskResult(
- task=self.COLLECTOR.__name__,
+ task=primary_collector.__name__,
parent=self.__class__.__name__,
status=ExecutionStatus.NOT_RAN,
message="Connection not available, data collection skipped",
)
else:
- if (
- collection_args is not None
- and isinstance(collection_args, dict)
- and hasattr(self, "COLLECTOR_ARGS")
- and self.COLLECTOR_ARGS is not None
- ):
- collection_args = self.COLLECTOR_ARGS.model_validate(collection_args)
+ collector_results: list[TaskResult] = []
+ merged_data: Optional[TDataModel] = None
+
+ for collector_cls in collector_classes:
+ collector_args = self._resolve_collector_args(collector_cls, collection_args)
+ collection_task = collector_cls(
+ system_info=self.system_info,
+ logger=self.logger,
+ system_interaction_level=system_interaction_level,
+ connection=self.connection_manager.connection,
+ max_event_priority_level=max_event_priority_level,
+ parent=self.__class__.__name__,
+ task_result_hooks=self.task_result_hooks,
+ log_path=self.log_path,
+ event_reporter=self.event_reporter,
+ session_id=self.session_id,
+ )
+ result, data = collection_task.collect_data(collector_args)
+ collector_results.append(result)
+ merged_data = self._merge_collected_data(merged_data, data)
- collection_task = self.COLLECTOR(
- system_info=self.system_info,
- logger=self.logger,
- system_interaction_level=system_interaction_level,
- connection=self.connection_manager.connection,
- max_event_priority_level=max_event_priority_level,
- parent=self.__class__.__name__,
- task_result_hooks=self.task_result_hooks,
- log_path=self.log_path,
- event_reporter=self.event_reporter,
- session_id=self.session_id,
+ self.collection_result = self._aggregate_collection_results(
+ self.__class__.__name__,
+ collector_results,
)
- self.collection_result, self._data = collection_task.collect_data(collection_args)
+ self._data = merged_data
except SystemCompatibilityError as e:
self.collection_result = TaskResult(
- task=self.COLLECTOR.__name__,
+ task=primary_collector.__name__,
parent=self.__class__.__name__,
status=ExecutionStatus.NOT_RAN,
message=str(e),
)
except Exception as e:
self.logger.exception(
- "Unhandled exception running collector %s for plugin %s",
- self.COLLECTOR.__name__,
+ "Unhandled exception running collectors for plugin %s",
self.__class__.__name__,
)
self.collection_result = TaskResult(
- task=self.COLLECTOR.__name__,
+ task=primary_collector.__name__,
parent=self.__class__.__name__,
status=ExecutionStatus.EXECUTION_FAILURE,
message=f"Unhandled exception running data collector: {str(e)}",
@@ -382,19 +524,19 @@ def run(
ExecutionStatus.EXECUTION_FAILURE,
ExecutionStatus.WARNING,
]:
- if self.analysis_result.status > self.collection_result.status:
- message = (
- f"Analysis warning: {self.analysis_result.message}"
- if self.analysis_result.status == ExecutionStatus.WARNING
- else f"Analysis error: {self.analysis_result.message}"
- )
- else:
-
- message = (
- f"Collection warning: {self.collection_result.message}"
- if self.collection_result.status == ExecutionStatus.WARNING
- else f"Collection error: {self.collection_result.message}"
- )
+ failure_parts: list[str] = []
+ for label, task_result in (
+ ("Collection", self.collection_result),
+ ("Analysis", self.analysis_result),
+ ):
+ if task_result.status == ExecutionStatus.WARNING:
+ failure_parts.append(f"{label} warning: {task_result.message}")
+ elif task_result.status in (
+ ExecutionStatus.ERROR,
+ ExecutionStatus.EXECUTION_FAILURE,
+ ):
+ failure_parts.append(f"{label} error: {task_result.message}")
+ message = "; ".join(failure_parts)
else:
message = "Plugin tasks completed successfully"
@@ -422,33 +564,33 @@ def find_datamodel_path_in_run(cls, run_path: str) -> Optional[str]:
run_path = os.path.abspath(run_path)
if not os.path.isdir(run_path):
return None
- collector_cls = getattr(cls, "COLLECTOR", None)
data_model_cls = getattr(cls, "DATA_MODEL", None)
- if not collector_cls or not data_model_cls:
- return None
- collector_dir = os.path.join(
- run_path,
- pascal_to_snake(cls.__name__),
- pascal_to_snake(collector_cls.__name__),
- )
- if not os.path.isdir(collector_dir):
- return None
- result_path = os.path.join(collector_dir, "result.json")
- if not os.path.isfile(result_path):
- return None
- try:
- res_payload = json.loads(Path(result_path).read_text(encoding="utf-8"))
- if res_payload.get("parent") != cls.__name__:
- return None
- except (json.JSONDecodeError, OSError):
+ if not data_model_cls:
return None
- want_json = data_model_cls.__name__.lower() + ".json"
- for fname in os.listdir(collector_dir):
- low = fname.lower()
- if low.endswith("datamodel.json") or low == want_json:
- return os.path.join(collector_dir, fname)
- if low.endswith(".log"):
- return os.path.join(collector_dir, fname)
+ for collector_cls in cls.get_collector_classes():
+ collector_dir = os.path.join(
+ run_path,
+ pascal_to_snake(cls.__name__),
+ pascal_to_snake(collector_cls.__name__),
+ )
+ if not os.path.isdir(collector_dir):
+ continue
+ result_path = os.path.join(collector_dir, "result.json")
+ if not os.path.isfile(result_path):
+ continue
+ try:
+ res_payload = json.loads(Path(result_path).read_text(encoding="utf-8"))
+ if res_payload.get("parent") != cls.__name__:
+ continue
+ except (json.JSONDecodeError, OSError):
+ continue
+ want_json = data_model_cls.__name__.lower() + ".json"
+ for fname in os.listdir(collector_dir):
+ low = fname.lower()
+ if low.endswith("datamodel.json") or low == want_json:
+ return os.path.join(collector_dir, fname)
+ if low.endswith(".log"):
+ return os.path.join(collector_dir, fname)
return None
@classmethod
diff --git a/nodescraper/pluginexecutor.py b/nodescraper/pluginexecutor.py
index 0821ff20..4f3febed 100644
--- a/nodescraper/pluginexecutor.py
+++ b/nodescraper/pluginexecutor.py
@@ -34,6 +34,8 @@
from pydantic import BaseModel
+from nodescraper.base.oobsshdataplugin import OOBSSHDataPlugin
+from nodescraper.connection.oob_ssh import OobSshConnectionManager
from nodescraper.constants import DEFAULT_LOGGER
from nodescraper.interfaces import ConnectionManager, DataPlugin, PluginInterface
from nodescraper.models import PluginConfig, SystemInfo
@@ -81,6 +83,9 @@ def __init__(
self.plugin_config = self.merge_configs(plugin_configs)
self.connection_library: dict[type[ConnectionManager], ConnectionManager] = {}
+ self.connection_configs: dict[str, Union[dict, BaseModel]] = (
+ dict(connections) if connections else {}
+ )
self.log_path = log_path
@@ -175,40 +180,56 @@ def run_queue(self) -> list[PluginResult]:
}
if plugin_class.CONNECTION_TYPE:
- connection_manager_class: Type[ConnectionManager] = plugin_class.CONNECTION_TYPE
- if (
- connection_manager_class.__name__
- in self.plugin_registry.connection_managers
- ):
- mgr_impl = self.plugin_registry.connection_managers[
- connection_manager_class.__name__
- ]
- elif (
- inspect.isclass(connection_manager_class)
- and issubclass(connection_manager_class, ConnectionManager)
- and not inspect.isabstract(connection_manager_class)
- ):
- # External packages set CONNECTION_TYPE on the plugin;
- # use it when not listed under nodescraper.connection_managers entry points.
- mgr_impl = connection_manager_class
+ if issubclass(plugin_class, OOBSSHDataPlugin):
+ mgr_impl = OobSshConnectionManager
+ connection_args = self.connection_configs.get("RedfishConnectionManager")
+ if connection_args is None:
+ self.logger.error(
+ "%s requires RedfishConnectionManager in the connection config",
+ plugin_name,
+ )
+ continue
else:
- self.logger.error(
- "Unable to find registered connection manager class for %s that is required by",
- connection_manager_class.__name__,
+ connection_manager_class: Type[ConnectionManager] = (
+ plugin_class.CONNECTION_TYPE
)
- continue
+ if (
+ connection_manager_class.__name__
+ in self.plugin_registry.connection_managers
+ ):
+ mgr_impl = self.plugin_registry.connection_managers[
+ connection_manager_class.__name__
+ ]
+ elif (
+ inspect.isclass(connection_manager_class)
+ and issubclass(connection_manager_class, ConnectionManager)
+ and not inspect.isabstract(connection_manager_class)
+ ):
+ # External packages set CONNECTION_TYPE on the plugin;
+ # use it when not listed under nodescraper.connection_managers entry points.
+ mgr_impl = connection_manager_class
+ else:
+ self.logger.error(
+ "Unable to find registered connection manager class for %s that is required by",
+ connection_manager_class.__name__,
+ )
+ continue
+ connection_args = None
if mgr_impl not in self.connection_library:
self.logger.info(
- "Initializing connection manager for %s with default args",
+ "Initializing connection manager for %s",
mgr_impl.__name__,
)
- self.connection_library[mgr_impl] = mgr_impl(
- system_info=self.system_info,
- logger=self.logger,
- task_result_hooks=self.connection_result_hooks,
- session_id=self.session_id,
- )
+ init_kwargs = {
+ "system_info": self.system_info,
+ "logger": self.logger,
+ "task_result_hooks": self.connection_result_hooks,
+ "session_id": self.session_id,
+ }
+ if connection_args is not None:
+ init_kwargs["connection_args"] = connection_args
+ self.connection_library[mgr_impl] = mgr_impl(**init_kwargs)
init_payload["connection_manager"] = self.connection_library[mgr_impl]
@@ -295,31 +316,51 @@ def apply_global_args_to_plugin(
run_args = {}
for key in global_args:
- if key in ["collection_args", "analysis_args"] and isinstance(plugin_inst, DataPlugin):
+ if key in ("collection_args", "analysis_args"):
continue
- else:
- run_args[key] = global_args[key]
-
- if (
- "collection_args" in global_args
- and hasattr(plugin_class, "COLLECTOR_ARGS")
- and plugin_class.COLLECTOR_ARGS is not None
- ):
-
- plugin_fields = set(plugin_class.COLLECTOR_ARGS.model_fields.keys())
- filtered = {
- k: v for k, v in global_args["collection_args"].items() if k in plugin_fields
- }
- if filtered:
- run_args["collection_args"] = filtered
+ run_args[key] = global_args[key]
+
+ if "collection_args" in global_args and hasattr(plugin_class, "COLLECTOR_ARGS"):
+ collector_args = plugin_class.COLLECTOR_ARGS
+ if (
+ isinstance(plugin_inst, DataPlugin)
+ and plugin_class.get_collector_classes()
+ and isinstance(collector_args, dict)
+ ):
+ per_collector_args: dict[str, dict] = {}
+ for collector_cls in plugin_class.get_collector_classes():
+ args_cls = plugin_class._collector_args_class(collector_cls)
+ if args_cls is None:
+ continue
+ plugin_fields = set(args_cls.model_fields.keys())
+ filtered = {
+ k: v
+ for k, v in global_args["collection_args"].items()
+ if k in plugin_fields
+ }
+ if filtered:
+ per_collector_args[collector_cls.__name__] = filtered
+ if per_collector_args:
+ run_args["collection_args"] = per_collector_args
+ elif collector_args is not None and not isinstance(collector_args, dict):
+ args_cls = (
+ collector_args if isinstance(collector_args, type) else type(collector_args)
+ )
+ plugin_fields = set(args_cls.model_fields.keys())
+ filtered = {
+ k: v for k, v in global_args["collection_args"].items() if k in plugin_fields
+ }
+ if filtered:
+ run_args["collection_args"] = filtered
if (
"analysis_args" in global_args
and hasattr(plugin_class, "ANALYZER_ARGS")
and plugin_class.ANALYZER_ARGS is not None
):
-
- plugin_fields = set(plugin_class.ANALYZER_ARGS.model_fields.keys())
+ analyzer_args = plugin_class.ANALYZER_ARGS
+ args_cls = analyzer_args if isinstance(analyzer_args, type) else type(analyzer_args)
+ plugin_fields = set(args_cls.model_fields.keys())
filtered = {k: v for k, v in global_args["analysis_args"].items() if k in plugin_fields}
if filtered:
run_args["analysis_args"] = filtered
diff --git a/nodescraper/pluginrecipe/discovery.py b/nodescraper/pluginrecipe/discovery.py
index aebeea3c..be0eaa03 100644
--- a/nodescraper/pluginrecipe/discovery.py
+++ b/nodescraper/pluginrecipe/discovery.py
@@ -58,7 +58,17 @@ def plugin_has_collector(plugin_name: str) -> bool:
bool: ``True`` when the plugin class defines ``COLLECTOR``.
"""
plugin_class = load_plugin_class(plugin_name)
- return plugin_class is not None and getattr(plugin_class, "COLLECTOR", None) is not None
+ if plugin_class is None:
+ return False
+ collectors = getattr(plugin_class, "get_collector_classes", None)
+ if callable(collectors):
+ return bool(collectors())
+ collector = getattr(plugin_class, "COLLECTOR", None)
+ if collector is None:
+ return False
+ if isinstance(collector, (tuple, list)):
+ return len(collector) > 0
+ return True
def plugin_has_analyzer(plugin_name: str) -> bool:
diff --git a/nodescraper/pluginregistry.py b/nodescraper/pluginregistry.py
index 559d96f6..5abc2f84 100644
--- a/nodescraper/pluginregistry.py
+++ b/nodescraper/pluginregistry.py
@@ -28,7 +28,7 @@
import inspect
import pkgutil
import types
-from typing import Optional
+from typing import Iterable, Optional
import nodescraper.connection as internal_connections
import nodescraper.plugins as internal_plugins
@@ -135,6 +135,7 @@ def load_connection_managers_from_entry_points() -> dict[str, type]:
managers: dict[str, type] = {}
try:
+ eps: Iterable
try:
eps = importlib.metadata.entry_points( # type: ignore[call-arg]
group="nodescraper.connection_managers"
@@ -145,7 +146,7 @@ def load_connection_managers_from_entry_points() -> dict[str, type]:
for entry_point in eps:
try:
- loaded = entry_point.load() # type: ignore[attr-defined]
+ loaded = entry_point.load() # type: ignore[attr-defined, union-attr]
if not (
inspect.isclass(loaded)
and issubclass(loaded, ConnectionManager)
@@ -177,6 +178,7 @@ def load_plugins_from_entry_points() -> dict[str, type]:
plugins = {}
try:
+ eps: Iterable
# Python 3.10+ supports group parameter
try:
eps = importlib.metadata.entry_points(group="nodescraper.plugins") # type: ignore[call-arg]
@@ -187,7 +189,7 @@ def load_plugins_from_entry_points() -> dict[str, type]:
for entry_point in eps:
try:
- plugin_class = entry_point.load() # type: ignore[attr-defined]
+ plugin_class = entry_point.load() # type: ignore[attr-defined, union-attr]
if (
inspect.isclass(plugin_class)
diff --git a/nodescraper/plugins/generic_collection/__init__.py b/nodescraper/plugins/generic_collection/__init__.py
new file mode 100644
index 00000000..06dbda9f
--- /dev/null
+++ b/nodescraper/plugins/generic_collection/__init__.py
@@ -0,0 +1,49 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+"""Generic command collection plugins (in-band and OOB SSH)."""
+
+from .analyzer_args import CommandCheck, GenericAnalyzerArgs
+from .collector_args import CommandSpec, GenericCollectionCollectorArgs
+from .generic_analyzer import GenericAnalyzer
+from .generic_collection_collector import GenericCollectionCollector
+from .generic_collection_data import CommandCollectionResult, GenericCollectionDataModel
+from .generic_collection_plugin_mixin import GenericCollectionPluginMixin
+from .inband_plugin import GenericCollectionPlugin
+from .oob_plugin import OobGenericCollectionPlugin
+
+__all__ = [
+ "CommandCheck",
+ "CommandCollectionResult",
+ "CommandSpec",
+ "GenericAnalyzer",
+ "GenericAnalyzerArgs",
+ "GenericCollectionCollector",
+ "GenericCollectionCollectorArgs",
+ "GenericCollectionDataModel",
+ "GenericCollectionPlugin",
+ "GenericCollectionPluginMixin",
+ "OobGenericCollectionPlugin",
+]
diff --git a/nodescraper/plugins/generic_collection/analyzer_args.py b/nodescraper/plugins/generic_collection/analyzer_args.py
new file mode 100644
index 00000000..a68e0fb2
--- /dev/null
+++ b/nodescraper/plugins/generic_collection/analyzer_args.py
@@ -0,0 +1,126 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from typing import Literal, Optional, Union
+
+from pydantic import Field, model_validator
+
+from nodescraper.models import AnalyzerArgs
+
+CompareOp = Literal["==", "!=", ">", ">=", "<", "<="]
+MatchMode = Literal["full", "any_line", "all_lines"]
+ValueType = Literal["int", "float", "str"]
+
+
+class CommandCheck(AnalyzerArgs):
+ """Validation rule for one collected command result, matched by collector command name."""
+
+ name: str = Field(
+ description="Name of the collected command to validate (must match collection_args.commands[].name).",
+ )
+ allow_failure: bool = Field(
+ default=False,
+ description="When True, a collection failure for this command does not fail the check.",
+ )
+ expected_exit_code: Optional[int] = Field(
+ default=None,
+ description="Expected exit code from collection. Defaults to 0 when other content checks are set.",
+ )
+ must_contain: Optional[Union[str, list[str]]] = Field(
+ default=None,
+ description="Stdout must contain this text or all texts in the list.",
+ )
+ must_not_contain: Optional[Union[str, list[str]]] = Field(
+ default=None,
+ description="Stdout must not contain this text or any texts in the list.",
+ )
+ expected: Optional[str] = Field(
+ default=None,
+ description="Exact stdout match after strip.",
+ )
+ expected_in: Optional[list[str]] = Field(
+ default=None,
+ description="Stripped stdout must be one of these values.",
+ )
+ expected_regex: Optional[str] = Field(
+ default=None,
+ description="Stdout must match this regex.",
+ )
+ forbidden_regex: Optional[str] = Field(
+ default=None,
+ description="Stdout must not match this regex.",
+ )
+ ignore_case: bool = Field(
+ default=False,
+ description="Case-insensitive matching for substring and regex checks.",
+ )
+ match_mode: MatchMode = Field(
+ default="full",
+ description="How to apply regex checks: full output, any line, or all non-empty lines.",
+ )
+ min_lines: Optional[int] = Field(default=None, ge=0)
+ max_lines: Optional[int] = Field(default=None, ge=0)
+ exact_lines: Optional[int] = Field(default=None, ge=0)
+ value_type: ValueType = Field(
+ default="int",
+ description="Type used when parsing stdout for expected_value checks.",
+ )
+ compare_op: CompareOp = Field(
+ default="==",
+ description="Comparison operator for expected_value checks.",
+ )
+ expected_value: Optional[Union[int, float, str]] = Field(
+ default=None,
+ description="Value to compare against parsed stdout.",
+ )
+ capture_regex: Optional[str] = Field(
+ default=None,
+ description="Optional regex with a capture group used before expected_value comparison.",
+ )
+
+ @model_validator(mode="after")
+ def _validate_name(self) -> "CommandCheck":
+ if not self.name:
+ raise ValueError("name must not be empty")
+ return self
+
+
+class GenericAnalyzerArgs(AnalyzerArgs):
+ checks: list[CommandCheck] = Field(
+ default_factory=list,
+ description="Per-command validation rules keyed by collected command name.",
+ )
+
+ @model_validator(mode="after")
+ def _validate_unique_check_names(self) -> "GenericAnalyzerArgs":
+ seen: set[str] = set()
+ duplicates: set[str] = set()
+ for check in self.checks:
+ if check.name in seen:
+ duplicates.add(check.name)
+ seen.add(check.name)
+ if duplicates:
+ raise ValueError(f"Duplicate check name(s): {sorted(duplicates)}")
+ return self
diff --git a/nodescraper/plugins/generic_collection/collector_args.py b/nodescraper/plugins/generic_collection/collector_args.py
new file mode 100644
index 00000000..fae58edf
--- /dev/null
+++ b/nodescraper/plugins/generic_collection/collector_args.py
@@ -0,0 +1,136 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from typing import Optional
+
+from pydantic import Field, field_validator, model_validator
+
+from nodescraper.models import CollectorArgs
+
+
+class CommandSpec(CollectorArgs):
+ """One named shell command and optional per-command overrides."""
+
+ name: str = Field(description="Stable name for this command, used by analysis checks.")
+ command: str = Field(
+ description=(
+ "Shell command to run on the target (non-interactive SSH exec). "
+ "Intended for small, text-oriented output. Avoid commands that stream large or "
+ "binary data to stdout (e.g. ``tar czf - ``): stdout is read fully into "
+ "memory and decoded as UTF-8 when captured. For BMC directory archives over SSH, "
+ "use ``OobBmcArchivePlugin`` instead."
+ ),
+ )
+ sudo: Optional[bool] = Field(
+ default=None,
+ description="Run with sudo. When omitted, uses collection_args.sudo.",
+ )
+ timeout: Optional[int] = Field(
+ default=None,
+ ge=1,
+ description="Command timeout in seconds. When omitted, uses collection_args.timeout.",
+ )
+ include_stdout: Optional[bool] = Field(
+ default=None,
+ description=(
+ "Store stdout in the data model. When omitted, uses collection_args.include_stdout. "
+ "When false, stdout is omitted from stored results (the SSH layer may still read it "
+ "for that command)."
+ ),
+ )
+
+ @field_validator("name", "command", mode="before")
+ @classmethod
+ def _strip_required_text(cls, value: object) -> object:
+ if isinstance(value, str):
+ return value.strip()
+ return value
+
+ @model_validator(mode="after")
+ def _validate_required_fields(self) -> "CommandSpec":
+ if not self.name:
+ raise ValueError("name must not be empty")
+ if not self.command:
+ raise ValueError("command must not be empty")
+ return self
+
+
+class GenericCollectionCollectorArgs(CollectorArgs):
+ """Arguments for :class:`GenericCollectionCollector` / ``GenericCollectionPlugin``.
+
+ Commands are run over SSH; **stdout and stderr are read fully into memory** on the
+ controller and decoded as UTF-8 (with replacement) when stdout is captured. This plugin
+ is meant for **short, text-oriented diagnostics** (logs, version strings, small file
+ snippets). It is **not** appropriate for large or binary streams—for example
+ ``tar czf - ``, which writes a tarball to stdout: you risk huge memory use, timeouts,
+ and corrupt binary data. For archiving BMC paths over SSH, use **``OobBmcArchivePlugin``**.
+ """
+
+ commands: list[CommandSpec] = Field(
+ default_factory=list,
+ description=(
+ "Named commands to run. Each entry must include 'name' and 'command'. "
+ "Prefer small textual stdout; see class docstring for streaming/binary limitations."
+ ),
+ )
+ sudo: bool = Field(
+ default=False,
+ description="Default sudo setting for commands that do not specify sudo.",
+ )
+ timeout: int = Field(
+ default=300,
+ ge=1,
+ description="Default per-command timeout in seconds.",
+ )
+ include_stdout: bool = Field(
+ default=True,
+ description=(
+ "Default: include each command's stdout in collected results for analysis. "
+ "When false, stdout is omitted from stored results (not a substitute for avoiding "
+ "large/binary streams; see class docstring)."
+ ),
+ )
+
+ @field_validator("commands", mode="before")
+ @classmethod
+ def _reject_plain_string_commands(cls, value: Optional[list[object]]) -> object:
+ if not value:
+ return []
+ for item in value:
+ if isinstance(item, str):
+ raise ValueError("Each command must be an object with 'name' and 'command' fields")
+ return value
+
+ @model_validator(mode="after")
+ def _validate_unique_command_names(self) -> "GenericCollectionCollectorArgs":
+ seen: set[str] = set()
+ duplicates: set[str] = set()
+ for cmd in self.commands:
+ if cmd.name in seen:
+ duplicates.add(cmd.name)
+ seen.add(cmd.name)
+ if duplicates:
+ raise ValueError(f"Duplicate command name(s): {sorted(duplicates)}")
+ return self
diff --git a/nodescraper/plugins/generic_collection/generic_analyzer.py b/nodescraper/plugins/generic_collection/generic_analyzer.py
new file mode 100644
index 00000000..ce911276
--- /dev/null
+++ b/nodescraper/plugins/generic_collection/generic_analyzer.py
@@ -0,0 +1,266 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+import operator
+import re
+from typing import Callable, Optional, Union
+
+from nodescraper.enums import EventCategory, EventPriority, ExecutionStatus
+from nodescraper.interfaces import DataAnalyzer
+from nodescraper.models import TaskResult
+
+from .analyzer_args import CommandCheck, CompareOp, GenericAnalyzerArgs
+from .generic_collection_data import CommandCollectionResult, GenericCollectionDataModel
+
+_COMPARE_OPS: dict[CompareOp, Callable[[Union[int, float, str], Union[int, float, str]], bool]] = {
+ "==": operator.eq,
+ "!=": operator.ne,
+ ">": operator.gt,
+ ">=": operator.ge,
+ "<": operator.lt,
+ "<=": operator.le,
+}
+
+
+def _as_list(value: Union[str, list[str]]) -> list[str]:
+ if isinstance(value, str):
+ return [value]
+ return list(value)
+
+
+def _non_empty_lines(stdout: str) -> list[str]:
+ return [line for line in stdout.splitlines() if line.strip()]
+
+
+def _needs_stdout(check: CommandCheck) -> bool:
+ return any(
+ [
+ check.must_contain is not None,
+ check.must_not_contain is not None,
+ check.expected is not None,
+ check.expected_in is not None,
+ check.expected_regex is not None,
+ check.forbidden_regex is not None,
+ check.min_lines is not None,
+ check.max_lines is not None,
+ check.exact_lines is not None,
+ check.expected_value is not None,
+ ]
+ )
+
+
+def _check_label(check: CommandCheck) -> str:
+ return check.name
+
+
+def _find_result(
+ data: GenericCollectionDataModel, check: CommandCheck
+) -> Optional[CommandCollectionResult]:
+ for result in data.results:
+ if result.name == check.name:
+ return result
+ return None
+
+
+def _regex_flags(ignore_case: bool) -> int:
+ return re.IGNORECASE if ignore_case else 0
+
+
+def _regex_matches(pattern: str, text: str, mode: str, ignore_case: bool) -> bool:
+ flags = _regex_flags(ignore_case)
+ compiled = re.compile(pattern, flags)
+ if mode == "full":
+ return compiled.search(text) is not None
+ lines = _non_empty_lines(text)
+ if not lines:
+ return False
+ if mode == "any_line":
+ return any(compiled.search(line) for line in lines)
+ return all(compiled.search(line) for line in lines)
+
+
+def _contains(text: str, needle: str, ignore_case: bool) -> bool:
+ if ignore_case:
+ return needle.lower() in text.lower()
+ return needle in text
+
+
+def _parse_value(raw: str, value_type: str, capture_regex: Optional[str]) -> Union[int, float, str]:
+ value = raw.strip()
+ if capture_regex:
+ match = re.search(capture_regex, value)
+ if not match:
+ raise ValueError(f"capture_regex {capture_regex!r} did not match stdout")
+ value = match.group(1) if match.groups() else match.group(0)
+ value = value.strip()
+ if value_type == "int":
+ return int(value)
+ if value_type == "float":
+ return float(value)
+ return value
+
+
+class GenericAnalyzer(DataAnalyzer[GenericCollectionDataModel, GenericAnalyzerArgs]):
+ """Validate generic collection command results against analysis_args checks."""
+
+ DATA_MODEL = GenericCollectionDataModel
+
+ def analyze_data(
+ self,
+ data: GenericCollectionDataModel,
+ args: Optional[GenericAnalyzerArgs] = None,
+ ) -> TaskResult:
+ if args is None:
+ args = GenericAnalyzerArgs()
+
+ if not data.results:
+ self.result.message = "No command results to analyze"
+ self.result.status = ExecutionStatus.NOT_RAN
+ return self.result
+
+ if not args.checks:
+ success_count = sum(1 for result in data.results if result.success)
+ self.result.message = (
+ f"Generic analysis: {success_count}/{len(data.results)} commands collected"
+ )
+ self.result.status = ExecutionStatus.OK
+ return self.result
+
+ failures: list[str] = []
+ failed_check_count = 0
+ for check in args.checks:
+ label = _check_label(check)
+ result = _find_result(data, check)
+ if result is None:
+ failures.append(f"{label}: no matching collected command")
+ failed_check_count += 1
+ continue
+
+ check_failures = self._evaluate_check(check, result)
+ if check_failures:
+ failed_check_count += 1
+ failures.extend(f"{label}: {msg}" for msg in check_failures)
+ self._log_event(
+ category=EventCategory.RUNTIME,
+ description=f"Check failed: {label}",
+ data={
+ "failures": check_failures,
+ "name": result.name,
+ "command": result.command,
+ },
+ priority=EventPriority.ERROR,
+ console_log=True,
+ )
+ else:
+ self._log_event(
+ category=EventCategory.RUNTIME,
+ description=f"Check passed: {label}",
+ data={"name": result.name, "command": result.command},
+ priority=EventPriority.INFO,
+ )
+
+ if failed_check_count:
+ passed = len(args.checks) - failed_check_count
+ self.result.message = f"Generic analysis: {passed}/{len(args.checks)} checks passed"
+ self.result.status = ExecutionStatus.ERROR
+ return self.result
+
+ self.result.message = (
+ f"Generic analysis: {len(args.checks)}/{len(args.checks)} checks passed"
+ )
+ self.result.status = ExecutionStatus.OK
+ return self.result
+
+ def _evaluate_check(self, check: CommandCheck, result: CommandCollectionResult) -> list[str]:
+ failures: list[str] = []
+
+ if not result.success and not check.allow_failure:
+ failures.append(f"command failed with exit code {result.exit_code}")
+ if not _needs_stdout(check) and check.expected_exit_code is None:
+ return failures
+
+ expected_exit_code = check.expected_exit_code
+ if expected_exit_code is None and _needs_stdout(check):
+ expected_exit_code = 0
+ if expected_exit_code is not None and result.exit_code != expected_exit_code:
+ failures.append(f"expected exit code {expected_exit_code}, got {result.exit_code}")
+
+ if not _needs_stdout(check):
+ return failures
+
+ if result.stdout is None:
+ failures.append("stdout not collected; set include_stdout on the command")
+ return failures
+
+ stdout = result.stdout
+ stripped = stdout.strip()
+
+ for needle in _as_list(check.must_contain or []):
+ if not _contains(stdout, needle, check.ignore_case):
+ failures.append(f"must_contain {needle!r} not found")
+
+ for needle in _as_list(check.must_not_contain or []):
+ if _contains(stdout, needle, check.ignore_case):
+ failures.append(f"must_not_contain {needle!r} found")
+
+ if check.expected is not None and stripped != check.expected:
+ failures.append(f"expected exact stdout {check.expected!r}, got {stripped!r}")
+
+ if check.expected_in is not None and stripped not in check.expected_in:
+ failures.append(f"stdout {stripped!r} not in expected_in")
+
+ if check.expected_regex is not None and not _regex_matches(
+ check.expected_regex, stdout, check.match_mode, check.ignore_case
+ ):
+ failures.append(f"expected_regex {check.expected_regex!r} did not match")
+
+ if check.forbidden_regex is not None and _regex_matches(
+ check.forbidden_regex, stdout, check.match_mode, check.ignore_case
+ ):
+ failures.append(f"forbidden_regex {check.forbidden_regex!r} matched")
+
+ line_count = len(_non_empty_lines(stdout))
+ if check.exact_lines is not None and line_count != check.exact_lines:
+ failures.append(f"expected {check.exact_lines} non-empty lines, got {line_count}")
+ if check.min_lines is not None and line_count < check.min_lines:
+ failures.append(
+ f"expected at least {check.min_lines} non-empty lines, got {line_count}"
+ )
+ if check.max_lines is not None and line_count > check.max_lines:
+ failures.append(f"expected at most {check.max_lines} non-empty lines, got {line_count}")
+
+ if check.expected_value is not None:
+ try:
+ parsed = _parse_value(stripped, check.value_type, check.capture_regex)
+ except (TypeError, ValueError) as exc:
+ failures.append(str(exc))
+ else:
+ compare = _COMPARE_OPS[check.compare_op]
+ if not compare(parsed, check.expected_value):
+ failures.append(
+ f"expected parsed value {parsed!r} {check.compare_op} {check.expected_value!r}"
+ )
+
+ return failures
diff --git a/nodescraper/plugins/generic_collection/generic_collection_collector.py b/nodescraper/plugins/generic_collection/generic_collection_collector.py
new file mode 100644
index 00000000..873f572a
--- /dev/null
+++ b/nodescraper/plugins/generic_collection/generic_collection_collector.py
@@ -0,0 +1,121 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from typing import Optional
+
+from nodescraper.base import InBandDataCollector
+from nodescraper.enums import EventCategory, EventPriority, ExecutionStatus, OSFamily
+from nodescraper.models import TaskResult
+
+from .collector_args import GenericCollectionCollectorArgs
+from .generic_collection_data import CommandCollectionResult, GenericCollectionDataModel
+
+
+class GenericCollectionCollector(
+ InBandDataCollector[GenericCollectionDataModel, GenericCollectionCollectorArgs]
+):
+ """Run user-configured shell commands and report per-command success."""
+
+ DATA_MODEL = GenericCollectionDataModel
+ SUPPORTED_OS_FAMILY: set[OSFamily] = {OSFamily.WINDOWS, OSFamily.LINUX, OSFamily.UNKNOWN}
+
+ def collect_data(
+ self, args: Optional[GenericCollectionCollectorArgs] = None
+ ) -> tuple[TaskResult, Optional[GenericCollectionDataModel]]:
+ if args is None:
+ args = GenericCollectionCollectorArgs()
+
+ if not args.commands:
+ self.result.message = "No commands configured"
+ self.result.status = ExecutionStatus.NOT_RAN
+ return self.result, None
+
+ results: list[CommandCollectionResult] = []
+ for cmd_spec in args.commands:
+ command = cmd_spec.command.strip()
+ if not command:
+ continue
+
+ sudo = cmd_spec.sudo if cmd_spec.sudo is not None else args.sudo
+ timeout = cmd_spec.timeout if cmd_spec.timeout is not None else args.timeout
+ include_stdout = (
+ cmd_spec.include_stdout
+ if cmd_spec.include_stdout is not None
+ else args.include_stdout
+ )
+ res = self._run_sut_cmd(
+ command,
+ sudo=sudo,
+ timeout=timeout,
+ )
+ success = res.exit_code == 0
+ cmd_result = CommandCollectionResult(
+ name=cmd_spec.name,
+ command=command,
+ success=success,
+ exit_code=res.exit_code,
+ sudo=sudo,
+ stdout=res.stdout if include_stdout else None,
+ stderr=res.stderr or None,
+ )
+ results.append(cmd_result)
+
+ if success:
+ self._log_event(
+ category=EventCategory.RUNTIME,
+ description=f"Command succeeded: {command!r}",
+ data={
+ "name": cmd_spec.name,
+ "command": command,
+ "exit_code": res.exit_code,
+ "sudo": sudo,
+ },
+ priority=EventPriority.INFO,
+ )
+ else:
+ self._log_event(
+ category=EventCategory.RUNTIME,
+ description=f"Command failed: {command!r}",
+ data={
+ "name": cmd_spec.name,
+ "command": command,
+ "exit_code": res.exit_code,
+ "sudo": sudo,
+ "stderr": res.stderr,
+ },
+ priority=EventPriority.ERROR,
+ console_log=True,
+ )
+
+ if not results:
+ self.result.message = "No commands configured"
+ self.result.status = ExecutionStatus.NOT_RAN
+ return self.result, None
+
+ success_count = sum(1 for result in results if result.success)
+ total = len(results)
+ self.result.message = f"Generic collection: {success_count}/{total} commands succeeded"
+ self.result.status = ExecutionStatus.OK if success_count == total else ExecutionStatus.ERROR
+ return self.result, GenericCollectionDataModel(results=results)
diff --git a/nodescraper/plugins/generic_collection/generic_collection_data.py b/nodescraper/plugins/generic_collection/generic_collection_data.py
new file mode 100644
index 00000000..3421df96
--- /dev/null
+++ b/nodescraper/plugins/generic_collection/generic_collection_data.py
@@ -0,0 +1,48 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from typing import Optional
+
+from pydantic import Field
+
+from nodescraper.models import DataModel
+
+
+class CommandCollectionResult(DataModel):
+ """Outcome of running one configured shell command."""
+
+ command: str
+ name: str
+ success: bool
+ exit_code: int
+ sudo: bool = False
+ stdout: Optional[str] = None
+ stderr: Optional[str] = None
+
+
+class GenericCollectionDataModel(DataModel):
+ """Results for each command configured in collection_args."""
+
+ results: list[CommandCollectionResult] = Field(default_factory=list)
diff --git a/nodescraper/plugins/generic_collection/generic_collection_plugin_mixin.py b/nodescraper/plugins/generic_collection/generic_collection_plugin_mixin.py
new file mode 100644
index 00000000..2a91c644
--- /dev/null
+++ b/nodescraper/plugins/generic_collection/generic_collection_plugin_mixin.py
@@ -0,0 +1,52 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+"""Shared plugin wiring for in-band and OOB generic command collection plugins."""
+
+from typing import Optional, Type
+
+from nodescraper.interfaces.dataanalyzertask import DataAnalyzer
+from nodescraper.interfaces.dataplugin import CollectorArgsClasses, CollectorClasses
+from nodescraper.models import AnalyzerArgs
+
+from .analyzer_args import GenericAnalyzerArgs
+from .collector_args import GenericCollectionCollectorArgs
+from .generic_analyzer import GenericAnalyzer
+from .generic_collection_collector import GenericCollectionCollector
+from .generic_collection_data import GenericCollectionDataModel
+
+
+class GenericCollectionPluginMixin:
+ """Collector, analyzer, and args shared by GenericCollectionPlugin variants."""
+
+ DATA_MODEL: Type[GenericCollectionDataModel] = GenericCollectionDataModel
+
+ COLLECTOR: Optional[CollectorClasses] = GenericCollectionCollector
+
+ COLLECTOR_ARGS: Optional[CollectorArgsClasses] = GenericCollectionCollectorArgs
+
+ ANALYZER: Optional[Type[DataAnalyzer]] = GenericAnalyzer
+
+ ANALYZER_ARGS: Optional[Type[AnalyzerArgs]] = GenericAnalyzerArgs
diff --git a/nodescraper/plugins/generic_collection/inband_plugin.py b/nodescraper/plugins/generic_collection/inband_plugin.py
new file mode 100644
index 00000000..f00ee138
--- /dev/null
+++ b/nodescraper/plugins/generic_collection/inband_plugin.py
@@ -0,0 +1,42 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from nodescraper.base import InBandDataPlugin
+
+from .analyzer_args import GenericAnalyzerArgs
+from .collector_args import GenericCollectionCollectorArgs
+from .generic_collection_data import GenericCollectionDataModel
+from .generic_collection_plugin_mixin import GenericCollectionPluginMixin
+
+
+class GenericCollectionPlugin(
+ GenericCollectionPluginMixin,
+ InBandDataPlugin[
+ GenericCollectionDataModel,
+ GenericCollectionCollectorArgs,
+ GenericAnalyzerArgs,
+ ],
+):
+ """Run arbitrary shell commands on the host via in-band SSH and validate results."""
diff --git a/nodescraper/plugins/generic_collection/oob_plugin.py b/nodescraper/plugins/generic_collection/oob_plugin.py
new file mode 100644
index 00000000..e2451600
--- /dev/null
+++ b/nodescraper/plugins/generic_collection/oob_plugin.py
@@ -0,0 +1,42 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from nodescraper.base import OOBSSHDataPlugin
+
+from .analyzer_args import GenericAnalyzerArgs
+from .collector_args import GenericCollectionCollectorArgs
+from .generic_collection_data import GenericCollectionDataModel
+from .generic_collection_plugin_mixin import GenericCollectionPluginMixin
+
+
+class OobGenericCollectionPlugin(
+ GenericCollectionPluginMixin,
+ OOBSSHDataPlugin[
+ GenericCollectionDataModel,
+ GenericCollectionCollectorArgs,
+ GenericAnalyzerArgs,
+ ],
+):
+ """Run arbitrary shell commands on the BMC via OOB SSH and validate results."""
diff --git a/nodescraper/plugins/inband/dmesg/dmesg_analyzer.py b/nodescraper/plugins/inband/dmesg/dmesg_analyzer.py
index cbc14a81..68e77702 100644
--- a/nodescraper/plugins/inband/dmesg/dmesg_analyzer.py
+++ b/nodescraper/plugins/inband/dmesg/dmesg_analyzer.py
@@ -247,6 +247,7 @@ class DmesgAnalyzer(RegexAnalyzer[DmesgData, DmesgAnalyzerArgs]):
),
message="RAS Correctable Error",
event_category=EventCategory.RAS,
+ event_priority=EventPriority.WARNING,
),
ErrorRegex(
regex=re.compile(
@@ -270,6 +271,7 @@ class DmesgAnalyzer(RegexAnalyzer[DmesgData, DmesgAnalyzerArgs]):
),
message="RAS Corrected PCIe Error",
event_category=EventCategory.RAS,
+ event_priority=EventPriority.WARNING,
),
ErrorRegex(
regex=re.compile(r"(?:\d{4}-\d+-\d+T\d+:\d+:\d+,\d+[+-]\d+:\d+)?(.*GPU reset begin.*)"),
@@ -334,8 +336,18 @@ class DmesgAnalyzer(RegexAnalyzer[DmesgData, DmesgAnalyzerArgs]):
event_category=EventCategory.RAS,
),
ErrorRegex(
- regex=re.compile(r"\[Hardware Error\]:.+MC\d+_STATUS.*(?:\n.*){0,5}"),
- message="MCE Error",
+ regex=re.compile(
+ r"\[Hardware Error\]:.+MC\d+_STATUS\[[^\]]*\|CE\|[^\]]*\].*(?:\n.*){0,5}"
+ ),
+ message="MCE Corrected Error",
+ event_category=EventCategory.RAS,
+ event_priority=EventPriority.WARNING,
+ ),
+ ErrorRegex(
+ regex=re.compile(
+ r"\[Hardware Error\]:.+MC\d+_STATUS\[[^\]]*\|UC\|[^\]]*\].*(?:\n.*){0,5}"
+ ),
+ message="MCE Uncorrected Error",
event_category=EventCategory.RAS,
),
ErrorRegex(
@@ -351,6 +363,7 @@ class DmesgAnalyzer(RegexAnalyzer[DmesgData, DmesgAnalyzerArgs]):
),
message="RAS Corrected Error",
event_category=EventCategory.RAS,
+ event_priority=EventPriority.WARNING,
),
ErrorRegex(
regex=re.compile(r"x86/cpu: SGX disabled by BIOS"),
diff --git a/nodescraper/plugins/inband/pcie/pcie_collector.py b/nodescraper/plugins/inband/pcie/pcie_collector.py
index d4c2108a..eb3bb5f7 100755
--- a/nodescraper/plugins/inband/pcie/pcie_collector.py
+++ b/nodescraper/plugins/inband/pcie/pcie_collector.py
@@ -66,11 +66,13 @@ class PcieCollector(InBandDataCollector[PcieDataModel, None]):
- `lspci -vvv` : Verbose collection of PCIe data
- `lspci -vvvt`: Verbose tree view of PCIe data
- `lspci -PP`: Path view of PCIe data for the GPUs
+ - `lspci -PP -D`: Path view of PCIe data for the GPUs (with domain prefix)
- If system interaction level is set to STANDARD or higher, the following commands will be run with sudo:
- `lspci -xxxx`: Hex view of PCIe data for the GPUs
- otherwise the following commands will be run without sudo:
- `lspci -x`: Hex view of PCIe data for the GPUs
- - `lspci -d :` : Count the number of GPUs in the system with this command
+ - `lspci -d :` : Detect AMD GPU device IDs in the system
+ - `lspci -PP -D -d :` : Upstream BDF path for GPUs (with domain prefix)
- If system interaction level is set to STANDARD or higher, the following commands will be run with sudo:
- The sudo lspci -xxxx command is used to collect the PCIe configuration space for the GPUs in the system
- otherwise the following commands will be run without sudo:
@@ -85,10 +87,12 @@ class PcieCollector(InBandDataCollector[PcieDataModel, None]):
CMD_LSPCI_VERBOSE = "lspci -vvv"
CMD_LSPCI_VERBOSE_TREE = "lspci -vvvt"
CMD_LSPCI_PATH = "lspci -PP"
+ CMD_LSPCI_PATH_DOMAIN = "lspci -PP -D"
CMD_LSPCI_HEX_SUDO = "lspci -xxxx"
CMD_LSPCI_HEX = "lspci -x"
CMD_LSPCI_AMD_DEVICES = "lspci -d {vendor_id}: -nn"
CMD_LSPCI_PATH_DEVICE = "lspci -PP -d {vendor_id}:{dev_id}"
+ CMD_LSPCI_PATH_DEVICE_DOMAIN = "lspci -PP -D -d {vendor_id}:{dev_id}"
def _detect_amd_device_ids(self) -> dict[str, list[str]]:
"""Detect AMD GPU device IDs from the system using lspci.
@@ -149,6 +153,10 @@ def show_lspci_path(self, sudo=True) -> Optional[str]:
"""Show lspci with -PP."""
return self._run_os_cmd(self.CMD_LSPCI_PATH, sudo=sudo)
+ def show_lspci_path_domain(self, sudo=True) -> Optional[str]:
+ """Show lspci with -PP -D (path view with domain prefix)."""
+ return self._run_os_cmd(self.CMD_LSPCI_PATH_DOMAIN, sudo=sudo)
+
def show_lspci_hex(self, bdf: Optional[str] = None, sudo=True) -> Optional[str]:
"""Show lspci with -xxxx."""
if sudo:
@@ -208,7 +216,10 @@ def _get_upstream_bdf_from_buspath(
"""
split_bdf_pos = 0
- bus_path_all_gpus = self._run_os_cmd(f"lspci -PP -d {vendor_id}:{dev_id}", sudo=sudo)
+ bus_path_all_gpus = self._run_os_cmd(
+ self.CMD_LSPCI_PATH_DEVICE_DOMAIN.format(vendor_id=vendor_id, dev_id=dev_id),
+ sudo=sudo,
+ )
if bus_path_all_gpus is None or bus_path_all_gpus == "":
self._log_event(
category=EventCategory.IO,
@@ -220,6 +231,16 @@ def _get_upstream_bdf_from_buspath(
upstream_bdfs: Dict[str, List[str]] = {}
for bus_path in bus_path_all_gpus.splitlines():
bus_path_list = (bus_path.split(" ")[split_bdf_pos]).split("/")
+ # With -D, only the first path component carries the domain prefix (e.g. 0001:00:01.1).
+ # Propagate it to all downstream bare BDFs so config space reads hit the correct domain.
+ domain = "0000"
+ for component in bus_path_list:
+ if component.count(":") == 2:
+ domain = component.split(":")[0]
+ break
+ bus_path_list = [
+ f"{domain}:{bdf}" if bdf.count(":") == 1 else bdf for bdf in bus_path_list
+ ]
if upstream_steps_limit is not None and len(bus_path_list) < upstream_steps_limit + 1:
# We don't have enough upstream devices to collect
self._log_event(
@@ -547,6 +568,7 @@ def get_pcie_cfg(
def _log_pcie_artifacts(
self,
lspci_pp: Optional[str],
+ lspci_pp_d: Optional[str],
lspci_hex: Optional[str],
lspci_verbose_tree: Optional[str],
lspci_verbose: Optional[str],
@@ -557,6 +579,7 @@ def _log_pcie_artifacts(
"lspci_verbose_tree.txt": lspci_verbose_tree,
"lspci_verbose.txt": lspci_verbose,
"lspci_pp.txt": lspci_pp,
+ "lspci_pp_d.txt": lspci_pp_d,
}
for name, data in name_log_map.items():
if data is not None:
@@ -626,8 +649,10 @@ def _get_pcie_data(
lspci_verbose = self.show_lspci_verbose(sudo=use_sudo)
lspci_verbose_tree = self.show_lspci_verbose_tree(sudo=use_sudo)
lspci_path = self.show_lspci_path(sudo=use_sudo)
+ lspci_path_domain = self.show_lspci_path_domain(sudo=use_sudo)
self._log_pcie_artifacts(
lspci_pp=lspci_path,
+ lspci_pp_d=lspci_path_domain,
lspci_hex=lspci_hex,
lspci_verbose_tree=lspci_verbose_tree,
lspci_verbose=lspci_verbose,
diff --git a/nodescraper/plugins/inband/pcie/pcie_data.py b/nodescraper/plugins/inband/pcie/pcie_data.py
index 77ea0e1c..83a03403 100644
--- a/nodescraper/plugins/inband/pcie/pcie_data.py
+++ b/nodescraper/plugins/inband/pcie/pcie_data.py
@@ -2009,6 +2009,7 @@ class PcieDataModel(DataModel):
- lspci_verbose: Verbose collection of PCIe data
- lspci_verbose_tree: Tree view of PCIe data
- lspci_path: Path view of PCIe data for the GPUs
+ - lspci_path_domain: Path view of PCIe data for the GPUs (with domain prefix)
- lspci_hex: Hex view of PCIe data for the GPUs
"""
diff --git a/nodescraper/plugins/inband/sys_settings/collector_args.py b/nodescraper/plugins/inband/sys_settings/collector_args.py
index 207c46b3..7e49f228 100644
--- a/nodescraper/plugins/inband/sys_settings/collector_args.py
+++ b/nodescraper/plugins/inband/sys_settings/collector_args.py
@@ -23,10 +23,12 @@
# SOFTWARE.
#
###############################################################################
-from pydantic import BaseModel, Field
+from pydantic import Field
+from nodescraper.models import CollectorArgs
-class SysSettingsCollectorArgs(BaseModel):
+
+class SysSettingsCollectorArgs(CollectorArgs):
"""Collection args for SysSettingsCollector.
paths: sysfs paths to read (cat). If a path contains '*', collect with ls -l instead (e.g. class/net/*/device).
diff --git a/nodescraper/plugins/inband/syslog/syslog_collector.py b/nodescraper/plugins/inband/syslog/syslog_collector.py
index dc3f7838..18584949 100644
--- a/nodescraper/plugins/inband/syslog/syslog_collector.py
+++ b/nodescraper/plugins/inband/syslog/syslog_collector.py
@@ -42,16 +42,28 @@ class SyslogCollector(InBandDataCollector[SyslogData, None]):
DATA_MODEL = SyslogData
CMD = r"ls -1 /var/log/syslog* 2>/dev/null | grep -E '^/var/log/syslog(\.[0-9]+(\.gz)?)?$' || true"
+ CMD_MESSAGES = r"ls -1 /var/log/messages* 2>/dev/null | grep -E '^/var/log/messages(\.[0-9]+(\.gz)?)?$' || true"
+
+ def _list_log_paths(self, list_cmd: str) -> list[str]:
+ list_res = self._run_sut_cmd(list_cmd, sudo=True)
+ return [p.strip() for p in (list_res.stdout or "").splitlines() if p.strip()]
+
+ @staticmethod
+ def _log_stem(path: str) -> str:
+ if path.startswith("/var/log/messages"):
+ return "messages"
+ return "syslog"
def _collect_syslog_rotations(self) -> list[TextFileArtifact]:
ret = []
- list_res = self._run_sut_cmd(self.CMD, sudo=True)
- paths = [p.strip() for p in (list_res.stdout or "").splitlines() if p.strip()]
+ paths: list[str] = []
+ for list_cmd in (self.CMD, self.CMD_MESSAGES):
+ paths.extend(self._list_log_paths(list_cmd))
if not paths:
self._log_event(
category=EventCategory.OS,
- description="No /var/log/syslog files found (including rotations).",
- data={"list_exit_code": list_res.exit_code},
+ description="No /var/log/syslog or /var/log/messages files found (including rotations).",
+ data={},
priority=EventPriority.WARNING,
)
return []
@@ -60,11 +72,12 @@ def _collect_syslog_rotations(self) -> list[TextFileArtifact]:
collected = []
for p in paths:
qp = shell_quote(p)
+ stem = self._log_stem(p)
if p.endswith(".gz"):
cmd = f"gzip -dc {qp} 2>/dev/null || zcat {qp} 2>/dev/null"
res = self._run_sut_cmd(cmd, sudo=True, log_artifact=False)
if res.exit_code == 0 and res.stdout is not None:
- fname = nice_rotated_name(p, "syslog")
+ fname = nice_rotated_name(p, stem)
self.logger.info("Collected syslog log: %s", fname)
collected.append(TextFileArtifact(filename=fname, contents=res.stdout))
collected_logs.append(fname)
@@ -74,7 +87,7 @@ def _collect_syslog_rotations(self) -> list[TextFileArtifact]:
cmd = f"cat {qp}"
res = self._run_sut_cmd(cmd, sudo=True, log_artifact=False)
if res.exit_code == 0 and res.stdout is not None:
- fname = nice_rotated_name(p, "syslog")
+ fname = nice_rotated_name(p, stem)
self.logger.info("Collected syslog log: %s", fname)
collected_logs.append(fname)
collected.append(TextFileArtifact(filename=fname, contents=res.stdout))
diff --git a/nodescraper/plugins/ooband/bmc_archive/__init__.py b/nodescraper/plugins/ooband/bmc_archive/__init__.py
new file mode 100644
index 00000000..2861fb9b
--- /dev/null
+++ b/nodescraper/plugins/ooband/bmc_archive/__init__.py
@@ -0,0 +1,40 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+"""OOB BMC archive collection over SSH."""
+
+from .bmc_archive_collector import BmcArchiveCollector
+from .bmc_archive_data import ArchiveCollectionResult, BmcArchiveDataModel
+from .bmc_archive_plugin import OobBmcArchivePlugin
+from .collector_args import BmcArchiveCollectorArgs, PathSpec
+
+__all__ = [
+ "ArchiveCollectionResult",
+ "BmcArchiveCollector",
+ "BmcArchiveCollectorArgs",
+ "BmcArchiveDataModel",
+ "OobBmcArchivePlugin",
+ "PathSpec",
+]
diff --git a/nodescraper/plugins/ooband/bmc_archive/bmc_archive_collector.py b/nodescraper/plugins/ooband/bmc_archive/bmc_archive_collector.py
new file mode 100644
index 00000000..547ba80d
--- /dev/null
+++ b/nodescraper/plugins/ooband/bmc_archive/bmc_archive_collector.py
@@ -0,0 +1,351 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from typing import Optional
+
+from nodescraper.base import InBandDataCollector
+from nodescraper.connection.inband.inband import BaseFileArtifact, BinaryFileArtifact
+from nodescraper.enums import EventCategory, EventPriority, ExecutionStatus, OSFamily
+from nodescraper.models import TaskResult
+from nodescraper.utils import shell_quote
+
+from .bmc_archive_data import ArchiveCollectionResult, BmcArchiveDataModel
+from .collector_args import BmcArchiveCollectorArgs, PathSpec
+
+
+class BmcArchiveCollector(InBandDataCollector[BmcArchiveDataModel, BmcArchiveCollectorArgs]):
+ """Archive BMC directories over SSH using tar czf - ."""
+
+ DATA_MODEL = BmcArchiveDataModel
+ SUPPORTED_OS_FAMILY = {OSFamily.LINUX, OSFamily.UNKNOWN}
+
+ REMOTE_ARCHIVE_TEMPLATE = "/tmp/node_scraper_{name}.tar.gz"
+ # None until first probe in a run; collect_data resets so each collection re-probes.
+ _tar_ignore_failed_read_supported: Optional[bool] = None
+
+ def _remote_archive_path(self, name: str) -> str:
+ safe_name = "".join(ch if ch.isalnum() or ch in "-_" else "_" for ch in name)
+ return self.REMOTE_ARCHIVE_TEMPLATE.format(name=safe_name)
+
+ def _remote_tar_supports_ignore_failed_read(self, *, sudo: bool, timeout: int) -> bool:
+ """Return True only if remote tar accepts GNU's --ignore-failed-read."""
+ cached = getattr(self, "_tar_ignore_failed_read_supported", None)
+ if cached is not None:
+ return cached
+ probe = self._run_sut_cmd(
+ "tar cf - --ignore-failed-read /dev/null",
+ sudo=sudo,
+ timeout=min(timeout, 60),
+ log_artifact=False,
+ )
+ stderr = (probe.stderr or "").lower()
+ if probe.exit_code == 0:
+ self._tar_ignore_failed_read_supported = True
+ return True
+ if any(
+ phrase in stderr
+ for phrase in (
+ "unrecognized option",
+ "invalid option",
+ "unknown option",
+ "illegal option",
+ )
+ ):
+ self._tar_ignore_failed_read_supported = False
+ return False
+ # Unrecognized failure: omit the flag so archiving still runs.
+ self._tar_ignore_failed_read_supported = False
+ return False
+
+ def _tar_command(
+ self,
+ path: str,
+ remote_archive: str,
+ *,
+ ignore_failed_read: bool,
+ ) -> str:
+ tar_flags = "czf -"
+ if ignore_failed_read:
+ tar_flags = "czf - --ignore-failed-read"
+ return f"tar {tar_flags} {shell_quote(path)} > {shell_quote(remote_archive)}"
+
+ def _path_exists(self, path: str, *, sudo: bool, timeout: int) -> bool:
+ res = self._run_sut_cmd(
+ f"test -e {shell_quote(path)}",
+ sudo=sudo,
+ timeout=timeout,
+ log_artifact=False,
+ )
+ return res.exit_code == 0
+
+ def _remote_archive_has_content(self, remote_archive: str, *, sudo: bool, timeout: int) -> bool:
+ res = self._run_sut_cmd(
+ f"test -s {shell_quote(remote_archive)}",
+ sudo=sudo,
+ timeout=timeout,
+ log_artifact=False,
+ )
+ return res.exit_code == 0
+
+ def _read_remote_archive(
+ self,
+ path_spec: PathSpec,
+ *,
+ remote_archive: str,
+ archive_filename: str,
+ sudo: bool,
+ timeout: int,
+ result: ArchiveCollectionResult,
+ ) -> tuple[ArchiveCollectionResult, Optional[BinaryFileArtifact]]:
+ read_artifact: Optional[BaseFileArtifact] = None
+ try:
+ read_artifact = self._read_sut_file(
+ remote_archive,
+ encoding=None,
+ strip=False,
+ log_artifact=True,
+ )
+ except Exception as exc:
+ result.stderr = str(exc)
+ self._log_event(
+ category=EventCategory.RUNTIME,
+ description=f"BMC archive read failed: {path_spec.name}",
+ data={"name": path_spec.name, "path": path_spec.path, "error": str(exc)},
+ priority=EventPriority.ERROR,
+ console_log=True,
+ )
+ return result, None
+ finally:
+ self._run_sut_cmd(
+ f"rm -f {shell_quote(remote_archive)}",
+ sudo=sudo,
+ timeout=timeout,
+ log_artifact=False,
+ )
+
+ if not isinstance(read_artifact, BinaryFileArtifact) or not read_artifact.contents:
+ result.stderr = "Archive file was empty or unreadable"
+ self._log_event(
+ category=EventCategory.RUNTIME,
+ description=f"BMC archive empty: {path_spec.name}",
+ data={"name": path_spec.name, "path": path_spec.path},
+ priority=EventPriority.ERROR,
+ console_log=True,
+ )
+ return result, None
+
+ read_artifact.filename = archive_filename
+ result.success = True
+ result.size_bytes = len(read_artifact.contents)
+ return result, read_artifact
+
+ def _collect_path(
+ self,
+ path_spec: PathSpec,
+ *,
+ default_sudo: bool,
+ default_timeout: int,
+ default_skip_if_missing: bool,
+ default_ignore_failed_read: bool,
+ ) -> tuple[ArchiveCollectionResult, Optional[BinaryFileArtifact]]:
+ sudo = default_sudo if path_spec.sudo is None else path_spec.sudo
+ timeout = default_timeout if path_spec.timeout is None else path_spec.timeout
+ skip_if_missing = (
+ default_skip_if_missing
+ if path_spec.skip_if_missing is None
+ else path_spec.skip_if_missing
+ )
+ ignore_failed_read = (
+ default_ignore_failed_read
+ if path_spec.ignore_failed_read is None
+ else path_spec.ignore_failed_read
+ )
+ remote_archive = self._remote_archive_path(path_spec.name)
+ archive_filename = f"{path_spec.name}.tar.gz"
+
+ result = ArchiveCollectionResult(
+ name=path_spec.name,
+ path=path_spec.path,
+ archive_filename=archive_filename,
+ )
+
+ if not self._path_exists(path_spec.path, sudo=sudo, timeout=timeout):
+ result.stderr = f"Path does not exist: {path_spec.path}"
+ if skip_if_missing:
+ result.skipped = True
+ self._log_event(
+ category=EventCategory.RUNTIME,
+ description=f"BMC archive skipped: {path_spec.name}",
+ data={"name": path_spec.name, "path": path_spec.path, "reason": "missing"},
+ priority=EventPriority.WARNING,
+ )
+ return result, None
+
+ self._log_event(
+ category=EventCategory.RUNTIME,
+ description=f"BMC archive failed: {path_spec.name}",
+ data={
+ "name": path_spec.name,
+ "path": path_spec.path,
+ "exit_code": 2,
+ "stderr": result.stderr,
+ },
+ priority=EventPriority.ERROR,
+ console_log=True,
+ )
+ result.exit_code = 2
+ return result, None
+
+ use_ignore_failed_read = (
+ ignore_failed_read
+ and self._remote_tar_supports_ignore_failed_read(sudo=sudo, timeout=timeout)
+ )
+
+ tar_res = self._run_sut_cmd(
+ self._tar_command(
+ path_spec.path,
+ remote_archive,
+ ignore_failed_read=use_ignore_failed_read,
+ ),
+ sudo=sudo,
+ timeout=timeout,
+ log_artifact=True,
+ )
+ result.exit_code = tar_res.exit_code
+ result.stderr = tar_res.stderr or ""
+
+ if tar_res.exit_code != 0:
+ if not self._remote_archive_has_content(
+ remote_archive,
+ sudo=sudo,
+ timeout=timeout,
+ ):
+ self._log_event(
+ category=EventCategory.RUNTIME,
+ description=f"BMC archive failed: {path_spec.name}",
+ data={
+ "name": path_spec.name,
+ "path": path_spec.path,
+ "exit_code": tar_res.exit_code,
+ "stderr": tar_res.stderr,
+ },
+ priority=EventPriority.ERROR,
+ console_log=True,
+ )
+ self._run_sut_cmd(
+ f"rm -f {shell_quote(remote_archive)}",
+ sudo=sudo,
+ timeout=timeout,
+ log_artifact=False,
+ )
+ return result, None
+
+ self._log_event(
+ category=EventCategory.RUNTIME,
+ description=f"BMC archive partial: {path_spec.name}",
+ data={
+ "name": path_spec.name,
+ "path": path_spec.path,
+ "exit_code": tar_res.exit_code,
+ "stderr": tar_res.stderr,
+ },
+ priority=EventPriority.WARNING,
+ )
+
+ result, archive_artifact = self._read_remote_archive(
+ path_spec,
+ remote_archive=remote_archive,
+ archive_filename=archive_filename,
+ sudo=sudo,
+ timeout=timeout,
+ result=result,
+ )
+ if result.success:
+ priority = EventPriority.WARNING if tar_res.exit_code != 0 else EventPriority.INFO
+ self._log_event(
+ category=EventCategory.RUNTIME,
+ description=f"BMC archive collected: {path_spec.name}",
+ data={
+ "name": path_spec.name,
+ "path": path_spec.path,
+ "size_bytes": result.size_bytes,
+ "archive_filename": archive_filename,
+ "partial": tar_res.exit_code != 0,
+ },
+ priority=priority,
+ )
+ return result, archive_artifact
+
+ def collect_data(
+ self,
+ args: Optional[BmcArchiveCollectorArgs] = None,
+ ) -> tuple[TaskResult, Optional[BmcArchiveDataModel]]:
+ if args is None:
+ args = BmcArchiveCollectorArgs()
+
+ if not args.paths:
+ self.result.message = "No paths configured in collection_args.paths"
+ self.result.status = ExecutionStatus.NOT_RAN
+ return self.result, None
+
+ self._tar_ignore_failed_read_supported = None
+
+ results: list[ArchiveCollectionResult] = []
+ archives: list[BinaryFileArtifact] = []
+ failures: list[str] = []
+
+ for path_spec in args.paths:
+ result, archive_artifact = self._collect_path(
+ path_spec,
+ default_sudo=args.sudo,
+ default_timeout=args.timeout,
+ default_skip_if_missing=args.skip_if_missing,
+ default_ignore_failed_read=args.ignore_failed_read,
+ )
+ results.append(result)
+ if archive_artifact is not None:
+ archives.append(archive_artifact)
+ if not result.success and not result.skipped:
+ failures.append(path_spec.name)
+
+ success_count = sum(1 for result in results if result.success)
+ skipped_count = sum(1 for result in results if result.skipped)
+ total = len(results)
+
+ if failures:
+ self.result.message = (
+ f"BMC archive collection: {success_count}/{total} paths archived "
+ f"({len(failures)} errors: {', '.join(failures)}"
+ f"{f'; {skipped_count} skipped' if skipped_count else ''})"
+ )
+ self.result.status = ExecutionStatus.ERROR
+ else:
+ suffix = f", {skipped_count} skipped" if skipped_count else ""
+ self.result.message = (
+ f"BMC archive collection: {success_count}/{total} paths archived{suffix}"
+ )
+ self.result.status = ExecutionStatus.OK
+
+ return self.result, BmcArchiveDataModel(results=results, archives=archives)
diff --git a/nodescraper/plugins/ooband/bmc_archive/bmc_archive_data.py b/nodescraper/plugins/ooband/bmc_archive/bmc_archive_data.py
new file mode 100644
index 00000000..9393acf2
--- /dev/null
+++ b/nodescraper/plugins/ooband/bmc_archive/bmc_archive_data.py
@@ -0,0 +1,65 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+import os
+from typing import Optional
+
+from pydantic import Field
+
+from nodescraper.connection.inband.inband import BinaryFileArtifact
+from nodescraper.models import DataModel
+
+
+class ArchiveCollectionResult(DataModel):
+ """Result of archiving one BMC path."""
+
+ name: str
+ path: str
+ success: bool = False
+ skipped: bool = False
+ exit_code: int = 0
+ stderr: str = ""
+ size_bytes: int = 0
+ archive_filename: Optional[str] = None
+
+
+class BmcArchiveDataModel(DataModel):
+ """Collected BMC directory archives."""
+
+ results: list[ArchiveCollectionResult] = Field(default_factory=list)
+ archives: list[BinaryFileArtifact] = Field(default_factory=list)
+
+ def log_model(self, log_path: str) -> None:
+ for archive in self.archives:
+ archive.log_model(log_path)
+
+ log_name = os.path.join(log_path, "oob_bmc_archive_results.json")
+ with open(log_name, "w", encoding="utf-8") as log_file:
+ log_file.write(
+ self.model_dump_json(
+ indent=2,
+ exclude={"archives"},
+ )
+ )
diff --git a/nodescraper/plugins/ooband/bmc_archive/bmc_archive_plugin.py b/nodescraper/plugins/ooband/bmc_archive/bmc_archive_plugin.py
new file mode 100644
index 00000000..dca6fc5e
--- /dev/null
+++ b/nodescraper/plugins/ooband/bmc_archive/bmc_archive_plugin.py
@@ -0,0 +1,44 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from nodescraper.base import OOBSSHDataPlugin
+
+from .bmc_archive_collector import BmcArchiveCollector
+from .bmc_archive_data import BmcArchiveDataModel
+from .collector_args import BmcArchiveCollectorArgs
+
+
+class OobBmcArchivePlugin(
+ OOBSSHDataPlugin[
+ BmcArchiveDataModel,
+ BmcArchiveCollectorArgs,
+ None,
+ ]
+):
+ """Archive remote directories over BMC SSH using tar czf - ."""
+
+ DATA_MODEL = BmcArchiveDataModel
+ COLLECTOR = BmcArchiveCollector
+ COLLECTOR_ARGS = BmcArchiveCollectorArgs
diff --git a/nodescraper/plugins/ooband/bmc_archive/collector_args.py b/nodescraper/plugins/ooband/bmc_archive/collector_args.py
new file mode 100644
index 00000000..bea4d82c
--- /dev/null
+++ b/nodescraper/plugins/ooband/bmc_archive/collector_args.py
@@ -0,0 +1,115 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from typing import Optional
+
+from pydantic import Field, field_validator, model_validator
+
+from nodescraper.models import CollectorArgs
+
+
+class PathSpec(CollectorArgs):
+ """One named BMC directory path to archive."""
+
+ name: str = Field(description="Stable name for this archive, used in output filenames.")
+ path: str = Field(description="Absolute BMC path to tar.")
+ sudo: Optional[bool] = Field(
+ default=None,
+ description="Run tar with sudo. When omitted, uses collection_args.sudo.",
+ )
+ timeout: Optional[int] = Field(
+ default=None,
+ ge=1,
+ description="Tar command timeout in seconds. When omitted, uses collection_args.timeout.",
+ )
+ skip_if_missing: Optional[bool] = Field(
+ default=None,
+ description="Skip this path when it does not exist on the BMC. When omitted, uses collection_args.skip_if_missing.",
+ )
+ ignore_failed_read: Optional[bool] = Field(
+ default=None,
+ description=(
+ "Pass --ignore-failed-read to tar so unreadable files do not abort the archive. "
+ "When omitted, uses collection_args.ignore_failed_read."
+ ),
+ )
+
+ @field_validator("name", "path", mode="before")
+ @classmethod
+ def _strip_required_text(cls, value: object) -> object:
+ if isinstance(value, str):
+ return value.strip()
+ return value
+
+ @model_validator(mode="after")
+ def _validate_required_fields(self) -> "PathSpec":
+ if not self.name:
+ raise ValueError("name must not be empty")
+ if not self.path:
+ raise ValueError("path must not be empty")
+ if not self.path.startswith("/"):
+ raise ValueError("path must be an absolute BMC path")
+ return self
+
+
+class BmcArchiveCollectorArgs(CollectorArgs):
+ paths: list[PathSpec] = Field(
+ default_factory=list,
+ description=(
+ "Named BMC paths to archive with tar czf -. "
+ "Configure in plugin config under plugins.OobBmcArchivePlugin.collection_args.paths."
+ ),
+ )
+ sudo: bool = Field(
+ default=False,
+ description="Default sudo setting for paths that do not specify sudo.",
+ )
+ timeout: int = Field(
+ default=600,
+ ge=1,
+ description="Default per-path tar timeout in seconds.",
+ )
+ skip_if_missing: bool = Field(
+ default=False,
+ description="Skip paths that do not exist on the BMC instead of failing collection.",
+ )
+ ignore_failed_read: bool = Field(
+ default=True,
+ description=(
+ "When true, pass GNU tar's --ignore-failed-read when the remote tar supports it."
+ ),
+ )
+
+ @model_validator(mode="after")
+ def _validate_unique_path_names(self) -> "BmcArchiveCollectorArgs":
+ seen: set[str] = set()
+ duplicates: set[str] = set()
+ for path_spec in self.paths:
+ if path_spec.name in seen:
+ duplicates.add(path_spec.name)
+ seen.add(path_spec.name)
+ if duplicates:
+ raise ValueError(f"Duplicate path name(s): {sorted(duplicates)}")
+ return self
diff --git a/nodescraper/plugins/ooband/redfish_endpoint/collector_args.py b/nodescraper/plugins/ooband/redfish_endpoint/collector_args.py
index 55bb4269..189c5edf 100644
--- a/nodescraper/plugins/ooband/redfish_endpoint/collector_args.py
+++ b/nodescraper/plugins/ooband/redfish_endpoint/collector_args.py
@@ -23,10 +23,12 @@
# SOFTWARE.
#
###############################################################################
-from pydantic import BaseModel, Field, field_validator
+from pydantic import Field, field_validator
+from nodescraper.models import CollectorArgs
-class RedfishEndpointCollectorArgs(BaseModel):
+
+class RedfishEndpointCollectorArgs(CollectorArgs):
"""Collection args: uris to GET (or discover from tree), optional concurrency and tree discovery."""
uris: list[str] = Field(
diff --git a/nodescraper/plugins/ooband/redfish_oem_diag/collector_args.py b/nodescraper/plugins/ooband/redfish_oem_diag/collector_args.py
index da5bd50c..7eb6bac7 100644
--- a/nodescraper/plugins/ooband/redfish_oem_diag/collector_args.py
+++ b/nodescraper/plugins/ooband/redfish_oem_diag/collector_args.py
@@ -27,12 +27,14 @@
from typing import Optional
-from pydantic import BaseModel, Field, model_validator
+from pydantic import Field, model_validator
+
+from nodescraper.models import CollectorArgs
DEFAULT_TASK_TIMEOUT_S = 1800
-class RedfishOemDiagCollectorArgs(BaseModel):
+class RedfishOemDiagCollectorArgs(CollectorArgs):
"""Collector/analyzer args for Redfish OEM diagnostic log collection."""
log_service_path: str = Field(
diff --git a/nodescraper/typeutils.py b/nodescraper/typeutils.py
index cd7a4650..bc4ce244 100644
--- a/nodescraper/typeutils.py
+++ b/nodescraper/typeutils.py
@@ -57,13 +57,14 @@ def get_generic_map(cls, class_type: Type[Any]) -> dict:
Returns:
dict: map of generic type parameters to their actual types
"""
- if class_type.__orig_bases__ and len(class_type.__orig_bases__) > 0:
- gen_base = class_type.__orig_bases__[0]
+ generic_map: dict = {}
+ for gen_base in getattr(class_type, "__orig_bases__", ()) or ():
class_org = get_origin(gen_base)
+ if class_org is None:
+ continue
args = get_args(gen_base)
generic_map = dict(zip(class_org.__parameters__, args))
- else:
- generic_map = {}
+ break
return generic_map
@@ -168,7 +169,9 @@ def get_model_types(cls, model: type[BaseModel]) -> dict[str, TypeData]:
type_map[name] = TypeData(
type_classes=cls.process_type(field.annotation),
required=field.is_required(),
- default=field.default,
+ default=(
+ field.default_factory() if callable(field.default_factory) else field.default
+ ),
)
return type_map
diff --git a/pyproject.toml b/pyproject.toml
index c5495ffc..1d40c1a8 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -29,7 +29,9 @@ dev = [
"pre-commit",
"pytest",
"pytest-cov",
- "mypy"
+ "mypy",
+ "types-paramiko",
+ "types-setuptools",
]
[project.urls]
diff --git a/test/functional/fixtures/generic_collection_plugin_config.json b/test/functional/fixtures/generic_collection_plugin_config.json
new file mode 100644
index 00000000..9941393d
--- /dev/null
+++ b/test/functional/fixtures/generic_collection_plugin_config.json
@@ -0,0 +1,150 @@
+{
+ "global_args": {},
+ "plugins": {
+ "GenericCollectionPlugin": {
+ "collection_args": {
+ "sudo": false,
+ "include_stdout": true,
+ "commands": [
+ {
+ "name": "kernel_os",
+ "command": "uname -s"
+ },
+ {
+ "name": "messages",
+ "command": "cat /var/log/messages",
+ "sudo": true
+ },
+ {
+ "name": "uid",
+ "command": "id -u",
+ "sudo": false
+ },
+ {
+ "name": "count_eq",
+ "command": "echo 8"
+ },
+ {
+ "name": "count_ne",
+ "command": "echo 7"
+ },
+ {
+ "name": "count_gt",
+ "command": "echo 10"
+ },
+ {
+ "name": "count_gte",
+ "command": "echo 5"
+ },
+ {
+ "name": "count_lt",
+ "command": "echo 3"
+ },
+ {
+ "name": "count_lte",
+ "command": "echo 10"
+ },
+ {
+ "name": "count_capture",
+ "command": "printf 'count: 42\\n'"
+ },
+ {
+ "name": "exact_match",
+ "command": "echo exact-value"
+ },
+ {
+ "name": "expected_in_check",
+ "command": "echo beta"
+ },
+ {
+ "name": "line_count",
+ "command": "printf 'one\\ntwo\\n'"
+ },
+ {
+ "name": "optional_fail",
+ "command": "false"
+ }
+ ]
+ },
+ "analysis_args": {
+ "checks": [
+ {
+ "name": "kernel_os",
+ "must_contain": "TEST"
+ },
+ {
+ "name": "messages",
+ "must_not_contain": "error"
+ },
+ {
+ "name": "uid",
+ "expected_regex": "^\\d+$"
+ },
+ {
+ "name": "count_eq",
+ "value_type": "int",
+ "compare_op": "==",
+ "expected_value": 8
+ },
+ {
+ "name": "count_ne",
+ "value_type": "int",
+ "compare_op": "!=",
+ "expected_value": 0
+ },
+ {
+ "name": "count_gt",
+ "value_type": "int",
+ "compare_op": ">",
+ "expected_value": 5
+ },
+ {
+ "name": "count_gte",
+ "value_type": "int",
+ "compare_op": ">=",
+ "expected_value": 5
+ },
+ {
+ "name": "count_lt",
+ "value_type": "int",
+ "compare_op": "<",
+ "expected_value": 10
+ },
+ {
+ "name": "count_lte",
+ "value_type": "int",
+ "compare_op": "<=",
+ "expected_value": 10
+ },
+ {
+ "name": "count_capture",
+ "value_type": "int",
+ "compare_op": "==",
+ "expected_value": 42,
+ "capture_regex": "count:\\s+(\\d+)"
+ },
+ {
+ "name": "exact_match",
+ "expected": "exact-value"
+ },
+ {
+ "name": "expected_in_check",
+ "expected_in": ["alpha", "beta", "gamma"]
+ },
+ {
+ "name": "line_count",
+ "exact_lines": 2
+ },
+ {
+ "name": "optional_fail",
+ "allow_failure": true,
+ "expected_exit_code": 1
+ }
+ ]
+ }
+ }
+ },
+ "result_collators": {},
+ "name": "GenericCollectionPlugin config",
+ "desc": "Demo config: per-command sudo, text checks, all compare_op values, and optional allow_failure"
+}
diff --git a/test/functional/test_generic_collection_plugin.py b/test/functional/test_generic_collection_plugin.py
new file mode 100644
index 00000000..3f37bc1b
--- /dev/null
+++ b/test/functional/test_generic_collection_plugin.py
@@ -0,0 +1,141 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+"""Functional tests for GenericCollectionPlugin with --plugin-configs."""
+
+import csv
+from pathlib import Path
+
+import pytest
+
+_FIXTURES = Path(__file__).resolve().parent / "fixtures"
+
+
+@pytest.fixture
+def generic_collection_config_file():
+ """Return path to GenericCollectionPlugin demo config (functional fixtures)."""
+ return _FIXTURES / "generic_collection_plugin_config.json"
+
+
+def test_generic_collection_plugin_with_config_file(
+ run_cli_command, generic_collection_config_file, tmp_path
+):
+ """Run GenericCollectionPlugin using collection_args.commands from config file."""
+ assert (
+ generic_collection_config_file.exists()
+ ), f"Config file not found: {generic_collection_config_file}"
+
+ log_path = str(tmp_path / "logs_generic_collection")
+ result = run_cli_command(
+ [
+ "--log-path",
+ log_path,
+ f"--plugin-configs={generic_collection_config_file}",
+ "run-plugins",
+ "GenericCollectionPlugin",
+ ],
+ check=False,
+ )
+
+ output = result.stdout + result.stderr
+ assert result.returncode in [0, 1, 2]
+ assert "GenericCollectionPlugin" in output
+ assert "Generic collection" in output or "genericcollection" in output.lower()
+
+
+def test_generic_collection_plugin_demo_config_runs_end_to_end(
+ run_cli_command, generic_collection_config_file, tmp_path
+):
+ """Demo fixture config runs collection and analysis with expected partial failures."""
+ log_path = str(tmp_path / "logs_generic_collection_csv")
+ result = run_cli_command(
+ [
+ "--log-path",
+ log_path,
+ f"--plugin-configs={generic_collection_config_file}",
+ "run-plugins",
+ "GenericCollectionPlugin",
+ ],
+ check=False,
+ )
+
+ output = result.stdout + result.stderr
+ assert "GenericCollectionPlugin" in output
+ assert "14/14 commands succeeded" not in output
+ assert "14/14 checks passed" not in output
+ assert "/14 commands succeeded" in output
+ assert "/14 checks passed" in output
+ assert "Check failed: kernel_os" in output
+
+ csv_files = list(Path(log_path).glob("**/nodescraper.csv"))
+ if not csv_files:
+ pytest.skip("CSV output not written; cannot verify collection status")
+
+ with open(csv_files[0], "r", encoding="utf-8") as csv_file:
+ rows = [
+ row
+ for row in csv.DictReader(csv_file)
+ if row.get("plugin") == "GenericCollectionPlugin"
+ ]
+
+ assert len(rows) >= 1, "GenericCollectionPlugin should appear in CSV results"
+ assert rows[0].get("status") == "ERROR", rows[0].get("message")
+
+
+def test_generic_collection_plugin_runs_analyzer(
+ run_cli_command, generic_collection_config_file, tmp_path
+):
+ """Analysis checks from the demo fixture config should run after collection."""
+ log_path = str(tmp_path / "logs_generic_collection_analysis")
+ result = run_cli_command(
+ [
+ "--log-path",
+ log_path,
+ f"--plugin-configs={generic_collection_config_file}",
+ "run-plugins",
+ "GenericCollectionPlugin",
+ ],
+ check=False,
+ )
+
+ output = result.stdout + result.stderr
+ assert "Running data analyzer: GenericAnalyzer" in output
+ assert "Generic analysis:" in output
+ assert "14/14 checks passed" not in output
+ assert "Check failed: kernel_os" in output
+
+
+def test_generic_collection_plugin_without_commands_not_ran(run_cli_command, tmp_path):
+ """Running without collection_args.commands should report no commands configured."""
+ log_path = str(tmp_path / "logs_generic_collection_empty")
+ result = run_cli_command(
+ ["--log-path", log_path, "run-plugins", "GenericCollectionPlugin"],
+ check=False,
+ )
+
+ output = result.stdout + result.stderr
+ assert result.returncode in [0, 1, 2]
+ assert "GenericCollectionPlugin" in output
+ assert "No commands configured" in output or "NOT_RAN" in output
diff --git a/test/functional/test_plugin_configs.py b/test/functional/test_plugin_configs.py
index 6ee1a004..ce06b057 100644
--- a/test/functional/test_plugin_configs.py
+++ b/test/functional/test_plugin_configs.py
@@ -48,6 +48,7 @@ def plugin_config_files(fixtures_dir):
"DimmPlugin": fixtures_dir / "dimm_plugin_config.json",
"DkmsPlugin": fixtures_dir / "dkms_plugin_config.json",
"DmesgPlugin": fixtures_dir / "dmesg_plugin_config.json",
+ "GenericCollectionPlugin": fixtures_dir / "generic_collection_plugin_config.json",
"JournalPlugin": fixtures_dir / "journal_plugin_config.json",
"KernelPlugin": fixtures_dir / "kernel_plugin_config.json",
"KernelModulePlugin": fixtures_dir / "kernel_module_plugin_config.json",
@@ -111,6 +112,7 @@ def test_plugin_config_with_builtin_config(run_cli_command, tmp_path):
"DimmPlugin",
"DkmsPlugin",
"DmesgPlugin",
+ "GenericCollectionPlugin",
"JournalPlugin",
"KernelPlugin",
"KernelModulePlugin",
diff --git a/test/unit/framework/common/shared_utils.py b/test/unit/framework/common/shared_utils.py
index 11e6f541..5b882549 100644
--- a/test/unit/framework/common/shared_utils.py
+++ b/test/unit/framework/common/shared_utils.py
@@ -95,6 +95,8 @@ def run(
self,
test_bool_arg: bool = True,
test_str_arg: str = "test",
+ test_list_arg: list[int] = [1], # noqa: B006
+ test_dict_arg: dict = {}, # noqa: B006
test_model_arg: Optional[TestModelArg] = None,
):
return PluginResult(
diff --git a/test/unit/framework/test_cli_helper.py b/test/unit/framework/test_cli_helper.py
index 3e09c108..7a29d888 100644
--- a/test/unit/framework/test_cli_helper.py
+++ b/test/unit/framework/test_cli_helper.py
@@ -140,6 +140,8 @@ def test_config_builder(plugin_registry):
"TestPluginA": {
"test_bool_arg": True,
"test_str_arg": "test",
+ "test_list_arg": [1],
+ "test_dict_arg": {},
"test_model_arg": {"model_attr": 123},
},
"ExamplePlugin": {},
diff --git a/test/unit/framework/test_config_builder.py b/test/unit/framework/test_config_builder.py
index 862d4011..ede3fac6 100644
--- a/test/unit/framework/test_config_builder.py
+++ b/test/unit/framework/test_config_builder.py
@@ -36,6 +36,8 @@ def test_config_builder(plugin_registry):
"TestPluginA": {
"test_bool_arg": True,
"test_str_arg": "test",
+ "test_list_arg": [1],
+ "test_dict_arg": {},
"test_model_arg": {"model_attr": 123},
}
}
diff --git a/test/unit/framework/test_dataplugin.py b/test/unit/framework/test_dataplugin.py
index e88f8cc5..67b92fb9 100644
--- a/test/unit/framework/test_dataplugin.py
+++ b/test/unit/framework/test_dataplugin.py
@@ -25,6 +25,7 @@
###############################################################################
import json
from pathlib import Path
+from typing import Optional
from unittest.mock import MagicMock, patch
import pytest
@@ -34,7 +35,7 @@
from nodescraper.interfaces.dataanalyzertask import DataAnalyzer
from nodescraper.interfaces.datacollectortask import DataCollector
from nodescraper.interfaces.dataplugin import DataPlugin
-from nodescraper.models import DataModel, TaskResult
+from nodescraper.models import CollectorArgs, DataModel, TaskResult
class StandardDataModel(DataModel):
@@ -261,6 +262,40 @@ def test_run_execution_modes(self, plugin_with_conn, collection, analysis, expec
assert mock_collect.call_count == expected_calls[0]
assert mock_analyze.call_count == expected_calls[1]
+ def test_run_reports_collection_and_analysis_errors(self, plugin_with_conn):
+ plugin_with_conn.data = StandardDataModel()
+
+ collection_error = TaskResult(
+ status=ExecutionStatus.ERROR,
+ message="Generic collection: 2/3 commands succeeded",
+ )
+ analysis_error = TaskResult(
+ status=ExecutionStatus.ERROR,
+ message="Generic analysis: 1/3 checks passed",
+ )
+
+ with (
+ patch.object(CoreDataPlugin, "collect") as mock_collect,
+ patch.object(CoreDataPlugin, "analyze") as mock_analyze,
+ ):
+
+ def collect_side_effect(*args, **kwargs):
+ plugin_with_conn.collection_result = collection_error
+ return collection_error
+
+ def analyze_side_effect(*args, **kwargs):
+ plugin_with_conn.analysis_result = analysis_error
+ return analysis_error
+
+ mock_collect.side_effect = collect_side_effect
+ mock_analyze.side_effect = analyze_side_effect
+
+ result = plugin_with_conn.run(collection=True, analysis=True)
+
+ assert result.status == ExecutionStatus.ERROR
+ assert "Collection error: Generic collection: 2/3 commands succeeded" in result.message
+ assert "Analysis error: Generic analysis: 1/3 checks passed" in result.message
+
def test_run_with_parameters(self, plugin_with_conn):
collection_args = {"param": "value"}
analysis_args = {"threshold": 0.5}
@@ -543,3 +578,107 @@ def test_load_run_data_direct_file(self, tmp_path: Path) -> None:
loaded = ExtractPlugin.load_run_data(str(p))
assert loaded is not None
assert loaded["value"] == "direct"
+
+
+class MultiPartDataModel(DataModel):
+ alpha: Optional[str] = None
+ beta: Optional[str] = None
+
+
+class AlphaCollector(DataCollector):
+ DATA_MODEL = MultiPartDataModel
+
+ def collect_data(self, args=None):
+ return TaskResult(status=ExecutionStatus.OK, task="AlphaCollector"), MultiPartDataModel(
+ alpha="alpha-value"
+ )
+
+
+class BetaCollector(DataCollector):
+ DATA_MODEL = MultiPartDataModel
+
+ def collect_data(self, args=None):
+ return TaskResult(status=ExecutionStatus.OK, task="BetaCollector"), MultiPartDataModel(
+ beta="beta-value"
+ )
+
+
+class AlphaCollectorArgs(CollectorArgs):
+ alpha_path: str = "/alpha"
+
+
+class BetaCollectorArgs(CollectorArgs):
+ beta_path: str = "/beta"
+
+
+class MultiCollectorPlugin(DataPlugin):
+ DATA_MODEL = MultiPartDataModel
+ CONNECTION_TYPE = MockConnectionManager
+ COLLECTOR = (AlphaCollector, BetaCollector)
+ ANALYZER = StandardAnalyzer
+
+
+class TestMultiCollectorDataPlugin:
+ def test_get_collector_classes_accepts_collector_tuple(self):
+ assert MultiCollectorPlugin.get_collector_classes() == (AlphaCollector, BetaCollector)
+
+ def test_get_collector_classes_accepts_single_collector(self):
+ assert CoreDataPlugin.get_collector_classes() == (BaseDataCollector,)
+
+ def test_collector_args_class_accepts_args_map(self):
+ class MappedArgsPlugin(DataPlugin):
+ DATA_MODEL = MultiPartDataModel
+ CONNECTION_TYPE = MockConnectionManager
+ COLLECTOR = (AlphaCollector, BetaCollector)
+ COLLECTOR_ARGS = {
+ "AlphaCollector": AlphaCollectorArgs,
+ "BetaCollector": BetaCollectorArgs,
+ }
+ ANALYZER = StandardAnalyzer
+
+ assert MappedArgsPlugin._collector_args_class(AlphaCollector) is AlphaCollectorArgs
+ assert MappedArgsPlugin._collector_args_class(BetaCollector) is BetaCollectorArgs
+
+ def test_collect_runs_all_collectors_and_merges_data(self, plugin_with_conn):
+ multi_plugin = MultiCollectorPlugin(
+ system_info=plugin_with_conn.system_info,
+ logger=plugin_with_conn.logger,
+ connection_manager=plugin_with_conn.connection_manager,
+ )
+
+ with (
+ patch.object(AlphaCollector, "collect_data") as alpha_collect,
+ patch.object(BetaCollector, "collect_data") as beta_collect,
+ ):
+ alpha_collect.return_value = (
+ TaskResult(status=ExecutionStatus.OK, task="AlphaCollector"),
+ MultiPartDataModel(alpha="alpha-value"),
+ )
+ beta_collect.return_value = (
+ TaskResult(status=ExecutionStatus.OK, task="BetaCollector"),
+ MultiPartDataModel(beta="beta-value"),
+ )
+
+ result = multi_plugin.collect()
+
+ alpha_collect.assert_called_once()
+ beta_collect.assert_called_once()
+ assert result.status == ExecutionStatus.OK
+ assert multi_plugin.data.alpha == "alpha-value"
+ assert multi_plugin.data.beta == "beta-value"
+ assert "AlphaCollector" in result.task
+ assert "BetaCollector" in result.task
+
+ def test_find_datamodel_path_in_run_checks_all_collectors(self, tmp_path: Path) -> None:
+ beta_dir = tmp_path / "multi_collector_plugin" / "beta_collector"
+ beta_dir.mkdir(parents=True)
+ (beta_dir / "result.json").write_text(
+ json.dumps({"parent": "MultiCollectorPlugin"}), encoding="utf-8"
+ )
+ (beta_dir / "multipartdatamodel.json").write_text(
+ json.dumps({"alpha": "a", "beta": "b"}), encoding="utf-8"
+ )
+
+ found = MultiCollectorPlugin.find_datamodel_path_in_run(str(tmp_path))
+ assert found is not None
+ assert found.endswith("multipartdatamodel.json")
diff --git a/test/unit/framework/test_pluginconfig.py b/test/unit/framework/test_pluginconfig.py
new file mode 100644
index 00000000..6fd9d0fb
--- /dev/null
+++ b/test/unit/framework/test_pluginconfig.py
@@ -0,0 +1,47 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (C) 2026 Advanced Micro Devices, Inc.
+#
+###############################################################################
+
+from __future__ import annotations
+
+from nodescraper.models import PluginConfig
+
+
+def test_plugin_config_merge_combines_plugins() -> None:
+ merged = PluginConfig.merge(
+ PluginConfig(
+ name="A",
+ desc="a",
+ plugins={"FooPlugin": {"collection": True, "analysis": False}},
+ ),
+ PluginConfig(
+ name="B",
+ desc="b",
+ plugins={"BarPlugin": {"collection": False, "analysis": True}},
+ ),
+ )
+ assert merged.name == "A"
+ assert merged.desc == "a"
+ assert merged.plugins["FooPlugin"] == {"collection": True, "analysis": False}
+ assert merged.plugins["BarPlugin"] == {"collection": False, "analysis": True}
+
+
+def test_plugin_config_merge_accepts_mappings() -> None:
+ merged = PluginConfig.merge(
+ {
+ "name": "A",
+ "plugins": {"FooPlugin": {}},
+ },
+ PluginConfig(plugins={"BarPlugin": {}}),
+ )
+ assert set(merged.plugins) == {"FooPlugin", "BarPlugin"}
+
+
+def test_plugin_config_coerce() -> None:
+ config = PluginConfig.coerce({"name": "Example", "plugins": {"ExamplePlugin": {}}})
+ assert isinstance(config, PluginConfig)
+ assert config.name == "Example"
diff --git a/test/unit/framework/test_type_utils.py b/test/unit/framework/test_type_utils.py
index be14d7ee..a6bb1012 100644
--- a/test/unit/framework/test_type_utils.py
+++ b/test/unit/framework/test_type_utils.py
@@ -45,6 +45,14 @@ class TestGenericImpl(TestGenericBase[str]):
pass
+class WiringMixin:
+ pass
+
+
+class TestMixinFirstImpl(WiringMixin, TestGenericBase[str]):
+ pass
+
+
class TestModel(BaseModel):
str_attr: str
int_attr: int
@@ -57,6 +65,10 @@ def test_generic_map():
assert TypeUtils.get_generic_map(TestGenericImpl) == {T: str}
+def test_generic_map_skips_non_generic_mixin_base():
+ assert TypeUtils.get_generic_map(TestMixinFirstImpl) == {T: str}
+
+
def test_func_arg_types():
res = TypeUtils.get_func_arg_types(TestGenericImpl.test_func, TestGenericImpl)
assert list(res.keys()) == ["arg", "arg2", "arg3"]
diff --git a/test/unit/plugin/fixtures/pcie_plugin_advanced_config.json b/test/unit/plugin/fixtures/pcie_plugin_advanced_config.json
new file mode 100644
index 00000000..54812949
--- /dev/null
+++ b/test/unit/plugin/fixtures/pcie_plugin_advanced_config.json
@@ -0,0 +1,28 @@
+{
+ "global_args": {},
+ "plugins": {
+ "PciePlugin": {
+ "analysis_args": {
+ "exp_speed": 5,
+ "exp_width": 16,
+ "exp_sriov_count": 8,
+ "exp_gpu_count_override": 4,
+ "exp_max_payload_size": {
+ "29631": 256,
+ "29711": 512
+ },
+ "exp_max_rd_req_size": {
+ "29631": 512,
+ "29711": 1024
+ },
+ "exp_ten_bit_tag_req_en": {
+ "29631": 1,
+ "29711": 0
+ }
+ }
+ }
+ },
+ "result_collators": {},
+ "name": "PciePlugin advanced config",
+ "desc": "Advanced config for testing PciePlugin with device-specific settings"
+}
diff --git a/test/unit/plugin/fixtures/pcie_plugin_config.json b/test/unit/plugin/fixtures/pcie_plugin_config.json
new file mode 100644
index 00000000..cc78167e
--- /dev/null
+++ b/test/unit/plugin/fixtures/pcie_plugin_config.json
@@ -0,0 +1,19 @@
+{
+ "global_args": {},
+ "plugins": {
+ "PciePlugin": {
+ "analysis_args": {
+ "exp_speed": 5,
+ "exp_width": 16,
+ "exp_sriov_count": 8,
+ "exp_gpu_count_override": 4,
+ "exp_max_payload_size": 256,
+ "exp_max_rd_req_size": 512,
+ "exp_ten_bit_tag_req_en": 1
+ }
+ }
+ },
+ "result_collators": {},
+ "name": "PciePlugin config",
+ "desc": "Config for testing PciePlugin"
+}
diff --git a/test/unit/plugin/test_dmesg_analyzer.py b/test/unit/plugin/test_dmesg_analyzer.py
index 67faaf05..d24a311c 100644
--- a/test/unit/plugin/test_dmesg_analyzer.py
+++ b/test/unit/plugin/test_dmesg_analyzer.py
@@ -711,6 +711,53 @@ def test_custom_regex_empty_list(system_info):
assert res.events[0].description == "Out of memory error"
+def test_mce_ce_uc_and_ras_corrected_warning_priorities(system_info):
+ dmesg_content = (
+ # MCE corrected (|CE| inside MCn_STATUS[...])
+ "kern :err : 2038-01-19T00:00:00,000000+00:00 "
+ "[Hardware Error]: Machine Check: CPU0 MC0_STATUS[0xcafe|CE|Misc]: 0x0\n"
+ # MCE uncorrected (|UC|)
+ "kern :err : 2038-01-19T00:00:01,000000+00:00 "
+ "[Hardware Error]: Machine Check: CPU1 MC1_STATUS[0xfeed|UC|AddrV]: 0x0\n"
+ # RAS Corrected (single-line)
+ "kern :err : 2038-01-19T00:00:02,000000+00:00 "
+ "trace [Hardware Error]: Corrected error, DRAM threshold\n"
+ # RAS Correctable
+ "kern :err : 2038-01-19T00:00:03,000000+00:00 "
+ "amdgpu 0000:de:ad.0: amdgpu: socket: 0 7 correctable hardware errors detected in total in gfx block\n"
+ # RAS Corrected PCIe (multiline block)
+ "[Hardware Error]: event severity: corrected, generic\n"
+ "[Hardware Error]: Error 2, type: corrected, details\n"
+ "[Hardware Error]: section_type: PCIe error, device 1111:11:11.1\n"
+ )
+
+ analyzer = DmesgAnalyzer(system_info=system_info)
+ res = analyzer.analyze_data(
+ DmesgData(dmesg_content=dmesg_content),
+ args=DmesgAnalyzerArgs(check_unknown_dmesg_errors=False),
+ )
+
+ by_desc = {e.description: e for e in res.events}
+
+ assert "MCE Corrected Error" in by_desc
+ assert by_desc["MCE Corrected Error"].priority == EventPriority.WARNING
+
+ assert "MCE Uncorrected Error" in by_desc
+ assert by_desc["MCE Uncorrected Error"].priority == EventPriority.ERROR
+
+ assert "RAS Corrected Error" in by_desc
+ assert by_desc["RAS Corrected Error"].priority == EventPriority.WARNING
+
+ assert "RAS Correctable Error" in by_desc
+ assert by_desc["RAS Correctable Error"].priority == EventPriority.WARNING
+
+ assert "RAS Corrected PCIe Error" in by_desc
+ assert by_desc["RAS Corrected PCIe Error"].priority == EventPriority.WARNING
+
+ # UC is ERROR → overall analysis status remains ERROR
+ assert res.status == ExecutionStatus.ERROR
+
+
def test_resolve_priority_no_match(system_info):
"""No rule matches → returns the original priority unchanged."""
analyzer = DmesgAnalyzer(system_info=system_info)
@@ -870,11 +917,10 @@ def test_priority_override_rules_in_analyze_data(system_info):
"""priority_override_rules passed via DmesgAnalyzerArgs overrides matched regex priorities."""
dmesg_data = DmesgData(
dmesg_content=(
- # RAS event — default ERROR, should become WARNING
- "kern :err : 2024-10-07T10:17:15,145363-04:00 "
- "amdgpu 0000:0c:00.0: amdgpu: socket: 4 1 correctable hardware errors detected in total in gfx block\n"
+ "kern :err : 2038-01-19T00:00:00,000000+00:00 "
+ "amdgpu 0000:de:ad.0: amdgpu: socket: 0 9 correctable hardware errors detected in total in gfx block\n"
# SW_DRIVER event — default ERROR, should stay ERROR (no matching rule)
- "kern :err : 2024-10-07T10:17:15,145363-04:00 IO_PAGE_FAULT\n"
+ "kern :err : 2038-01-19T00:00:01,000000+00:00 IO_PAGE_FAULT\n"
)
)
@@ -905,8 +951,8 @@ def test_priority_override_no_change_keeps_original(system_info):
"""NO_CHANGE rule leaves the original event priority intact."""
dmesg_data = DmesgData(
dmesg_content=(
- "kern :err : 2024-10-07T10:17:15,145363-04:00 "
- "amdgpu 0000:0c:00.0: amdgpu: socket: 4 1 correctable hardware errors detected in total in gfx block\n"
+ "kern :err : 2038-01-19T00:00:00,000000+00:00 "
+ "amdgpu 0000:de:ad.0: amdgpu: socket: 0 9 correctable hardware errors detected in total in gfx block\n"
)
)
@@ -922,7 +968,7 @@ def test_priority_override_no_change_keeps_original(system_info):
)
assert len(res.events) == 1
- assert res.events[0].priority == EventPriority.ERROR
+ assert res.events[0].priority == EventPriority.WARNING
def test_custom_regex_with_multiline_pattern(system_info):
diff --git a/test/unit/plugin/test_generic_analyzer.py b/test/unit/plugin/test_generic_analyzer.py
new file mode 100644
index 00000000..63c507b2
--- /dev/null
+++ b/test/unit/plugin/test_generic_analyzer.py
@@ -0,0 +1,239 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+import pytest
+from pydantic import ValidationError
+
+from nodescraper.enums.executionstatus import ExecutionStatus
+from nodescraper.plugins.generic_collection import (
+ CommandCheck,
+ CommandCollectionResult,
+ GenericAnalyzer,
+ GenericAnalyzerArgs,
+ GenericCollectionDataModel,
+)
+
+
+@pytest.fixture
+def analyzer(system_info):
+ return GenericAnalyzer(system_info=system_info)
+
+
+def _data(*results: CommandCollectionResult) -> GenericCollectionDataModel:
+ return GenericCollectionDataModel(results=list(results))
+
+
+def test_evaluates_each_check_independently(analyzer):
+ data = _data(
+ CommandCollectionResult(
+ name="kernel_os",
+ command="uname -s",
+ success=True,
+ exit_code=0,
+ stdout="Linux\n",
+ ),
+ CommandCollectionResult(
+ name="messages",
+ command="cat /var/log/messages",
+ success=False,
+ exit_code=1,
+ stdout="",
+ stderr="No such file",
+ ),
+ CommandCollectionResult(
+ name="uid",
+ command="id -u",
+ success=True,
+ exit_code=0,
+ stdout="1000\n",
+ ),
+ )
+ args = GenericAnalyzerArgs(
+ checks=[
+ CommandCheck(name="kernel_os", must_contain="TEST"),
+ CommandCheck(name="messages", must_not_contain="error"),
+ CommandCheck(name="uid", expected_regex=r"^\d+$"),
+ ],
+ )
+
+ result = analyzer.analyze_data(data, args)
+
+ assert result.status == ExecutionStatus.ERROR
+ assert "1/3 checks passed" in result.message
+
+
+def test_must_contain_passes(analyzer):
+ data = _data(
+ CommandCollectionResult(
+ name="kernel_os",
+ command="uname -s",
+ success=True,
+ exit_code=0,
+ stdout="Linux\n",
+ )
+ )
+ args = GenericAnalyzerArgs(checks=[CommandCheck(name="kernel_os", must_contain="Linux")])
+
+ result = analyzer.analyze_data(data, args)
+
+ assert result.status == ExecutionStatus.OK
+ assert "1/1 checks passed" in result.message
+
+
+def test_expected_value_numeric_compare(analyzer):
+ data = _data(
+ CommandCollectionResult(
+ name="gpu_count",
+ command="echo 8",
+ success=True,
+ exit_code=0,
+ stdout="8\n",
+ )
+ )
+ args = GenericAnalyzerArgs(
+ checks=[
+ CommandCheck(
+ name="gpu_count",
+ expected_value=8,
+ compare_op="==",
+ value_type="int",
+ )
+ ],
+ )
+
+ result = analyzer.analyze_data(data, args)
+
+ assert result.status == ExecutionStatus.OK
+
+
+def test_expected_value_numeric_compare_fails(analyzer):
+ data = _data(
+ CommandCollectionResult(
+ name="gpu_count",
+ command="echo 4",
+ success=True,
+ exit_code=0,
+ stdout="4\n",
+ )
+ )
+ args = GenericAnalyzerArgs(
+ checks=[
+ CommandCheck(
+ name="gpu_count",
+ expected_value=8,
+ compare_op="==",
+ value_type="int",
+ )
+ ],
+ )
+
+ result = analyzer.analyze_data(data, args)
+
+ assert result.status == ExecutionStatus.ERROR
+
+
+def test_line_count_checks(analyzer):
+ data = _data(
+ CommandCollectionResult(
+ name="devices",
+ command="lspci",
+ success=True,
+ exit_code=0,
+ stdout="dev1\n\ndev2\n",
+ )
+ )
+ args = GenericAnalyzerArgs(checks=[CommandCheck(name="devices", min_lines=2, max_lines=2)])
+
+ result = analyzer.analyze_data(data, args)
+
+ assert result.status == ExecutionStatus.OK
+
+
+def test_stdout_required_for_content_check(analyzer):
+ data = _data(
+ CommandCollectionResult(
+ name="kernel_os",
+ command="uname -s",
+ success=True,
+ exit_code=0,
+ stdout=None,
+ )
+ )
+ args = GenericAnalyzerArgs(checks=[CommandCheck(name="kernel_os", must_contain="Linux")])
+
+ result = analyzer.analyze_data(data, args)
+
+ assert result.status == ExecutionStatus.ERROR
+
+
+def test_allow_failure_passes_failed_command_check(analyzer):
+ data = _data(
+ CommandCollectionResult(
+ name="optional",
+ command="false",
+ success=False,
+ exit_code=1,
+ stdout="",
+ )
+ )
+ args = GenericAnalyzerArgs(
+ checks=[CommandCheck(name="optional", allow_failure=True, expected_exit_code=1)],
+ )
+
+ result = analyzer.analyze_data(data, args)
+
+ assert result.status == ExecutionStatus.OK
+
+
+def test_no_checks_reports_collection_summary(analyzer):
+ data = _data(
+ CommandCollectionResult(
+ name="false_cmd",
+ command="false",
+ success=False,
+ exit_code=1,
+ stdout="",
+ )
+ )
+
+ result = analyzer.analyze_data(data, GenericAnalyzerArgs(checks=[]))
+
+ assert result.status == ExecutionStatus.OK
+ assert "0/1 commands collected" in result.message
+
+
+def test_analyzer_args_require_unique_check_names():
+ with pytest.raises(ValidationError, match="Duplicate check name"):
+ GenericAnalyzerArgs(
+ checks=[
+ CommandCheck(name="kernel_os", must_contain="Linux"),
+ CommandCheck(name="kernel_os", expected="Linux"),
+ ]
+ )
+
+
+def test_analyzer_args_require_check_name():
+ with pytest.raises(ValidationError):
+ GenericAnalyzerArgs(checks=[CommandCheck(name="", must_contain="Linux")])
diff --git a/test/unit/plugin/test_generic_collection_collector.py b/test/unit/plugin/test_generic_collection_collector.py
new file mode 100644
index 00000000..c7b3c802
--- /dev/null
+++ b/test/unit/plugin/test_generic_collection_collector.py
@@ -0,0 +1,223 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from unittest.mock import MagicMock
+
+import pytest
+from pydantic import ValidationError
+
+from nodescraper.connection.inband.inband import CommandArtifact
+from nodescraper.enums.executionstatus import ExecutionStatus
+from nodescraper.enums.systeminteraction import SystemInteractionLevel
+from nodescraper.plugins.generic_collection import (
+ CommandCollectionResult,
+ CommandSpec,
+ GenericCollectionCollector,
+ GenericCollectionCollectorArgs,
+ GenericCollectionDataModel,
+ GenericCollectionPlugin,
+)
+
+
+@pytest.fixture
+def collector(system_info, conn_mock):
+ return GenericCollectionCollector(
+ system_info=system_info,
+ system_interaction_level=SystemInteractionLevel.PASSIVE,
+ connection=conn_mock,
+ )
+
+
+def test_collect_all_commands_success(collector):
+ collector._run_sut_cmd = MagicMock(
+ side_effect=[
+ CommandArtifact(exit_code=0, stdout="linux\n", stderr="", command="uname -s"),
+ CommandArtifact(exit_code=0, stdout="ok\n", stderr="", command="echo ok"),
+ ]
+ )
+ args = GenericCollectionCollectorArgs(
+ commands=[
+ CommandSpec(name="kernel_os", command="uname -s"),
+ CommandSpec(name="echo_ok", command="echo ok"),
+ ]
+ )
+
+ result, data = collector.collect_data(args)
+
+ assert result.status == ExecutionStatus.OK
+ assert data == GenericCollectionDataModel(
+ results=[
+ CommandCollectionResult(
+ name="kernel_os",
+ command="uname -s",
+ success=True,
+ exit_code=0,
+ sudo=False,
+ stdout="linux\n",
+ ),
+ CommandCollectionResult(
+ name="echo_ok",
+ command="echo ok",
+ success=True,
+ exit_code=0,
+ sudo=False,
+ stdout="ok\n",
+ ),
+ ]
+ )
+ assert collector._run_sut_cmd.call_count == 2
+
+
+def test_collect_reports_partial_failure(collector):
+ collector._run_sut_cmd = MagicMock(
+ side_effect=[
+ CommandArtifact(exit_code=0, stdout="linux\n", stderr="", command="uname -s"),
+ CommandArtifact(exit_code=1, stdout="", stderr="failed", command="false"),
+ ]
+ )
+ args = GenericCollectionCollectorArgs(
+ commands=[
+ CommandSpec(name="kernel_os", command="uname -s"),
+ CommandSpec(name="false_cmd", command="false"),
+ ]
+ )
+
+ result, data = collector.collect_data(args)
+
+ assert result.status == ExecutionStatus.ERROR
+ assert data.results[0].success is True
+ assert data.results[1].success is False
+ assert data.results[1].exit_code == 1
+ assert data.results[1].stderr == "failed"
+
+
+def test_collect_no_commands(collector):
+ result, data = collector.collect_data(GenericCollectionCollectorArgs())
+
+ assert result.status == ExecutionStatus.NOT_RAN
+ assert data is None
+
+
+def test_collect_passes_global_sudo_and_timeout(collector):
+ collector._run_sut_cmd = MagicMock(
+ return_value=CommandArtifact(exit_code=0, stdout="", stderr="", command="id")
+ )
+ args = GenericCollectionCollectorArgs(
+ commands=[CommandSpec(name="user_id", command="id")],
+ sudo=True,
+ timeout=60,
+ )
+
+ collector.collect_data(args)
+
+ collector._run_sut_cmd.assert_called_once_with("id", sudo=True, timeout=60)
+
+
+def test_collect_per_command_sudo_overrides(collector):
+ collector._run_sut_cmd = MagicMock(
+ side_effect=[
+ CommandArtifact(exit_code=0, stdout="", stderr="", command="id"),
+ CommandArtifact(exit_code=0, stdout="", stderr="", command="cat /var/log/messages"),
+ ]
+ )
+ args = GenericCollectionCollectorArgs(
+ commands=[
+ CommandSpec(name="user_id", command="id"),
+ CommandSpec(name="messages", command="cat /var/log/messages", sudo=True),
+ ],
+ sudo=False,
+ timeout=300,
+ )
+
+ result, data = collector.collect_data(args)
+
+ assert result.status == ExecutionStatus.OK
+ assert collector._run_sut_cmd.call_args_list[0].kwargs == {"sudo": False, "timeout": 300}
+ assert collector._run_sut_cmd.call_args_list[1].kwargs == {"sudo": True, "timeout": 300}
+ assert data.results[0].sudo is False
+ assert data.results[1].sudo is True
+
+
+def test_collect_per_command_timeout_override(collector):
+ collector._run_sut_cmd = MagicMock(
+ return_value=CommandArtifact(exit_code=0, stdout="", stderr="", command="sleep 1")
+ )
+ args = GenericCollectionCollectorArgs(
+ commands=[CommandSpec(name="sleep_one", command="sleep 1", timeout=10)],
+ timeout=300,
+ )
+
+ collector.collect_data(args)
+
+ collector._run_sut_cmd.assert_called_once_with("sleep 1", sudo=False, timeout=10)
+
+
+def test_collect_stores_stdout_when_disabled(collector):
+ collector._run_sut_cmd = MagicMock(
+ return_value=CommandArtifact(
+ exit_code=0, stdout="secret\n", stderr="", command="echo secret"
+ )
+ )
+ args = GenericCollectionCollectorArgs(
+ commands=[CommandSpec(name="secret", command="echo secret", include_stdout=False)],
+ include_stdout=True,
+ )
+
+ _, data = collector.collect_data(args)
+
+ assert data.results[0].stdout is None
+
+
+def test_collector_args_reject_plain_string_commands():
+ with pytest.raises(ValidationError, match="name' and 'command'"):
+ GenericCollectionCollectorArgs(commands=["uname -s"])
+
+
+def test_collector_args_require_name():
+ with pytest.raises(ValidationError):
+ GenericCollectionCollectorArgs(commands=[CommandSpec(name="", command="uname -s")])
+
+
+def test_collector_args_require_unique_names():
+ with pytest.raises(ValidationError, match="Duplicate command name"):
+ GenericCollectionCollectorArgs(
+ commands=[
+ CommandSpec(name="dup", command="uname -s"),
+ CommandSpec(name="dup", command="uname -m"),
+ ]
+ )
+
+
+def test_generic_collection_plugin_wiring():
+ from nodescraper.plugins.generic_collection import (
+ GenericAnalyzer,
+ GenericAnalyzerArgs,
+ )
+
+ assert GenericCollectionPlugin.DATA_MODEL is GenericCollectionDataModel
+ assert GenericCollectionPlugin.get_collector_classes() == (GenericCollectionCollector,)
+ assert GenericCollectionPlugin.COLLECTOR_ARGS is GenericCollectionCollectorArgs
+ assert GenericCollectionPlugin.ANALYZER is GenericAnalyzer
+ assert GenericCollectionPlugin.ANALYZER_ARGS is GenericAnalyzerArgs
diff --git a/test/unit/plugin/test_oob_bmc_archive_plugin.py b/test/unit/plugin/test_oob_bmc_archive_plugin.py
new file mode 100644
index 00000000..9040d330
--- /dev/null
+++ b/test/unit/plugin/test_oob_bmc_archive_plugin.py
@@ -0,0 +1,272 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from unittest.mock import MagicMock
+
+import pytest
+
+from nodescraper.base import OOBSSHDataPlugin
+from nodescraper.connection.inband.inband import BinaryFileArtifact, CommandArtifact
+from nodescraper.connection.redfish import RedfishConnectionManager
+from nodescraper.enums import ExecutionStatus, OSFamily, SystemLocation
+from nodescraper.models import SystemInfo, TaskResult
+from nodescraper.pluginregistry import PluginRegistry
+from nodescraper.plugins.ooband.bmc_archive import (
+ BmcArchiveCollector,
+ BmcArchiveCollectorArgs,
+ OobBmcArchivePlugin,
+ PathSpec,
+)
+
+
+@pytest.fixture
+def collector(monkeypatch):
+ monkeypatch.setattr(
+ "nodescraper.base.inbandcollectortask.InBandDataCollector.__init__",
+ lambda self, *args, **kwargs: None,
+ )
+ collector = BmcArchiveCollector(
+ system_info=SystemInfo(
+ hostname="bmc",
+ location=SystemLocation.REMOTE,
+ os_family=OSFamily.LINUX,
+ ),
+ connection=MagicMock(),
+ )
+ # InBandDataCollector.__init__ is stubbed, so Task/DataCollector init never runs.
+ collector.parent = None
+ collector.task_result_hooks = []
+ collector.result = TaskResult(task=BmcArchiveCollector.__name__, parent=None)
+ collector.result.status = ExecutionStatus.OK
+ collector.result.message = ""
+ collector.logger = MagicMock()
+ return collector
+
+
+def test_oob_bmc_archive_plugin_registers():
+ assert OobBmcArchivePlugin.is_valid()
+ assert OobBmcArchivePlugin.ANALYZER is None
+ assert "OobBmcArchivePlugin" in PluginRegistry().plugins
+
+
+def test_oob_bmc_archive_plugin_uses_redfish_connection_manager_like_oob_generic_collection():
+ assert issubclass(OobBmcArchivePlugin, OOBSSHDataPlugin)
+ assert OobBmcArchivePlugin.CONNECTION_TYPE is RedfishConnectionManager
+
+
+def test_plugin_log_directory_name_uses_oob_prefix():
+ from nodescraper.utils import pascal_to_snake
+
+ assert pascal_to_snake("OobBmcArchivePlugin") == "oob_bmc_archive_plugin"
+
+
+def test_tar_command_uses_streaming_tar_and_redirect(collector):
+ cmd = collector._tar_command(
+ "/data/example_a",
+ "/tmp/node_scraper_archive_alpha.tar.gz",
+ ignore_failed_read=True,
+ )
+ assert (
+ cmd
+ == "tar czf - --ignore-failed-read '/data/example_a' > '/tmp/node_scraper_archive_alpha.tar.gz'"
+ )
+
+
+def test_collect_path_omits_ignore_failed_read_when_tar_lacks_option(collector, monkeypatch):
+ """If ``--ignore-failed-read`` is not supported, fall back to plain tar."""
+ exists_result = CommandArtifact(
+ command="test -e '/data/example_a'", stdout="", stderr="", exit_code=0
+ )
+ probe_unsupported = CommandArtifact(
+ command="tar cf - --ignore-failed-read /dev/null",
+ stdout="",
+ stderr="tar: unrecognized option '--ignore-failed-read'\n",
+ exit_code=1,
+ )
+ tar_plain = CommandArtifact(
+ command="tar czf - '/data/example_a' > '/tmp/node_scraper_archive_alpha.tar.gz'",
+ stdout="",
+ stderr="",
+ exit_code=0,
+ )
+ read_result = BinaryFileArtifact(filename="archive_alpha.tar.gz", contents=b"x")
+ rm_result = CommandArtifact(command="rm -f", stdout="", stderr="", exit_code=0)
+
+ collector._run_sut_cmd = MagicMock(
+ side_effect=[exists_result, probe_unsupported, tar_plain, rm_result]
+ )
+ collector._read_sut_file = MagicMock(return_value=read_result)
+ collector._log_event = MagicMock()
+
+ path_spec = PathSpec(name="archive_alpha", path="/data/example_a")
+ result, archive = collector._collect_path(
+ path_spec,
+ default_sudo=False,
+ default_timeout=600,
+ default_skip_if_missing=False,
+ default_ignore_failed_read=True,
+ )
+
+ assert result.success is True
+ assert archive is not None
+ collector._run_sut_cmd.assert_any_call(
+ "tar czf - '/data/example_a' > '/tmp/node_scraper_archive_alpha.tar.gz'",
+ sudo=False,
+ timeout=600,
+ log_artifact=True,
+ )
+
+
+def test_collect_path_reads_archive_after_tar(collector, monkeypatch):
+ exists_result = CommandArtifact(
+ command="test -e '/data/example_a'", stdout="", stderr="", exit_code=0
+ )
+ probe_result = CommandArtifact(
+ command="tar cf - --ignore-failed-read /dev/null",
+ stdout="",
+ stderr="",
+ exit_code=0,
+ )
+ tar_result = CommandArtifact(
+ command="tar czf - --ignore-failed-read '/data/example_a' > '/tmp/node_scraper_archive_alpha.tar.gz'",
+ stdout="",
+ stderr="",
+ exit_code=0,
+ )
+ archive_bytes = b"fake-gzip-data"
+ read_result = BinaryFileArtifact(filename="archive_alpha.tar.gz", contents=archive_bytes)
+ rm_result = CommandArtifact(
+ command="rm -f '/tmp/node_scraper_archive_alpha.tar.gz'",
+ stdout="",
+ stderr="",
+ exit_code=0,
+ )
+
+ collector._run_sut_cmd = MagicMock(
+ side_effect=[exists_result, probe_result, tar_result, rm_result]
+ )
+ collector._read_sut_file = MagicMock(return_value=read_result)
+ collector._log_event = MagicMock()
+
+ path_spec = PathSpec(name="archive_alpha", path="/data/example_a")
+ result, archive = collector._collect_path(
+ path_spec,
+ default_sudo=False,
+ default_timeout=600,
+ default_skip_if_missing=False,
+ default_ignore_failed_read=True,
+ )
+
+ assert result.success is True
+ assert result.size_bytes == len(archive_bytes)
+ assert archive is not None
+ assert archive.filename == "archive_alpha.tar.gz"
+ collector._run_sut_cmd.assert_any_call(
+ "tar czf - --ignore-failed-read '/data/example_a' > '/tmp/node_scraper_archive_alpha.tar.gz'",
+ sudo=False,
+ timeout=600,
+ log_artifact=True,
+ )
+ collector._read_sut_file.assert_called_once_with(
+ "/tmp/node_scraper_archive_alpha.tar.gz",
+ encoding=None,
+ strip=False,
+ log_artifact=True,
+ )
+
+
+def test_collect_path_skips_missing_path_when_configured(collector):
+ missing_result = CommandArtifact(
+ command="test -e '/data/missing'",
+ stdout="",
+ stderr="",
+ exit_code=1,
+ )
+ collector._run_sut_cmd = MagicMock(return_value=missing_result)
+ collector._log_event = MagicMock()
+
+ path_spec = PathSpec(name="archive_missing", path="/data/missing")
+ result, archive = collector._collect_path(
+ path_spec,
+ default_sudo=False,
+ default_timeout=600,
+ default_skip_if_missing=True,
+ default_ignore_failed_read=True,
+ )
+
+ assert result.skipped is True
+ assert result.success is False
+ assert archive is None
+ collector._run_sut_cmd.assert_called_once()
+
+
+def test_collect_data_not_ran_without_paths(collector):
+ collector._log_event = MagicMock()
+ task_result, data = collector.collect_data(BmcArchiveCollectorArgs(paths=[]))
+
+ assert task_result.status == ExecutionStatus.NOT_RAN
+ assert data is None
+ assert "collection_args.paths" in task_result.message
+
+
+def test_collect_data_reports_partial_failures(collector, monkeypatch):
+ exists_ok = CommandArtifact(command="test -e", stdout="", stderr="", exit_code=0)
+ probe_ok = CommandArtifact(
+ command="tar cf - --ignore-failed-read /dev/null", stdout="", stderr="", exit_code=0
+ )
+ ok_tar = CommandArtifact(command="tar", stdout="", stderr="", exit_code=0)
+ fail_tar = CommandArtifact(command="tar", stdout="", stderr="missing", exit_code=2)
+ no_archive = CommandArtifact(command="test -s", stdout="", stderr="", exit_code=1)
+ rm = CommandArtifact(command="rm", stdout="", stderr="", exit_code=0)
+ archive = BinaryFileArtifact(filename="archive_alpha.tar.gz", contents=b"data")
+
+ collector._run_sut_cmd = MagicMock(
+ side_effect=[
+ exists_ok,
+ probe_ok,
+ ok_tar,
+ rm,
+ exists_ok,
+ fail_tar,
+ no_archive,
+ rm,
+ ]
+ )
+ collector._read_sut_file = MagicMock(return_value=archive)
+ collector._log_event = MagicMock()
+
+ args = BmcArchiveCollectorArgs(
+ paths=[
+ PathSpec(name="archive_alpha", path="/data/example_a"),
+ PathSpec(name="archive_beta", path="/data/example_b"),
+ ]
+ )
+ task_result, data = collector.collect_data(args)
+
+ assert task_result.status == ExecutionStatus.ERROR
+ assert data is not None
+ assert len(data.archives) == 1
+ assert data.results[0].success is True
+ assert data.results[1].success is False
diff --git a/test/unit/plugin/test_oob_generic_collection_plugin.py b/test/unit/plugin/test_oob_generic_collection_plugin.py
new file mode 100644
index 00000000..8f7698cb
--- /dev/null
+++ b/test/unit/plugin/test_oob_generic_collection_plugin.py
@@ -0,0 +1,73 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from nodescraper.base import InBandDataPlugin, OOBSSHDataPlugin
+from nodescraper.connection.inband.inbandmanager import InBandConnectionManager
+from nodescraper.connection.redfish import RedfishConnectionManager
+from nodescraper.pluginregistry import PluginRegistry
+from nodescraper.plugins.generic_collection import (
+ GenericAnalyzer,
+ GenericAnalyzerArgs,
+ GenericCollectionCollector,
+ GenericCollectionCollectorArgs,
+ GenericCollectionDataModel,
+ GenericCollectionPlugin,
+ GenericCollectionPluginMixin,
+ OobGenericCollectionPlugin,
+)
+
+
+def test_generic_collection_plugins_are_valid():
+ assert GenericCollectionPlugin.is_valid()
+ assert OobGenericCollectionPlugin.is_valid()
+
+
+def test_generic_collection_plugins_register():
+ registry = PluginRegistry()
+ assert "GenericCollectionPlugin" in registry.plugins
+ assert "OobGenericCollectionPlugin" in registry.plugins
+
+
+def test_generic_collection_plugin_mixin_wiring():
+ for plugin_cls in (GenericCollectionPlugin, OobGenericCollectionPlugin):
+ assert plugin_cls.DATA_MODEL is GenericCollectionDataModel
+ assert plugin_cls.get_collector_classes() == (GenericCollectionCollector,)
+ assert plugin_cls.COLLECTOR_ARGS is GenericCollectionCollectorArgs
+ assert plugin_cls.ANALYZER is GenericAnalyzer
+ assert plugin_cls.ANALYZER_ARGS is GenericAnalyzerArgs
+
+
+def test_generic_collection_plugin_uses_inband_base():
+ assert issubclass(GenericCollectionPlugin, InBandDataPlugin)
+ assert issubclass(GenericCollectionPlugin, GenericCollectionPluginMixin)
+ assert GenericCollectionPlugin.CONNECTION_TYPE is InBandConnectionManager
+
+
+def test_oob_generic_collection_plugin_uses_oob_ssh_base():
+ assert issubclass(OobGenericCollectionPlugin, OOBSSHDataPlugin)
+ assert issubclass(OobGenericCollectionPlugin, GenericCollectionPluginMixin)
+ assert OobGenericCollectionPlugin.CONNECTION_TYPE is RedfishConnectionManager
+ assert GenericCollectionPlugin.COLLECTOR is OobGenericCollectionPlugin.COLLECTOR
+ assert GenericCollectionPlugin.ANALYZER is OobGenericCollectionPlugin.ANALYZER
diff --git a/test/unit/plugin/test_pcie_collector.py b/test/unit/plugin/test_pcie_collector.py
new file mode 100644
index 00000000..6aabc5c0
--- /dev/null
+++ b/test/unit/plugin/test_pcie_collector.py
@@ -0,0 +1,111 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+from unittest.mock import MagicMock
+
+import pytest
+
+from nodescraper.enums.systeminteraction import SystemInteractionLevel
+from nodescraper.plugins.inband.pcie.pcie_collector import PcieCollector
+
+
+@pytest.fixture
+def collector(system_info, conn_mock):
+ return PcieCollector(
+ system_info=system_info,
+ system_interaction_level=SystemInteractionLevel.PASSIVE,
+ connection=conn_mock,
+ )
+
+
+LSPCI_PP_D_MULTI_DOMAIN_OUTPUT = (
+ "0001:00:01.1/00:02.0/00:03.0 Processing accelerators: Advanced Micro Devices, Inc."
+)
+
+LSPCI_PP_D_SINGLE_DOMAIN_OUTPUT = (
+ "00:01.1/00:02.0/00:03.0 Processing accelerators: Advanced Micro Devices, Inc."
+)
+
+
+def test_get_upstream_bdf_uses_lspci_pp_d_command(collector):
+ """Upstream BDF lookup must use lspci -PP -D -d for multi-domain path output."""
+ collector._run_os_cmd = MagicMock(return_value=LSPCI_PP_D_MULTI_DOMAIN_OUTPUT)
+
+ collector._get_upstream_bdf_from_buspath("1002", "74a1")
+
+ collector._run_os_cmd.assert_called_once_with(
+ collector.CMD_LSPCI_PATH_DEVICE_DOMAIN.format(vendor_id="1002", dev_id="74a1"),
+ sudo=True,
+ )
+
+
+def test_get_upstream_bdf_propagates_domain_prefix(collector):
+ """Bare downstream BDFs inherit the domain prefix from the root path component."""
+ collector._run_os_cmd = MagicMock(return_value=LSPCI_PP_D_MULTI_DOMAIN_OUTPUT)
+
+ upstream_bdfs = collector._get_upstream_bdf_from_buspath("1002", "74a1", upstream_steps_limit=2)
+
+ assert upstream_bdfs == {
+ "0001:00:03.0": ["0001:00:03.0", "0001:00:02.0", "0001:00:01.1"],
+ }
+
+
+def test_get_upstream_bdf_defaults_domain_to_0000(collector):
+ """When no domain prefix is present, bare BDFs default to domain 0000."""
+ collector._run_os_cmd = MagicMock(return_value=LSPCI_PP_D_SINGLE_DOMAIN_OUTPUT)
+
+ upstream_bdfs = collector._get_upstream_bdf_from_buspath("1002", "74a1", upstream_steps_limit=1)
+
+ assert upstream_bdfs == {
+ "0000:00:03.0": ["0000:00:03.0", "0000:00:02.0"],
+ }
+
+
+def test_show_lspci_path_domain_uses_correct_command(collector):
+ """Artifact collection runs lspci -PP -D."""
+ collector._run_os_cmd = MagicMock(return_value="0001:00:01.1/00:02.0")
+
+ result = collector.show_lspci_path_domain(sudo=False)
+
+ collector._run_os_cmd.assert_called_once_with(collector.CMD_LSPCI_PATH_DOMAIN, sudo=False)
+ assert result == "0001:00:01.1/00:02.0"
+
+
+def test_log_pcie_artifacts_includes_lspci_pp_d(collector):
+ """Domain-prefixed path view is saved as lspci_pp_d.txt."""
+ collector._log_pcie_artifacts(
+ lspci_pp="00:03.0/00:02.0",
+ lspci_pp_d="0001:00:01.1/0001:00:02.0/0001:00:03.0",
+ lspci_hex="00:",
+ lspci_verbose_tree="tree",
+ lspci_verbose="verbose",
+ )
+
+ artifact_names = {artifact.filename for artifact in collector.result.artifacts}
+ assert "lspci_pp_d.txt" in artifact_names
+ lspci_pp_d = next(
+ artifact for artifact in collector.result.artifacts if artifact.filename == "lspci_pp_d.txt"
+ )
+ assert lspci_pp_d.contents == "0001:00:01.1/0001:00:02.0/0001:00:03.0"
diff --git a/test/unit/plugin/test_pcie_plugin.py b/test/unit/plugin/test_pcie_plugin.py
new file mode 100644
index 00000000..ccc61c91
--- /dev/null
+++ b/test/unit/plugin/test_pcie_plugin.py
@@ -0,0 +1,140 @@
+###############################################################################
+#
+# MIT License
+#
+# Copyright (c) 2026 Advanced Micro Devices, Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+###############################################################################
+
+import json
+from pathlib import Path
+
+import pytest
+
+from nodescraper.models import PluginConfig
+from nodescraper.plugins.inband.pcie.analyzer_args import PcieAnalyzerArgs
+from nodescraper.plugins.inband.pcie.pcie_analyzer import PcieAnalyzer
+from nodescraper.plugins.inband.pcie.pcie_collector import PcieCollector
+from nodescraper.plugins.inband.pcie.pcie_data import PcieDataModel
+from nodescraper.plugins.inband.pcie.pcie_plugin import PciePlugin
+
+
+@pytest.fixture
+def fixtures_dir():
+ return Path(__file__).parent / "fixtures"
+
+
+@pytest.fixture
+def pcie_config_file(fixtures_dir):
+ return fixtures_dir / "pcie_plugin_config.json"
+
+
+@pytest.fixture
+def pcie_advanced_config_file(fixtures_dir):
+ return fixtures_dir / "pcie_plugin_advanced_config.json"
+
+
+def _load_plugin_config(path: Path) -> PluginConfig:
+ return PluginConfig.model_validate(json.loads(path.read_text()))
+
+
+def _pcie_analysis_args(config: PluginConfig) -> PcieAnalyzerArgs:
+ return PcieAnalyzerArgs.model_validate(config.plugins["PciePlugin"]["analysis_args"])
+
+
+def test_pcie_plugin_class_attributes():
+ assert PciePlugin.DATA_MODEL is PcieDataModel
+ assert PciePlugin.COLLECTOR is PcieCollector
+ assert PciePlugin.ANALYZER is PcieAnalyzer
+ assert PciePlugin.ANALYZER_ARGS is PcieAnalyzerArgs
+
+
+def test_pcie_plugin_basic_config_fixture_exists(pcie_config_file):
+ assert pcie_config_file.exists(), f"Config file not found: {pcie_config_file}"
+
+
+def test_pcie_plugin_advanced_config_fixture_exists(pcie_advanced_config_file):
+ assert pcie_advanced_config_file.exists(), f"Config file not found: {pcie_advanced_config_file}"
+
+
+def test_pcie_plugin_with_basic_config(pcie_config_file):
+ """Basic config file analysis_args validate as integer PcieAnalyzerArgs."""
+ config = _load_plugin_config(pcie_config_file)
+ args = _pcie_analysis_args(config)
+
+ assert config.name == "PciePlugin config"
+ assert args.exp_speed == 5
+ assert args.exp_width == 16
+ assert args.exp_sriov_count == 8
+ assert args.exp_gpu_count_override == 4
+ assert args.exp_max_payload_size == 256
+ assert args.exp_max_rd_req_size == 512
+ assert args.exp_ten_bit_tag_req_en == 1
+
+
+def test_pcie_plugin_with_advanced_config(pcie_advanced_config_file):
+ """Advanced config file supports device-specific analyzer args."""
+ config = _load_plugin_config(pcie_advanced_config_file)
+ args = _pcie_analysis_args(config)
+
+ assert config.name == "PciePlugin advanced config"
+ assert args.exp_max_payload_size == {29631: 256, 29711: 512}
+ assert args.exp_max_rd_req_size == {29631: 512, 29711: 1024}
+ assert args.exp_ten_bit_tag_req_en == {29631: 1, 29711: 0}
+
+
+def test_pcie_plugin_combined_configs(pcie_config_file, pcie_advanced_config_file):
+ """Multiple plugin configs merge with later PciePlugin settings taking precedence."""
+ basic = _load_plugin_config(pcie_config_file)
+ advanced = _load_plugin_config(pcie_advanced_config_file)
+
+ merged = PluginConfig.merge(basic, advanced)
+ args = _pcie_analysis_args(merged)
+
+ assert merged.name == "PciePlugin config"
+ assert isinstance(args.exp_max_payload_size, dict)
+ assert args.exp_max_payload_size[29631] == 256
+ assert args.exp_max_payload_size[29711] == 512
+ assert args.exp_max_rd_req_size[29711] == 1024
+
+
+def test_pcie_plugin_run_plugins_entry_present(pcie_config_file):
+ """PciePlugin is configured for collection and analysis via plugin config."""
+ config = _load_plugin_config(pcie_config_file)
+
+ assert "PciePlugin" in config.plugins
+ assert "analysis_args" in config.plugins["PciePlugin"]
+
+
+def test_pcie_plugin_passive_interaction_config(pcie_config_file):
+ """PASSIVE runs use the same analysis args shape as the basic plugin config."""
+ config = _load_plugin_config(pcie_config_file)
+ args = PcieAnalyzerArgs.model_validate(config.plugins["PciePlugin"]["analysis_args"])
+
+ assert args.exp_speed == 5
+ assert args.exp_width == 16
+
+
+def test_pcie_plugin_skip_sudo_config(pcie_config_file):
+ """Skip-sudo scenarios still load the same PciePlugin analysis args from config."""
+ config = _load_plugin_config(pcie_config_file)
+
+ assert config.plugins["PciePlugin"]["analysis_args"]["exp_gpu_count_override"] == 4
diff --git a/test/unit/plugin/test_syslog_collector.py b/test/unit/plugin/test_syslog_collector.py
index 6d322033..14636ae4 100644
--- a/test/unit/plugin/test_syslog_collector.py
+++ b/test/unit/plugin/test_syslog_collector.py
@@ -74,6 +74,8 @@ def test_collect_rotations_good_path(monkeypatch, system_info, conn_mock):
def run_map(cmd, **kwargs):
if cmd.startswith("ls -1 /var/log/syslog"):
return DummyRes(command=cmd, stdout=ls_out, exit_code=0)
+ if cmd.startswith("ls -1 /var/log/messages"):
+ return DummyRes(command=cmd, stdout="", exit_code=0)
if cmd.startswith("cat "):
if "/var/log/syslog.1" in cmd:
@@ -102,7 +104,7 @@ def run_map(cmd, **kwargs):
def test_collect_rotations_no_files(monkeypatch, system_info, conn_mock):
def run_map(cmd, **kwargs):
- if cmd.startswith("ls -1 /var/log/syslog"):
+ if cmd.startswith("ls -1 /var/log/syslog") or cmd.startswith("ls -1 /var/log/messages"):
return DummyRes(command=cmd, stdout="", exit_code=0)
return DummyRes(command=cmd, stdout="", exit_code=1)
@@ -113,7 +115,7 @@ def run_map(cmd, **kwargs):
assert c.result.artifacts == []
assert any(
- e["description"].startswith("No /var/log/syslog files found")
+ e["description"].startswith("No /var/log/syslog or /var/log/messages files found")
and getattr(e["priority"], "name", str(e["priority"])) == "WARNING"
for e in c._events
)
@@ -125,6 +127,8 @@ def test_collect_rotations_gz_failure(monkeypatch, system_info, conn_mock):
def run_map(cmd, **kwargs):
if cmd.startswith("ls -1 /var/log/syslog"):
return DummyRes(command=cmd, stdout=ls_out, exit_code=0)
+ if cmd.startswith("ls -1 /var/log/messages"):
+ return DummyRes(command=cmd, stdout="", exit_code=0)
if "gzip -dc" in cmd and "/var/log/syslog.2.gz" in cmd:
return DummyRes(command=cmd, stdout="", exit_code=1, stderr="gzip: not found")
return DummyRes(command=cmd, stdout="", exit_code=1)
@@ -149,6 +153,8 @@ def test_collect_data_integration(monkeypatch, system_info, conn_mock):
def run_map(cmd, **kwargs):
if cmd.startswith("ls -1 /var/log/syslog"):
return DummyRes(command=cmd, stdout=ls_out, exit_code=0)
+ if cmd.startswith("ls -1 /var/log/messages"):
+ return DummyRes(command=cmd, stdout="", exit_code=0)
if cmd.startswith("cat ") and "/var/log/syslog" in cmd:
return DummyRes(command=cmd, stdout="syslog file content\n", exit_code=0)
return DummyRes(command=cmd, stdout="", exit_code=1)
@@ -159,3 +165,40 @@ def run_map(cmd, **kwargs):
assert isinstance(data, SyslogData)
assert data.syslog_logs[0].filename == "rotated_syslog.log"
assert c.result.message == "Syslog data collected"
+
+
+def test_collect_rotations_messages_good_path(monkeypatch, system_info, conn_mock):
+ ls_out = (
+ "\n".join(
+ [
+ "/var/log/messages",
+ "/var/log/messages.1",
+ "/var/log/messages.2.gz",
+ ]
+ )
+ + "\n"
+ )
+
+ def run_map(cmd, **kwargs):
+ if cmd.startswith("ls -1 /var/log/syslog"):
+ return DummyRes(command=cmd, stdout="", exit_code=0)
+ if cmd.startswith("ls -1 /var/log/messages"):
+ return DummyRes(command=cmd, stdout=ls_out, exit_code=0)
+
+ if cmd.startswith("cat "):
+ if "/var/log/messages.1" in cmd:
+ return DummyRes(command=cmd, stdout="messages.1 content\n", exit_code=0)
+ if "/var/log/messages" in cmd:
+ return DummyRes(command=cmd, stdout="messages content\n", exit_code=0)
+
+ if "gzip -dc" in cmd and "/var/log/messages.2.gz" in cmd:
+ return DummyRes(command=cmd, stdout="messages gz content\n", exit_code=0)
+
+ return DummyRes(command=cmd, stdout="", exit_code=1, stderr="unexpected")
+
+ c = get_collector(monkeypatch, run_map, system_info, conn_mock)
+
+ n = c._collect_syslog_rotations()
+ assert n[0].filename == "rotated_messages.log"
+ assert n[1].filename == "rotated_messages.1.log"
+ assert n[2].filename == "rotated_messages.2.gz.log"