Skip to content

KVM HA: Fix CheckOnHostAnswer success flag when there is no heartbeat#13373

Open
sureshanaparti wants to merge 4 commits into
apache:4.22from
shapeblue:ha-checkonhostanswer-fix
Open

KVM HA: Fix CheckOnHostAnswer success flag when there is no heartbeat#13373
sureshanaparti wants to merge 4 commits into
apache:4.22from
shapeblue:ha-checkonhostanswer-fix

Conversation

@sureshanaparti
Copy link
Copy Markdown
Contributor

@sureshanaparti sureshanaparti commented Jun 8, 2026

Description

This PR fixes the CheckOnHostAnswer success flag when there is no heartbeat, for KVM HA.

Fixes #13371

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

@sureshanaparti
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@sureshanaparti sureshanaparti linked an issue Jun 8, 2026 that may be closed by this pull request
@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 8, 2026

Codecov Report

❌ Patch coverage is 50.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.67%. Comparing base (21b2025) to head (4d298b4).

Files with missing lines Patch % Lines
...ache/cloudstack/kvm/ha/KVMHostActivityChecker.java 0.00% 1 Missing ⚠️
...oud/hypervisor/vmware/resource/VmwareResource.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.22   #13373      +/-   ##
============================================
- Coverage     17.67%   17.67%   -0.01%     
+ Complexity    15792    15791       -1     
============================================
  Files          5922     5922              
  Lines        533165   533166       +1     
  Branches      65208    65208              
============================================
  Hits          94242    94242              
- Misses       428276   428277       +1     
  Partials      10647    10647              
Flag Coverage Δ
uitests 3.69% <ø> (ø)
unittests 18.75% <50.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sureshanaparti
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress.

@sureshanaparti sureshanaparti requested a review from Copilot June 8, 2026 12:09
@sureshanaparti sureshanaparti changed the base branch from main to 4.22 June 8, 2026 12:12
@sureshanaparti
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 18187

@weizhouapache
Copy link
Copy Markdown
Member

@sureshanaparti
there is a test failure

14:34:05 [ERROR] testCheckOnHostCommand(com.cloud.hypervisor.kvm.resource.LibvirtComputingResourceTest)  Time elapsed: 0.279 s  <<< FAILURE!
14:34:05 java.lang.AssertionError
14:34:05 	at com.cloud.hypervisor.kvm.resource.LibvirtComputingResourceTest.testCheckOnHostCommand(LibvirtComputingResourceTest.java:3133)

@sureshanaparti
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@sureshanaparti sureshanaparti marked this pull request as ready for review June 8, 2026 13:20
@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 18189

@weizhouapache
Copy link
Copy Markdown
Member

@sureshanaparti
it looks like the change made no effects

public CheckOnHostAnswer(CheckOnHostCommand cmd, String details) {
super(cmd, false, details);
determined = false;
alive = false;
}

these two lines are the same actually

                return new CheckOnHostAnswer(command, "Heart is not beating");
                return new CheckOnHostAnswer(command, false, "Heart is not beating");

@sureshanaparti
Copy link
Copy Markdown
Contributor Author

sureshanaparti commented Jun 8, 2026

@sureshanaparti it looks like the change made no effects

public CheckOnHostAnswer(CheckOnHostCommand cmd, String details) {
super(cmd, false, details);
determined = false;
alive = false;
}

these two lines are the same actually

                return new CheckOnHostAnswer(command, "Heart is not beating");
                return new CheckOnHostAnswer(command, false, "Heart is not beating");

@weizhouapache answer's success flag is set in the second case (result should be true which indicates the cmd is successfully processed without any errors, and then get the other details - isAlive, etc), and answer.getResult() is true here:

if (answer != null) {
if (answer.getResult()) {
hostStatusFromNeighbour = ((CheckOnHostAnswer)answer).isAlive() ? Status.Up : Status.Down;
logger.debug("Neighboring {} returned status [{}] for the investigated {}.", neighbor.toString(), hostStatusFromNeighbour, host.toString());
if (hostStatusFromNeighbour == Status.Up) {
return hostStatusFromNeighbour;
}
} else {
logger.debug("{} is not active according to neighbor {}, details: {}.", host.toString(), neighbor.toString(), answer.getDetails());
}
} else {

@weizhouapache
Copy link
Copy Markdown
Member

weizhouapache commented Jun 8, 2026

@weizhouapache answer's success flag is set in the second case (result should be true which indicates the cmd is success, and then get the other details - isAlive, etc), and answer.getResult() is true here:

if (answer != null) {
if (answer.getResult()) {
hostStatusFromNeighbour = ((CheckOnHostAnswer)answer).isAlive() ? Status.Up : Status.Down;
logger.debug("Neighboring {} returned status [{}] for the investigated {}.", neighbor.toString(), hostStatusFromNeighbour, host.toString());
if (hostStatusFromNeighbour == Status.Up) {
return hostStatusFromNeighbour;
}
} else {
logger.debug("{} is not active according to neighbor {}, details: {}.", host.toString(), neighbor.toString(), answer.getDetails());
}
} else {

ok, I will re-test

Copy link
Copy Markdown
Member

@weizhouapache weizhouapache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verified ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Host HA is broken in CloudStack 4.22.1.0

4 participants