Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KVM HA Enhancements for Ceph RBD Primary Storage #5862

Open
wants to merge 51 commits into
base: main
Choose a base branch
from

Conversation

dhslove
Copy link
Contributor

@dhslove dhslove commented Jan 14, 2022

Description

This PR includes the features to provide high availability using Ceph RBD-based Primary Storage.

The KVM Host HA feature uses NFS Primary Storage. However, an additional storage consideration in KVM is Ceph RBD. This PR considers the following two to provide HA using RBD.

  1. Must use the multi-monitor server feature of Ceph RBD.
  2. The RBD-based HA function should be provided in the same way as the NFS-based HA.

This PR includes the feature to enter multiple monitor addresses separated by commas in the RADOS Monitor item when registering RBD Primary Storage.

And it includes adding code to provide HA based on RBD Primary Storage.

To support RBD-based HA, add RBDStoragePool class to KVMHABase and use it to perform heartbeat check and activity check in the same way as NFS. Similar to how NFS writes hb and ac files to a KVMHA directory, RADOS commands are used to put hb and ac files into the RBD pool registered as primary storage.

To perform this heartbeat and activity check, kvmheartbeat_rbd.sh and kvmvmactivity_rbd.sh files were added. This is to clearly distinguish roles from the same shell file for nfs.

This PR is necessary to support agent-free HA based on Ceph RBD like NFS, even if the related PR #4978 is merged.

The execution result of the function can be checked in the attached image.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

RBD Multiple Monitor Support - Add RBD Primary Storage
rbd-primary-add

RBD Multiple Monitor Support - RBD Primary Storage List
primarystorage-list

RBD Multiple Monitor Support - KVM Host Secret List
secret-list

RBD Multiple Monitor Support - VM Disk XML(Multi RBD Monitor)
disk-xml

RBD Based KVM Host HA - One of Hosts Off
host-list

RBD Based KVM Host HA - Host Suspect State
host-status

RBD Based KVM Host HA - Host Down State
host-down1

ycyun and others added 13 commits October 20, 2021 17:05
RBD primary storage 연결시 + 및 / 기호를 -와 _ 로 변경하는 로직 개선
rbd multi monitor 지원
comma로 넣을 경우 userInfo 도 null 로 출력되는 현상 수정
rbd일 때 uuid 에러로 생성로직 변경
multi monitor인 경우 로직 변경
코드 정리
Primary Storage가 RBD인 경우에  Host HA가 동작하도록 기능개선
storagepool > pooltype 제거 및 메서드명변경(getRbdMonIpAddress)
rbd pool auth 변경가능하도록 수정
nfs, rbd Storage hb모니터링 함께 되도록 수정
@sureshanaparti
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 2208

@sureshanaparti
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@sureshanaparti a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@wido
Copy link
Contributor

wido commented Jan 14, 2022

This looks very interesting. I'll review this and get back. Give me a few days

@blueorangutan
Copy link

Trillian test result (tid-2879)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 33668 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5862-t2879-kvm-centos7.zip
Smoke tests completed. 92 look OK, 0 have errors
Only failed tests results shown below:

Test Result Time (s) Test File

@wido
Copy link
Contributor

wido commented Jan 17, 2022

May I ask why you are not just looking up if there is a Watcher on RBD images?

@dhslove
Copy link
Contributor Author

dhslove commented Jan 18, 2022

Thanks your review. @wido

The reason is

  1. To use the same method as NFS (modified time check),
  2. RBD Watcher continues to exist when the KVM host hangs (Watcher's inaccuracy in some cases).

sjyu1 and others added 4 commits January 28, 2022 14:23
코딩규칙 및 요청사항 수정
runHeartbeatToPool 메서드명 수정
HA Cloudstack 요청사항 수정
Copy link
Contributor

@wido wido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced that this is the best route forward.

RBD already supports watchers and exclusive locking on images by default. If a host dies you should see within 30 sec or so if it is no longer watching a RBD image and also has not refreshed it's lock anymore.

This external script we are calling also does not support IPv6 (InetAddress) and would rely on external tooling.

Imho we should have each Host maybe just 'watch' a RBD image. Other hosts can check if there still is a watcher on that image. If not we can assume the Host has died.

Overall I think we should not merge this and rely on the things Ceph already has like exclusive locking and watchers on RBD images.

1. 각 호스트용 rbd 이미지(hb-hostip) 생성하여 watcher 감시
2. /etc/ceph 파일생성 대신 -c, -k 옵션을 사용하여 rbd 실행
3. keyring파일은 /etc/cloudstack/agent/ 경로에 생성
py파일에 파이썬 버전 설정
Copy link
Contributor

@wido wido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good

@DaanHoogland
Copy link
Contributor

@GutoVeronezi @ravening I resolved all outdated comments. cc @weizhouapache. Do you giys approve?

@weizhouapache
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✖️ el7 ✖️ el8 ✖️ debian ✖️ suse15. SL-JID 4814

@weizhouapache
Copy link
Member

@dhslove
can you solve the compilation error ?

[ERROR] COMPILATION ERROR : 
[ERROR] /home/travis/build/apache/cloudstack/plugins/storage/volume/default/src/main/java/org/apache/cloudstack/storage/datastore/lifecycle/CloudStackPrimaryDataStoreLifeCycleImpl.java:[241,16] error: cannot find symbol
  symbol:   variable multi
  location: class CloudStackPrimaryDataStoreLifeCycleImpl
[ERROR] /home/travis/build/apache/cloudstack/plugins/storage/volume/default/src/main/java/org/apache/cloudstack/storage/datastore/lifecycle/CloudStackPrimaryDataStoreLifeCycleImpl.java:[241,15] error: illegal parenthesized expression
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project cloud-plugin-storage-volume-default: Compilation failure: Compilation failure: 
[ERROR] /home/travis/build/apache/cloudstack/plugins/storage/volume/default/src/main/java/org/apache/cloudstack/storage/datastore/lifecycle/CloudStackPrimaryDataStoreLifeCycleImpl.java:[241,16] error: cannot find symbol
[ERROR]   symbol:   variable multi
[ERROR]   location: class CloudStackPrimaryDataStoreLifeCycleImpl
[ERROR] /home/travis/build/apache/cloudstack/plugins/storage/volume/default/src/main/java/org/apache/cloudstack/storage/datastore/lifecycle/CloudStackPrimaryDataStoreLifeCycleImpl.java:[241,15] error: illegal parenthesized expression
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] https://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <args> -rf :cloud-plugin-storage-volume-default
The command "./tools/travis/install.sh" failed and exited with 1 during .

Copy link
Contributor

@GutoVeronezi GutoVeronezi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhslove, ACS already uses java bindings for librados in com.ceph.rados.Rados; could not the heartbeat be implemented through Java instead of creating Python scripts? cc: @wido

for (StoragePoolVO pool : clusterPools) {
if (pool.getPoolType() == StoragePoolType.NetworkFilesystem) {
hasNfs = true;
if (pool.getPoolType() == StoragePoolType.NetworkFilesystem || pool.getPoolType() == StoragePoolType.RBD) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhslove as these conditions are used more than once, we could create a method in StoragePoolType to do the validation. Something like this:

        public boolean supportsHa() {
            return this.equals(NetworkFilesystem) || this.equals(RBD);
        }

And then:

Suggested change
if (pool.getPoolType() == StoragePoolType.NetworkFilesystem || pool.getPoolType() == StoragePoolType.RBD) {
if (pool.getPoolType().supportsHa()) {

Comment on lines 62 to 69
String _poolUUID;
String _monHost;
String _poolMountSourcePath;
String _mountDestPath;
PoolType _type;
String _poolAuthUserName;
String _poolAuthSecret;
String _poolSourceHost;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These _ in the begin of the name is a legacy behavior, it should not be used anymore.

@@ -57,6 +58,28 @@ public NfsStoragePool(String poolUUID, String poolIp, String poolSourcePath, Str
}
}

public static class RbdStoragePool {
String _poolUUID;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, using strict camel case (non consecutive capitals/treat acronyms as normal words) would improve readability; with this approach, the variable name would be poolUuid. Also, this approach is already being applied in the class name: it is RbdStoragePool instead of RBDStoragePool.

String _poolAuthSecret;
String _poolSourceHost;

public RbdStoragePool(String poolUUID, String monHost, String poolSourcePath, String mountDestPath, PoolType type, String poolAuthUserName, String poolAuthSecret, String poolSourceHost) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same remark about applying camel case to acronyms.

Comment on lines +81 to +120
for (RbdStoragePool rbdpools : rbdStoragePools) {
hostAndPools = String.format("host IP [%s] in pools [%s]", hostIp, rbdStoragePools.stream().map(pool -> pool._monHost).collect(Collectors.joining(", ")));
s_logger.debug(String.format("Checking heart beat with KVMHAChecker for %s", hostAndPools));

ProcessBuilder processBuilder = new ProcessBuilder();
processBuilder.command().add("python3");
processBuilder.command().add(s_heartBeatPathRbd);
processBuilder.command().add("-i");
processBuilder.command().add(rbdpools._poolSourceHost);
processBuilder.command().add("-p");
processBuilder.command().add(rbdpools._poolMountSourcePath);
processBuilder.command().add("-n");
processBuilder.command().add(rbdpools._poolAuthUserName);
processBuilder.command().add("-s");
processBuilder.command().add(rbdpools._poolAuthSecret);
processBuilder.command().add("-v");
processBuilder.command().add(hostIp);
processBuilder.command().add("-r");
processBuilder.command().add("r");
Process process = null;
String parsedLine = "";
try {
process = processBuilder.start();
BufferedReader bfr = new BufferedReader(new InputStreamReader(process.getInputStream()));
parsedLine = bfr.readLine();
} catch (IOException e) {
e.printStackTrace();
}

s_logger.debug(String.format("Checking heart beat with KVMHAChecker [{command=\"%s\", log: \"%s\", pool: \"%s\"}].", processBuilder.command().toString(), parsedLine,
rbdpools._monHost));

if (process != null && parsedLine.contains("DEAD")) {
s_logger.warn(String.format("Checking heart beat with KVMHAChecker command [%s] returned. [%s]. It may cause a shutdown of host IP [%s].", processBuilder.command().toString(),
parsedLine, hostIp));
} else {
validResult = true;
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block could be extracted to a method, to better organize the code.

Comment on lines +75 to +99
ProcessBuilder processBuilder = new ProcessBuilder();
processBuilder.command().add("python3");
processBuilder.command().add(vmActivityCheckPath);
processBuilder.command().add("-i");
processBuilder.command().add(rbdStoragePool._poolSourceHost);
processBuilder.command().add("-p");
processBuilder.command().add(rbdStoragePool._poolMountSourcePath);
processBuilder.command().add("-n");
processBuilder.command().add(rbdStoragePool._poolAuthUserName);
processBuilder.command().add("-s");
processBuilder.command().add(rbdStoragePool._poolAuthSecret);
processBuilder.command().add("-v");
processBuilder.command().add(hostIp);
processBuilder.command().add("-u");
processBuilder.command().add(volumeUuidList);
command = processBuilder.command().toString();
Process process = null;

try {
process = processBuilder.start();
BufferedReader bfr = new BufferedReader(new InputStreamReader(process.getInputStream()));
parsedLine = bfr.readLine();
} catch (IOException e) {
e.printStackTrace();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here about using com.cloud.utils.script.Script.

@@ -43,9 +44,16 @@ public Answer execute(final CheckVMActivityOnStoragePoolCommand command, final L
final ExecutorService executors = Executors.newSingleThreadExecutor();
final KVMHAMonitor monitor = libvirtComputingResource.getMonitor();
final StorageFilerTO pool = command.getPool();
if (Storage.StoragePoolType.NetworkFilesystem == pool.getType()){
if (Storage.StoragePoolType.NetworkFilesystem == pool.getType() || Storage.StoragePoolType.RBD == pool.getType()){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If created, the method supportsHa could be used here too.

final OutOfBandManagementResponse resp = outOfBandManagementService.executePowerOperation(r, PowerOperation.RESET, null);
return resp.getSuccess();
final OutOfBandManagement oobm = outOfBandManagementDao.findByHost(r.getId());
if(oobm.getPowerState() == PowerState.Off){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if(oobm.getPowerState() == PowerState.Off){
if (oobm.getPowerState() == PowerState.Off) {

if(oobm.getPowerState() == PowerState.Off){
LOG.warn("OOBM recover operation failed for the host " + r.getName() + " already OFF");
return false;
}else{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}else{
} else {

final OutOfBandManagementResponse resp = outOfBandManagementService.executePowerOperation(r, PowerOperation.OFF, null);
return resp.getSuccess();
final OutOfBandManagement oobm = outOfBandManagementDao.findByHost(r.getId());
if (oobm.getPowerState() == PowerState.Unknown){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (oobm.getPowerState() == PowerState.Unknown){
if (oobm.getPowerState() == PowerState.Unknown) {

@github-actions
Copy link

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

@DaanHoogland DaanHoogland added this to the 4.19.0.0 milestone Jun 22, 2023
@shwstppr
Copy link
Contributor

shwstppr commented Oct 3, 2023

@dhslove can you please address the merge conflicts? Should we consider this for 4.19.0.0?

@weizhouapache
Copy link
Member

moved to 4.21.0.0 milestone

@DaanHoogland DaanHoogland linked an issue Sep 25, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Status] KVM Host-HA using CEPH RBD