Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LVMLOCK\SanLock + Kernel Panic #156

Closed
dylanetaft opened this issue Sep 11, 2024 · 2 comments
Closed

LVMLOCK\SanLock + Kernel Panic #156

dylanetaft opened this issue Sep 11, 2024 · 2 comments

Comments

@dylanetaft
Copy link

So...I found a way to trigger a kernel panic. It wasn't intentional.

The process to reproduce is as follows
Set up two ISCSI targets with LIO\targetcli on a remote server, set up 4 luns in total, one for a volume group to hold a global lock for sanlock, another for a VG to hold other things - in my case KVM VMs.

I was thinking of mirroring what IBM has done with Power systems, virtual IO servers, and taking two ISCSI targets from two remote systems and using LVM to mirror them into a single filesystem.

Upon reviewing lvmlockd's man page and ultimiately some of the code - sanlock and lvmlock will only use the first PV to store the global lock. It creates a hidden LV that cannot be mirrored.

So I used MDADM to raid-1 the two ISCSI luns together and then put my global lock VG ontop of that.

I can reboot either iscsi remote target server, and I do not lose the global lock, it appears stable and recoverable.

It is not clear if exclusive locks on LVs work the same way - so for that I simply mirrored my other two luns together, did vgchange --lockstart on the VG, and vgchange -ae on the LV.

I booted up some VMs in KVM on the host.

Then I proceeded to reboot one of the iscsi target servers.

I did not lose the global lock as before, mdadm protected it.

The system kernel panicked while doing some vgdisplay commands, it seems maybe shared locks on LVs work in a similar way to the global lock?

So maybe this is more of a documentation issue - the man page of lvmlock should probably state that for both the global locks and other LVs, mirroring just doesn't work.

Everything seems fine if I put mdadm underneath both volume groups. It absorbs the loss of disk fine and no locks get lost, no VMs go down, no kernel panic.

I get that sanlock is probably for a true san - which has proper MPIO and has dual storage controllers.

Someone could be tempted however to chain this stuff together in production as like a software defined storage solution - with newer devices like NVMEs which don't work like trays of disks on dual SAS controllers. And it APPEARS to work until you really start testing failure situations. It DOES work if you put MDADM under the whole thing so lvmlock never sees PVs disappearing.

Are kernel panics and subsequent system reboots a bug? Or is this a documentation issue? Sanlock and lvmlock are for a true SAN, don't try to LVM mirror devices and expect redundancy for locks or system stability?

@teigland
Copy link
Contributor

The sanlock disk paxos algorithm requires all machines to be reading/writing the same disk blocks at once. If those are mirrored underneath, we can't guarantee that machines all see the same thing when racing to do i/o on a single sector. If you lose the disk holding the hidden lvmlock LV (holding the sanlock leases), then you need to take the VG offline, rebuild the lvmlock leases, and bring the VG back online (details in lvmlockd man page under "Recover from lost PV holding sanlock locks".) With this approach, you can still add multiple PVs to the VG and create raid LVs for normal use.

The sanlock man page briefly mentions the issue of host based mirroring: "Using sanlock on shared block devices that do host based mirroring or replication is not likely to work correctly."

As for md raid, that does not work correctly from multiple hosts concurrently, apart from md-cluster which uses the dlm. So, an alternative approach to using shared VGs is to set up a corosync and dlm cluster.

@dylanetaft
Copy link
Author

That'll work. I saw a few other folks try to do this online so at least there's a good answer. Thanks for your time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants