Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document limitations when using libdevmapper in a multi-threaded program #69

Open
DemiMarie opened this issue Feb 6, 2022 · 7 comments

Comments

@DemiMarie
Copy link

Many modern programs and programming environments are multi-threaded, but libdevmapper is not thread-safe. Would it be possible to document the rules that users must follow when using libdevmapper in a multi-threaded program?

@zkabelac
Copy link

zkabelac commented Feb 6, 2022

Might be disappointing, but we still would prefer libdm users stay with non-threaded usage.

i.e. having question about failures caused by threaded usage is something we do not want to get/solve at all.
It's very very simple to loose big portions of users data by a single badly scheduled 'ioctl()' on inappropriate device. So even lvm2 does not use threaded access for anything else then parallel 'dmsetup status'.

We could 'list' a set of libdm functions which are thread safe - but those are typically functions for i.e. dm_list/hash manipulation. But functions prepended with dm_tree/dm_task usually do rely on some internal variables and shell not be used in threaded environment.

So far I'm not convinced manipulation with 'DM' table is time critical operation that needs to be 'hyper-optimized'. User of libdm should be staying with serialized list of operations - even from the perspective that last thing to end with could be hundreds of 'parallel' operations resulting in many corrupted devices at once - while in the serialized case typically the 1st. error stops from all other errors to happen...
Devices are more delicate and tend to have various kind of errors compared with i.e. threaded alpha-sort ....

Surely the user may parallelize work with independent set of devices working with 'linear/striped' target type - but even in this very specific use-case the 'actual' benefit of using threads for this task compared with having them done in serial order is pretty minimal....

@DemiMarie
Copy link
Author

Might be disappointing, but we still would prefer libdm users stay with non-threaded usage.

There are many programming environments (Go, glib, Java, Haskell, many Rust programs) where this isn’t an option. One can put a global lock around calls to libdevmapper, or even ensure that all libdevmapper usage happens on a single thread, but one cannot ensure no other threads are running.

i.e. having question about failures caused by threaded usage is something we do not want to get/solve at all. It's very very simple to loose big portions of users data by a single badly scheduled 'ioctl()' on inappropriate device. So even lvm2 does not use threaded access for anything else then parallel 'dmsetup status'.

We could 'list' a set of libdm functions which are thread safe - but those are typically functions for i.e. dm_list/hash manipulation.

That would still be useful.

But functions prepended with dm_tree/dm_task usually do rely on some internal variables and shell not be used in threaded environment.

Is it safe for other threads to be running, so long as they do not call into libdm?

So far I'm not convinced manipulation with 'DM' table is time critical operation that needs to be 'hyper-optimized'. User of libdm should be staying with serialized list of operations - even from the perspective that last thing to end with could be hundreds of 'parallel' operations resulting in many corrupted devices at once - while in the serialized case typically the 1st. error stops from all other errors to happen... Devices are more delicate and tend to have various kind of errors compared with i.e. threaded alpha-sort ....

My motivation is not to perform device-mapper operations in parallel, but rather to allow device-mapper to be used from a program environment where the presence of other threads is unavoidable. If it is genuinely necessary for no other threads to be running at all, the only safe way to use libdevmapper in these environments is via a dedicated subprocess.

@zkabelac
Copy link

zkabelac commented Feb 7, 2022

Might be disappointing, but we still would prefer libdm users stay with non-threaded usage.

There are many programming environments (Go, glib, Java, Haskell, many Rust programs) where this isn’t an option. One can put a global lock around calls to libdevmapper, or even ensure that all libdevmapper usage happens on a single thread, but one cannot ensure no other threads are running.

That's precisely the reason why there is no API library - these languages should exec lvm command with right set of arguments - we are simply not able to maintain better API ATM... (and there is some toy D-Bus API if that helps)

i.e. having question about failures caused by threaded usage is something we do not want to get/solve at all. It's very very simple to loose big portions of users data by a single badly scheduled 'ioctl()' on inappropriate device. So even lvm2 does not use threaded access for anything else then parallel 'dmsetup status'.
We could 'list' a set of libdm functions which are thread safe - but those are typically functions for i.e. dm_list/hash manipulation.

That would still be useful.

Those auxiliary functions are typically built-in in other projects as well - so there no real reason to use them through libdm in any multithreaded program :) of course unless you like to call functions with 'dm_' prefix ;).

But functions prepended with dm_tree/dm_task usually do rely on some internal variables and shell not be used in threaded environment.

Is it safe for other threads to be running, so long as they do not call into libdm?

As long as you have one dedicated thread calling 'dm_tree/task' function - you are fine to use libdm within multi-threaded application.

My motivation is not to perform device-mapper operations in parallel, but rather to allow device-mapper to be used from a program environment where the presence of other threads is unavoidable. If it is genuinely necessary for no other threads to be running at all, the only safe way to use libdevmapper in these environments is via a dedicated subprocess.

That is possible (like with mostly any other non-thread-safe code) - to simply add global locking around function calls. libdm is no exception in this common practice...
Typically - you dedicate 1 thread for libdm manipulation and you make other threads to prepare job for it (just like i.e. dmeventd does)

@DemiMarie
Copy link
Author

Might be disappointing, but we still would prefer libdm users stay with non-threaded usage.

There are many programming environments (Go, glib, Java, Haskell, many Rust programs) where this isn’t an option. One can put a global lock around calls to libdevmapper, or even ensure that all libdevmapper usage happens on a single thread, but one cannot ensure no other threads are running.

That's precisely the reason why there is no API library - these languages should exec lvm command with right set of arguments - we are simply not able to maintain better API ATM... (and there is some toy D-Bus API if that helps)

The tools I am thinking of are a bit less general than lvm. For instance, both Docker and a hypothetical Qubes volume manager use dedicated thin pools that are not used for their own code or for swap. Therefore, they do not need to worry about deadlocks if code or data needs to be paged in while a device is suspended.

But functions prepended with dm_tree/dm_task usually do rely on some internal variables and shell not be used in threaded environment.

Is it safe for other threads to be running, so long as they do not call into libdm?

As long as you have one dedicated thread calling 'dm_tree/task' function - you are fine to use libdm within multi-threaded application.

Would it be possible to officially document this? Something like, “libdm is not thread-safe. You can freely use libdm functions in a multi-threaded program, so long as you only call them from a single thread at a time. It is recommended, but not required, that you call libdm functions from a dedicated thread. dm_list and dm_hash functions can be called concurrently from multiple threads, provided no two threads try to modify the same object at the same time.”

FYI, if you ever get around to writing an (incompatible) libdm2, my suggestion (and it is only a suggestion!) is to replace the global variables with some sort of opaque dm_context structure. Using different dm_context * in different threads would be safe, even if not recommended. That would make the threading rule “you must not modify the same object in multiple threads at once”, which is a very common requirement that every programming environment knows how to deal with. If this is not feasible, an alternative would be for libdm2 to expose a global reentrant lock, which must be held when calling into libdm2.

@zkabelac
Copy link

zkabelac commented Feb 8, 2022

Might be disappointing, but we still would prefer libdm users stay with non-threaded usage.

There are many programming environments (Go, glib, Java, Haskell, many Rust programs) where this isn’t an option. One can put a global lock around calls to libdevmapper, or even ensure that all libdevmapper usage happens on a single thread, but one cannot ensure no other threads are running.

We still believe all these users should be using higher level API.
libdm level requires rather deep understanding (unless you want to use 'linears').

As soon as these 'short-cutting' API starts to face real-world troubles with failing hw and massive data losses - those developers are usually helpless - and it all ends with 'do you have your full backup'.....

That's precisely the reason why there is no API library - these languages should exec lvm command with right set of arguments - we are simply not able to maintain better API ATM... (and there is some toy D-Bus API if that helps)

The tools I am thinking of are a bit less general than lvm. For instance, both Docker and a hypothetical Qubes volume manager use dedicated thin pools that are not used for their own code or for swap. Therefore, they do not need to worry about deadlocks if code or data needs to be paged in while a device is suspended.

Design based on quick device creation&destruction is IMHO not right.
Device creation is not a 'free' operation - the real time spend in lvm2 code is actually not that big (just see the 'time command)

Not that I'd want to you to discourage from all this - but there are many things to solve if you want to stay on 'safe side'. And sooner or later you will end with similar logic lvm2 is using.

Is it safe for other threads to be running, so long as they do not call into libdm?

As long as you have one dedicated thread calling 'dm_tree/task' function - you are fine to use libdm within multi-threaded application.

Would it be possible to officially document this? Something like, “libdm is not thread-safe. You can freely use libdm functions in a multi-threaded program, so long as you only call them from a single thread at a time. It is recommended, but not required, that you call libdm functions from a dedicated thread. dm_list and dm_hash functions can be called concurrently from multiple threads, provided no two threads try to modify the same object at the same time.”

Documenting this means encouraging users to use it :) and we will get reports from those users (and usually these reports well hide the origins of problems)
Given the existing team-size - it's better to persuade users to go with lvm2 ;)

FYI, if you ever get around to writing an (incompatible) libdm2, my suggestion (and it is only a suggestion!) is to replace the global variables with some sort of opaque dm_context structure.

We surely know how to write libraries - but the library concept in this case is not the way forward.
As such low-level library still require large amount of knowledge of the whole DM subsystem.

@DemiMarie
Copy link
Author

That's precisely the reason why there is no API library - these languages should exec lvm command with right set of arguments - we are simply not able to maintain better API ATM... (and there is some toy D-Bus API if that helps)

The tools I am thinking of are a bit less general than lvm. For instance, both Docker and a hypothetical Qubes volume manager use dedicated thin pools that are not used for their own code or for swap. Therefore, they do not need to worry about deadlocks if code or data needs to be paged in while a device is suspended.

Design based on quick device creation&destruction is IMHO not right. Device creation is not a 'free' operation - the real time spend in lvm2 code is actually not that big (just see the 'time command)

Output of time for an lvcreate --type=thin command on my system:

0.10user 0.08system 0:00.24elapsed 76%CPU (0avgtext+0avgdata 37028maxresident)k
2048inputs+1828outputs (0major+6330minor)pagefaults 0swaps

For a snapshot:

0.09user 0.09system 0:00.26elapsed 72%CPU (0avgtext+0avgdata 37012maxresident)k
1536inputs+1572outputs (0major+6335minor)pagefaults 0swaps

In both cases, it looks like a roughly a third of the time is in userspace, a third of the time is in the kernel, and a third of the time waiting (presumably on udev).

FYI, if you ever get around to writing an (incompatible) libdm2, my suggestion (and it is only a suggestion!) is to replace the global variables with some sort of opaque dm_context structure.

We surely know how to write libraries - but the library concept in this case is not the way forward. As such low-level library still require large amount of knowledge of the whole DM subsystem.

I know of at least one higher-level library implemented on top of libdevmapper: libcryptsetup. What should libcryptsetup be using?

We still believe all these users should be using higher level API.

Does that include tools such as Stratis and libraries such as libcryptsetup?

@zkabelac
Copy link

zkabelac commented Feb 8, 2022

In both cases, it looks like a roughly a third of the time is in userspace, a third of the time is in the kernel, and a third of the time waiting (presumably on udev).

So far these timings looks quite fine to me.
If you want to get faster times - prepare filter/devicelist to minimize device scanning.
User-spase code tends to be pretty minimal (unless you have thousands of volumes in your VG)
Check the 'lvcreate -vvvv' trace if you got it minimal scanning.

Majority of time delays are synchronization times where you have to wait for disk ack to proceed to next step... If you 'skip' those steps - you can make it faster, but non-resistant to various failures.

I know of at least one higher-level library implemented on top of libdevmapper: libcryptsetup. What should libcryptsetup be using?

libcryptsetup is in very close touch with libdm devel :) so yeah they have the deep knowledge...

We still believe all these users should be using higher level API.

Does that include tools such as Stratis and libraries such as libcryptsetup?

I'm not going to comment Stratis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants