-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/snap-confine: release: Basic Ubuntu Touch & Halium enablement #13661
base: master
Are you sure you want to change the base?
Conversation
Everyone contributing to this PR have now signed the CLA. Thanks! |
459cc75
to
118a947
Compare
release/release.go
Outdated
// OnTouch states whether the process is running inside a Touch image | ||
// with a classic (but mostly read-only) filesystem layout. | ||
// It is a narrowed down variant of OnClassic | ||
var OnTouch bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd split the part specific to Touch into a separate PR. There may be questions about how we want to approach this specific variant that are unrelated to supporting libhybris
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbf I've waited pretty long on the previous PR that has been sitting for a while, which used to be the libhybris-only PR. But if more people want to have it split up, then sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would split it. The s-c bits should be fairly easy to land, but folks may be asking questions about this one and there's no point in holding back the rest of the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. Please have a look at the comments.
static void sc_hybris_mount_android_rootfs(const char *rootfs_dir) | ||
{ | ||
// Bind mount a tmpfs on $rootfs_dir/$tgt_dir (i.e. /var/lib/snapd/lib/gl) | ||
char rfsbuf[512] = { 0 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can reuse the buffer since snprintf always adds terminating \0 and simply interleave snprintf/fs operation sequence
Also, please use PATH_MAX where appropriate:
char rfsbuf[512] = { 0 }; | |
char rfsbuf[PATH_MAX] = { 0 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
// | ||
// The glob list passed to us is produced with paths relative to source dir, | ||
// to simplify the various tie-in points with this function. | ||
static void sc_hybris_populate_libgl_with_hostfs_symlinks(const char *libgl_dir, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like a copy of sc_populate_libgl_with_hostfs_symlinks, could we make the former non-static and share the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken care of.
} | ||
} | ||
|
||
static void sc_hybris_mkdir_and_mount_and_glob_files(const char *rootfs_dir, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, can we share sc_mkdir_and_mount_and_glob_files from nvidia support code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, noted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
cmd/snap-confine/snap-confine.c
Outdated
setenv("HYBRIS_LINKER_DIR", "/var/lib/snapd/lib/gl/libhybris/linker", 1); | ||
setenv("HYBRIS_EGLPLATFORM_DIR", "/var/lib/snapd/lib/gl/libhybris", 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels misplaced, neither snap-confine nor any part of the snap execution chain delivered by snapd sets any of the environment variables related to mesa/vulkan/egl. Most likely this should be part of to https://github.com/snapcore/snapcraft-desktop-integration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO having snap-confine set up the environment by itself allows for:
- Expected environment for any Snap
- Less ways of the Snap screwing up something in the setup (because it misses -desktop-integration or does things itself)
- Have existing Snaps working as-is.
I cannot expect upstream app developers to care or have the knowledge to setup libhybris appropriately, which is why I put it there.
EDIT: Also I would argue that the libhybris enablement is for more than just GL drivers, it actually allows any appropriately wrapped bionic-library to function inside of a Snap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken care of.
Since security is supposed to be involved in the discussion, here is a little hint about something that might come up: For Halium/Android 9-only devices we had to ship a TLS-padding hack in the form of a shared library that needs to be LD_PRELOADed into the process, otherwise bionic libraries and glibc libraries would have a mismatch in TLS sizes. The do-nothing library resides here: https://gitlab.com/ubports/development/core/hybris-support/tls-padding/-/blob/main/tls-padding.cpp?ref_type=heads Ubuntu Touch sets the LD_PRELOAD environment variable to include this hack into every user-app process, so ideally we can keep it to keep having 9 devices chugging along. It explains the necessity to include the tls-padding .so into the AppArmor profile so that it can be mapped into the process image. |
d41963d
to
ba1b38d
Compare
Thanks for the additional comments, will take care of them. I just noticed I wasn't specific enough on the reasoning regarding commit d779cae where I stated "on a separate filesystem". This means udev isn't able to manage it using devtmpfs and udevadm being unable to query properties of those binderfs endpoints. EDIT: CLA signed with the email address in my commits. |
d779cae
to
4004cfb
Compare
I will happily review this after 2.62 is released. Apologies for the delay, we're really busy trying to get the release in good shape. |
interfaces/builtin/opengl.go
Outdated
/{,var/}run/shm/hybris_shm_data rw, # FIXME: LP: #1226569 (make app-specific) | ||
# This is a LD_PRELOADed TLS padding library allowing use of both | ||
# Android 9 and glibc TLS slots at the same time. Only used on Halium 9. | ||
/usr/lib/@{multiarch}/libtls-padding.so mr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the library resolved at runtime, the directory will be from the perspective of a pivot_root base snap which (I suspect?) does not carry it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This particular library resides on the host, which then gets picked up by snap-confine AFAICT.
In general thread-locals in a shared library are taken care of after initialization by ld
, before the actual executable is run. Think calling constructors of a static variable, it's similar in style to how ld
treats thread-locals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for the record, host files do not appear at that location. What you see here comes from the base snap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am surprised then that snap-confine correctly stops complaining about not being able to map that library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're setting LD_PRELOAD in your environment, than the denial from s-c should be raised regardless of this interface, as s-c is running under a different profile. Now, setting LD_PRELOAD in the environment will not have an effect of the snap as IIRC LD_PRELOAD isn't carried over to the snap's environment. Now, if OTOH is being set within the snap, then yes, if you allow it explicitly in the file like here, the denial will go away even if the file doesn't exist. Can you paste the denial you saw?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After giving it some time, I've come to the conclusion that we might still want to pass through the preload hack to the confined environment in case it's running on Touch specifically. Would that be feasible from your perspective?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ld.so will strip out LD_PRELOAD before it switches execution to snap-confine, i.e. snap-confine will not even see that environment variable, so there's no chance of passing it to the snap. Perhaps you can load it from /etc/environment when executing in the snap, but it would not be automatic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was more thinking of measuring whether it's running on Touch via other means (checking for existance of a Halium-only file) and recreating the environment variable. It might sound hacky at first but there really seems to be no easy other way.
EDIT: I suppose that needs to happen in the desktop-helpers then? In that case I would reduce this MR by just allowing mapping the file from the hostfs and do the rest in the launchers. Or how about a SNAP_LD_PRELOAD
which allows snapd to dictate the environment variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume you've ran some apps with this change built locally. Were they able to start and function properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I couldn't as my assumption about Halium 9 vs 10 was wrong. I thought one of my test devices was susceptible to the issue, whereas it was just Halium 10 MediaTek devices specifically (which seem to have TLS slots hardcoded in their bionic-based GL libs).
interfaces/builtin/opengl.go
Outdated
/{,android/}apex/com.android.i18n/lib{,64}/**.so m, | ||
/{,dev/}socket/property_service rw, # attach_disconnected path | ||
/{,dev/}socket/logdw rw, # attach_disconnected path | ||
/{,dev/}__properties__/** rw, # attach_disconnected path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, where is properties coming from? Is this some Android-specific interface? What's inside?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The __properties__
part contains SELinux info of process labels allowed to access Android properties through getprop
and setprop
. It's more of a void on UT though since functionality tied to SELinux is stubbed out to not conflict with AppArmor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need rw
access or just read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially only needs read-access, since that is set up in the Halium container. But I'll have to check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
Ping! I've noticed 2.62 landing, and I don't want to be left behind. :) |
a95e4ba
to
d866b01
Compare
For the record and completeness sake, here is the list of devices tested with these changes:
|
Sooo... anything new? |
@alexmurray can you have a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than the LD_PRELOAD hack which from the existing comments seems unnecessary / unworkable via snap-confine and should be removed.
sc_identity old = sc_set_effective_identity(sc_root_group_identity()); | ||
int res = mkdir(android_rootfs_dir, 0755); | ||
if (res != 0 && errno != EEXIST) { | ||
die("cannot create tmpfs target %s", android_rootfs_dir); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: this does't appear to be a tmpfs so this error message is misleading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled in d5a85d6
@@ -42,6 +42,8 @@ | |||
/{,usr/}lib{,32,64,x32}/{,@{multiarch}/}libudev.so* mr, | |||
/{,usr/}lib{,32,64,x32}/{,@{multiarch}/}libseccomp.so* mr, | |||
/{,usr/}lib{,32,64,x32}/{,@{multiarch}/}libcap.so* mr, | |||
# TLS padding hack for Android 9 | |||
/{,usr/}lib{,32,64,x32}/{,@{multiarch}/}libtls-padding.so mr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this should be removed given the prior discussions that it appears unneeded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not needed for most devices, but rather for a specific series of devices from a specific Android generation: Android 10 Mediatek, which I don't own, so I cannot figure out a solution by myself.
I am fine with the other changes landing but realistically we will need a solution for Mediatek devices with Android 10 too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Handled in d5a85d6
cmd/snap-confine/udev-support.c
Outdated
static void sc_udev_allow_hybris(sc_device_cgroup *cgroup) | ||
{ | ||
struct stat sbuf; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: would this be easier to read if it was written as something like the following (note untested):
const char *paths[] = {"/dev/binderfs/binder", "/dev/binderfs/hwbinder", "/dev/binder", "/dev/hwbinder"};
for (int i = 0; i < sizeof(paths)/sizeof(paths[0]); i++) {
struct stat sbuf;
if (stat(paths[i], &sbuf) == 0) {
sc_device_cgroup_allow(cgroup, S_IFCHR, major(sbuf.st_rdev),
minor(sbuf.st_rdev));
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, based it on your template. Handled in d5a85d6
interfaces/builtin/opengl.go
Outdated
/{,var/}run/shm/hybris_shm_data rw, # FIXME: LP: #1226569 (make app-specific) | ||
# This is a LD_PRELOADed TLS padding library allowing use of both | ||
# Android 9 and glibc TLS slots at the same time. Only used on Halium 9. | ||
/var/lib/snapd/hostfs/usr/lib/@{multiarch}/libtls-padding.so mr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again this should be removed since it appears unnecessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check and address my previous comment above, we still need a solution for Mediatek Android 10 devices and I doubt waiting for much longer is a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, short-term I am fine without supporting MTK-10 devices for now, but I would really like to have this handled later.
Handled in d5a85d6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look from top to bottom. Some comments.
I would like to land the move to nvidia/mount logic separately to make the diff shorter. I can propose that separately.
} | ||
(void)sc_set_effective_identity(old); | ||
|
||
(void)mount(SC_HYBRIS_ROOTFS, android_rootfs_dir, NULL, MS_BIND | MS_REC | MS_RDONLY, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be checked and should s-c die if it does not work? If it is legitimately allowed to fail, as indicated by the (void)
up front, can you add a comment just above
to explain why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can take a deeper look later after $DAYJOB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled in d5a85d6
cmd/snap-confine/udev-support.c
Outdated
sc_device_cgroup_allow(cgroup, S_IFCHR, major(sbuf.st_rdev), | ||
minor(sbuf.st_rdev)); | ||
} | ||
if (stat("/dev/binder", &sbuf) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this runs everywhere, what happens if someone installs, say, anbox and gets /dev/binder. Now all snaps will allow access to the binder. I don't have an immediate idea on how to make this better without some host-level hint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The host level hint can again be a check for existance of /system/build.prop, which under Halium distributions always has to exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled in d5a85d6
snap-confine: Additional AppArmor rules for proper libhybris functionality in snap-confine interfaces: Add libhybris AppArmor paths & udev rules to opengl Tested working on the Pixel 3a (Android 9, Qualcomm). snap-confine: Set up hybris environment variables if appropriate Hybris needs the linker and EGL platform paths properly set up to allow running graphical applications using Android drivers.
Required to grant access to the respective devices in /dev.
…t for strict libhybris can accept environment variables allowing relocating the library loader and having full functionality by providing the userspace drivers in /var/lib/snapd/lib/gl. libhybris and all Halium/Android driver bits need to be set up for strict only, classic can continue consuming drivers from the standard distribution paths.
- Use PATH_MAX - Commonalize shared functions between NVIDIA & libhybris - Unify and use SC_HYBRIS_PROPERTY_FILE appropriately
…pu model Newer Qualcomm devices (like the Fairphone 4) seem to have their userspace GL libraries query the kernel-mode driver for the respective GPU model.
binder and hwbinder are devices residing in their own filesystem. In order to allow access to them they need to be unconditionally allowed in the cgroup for libhybris drivers to work accordingly. Note that binderfs is checked before binder in order to distinguish between an Android 11+ and lower kernels since newer ones place a compatibility symlink at the old and familiar path.
Do the environment variable setup for libhybris inside of snap-exec.
As required by some devices, this allows loading i18n APEX into the process in order to allow certain bionic libraries to work properly. Additionally explain the use and necessity for libtls-padding inclusion.
…ead-only The property area is set up by Halium, confined snaps might merely read it.
The confined snap can only load libs from directories prefixed with /var/lib/snapd/hostfs, so ensure libtls-padding that resides there is mappable into the process.
cff6dec
to
c4c706e
Compare
- Cleanup and remove TLS-padding permit from AppArmor rules - Only allow binder access on Halium/libhybris systems - Die in case the Halium environment cannot be mounted into the target - Restructure binder device mount operation to iterate through a list of allowed device paths
Everyone contributing to this PR have now signed the CLA. Thanks! |
c4c706e
to
d5a85d6
Compare
I've addressed all mentioned requested changes, but I came up with another one which I'll happily add in case I get a "makes sense" from you: How about only adding those libhybris-related AppArmor changes to the template in case it detects Alternatively, I could also reintegrate this one here in order to have a more narrowed idea of "Am I running on an Ubuntu Touch with libhybris device?" in addition to the property file existing: https://github.com/snapcore/snapd/pull/13674/files#diff-d04e67128a47f4820ffbef091b0bdc701ce59b4de9671b109405f19e91eb629dR193 |
Set the environment variables required to handle libhybris in a confined environment, with libhybris bionic linker and EGL_PLATFORM implementations available through environment variables. Defaults to wayland for libhybris' libEGL. libgbm support should stay untouched with these variables setting up the linker paths). Tested on Halium-enabled Pixel 3a using Halium (Ubuntu Touch 20.04). Tested Snaps: - Tide IDE (classic Snap running with libhybris on the host) - Gnome Mahjongg (with GDK_DEBUG=gl-disable) - QML Creator (with Wayland QPA) - Yamagi Quake 2 (with SDL & GLES Renderer)
Telegram Desktop works better with QT_QPA_PLATFORM=xcb, which allows dropping baggage of HYBRIS_EGLPLATFORM from the environment. Tested on Halium-enabled Pixel 3a using Halium (Ubuntu Touch 20.04). Tested Snaps: - Telegram Desktop (with QT_QPA_PLATFORM=xcb and EGL_PLATFORM=null or empty) - Tide IDE (classic Snap running with libhybris on the host) - Gnome Mahjongg (with GDK_DEBUG=gl-disable) - QML Creator (with Wayland QPA) - Yamagi Quake 2 (with SDL & GLES Renderer)
As detected by static-checks.
These changes allow for preliminary support of Halium/libhybris drivers and enabling some more sophisticated Snaps to work on Ubuntu Touch (like CUPS).
For more information about Halium: https://halium.org
Largely based on the NVIDIA support and provides a working environment for confined Qt5, Qt6, SDL, GTK and Flutter Snaps (as long as a GLES renderer is available) like QML Creator (
sudo snap install qmlcreator
) and running it with thewayland-egl
platform.