Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consolidated s390 device configuration #5250

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

steffen-maier
Copy link
Contributor

@steffen-maier steffen-maier commented Oct 13, 2023

Consolidate the persistent and dynamic configuration of s390-specific devices in Linux distributions by delegating the configuration to the existing framework zdev from s390-tools.

This pull request completes consolidated s390 device configuration in anaconda.
This also fixes a newly discovered bug about missing persistent device configuration if device is enabled outside of anaconda, e.g. using rd.zfcp from dracut.
Some of the commits [see their descriptions] depend on certain commits from:

This PR works together with storaged-project/blivet#1162.

Zdev's job is to perform low-level configuration after which the user gets architecture-independent objects such as block devices, SCSI devices, or network interfaces. Those can and should in turn be configured with existing common code mechanisms. So there's a clear separated layering for configuration duties.

In particular, the s390-specific devices currently are: DASD (traditional disk), ZFCP (scsi), and ZNET representing channel-attached network (QETH incl. OSA and HiperSockets, LCS, CTC). Zdev has a stable command line user interface and abstracts from sysfs and from a persistent configuration representation. Zdev encapsulates configuration details. Systems management code can simply delegate configuration to zdev and thus reduce architecture-specific code.

This improves user experience, serviceability, maintainability, and reduces test effort.

@jstodola @poncovka @sharkcz

Even though this is a draft pull request, I would appreciate review comments.
It's only a draft until we sorted out the dependencies and merge order of pull requests for different related projects. Some thoughts in the integration process are in #5250 (comment).

I developed and tested this locally with updates.img based on code (anaconda+blivet) branched closely enough for the updates to work with RHEL9.1 products.img (current when I started development).
This is a forward port, which applied pretty much automatically to the new base.

@pep8speaks
Copy link

pep8speaks commented Oct 13, 2023

Hello @steffen-maier! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 38:1: E402 module level import not at top of file

Comment last updated at 2024-02-29 13:32:02 UTC

@github-actions github-actions bot added the f39 label Oct 13, 2023
@jkonecny12
Copy link
Member

Hello @steffen-maier, you are targeting this change into Fedora-39, however, Fedora 39 is in final freeze so until we have a bug with approved exception we can't get it in. I would suggest you to switch branch to master instead and focus on Fedora 40.

Copy link
Contributor

@VladimirSlavik VladimirSlavik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I'll defer to @poncovka, @rvykydal and @vojtechtrefny for more details. I am not sure about the high level matters, such as if this should be all somewhere in blivet or not.

Apart from the failing tests, here's what I found...

pyanaconda/modules/storage/installation.py Outdated Show resolved Hide resolved
pyanaconda/modules/network/nm_client.py Show resolved Hide resolved
pyanaconda/modules/storage/dasd/discover.py Show resolved Hide resolved
Copy link
Contributor

@jstodola jstodola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since chzdev and /lib/s390-tools/zdev-to-rd.znet are used, anaconda.spec should be updated so that anaconda-core requires the s390utils-core package (assuming the zdev-to-rd.znet helper script will be placed there) and also the minimal version of s390utils-core should be specified once it is known.

@steffen-maier
Copy link
Contributor Author

Since chzdev and /lib/s390-tools/zdev-to-rd.znet are used, anaconda.spec should be updated so that anaconda-core requires the s390utils-core package (assuming the zdev-to-rd.znet helper script will be placed there) and also the minimal version of s390utils-core should be specified once it is known.

I have already prepared such change commit but defer pushing until I looked at the unit tests fails, so I don't trigger the git workflows too often and unnecessarily.
Until then, I would like to understrand the background. IIRC, the "test" mode of anaconda, where you could run it as a process in a regular running Linux instance (as opposed to the dedicated anaconda initrd&stage2 environment), was removed long ago. For such use case, I would have immediately understood the reason for rpm package dependencies. Setting aside anaconda development using -devel sub packages, I wonder if any user would install an anaconda "binary" rpm and thus need the package dependencies.
Do we add the requirements because this is basic input for lorax (if that is still used)? AFAIK, lorax introduced the approach of packaging full rpm content into an installer image, and then have lorax config to prune certain files from the image. I.e. an exclude list approach instead of an allow list approach as what was used before lorax (anaconda's former build scripts incl. scripts/mk-images used for composes ?).

@steffen-maier
Copy link
Contributor Author

Hello @steffen-maier! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 38:1: E402 module level import not at top of file

Comment last updated at 2023-10-18 17:30:34 UTC

Unfortunately, I'm not sure why that is. I fail to see how my code change introduced that. The only preceding non-import statement I can see is:

gi.require_version("NM", "1.0")

But that was already there before my changeset.

@steffen-maier
Copy link
Contributor Author

Since chzdev and /lib/s390-tools/zdev-to-rd.znet are used, anaconda.spec should be updated so that anaconda-core requires the s390utils-core package (assuming the zdev-to-rd.znet helper script will be placed there) and also the minimal version of s390utils-core should be specified once it is known.

Except for the currently inevitable "FIXME", I hope this is what you were looking for: 48f6232

@VladimirSlavik
Copy link
Contributor

Line 38:1: E402 module level import not at top of file

Feel free to ignore these if not applicable. The GI imports always trip it up.

@VladimirSlavik
Copy link
Contributor

VladimirSlavik commented Oct 19, 2023

lorax

Yes, that is still used. However, for the long term we are trying to move towards having the dependencies in anaconda metapackages anaconda-install-{env,img}-deps and minimize the lorax part. If there is no technical reason to have the new dependency in lorax templates, such as removing large amounts of unused data from the installed package, then please add is as a dependency here.

By the way, I can't find any package with these tools?

@steffen-maier
Copy link
Contributor Author

By the way, I can't find any package with these tools?

You mean the helper scripts /lib/s390-tools/zdev-... being called by anaconda and blivet in this PR? If so:

Some of the commits [see their descriptions] depend on certain commits from ibm-s390-linux/s390-tools#158.

As Jan stated earlier, they will end up in a future release of s390utils-core.

@VladimirSlavik
Copy link
Contributor

Actually, I can't find even the package s390utils-core. Or is that supposed to not exist yet? The commits you reference don't change any spec file...

@steffen-maier
Copy link
Contributor Author

Actually, I can't find even the package s390utils-core. Or is that supposed to not exist yet? The commits you reference don't change any spec file...

Good point. I pointed to the upstream repo, which does not have any packaging definitions. Downstream packaging is by means of https://src.fedoraproject.org/rpms/s390utils/blob/rawhide/f/s390utils.spec#_206 already creating s390utils-core. Maybe I should add spec file changes for my upstream changes to my steffen-maier/s390utils#1, which currently only exists of pure downstream contributions.

@steffen-maier
Copy link
Contributor Author

How can I easily (without containers or other non-trivial setup) run the unit tests locally? Pointers to doc might be sufficient.
I messed it up again and it took very long for the git workflow to run after I had pushed updates, so I'm looking for more efficient turnaround times.

@sharkcz
Copy link
Contributor

sharkcz commented Oct 19, 2023

Actually, I can't find even the package s390utils-core. Or is that supposed to not exist yet? The commits you reference don't change any spec file...

it is a s390x specific package, not available anywhere else (x86_64, ...)

@VladimirSlavik
Copy link
Contributor

VladimirSlavik commented Oct 19, 2023

Some of the tests here run only after they're manually confirmed.

For tests, tl;dr is:

  • Generally, make -f Makefile.am container-ci should be sufficient locally - it should grab the missing container and run unit tests. I'd give it 90% confidence that it will "just work", unless you're not on x86_64.
  • If you're on another arch, ./autogen.sh && ./configure && make && make ci should give you the same.
  • RPM tests are mostly analogous to this.

That said - https://anaconda-installer.readthedocs.io/en/latest/testing.html

@VladimirSlavik
Copy link
Contributor

As for storaged-project/blivet#1162 (comment), the docs have a section about patching the container used for unit tests, which might help.

@steffen-maier
Copy link
Contributor Author

  • Generally, make -f Makefile.am container-ci should be sufficient locally - it should grab the missing container and run unit tests. I'd give it 90% confidence that it will "just work", unless you're not on x86_64.

FYI: I had to manually load the ext4 kernel module on my local container host sudo modprobe ext4, otherwise a number of additional unit tests failed (that worked here in the git workflow) because ext4 was not available and it could not load the module either (non-root and in container?).

@steffen-maier
Copy link
Contributor Author

I'm still working on the unit tests, got the running locally but then got distracted. Maybe I will need help with the unit tests.

About the current re-run of the git workflow from https://github.com/rhinstaller/anaconda/actions/runs/6577747989/job/18665300996?pr=5250:
These are still caused by my code changes here:
=========================== short test summary info ============================
FAILED unit_tests/pyanaconda_tests/modules/network/test_module_network_nm_client.py::NMClientTestCase::test_get_dracut_arguments_from_connection
FAILED unit_tests/pyanaconda_tests/modules/storage/test_module_dasd.py::DASDTasksTestCase::test_discovery FAILED

The following ones are new and I don't see how my unchanged code could have cause these. Maybe there is a new blivet version being pulled in and the anaconda unit tests were not yet updated for those blivet changes unrelted to this PR?:
unit_tests/pyanaconda_tests/modules/storage/partitioning/test_module_part_blivet.py::BlivetPartitioningInterfaceTestCase::test_request_handler
FAILED unit_tests/pyanaconda_tests/modules/storage/partitioning/test_module_part_blivet.py::BlivetPartitioningInterfaceTestCase::test_send_request
FAILED unit_tests/pyanaconda_tests/modules/storage/partitioning/test_module_part_blivet.py::BlivetPartitioningInterfaceTestCase::test_storage_handler

from blivet.tasks.fslabeling import Ext2FSLabeling, FATFSLabeling, JFSLabeling, ReiserFSLabeling, XFSLabeling, NTFSLabeling
E ImportError: cannot import name 'JFSLabeling' from 'blivet.tasks.fslabeling' (/usr/lib/python3.12/site-packages/blivet/tasks/fslabeling.py)

raise UnsupportedPartitioningError("Missing support for Blivet-GUI") from e
E pyanaconda.modules.common.errors.storage.UnsupportedPartitioningError: Missing support for Blivet-GUI

FAILED unit_tests/pyanaconda_tests/ui/test_simple_ui.py::SimpleUITestCase::test_gui

blivet-gui-partitioning vs. interactive-partitioning

====== 6 failed, 2031 passed, 1 xfailed, 21 warnings in 238.13s (0:03:58) ======

@steffen-maier
Copy link
Contributor Author

Updated the commit descriptions with the actual upstream commits since ibm-s390-linux/s390-tools#158 got merged.
Clarified the explicit package dependency in spec file, to model the dependencies more fine granular: commit ("DASDDiscoverTask...") does only depend on s390utils-core but not a particular new version, only commit ("network...") depends on a (future) new version of s390utils-core (originated in #5250 (review)).
Added a similar version dependency on dracut-network in the anaconda spec file, because commit ("network: use consolidated s390 device configuration (#1802482,#1937049)") can only get the rd.znet boot persistent device config from initrd (and transfer it to the installed system so that'll boot with a network connection) with an updated dracut module "znet" from dracutdevs/dracut#2534.
Squashed the fixup commit moving the import of initrd zdev config from anaconda python code to a new anaconda systemd service (#5250 (comment)).
No code changes.

…802482,#1937049)

Implements the dasd part of referenced bugs.

This allows to delegate s390-specifics to zdev from s390-tools and to
subsequently remove s390-specific code here.

Additionally, the change supports to subsequently solely rely on the zdev
persistent device configuration "database". Create entries on activating
devices. Simply copy the persistent device configuration to sysroot finally
without having to deal with device properties.

The spec file update reflects the new dependency on `chzdev` from the
s390 architecture specific sub-package s390utils-core. Actually, this
commit here only depends on `chzdev` in older versions already packaged
and shipped, so no version comparison necessary here.

Regarding unit test:

As blockdev was already mocked it's not clear how the test would mock
that sanitize_dev_input would actually canonicalize a DASD device bus-ID.

Also, the argument to execWithRedirect() is much more involved than just
the device bus-ID previously. It does not make sense for the unit test
to hard code the full argument of how execWithRedirect() is currently
invoked in the code.

So just check that the new code calls execWithRedirect() exactly once and
ignore the arguments.

It would be a separate fix patch to improve a fake sanitize unit test. I
suppose it would need to start with a DASD device bus-ID that is not yet
canonical/sanitized, e.g. something along the lines of:

DASDDiscoverTask("A100").run()
blockdev.s390.sanitize_dev_input.return_value = "0.0.A100"
blockdev.s390.sanitize_dev_input.assert_called_once_with("A100")
sanitized_input = blockdev.s390.sanitize_dev_input.return_value
execWithRedirect.assert_called_once_with(...)

It's unclear how much value that would have, though.

Signed-off-by: Steffen Maier <[email protected]>
@steffen-maier
Copy link
Contributor Author

Rebased to latest master and force pushed to resolve the merge conflict in pyanaconda/modules/storage/installation.py due to commit 9b82681 ("Add a simple NVMe module for NVMe Fabrics support").

Also fixed:

Clarified the explicit package dependency in spec file, to model the dependencies more fine granular: commit ("DASDDiscoverTask...") does only depend on s390utils-core but not a particular new version, only commit ("network...") depends on a (future) new version of s390utils-core (originated in #5250 (review)).

The new s390utils-core version is already required one commit earlier by ("write persistent config..."), so I moved the version dependency there.

@steffen-maier
Copy link
Contributor Author

@steffen-maier is this update in Anaconda (and perhaps related python-blivet changes in storaged-project/blivet#1162) required to not block the whole effort (changes in the dracut PR dracutdevs/dracut#2534) and/or in steffen-maier/s390utils#1)

@rvykydal this anaconda PR and the connected storaged-project/blivet#1162 are vital core parts of the consolidation endeavor.
However, they do not block the dracut PR as that one is independent. Actually, this anaconda PR here has one commit that depends on the dracut PR (#5250 (comment)) .
As much as I would like to integrate anaconda&blivet soon, I also would like to avoid breaking installability of Fedora Rawhide s390x for an indetermined amount of time (until the dracut changes hopefully land upstream and the Fedora dracut packaging rebases onto that).

or is there a chance that the old approach will keep working for some time before applying the Anaconda (blivet) changes?

Yes, as long as we hold back steffen-maier/s390utils#1 the old mechanism is still there (@sharkcz please do not pull that one for the time being :-) ).
However, we should make sure that the s390utils downstream PR gets integrated tightly after anaconda&blivet(&dracut) to land together in a distro release. Otherwise, it could contradict the consolidation effort: Dracut(+s390-tools) currently have 3 different methods to configure s390 devices during early initrd, Fedora has its own custom downstream method to configure s390 devices after initrd switch root; if we add another new method without removing the old one it would seem the opposite of consolidation to me -- plus it might cause functional trouble due to concurrent double configuration of the same devices by different methods.

I think the Anaconda and blivet PRs are ready and I'd prefer the switch to zdev solution, and I'll approve anaconda PR when we make sure test would pass and perhaps update the FIXME in the spec file,

Even though we would like to have all s390 device types use either the old mechanism or the zdev mechanism, I could factor the network commit depending the pending dracut PR into a new separate anaconda PR. This way, the existing anaconda&blivet PRs do not depend on dracut changes anymore and we could integrate them to reduce backlog. However, I'm a bit nervous about the dracut changes taking longer and we could end up with a distro using a different old mechanism just for s390 network device configuration.
Any idea how to help speed the required dracut review? @dtardon was the only one providing review comments so far and they were good ones, thank you!

@rvykydal, Please let me know if you would like to go the approach of factoring out the network part.

Note, however, that we still need a new version of s390utils(-core) including the content of ibm-s390-linux/s390-tools#158 as prereq for anaconda&blivet.
@sharkcz, that's 24 commits that are upstream but there is no official upstream release on https://github.com/ibm-s390-linux/s390-tools/releases yet including those. Do you need such release to bring the content into a new rawhide build of s390utils or is there a different possibility (downstream patches for the time being)?

but it is possible that we might face resource issues for applying the changes, so I'd like to clear up what would be the risks there.

It's not clear to me what efforts the integration would cause. I tried to provide a turn key solution. Maybe we can discuss and resolve offline how to minimize integration effort.

@dtardon
Copy link

dtardon commented Jan 12, 2024

Any idea how to help speed the required dracut review?

The changes look pretty good to me dracut-vise (just a few nitpicks so I can have a feeling I reviewed them). I don't know enough about the s390x tooling to evaluate their correctness, but I assume you know what you're doing. But I'm just a contributor to dracut; I don't have the rights to approve PRs. It might speed things up if someone like @sharkcz gave it a second look and approved it.

@sharkcz
Copy link
Contributor

sharkcz commented Jan 12, 2024

yes, I am going to look/review ASAP

@rvykydal
Copy link
Contributor

@steffen-maier thank you for the details on the process, I still believe that when the dracut part is ready we will be able to switch anaconda.

@KKoukiou KKoukiou added the blocked Don't merge this pull request! label Feb 28, 2024
…ot (#1802482,#1937049)

Implements a part of the referenced bugs.

Depends on
ibm-s390-linux/s390-tools@bc4f455
("zdev/dracut: retain early persistent config over switch root").
The spec file update reflects this new dependency on the new v2.31.0 of the
s390 architecture specific sub-package s390utils-core.

A new s390-tools zdev dracut module hook retain-zdev.sh copies the zdev
persistent configuration from the initrd into /run/zdev.initrd.config.
Import that persistent device configuration. It can be used:
* to transfer the configuration into the installed system.
* by python-blivet ZFCPDiskDevice.dracut_setup_args(),

Any s390 device configuration in the installer user interface produces
persistent zdev config entries.

Instead of treating each s390 device type differently, simply transfer the
entire combined resulting zdev persistent config into the installed system.

The import above also fixes the problem that installations, which got zfcp
paths activated by means of rd.zfcp= dracut cmdline arguments, were missing
those paths in the installed system.
Since commit 87ab1ab ("Support cio_ignore functionality for zFCP
devices (#533492)"), /etc/zfcp.conf replaced /tmp/fcpconfig.
Since commit 011ea0a ("Remove linuxrc.s390"), /etc/zfcp.conf only
exists if the user specified dracut cmdline parameter rd.zfcp=.
https://github.com/ibm-s390-linux/s390-tools/tree/master/zdev/
handles parsing of rd.zfcp= without /etc/zfcp.conf as of commit
06a30ae529a5 ("zdev/dracut: add rd.zfcp cmdline option handling").
https://src.fedoraproject.org/rpms/s390utils.git
no longer writes /etc/zfcp.conf during deprecated parsing of rd.zfcp=
as of commit
("zfcp: migrate to consolidated persistent device config with zdev")
Hence, nothing populates /etc/zfcp.conf during installer boot anymore.
So python-blivet has no more initial import input to carry forward.

Signed-off-by: Steffen Maier <[email protected]>
Implements the znet part of referenced bugs.

Depends on
ibm-s390-linux/s390-tools@73f51e4
("zdev: add helper to convert from zdev config to rd.znet").
The spec file already reflects the new dependency on `zdev-to-rd.znet` in
the new v2.31.0 of the s390 architecture specific sub-package
s390utils-core.

Dracut commit ("feat(znet): use zdev for consolidated device
configuration"), replaces the distribution-specific persistent
configuration of s390 (channel-attached) network devices with a common
consolidated mechanism using chzdev from s390-tools. So there is no more
ccw.conf nor s390-specific low-level network config in NetworkManager
connections nor ifcfg files.
The spec file update reflects this new dependency on the updated dracut
module "znet" in a new version of the dracut-network sub-package.

Therefore, drop NETTYPE and OPTIONS.

Keep SUBCHANNELS nonetheless because it can still serve as a matching key
for NM connections.

Delegate the generation of rd.znet statements to a helper tool from
s390-tools, which gets its low-level config information from the
consolidated mechanism using chzdev.

There are two different code paths involved:

* Root-fs on something (such as iSCSI or NFS) that depends on znet
  =>_get_dracut_znet_argument_from_connection().
  Related earlier commits:
  commit fa174ab ("Write rd_CCW when root fs is on a network device on s390x (#577193)")
  and lately replacing former code:
  commit f85682f ("network module: add support for getting dracut arguments")
  commit 7cf4d64 ("network module: use network module to get dracut arguments")
  and finally replacing initscripts ifcfg by NetworkManager connection:
  commit 840c984 ("network: generate dracut arguments from connections (#1751189)")
  (Note that this generated rd.znet is independent of the (last) one just
  inherited from the boot parameters between commit 64fb106 ("Preserve
  network args on s390x.") and commit a4ba9ae ("Do not pass rd.znet on
  to installed system unconditionally").)

* Configure znet on boot with rd.znet= but without any corresponding ip=
  and instead use the kickstart command "network" to perform high-level
  configuration of the network interface created with rd.znet. This creates
  a non-initramfs NM connection.
  => pyanaconda.modules.network.initialization.ApplyKickstartTask.run
  => pyanaconda.modules.network.nm_client.add_connection_from_ksdata
  => pyanaconda.modules.network.nm_client.create_connections_from_ksdata
  => get_s390_settings() and _update_wired_connection_with_s390_settings()
  (In contrast, early initrd network setups get both the low-level s390
  config and high-level interface config via nm-initrd-generator,
  which parses rd.znet= as well as ip=.)

Signed-off-by: Steffen Maier <[email protected]>
storaged-project/blivet#1162 (comment)
removes DASDDevice.opts. Anticipating that blivet change, update
the anaconda unit tests making use of DASDDevice.

Signed-off-by: Steffen Maier <[email protected]>
@steffen-maier
Copy link
Contributor Author

steffen-maier commented Feb 29, 2024

Since chzdev and /lib/s390-tools/zdev-to-rd.znet are used, anaconda.spec should be updated so that anaconda-core requires the s390utils-core package (assuming the zdev-to-rd.znet helper script will be placed there) and also the minimal version of s390utils-core should be specified once it is known.

Now the s390-tools upstream version v2.31.0 is known and I updated the commit touching the spec file ("write persistent config of any (dasd,zfcp,znet) s390 devices to sysroot (#1802482,#1937049)").

Copy link

This PR is stale because it has been open 60 days with no activity.
Remove stale label or comment or this will be closed in 30 days.

@github-actions github-actions bot added the stale label Apr 30, 2024
@steffen-maier
Copy link
Contributor Author

This is not stale. It's just blocked by dracutdevs/dracut#2534.

@github-actions github-actions bot removed the stale label May 1, 2024
Copy link

This PR is stale because it has been open 60 days with no activity.
Remove stale label or comment or this will be closed in 30 days.

@jstodola
Copy link
Contributor

Tested these changes against a recent RHEL-10 build and the changes work there fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked Don't merge this pull request! f41 port to RHEL10