When your booting a Linux system using dracut, i.e. with any
redhat style distribution, dracut's internal code looks to validate
the kernel hmac signature in before proceeding to userspace.
It does this by looking at the /boot/ folder file for the kernel
hmac file.
And it normally does this with the root filesystem. Except if the
kernel is not on the root filesystem and is instead on a /boot
filesystem, this breaks horribly. This is compounded because
DIB enables the operator to restructure the OS image/layout
to fit their needs. In order for this to be navigated, as dracut
is written, we need to pass a "boot=" argument to the kernel.
So now we attempt to purge any prior boot entry in the disk image
content, which is good because any filesystem operations invalidate
it, and then we attempt to identify the boot filesystem, and save a
boot kernel command line parameter so the resulting image can
boot properly if FIPS was enabled in the prior image.
Regex developed with https://sed.js.org utilizing stdin:
VAR="quiet boot=UUID=173c759f-1302-48a3-9d51-a17784c21e03 text"
VAR="quiet boot=PARTUUID=173c759f-1302-48a3-9d51-a17784c21e03"
VAR="quiet boot=PARTUUID=173c759f-1302-48a3-9d51-a17784c21e03 reboot=meow"
VAR="quiet boot=UUID=/dev/sda1 text"
VAR="quiet boot=/dev/sda1"
VAR="quiet boot=/dev/sda1 reboot=meow"
VAR="quiet after_boot=1 reboot=meow boot=/dev/sda1"
VAR="quiet after_boot=1 reboot=meow"
Which resulted in stdout:
VAR="quiet text"
VAR="quiet"
VAR="quiet reboot=meow"
VAR="quiet text"
VAR="quiet"
VAR="quiet reboot=meow"
VAR="quiet after_boot=1 reboot=meow"
VAR="quiet after_boot=1 reboot=meow"
Change-Id: I9034c21e84deda2ba2c0ec0d1d6d6595ed10bed4
Sometimes umount doesn't have much time to finish and failed with
error 'target is busy', but this is not an actual error in some cases
and the operation should be repeated again with some timeout.
This solves the issue and raise actual exception only after several
tries with timeout.
Closes-Bug: #2004492
Change-Id: I069af85b52e20e9fd688f9ae07e66beb2179f3e1
Signed-off-by: Maksim Malchuk <maksim.malchuk@gmail.com>
A recent change that didn't fail with hard-tabs made me realise we're
not running tox -e pep8 ... which means we're not running dib-lint
which should find this (and other things).
I couldn't pinpoint when this happened; maybe job config was never in
this repo.
Anyway, move the pylint and dib-lint/flake8 testing to the now
standard "linters" and update the linting job to
openstack-tox-linters.
It looks like pylint is very lightly used (came in with
I7e24d8348db3aef79e1395d12692199a1f80161a and we've never expanded any
testing). Leave this alone for now, but probably it is not important
any more.
This revealed some issues; updated flake8
(Iaa19c36f8cab8482a01f764c588375db8e7d8be3) found some spacing issues
with keywords and an update to elrepo to match our standard bash
flags.
Change-Id: I45bf108c467f7c8190ca252e6c48450c2622aaf8
This change extends the block device lvs attributes to allow creating
a volume which represents a thin pool, and to create volumes which are
allocated from this pool.
Change-Id: Ic58f55c36236cc8c6279fbcb708e27dc2982f2d5
Without this change, the final unmount will timeout after the
rollbacks are called when the partitioning fails due to a user error.
dmsetup remove is called both for partition and LVM volume devices.
Change-Id: I99679ea00338d4018a95d4da9b21685161cd5049
The block device lvm lvs `size` attribute was passed directly to
lvcreate, so using units M, G means base 2. All other block device
size values are parsed with accepted conventions of M, B being base 10
and MiB, GiB being base 2.
lvm lvs `size` attributes are now parsed the same as other size
attributes. This improves consistency and makes it practical to
calculate volume sizes to fill the partition size. This means existing
size values will now create slightly smaller volumes. Previous sizes
can be restored by changing the unit to MiB, GiB, or increasing the
value for a base 10 unit.
The impact on this change should be minimal, the only known uses of lvm
volumes (TripleO, and element block-device-efi-lvm) uses extents
percentage instead of size. The smaller sizes can always be increased
after deployment.
Requested sizes will also be rounded down to align with physical
extents (4MiB). Previously specifying a value which did not align on
4MiB would consume an extra extent which could unexpectedly consume
more than the partition size.
Change-Id: Ia109cc5105071d82cc895d8d9cb85bc47da20a7a
ABCs in collections should be imported from collections.abc and direct
import from collections is deprecated since Python 3.3.
Change-Id: Idacff95cbb276eda0bc55de771ce6c701363c2e1
With the removal of Python 2.x we can remove the unittest2 compat
wrappers and switch to assertCountEqual instead of assertItemsEqual
We have been able to use them since then, because
testtools required unittest2, which still included it. With testtools
removing Python 2.7 support [3][4], we will lose support for
assertItemsEqual, so we should switch to use assertCountEqual.
[1] - https://bugs.python.org/issue17866
[2] - https://hg.python.org/cpython/rev/d9921cb6e3cd
[3] - testing-cabal/testtools#286
[4] - testing-cabal/testtools#277
Change-Id: I870286a2557e41099597c22dc9747743e1077615
New versions of flake8 fail as "l" can be confused for "1" apparently
(E741) ... not sure I totally agree but since it's only one instance,
update it.
Change-Id: Ic5c47867facd56b53cc6534da4ae3a345c516202
This causes problems for other projects incorporating dib; we don't
have a specific need for a cap.
Fix a few issues, mostly spacing or regex matches. No functional
changes.
W503 and W504 relate to leaving artithmetic operators at the start or
end of lines, and are mutually exclusive and, due to "ignore"
overriding the defaults both get enabled. It seems everyone gets this
wrong (https://gitlab.com/pycqa/flake8/issues/466). Don't take a
position on this and ignore both.
Use double # around comments including YAML snippets using "# type: "
which now gets detected as PEP484/mypy type hints.
Change-Id: I8b7ce6dee02dcce31c82427a2441c931d136ef57
Currently there is a bug, that tries to detach the device from a
partition at the first try, without considering that there may be
other partitions and volumes on it. Ensure that the detach is done
properly, and add a test to ensure that this happens correctly.
Change-Id: I35c5a473509f17a70270a2cbf5bf579faaeb123a
Fixes-Bug: #1777861
A recap -- we run umount phase then cleanup phase.
Currently we register a object to do the final LVM cleanup based on
the parent PV. In light of I697bfbf042816c5ddf170bde9534cc4f0c7279ff,
I believe this should just be done in the cleanup phase. Note there
was probably additional confusion because the partition removal was
done in the cleanup phase until
I7af3c5cf66afd81a481f454b5207af552ad52a32, where is was moved into the
umount phase.
Thus it is moved into the cleanup() function and this should now run,
per the comment, after everything is unmounted in umount phase.
This also exposes that we didn't have the cleanup phase in the unit
tests (because it wasn't doing anything I guess). Add it.
Change-Id: I1c5f4ffc9619c774f78d21b918a81647b3dc28f5
This review squashes:
Iac9afc7766d3640815dc20cfd6de1245d36a09cc
Ie894b5801bd7b3815432882cd626941e89d9f9a1
We need to do this as we can't fix pylint without networkx as that
failes requirements-chak due to us having a cap on networkx and we can't
uncap networkx as part of tripleo-buildimage installs without
constratints which gets us 2.1 and DIB desn't support 2.x
This is the commit message Iac9afc7766d3640815dc20cfd6de1245d36a09cc
---
One of the pylint dependencies has updated to be python3 only; this
version of pylint correctly caps things so it still works with
python2.
This also exposes that we need to uncap networkx due to
I34045f87ca19c2f184b040f4d89347374cce518b. We should remain on
version 1 for now thanks to upper-constraints, but we need to maintain
the lower-constraint.
---
This is the commit message Ie894b5801bd7b3815432882cd626941e89d9f9a1
---
Support different versions of networkx
Since the entry of networkx 2.0 nodes has a different
behaviour. Checking if dg.nodes is iterable is enough to add
compatibility for new/older versions.
---
Change-Id: I82dc61fac6c156a4f0d574290c7632077aa53195
Similar to I697bfbf042816c5ddf170bde9534cc4f0c7279ff, the order of
things called is "dib-block-device umount" *then* "dib-block-device
cleanup".
Because we're doing the "kpartx -d" here in cleanup, it means that the
loop-device is removed in umount phase from level0/localloop.py, then
afterwards we try and remove the partitions.
Change-Id: I7af3c5cf66afd81a481f454b5207af552ad52a32
TODO: a test case to ensure the ordering
One call in localloop requires the output of the command, so modify
exec_sudo to buffer up output and return it. This is modelled on the
same thing in package-installs-v2 which seems to work. Rather than
return a subprocess exception, return a dib exception which everything
should have imported anyway.
The overall reason for this is to make our external calls more
consistent for mocking in unit testing.
Change-Id: I10d23b873dee9f775daef2a4c8be5671d02c386e
As described in blockdevice.py detachment and (most) resources
release must be done in the umount phase of a block device module.
Until now these jobs were done in the lvm cleanup() phase - which
is too late - especially when using nested LVMs.
This patch moves the functionality of the cleanup() phase to the
umount() phase for the lvm module.
It includes a test case that fails without applying the provided
source code changes.
Change-Id: I697bfbf042816c5ddf170bde9534cc4f0c7279ff
Signed-off-by: Andreas Florath <andreas@florath.net>
Our sgdisk calls are putting extra double-quotes around the names of
partitions. This confuses sfdisk, which confuses growpart, which
confuses growroot ... and you don't get your partition grown for EFI
boot.
Ensure we just bunch arguments into the list directly (for Popen)
rather than string split and have to worry about quoting. Add a check
for this to our GPT unit test, extending it to include a space in the
name of the root partition.
Change-Id: I0a8cb69bb4c9c0865fbaa63ba0d7210028da552e
When booting on UEFI, there was an issue mounting the vfat
filesystem. It was caused because the mount was defined in
/etc/fstab in lowercase, but the disk had it labeled in upper
case, and system could not find it. Conver the label to upper
case in case of fat/vfat.
Change-Id: Id3dee735e6f8fb221d199c4aba648f3e9a6e4206
There was a typo in I6b819a8071389e7e4eb4874ff7750bd192695ff2 that
modified this default partition type from "0x83" to just 83. We are
now seeing failures relating to this as sfdisk checks for a "disk
manager" when it see Id 0x53 (== 83)
Device Boot Start End Blocks Id System
/dev/vda1 * 2048 26664575 13331264 53 OnTrack DM6 Aux3
Restore to 0x83
Change-Id: Ib43038d2d740fbe01a21a13dd56367f7bc97f869
This adds support for a GPT label type to the partitioning code. This
is relatively straight-forward translation of the partition config
into a sgparted command-line and subsequent call.
A unit test is added based on a working GPT/EFI configuration and the
fedora-minimal functional test is updated to build a single-partition
GPT based using the new block-device-gpt override element. See notes
in the sample configuration files about partition requirements and
types.
Documentation has been updated.
Co-Authored-By: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Change-Id: I6b819a8071389e7e4eb4874ff7750bd192695ff2
As described, we want to set the default label for XFS disks to the
shorter value.
For example, you hit this when setting the old FS_TYPE environment
variable to 'xfs' (which sets the "root-fs-type" parameter, which gets
passed through to 'type'; but does not set a default label).
Change-Id: I41dce6e25766562db4366021309b8c2b74a8ab80
Closes-Bug: 1742170
This small change avoids running fstrim on vfat partitions.
The mount order test-case has been updated to also test the mkfs
creation components, and the input config modified to have a vfat
partition to cover this path.
Change-Id: I8952e748d4bdc12a5769706de9057c1e97d95e37
This patch removes the unneeded dd calls in the lvm block device
plugin.
After removing the underlying block device, there is the need to call
'pvscan --cache'. This is done by a dedicated LVM cleanup node which
is cleaned up after the the underlying block device.
Change-Id: Id8eaede77fbdc107d2ba1035cd6b8eb5c10160c3
Signed-off-by: Andreas Florath <andreas@florath.net>
The call to fstrim in disk-image-create is currently useless, because
at the time this is called, the file systems were already umounted by
the block device layer.
The current implementation of the block-device mount plugin does not
call fstrim at all: resulting in larger image sizes.
This patch removes the useless fstrim call from the disk-image-create
script and moves this into the block-device mount.py.
The resulting image might be much smaller. Example: Ubuntu Xenial
with some elements; once with and once without this patch:
-rw-r--r-- 1 dib dib 475661824 Sep 16 06:43 ubuntu-xenial-without-fstrim.qcow2
-rw-r--r-- 1 dib dib 364249088 Sep 16 09:30 ubuntu-xenial-with-fstrim.qcow2
Change-Id: I4e21ae50c5e6e26dc9f50f004ed6413132c81047
Signed-off-by: Andreas Florath <andreas@florath.net>
We intended to do an in-place sort of the mount-point list, but
sorted() returns a new list that wasn't captured. Move to the .sort()
function.
It seems the existing unit-test missed this. Add a new test taken
from the bug which does exhibit a sorting issue. Also added a
unit-test of just the comparitor for sanity.
Closes-Bug: 1699437
Change-Id: I8101e4a1804a4af7dbda20d48bf362c3f4ad2742
This provides a basic LVM support to dib-block-device.
Co-Authored-By: Ian Wienand <iwienand@redhat.com>
Change-Id: Ibd624d9f95ee68b20a15891f639ddd5b3188cdf9
The MBR Partition Table Entry (PTE) allows one to specify many
possible partition types and one of the benefits of this is being able
to specify the CHS variant or the LBA variant.
By default, LBA only creates partitions of type 0x83 (of course,
that's only because the documentation doesn't tell you how to make it
do anything else).
I will take up Ian's suggestion in patch set 2 for a more rigorous
test in an independent patch set.
Change-Id: If3068535980eac2e58d4025444c65147a8c7fedc
Closes-Bug:#1703352
The code in mkfs correctly extends the command line with a '-n' for
vfat but does not currently do it for fat. This means that mkfs for
fat ends up with a '-L' which is what you'd do for everything like
ext[234].
The change just treats fat like vfat in the one place where this check
is required.
Change-Id: If65dfd949acdadff33a564640fb42ea73026a786
Closes-Bug: #1703063
We introduced the "settle" in
I90103b59357edebbac7a641e8980cb282d37561b thinking that maybe kpartx
had not finished writing the partition. This probably wasn't a bad
first assumption, since we used to have this -- but is seems
insufficient.
The other failiure here seems to be if kpartx hasn't actually seen the
updated partition table in the image, so it has correctly (in it's
mind) not mounted the partition.
Looking at strace of fdisk run manually on a loopback, it will do a
fsync on the raw device after writing and then a global sync as it
exits.
This replicates this; we flush and fsync in mbr.py in the exit handler
after writing the partition, before closing the file (i've updated one
of the unit tests to double-check the call). In the partitioning.py
caller we execute a sync call too.
Since it does seem unlikely the "-s" option of kpartx is not working,
I've removed the udev settle work-around too.
Change-Id: Ia77a0ffe4c76854b326ed76490479d9c691b49aa
Partial-Bug: #1698337
There was a race in diskimage-builder where the mkfs call after a
kpartx -avs for the loop device would fail because the device was
not yet ready. This adds a udevadm settle call after the kpartx
to make sure the udev event queue has cleared.
Change-Id: I90103b59357edebbac7a641e8980cb282d37561b
Closes-Bug: #1698337
Currently we only export "image-block-device" which is the loopback
device (/dev/loopX) for the underlying image. This is the device we
install grub to (from inside the chroot ...)
This is ok for x86, but is insufficient for some platforms like PPC
which have a separate boot partition. They do not want to install to
the loop device, but do things like dd special ELF files into special
boot partitions.
The first problem seems to be that in level1/partitioning.py we have a
whole bunch of different paths that either call partprobe on the loop
device, or kpartx. We have _all_part_devices_exist() that gates the
kpartx for unknown reasons. We have detach_loopback() that does not
seem to remove losetup created devices. I don't think this does
cleanup if it uses kpartx correctly. It is extremley unclear what's
going to be mapped where.
This moves to us *only* using kpartx to map the partitions of the loop
device. We will *not* call partprobe and create the /dev/loopXpN
devices and will only have the devicemapper nodes kpartx creates.
This seems to be best. Cleanup happens inside partitioning.py.
practice. Deeper thinking about this, and more cleanup of the
variables will be welcome.
This adds "image-block-devices" (note the extra "s") which exports all
the block devices with name and path. This is in a string format that
can be eval'd to an array (you can't export arrays).
This is then used in a follow-on
(I0918e8df8797d6dbabf7af618989ab7f79ee9580) to pick the right
partition on PPC.
Change-Id: If8e33106b4104da2d56d7941ce96ffcb014907bc
Currently we pass a reference to a global "rollback" list to create()
to keep rollback functions. Other nodes don't need to know about
global rollback state, and by passing by reference we're giving them
the chance to mess it up for everyone else.
Add a "add_rollback()" function in NodeBase for create() calls to
register rollback calls within themselves. As they hit rollback
points they can add a new entry. lambda v arguments is much of a
muchness -- but this is similar to the standard atexit() call so with
go with that pattern. A new "rollback()" call is added that the
driver will invoke on each node as it works its way backwards in case
of failure.
On error, nodes will have rollback() called in reverse order (which
then calls registered rollbacks in reverse order).
A unit test is added to test rollback behaviour
Change-Id: I65214e72c7ef607dd08f750a6d32a0b10fe97ac3