Commit graph

111 commits

Author SHA1 Message Date
Zuul
9c1ee6dcd8 Merge "Correct boot path to cover FIPS usage cases" 2023-03-21 06:39:00 +00:00
Zuul
950ad3324d Merge "Add swap support" 2023-03-21 06:38:57 +00:00
Julia Kreger
4633da7750 Correct boot path to cover FIPS usage cases
When your booting a Linux system using dracut, i.e. with any
redhat style distribution, dracut's internal code looks to validate
the kernel hmac signature in before proceeding to userspace.

It does this by looking at the /boot/ folder file for the kernel
hmac file.

And it normally does this with the root filesystem. Except if the
kernel is not on the root filesystem and is instead on a /boot
filesystem, this breaks horribly. This is compounded because
DIB enables the operator to restructure the OS image/layout
to fit their needs. In order for this to be navigated, as dracut
is written, we need to pass a "boot=" argument to the kernel.

So now we attempt to purge any prior boot entry in the disk image
content, which is good because any filesystem operations invalidate
it, and then we attempt to identify the boot filesystem, and save a
boot kernel command line parameter so the resulting image can
boot properly if FIPS was enabled in the prior image.

Regex developed with https://sed.js.org utilizing stdin:

VAR="quiet boot=UUID=173c759f-1302-48a3-9d51-a17784c21e03 text"
VAR="quiet boot=PARTUUID=173c759f-1302-48a3-9d51-a17784c21e03"
VAR="quiet boot=PARTUUID=173c759f-1302-48a3-9d51-a17784c21e03 reboot=meow"
VAR="quiet boot=UUID=/dev/sda1 text"
VAR="quiet boot=/dev/sda1"
VAR="quiet boot=/dev/sda1 reboot=meow"
VAR="quiet after_boot=1 reboot=meow boot=/dev/sda1"
VAR="quiet after_boot=1 reboot=meow"

Which resulted in stdout:

VAR="quiet text"
VAR="quiet"
VAR="quiet reboot=meow"
VAR="quiet text"
VAR="quiet"
VAR="quiet reboot=meow"
VAR="quiet after_boot=1 reboot=meow"
VAR="quiet after_boot=1 reboot=meow"

Change-Id: I9034c21e84deda2ba2c0ec0d1d6d6595ed10bed4
2023-03-15 11:25:21 -07:00
Maksim Malchuk
601dc0387f Add swap support
Adds swap as a valid "filesystem"

Closes-Bug: #1816136
Change-Id: Ie50834a9834815b1dfacafd283f505f3323d35c8
Co-Authored-By: luke.odom <luke.odom@dreamhost.com>
Signed-off-by: Maksim Malchuk <maksim.malchuk@gmail.com>
2023-03-06 14:54:35 +03:00
Maksim Malchuk
84d6af7de8 Repeat to umount filesystem when exception occurs
Sometimes umount doesn't have much time to finish and failed with
error 'target is busy', but this is not an actual error in some cases
and the operation should be repeated again with some timeout.

This solves the issue and raise actual exception only after several
tries with timeout.

Closes-Bug: #2004492
Change-Id: I069af85b52e20e9fd688f9ae07e66beb2179f3e1
Signed-off-by: Maksim Malchuk <maksim.malchuk@gmail.com>
2023-02-01 20:09:47 +03:00
Ian Wienand
2a25db9ee6
Start running dib-lint again
A recent change that didn't fail with hard-tabs made me realise we're
not running tox -e pep8 ... which means we're not running dib-lint
which should find this (and other things).

I couldn't pinpoint when this happened; maybe job config was never in
this repo.

Anyway, move the pylint and dib-lint/flake8 testing to the now
standard "linters" and update the linting job to
openstack-tox-linters.

It looks like pylint is very lightly used (came in with
I7e24d8348db3aef79e1395d12692199a1f80161a and we've never expanded any
testing).  Leave this alone for now, but probably it is not important
any more.

This revealed some issues; updated flake8
(Iaa19c36f8cab8482a01f764c588375db8e7d8be3) found some spacing issues
with keywords and an update to elrepo to match our standard bash
flags.

Change-Id: I45bf108c467f7c8190ca252e6c48450c2622aaf8
2022-09-21 07:56:05 +10:00
Steve Baker
833c5b8ceb Support LVM thin provisioning
This change extends the block device lvs attributes to allow creating
a volume which represents a thin pool, and to create volumes which are
allocated from this pool.

Change-Id: Ic58f55c36236cc8c6279fbcb708e27dc2982f2d5
2022-08-24 10:34:42 +12:00
Steve Baker
1a4fb0b89b Do dmsetup remove device in rollback
Without this change, the final unmount will timeout after the
rollbacks are called when the partitioning fails due to a user error.

dmsetup remove is called both for partition and LVM volume devices.

Change-Id: I99679ea00338d4018a95d4da9b21685161cd5049
2022-08-18 10:23:41 +12:00
Steve Baker
d090126c66 Parse block device lvm lvs size attributes
The block device lvm lvs `size` attribute was passed directly to
lvcreate, so using units M, G means base 2. All other block device
size values are parsed with accepted conventions of M, B being base 10
and MiB, GiB being base 2.

lvm lvs `size` attributes are now parsed the same as other size
attributes. This improves consistency and makes it practical to
calculate volume sizes to fill the partition size. This means existing
size values will now create slightly smaller volumes. Previous sizes
can be restored by changing the unit to MiB, GiB, or increasing the
value for a base 10 unit.

The impact on this change should be minimal, the only known uses of lvm
volumes (TripleO, and element block-device-efi-lvm) uses extents
percentage instead of size. The smaller sizes can always be increased
after deployment.

Requested sizes will also be rounded down to align with physical
extents (4MiB). Previously specifying a value which did not align on
4MiB would consume an extra extent which could unexpectedly consume
more than the partition size.

Change-Id: Ia109cc5105071d82cc895d8d9cb85bc47da20a7a
2022-07-06 11:27:42 +12:00
Takashi Kajinami
b6254398e7 Replace deprecated import of ABCs from collections
ABCs in collections should be imported from collections.abc and direct
import from collections is deprecated since Python 3.3.

Change-Id: Idacff95cbb276eda0bc55de771ce6c701363c2e1
2021-07-17 01:02:19 +09:00
Zuul
ea6403388e Merge "Switch from unittest2 compat methods to Python 3.x methods" 2020-07-13 02:21:12 +00:00
Zuul
b6d4bef9eb Merge "Use kpartx option to update partition mappings" 2020-07-07 08:27:29 +00:00
melissaml
bf0dc265ae Switch from unittest2 compat methods to Python 3.x methods
With the removal of Python 2.x we can remove the unittest2 compat
wrappers and switch to assertCountEqual instead of assertItemsEqual

We have been able to use them since then, because
testtools required unittest2, which still included it. With testtools
removing Python 2.7 support [3][4], we will lose support for
assertItemsEqual, so we should switch to use assertCountEqual.

[1] - https://bugs.python.org/issue17866
[2] - https://hg.python.org/cpython/rev/d9921cb6e3cd
[3] - testing-cabal/testtools#286
[4] - testing-cabal/testtools#277

Change-Id: I870286a2557e41099597c22dc9747743e1077615
2020-07-07 11:11:28 +08:00
Simon Westphahl
4a424ecabb Use kpartx option to update partition mappings
Fix cases of 'mkfs' failing because the partitions never showed up. Partition
mappings will now be updated instead of just adding them with 'kpartx'. That
means that 'kpartx' will also remove devmappings for deleted partitions.

Traceback of failing mkfs call:

2020-05-11 22:03:25.523 | INFO diskimage_builder.block_device.utils [-] Calling [sudo sync]
2020-05-11 22:03:25.539 | INFO diskimage_builder.block_device.utils [-] Calling [sudo kpartx -avs /dev/loop0]
2020-05-11 22:03:25.581 | INFO diskimage_builder.block_device.utils [-] Calling [sudo mkfs -t ext4 -i 4096 -J size=64 -L cloudimg-rootfs -U 21c6f9eb-4d52-4e5c-b9b7-796735de8909 -q /dev/mapper/loop0p1]
2020-05-11 22:03:25.700 | ERROR diskimage_builder.block_device.blockdevice [-] Create failed; rollback initiated
2020-05-11 22:03:25.700 | Traceback (most recent call last):
2020-05-11 22:03:25.700 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/blockdevice.py", line 406, in cmd_create
2020-05-11 22:03:25.700 |     node.create()
2020-05-11 22:03:25.700 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/level2/mkfs.py", line 133, in create
2020-05-11 22:03:25.700 |     exec_sudo(cmd)
2020-05-11 22:03:25.700 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/utils.py", line 143, in exec_sudo
2020-05-11 22:03:25.700 |     raise e
2020-05-11 22:03:25.700 | diskimage_builder.block_device.exception.BlockDeviceSetupException: exec_sudo failed
2020-05-11 22:03:25.700 | INFO diskimage_builder.block_device.level0.localloop [-] loopdev detach
2020-05-11 22:03:25.701 | INFO diskimage_builder.block_device.utils [-] Calling [sudo losetup -d /dev/loop0]
2020-05-11 22:03:25.732 | INFO diskimage_builder.block_device.level0.localloop [-] Remove image file [/tmp/dib_image.muyw7t1h/image0.raw]
2020-05-11 22:03:25.734 | ERROR diskimage_builder.block_device.blockdevice [-] Rollback complete, exiting
2020-05-11 22:03:25.740 | Traceback (most recent call last):
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/bin/dib-block-device", line 8, in <module>
2020-05-11 22:03:25.740 |     sys.exit(main())
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/cmd.py", line 120, in main
2020-05-11 22:03:25.740 |     return bdc.main()
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/cmd.py", line 115, in main
2020-05-11 22:03:25.740 |     self.args.func()
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/cmd.py", line 36, in cmd_create
2020-05-11 22:03:25.740 |     self.bd.cmd_create()
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/blockdevice.py", line 406, in cmd_create
2020-05-11 22:03:25.740 |     node.create()
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/level2/mkfs.py", line 133, in create
2020-05-11 22:03:25.740 |     exec_sudo(cmd)
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/utils.py", line 143, in exec_sudo
2020-05-11 22:03:25.740 |     raise e
2020-05-11 22:03:25.740 | diskimage_builder.block_device.exception.BlockDeviceSetupException: exec_sudo failed

Change-Id: I374f7f22f9e93ef35eb5813712ca59e75f0733e8
Related-Bug: #1698337
2020-06-09 09:07:55 +02:00
Andreas Jaeger
4493208048 Drop six usage
With python3, six is not needed anymore, drop it.

Change-Id: I70bb679270605ac32ca0cceb9414ea3a210e5842
2020-06-05 12:04:37 +02:00
Ian Wienand
9080d04923 block device: update variable name
New versions of flake8 fail as "l" can be confused for "1" apparently
(E741) ... not sure I totally agree but since it's only one instance,
update it.

Change-Id: Ic5c47867facd56b53cc6534da4ae3a345c516202
2020-05-13 06:22:02 +10:00
Ian Wienand
28ebd24844 Uncap hacking
This causes problems for other projects incorporating dib; we don't
have a specific need for a cap.

Fix a few issues, mostly spacing or regex matches.  No functional
changes.

W503 and W504 relate to leaving artithmetic operators at the start or
end of lines, and are mutually exclusive and, due to "ignore"
overriding the defaults both get enabled.  It seems everyone gets this
wrong (https://gitlab.com/pycqa/flake8/issues/466).  Don't take a
position on this and ignore both.

Use double # around comments including YAML snippets using "# type: "
which now gets detected as PEP484/mypy type hints.

Change-Id: I8b7ce6dee02dcce31c82427a2441c931d136ef57
2020-02-24 10:34:46 +11:00
zhufl
b5f247b04f Add missing ws separator between words
This is to add missing ws separator between words.

Change-Id: Ie192d296128fb785c344ac5d8a77cad59764080e
2018-11-21 16:07:59 +08:00
Zuul
3be4b0c1fd Merge "Only detach device if all partitions have been cleaned" 2018-07-31 08:21:27 +00:00
Zuul
3197a7ef1b Merge "Move LVM cleanup phase into cleanup" 2018-07-31 00:30:47 +00:00
Zuul
d50bd1deb3 Merge "Don't quote names with sgdisk" 2018-07-30 06:26:25 +00:00
Yolanda Robla
64bb87f7b5 Only detach device if all partitions have been cleaned
Currently there is a bug, that tries to detach the device from a
partition at the first try, without considering that there may be
other partitions and volumes on it. Ensure that the detach is done
properly, and add a test to ensure that this happens correctly.

Change-Id: I35c5a473509f17a70270a2cbf5bf579faaeb123a
Fixes-Bug: #1777861
2018-07-30 16:24:57 +10:00
Ian Wienand
7302f38f97 Move LVM cleanup phase into cleanup
A recap -- we run umount phase then cleanup phase.

Currently we register a object to do the final LVM cleanup based on
the parent PV.  In light of I697bfbf042816c5ddf170bde9534cc4f0c7279ff,
I believe this should just be done in the cleanup phase.  Note there
was probably additional confusion because the partition removal was
done in the cleanup phase until
I7af3c5cf66afd81a481f454b5207af552ad52a32, where is was moved into the
umount phase.

Thus it is moved into the cleanup() function and this should now run,
per the comment, after everything is unmounted in umount phase.

This also exposes that we didn't have the cleanup phase in the unit
tests (because it wasn't doing anything I guess).  Add it.

Change-Id: I1c5f4ffc9619c774f78d21b918a81647b3dc28f5
2018-07-30 14:35:16 +10:00
Zuul
48645abff6 Merge "Call kpartx remove in umount, not cleanup" 2018-07-24 23:05:16 +00:00
Zuul
0a40f45094 Merge "Move localloop to exec_sudo" 2018-07-24 23:05:15 +00:00
Zuul
961235854b Merge "block-device lvm: fix umount phase" 2018-07-24 11:26:11 +00:00
Ian Wienand
1107326723 Update pylint to 1.7.6, uncap networkx
This review squashes:
    Iac9afc7766d3640815dc20cfd6de1245d36a09cc
    Ie894b5801bd7b3815432882cd626941e89d9f9a1

We need to do this as we can't fix pylint without networkx as that
failes requirements-chak due to us having a cap on networkx and we can't
uncap networkx as part of tripleo-buildimage installs without
constratints which gets us 2.1 and DIB desn't support 2.x

This is the commit message Iac9afc7766d3640815dc20cfd6de1245d36a09cc
---
One of the pylint dependencies has updated to be python3 only; this
version of pylint correctly caps things so it still works with
python2.

This also exposes that we need to uncap networkx due to
I34045f87ca19c2f184b040f4d89347374cce518b.  We should remain on
version 1 for now thanks to upper-constraints, but we need to maintain
the lower-constraint.
---

This is the commit message Ie894b5801bd7b3815432882cd626941e89d9f9a1
---
Support different versions of networkx

Since the entry of networkx 2.0 nodes has a different
behaviour. Checking if dg.nodes is iterable is enough to add
compatibility for new/older versions.
---

Change-Id: I82dc61fac6c156a4f0d574290c7632077aa53195
2018-07-18 09:27:01 +10:00
Ian Wienand
f94943344f Call kpartx remove in umount, not cleanup
Similar to I697bfbf042816c5ddf170bde9534cc4f0c7279ff, the order of
things called is "dib-block-device umount" *then* "dib-block-device
cleanup".

Because we're doing the "kpartx -d" here in cleanup, it means that the
loop-device is removed in umount phase from level0/localloop.py, then
afterwards we try and remove the partitions.

Change-Id: I7af3c5cf66afd81a481f454b5207af552ad52a32
TODO: a test case to ensure the ordering
2018-06-29 11:22:33 +10:00
Ian Wienand
a1a549548a Move localloop to exec_sudo
One call in localloop requires the output of the command, so modify
exec_sudo to buffer up output and return it.  This is modelled on the
same thing in package-installs-v2 which seems to work.  Rather than
return a subprocess exception, return a dib exception which everything
should have imported anyway.

The overall reason for this is to make our external calls more
consistent for mocking in unit testing.

Change-Id: I10d23b873dee9f775daef2a4c8be5671d02c386e
2018-06-29 11:22:24 +10:00
Andreas Florath
f5736f3178 block-device lvm: fix umount phase
As described in blockdevice.py detachment and (most) resources
release must be done in the umount phase of a block device module.

Until now these jobs were done in the lvm cleanup() phase - which
is too late - especially when using nested LVMs.

This patch moves the functionality of the cleanup() phase to the
umount() phase for the lvm module.
It includes a test case that fails without applying the provided
source code changes.

Change-Id: I697bfbf042816c5ddf170bde9534cc4f0c7279ff
Signed-off-by: Andreas Florath <andreas@florath.net>
2018-06-28 15:21:59 +10:00
Ian Wienand
b0da703f46 Don't quote names with sgdisk
Our sgdisk calls are putting extra double-quotes around the names of
partitions.  This confuses sfdisk, which confuses growpart, which
confuses growroot ... and you don't get your partition grown for EFI
boot.

Ensure we just bunch arguments into the list directly (for Popen)
rather than string split and have to worry about quoting.  Add a check
for this to our GPT unit test, extending it to include a space in the
name of the root partition.

Change-Id: I0a8cb69bb4c9c0865fbaa63ba0d7210028da552e
2018-06-27 18:10:08 +10:00
Zuul
e39adcd65f Merge "Remove redundant word" 2018-06-22 14:23:47 +00:00
chenxiangui
c809160906 Remove redundant word
Remove the redundant word 'the' in config.py

Change-Id: I3e9cb6390ce196f0a9022aef10f6c7b1ace36c48
2018-06-19 18:00:11 +08:00
Yolanda Robla
9687a1efe1 Convert labels to upper case
When booting on UEFI, there was an issue mounting the vfat
filesystem. It was caused because the mount was defined in
/etc/fstab in lowercase, but the disk had it labeled in upper
case, and system could not find it. Conver the label to upper
case in case of fat/vfat.

Change-Id: Id3dee735e6f8fb221d199c4aba648f3e9a6e4206
2018-06-19 11:12:54 +02:00
Ian Wienand
f3f671cf10 Fix default partition type
There was a typo in I6b819a8071389e7e4eb4874ff7750bd192695ff2 that
modified this default partition type from "0x83" to just 83.  We are
now seeing failures relating to this as sfdisk checks for a "disk
manager" when it see Id 0x53 (== 83)

     Device Boot      Start         End      Blocks   Id  System
  /dev/vda1   *        2048    26664575    13331264   53  OnTrack DM6 Aux3

Restore to 0x83

Change-Id: Ib43038d2d740fbe01a21a13dd56367f7bc97f869
2018-03-22 10:10:47 +11:00
Ian Wienand
55b479b54f GPT partitioning support
This adds support for a GPT label type to the partitioning code.  This
is relatively straight-forward translation of the partition config
into a sgparted command-line and subsequent call.

A unit test is added based on a working GPT/EFI configuration and the
fedora-minimal functional test is updated to build a single-partition
GPT based using the new block-device-gpt override element.  See notes
in the sample configuration files about partition requirements and
types.

Documentation has been updated.

Co-Authored-By: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Change-Id: I6b819a8071389e7e4eb4874ff7750bd192695ff2
2018-02-23 10:04:26 +11:00
Mark Hamzy
c7da8bc90a Set default label for XFS disks
As described, we want to set the default label for XFS disks to the
shorter value.

For example, you hit this when setting the old FS_TYPE environment
variable to 'xfs' (which sets the "root-fs-type" parameter, which gets
passed through to 'type'; but does not set a default label).

Change-Id: I41dce6e25766562db4366021309b8c2b74a8ab80
Closes-Bug: 1742170
2018-01-29 15:10:08 +11:00
Ian Wienand
fd9a8acecd Don't fstrim vfat partitions
This small change avoids running fstrim on vfat partitions.

The mount order test-case has been updated to also test the mkfs
creation components, and the input config modified to have a vfat
partition to cover this path.

Change-Id: I8952e748d4bdc12a5769706de9057c1e97d95e37
2018-01-23 13:24:09 +11:00
Andreas Florath
bb6cf52d85 Remove dd from LVM element
This patch removes the unneeded dd calls in the lvm block device
plugin.

After removing the underlying block device, there is the need to call
'pvscan --cache'.  This is done by a dedicated LVM cleanup node which
is cleaned up after the the underlying block device.

Change-Id: Id8eaede77fbdc107d2ba1035cd6b8eb5c10160c3
Signed-off-by: Andreas Florath <andreas@florath.net>
2017-10-08 17:21:21 +00:00
Andreas Florath
fa6c731132 Move fstrim to block device layer
The call to fstrim in disk-image-create is currently useless, because
at the time this is called, the file systems were already umounted by
the block device layer.

The current implementation of the block-device mount plugin does not
call fstrim at all: resulting in larger image sizes.

This patch removes the useless fstrim call from the disk-image-create
script and moves this into the block-device mount.py.

The resulting image might be much smaller.  Example: Ubuntu Xenial
with some elements; once with and once without this patch:

-rw-r--r-- 1 dib dib 475661824 Sep 16 06:43 ubuntu-xenial-without-fstrim.qcow2
-rw-r--r-- 1 dib dib 364249088 Sep 16 09:30 ubuntu-xenial-with-fstrim.qcow2

Change-Id: I4e21ae50c5e6e26dc9f50f004ed6413132c81047
Signed-off-by: Andreas Florath <andreas@florath.net>
2017-09-28 17:48:59 +10:00
Ian Wienand
ed3c5d9711 Actually sort mount-point list
We intended to do an in-place sort of the mount-point list, but
sorted() returns a new list that wasn't captured.  Move to the .sort()
function.

It seems the existing unit-test missed this.  Add a new test taken
from the bug which does exhibit a sorting issue.  Also added a
unit-test of just the comparitor for sanity.

Closes-Bug: 1699437
Change-Id: I8101e4a1804a4af7dbda20d48bf362c3f4ad2742
2017-09-19 11:30:36 +10:00
Yolanda Robla
c2dc3dc78e LVM support for dib-block-device
This provides a basic LVM support to dib-block-device.

Co-Authored-By: Ian Wienand <iwienand@redhat.com>

Change-Id: Ibd624d9f95ee68b20a15891f639ddd5b3188cdf9
2017-08-24 16:22:56 +10:00
Jenkins
ced9b51f6e Merge "Allow users to specify partition type in the MBR PTE" 2017-08-04 05:19:02 +00:00
Amrith Kumar
52faa0e1d9 Allow users to specify partition type in the MBR PTE
The MBR Partition Table Entry (PTE) allows one to specify many
possible partition types and one of the benefits of this is being able
to specify the CHS variant or the LBA variant.

By default, LBA only creates partitions of type 0x83 (of course,
that's only because the documentation doesn't tell you how to make it
do anything else).

I will take up Ian's suggestion in patch set 2 for a more rigorous
test in an independent patch set.

Change-Id: If3068535980eac2e58d4025444c65147a8c7fedc
Closes-Bug:#1703352
2017-07-29 06:34:25 -04:00
Amrith Kumar
59f416ae20 The correct option for label name in fat and vfat is '-n'
The code in mkfs correctly extends the command line with a '-n' for
vfat but does not currently do it for fat. This means that mkfs for
fat ends up with a '-L' which is what you'd do for everything like
ext[234].

The change just treats fat like vfat in the one place where this check
is required.

Change-Id: If65dfd949acdadff33a564640fb42ea73026a786
Closes-Bug: #1703063
2017-07-15 22:48:52 -04:00
Xinliang Liu
178db0c97b Fix mkfs use wrong label option for vfat
For vfat type, mkfs should use '-n' option for label.
e.g.:
mkfs -t vfat -n LABEL-STRING

Change-Id: I1414c5b8e0aeb240c3e6884e35ba75dde677db0c
2017-06-22 14:50:53 +08:00
Ian Wienand
5d5fa06e5c Sync after writing partition table
We introduced the "settle" in
I90103b59357edebbac7a641e8980cb282d37561b thinking that maybe kpartx
had not finished writing the partition.  This probably wasn't a bad
first assumption, since we used to have this -- but is seems
insufficient.

The other failiure here seems to be if kpartx hasn't actually seen the
updated partition table in the image, so it has correctly (in it's
mind) not mounted the partition.

Looking at strace of fdisk run manually on a loopback, it will do a
fsync on the raw device after writing and then a global sync as it
exits.

This replicates this; we flush and fsync in mbr.py in the exit handler
after writing the partition, before closing the file (i've updated one
of the unit tests to double-check the call).  In the partitioning.py
caller we execute a sync call too.

Since it does seem unlikely the "-s" option of kpartx is not working,
I've removed the udev settle work-around too.

Change-Id: Ia77a0ffe4c76854b326ed76490479d9c691b49aa
Partial-Bug: #1698337
2017-06-19 17:13:36 +10:00
Michael Johnson
250aeb5d21 Fix mkfs failure when loop device is not ready
There was a race in diskimage-builder where the mkfs call after a
kpartx -avs for the loop device would fail because the device was
not yet ready.  This adds a udevadm settle call after the kpartx
to make sure the udev event queue has cleared.

Change-Id: I90103b59357edebbac7a641e8980cb282d37561b
Closes-Bug: #1698337
2017-06-17 09:00:13 +10:00
Ian Wienand
6c394f5746 Pass all blockdevices to bootloader
Currently we only export "image-block-device" which is the loopback
device (/dev/loopX) for the underlying image.  This is the device we
install grub to (from inside the chroot ...)

This is ok for x86, but is insufficient for some platforms like PPC
which have a separate boot partition.  They do not want to install to
the loop device, but do things like dd special ELF files into special
boot partitions.

The first problem seems to be that in level1/partitioning.py we have a
whole bunch of different paths that either call partprobe on the loop
device, or kpartx.  We have _all_part_devices_exist() that gates the
kpartx for unknown reasons.  We have detach_loopback() that does not
seem to remove losetup created devices.  I don't think this does
cleanup if it uses kpartx correctly.  It is extremley unclear what's
going to be mapped where.

This moves to us *only* using kpartx to map the partitions of the loop
device.  We will *not* call partprobe and create the /dev/loopXpN
devices and will only have the devicemapper nodes kpartx creates.
This seems to be best.  Cleanup happens inside partitioning.py.
practice.  Deeper thinking about this, and more cleanup of the
variables will be welcome.

This adds "image-block-devices" (note the extra "s") which exports all
the block devices with name and path.  This is in a string format that
can be eval'd to an array (you can't export arrays).

This is then used in a follow-on
(I0918e8df8797d6dbabf7af618989ab7f79ee9580) to pick the right
partition on PPC.

Change-Id: If8e33106b4104da2d56d7941ce96ffcb014907bc
2017-06-08 17:14:22 +10:00
Ian Wienand
1d1e4ccb3e Move rollback into NodeBase object
Currently we pass a reference to a global "rollback" list to create()
to keep rollback functions.  Other nodes don't need to know about
global rollback state, and by passing by reference we're giving them
the chance to mess it up for everyone else.

Add a "add_rollback()" function in NodeBase for create() calls to
register rollback calls within themselves.  As they hit rollback
points they can add a new entry.  lambda v arguments is much of a
muchness -- but this is similar to the standard atexit() call so with
go with that pattern.  A new "rollback()" call is added that the
driver will invoke on each node as it works its way backwards in case
of failure.

On error, nodes will have rollback() called in reverse order (which
then calls registered rollbacks in reverse order).

A unit test is added to test rollback behaviour

Change-Id: I65214e72c7ef607dd08f750a6d32a0b10fe97ac3
2017-06-08 17:14:20 +10:00