Commit graph

36 commits

Author SHA1 Message Date
Steve Baker
1a4fb0b89b Do dmsetup remove device in rollback
Without this change, the final unmount will timeout after the
rollbacks are called when the partitioning fails due to a user error.

dmsetup remove is called both for partition and LVM volume devices.

Change-Id: I99679ea00338d4018a95d4da9b21685161cd5049
2022-08-18 10:23:41 +12:00
Simon Westphahl
4a424ecabb Use kpartx option to update partition mappings
Fix cases of 'mkfs' failing because the partitions never showed up. Partition
mappings will now be updated instead of just adding them with 'kpartx'. That
means that 'kpartx' will also remove devmappings for deleted partitions.

Traceback of failing mkfs call:

2020-05-11 22:03:25.523 | INFO diskimage_builder.block_device.utils [-] Calling [sudo sync]
2020-05-11 22:03:25.539 | INFO diskimage_builder.block_device.utils [-] Calling [sudo kpartx -avs /dev/loop0]
2020-05-11 22:03:25.581 | INFO diskimage_builder.block_device.utils [-] Calling [sudo mkfs -t ext4 -i 4096 -J size=64 -L cloudimg-rootfs -U 21c6f9eb-4d52-4e5c-b9b7-796735de8909 -q /dev/mapper/loop0p1]
2020-05-11 22:03:25.700 | ERROR diskimage_builder.block_device.blockdevice [-] Create failed; rollback initiated
2020-05-11 22:03:25.700 | Traceback (most recent call last):
2020-05-11 22:03:25.700 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/blockdevice.py", line 406, in cmd_create
2020-05-11 22:03:25.700 |     node.create()
2020-05-11 22:03:25.700 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/level2/mkfs.py", line 133, in create
2020-05-11 22:03:25.700 |     exec_sudo(cmd)
2020-05-11 22:03:25.700 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/utils.py", line 143, in exec_sudo
2020-05-11 22:03:25.700 |     raise e
2020-05-11 22:03:25.700 | diskimage_builder.block_device.exception.BlockDeviceSetupException: exec_sudo failed
2020-05-11 22:03:25.700 | INFO diskimage_builder.block_device.level0.localloop [-] loopdev detach
2020-05-11 22:03:25.701 | INFO diskimage_builder.block_device.utils [-] Calling [sudo losetup -d /dev/loop0]
2020-05-11 22:03:25.732 | INFO diskimage_builder.block_device.level0.localloop [-] Remove image file [/tmp/dib_image.muyw7t1h/image0.raw]
2020-05-11 22:03:25.734 | ERROR diskimage_builder.block_device.blockdevice [-] Rollback complete, exiting
2020-05-11 22:03:25.740 | Traceback (most recent call last):
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/bin/dib-block-device", line 8, in <module>
2020-05-11 22:03:25.740 |     sys.exit(main())
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/cmd.py", line 120, in main
2020-05-11 22:03:25.740 |     return bdc.main()
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/cmd.py", line 115, in main
2020-05-11 22:03:25.740 |     self.args.func()
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/cmd.py", line 36, in cmd_create
2020-05-11 22:03:25.740 |     self.bd.cmd_create()
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/blockdevice.py", line 406, in cmd_create
2020-05-11 22:03:25.740 |     node.create()
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/level2/mkfs.py", line 133, in create
2020-05-11 22:03:25.740 |     exec_sudo(cmd)
2020-05-11 22:03:25.740 |   File "/home/zuul/dib/lib/python3.6/site-packages/diskimage_builder/block_device/utils.py", line 143, in exec_sudo
2020-05-11 22:03:25.740 |     raise e
2020-05-11 22:03:25.740 | diskimage_builder.block_device.exception.BlockDeviceSetupException: exec_sudo failed

Change-Id: I374f7f22f9e93ef35eb5813712ca59e75f0733e8
Related-Bug: #1698337
2020-06-09 09:07:55 +02:00
Zuul
3be4b0c1fd Merge "Only detach device if all partitions have been cleaned" 2018-07-31 08:21:27 +00:00
Zuul
d50bd1deb3 Merge "Don't quote names with sgdisk" 2018-07-30 06:26:25 +00:00
Yolanda Robla
64bb87f7b5 Only detach device if all partitions have been cleaned
Currently there is a bug, that tries to detach the device from a
partition at the first try, without considering that there may be
other partitions and volumes on it. Ensure that the detach is done
properly, and add a test to ensure that this happens correctly.

Change-Id: I35c5a473509f17a70270a2cbf5bf579faaeb123a
Fixes-Bug: #1777861
2018-07-30 16:24:57 +10:00
Ian Wienand
f94943344f Call kpartx remove in umount, not cleanup
Similar to I697bfbf042816c5ddf170bde9534cc4f0c7279ff, the order of
things called is "dib-block-device umount" *then* "dib-block-device
cleanup".

Because we're doing the "kpartx -d" here in cleanup, it means that the
loop-device is removed in umount phase from level0/localloop.py, then
afterwards we try and remove the partitions.

Change-Id: I7af3c5cf66afd81a481f454b5207af552ad52a32
TODO: a test case to ensure the ordering
2018-06-29 11:22:33 +10:00
Ian Wienand
b0da703f46 Don't quote names with sgdisk
Our sgdisk calls are putting extra double-quotes around the names of
partitions.  This confuses sfdisk, which confuses growpart, which
confuses growroot ... and you don't get your partition grown for EFI
boot.

Ensure we just bunch arguments into the list directly (for Popen)
rather than string split and have to worry about quoting.  Add a check
for this to our GPT unit test, extending it to include a space in the
name of the root partition.

Change-Id: I0a8cb69bb4c9c0865fbaa63ba0d7210028da552e
2018-06-27 18:10:08 +10:00
Ian Wienand
55b479b54f GPT partitioning support
This adds support for a GPT label type to the partitioning code.  This
is relatively straight-forward translation of the partition config
into a sgparted command-line and subsequent call.

A unit test is added based on a working GPT/EFI configuration and the
fedora-minimal functional test is updated to build a single-partition
GPT based using the new block-device-gpt override element.  See notes
in the sample configuration files about partition requirements and
types.

Documentation has been updated.

Co-Authored-By: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Change-Id: I6b819a8071389e7e4eb4874ff7750bd192695ff2
2018-02-23 10:04:26 +11:00
Ian Wienand
5d5fa06e5c Sync after writing partition table
We introduced the "settle" in
I90103b59357edebbac7a641e8980cb282d37561b thinking that maybe kpartx
had not finished writing the partition.  This probably wasn't a bad
first assumption, since we used to have this -- but is seems
insufficient.

The other failiure here seems to be if kpartx hasn't actually seen the
updated partition table in the image, so it has correctly (in it's
mind) not mounted the partition.

Looking at strace of fdisk run manually on a loopback, it will do a
fsync on the raw device after writing and then a global sync as it
exits.

This replicates this; we flush and fsync in mbr.py in the exit handler
after writing the partition, before closing the file (i've updated one
of the unit tests to double-check the call).  In the partitioning.py
caller we execute a sync call too.

Since it does seem unlikely the "-s" option of kpartx is not working,
I've removed the udev settle work-around too.

Change-Id: Ia77a0ffe4c76854b326ed76490479d9c691b49aa
Partial-Bug: #1698337
2017-06-19 17:13:36 +10:00
Michael Johnson
250aeb5d21 Fix mkfs failure when loop device is not ready
There was a race in diskimage-builder where the mkfs call after a
kpartx -avs for the loop device would fail because the device was
not yet ready.  This adds a udevadm settle call after the kpartx
to make sure the udev event queue has cleared.

Change-Id: I90103b59357edebbac7a641e8980cb282d37561b
Closes-Bug: #1698337
2017-06-17 09:00:13 +10:00
Ian Wienand
6c394f5746 Pass all blockdevices to bootloader
Currently we only export "image-block-device" which is the loopback
device (/dev/loopX) for the underlying image.  This is the device we
install grub to (from inside the chroot ...)

This is ok for x86, but is insufficient for some platforms like PPC
which have a separate boot partition.  They do not want to install to
the loop device, but do things like dd special ELF files into special
boot partitions.

The first problem seems to be that in level1/partitioning.py we have a
whole bunch of different paths that either call partprobe on the loop
device, or kpartx.  We have _all_part_devices_exist() that gates the
kpartx for unknown reasons.  We have detach_loopback() that does not
seem to remove losetup created devices.  I don't think this does
cleanup if it uses kpartx correctly.  It is extremley unclear what's
going to be mapped where.

This moves to us *only* using kpartx to map the partitions of the loop
device.  We will *not* call partprobe and create the /dev/loopXpN
devices and will only have the devicemapper nodes kpartx creates.
This seems to be best.  Cleanup happens inside partitioning.py.
practice.  Deeper thinking about this, and more cleanup of the
variables will be welcome.

This adds "image-block-devices" (note the extra "s") which exports all
the block devices with name and path.  This is in a string format that
can be eval'd to an array (you can't export arrays).

This is then used in a follow-on
(I0918e8df8797d6dbabf7af618989ab7f79ee9580) to pick the right
partition on PPC.

Change-Id: If8e33106b4104da2d56d7941ce96ffcb014907bc
2017-06-08 17:14:22 +10:00
Ian Wienand
1d1e4ccb3e Move rollback into NodeBase object
Currently we pass a reference to a global "rollback" list to create()
to keep rollback functions.  Other nodes don't need to know about
global rollback state, and by passing by reference we're giving them
the chance to mess it up for everyone else.

Add a "add_rollback()" function in NodeBase for create() calls to
register rollback calls within themselves.  As they hit rollback
points they can add a new entry.  lambda v arguments is much of a
muchness -- but this is similar to the standard atexit() call so with
go with that pattern.  A new "rollback()" call is added that the
driver will invoke on each node as it works its way backwards in case
of failure.

On error, nodes will have rollback() called in reverse order (which
then calls registered rollbacks in reverse order).

A unit test is added to test rollback behaviour

Change-Id: I65214e72c7ef607dd08f750a6d32a0b10fe97ac3
2017-06-08 17:14:20 +10:00
Ian Wienand
824a9e91c4 Add state to NodeBase class
Making the global state reference a defined part of the node makes
some parts of the block device processing easier and removes the need
for other global values.

The state is passed to PluginNodeBase.__init__() and expected to be
passed into all nodes as they are created.  NodeBase.__init__() is
updated with the new paramater 'state'.

The parameter is removed from the create() call as nodes can simply
reference it at any point as "self.state".

This is similar to 1cdc8b20373c5d582ea928cfd7334469ff36dbce, except it
is based on I68840594a34af28d41d9522addcfd830bd203b97 which loads the
node-list from pickled state for later cmd_* calls.  Thus we only
build the state *once*, at cmd_create() time as we build the node
list.

Change-Id: I468dbf5134947629f125504513703d6f2cdace59
2017-06-08 17:13:26 +10:00
Ian Wienand
b85de3cd9e Add state object, rename "results", add unit tests
A couple of things going on, but I think it makes sense to do them
atomically.

The NodeBase.create() argument "results" is the global state
dictionary that will be saved to "state.json", and re-loaded in later
phases and passed to them as the argument "state".  So for
consistency, call this argument "state" (this fits with the change out
to start building the state dictionary earlier in the
PluginBase.__init__() calls).

Since the "state" is a pretty important part of how everything works,
move it into a separate object.  This is treated as essentially a
singleton.  It bundles it nicely together for some added
documentation [1].

We move instantiation of this object out of the generic
BlockDevice.__init__() call and into the actual cmd_* drivers.  This
is because there's two distinct instantiation operations -- creating a
new state (during cmd_create) and loading an existing state (other
cmd_*).  This is also safer -- since we know the cmd_* arguments are
looking for an existing state.json, we will fail if it somehow goes
missing.

To more fully unit test this, some testing plugins and new
entry-points are added.  These add known state values which we check
for.  These should be a good basis for further tests.

[1] as noted, we could probably do some fun things in the future like
make this implement a dictionary and have some saftey features like
r/o keys.

Change-Id: I90eb711b3e9b1ce139eb34bdf3cde641fd06828f
2017-05-30 20:39:00 +10:00
Andreas Florath
f314df12c3 Refactor: use lazy logging
As described in pep282 [1], the variable part of a log message
should be passed in via parameter.  In this case the parameters
are evaluated only when they need to be.

This patch fixes (unifies) this for DIB.

A check using pylint was added that this kind of passing parameters to
the logging subsystem is enforced in future.  As a blueprint a similar
(stripped-down) approach from cinder [2] was used.

[1] https://www.python.org/dev/peps/pep-0282/
[2] https://github.com/openstack/cinder/blob/master/tox.ini

Change-Id: I2d7bcc863e4e9583d82d204438b3c781ac99824e
Signed-off-by: Andreas Florath <andreas@florath.net>
2017-05-30 14:39:58 +10:00
Ian Wienand
deb832d685 Create and use plugin/node abstract classes
This completes the transitions started in
Ic5a61365ef0132476b11bdbf1dd96885e91c3cb6

The new file plugin.py is the place to start with this change.  The
abstract base classes PluginBase and NodeBase are heavily documented.
NodeBase essentially replaces Digraph.Node

The changes in level?/*.py make no functional changes, but are just
refactoring to implement the plugin and node classes consistently.
Additionally we have added asserts during parsing & generation to
ensure plugins are implemented PluginBase, and get_nodes() is always
returning NodeBase objects for the graph.

Change-Id: Ie648e9224749491260dea65d7e8b8151a6824b9c
2017-05-26 11:48:11 +10:00
Ian Wienand
75817ef205 Use networkx for digraph
This switches the code to use networkx for the digraph implementation.

Note that the old implementation specifically isn't removed in this
change -- for review clarity.  It will be replaced by a base class
that defines things properly to the API described below.

Plugins return a node object with three functions

 get_name() : return the unique name of this node

 get_nodes() : return a list of nodes for insertion into the graph.
  Usually this is just "self".  Some special things like partitioning
  add extra nodes at this point, however.

 get_edges() : return a tuple of two lists; edges_from and edges_to
  As you would expect the first is a list of node names that points to
  us, and the second is a list of node names we point to.  Usually
  this is only populated as ([self.base],[]) -- i.e. our "base" node
  points to us.  Some plugins, such as mounting, create links both to
  and from themselves, however.

Plugins have been updated, some test cases added (error cases
specifically)

Change-Id: Ic5a61365ef0132476b11bdbf1dd96885e91c3cb6
2017-05-26 11:42:10 +10:00
Ian Wienand
00da1982ce Add a more generic tree->graph parser
This moves to a more generic config parser that doesn't have plugins
parsing part of the tree.

I understand why it ended up that way; we have "partitions" key which
has special semantics compared to others keys and there was a desire
to keep it isolated from core tree->graph code.  But this isn't really
isolated; you have to reverse-engineer several module-crossing
boundaries, extras classes and repetitive recursive functions.

Ultimately, plugins should have access to the node graph, but not
participate in configuration parsing.  This way we ensure that plugins
can't invent new methods of configuration parsing.

Note: unit tests produce the same tree -> graph conversion as the old
method.  i.e. this is not intended to have a functional change.

Change-Id: I8a5d62a076a5a50597f2f1df3a8615afba6dadb2
2017-05-26 10:13:14 +10:00
Ian Wienand
bc58b5c515 Move parts of Partition creation into object
Move Partition() object creation into the actual Partition object,
rather than having the logic within the Partitioning() object

Change-Id: I833ed419a0fca38181a9e2db28e5af87500d8ba4
2017-05-20 06:44:39 +00:00
Ian Wienand
d013496ba0 Split partition into it's own file
Split Partition() into it's own file for clarity.  This will be
followed-on by less dependence between Partitions and Partition

Change-Id: I860f6a1787c0e4fe99f93919ac37cf7d80bfaae9
2017-05-20 06:44:39 +00:00
Ian Wienand
4e08765f87 Move exception to it's own file (again)
Moving the exception didn't cause problems in
I925ed62bdc808f0e07862f6e0905e80b50fbe942, but in later changes where
we split blockdevice.py up a bit more, we can get a bit tangled with
circular imports.

Change-Id: I8297483f64c4e1deecd5ec88ee40e9198bb83589
2017-05-20 06:44:39 +00:00
Jenkins
ca04348393 Merge "Remove _config_error thrower" 2017-05-18 02:37:53 +00:00
Ian Wienand
b91207ae47 Remove _config_error thrower
"log and throw" is arguably an anti-pattern; the error message either
bubles-up into the exception, or the handler figures it out.  We have
an example where this logs, and then the handler in blockdevice.py
catches it and logs it again.

Less layers is better; just raise the exception, and use log.exception
to get tracebacks where handled.

Change-Id: I8efd94fbe52a3911253753f447afdb7565849185
2017-05-18 10:37:56 +10:00
Ian Wienand
2d2b2725bd Remove PluginBase/NodePluginBase class
A majority of the "plugins" aren't implementing the plugin class.
Clearly we need some refactoring of the ideas here.  Remove for
simplicity.

Change-Id: If399a371b171f4fd17cfa5856fe55daca4c86e60
2017-05-17 09:03:42 +02:00
Andreas Florath
e4e23897a1 Refactor: block-device filesystem creation, mount and fstab
This patch finalizes the block device refactoring.  It moves the three
remaining levels (filesystem creation, mount and fstab handling) into
the new python module.

Now it is possible to use any number of disk images, any number of
partitions and used them mounted to different directories.

Notes:

 * unmount_dir : modified to only unmount the subdirs mounted by
   mount_proc_sys_dev().  dib-block-device unmounts
   $TMP_MOUNT_PATH/mnt (see I85e01f3898d3c043071de5fad82307cb091a64a9)

Change-Id: I592c0b1329409307197460cfa8fd69798013f1f8
Signed-off-by: Andreas Florath <andreas@florath.net>
Closes-Bug: #1664924
2017-05-12 13:52:02 +02:00
Ian Wienand
47140293b6 Move blockdevicesetupexception.py into blockdevice.py
Less is more when it comes to code :)

Change-Id: I925ed62bdc808f0e07862f6e0905e80b50fbe942
2017-05-11 15:22:41 +10:00
Ian Wienand
2a185ec6b6 block_device: reorder imports
Reorder imports to hacking standard (stdlib, third-party, project) [1]

[1] https://docs.openstack.org/developer/hacking/#import-order-template

Change-Id: I4aa73321e1e796ef6b8b079e42f90bf5c75388fe
2017-05-11 10:38:55 +10:00
Ian Wienand
0d8c4270c0 exec_sudo: check cmd for str, log output and raise exception
To avoid any confusion, commands passed to exec_sudo() should be a
list of "str"s.  Log a message if we see unicode issues.

This also adds a debug trace of all output.  stderr is captured.

This is modified to raise CalledProcessError on failure, like
check_call().  Calls that are ok to fail will need to explicitly catch
and ignore this.

The two calls that we expect to fail are wrapped

We wish to try rolling back if one of these command raises an
exception.  Modify the create handler to initiate rollback on all
exceptions.

Change-Id: Iee4fa41ffaf243e4728bf3a5eeec5c8fa8d2dadc
2017-05-11 09:45:25 +10:00
Yolanda Robla
9b75eda51a Introduce exec_sudo command
Add a new method in the block device library called
exec_sudo, so it can be reused.

This is a partial refactor of change
I592c0b1329409307197460cfa8fd69798013f1f8

Change-Id: Id621f6d029e1275a35c4fd3f19b57c8518076134
Co-Authored-By: Andreas Florath <andreas@florath.net>
2017-05-05 16:11:41 +02:00
Yolanda Robla
08c36e4bf8 Add refactor of tree-like vs graph
Introducing the refactors of the block device to allow a tree-like
configuration, and start using it for the partitions level.
Based on patch I3600c6a3d663c697b59d91bd3fbb5e408af345e4

Change-Id: I58bb3c256a1dfd100d29266571c333c2d43334f7
Co-Authored-By: Andreas Florath <andreas@florath.net>
2017-05-03 05:27:43 +00:00
Yolanda Robla
943f1ccf04 Refactor block_device: isolate the getval call
Add a new getval call that allows to retrieve values
from the block device. Also isolating the block device
information into a 'blockdev' dictionary entry, to better
return it with the getval command.

This is a refactor from the original code at
I3600c6a3d663c697b59d91bd3fbb5e408af345e4.

Change-Id: I93d33669a3a0ae644eab9f9b955bb5a9470cadeb
Co-Authored-By: Andreas Florath <andreas@florath.net>
2017-05-01 12:22:52 +02:00
Andreas Florath
57c9e0bb41 Use stevedore for plugin config of block device
This patch introduces stevedore plugin mechanism for use
with the block device layer.  This makes it possible that
other projects pass in their own block device plugins.

Change-Id: Id3ea56aaf75f5a20a4e1b6ac2a68adb12c56b574
Signed-off-by: Andreas Florath <andreas@florath.net>
2017-03-24 21:42:47 +00:00
Ian Wienand
a8f2eaded8 Capture output in _exec_sudo
The stdout of the script is captured, so anything coming out from
these commands needs to be captured.  Move to check_process and show
the output as part of an error log in failure case.

Change-Id: I1150375cdc479d4f19b8ddeb49a824ab16fdf831
2017-03-23 09:31:37 +11:00
Yolanda Robla
96504a4de0 Use OrderedDict for partitions instead of simple dictionary
The order of the partitions is important, it needs to be preserved.
If using a simple dict, this is not happening. As a consequence,
checks like 'primary partition being first' are failing because the
dictionary sorts the partitions randomly.
Switched to OrderedDict solved the problem, as it preserves the
ordering it gets from the yaml blob.

Change-Id: Icfa9bd95ffd0203d7c3f6af95de3a6f848c2a954
2017-03-07 15:55:20 +01:00
Andreas Florath
866a06f92d Refactor: block-device partitioning cleanup
Now that the main partitioning refactor patch is merged, there is
a small relict of handling partitions still in the disk-image-create
main.

This patch moves the functionality from disk-image-create to the
block-device/partitioning module: it is mostly a rewrite of the
original bash code in python.

Change-Id: Ia73baeca74180a7bc9ea487da03ff56d6a3070ce
Signed-off-by: Andreas Florath <andreas@florath.net>
2017-03-07 18:43:09 +11:00
Andreas Florath
ec7f56c1b2 Refactor: block-device handling (partitioning)
During the creation of a disk image (e.g. for a VM), there is the need
to create, setup, configure and afterwards detach some kind of storage
where the newly installed OS can be copied to or directly installed
in.

This patch implements partitioning handling.

Change-Id: I0ca6a4ae3a2684d473b44e5f332ee4225ee30f8c
Signed-off-by: Andreas Florath <andreas@florath.net>
2017-01-24 19:59:10 +00:00