There are a couple of loops identified here that output a lot of
tracing each iteration for little value. Make them only trace at
higher levels (-x -x and above).
Points of actual interest within the loops are replaced with an
explicit "echo" statement.
Additionally, export DIB_DEBUG_TRACE explicitly and in all cases, not
just when tracing is turned on.
Change-Id: Id710c0b111fc1f5e1ae87fc35f6db28b24867bad
It was noticed on a very busy system this can take about 1s per loop.
This starts to add up on thousands of processes.
Firstly, prune out all the kernel threads. Then introduce a very
small inline python script to find any pids that seem to be in the
chroot without forking to examine each one. After that the existing
loop just kills anything as before.
Change-Id: Icc7bc7eda80ffcd636f97e6542d70c220e9c225e
Do md5 and sha256 in parallel to speed things up for larger images.
Change-Id: Ib782fe54e4286ba2749a7ab7247f5d41a887a370
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
In the error case, we get a spew of output as this check goes though
every pid checking if its in the chroot. Disable tracing around the
call.
Change-Id: Ie84f12974755c0c2c51d7e7697337ed9b32a4a1c
It has been observed that some chroot operations spawn additional
processes which rely on chroot files. More specifically, zypper, uses
gpg-agent to import and validate gpg keys for its repositories. This
gpg-agent process may stay alive for longer which prevents unmounting of
the tmpfs directory since the gpg-agent process still uses libraries etc
which were present in the chroot. We try to solve this by using walking
all the pids in /proc to find out the running processes in the chroot and
kill them gracefully. If that fails for whatever reason, then we simply
keep trying to umount the tmpfs directory before we give up.
The gpg-agent process usually terminates soon after its home directory
disappears but on fast systems we can reach the 'umount tmpfs' point
before gpg-agent terminates by itself. The solution is generic enough so
other 'chroot processes' can also be handled appropriately.
Change-Id: Iccf332678c79266113e76f062884fc5ee79e515d
In shade, we use both md5 and sha256 checksums to help validate the
integrity of an image. Rather then having nodepool do this each time
for every time, have diskimage-builder create these files when we
build the image.
We've added a flag (disabled by default) to toggle this functionality.
Change-Id: I5815ba69b7d477f1e91dc8ec0c69c86168770964
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
The temporary directories for image creation and building the OS both
started with 'image' as their names followed by some random
characters. During debugging this is annoying, because on first sight
it is not clear, where which files are stored.
This patch renames them to dib_build.XXXXXXXX and dib_image.XXXXXXXX.
This patch introduces no user-visible change:
the temporary directories are only used during the
run of disk-image-builder.
Change-Id: I249cdb7750fe9a746b375b462789cd9b82681a2e
Signed-off-by: Andreas Florath <andreas@florath.net>
There were a couple of functions which were unused:
ensure_nbd, map_nbd, unmount_qcow_image, mount_qcow_image, ensure_sudo
Because some of them use 'trap ... EXIT' this hinders introducing a
separate exit phase - therefore they are removed.
(It would also be impossible to use them in the current setup, because
they overwrite the 'trap ... EXIT' of the disk-image-creates 'main'.)
Change-Id: If932a557dca9aea4864154ad6c4f286373d6dd7c
Signed-off-by: Andreas Florath <andreas@florath.net>
This reverts commit f07e33a2e9.
This change reverts the revert while fixing the underlying issue --
$PIPESTATUS needed to be encapsulated in ${}s
Change-Id: I1df06ffa7aecf4ea4b8e187dc756e9fc779786bc
This reverts commit 0d1d6bec7c.
This patch breaks tripleo-ci (the instack.qcow2 images is failing
to build) and was merged without passing CI.
Closes-Bug: #1582115
Change-Id: Ic4725ad0689c937fb4c8c792e1eaff5f4ea9ada9
In phases which are called from eval_run_d (block-device.d) we do not
listen to exit 1's nor do we allow break=after-error. This is because
the run_d function is called in a subshell in order to grab its output.
This also turns on pipefail in the main disk-image-create script.
Change-Id: I88ab2e7104148437eabfe6880e3a1e5ebbb2c15d
Cleanup this function to work with a symlinked directory. Document
it's behaviour more exactly, and add a simple unit-test for it (not
run by default, due to doing things like mounting and unmounting
system dirs on a live system, which doesn't seem safe for CI. But it
is useful for developers ensuring sanity).
Change-Id: I335316019ef948758392b03e91f9869102a472b9
Due to a bug in how we were running some of our phases we were not
detecting some nonzero exit's. When this is fixed, dib fails early during
cleanup (leaving some resources attatched) due to not propertly
accounting for pipefail.
Change-Id: Icc0b35acbe035cac12a9291e2d07b6c690c3a6ad
With a slow file system, umount can return 0 and the immediately
following remove can fail with a "Device or resource busy" error.
This happened in DevStack in disk-image-create where unmount_image
is followed by an immediate cleanup_build_dir.
Solution is to apply same logic from bug 1332521 to allow the
remove to retry on failure (up to 5s) in case the umount has not
completed.
Change-Id: I3337e2b4ad0111e77f79dc179439cdfea8ebdeda
Closes-Bug: #1527721
Currently when these files are opened your editor doesn't know what to
do with them. Add #!/bin/bash to library functions so that editors,
diff-tools, etc can do syntax highlighting.
There are other ways to skin this cat, such as renaming to ".sh",
adding -* style editor flags, etc. We had this discussion in DevStack
too, and came to the conclusion the simplest thing that works for
everyone is to just put the #! at the top.
Change-Id: I4cf64321e14844696139f5d40e4d719436390b35
Temp dirs are created with mktemp and thus belong to the user. There
is no need to chown them unless we used `mount -t tmpfs`.
Move chown under the tmpfs_check conditional.
Change-Id: I37efe18ced3a06d461364dc5cb20600f1527e995
This reverts commit ea4a823810.
This function was actually still in use in lib/common-functions
and removing it causes the disk-image-get-kernel /bin
to fail entirely.
Change-Id: Icddb3ca369922a6ea915af8b1b62c434cb1bdf28
Closes-bug: 1464031
Split the cleanup_dirs function in two, i.e. cleanup of the build dir
and the image dir, and use the former to cleanup the temporary build
subdirs after their unmount, before the conversion to other disk
formats; they are not needed anyway at that point, and allows to save
disk space during the conversion phase.
Change-Id: Ie30d7e6033613d6979148423326ae7e17a7342e7
This allow custom elements to be added with symlink. Without -follow
a symlinked element is valid but scripts in *.d directory aren't used.
Change-Id: If50b7d9c3b1f6fe278c28488146709efe5cf065f
Closes-Bug: 1461124
By that point in the build it isn't generally useful, and it causes
confusion when builds fail because people think that's the error.
Change-Id: I26dee4ac0947b71a4a065ef6c5a18103e7df6667
Given this is often the final output, it can look like an error occured.
Changing the wording makes this clearer.
Change-Id: I70f157054e3120cffee6fa5241b1ffe0b7bfa650
I regularly see users report that their build fails because this unmount
line reports an error. Even though we dont bail here because of the ||
true, as a user it is hard to distinguish this from an error.
Change-Id: Ic43f4fb24c53c58329fdf501bba6ba14024ec2aa
Deprecated the `--expand-dependencies` flag from `element-info` usage.
The flag was required and not optional. We can rely on argparse to exit non-0
when the required positional argument is not provided.
Change-Id: Iaf8eb962eb600760974bc33c30b809a07a23278e
Closes-Bug: 1265649
When building the ramdisk we don't cleanup the temporary
directories after ourselves. This leaves /tmp/image.* directories
mounted and /tmp/image.* directories on the system.
Also the ramdisk-functions duplicate, from what I can see,
the cleanup function from common-functions. So when a job
is killed off it ends up leaving /tmp/image.* directories
on the system.
Change-Id: I2d73aabd0eb176027b4e7368580db08902e2b6ab
The element builds dracut from source on Ubuntu because the
Ubuntu dracut package is broken and very old, so it can't be
installed properly and causes a number of other issues that
are fixed by using a newer version of Dracut.
This initial version should work in virtualized environments.
Further validation of its suitability for real baremetal
deployments will need to be done in the future, but this should
be sufficient to enable that work.
Regarding Dracut specifically, in order to limit the changes
needed in the existing scripts this element continues to use a
cut down version of the /init script that we were building for the
existing ramdisk. However, instead of running it as pid 0 it is
run as a Dracut pre-mount hook. This allows Dracut to set up all
of the hardware and system bits, while falling early enough in the
Dracut sequence to complete the deployment before Dracut would try
to boot off the hard disk.
bp tripleo-juno-dracut-ramdisks
Change-Id: I144c8993fe040169f440bd4f7a428fdbe3d745cf
Until now there was a possibility for two elements to install hooks
with the same name, so one of them was overwritten. Change logic to
copy the hooks and fail in case one with the same name exists.
Change-Id: Ic2c46835b27c9319f7a889ffd0ccf3f5ccc1f0cd
Closes-Bug: 1251952
After being deprecated two releases ago, finally remove any reference
for the support of first-boot.d
Change-Id: I08d67404ef48cad61db3b18fb86e970abfa5d2b6
When uploading images to multiple clouds it is possible that the same
image will be needed in multiple formats to accomodate hypervisors
across clouds. Update disk-image-create's -t flag to take a list of
desired output image formats so that a single disk-image-create can
output all of the desired image formats.
Change-Id: If121b2342ae888855ba435aa3189f039e985b812
If we entered the cleanup trap due to exit with an error code we should
exit dib with an error code.
Change-Id: Iee1a05668b3239113fb91a2da0d9a66d7de4db6b
A user reported symptoms where the losetup line used to detach the
loopback device was failing in tar mode. We don't need to detach a
device that does not exist.
Change-Id: I807996e16199288927b49b4f300ae9b461cb8fe7
Closes-Bug: #1378033
When running inside a Docker container, we cannot rely on devices in
/dev/mapper to be automagically created by udev, because we probably
don't have a udev at all. To work around this, run dmsetup mknodes
after every kpartx run.
Change-Id: If7e30579224ce54c5ed26d08974d8293c144719a
Now that dib-run-parts has been moved to the dib-utils project, we
need to update diskimage-builder to use it instead of the version
directly in diskimage-builder.
This change removes the old copy of the dib-run-parts script in
the element, adds dib-utils as a dependency of diskimage-builder,
and updates the uses of dib-run-parts to correctly handle the fact
that it is now external to the project.
Requires I0be1f876d0e4a7d38e0d5c6010a552a8ebb158a4
Change-Id: Ia0a0df7784a14c49b5c47ac0b03e6c2602c84b3b
We need to be able to do install.d like things for ramdisks
themselves, but install.d runs outside the ramdisk context - and its
likely to break peoples brains if we mangle the two together - so this
adds a new hook point, ramdisk-install, specifically for installing
things into the ramdisk.
Change-Id: I37d1660309cda6e28bd0b316b08f61db4e080613
The two duplicated functions, save_image and finish_image will move
an existing image out of the way if it exists, but it isn't
configurable. Check an environment variable is 0 before doing so.
Switch save_image to just calling finish_image, rather than
duplicating its code exactly.
Change-Id: I26a5a8fa4b6e853c9440bffab195b0bc3728be40