I just saw in the trace output of a failure
> grep -o 'CentOS-.[^>]*GenericCloud-.[^>]*.qcow2'
> sort -r
> head -1
sort: fflush failed: 'standard output': Broken pipe
sort: write error
i.e. the "head -1" has exited after reading one line, but "sort -r"
still wants to write and thus has hit a pipe failure, and because we
run with "-o pipefail" this has halted the script.
This seems like it has been there more or less forever, maybe we just
got lucky hitting it now? Anyway, we can work around this by using a
process substitution and passing the output of this into head, this
way we won't hit a pipe failure.
I also updated the fedora path as it does the same thing.
Change-Id: I44d97e5bb31702aacf396e0229329a2ef9c64f2f
We've not really been using the Focal containerfile, as we move
forward jammy is a better choice for keeping stable as we might find
some new users for it.
Also add binutils to bindep for native bullseye builds (see
Icb0e40827c9f8ac583fa143545e6bed9641bf613)
Change-Id: I22ebe2bbccaec34180e58996b21e47bfc4f36055
03-reset-bls-entries was previously a pre-install script to run after
the machine-id was set, but a new kernel may be installed during the
install phase, which will install another bls entry file with a
filename which differs from the machine-id.
This means this package installed bls file won't be updated when
grub2-mkconfig is called, resulting in incorrect kernel args and boot
device in the entry file that will get booted by default.
By fixing the filenames after the new kernel is installed,
grub2-mkconfig will update the bls file that actually gets used on
boot.
Change-Id: I653bef9638e38ded68458fd40d90e30e5206caad
According to the systemd documentation[1], if /etc/machine-id is empty
it will be populated with a unique value, but not in a way which
triggers an actual first boot event (running units with
ConditionFirstBoot=yes set)
This change writes "uninitialized" to /etc/machine-id to ensure that
systemd-firstboot.service actually runs, and other units can use
first-boot-complete.target as a dependency to trigger on first boot.
Since /var/lib/dbus/machine-id is sometimes a symlink to
/etc/machine-id, it is truncated before writing to /etc/machine-id.
On older versions of systemd before first boot semantics were
formalised, any non-uuid value will trigger a new machine-id to be
generated, so "uninitialized" also works.
[1] https://www.freedesktop.org/software/systemd/man/machine-id.html#First%20Boot%20Semantics
Change-Id: I77c35e51a3da2e8a6b5a2c80d033a159b303c9af
This started a long way from here, when I noticed that "top" on centos
9-stream images wasn't working because ncurses-base wasn't installed.
This led me to the extant install of bash/glibc/ncurses-libs from
Iecf7f7e4c992bb23437b6461cdd04cdca96aafa6. However it didn't really
explain why these are brought in here.
Reading further it became clearer that over the years of distribution
additions, Fedora updates, etc. this has grown into a bit of a mess.
Refactor the release package installs into a more logical flow,
pulling out checks/comments for Fedora's of ancient history, etc.
Remove the 9-stream package installs; this isn't the place for them,
and the should be brought in by the base packages.
Ultimately, this is intendend to a be a no-op refactor.
Change-Id: Ie7d9a6497d0d20a3303ec0da3d0668c74efa2c3d
The recent git ownership-checking changes (see related bug for full
details) mean we can not run git in non-owned directories.
We have a couple of cases here where we have done a "pushd" to work in
the REPO_DEST context; this is the destination directory that is
inside the chroot so needs to be operated on as "root" (via sudo
calls). This certainly makes sense -- but given the new way of things
it can hide what context each call is working in, which is now very
important. Previously this worked because you could read it; now it's
doing the UID check too, calls in here without sudo now fail.
Remvoe the pushd's and make every call that works in REPO_DEST
explicit with -C, and add sudo calls around it.
Change-Id: Id1f6bd94c9c77ef6ab2b562a7e0bc48f749c58ac
Related-Bug: https://bugs.launchpad.net/devstack/+bug/1968798
The bootloader element installs the grub bootloader for whole-disk
images, but it also correctly sets values in /etc/default/grub and BLS
entries.
This value setting is useful even if the bootloader isn't installed.
For example, the overcloud-full partition image benefits from a
correct /etc/default/grub and BLS entries which ironic-python-agent
will use when it installs grub on the disk during baremetal deploy.
This change moves the actual grub install to the end of the script,
and if there is no $DIB_BLOCK_DEVICE set then install is skipped.
This allows overcloud-full to use the bootloader element instead of
the grub2 element, so the correct grub defaults are set on centos9,
including the correct root device on centos9.
Change-Id: I8cb34914bbbfa05521bbb71cc6637368b980358f
This reverts the mirror removal in
I817b412b7f06523df635e8b16111bc1081b40f66 and updates the test to F35,
which is mirrored.
Change-Id: I00d24690f57bedd3fc5ebbc18de0ed874ad1e4ef
In some build environments Docker is already installed - and adding
podman is not an option. Add a new variable to toggle this, and
rename the now incorrectly titled DIB_CONTAINERFILE_PODMAN_ROOT to
just ...RUNTIME_ROOT to match.
Change-Id: I677e4f491b40360dceabdf4f2a9e64c7cb493dc7
This adds a check for the root device having filesystem type btrfs,
and when it is assume there is a subvolume called "root". This fixes
extract-image when using Fedora-Cloud-Base btrfs images.
This should be sufficient until there is another btrfs base image with
a different subvolume layout.
Change-Id: Ib18979090585ba92566e523951b521b9d902fcb7
This change is proposed again, avoiding lsblk features missing from
older distros:
- lsblk is avoided entirely for a whole-disk image with a single
partition, which would be the majority of old image building jobs
- Field PARTTYPENAME not available on the lsblk in CentOS-8, instead
rely on the GUID being correct for EFI partitions
- Argument --output-all not available on the lsblk in CentOS-7, this
is just for logging debug, so can be removed
This reverts commit b06bac734c.
Change-Id: Ib0d4e7751fd968511fc7f672d524e58d1488ae11
This reverts commit 0630b3cb69.
Reason for revert: breaks compatibility with CentOS Stream 8, lsblk does not have PARTTYPENAME until version 2.35 and CS8 has version 2.32.1 installed
Change-Id: I7fc0e76f0eeb8594d8a0d57629b2c67526b961ad
Rocky Linux is very similar to CentOS 8. CentOS 8 required and forced
NetworkManager with glean so we update dib to do the same for Rocky.
Change-Id: I145e57d61059c2f34dc2d4810e83809b71c6aade
RHEL-9 base images are whole-disk images with the /boot/efi partition
correctly set up for EFI Secure Boot. This doesn't work with
extract-image because it only mounts the root partition, leaving
/boot/efi empty even though grub2-efi & shim packages are "installed".
This change mounts discovered partitions to mnt/boot, mnt/boot/efi so
all content can be extracted from the image.
Partition detection is done by reading block device attributes and
matching on Boot Loader Specification[1] UIDs or labels as observed in
supported base images.
[1] https://systemd.io/BOOT_LOADER_SPECIFICATION/
Change-Id: I8487002a18ae6ca98609ab68d92ae9173a2b864f
Similar to the CentOS-9-Stream fix [1] this change renames the default
BLS entry to match the current machine-id so that grub2-mkconfig calls
will refresh the kernel options.
However there is an additional issue with the rhel-9 base image. It is
unique in having a dedicated boot partition, so the path to the kernel
and initramfs don't include /boot. This results in an unbootable image
when /boot is a directory of the root partition.
These paths do not get corrected by calling grub2-mkconfig, so this
change performs a sed on the paths to fix them for a root partition
/boot.
[1] I327f5e7a95e47905c01138c8c4483f3f03e8efff
Change-Id: I37a1d310e1854f4a49725e355d484e456ea4fc7a
The check removed here came in with
I4481b43e4a8fe4144be9c7eb9d9c618bbb2df21e a long time ago. At that
time we were not building EFI images, and were building i386 images;
both of which are now untrue.
We can simplify this now by merging it into the gpt/mbr path. If we
are in there we know that we should set --target=i386-pc for BIOS
boot. For sanity check that we are x86 in this path -- PPC is handled
separately (although it's probably bit-rotted) and ARM64 is EFI.
Change-Id: Ie9839c9adc642b0dd688bced3faa46e9314e9799
Co-Authored-By: Clark Boylan <clark.boylan@gmail.com>
OpenDev relies on the epel role to configure the epel repository for our
image builds. Specifically we need epel to pull in haveged. Update the
epel role to recognize rocky and configure it properly.
Change-Id: I968d4702ef39590e972b782a09e18a5db40703ad
This fixes a regression introduced by
Ia99687815667c3cf5e82cf21d841d3b1008b8fa9
It turns out that [[ -d /usr/lib/grub/*-efi ]] is not a good check,
because [[ doesn't split that and try to glob match ( [ would ). This
has resulted in us triggering this path on ARM64.
This is an x86-64 only check, because on other platforms we either
don't support EFI or are EFI only. Restrict this check to get arm64
working again.
Change-Id: I6a75f8504826bcb0ac122d53dfb9faff975077f4
Gentoo updated the layout and files for vaidating stages
At least we can validate cryptographically and infer valid checksum now.
https://www.gentoo.org/news/2022/02/17/changed-signatures.html
Change-Id: I708b44419ae53dec2c19a2210ef427dcd2eb6002
Signed-off-by: Matthew Thode <mthode@mthode.org>
The fedora and fedora-minimal elements currently test Fedora <=34. The
opendev ci mirrors no longer mirror Fedora 34 which means we cannot rely
on those mirrors for these tests. Remove the mirror configuration when
testing these older Fedora builds.
Change-Id: I817b412b7f06523df635e8b16111bc1081b40f66
The reality is that "stable" is what is tested. This tries to give
enough info that users can ascertain what tests are running at any
given time and hence what elements are known to be working.
Additional, clarify the Fedora position in the README as now described
by above.
Closes: #1653561
Change-Id: Ifb91b9089790897861bd7e671c3dba59adac239d
GRUB_OPTS has never been documented as externally available, and is
not used. Assume it's value to simplify the code.
Move the grub version check separately, as we only support grub2
Remove references to buliding i386 images. I don't image it works in
any way.
Remove ci.md, which is no longer relevant.
Refactor the test for "building BIOS image on EFI system" consiberably
after these changes.
Change-Id: Ia99687815667c3cf5e82cf21d841d3b1008b8fa9
The dhcp-all-interfaces element does not work with the predictable names
scheme, fallback to the persistent names scheme as workaround.
Bug: 1960301
Change-Id: I117964a60615a5b7e9984f52f02cd018d1a48ed0
Using rpm -e to remove old kernels fails when other packages also
depend on the removed kernel.
This change reverts back to using dnf to remove the kernel, but also
sets the config value protect_running_kernel=False to avoid the issue
where the build host kernel version matches the version of the package
being deleted.
reverts commit 1ac31afd62.
Change-Id: Ie58630c23a34f2db34f3934abbd0c1076ab9d835
DevStack likes to use LC_ALL=en_US.UTF-8 because it does some
post-processing of output that depends on stable sort-order. Pull in
the langpack package that provides this by default -- fedora-minimal
was doing this via yum-minimal so it makes sense to be equivalent.
Change-Id: I799bcfd73e1cb76ee1808b3441f40b0525e3c73d
AFAICS, use of this was removed with
I7f98a13091056809fedae8a5c8ee10b0ef8bbb2a and I can't see any other
references to it. Correct the comment to describe how it works.
Change-Id: I5123729b7457dcbd4f4a51cff49904f7bd071e9b
Introduce new container image for Rocky Linux, a downstream clone of Red
Hat Enterprise Linux.
Keep non-voting in Check for a while before adding to any gate checks
Signed-off-by: Neil Hanlon <neil@shrug.pw>
Change-Id: Ib383f60bc23b434b400f85c376840a000cafc697
Related-Bug: https://review.opendev.org/805800/
For centos stream, the $releasever is just the major version. Several
of our .repo files are using $releasever in their path, and I think
that 8-stream installs are actually using 8 repos to install from.
For 9-stream, which doesn't have a corresponding 9, we're getting
errors enabling some of the aarch64 tests.
Replace all the $releasever expansions in the .repo files with the
exact version they are being installed for. They don't need to be
generic; we are installing these specific repos for each DIB_RELEASE,
so they don't mix-and-match.
Change-Id: I48d438d8f51280cd060433fc8a67358d8345287f
SUSE dropped OpenStack Cloud in 2019 [1], and as a result, some
OpenStack-related repositories were removed from openSUSE Download and
root filesystem images stopped being provided. This change deprecates
Leap releases before 15.3 and employs the extract-image script. It also
moves the extract-image script to the sysprep element, since now it's
also used by openSUSE-related elements.
Additionally, revert the "Remove opensuse related funtests" change [2]
so that the opensuse element is tested again and set the default Leap
release to 15.3.
[1] https://www.zdnet.com/article/suse-drops-openstacks/
[2] https://review.opendev.org/c/openstack/diskimage-builder/+/824002
Change-Id: I73d6323aa65cee69a55e54bc53ed682f096dfc89
We've moved away from building "stable"/"testing" targets, as they
move over time so you never know what you're building.
These testing targets are unused, remove them to remove confusion.
Change-Id: I2a53f70ed07873b9a408972d2162b6c10b050db5
This does a basic vm build test of bullseye-arm64, which currently is
missing from the ARM64 testing.
To keep runtimes a bit more reasonable, split the job into two parts,
one for deb distros and one for rpm.
Change-Id: I0f28ff92e1b8d08d56b82b392e2cc355d567d007
NetworkManager is quite capable to do automatic
interface configuration. NetworkManager will by default
try to auto-configure any interface with no configuration.
It will use DHCP for IPv4 and Router Advertisements to
decide how to initialize IPv6.
It will most likely do it just as good, or better than the
dhcp-all-interfaces.sh script.
Since dhcp-all-interfaces clean out all ifcfg files in
60-remove-cloud-image-interfaces it means NetworkManager will
by default attempt auto configuration for all interfaces.
This change add's and environment variable:
DIB_DHCP_NETWORK_MANAGER_AUTO (default: false)
When DIB_DHCP_NETWORK_MANAGER_AUTO is set to `true` only the
NetworkManager config will be written. The dhcp-all-interfaces
service will not be installed. Hence dhcp-all-interfaces will
not write any config files, allowing NetworkManager to just do
it's thing.
Change-Id: Id6f8d6aaaf52a78175bb6c065ec88274c364834e
This change:
- adds a note regarding an error when building focal ubuntu-minimal
images on operating systems with older versions of debootstrap
- adds a reference to where the DIB_RELEASE variable definition can be
found
Closes-Bug: #1941831
Change-Id: Ibc1e04dba0562c4f4909a8cb8af041d9b8ac45c4
This change replaces the call to grub2-switch-to-blscfg with a file
rename to update it to the actual machine-id.
grub2-switch-to-blscfg has issues in some build environments:
- When the build host is EFI boot, it assumes the image is, and
fails when config file /etc/grub2-efi.cfg is missing
- With recent cento9 images and a fedora build host it fails with:
grub2-probe: error: cannot find a device for / (is /dev mounted?)
Change-Id: I74ad800b702f2b491d958555cef8d7c7f63d74ac
In the grub2 element the grub2-efi-x64-modules package
is missing in the centos 9 section, this cause a failure
because grub2 cannot find the neccecary files when
installing the bootloader on EFI systems.
It seems grub2-efi-x64-modules was not included in release
9, this is likely why the block was added initially without
this package. Since it is now there, the Centos 9 specific
block is no longer needed.
Removing the rhel 8 block as well, as it is identical to the
family "redhat" block i.e it is redundant.
Closes-Bug: #1957169
Change-Id: Ia6b0ecf0cd15fb23c6740543940ee513a8602afe
This change removes the uninstall grub2-efi which was required for
prerelease rhel-9 images but now breaks current centos-9-stream
images. A different approach may be required for rhel-9 if the base
image remains different to centos-9-stream (such as populating the
empty /boot/efi partition from the base image)
This change also fixes the detection of whether this is an efi build
to check the block device instead of checking for whether a grub efi
package is installed. This fixes building a centos-9-stream whole-disk
image when package grub2-efi-x64 is installed but a legacy fallback
grub also needs to be installed.
Change-Id: I24baf553e1acd15a66737fc0b2a79d5335e28aa5
Partial-Bug: #1957789
This was introduced in [0] but we can include it in the existing elif
series instead.
[0] I2b75afd310f009ae8614f6ca75bb984b56d25c45
Change-Id: Ibe05f367be997efbd8c5ebec77503ebd9cda1c8b
Per the bug mentioned upstream, grub2-mkconfig will currently not set
the kernel options for BLS entries prefixed with a machine-id
different to the running system.
This affects the centos element, as the upstream .qcow2 comes with a
pre-existing BLS entry but a blank machine-id. This only affects
9-stream -- prior releases either don't use BLS or have entries
configured to use a common variable from grubenv which is updated
correctly.
We currently can not end-to-end test this in OpenDev because we run
our functional tests on Ubuntu Focal (they use devstack), whose kernel
can not read the XFS format on the 9-stream .qcow2. This expands the
functional tests (that run on Debian Buster, with a later kernel) to
add the vm element, so the bootloader path is exercised (this requires
a block-device too). This at least runs the bootloader install, we
can confirm the kernel options look right from the dumping provided
the logs.
Change-Id: I327f5e7a95e47905c01138c8c4483f3f03e8efff
The pip_args variable is not initialized when installing pip for
bullseye resulting in an unbound variable error when running
install_python3_pip on that debian version.
This patch fixes the issue moving pip_args inizialization to a common
place.
Change-Id: I1603c97871449b4f73e3062a705d655e9454bf33
A lack of space between package names was causing apt to fail.
[0] I2b75afd310f009ae8614f6ca75bb984b56d25c45
Change-Id: Ia7e005c2f583037ee44a3c364e3b8d79d51e03a2
Debian bullseye has removed python-pip and python-virtualenv
from its repos, let's install only pip and virtualenv python3 modules.
Also split pip installation based on python2 and python3 for
debian-based distributions.
Change-Id: I2b75afd310f009ae8614f6ca75bb984b56d25c45
This reverts I2701260d54cf6bc79f1ac765b512d99d799e8c43,
Idf2a471453c5490d927979fb97aa916418172153 and part of
Iecf7f7e4c992bb23437b6461cdd04cdca96aafa6 which added special flags to
update kernels via grubby.
These changes actually ended up reverting the behaviour on Fedora 35,
which is what led me to investigate what was going on more fully.
All distros still support setting GRUB_DEVICE in /etc/default/grub;
even the BLS based ones (i.e. everything !centos7).
The implementation *is* confusing -- in earlier distros each BLS entry
would refer to the variable $kernelopts; which grub2-mkconfig would
write into /boot/grub2/grubenv. After commit [1] this was reverted,
and the kernel options are directly written into the BLS entry.
But the real problem is this bit from [2]
get_sorted_bls()
{
if ! [ -d "${blsdir}" ] || ! [ -e /etc/machine-id ]; then
return
fi
...
files=($(for bls in ${blsdir}/${machine_id}-*.conf; do
...
}
i.e., to avoid overwriting BLS entries for other OS-boots (?),
grub2-mkconfig will only update those BLS entries that match the
current machine-id.
The problem for DIB is that we are clearing the machine-id early in
finalise.d/01-clear-machine-id, but then running the bootloader update
later in finalise.d/50-bootloader.
The result is that the bootloader entry generated when we installed
the kernel (which guessed at the root= device, etc.) is *not* updated.
Even more annoyingly, the gate doesn't pick this up -- because the
gate tests run on a DIB image that was booted with
"root=LABEL=cloudimg-rootfs" the kernel initially installed with
"install-kernel" (that we never updated) is actually correct. But
this fails when built on a production host.
Thus we don't need any of the explicit grubby updates; these are
reverted here. This moves the machine-id clearing to after the
bootloader setup, which allows grub2-mkconfig to setup the BLS entries
correctly.
[1] 4a742183a3
[2] https://src.fedoraproject.org/rpms/grub2/blob/rawhide/f/0062-Add-BLS-support-to-grub-mkconfig.patch
Depends-On: https://review.opendev.org/c/zuul/nodepool/+/818705
Change-Id: Ia0e49980eb50eae29a5377d24ef0b31e4d78d346
Patch allow to set path for local image source,
instead download latest or use the cached image.
This permit to build image also in environment without internet access.
re-propose of patch: https://review.opendev.org/c/openstack/diskimage-builder/+/809009
Change-Id: I54395b09af339caee040326b809e8fbf8b0e7d6a
A recent(-ish) change in git [1] has exposed a bug in caching that
appears in one very specific circumstance -- updating the
openstack/openstack super-repo [2].
This repo gets a submodule update every time something is pushed. By
using "--git-dir" while the cwd is one-level above the actual repo we
are confusing [1] which is not finding the submodule directories
correctly and giving us an error:
Could not access submodule 'foo'
for every submodule that has updated between now and the last time we
updated the cache. [3]
The git manual does warn about this
If you just want to run git as if it was started in <path> then use
git -C <path>.
Indeed, that is what we want to do in this path. Modify the calls to
use -C.
[1] 505a276596
[2] https://opendev.org/openstack/openstack/
[3] The result for opendev production is that image builds fail every
time an openstack/* project is checked in; we then race to retry
the build before another commit lands and updates the submodules
again.
Change-Id: Iadb23454e29d8869e11407e1592007b0f0963e17