When running under nodepool in a foreground, non-daemonized situation
without a tty (i.e. within a container) we're seeing this "wait" hang
indefinitely.
It is probably related to "outfilter.py" and output file descriptors,
although TBH we haven't completely root-caused it. I won't claim this
is a great solution, but it should hopefully let the dib process
finish and just die, where outfilter will disappear.
Change-Id: If78da54df3d4c240fee16aee4413ec554b37c1d6
The current implementation evauates the dib-init-system
script too early. Also it looks that there is no simple
way of getting the info about the init system automatically:
another element can install (later on) a different
init system. Therefore the only reliable way of setting
this is manual.
Change-Id: I6e9ffa1bdb3154f488f4fd335b197699b86aacd4
Signed-off-by: Andreas Florath <andreas@florath.net>
I commonly get asked for help when people are attempting to create
local image elements and they cannot get them to work.
diskimage-builder silently ignores element scripts that it doesn't
find to it's liking, such as non-executable or files with extensions
(.sh is a common mistake).
This patch extends the '-x' tracing flag down to dib-run-parts and
will cause it to print out helpful messages when these files would
otherwise be silently ignored.
Examples:
Ignoring non-executable files: 10-do-not-run-me
Ignoring non-conforming filenames: 10-I-can-run.sh
I am not enabling these by default as they can create extra noise
and require additional filesystem IO to produce.
Change-Id: Ic804efca3015c199440b4b10da951d71a815c64f
When the rdisc6 utility is available probe for router
advertisement. configure eni and rhel-netscripts interfaces
to do IPv6 address configuration according to the flags
in the RA recived from the router.
The systemd service file timeout is DIB_DHCP_TIMEOUT * 2,
so that DHCPv4 can timout, and dhcpv6 run before the service
times out.
Retries are commented in dhclient.conf, without it we end up
trying DIB_DHCP_TIMEOUT * 60 before the client move on to
IPv6.
WHEN:
Stateful address conf. : No
Stateful other conf. : No
THEN:
Do not run dhclient at all, autoconfiguration via
SLAAC only.
WHEN:
Stateful address conf. : No
Stateful other conf. : Yes
THEN:
Run "dhclient -6 -S", The ``-S`` option makes the
dhcp client not request an address, only other
options such as DNS servers and NTP servers from
DHCPv6 server.
WHEN:
Stateful address conf. : Yes
Stateful other conf. : Yes
THEN:
The dhcp client should request an address _and_ other
options such as DNS servers and NTP servers from
DHCPv6 server.
NOTE: No IPv6 support added for suse-netscripts
Closes-Bug: 1754219
Change-Id: Icdc79875c33f894ab7eaec8afdfb33a731efff99
Currently DIB_ADD_APT_KEYS only supports GPG armor keys, while
default Debuntu apt gpg keys are in keyring format.
Change-Id: I361c375e25b03a08b19052b10c6733939c8df921
This reverts commit a3e9e7f89e.
We still have some issues with vhd creation on RAX
In short, it appears that images fail to resize unless they have a
specific "creator" field. Revert this while we consider the options.
[1] https://bugs.launchpad.net/nova/+bug/862653
Change-Id: I2b6a3bfbfe28432fbb6a2ce4a0211939d224b8d5
The "ironic-agent" is copied to ironic-python-agent-builder and
hence it is deprecated from DIB.
Remove from functional testing
Change-Id: Ibc4f75b9d7e2a31994fc86d05bd57975f00fb74f
Task: 36198
Story: 2005114
This package is not installed by default on Debuntu, but is on RH
platforms. This is causing a build breakage as DIB_PYTHON_VIRTUALENV
tries to use this (I3414fb9e503f94ff744b560eff9ec0f4afdbb50e).
Add the package.
Change-Id: I9a551c57dd128bbb4b095c847f634c777b2cb553
To ensure dracut does not load nouveau we need to explicitly disable it via
omit_drivers.
This change adds a method to drop in arbitary dracut conf files to an element
which are picked up by dracut-regenerate and included in the chroot where we
run dracut.
The disable-nouveau element just adds a conf file with
`omit_drivers += " nouveau"`
The default dracut conf files in /usr/lib include a similar file to omit the
nvidia kernel modules.
Change-Id: I6375e4843fd08d1410141fbbd8658042dcd5ad05
Closes-bug: 1842664
Seeing this at the end of the tripleo overcloud full build:
99-selinux-fixfiles-restore: line 69: [: too many arguments
Change-Id: I8fb10f3d3d38723b41190ae1898757e6df073945
The vhdutil utility is completely dead; the whole subsystem it relies
on was removed with [1] so it's not even vaguely possible to keep it
up-to-date.
I took the .raw images on a nb and used the qemu-img there (so Xenial)
and generated some VPC images; uploaded them to rackspace and the all
seemed to boot fine. If there was a problem, maybe it's been fixed on
either the qemu or RAX side in the previous few years.
Thus swith to qemu-img to generate the vhd images too.
[1] https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=5c883cf036cf5ab8b1b79390549e2475f7a568dd
Change-Id: I3099d2ebb958370fcec623087a093b2c8dbdc6c4
Add option to set the suite subpath after the release name for the
security mirror URL independently in the debian-minimal element,
since this can differ between mirrors.
Change-Id: I4cc8f54fba012986423e30e19bff276208b8ad62
With the introduction of centos 8 we have constructs like
if [[ $DISTRO =~ (centos|fedora) && $DIB_RELEASE -ge 8 ]]
This is intended to match the "centos7" element (from the =~) but it
was missed that this is setting the DIB_RELEASE to "GenericCloud".
I think it makes more sense for this to be a numeric release, and
makes constructs like above work. There really isn't any other type
of image to choose here; thus we move it into a new, centos7
specific variable.
Note that when the centos 8 images are available, we want to move to a
generic "centos" element that will handle both 7 and 8 together (same
as rhel) based on DIB_RELEASE and deprecate centos7; this works with
that environment too.
Change-Id: I2e6b7848070d6452c0563e2a122447627c6e6bf7
It turns out that this breaks ipv6 config with NM. Instead what we want
is for glean to not up interfaces on boot (see the depends-on).
Change-Id: I6c5bc76c433e29f02d3266ab8f669015125ec954
Depends-On: https://review.opendev.org/#/c/688031
This adds CentOS 8 into functional and boot tests.
This completes centos-minimal support, documentation is updated and a
release note is added.
Change-Id: I435c2967b4f49faeb6d6edf189907b9f96e80357
As described inline, NetworkManager and dhcp-client make up the basic
networking for centos 8 installs; bring them into the base image.
Although in infra we then use simple-init, some other users find this
helpful.
Change-Id: Ib9f32e73bf9109cc1b659fe1deceb1a15301ffeb
By default network-scripts package isn't installed, so the directories
for these files don't exist either. Skip by default for Centos 8.
Change-Id: I194ec3735e17f27e586386541dc51f775b01e510
Use the wrapper calls from Ia267a60eecfa8f4071dd477d86daebe07e9a7e38
to install glean.
Using this wrapper means we cover all cases without more and more
branches; it should work for python2, python3 and also the special
case of RHEL/CentOS where dib-python points to the special
/usr/libexec/platform-python (which is python3.6 with inbuilt pip)
Change-Id: If624e8bb66ce0761fc0d5f34c2bed8b93a7daeee
NetworkManager with simple-init has proven to be stable in OpenStack
infra, switch to it by default for CentOS and Fedora. For CentOS 8
and Fedora, add a check to make it the only option. Thus only CenOS 7
remains optionally using the legacy scripts; this is likely not used
anywhere (infra is really the primary user, where NetworkManager is
already used); we can likely remove this variable (and hence path) in
a future cleanup.
In the setup, remove rhel7 element which was never really tested.
Reorganise the fallthrough to call out the default paths as doing
nothing.
Change-Id: Ic996956da4b85f7d95179b8df9881d5f52c091af
Currently, the serial console is hardcoded to ttyS0 in the bootloader
element. This is a challenge for users that want to build images for
some baremetal servers. Supermicro servers, for example, use ttyS1 for
the serial over lan interface.
This patch adds a new environment variable DIB_BOOTLOADER_SERIAL_CONSOLE
that can be set to override the default.
Change-Id: Ie8173be8690ac0b7164ce9e5b66d3c1c18f844d6
Add option to set the security mirror URL independently in the
debian-minimal element, since this can not be overriden by the
standard DIB_DISTRIBUTION_MIRROR variable.
Change-Id: I145844a410d06a479e68db1bf6d5d0159389305c
As described inline, deprecate the "source" install for CentOS 8.
Overwriting the packaged tools has long been a pain-point in our
images, and the best outcome is just not to play the game [1].
However, the landscape remains complicated. For example, RHEL/CentOS
8 introduces the separate "platform-python" binary, which seems like
the right tool to install platform tools like "glean" (simple-init)
with. However, platform-python doesn't have virtualenv (only the
inbuilt venv).
So that every element doesn't have to hard-code in workarounds for
these various layouts, create two new variables DIB_PYTHON_PIP and
DIB_PYTHON_VIRTUALENV to just "do the right thing". If you need is
"install a pip package" or "create a virtualenv" this should work on
all the platforms we support. If you know more specifically what you
want (e.g. must be a python3 virtualenv) then nothing stops elements
calling that directly (e.g. python3 -m virtualenv create); these are
just helper wrappers for base elements that need to be broadly
compatible.
[1] http://lists.openstack.org/pipermail/openstack-infra/2019-September/006483.html
Change-Id: Ia267a60eecfa8f4071dd477d86daebe07e9a7e38
Don't install the "yum" package, which is a backwards compat around
dnf. With 687003f we should not need the backwards compat links any
more.
Add libcurl to avoid conficts with in the curl "-minimal" packages
that happens on CentOS 8. But skip it on Fedora, because it seems to
create more problems there (not going to pretend it isn't all a
hack ... but it seems to work).
Change-Id: I1de2703eb5075a0a22837b6898bd8eb960d080dd
A few places we either assume centos uses "yum" directly, or have
switching based on the distro type.
In both cases, we can use ${YUM} directly to avoid ambiguity
Change-Id: I71095a9bd1862f8956b5982fbbb3e1d213926c14
The libselinux packages etc don't exist for Python 2 on Centos 8 [1].
Ensure the package map installs the python3 versions.
We could probably invert the logic now, and make it so Centos 7 is the
"special" version that overrides things to install python2. Left
alone for now to avoid changing too much at once.
[1] https://bugs.centos.org/view.php?id=16458
Change-Id: I944cf4f2902c28728aa5bb9e2a00b3eef122d52e
CentOS 8 has the "new" split-up locales packages. Fedora 24 is now
long gone, so take out the old branch and apply the lang package
install to Centos 8 as well.
The manual locale cleanup is not necessary on Centos 8; skip it.
Change-Id: Ib65fc15fe471348793fd6efb034517f11abd905e
The repo format has slightly changed for CentOS 8 (s/os/baseos/).
Make the chroot builder look for a more specific repos.d directory
first named for the distro variable, then fall back to to top-level
dir (this avoids having to constantly change fedora).
Update the gate mirror setup and roles for new Centos 8 paths too.
Change-Id: I5b7f0c3624cac1d7aa7ed8bf6286b85d808b9c9a
This is no longer a valid option for dnf, and it puts out a lot of
warnings constantly about the invalid entry [1]. Remove it.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1653831
Change-Id: Iba0585cab3e5e78e9324196f276b2341e7bb9e3c
Install the Python 3 libselinux packages for Fedora platforms. I
think this is the right choice; Fedora is a Python-3 only distro so we
shouldn't default to installing the python2 libraries.
This has a practical effect if you're using Ansible with
ansible_python_interpreter=/usr/bin/python3 as it needs these
packages.
There is some small chance of breakage if you're using Ansible still
with Python 2, I guess. In infra I notice we bring this in with
"zuul-worker" project-config element. On balance, I think that if you
need the Python 2 packages for some reason, it should be a special
install and not part of redhat-common.
Change-Id: Ibcec0b3660d01b861838c2ae87ca43d98953ce32
Two bugs are addressed.
1) The sysprep element was broken in that it only truncates
/etc/machine-id, but not /var/lib/dbus/machine-id. systemd will
not generate a new machine-id if /var/lib/dbus/machine-id is
present[1], it will simply copy it to /etc/machine-id.
We observed machine-ids being packaged in /var/lib/dbus/machine-id
on several distros: Ubuntu Bionic, Fedora 29, Debian Stretch.
CentOS 7 and Ubuntu Xenial do not contain packaged machine-id as
far as I can tell.
All test builds were performed using -minimal elements.
2) A second bug existed where debian-minimal did not run the sysprep
element at all, so a stretch image I tested contained a populated
/etc/machine-id AND a populated /var/lib/dbus/machine-id.
[1] https://www.freedesktop.org/software/systemd/man/machine-id.html#Initialization
Change-Id: Ibb28b6e90d966a845de38a2cd5a1e8babd2604bc
Similar to https://review.opendev.org/#/c/663693/, the x64 packages
should be used for x86 architectures.
Change-Id: I5e8a4d58e96d65eb60fc539b8a1d56853b12faac
Closes-Bug: 1843820
linux-firmware and linux-firmware-whence (meta package for mostly iwl
firmwares) packages account for approx. 289 M install size on a F30
system, and linux-firmware for approx. 176 M on CentOS 7. Users needing
these firmwares are eventually baremetal users and are not looking for a
very minimal operating system base install like virtual image users are.
Thus, a non-minimal OS element is better suited for them. Alternatively,
it could be later considered a dedicated firmware element.
This is inline with I8ce65e1d357d15e8ed8995ad1dcaea02bbd1986f.
Change-Id: If104fc3c1e9349b8d501a2351fff1ab4c0dbc6a4
This is consistent with the previous simplication of
build targets in the opendev environment to refer to
"opensuse15" being the alias of "latest stable openSUSE Leap 15.x".
Change-Id: I904a3ca0d6dbddd2bb1a673836ab6a0ad249526d
Add a new environment variable $DIB_GZIP_BIN allowing builders to
specify a different gzip (such as pigz) to be used when compressing
tgz images.
Change-Id: Ifb617568140a149e2fda241e07ff8a59429e6697
We have an application breaking because /usr/share/cracklib is being
deleted from the image. The application installs its dependencies,
including cracklib, but since yum shows that cracklib is already
installed, it does not reinstall it.
Change-Id: Id6fccf76c706dbc6c2124abcfd12c1f10cef5e09
Newer openSUSE distributions install an absolute link to /run/netconfig
as /etc/resolv.conf in $TARGET_ROOT. as that points outside
TARGET_ROOT, we unintentionally wipe the system resolv.conf here
and break our ability to finish building the image.
Change-Id: I9d5aaa9fad2f81dcabfe19e2f1e6b6e50af597d7
This is a follow-on to I475a253091cbaf63687b91c748c31a6753bb0f57 as we
are still seeing issues on some clouds with unconfigured networking.
We increase the timeout, but also make it configurable so we can
fiddle it without a dib release in the gate.
To follow-on from the experimentation done by clarkb, I can confirm by
emperical testing on a Centos 7 image (from today, today being this
change's date) that setting
net.ipv6.conf.all.autoconf=0
by itself is "fatal" and the interfaces do not come up; i.e. nm does
not by default seem to re-enable ipv6 for the interface. However,
explicitly adding:
IPV6INIT=yes
IPV6_AUTOCONF=yes
to the interface file *does* seem to make it work, even if
"all.autoconf=0" is set (then again, there's also bugs about the
effect of this [1]). However, no extant distribution (I can currently
find) does anything like this by default.
If this continues, this may be an option. Another might be to avoid
the use of the nm-settings-ifcfg-rh profiles and move directly to nm
ini files with glean.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=11655
Change-Id: I869ebffc8cde3bbff573f6583fd9dd02a5598590
Upstream is now publishing 17.1 profile systemd stages
Also updates the docs that were forgotten in the last patch
Change-Id: I0f2e7976845b1d3c55ffe8869eec0bc04a191252
As described inline, we need to ensure the underlying directories in
the image are correctly labeled, or we get all manner of services
failing during boot with selinux in enforcing mode. Although the
problem is generic, this first shows up in Fedora 30 as systemd has
become more strict about namespace failures (I think) [1].
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1663040#c22
Change-Id: I52c1cc719884879169b606b00651aa26f5b783f1
This patch adds option DIB_YUM_MINIMAL_EXTRA_REPOS to yum-minimal to
allow DIB users to include extra repositories to their final image.
Change-Id: I89549f4b0f4c9470143b5064817acab5043e31c5
Something (possibly [1], but that change is at best cryptic) has
changed such that we don't get correct /etc/os-release files
installed. This flows on to grub half-installing itself, enough to
not fail the build but not enough to make something bootable.
Installing the -cloud release package gets it back, and seems like a
sane choice for dib.
[1] 617b1bed34
Change-Id: Iff0413887fad798273b2bfcb140cc07f36d54a04
As noted in the change, 7fd52ba841
increased the size of the EFI partition considerably. This has meant
that our padding upwards of the disk size is insufficient and EFI
builds (arm64 in particular) is failing due to out-of-disk errors
during final image operations like installing kernels.
Similar to the discussion we had in
I65fa13a088eecdfe61636678578577ea2cfb3c0c, this feels a bit ugly
because we're mixing logic here with sizes specified in block-device
config files. But it boils down to the same problem; we are
calculating the disk size here and passing it to the block-layer, so
unless we want to make large changes to the status quo about where
these sizes are calculated, small adjustments here are the most KISS
solution.
Thus we check if we have selected the EFI bootloader element, and thus
assume there will be a large system EFI partition and expand the disk
size accordingly.
Change-Id: Ifa05366c2f2b95259f3312e4dde8c85347075ba1
This debug statement lists every element found and its dependencies on
every build; it's just noise unless you're debugging the element
dependency solver itself. Remove from output.
Change-Id: I9281b953d958a3fd5e20edbc560a341a2fcc3deb
This seems to miss the exit code of the dracut process, which actualy
caused some issues in I8511669e188717494daf2bc1384a6dd346f942a4 where
it would have been much clearer to stop after the initramfs generation
failed.
Add some debug messages, and catch any errors from the final call.
Change-Id: I6f89441ec4709f5199535e15a7cc53a3a8af273d
Should install "grub2-efi-aa64 grub2-efi-aa64-modules" instead of
"grub2-efi grub2-efi-modules" for arm64
Change-Id: Iee3191b0944b3b862890d166a9d36bd592fe8f7e
Closes-bug: #1839816
- DIB_DISTRIBUTION_MIRROR_UBUNTU_IGNORE matches when it is empty or not
set and DIB_DISTRIBUTION_MIRROR is being used. Checking for it being
set and not empty solves this.
- Normalizing bash conditionals for readability
Closes-Bug: #1808359
Change-Id: I87853fcda4c8b29a3f1720a2778debeb3acc3a53
Signed-off-by: Manuel Torrinha <manuel.torrinha@tecnico.ulisboa.pt>
The 'pypi' mirror element is generating invalid pip.conf files
when more than one "DIB_PYPI_MIRROR_URL" is specified.
This patch fixes the pip.conf file rendering to use a proper form
when there are multiple "extra_index_url" provided.
Closes-Bug: #1839558
Change-Id: Ibda0e7390955683560e09b486f636775643ff57c
python 3.6 warns about regexes like:
DeprecationWarning: invalid escape sequence \+
I noticed that debugging a trove job and it really led me in the wrong
way. Fix this with making it a raw string.
Change-Id: I58ee1a49d62316c6c3f0588832c97f659f7e460b
Per the inline comment, a machine-id is required for kernels to
install correctly (this may well be a bug, but the linked issue
remained inconclusive).
Add a call to make the machine-id before install packages.
Change-Id: If75d04376e62bfdfe14ee3ca4d0bd5c8b383c1b0
Redhat-Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1737355
The 17.1 profile changed the defaults used in portage for where we store
our repo, distfiles and binpkgs. Some portage related variables need to
be set deterministically. 17.1 is no enabled for Systemd's profile.
Change-Id: Ib55f6875c5cb461c3c530b51d7420ce3dc8da360
This element configures systemd to send its journal to the console,
which can then be retreived by server commands. In the case of
nodepool, if the image failed to boot the console will be dumped into
the logs when nodepool decides the node is not responding. Having
this can be very helpful diagnosing early boot errors.
Needed-By: https://review.opendev.org/#/c/669787/
Change-Id: I6b6df7023acb6b2f967b84840bc4b542ebc03727
Newer versions of open-iscsi seem to compile on Gentoo / musl. Use them
if we can. This also removes the cap on open-iscsi.
Change-Id: I596cb61494e459a419bce6a63deff89f9e78fe23
Previously we were trying to enable dbus-daemon service on all prior to
fedora 30. Unfortunately 28 and older don't have this service so this
broke those releases and only worked for 29. Fix this by only enabling
this service on fedora 29.
Change-Id: I1bd15dcf0bbe270afccb0c0c3ea6ad08862a53f1
The linux kernel and NetworkManager fight each other over control for
interface management when router advertisements are in use. Long story
short if the linux kernel configures a network interface for ipv6
before NetworkManager attempts to manage that interface then NM will
ignore the interface and not configure ipv4 on it.
This can happen because the kernel is configured to send router
advertisements solicitations which result in router advertisements which
the kernel uses to configure the interface(s). There is a default of a 1
second delay before sending the solicitation which in many cases is long
enough that NM has started before then. However, in slower environments
like those used for testing with qemu this isn't long enough.
Some testing by hand indicates that 15 seconds is about right so
increase the delay to 15 seconds via sysctl.conf.
Note this may increase boot times in ipv6 only environments (though it
is hard to be sure due to how systemd starts everything at once and does
socket activation and the like).
Change-Id: I475a253091cbaf63687b91c748c31a6753bb0f57
When virtualenv and setuptools gots installed from source and rpm
then their installation path lives at different places but when
the python script got called then that time it choses either of
rpm or source based path on system wide installation and leads to
different failure as their methods are not implemeted.
So by setting _clear_old_files to 0 will install
python3-virtualenv python3-pip python3-setuptools from rpms only
and avoid these failures.
Change-Id: I0c162f1fe8168513e352546ab8dd2b68fa65b88c
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
autounmask=y (default) changes portage depsolving, causing errors
(mostly often seen in perl and binpkg related issues).
Disabling this functionality for DIB builds is OK as the enviroment is
not passed on post build and the build process is not interactive
anyway.
Change-Id: Ife9ace246bec16864ee4982bc456763af5dff2e8
Signed-off-by: Matthew Thode <mthode@mthode.org>
debian-minimal depends on debootstrap which depends on dpkg
This needs to be installed early as dpkg installs the apt keys early via
02-add-apt-keys in pre-install.d
Change-Id: I8580849ceaa7a5152c94f29afa890ac6d6983fb1