As motivation for this; we have had two breakouts of dib in recent
memory. One was a failure to unmount through symlinks in the core
code (I335316019ef948758392b03e91f9869102a472b9) and the other was
removing host keys on the build-system
(Ib01d71ff9415a0ae04d963f6e380aab9ac2260ce).
For the most part, dib runs unprivileged. Bits of the core code are
hopefully well tested (modulo bugs like the first one!). We give free
reign inside the chroot (although there is still some potential there
for adverse external affects via bind mounts). Where we could be a
bit safer (and could have prevented at least the second of these
breakouts) is with some better checking that the "sudo" calls
*outside* the chroot at least looked sane.
This adds a basic check that we're using chroot or image paths when
calling sudo in those parts of elements that run *outside* the chroot.
Various files are updated to accomodate this check; mostly by just
ignoring it for existing code (I have not audited these calls).
Nobody is pretending this type of checking makes dib magically safe,
or removes the issues with it needing to do things as root during the
build. But this can help find egregious errors like the key removal.
Change-Id: I161a5aea1d29dcdc7236f70d372c53246ec73749
If the initial yum install into the chroot fails, we can leave behind
a lockfile and an incorrectly modified rpmmacros.
Change this so we run the cleanup unconditionally.
Change-Id: Ia9f9c4c845e5f34d33ff9a4ab7226c9175283757
I have seem some occasional odd failures coming from the "dnf -y
update" done by elements/base/install.d/00-up-to-date.
dnf seems to sometimes think a package is not installed when it really
is. It then seems to try and re-install them, but notices they are
installed, and then bails with a failure exit [1]. The packages that
seem to cause this vary, but the common thread is that they seem to
have all been installed during the initial phase of installing the
package manager in the chroot.
I suspect that when we are building the chroot, we do our initial
install with the "external" yum & rpm. Then we start using the
dnf/yum in the chroot, but we're actually using meta-data created by
the *external* tools -- which could be vastly different versions or
who-knows-what. While I honestly I don't have an exact root cause,
empirically I've found rebuilding the rpm db always seems to fix
things up.
So this change takes care to rebuild the rpm db with the chroot
version of rpm, and clear out the package metadata for a refresh with
"update". This should hopefully put us in a consistent state.
[1] http://paste.openstack.org/show/487356/
Change-Id: I565df23897ae511356c4861fdbe63823fa6b6ff9
We were getting some subtle issues in fedora-minimal builds that
turned out to be because /var/run was not a symlink to /run.
Upon further investigation, it turns out that yum is creating a
/var/run directory for it's pid file when it starts working in the
empty chroot (which I verified by stracing it)
---
5905 stat("/home/ubuntu/tmp/dib-tmp/image.Ac4VZZsl/mnt/var/run", 0x7ffddffa0330) = -1 ENOENT (No such file or directory)
5905 mkdir("/home/ubuntu/tmp/dib-tmp/image.Ac4VZZsl/mnt/var/run", 0755) = 0
5905 open("/home/ubuntu/tmp/dib-tmp/image.Ac4VZZsl/mnt/var/run/yum.pid", O_WRONLY|O_CREAT|O_EXCL, 0644) = 6
---
Because this happens *before* we install "filesystem" (the package),
we mess up it's symlinking.
To work-around this, pre-install the trio of base packages (setup,
basesystem, filesystem) with rpm from outside the chroot.
Change-Id: I411b6ec9d91d95d3a0f98e76853086af3b70abe8
As described in the comment, systemd will create a broken
/etc/resolv.conf link if there is no file in the base-image (as you
can read in the bug, it is debated if this is a bug or a feature).
The solution is to leave a dummy /etc/resolv.conf file in the image.
Whatever network manager you choose (NetworkManager, glean,
cloud-config, etc) will overwrite this anyway.
It's just that some tools, such as dhclient, get confused with the
broken symlink. This affects you if you're using glean to configure
the network in a DHCP situation, for example -- dhclient won't
configure nameservers and everything goes to heck.
Change-Id: I734834d03e7fdb13f9ab2e86f877b07bf4a84ff9
As described in the comments, CentOS overrides the "distroverpkg"
variable in yum.conf. This is the package that yum queries to
establish the value of the $releasever variable. On other platforms,
this defaults to "redhat-release" (which "fedora-release" provides) so
everything works. It is only when the base-system "distroverpkg"
refers to a package not in the chroot we hit the issue.
We can avoid this by setting the releasever variable via the
commandline.
Change-Id: I231c3277960992cd479b8aff7838f246397936f2
On Debian/Ubuntu installs of RPM, /usr/lib/rpm/macros sets
%_dbpath %(echo $HOME/.rpmdb)
which makes quite a bit of sense, because RPM is not the system
packager and thus RPM is setup to install things into a hierarchy in
the users homedir.
However, this messes things up when building a Fedora chroot on an
Ubuntu platform.
We use RPM & yum from the base-system to bootstrap the Fedora chroot.
While both obey --root flags, they still pick up the %_dbpath macro
and so end up creating the RPM database in <chroot>/home/user/.rpmdb
After we have bootstrapped yum/dnf, we execute further installation
commands from inside the chroot -- where we now have the Fedora
version of /usr/lib/rpm/macros and hence have _dbpath set to
/var/lib/rpm -- except there is no rpm database there.
Should anyone be finding this in the future, the actual issue that
appears is
$ sudo chroot /opt/dib_tmp/image.b6B5S3f6/mnt dnf makecache
Error: Failed to synchronize cache for repo 'fedora' from \
'https://mirrors.fedoraproject.org/metalink?repo=fedora-$releasever&arch=x86_64': \
Cannot prepare internal mirrorlist: file "repomd.xml" was not found in metalink
Note the issue there is that $releasever is not expanded, because the
rpmdb where this info is kept is not populated.
The trick is to make sure we override this value when using the host
rpm/yum to setup the chroot. The bare rpm calls, which we use to
install the repos, have a --dbpath argument where we can override
this. yum does not however, so we override this in the global
~/.rpmmacros while we are installing the packaging tools and
dependencies into the chroot.
Copious comments are included, because this is super-confusing.
Change-Id: I20801150ea02d1c64f118eb969fb2aec473476f7
fedora-minimal fails to build on Ubuntu Trusty due do being unable to
find the initrd (see Id4c04d7ae20068643df34d2fa31068e8a917a52d).
This is a rather obscure problem that comes from the intersection of
several things.
The first thing to note is that the post-install scripts of the
kernel-core package use kernel-install [1]. For whatever reason, this
installs the kernel to /boot/MACHINE-ID/KERNEL-VERSION
MACHINE-ID comes from /etc/machine-id; a UUID that should have been
created by the systemd post-inst scripts with systemd-machine-id-setup
[2].
The chroot environment provided for root.d elements has no kernel
file-systems like /proc or /dev mounted. This is where differences in
the base-system come into play -- on more recent systems that
implement getrandom() systemd does not need /dev/urandom to generate
the machine-id [3]; we get a value and /etc/machine-id is populated.
On older platforms (Trusty), systemd-machine-id-setup fails (unable to
access /dev/urandom) and we end up with a blank /etc/machine-id. This
ends up making kernel-install (the script) fail during yum's
installation of kernel-core, which means the initrd is not installed
correctly.
We end up bailing out in fedora-minimal/install.d/99-ramdisk, where we
try to put the installed ramdisk in /boot for the later grub install
scripts to find.
The solution here is to mount the standard kernel file-systems within
the chroot before we try installing.
[1] http://www.freedesktop.org/software/systemd/man/kernel-install.html
[2] http://www.freedesktop.org/software/systemd/man/systemd-machine-id-setup.html
[3] https://github.com/systemd/systemd/blob/master/src/basic/random-util.c
Change-Id: Ibcce35da928f64e6a719b070bcc833346ee7ee92
yum-minimal/root.d/08-yum-chroot runs before yum/root.d/50-yum-cache,
and thus if run on a completely fresh system will fail in
08-yum-chroot as the YUM_CACHE directory isn't made.
This is probably hidden by testing & nodepool builds, because it sets
DIB_IMAGE_CACHE. It was hidden from me because locally I have done
builds using the "yum" element previously, which had created the
cache.
Change-Id: I333f5f7e67d198f75a522cc296c118c2e94a5ecb
I'm not sure why we try to do an extra install of these, it is done
inside the chroot in _install_repos. Currently it just gets skipped
saying the packages are already installed.
Change-Id: Ic7aa8cbe13e4347b447e84bb9c12483a4e125228
Add basic F22/dnf support to yum-minimal path. We extract common
code, add some comments and reduce duplication.
Change-Id: If4bd5f88e26bd6f2168958f1ec1efff1072de7ba
Move yum-based install into a function, to make way for a second
related function where use dnf later
Change-Id: Iad09f3753ecdfa0c10cb8a0970a3c8e5a2dccab1
fedora-release >= 22 has acquired a dependency on /bin/sh. This comes
from a %posttrans section of the spec file, which is symlinking the
os-release file.
As discussed in [1], the links are setup correctly in the rpm, so the
post-install script isn't doing anything. Thus we can safely ignore
the dependency with --nodeps
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1265873
Change-Id: Icf17c84580a75d42d8e90d5d6e81ae7f5f576c32
The centos-minimal approach of using rinse does not, it turns out, work
on centos. That's a bummer. It's also rather heavyweight. Instead, with
minor machinations, we can just use yum itself pointed at a chroot.
Also adding fedora-minimal element which creates a fedora image using
the new yum-minimal approach.
Co-Authored-By: Gregory Haynes <greg@greghaynes.net>
Change-Id: I026fd9d323e786dae5bb67824c6501067e1ceaa3