diskimage-builder/diskimage_builder/elements/bootloader/finalise.d/50-bootloader

261 lines
9.4 KiB
Text
Raw Normal View History

2012-11-09 11:04:13 +00:00
#!/bin/bash
# Configure grub. Note that the various conditionals here are to handle
# different distributions gracefully.
2012-11-09 11:04:13 +00:00
if [ ${DIB_DEBUG_TRACE:-1} -gt 0 ]; then
set -x
fi
set -eu
set -o pipefail
2012-11-09 11:04:13 +00:00
if [ ${DIB_EXTLINUX:-0} != "0" ]; then
echo "DIB_EXTLINUX no longer supported"
exit 1
fi
Pass all blockdevices to bootloader Currently we only export "image-block-device" which is the loopback device (/dev/loopX) for the underlying image. This is the device we install grub to (from inside the chroot ...) This is ok for x86, but is insufficient for some platforms like PPC which have a separate boot partition. They do not want to install to the loop device, but do things like dd special ELF files into special boot partitions. The first problem seems to be that in level1/partitioning.py we have a whole bunch of different paths that either call partprobe on the loop device, or kpartx. We have _all_part_devices_exist() that gates the kpartx for unknown reasons. We have detach_loopback() that does not seem to remove losetup created devices. I don't think this does cleanup if it uses kpartx correctly. It is extremley unclear what's going to be mapped where. This moves to us *only* using kpartx to map the partitions of the loop device. We will *not* call partprobe and create the /dev/loopXpN devices and will only have the devicemapper nodes kpartx creates. This seems to be best. Cleanup happens inside partitioning.py. practice. Deeper thinking about this, and more cleanup of the variables will be welcome. This adds "image-block-devices" (note the extra "s") which exports all the block devices with name and path. This is in a string format that can be eval'd to an array (you can't export arrays). This is then used in a follow-on (I0918e8df8797d6dbabf7af618989ab7f79ee9580) to pick the right partition on PPC. Change-Id: If8e33106b4104da2d56d7941ce96ffcb014907bc
2017-06-06 02:09:24 +00:00
# Some distros have pre-installed grub in some other way, and want to
# skip this.
if [[ -f "/tmp/grub/install" ]]; then
exit 0
fi
BOOT_DEV=$IMAGE_BLOCK_DEVICE
Pass all blockdevices to bootloader Currently we only export "image-block-device" which is the loopback device (/dev/loopX) for the underlying image. This is the device we install grub to (from inside the chroot ...) This is ok for x86, but is insufficient for some platforms like PPC which have a separate boot partition. They do not want to install to the loop device, but do things like dd special ELF files into special boot partitions. The first problem seems to be that in level1/partitioning.py we have a whole bunch of different paths that either call partprobe on the loop device, or kpartx. We have _all_part_devices_exist() that gates the kpartx for unknown reasons. We have detach_loopback() that does not seem to remove losetup created devices. I don't think this does cleanup if it uses kpartx correctly. It is extremley unclear what's going to be mapped where. This moves to us *only* using kpartx to map the partitions of the loop device. We will *not* call partprobe and create the /dev/loopXpN devices and will only have the devicemapper nodes kpartx creates. This seems to be best. Cleanup happens inside partitioning.py. practice. Deeper thinking about this, and more cleanup of the variables will be welcome. This adds "image-block-devices" (note the extra "s") which exports all the block devices with name and path. This is in a string format that can be eval'd to an array (you can't export arrays). This is then used in a follow-on (I0918e8df8797d6dbabf7af618989ab7f79ee9580) to pick the right partition on PPC. Change-Id: If8e33106b4104da2d56d7941ce96ffcb014907bc
2017-06-06 02:09:24 +00:00
# All available devices, handy for some bootloaders...
declare -A DEVICES
eval DEVICES=( $IMAGE_BLOCK_DEVICES )
DIB_BLOCK_DEVICE=${DIB_BLOCK_DEVICE:-}
# Right now we can't use pkg-map to branch by arch, so tag an
# architecture specific virtual package so we can install the
# rigth thing based on distribution.
if [[ "$ARCH" =~ "ppc" ]]; then
install-packages -m bootloader grub-ppc64
elif [[ "${DIB_BLOCK_DEVICE}" == "mbr" ||
"${DIB_BLOCK_DEVICE}" == "gpt" ]]; then
install-packages -m bootloader grub-pc
elif [[ "${DIB_BLOCK_DEVICE}" == "efi" ]]; then
install-packages -m bootloader grub-efi grub-efi-$ARCH
else
install-packages -m bootloader grub-pc grub-efi grub-efi-$ARCH
fi
GRUBNAME=$(type -p grub-install) || echo "trying grub2-install"
if [ -z "$GRUBNAME" ]; then
GRUBNAME=$(type -p grub2-install)
fi
if type grub2-mkconfig >/dev/null; then
GRUB_MKCONFIG="grub2-mkconfig"
else
GRUB_MKCONFIG="grub-mkconfig"
fi
if [[ ! $($GRUBNAME --version) =~ ' 2.' ]]; then
echo "Failure: not grub2"
exit 1
fi
# Some distros keep things in /boot/grub2, others in /boot/grub
if [ -d /boot/grub2 ]; then
GRUB_CFG=/boot/grub2/grub.cfg
GRUBENV=/boot/grub2/grubenv
else
# NOTE(ianw) This used to be behind a "-d /boot/grub" but this
# directory doesn't seem to exist for gentoo at this point;
# something creates it later. So we just fallback to this
# unconditionally.
GRUB_CFG=/boot/grub/grub.cfg
GRUBENV=/boot/grub/grubenv
mkdir -p /boot/grub
fi
# When using EFI image-based builds, particularly rhel element
# based on RHEL>=8.2 .qcow2, we might have /boot/grub2/grubenv
# as a dangling symlink to /boot/efi because we have extracted
# it from the root fs, but we didn't populate the separate EFI
# boot partition from the image. grub2-install calls rename()
# on this file, so if it's a dangling symlink it errors. Just
# remove it if it exists.
if [[ -L $GRUBENV ]]; then
rm -f $GRUBENV
fi
Fix BLS based bootloader installation This reverts I2701260d54cf6bc79f1ac765b512d99d799e8c43, Idf2a471453c5490d927979fb97aa916418172153 and part of Iecf7f7e4c992bb23437b6461cdd04cdca96aafa6 which added special flags to update kernels via grubby. These changes actually ended up reverting the behaviour on Fedora 35, which is what led me to investigate what was going on more fully. All distros still support setting GRUB_DEVICE in /etc/default/grub; even the BLS based ones (i.e. everything !centos7). The implementation *is* confusing -- in earlier distros each BLS entry would refer to the variable $kernelopts; which grub2-mkconfig would write into /boot/grub2/grubenv. After commit [1] this was reverted, and the kernel options are directly written into the BLS entry. But the real problem is this bit from [2] get_sorted_bls() { if ! [ -d "${blsdir}" ] || ! [ -e /etc/machine-id ]; then return fi ... files=($(for bls in ${blsdir}/${machine_id}-*.conf; do ... } i.e., to avoid overwriting BLS entries for other OS-boots (?), grub2-mkconfig will only update those BLS entries that match the current machine-id. The problem for DIB is that we are clearing the machine-id early in finalise.d/01-clear-machine-id, but then running the bootloader update later in finalise.d/50-bootloader. The result is that the bootloader entry generated when we installed the kernel (which guessed at the root= device, etc.) is *not* updated. Even more annoyingly, the gate doesn't pick this up -- because the gate tests run on a DIB image that was booted with "root=LABEL=cloudimg-rootfs" the kernel initially installed with "install-kernel" (that we never updated) is actually correct. But this fails when built on a production host. Thus we don't need any of the explicit grubby updates; these are reverted here. This moves the machine-id clearing to after the bootloader setup, which allows grub2-mkconfig to setup the BLS entries correctly. [1] https://src.fedoraproject.org/rpms/grub2/c/4a742183a39f344a7685bccdc76d5e64dea3766a?branch=master [2] https://src.fedoraproject.org/rpms/grub2/blob/rawhide/f/0062-Add-BLS-support-to-grub-mkconfig.patch Depends-On: https://review.opendev.org/c/zuul/nodepool/+/818705 Change-Id: Ia0e49980eb50eae29a5377d24ef0b31e4d78d346
2021-11-23 05:30:50 +00:00
echo "GRUB_DEVICE=LABEL=${DIB_ROOT_LABEL}" >> /etc/default/grub
echo 'GRUB_DISABLE_LINUX_UUID=true' >> /etc/default/grub
echo "GRUB_TIMEOUT=${DIB_GRUB_TIMEOUT:-5}" >>/etc/default/grub
echo 'GRUB_TERMINAL="serial console"' >>/etc/default/grub
echo 'GRUB_GFXPAYLOAD_LINUX=auto' >>/etc/default/grub
Correct boot path to cover FIPS usage cases When your booting a Linux system using dracut, i.e. with any redhat style distribution, dracut's internal code looks to validate the kernel hmac signature in before proceeding to userspace. It does this by looking at the /boot/ folder file for the kernel hmac file. And it normally does this with the root filesystem. Except if the kernel is not on the root filesystem and is instead on a /boot filesystem, this breaks horribly. This is compounded because DIB enables the operator to restructure the OS image/layout to fit their needs. In order for this to be navigated, as dracut is written, we need to pass a "boot=" argument to the kernel. So now we attempt to purge any prior boot entry in the disk image content, which is good because any filesystem operations invalidate it, and then we attempt to identify the boot filesystem, and save a boot kernel command line parameter so the resulting image can boot properly if FIPS was enabled in the prior image. Regex developed with https://sed.js.org utilizing stdin: VAR="quiet boot=UUID=173c759f-1302-48a3-9d51-a17784c21e03 text" VAR="quiet boot=PARTUUID=173c759f-1302-48a3-9d51-a17784c21e03" VAR="quiet boot=PARTUUID=173c759f-1302-48a3-9d51-a17784c21e03 reboot=meow" VAR="quiet boot=UUID=/dev/sda1 text" VAR="quiet boot=/dev/sda1" VAR="quiet boot=/dev/sda1 reboot=meow" VAR="quiet after_boot=1 reboot=meow boot=/dev/sda1" VAR="quiet after_boot=1 reboot=meow" Which resulted in stdout: VAR="quiet text" VAR="quiet" VAR="quiet reboot=meow" VAR="quiet text" VAR="quiet" VAR="quiet reboot=meow" VAR="quiet after_boot=1 reboot=meow" VAR="quiet after_boot=1 reboot=meow" Change-Id: I9034c21e84deda2ba2c0ec0d1d6d6595ed10bed4
2023-03-02 16:43:50 +00:00
# NOTE(TheJulia): We need to remove any boot entry from the /etc/default/grub
# file that may already exist, such as what was added by fips being setup on
# either in the source image or by by an element, as we repack the image.
# with new filesystems.
# Matches any element which looks like " boot=" and the associated value
# in order for us to have a clean starting point to put a value in place,
# if applicable.
# Removes entry trailing with a space, or any entry where boot is set as
# the last argument on the line.
sed -i 's/\ boot=[0-9A-Za-z/=\-]\+//' /etc/default/grub
# NOTE(TheJulia): When using FIPS, dracut wants to evaluate
# the hmac files for the kernel checksum. However, if /boot is
# located on a separate filesystem from the root filesystem,
# than this fails. As a result, we need to identify IF /boot
# is a separate filesystem, and convey this fact as a boot
# argument so dracut does not halt the system on boot.
if [[ -n "${DIB_BOOT_LABEL}" ]]; then
BOOT_FS="boot=LABEL=${DIB_BOOT_LABEL}"
else
BOOT_FS=""
fi
if [[ -n "${DIB_BOOTLOADER_SERIAL_CONSOLE}" ]]; then
SERIAL_CONSOLE="${DIB_BOOTLOADER_SERIAL_CONSOLE}"
elif [[ "powerpc ppc64 ppc64le" =~ "$ARCH" ]]; then
# Serial console on Power is hvc0
SERIAL_CONSOLE="hvc0"
elif [[ "arm64" =~ "$ARCH" ]]; then
SERIAL_CONSOLE="ttyAMA0,115200"
else
SERIAL_CONSOLE="ttyS0,115200"
fi
GRUB_CMDLINE_LINUX_DEFAULT="console=tty0 console=${SERIAL_CONSOLE} no_timer_check"
Correct boot path to cover FIPS usage cases When your booting a Linux system using dracut, i.e. with any redhat style distribution, dracut's internal code looks to validate the kernel hmac signature in before proceeding to userspace. It does this by looking at the /boot/ folder file for the kernel hmac file. And it normally does this with the root filesystem. Except if the kernel is not on the root filesystem and is instead on a /boot filesystem, this breaks horribly. This is compounded because DIB enables the operator to restructure the OS image/layout to fit their needs. In order for this to be navigated, as dracut is written, we need to pass a "boot=" argument to the kernel. So now we attempt to purge any prior boot entry in the disk image content, which is good because any filesystem operations invalidate it, and then we attempt to identify the boot filesystem, and save a boot kernel command line parameter so the resulting image can boot properly if FIPS was enabled in the prior image. Regex developed with https://sed.js.org utilizing stdin: VAR="quiet boot=UUID=173c759f-1302-48a3-9d51-a17784c21e03 text" VAR="quiet boot=PARTUUID=173c759f-1302-48a3-9d51-a17784c21e03" VAR="quiet boot=PARTUUID=173c759f-1302-48a3-9d51-a17784c21e03 reboot=meow" VAR="quiet boot=UUID=/dev/sda1 text" VAR="quiet boot=/dev/sda1" VAR="quiet boot=/dev/sda1 reboot=meow" VAR="quiet after_boot=1 reboot=meow boot=/dev/sda1" VAR="quiet after_boot=1 reboot=meow" Which resulted in stdout: VAR="quiet text" VAR="quiet" VAR="quiet reboot=meow" VAR="quiet text" VAR="quiet" VAR="quiet reboot=meow" VAR="quiet after_boot=1 reboot=meow" VAR="quiet after_boot=1 reboot=meow" Change-Id: I9034c21e84deda2ba2c0ec0d1d6d6595ed10bed4
2023-03-02 16:43:50 +00:00
echo "GRUB_CMDLINE_LINUX_DEFAULT=\"${GRUB_CMDLINE_LINUX_DEFAULT} ${DIB_BOOTLOADER_DEFAULT_CMDLINE} ${BOOT_FS}\"" >>/etc/default/grub
echo 'GRUB_SERIAL_COMMAND="serial --speed=115200 --unit=0 --word=8 --parity=no --stop=1"' >>/etc/default/grub
# os-prober leaks /dev/sda into config file in dual-boot host
# Disable grub-os-prober to avoid the issue while running
# grub-mkconfig
# Setting a flag to track whether the entry is already there in grub config
PROBER_DISABLED=
if ! grep -qe "^\s*GRUB_DISABLE_OS_PROBER=true" /etc/default/grub; then
PROBER_DISABLED=true
echo 'GRUB_DISABLE_OS_PROBER=true' >> /etc/default/grub
fi
# GRUB_MKCONFIG call needs to happen after we configure
# /etc/default/grub above. Without this we can set inappropriate
# root device labels and then images don't boot.
#
# This produces a legacy config which both bios and uefi can boot
# Later we copy the final config to an efi specific location to
# support uefi specific functionality like secure boot.
$GRUB_MKCONFIG -o $GRUB_CFG
Fix BLS based bootloader installation This reverts I2701260d54cf6bc79f1ac765b512d99d799e8c43, Idf2a471453c5490d927979fb97aa916418172153 and part of Iecf7f7e4c992bb23437b6461cdd04cdca96aafa6 which added special flags to update kernels via grubby. These changes actually ended up reverting the behaviour on Fedora 35, which is what led me to investigate what was going on more fully. All distros still support setting GRUB_DEVICE in /etc/default/grub; even the BLS based ones (i.e. everything !centos7). The implementation *is* confusing -- in earlier distros each BLS entry would refer to the variable $kernelopts; which grub2-mkconfig would write into /boot/grub2/grubenv. After commit [1] this was reverted, and the kernel options are directly written into the BLS entry. But the real problem is this bit from [2] get_sorted_bls() { if ! [ -d "${blsdir}" ] || ! [ -e /etc/machine-id ]; then return fi ... files=($(for bls in ${blsdir}/${machine_id}-*.conf; do ... } i.e., to avoid overwriting BLS entries for other OS-boots (?), grub2-mkconfig will only update those BLS entries that match the current machine-id. The problem for DIB is that we are clearing the machine-id early in finalise.d/01-clear-machine-id, but then running the bootloader update later in finalise.d/50-bootloader. The result is that the bootloader entry generated when we installed the kernel (which guessed at the root= device, etc.) is *not* updated. Even more annoyingly, the gate doesn't pick this up -- because the gate tests run on a DIB image that was booted with "root=LABEL=cloudimg-rootfs" the kernel initially installed with "install-kernel" (that we never updated) is actually correct. But this fails when built on a production host. Thus we don't need any of the explicit grubby updates; these are reverted here. This moves the machine-id clearing to after the bootloader setup, which allows grub2-mkconfig to setup the BLS entries correctly. [1] https://src.fedoraproject.org/rpms/grub2/c/4a742183a39f344a7685bccdc76d5e64dea3766a?branch=master [2] https://src.fedoraproject.org/rpms/grub2/blob/rawhide/f/0062-Add-BLS-support-to-grub-mkconfig.patch Depends-On: https://review.opendev.org/c/zuul/nodepool/+/818705 Change-Id: Ia0e49980eb50eae29a5377d24ef0b31e4d78d346
2021-11-23 05:30:50 +00:00
# If we are using BLS, for debugging purposes dump out the kernel
if [[ -e /boot/loader/entries ]]; then
grubby --info=ALL
fi
# Remove the fix to disable os_prober
if [ -n "$PROBER_DISABLED" ]; then
sed -i '$d' /etc/default/grub
fi
# Fix efi specific instructions in grub config file
if [ -d /sys/firmware/efi ]; then
sed -i 's%\(initrd\|linux\)efi /boot%\1 /boot%g' $GRUB_CFG
fi
# when using efi, and having linux16/initrd16, it needs to be replaced
# by linuxefi/initrdefi. When building images on a non-efi system,
# the 16 suffix is added to linux/initrd entries, but we need it to be
# linuxefi/initrdefi for the image to boot under efi
if [[ ${DIB_BLOCK_DEVICE} == "efi" ]]; then
sed -i 's%\(linux\|initrd\)16 /boot%\1efi /boot%g' $GRUB_CFG
# Finally copy the grub.cfg and grubenv to the EFI specific dir
# to support functionality like secure boot. We make a copy because
# /boot and /boot/efi may be different partitions and uefi looks
# for a specific partition UUID preventing symlinks from working.
if [ -d /boot/efi/$EFI_BOOT_DIR ] ; then
cp $GRUB_CFG /boot/efi/$EFI_BOOT_DIR/grub.cfg
if [ -a $GRUBENV ]; then
cp $GRUBENV /boot/efi/$EFI_BOOT_DIR/grubenv
fi
fi
fi
# Ensure paths in BLS entries account for /boot being a partition or part of the
# root partition
if [[ -e /boot/loader/entries ]]; then
pushd /boot/loader/entries
set +e
mountpoint /boot
bootmount=$?
for entry in *; do
if [[ $bootmount -eq 0 ]]; then
sed -i "s| /boot/vmlinuz| /vmlinuz|" $entry
sed -i "s| /boot/initramfs| /initramfs|" $entry
else
sed -i "s| /vmlinuz| /boot/vmlinuz|" $entry
sed -i "s| /initramfs| /boot/initramfs|" $entry
fi
done
set -e
popd
# Print resulting grubby output for debug purposes
grubby --info=ALL
fi
if [[ ! "$ARCH" =~ "ppc" ]] && [[ -z "${DIB_BLOCK_DEVICE}" ]]; then
echo "WARNING: No bootloader installation will occur."
echo "To install a bootloader ensure you have included a block-device-* element"
exit 0
fi
echo "Installing GRUB2..."
# We need --force so grub does not fail due to being installed on the
# root partition of a block device.
GRUB_OPTS="--force "
if [[ "$ARCH" =~ "ppc" ]] ; then
# For PPC (64-Bit regardless of Endian-ness), we use the "boot"
# partition as the one to point grub-install to, not the loopback
# device. ppc has a dedicated PReP boot partition.
# For grub2 < 2.02~beta3 this needs to be a /dev/mapper/... node after
# that a dev/loopXpN node will work fine.
$GRUBNAME --modules="part_msdos" $GRUB_OPTS ${DEVICES[boot]} --no-nvram
else
# This set of modules is sufficient for all installs (mbr/gpt/efi)
modules="part_msdos part_gpt lvm"
if [[ ${DIB_BLOCK_DEVICE} == "mbr" || ${DIB_BLOCK_DEVICE} == "gpt" ]]; then
if [[ ! "x86_64 amd64" =~ ${ARCH} ]]; then
echo "*** ${ARCH} is not supported by mbr/gpt"
fi
$GRUBNAME --modules="$modules biosdisk" --target=i386-pc \
$GRUB_OPTS $BOOT_DEV
elif [[ ${DIB_BLOCK_DEVICE} == "efi" ]]; then
# We need to manually set the target if it's different to
# the host. Setup for EFI
case $ARCH in
"x86_64"|"amd64")
# This call installs grub for BIOS compatability
# which makes portable EFI/BIOS images.
$GRUBNAME --modules="$modules" --target=i386-pc $BOOT_DEV
# Set the x86_64 specific efi target for the generic
# installation below.
GRUB_OPTS="--target=x86_64-efi"
;;
# At this point, we don't need to override the target
# for any other architectures.
esac
# If we don't have a distro specific dir with presigned efi targets
# we install a generic one.
if [ ! -d /boot/efi/$EFI_BOOT_DIR ]; then
echo "WARNING: /boot/efi/$EFI_BOOT_DIR does not exist, UEFI secure boot not supported"
# This tells the EFI install to put the EFI binaries into
# the generic /BOOT directory and avoids trying to update
# nvram settings.
extra_options="--removable"
$GRUBNAME --modules="$modules" $extra_options $GRUB_OPTS $BOOT_DEV
fi
fi
fi