base_services_start: convert mcelog exception to hcn-init

We've had this 'exception' for mcelog.service failing in here for
years. Looking into it, it seems to now be fixed:
https://bugzilla.redhat.com/show_bug.cgi?id=1526725
and hasn't happened in our official instances for years (I guess
because they're all Intel boxes). However, we have a similar case
on ppc64le with hcn-init.service failing spuriously:
https://bugzilla.redhat.com/show_bug.cgi?id=1894654
so I'm just converting it into a workaround for that instead. We
could wire this up to be more sophisticated, with some kind of
array or hash of services that are allowed to fail and more
complex checking code, but let's not bother unless/until it's
necessary.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
This commit is contained in:
Adam Williamson 2020-11-04 09:31:40 -08:00
parent b284e79250
commit 9eef80a85a

View file

@ -16,15 +16,16 @@ sub run {
# if we have 0 failed services, we're good
my $ret = script_run "grep '0 loaded units' /tmp/failed.txt";
return if $ret == 0;
# if only mcelog failed, that's a soft fail
# if only hcn-init failed, that's a soft fail, see:
# https://bugzilla.redhat.com/show_bug.cgi?id=1894654
$ret = script_run "grep '1 loaded units' /tmp/failed.txt";
if ($ret != 0) {
die "More than one services failed to start";
}
else {
# fail if it's something other than mcelog
assert_script_run "systemctl is-failed mcelog.service";
record_soft_failure;
# fail if it's something other than hcn-init
assert_script_run "systemctl is-failed hcn-init.service";
record_soft_failure "hcn-init failed - https://bugzilla.redhat.com/show_bug.cgi?id=1894654";
}
}