Commit graph

593 commits

Author SHA1 Message Date
Adam Williamson
9eef80a85a base_services_start: convert mcelog exception to hcn-init
We've had this 'exception' for mcelog.service failing in here for
years. Looking into it, it seems to now be fixed:
https://bugzilla.redhat.com/show_bug.cgi?id=1526725
and hasn't happened in our official instances for years (I guess
because they're all Intel boxes). However, we have a similar case
on ppc64le with hcn-init.service failing spuriously:
https://bugzilla.redhat.com/show_bug.cgi?id=1894654
so I'm just converting it into a workaround for that instead. We
could wire this up to be more sophisticated, with some kind of
array or hash of services that are allowed to fail and more
complex checking code, but let's not bother unless/until it's
necessary.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-04 09:34:12 -08:00
Adam Williamson
92d52c6ac6 Split VNC client steps in two for aarch64 timing issue
So, there's a problem with how we figure out the NetworkManager
connection to use in setup_tap_static: it expects the first
connection in the list to be the right one, but this is only
actually true so long as it's *active*. When we're in the tap
case, it's usually not going to actually *work* out of the box
on boot (or else we wouldn't need setup_tap_static at all...),
so some time after boot, NetworkManager gives up on it and marks
it as inactive. And after that, setup_tap_static won't work any
more.

I never noticed this as a problem before because usually we do
setup_tap_static before that point. But it seems in the vnc
client tests, on aarch64, desktop boot and login is slow enough
that by the time we switch to a VT and try to setup the network,
we're very close to that cutoff, and sometimes miss it.

This, I hope, avoids the problem by doing the network setup in
that test before we deal with the desktop login, then doing the
desktop login, then doing the actual VNC bits.

The alternative here would be to figure out a better way to do
setup_tap_static, but I can't.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-30 12:39:52 -07:00
Adam Williamson
bcefdd8357 Wait a bit before typing password on GNOME login screen
Seems like this often fails when booting the desktop disk image
on aarch64 if we start typing right away.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-30 12:39:52 -07:00
Adam Williamson
5341957960 cockpit: wait longer for services screen to load
It seems like it can be *really* slow on aarch64, since 218:
https://github.com/cockpit-project/cockpit/issues/14840
this should give it a total of 180 seconds on aarch64 (90 second
still screen timeout plus 30 second assert_screen timeout, with
1.5x scale).

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-30 12:39:52 -07:00
Adam Williamson
c7a1b94c84 Enable aarch64 disk image testing, related fixes
This sets us up to test the release-blocking aarch64 disk images
(Minimal, Server and Workstation). It also allows for testing
armhfp disk images on aarch64 worker hosts (though my testing of
that isn't going too well so far), and fixes the initial-setup
handling for a change upstream ('use password' is now the default
so we don't need to choose it). We rewire disk image deployment
test loading to work through the generic loader code rather than
using ENTRYPOINT, as it allows us to more gracefully handle
graphical (Workstation) vs. console (Server, Minimal), moving
the code for handling console initial-setup to a helper function
just like the code for gnome-initial-setup and having _console_
wait_login call it when appropriate. We also tweak desktop_vt a
bit because now we need to switch from a console running as test
to a desktop, which breaks the assumption that the highest
numbered session of user test is the desktop...

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-30 12:39:49 -07:00
Adam Williamson
b57b306d4b _iot_zezere_remote: be careful when quitting firefox
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-28 13:52:07 -07:00
Adam Williamson
505c556c67 support_server: disable systemd-resolved
We're setting up our own (dnsmasq) name server, we can't have
resolved running.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-28 09:07:46 -07:00
Adam Williamson
a9704de4bd FreeIPA: disable dnssec validation till weird bug is fixed
I noticed today that if we deploy FreeIPA with dnssec validation
enabled, dnf can't resolve dl.fedoraproject.org afterwards, which
is a problem because it means we wind up falling through to
random mirrors for metadata and package download once the server
is deployed, which can be slow and give old packages. This seems
to be why the server upgrade test on F33 is sometimes failing
because we get an older FreeIPA package on upgrade, even though
the newer one has been stable for a week.

It's difficult to pin down exactly where this bug is and fix it,
I've mailed some folks to try and work it out, but until that's
figured out, let's just disable dnssec validation.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-23 11:38:01 -07:00
Adam Williamson
0d8ceec820 Try to make desktop_browser more robust
We've been getting failures lately on the first page load, I
think because Firefox is getting even more grindy on startup. So
turn the 'sleep' into a 'wait_still_screen', extend another wait,
and tweak the 'browser' needle so it only matches after the
bookmark bar has loaded rather than as soon as half the chrome
appears. Also make all the wait_still_screens use similarity 45
for consistency (flashing cursor could be there on any of them).

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-23 09:48:09 -07:00
Adam Williamson
bd7d3cd663 Fix desktop_terminal command check (thanks defolos)
This check wasn't working, the test passed whatever wait_serial
found. This version suggested by defolos works, I checked.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-15 16:25:26 -07:00
Adam Williamson
1c33d07d38 Drop workaround_ble26, bug was fixed months ago
https://pagure.io/background-logo-extension/issue/26 was fixed
months back, we don't need this any more.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-15 15:36:04 -07:00
Adam Williamson
a008ffb8be Simplify desktop notification checks (#195)
This is the best option I can come up with to deal with #195.
Update notifications seem to have become transient in KDE lately
(even in F31 and F32, if I'm looking at these screenshots right).
This actually simplifies things a lot to do more or less the
same in the KDE and GNOME paths: open the 'permanent' store of
notifications (in GNOME you get to it by clicking on the clock,
in KDE via the systray) and then look for no notifications (live
path) or only an update notification (post-install path). We
only run this test for composes so we shouldn't need to worry
about anything older than F32, and I believe this should work
for KDE in F32 and F33. I left out click_unwanted_notifications
for now as I'm hoping it should be unnecessary, but we can add
it back in if necessary.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-14 23:30:00 -07:00
Adam Williamson
94b47afc53 Tweak setup_tap_static and FreeIPA tests for resolved
This does some of the things suggested by cheimes in
https://bugzilla.redhat.com/show_bug.cgi?id=1880628#c24 . It
seems to make the replica tests work with resolved, still work
with pre-F33 resolving, and not break anything. Also remove the
workaround to disable resolved if it's running, as we can now
work with it.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-09 16:54:46 -07:00
Adam Williamson
40974c2f94 Simplify Krusader app test
We don't need a separate 'welcome' needle because it just matches
on an OK button anyway. So turn that needle into an OK needle
(we don't have any existing 'blue OK button' needle) and simplify
the logic to a single loop for kde_ok and krusader_settings_close.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-08 14:47:46 -07:00
Adam Williamson
200cab3899 disk_custom_resize_lvm: add some waits
On ppc64le it looks like this test is often failing because it
takes a second or two to update the partition list after we
click update settings, but we're not waiting for that, so we
wind up clicking in the wrong place because we match the next
partition needle before the list is refreshed but click after
it's refreshed. Let's hope these waits solve it.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-07 14:52:49 -07:00
Adam Williamson
fdf142dbd5 Disable systemd-resolved before deploying FreeIPA server/replica
Having systemd-resolved in use seems to cause problems for
FreeIPA servers:
https://bugzilla.redhat.com/show_bug.cgi?id=1880628
until the scripts are enhanced to do this or something, let's
disable it before server/replica deployment.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-18 13:49:11 -07:00
Adam Williamson
aab6935707 FreeIPA replica: don't re-do setup_tap_static after deployment
ipa-replica-install already changes the DNS config to use the
local bind instance, we don't need to do this and it's actually
wrong (as it bypasses the local BIND we should use and uses
the VM host's DNS servers instead).

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-18 13:43:28 -07:00
Adam Williamson
c2e7ddba45 Fix DNS config in realmd_join_sssd and realmd_join_cockpit
Seems what they had before worked until systemd-resolved became
the default; now we need to make sure we do nmcli mod and then
bring the connection down and up, as we do in tapnet.pm. Writing
to resolv.conf is kinda "wrong" for resolved but I don't think
it really breaks anything so I think I'll just leave those bits
in until F32 goes EOL just in case they're still somehow needed
on F31 or F32.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-17 16:01:15 -07:00
Adam Williamson
314f8f84eb Another attempt to improve robustness of desktop_browser
https://openqa.fedoraproject.org/tests/667693#step/desktop_browser/8
shows us matching on Save File when the window is in kind of a
borked state; we'd probably wind up clicking on Open with,
because by the time we click the content of the window will have
moved to where it's actually supposed to be...so let's try this
to slow it down a bit.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-16 11:55:31 -07:00
Adam Williamson
ed5c06baa8 Upload pylorax.log when done building installer image
Handy to have it around to check for oddities.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-15 14:54:44 -07:00
Adam Williamson
a5d37e4c67 Need to bind mount the workarounds repo too
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-15 14:44:34 -07:00
Adam Williamson
e4d89b6d6b Tweak how we add repos to mock config a bit
Not sure if the other way was valid.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-15 14:25:05 -07:00
Adam Williamson
3dd33e3ef1 Use workarounds repo for installer and live build tests
Need this to pull in the kernel fix that's breaking install tests
at the moment.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-15 14:21:07 -07:00
Adam Williamson
9824f20566 Slow down desktop_browser a bit to try and make it more reliable
Getting some odd failures where the downloaded file doesn't show
up in the right place which I think might be due to over rapid
clicking here. Try and slow it down a bit.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-11 16:18:15 -07:00
Adam Williamson
e99a3fbdf9 os_release: adjust for Fedora CoreOS
...which just has to be another special flower.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-02 16:22:20 -07:00
Adam Williamson
478b7eff9e Add initial CoreOS product and test templates
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-02 14:49:29 -07:00
Adam Williamson
170ef0733a Use nmcli for static network stuff, not ifcfg files
This should work even if the ifcfg plugin is not present (hi,
CoreOS) or 'predictable' (har) network names are on.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-02 14:49:29 -07:00
Adam Williamson
52d52c7062 Add a bit of clean up before second run of postgresql-setup
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-01 15:46:10 -07:00
Adam Williamson
a3806af8ee Workaround RHBZ #1872511 by installing langpack
There's a complex bug in current Rawhide affecting the database
server test; it boils down to "deployment fails because LANG is
set to a locale for which the corresponding langpack is not
installed". As we know broadly what's going on there now, let's
work around it with a soft failure so we catch any later bugs in
the process.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-01 15:22:19 -07:00
Adam Williamson
387b09a53a Fix previous zezere change (use single quotes)
Stupid @.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-08-21 16:03:14 -07:00
Adam Williamson
30ab26fbe6 Tweak _iot_zezere_remote to keep retrying ssh for up to 10 mins
The time before the ssh key provision request goes through turns
out to be kinda unpredictable, so instead of just a hardcoded
wait then assuming it should succeed, let's do a loop-y retry
thing instead.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-08-21 15:02:01 -07:00
Adam Williamson
b9f6ecd72d Conditionalize FreeIPA UI change, add 4.8.9 update to workarounds
The FreeIPA UI change that the previous commit adapted to is in
4.8.9. That's stable for Rawhide and F33 already, but still in
testing for F32, and won't go to F31. So we need to make the
change conditional on release number, and we also add the update
to workarounds for F32 so we don't have to do something awkward
while we wait for it to go stable.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-08-21 14:00:42 -07:00
Alexander Bokovoy
5713631a9c Update password change needle and code to FreeIPA 4.8.9
OTP field was moved into the last position in the password change dialog
to prevent issues with OTP code expiring while users enter their
passwords.

Signed-off-by: Alexander Bokovoy <abokovoy@redhat.com>
2020-08-21 18:05:33 +03:00
Adam Williamson
7682872d95 desktop_login: update reboot flow for GNOME changes in F33+
GNOME now also splits 'Restart...' and 'Power Off...' as KDE
does, so we need to tweak the conditional and add some needles.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-08-17 16:25:32 -07:00
Adam Williamson
025949f483 FreeIPA: fix reverse zone for 172.16.2 network use
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-08-12 11:18:41 -07:00
Adam Williamson
232b224d22 Add 'with swap' tests, drop swap parts from other tests (#180)
In Fedora 33, we generally no longer include a disk-based swap
partition by default (instead swap-on-ZRAM is used, see
https://fedoraproject.org/wiki/Changes/SwapOnZRAM ). This tweaks
our tests to account for that. In tests that aren't to do with
swap at all, we stop including a swap partition in order to be
closer to the default layout. We replace the old _no_swap blivet
and custom tests with _with_swap tests that, as the name implies,
*explicitly include* a swap partition, and adjust the postinstall
test to check the disk swap partition is there.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-08-11 15:09:33 -07:00
Adam Williamson
755ac778cc Wait longer for Zezere provision request to go through
30 seconds doesn't seem to be reliable enough. Let's try 60, if
that's not enough I'll try and think of something smarter.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-08-07 16:25:07 -07:00
Adam Williamson
a0d4c2fc65 Add a keypress to the 'keepalive' loop in desktop_notifications
Just repositioning the mouse appears not to be enough to prevent
the sesssion going idle any more, since the 20200731.n.0 compose.
Not sure what causes this, probably the kernel. Adding a space
keypress seems to help in both KDE and GNOME.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-08-06 18:15:49 -07:00
Adam Williamson
855aaef258 Try harder to be safe when quitting Firefox
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-08-05 13:56:07 -07:00
Adam Williamson
fed44e3fdb wait_still_screen after exiting firefox in server_cockpit_default
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-08-04 18:13:14 -07:00
Adam Williamson
aa41fe4e4e Automate QA:Testcase_Zezere_Ignition
This is a bit complex to automate, because we cannot really use
the production Zezere server (provision.fedoraproject.org) as
the test case shows, as we'd have to solve authentication and
we also don't really want to constantly keep registering new
hosts to it that are going to disappear and never be seen again.

So, instead we'll do it by setting up our *own* Zezere, and
provisioning our IoT system in that. We run two tests. The
'ignition' test is the actual IoT 'device'; all it really does
is boot up, sit around, and wait to be provisioned. The 'server'
test first sets up a Zezere server, then logs into it, adds an
ssh key, claims the IoT device, provisions it, and connects to
it to create a special file which tells the 'ignition' test
everything worked and it can close out.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-07-23 18:01:06 -07:00
Adam Williamson
72edbfe991 Use qemu host IP 172.16.2.2 not 10.0.2.2
This is to make the infra folks happy, apparently using 10.0.x.x
and 10.1.x.x is causing conflicts since our actual infra network
uses those ranges too.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-07-23 16:40:45 -07:00
Adam Williamson
d0274fe7f9 Tweak support_server DHCP range
It started too low, overlapped with some IPs we set static now.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-07-17 14:59:09 -07:00
Adam Williamson
ead05e6c32 Drop explicit install of fedora-repos-modular again
It actually is supposed to be installed by default, so if it's
missing that's a bug. It's been added to comps now so it should
be there from now on.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-07-10 18:41:05 -07:00
Adam Williamson
3f6ac527bb Apply overview workaround to yet one more test
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-07-09 14:29:49 -07:00
Adam Williamson
4a6cd8bcd5 Abstract out overview type-to-search bug workaround
And also use it in GNOME apps settings.pm test.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-07-09 14:25:22 -07:00
Adam Williamson
50d9d8bafa Workaround overview type issue in input test too
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-07-08 17:24:51 -07:00
Adam Williamson
4fee822475 modularity test: install fedora-repos-modular if necessary
It got split out and is not installed by default in 33-0.8. This
is intentional, not a bug, see https://pagure.io/fesco/issue/2114

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-07-08 14:52:11 -07:00
Adam Williamson
919c88f48f Add QA:Testcase_Clevis test (TPM-based automatic decryption)
This adds a test that automates
https://fedoraproject.org/wiki/QA:Testcase_Clevis. It requires
os-autoinst-4.6-18.20200623git5038d8c or newer, and a worker
host in the 'tpm' class which is set up to have an instance of
swtpm running at /tmp/mytpmX , where X is the worker instance
number, for each worker. The Fedora infrastructure ansible
plays have been updated to handle this via an instantiated
systemd service, which other instances can also adopt.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-07-02 16:44:55 -07:00
Adam Williamson
3189f1a62c Tweak NFS repo setup in _support_server to copy all files
`cp -R foo/*` doesn't get all files in `foo/`, it misses hidden
files. This turns out to be a problem with recent anaconda, as
it expects to find a .treeinfo file here. So, let's use rsync.
We could probably do this with cp too but I can't think of the
right arguments right now...

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-06-18 17:00:42 -07:00