Remove a bunch of needles that have not been used for some time,
plus a few workarounds that are similarly stale.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Inspired by openQA's 01-compile-check-all.t, this adds a perl
test which checks the syntax of main.pm and all lib and test
files, and hooks it up to CI. Requires os-autoinst and
perl-Test-Strict.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
There is nothing inherently 'root'-y about these so it makes no
sense to prefix their names with 'root-'. And why change from
'console' to 'terminal' compared to the naming used in the
actual qemu command and the log files? It's just confusing.
Let's be consistent (except for using - instead of _ here...
but - is easier to type!)
Signed-off-by: Adam Williamson <awilliam@redhat.com>
The log files are all under the ostree deploy root, the 'real'
system root has nothing useful. Try and find the deploy root
and prepend it to all the relevant commands if we're a CANNED
install.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This was necessary for debugging the FreeIPA 4.8 pre-release
update bug, so let's have it for all runs, just in case.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We're getting an intermittent case where FreeIPA tests fail
because of a web server certificate issue. Click 'Advanced' in
Firefox when this happens so we can get a bit more info on the
problem.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Just like the installer image build test, only...it builds a live
image. This involves reimplementing quite a chunk of the Koji
livemedia task. Ah, well. Also involves rethinking the flavor
names a bit here, these seem...better.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This adds a test which builds a netinst image potentially with
the package(s) from the update, and uploads that image. It also
adds a test which runs a default install using that image. This
is intended to check whether the update breaks the creation or
use of install images; particularly this will let us test
anaconda etc. updates. We also update the minimal disk image
name, as we have to make it bigger to accommodate this test,
and making it bigger changes its name - the actual change to
the disk image itself is in createhdds. We also have to redo a
bunch of installer needles for F28 fonts, after I removed them
a month or so back...
Signed-off-by: Adam Williamson <awilliam@redhat.com>
If a test fails to the dracut shell, we currently don't do
anything useful. This should recognize when that happens, and
upload rdsosreport.txt.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This should fix log collection when a French or Japanese test
fails before the test itself would have done this.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
If an update test fails before reaching advisory_post, we don't
generate the 'what update packages were installed' and 'were
any update packages *not* installed when they should have been'
logs, but these may well be useful for diagnosing the failure -
so let's also do the same stuff there. Only let's not do it all
twice.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Sometimes we get a test failing because the SUT isn't connecting
to the network for some reason. In this case we never get any
logs, because `upload_logs` relies on being able to reacht at
least the worker host system via the network.
This attempts to detect when we can't ping the worker host, and
in that case, send some info out over the serial line instead.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
We were doing this in a post-install test, but not on failures.
We need it to figure out why Firefox is crashing on aarch64...
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Previous approach wouldn't work for tests that run after the
install test...let's just set a password from a chroot after
install completes. Don't really like this as it changes the
'real' install process a bit, but it's the least invasive short
term fix at least. We can maybe do something more sudo-y later
with a bit more thought.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
It's really INSTALL_NO_USER, not USER_LOGIN='false'. Also, we
need to make root_console work with no root password, sigh.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Committing without review as this is pretty trivial and I've
had it on staging for the last few days without issue. Just gets
us somewhat better info for debugging FreeIPA issues.
Summary:
This is to handle cases like #1414904 , where the system boots
to emergency mode. We really need logs to try and debug this.
Test Plan:
Force a test to hit emergency mode somehow (right now
you can just run base_services_start on Rawhide over and over
until you hit #1414904, but there's probably an easier way to
do it, I think there's a systemd boot arg to tell it which target
to boot for e.g.) and check logs get uploaded. Also check this
doesn't break log upload for a 'normal' failure.
Reviewers: garretraziel_but_actually_jsedlak_who_uses_stupid_nicknames
Reviewed By: garretraziel_but_actually_jsedlak_who_uses_stupid_nicknames
Subscribers: tflink
Differential Revision: https://phab.qa.fedoraproject.org/D1103
Summary:
This adds a couple of new exporter modules, renames main_common
to utils (this is a better name: openSUSE's main_common is
functions used in main.pm, utils is what they call their module
full of miscellaneous commonly-used functions), and moves a
bunch of utility functions that were previously needlessly
implemented as instance methods in base classes into the
exporter modules. That means we can get rid of all the annoying
$self-> syntax for calling them.
We get rid of `fedorabase` entirely, as it's no longer useful
for anything. Other base classes keep the 'standard' methods
(like `post_fail_hook`) and methods which actually need to be
methods (like `root_console`, whose behaviour is different in
anacondatest and installedtest).
Test Plan:
Do a full test suite run and check everything lines
up. There should be no functional differences from before at all,
this is just a re-org.
Reviewers: jskladan, garretraziel_but_actually_jsedlak_who_uses_stupid_nicknames
Reviewed By: garretraziel_but_actually_jsedlak_who_uses_stupid_nicknames
Subscribers: tflink
Differential Revision: https://phab.qa.fedoraproject.org/D1080
The README looks pretty ugly on Pagure. So let's unwrap it.
Let's also move the function docs into the source files. We're
much more likely to keep them up to date that way, I think. We
should probably change over to proper perl POD documentation at
some point, but comments in-line are OK for now I think.
This should solve all those annoying "Failed to synchronize
cache for repo 'updates'" failures we've had: there's no need
for the 'updates' repository to be enabled when we've decided
we want the `repo_setup` changes to be made, and having it
enabled causes problems when we run right after the Rawhide
compose completes. We hit the awkward period where the rawhide
repo has been synced but mirrormanager has not been updated
with the new metadata checksums, so mirrormanager rejects the
metadata from dl.fp.o and DNF has to go out and hit other
mirrors until it finds one which didn't sync yet. Since the
point of `repo_setup` is specifically to hack up the config so
we only use packages from the compose *anyway*, there's no
reason at all to worry about leaving 'updates' enabled and
nerfing it like we do 'fedora' and 'rawhide', we can just turn
it off.
Summary:
The current installedtest post_fail_hook assumes /var/tmp/abrt
exists at all, and dies if it doesn't, leading to no /var/log
upload. We can also avoid using openQA `script_output` - which
is annoyingly indirect and slow - by using this neat `test -n`
trick I found on SO. Let's also use it in the anacondatest
post_fail_hook to avoid uploading /var/tmp when it's empty
(which we currently do). This also drops the 0 arg from a few
more script_run calls, because it's safe to wait for the run
to complete and we should probably do so to avoid later typing
errors if the commands are slow.
Test Plan:
Cause both anaconda and installed tests to fail and
check the hooks work as intended. Maybe twiddle the failures to
ensure directories do and don't exist and/or have contents and
make sure things work OK. I've tested this to some degree and
I'm pretty sure it works right.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D1041
It's not always in minimal installs. This is a simple change
and needed to make the post-fail hook work for minimal installs,
so pushing without review.
Summary:
os-autoinst implements `script_run` itself now, we aren't
required to implement it ourselves any more. os-autoinst's
implementation is better than ours, as it allows for verifying
the script actually ran (via the redirect-output-to-serial-
console trick).
So this drops our implementation so we'll just use the upstream
one. Where I judged we don't want to bother with the 'check
the command actually ran' feature I've adjusted our direct
`script_run` calls to pass a wait time of 0, which skips the
'wait for command to run' stuff entirely and just does a simple
'type the string and hit enter'.
Because of how the inheritance works, our `assert_script_run`
calls already used the os-autoinst `script_run`, rather than
the one from our distribution.
This should prevent `prepare_test_packages` sometimes going
wrong right after removing the python3-kickstart package, as
we'll properly wait for that removal to complete now (before
we weren't, we'd just start typing the next command while it
was still running, which could result in lost keypresses).
Test Plan:
Check all tests still run OK (I've tried this on
staging and it seems fine).
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D1034
Summary:
Since we can match on multiple needles, we can drop the loop
from console_login and instead do it this way, which is simpler
and should work better on ARM (the timeouts will scale and
allow ARM to be slow here). Also move it to main_common as
there's no logical reason for it to be a class method.
Also remove the `check` arg. `check` was only set to 0 by two
tests, _console_shutdown and anacondatest's _post_fail_hook.
For _console_shutdown, I think I just wanted to give it the
best possible chance of succeeding. But we're really not going
to lose anything significant by checking, the only case where
check=>0 would've helped is if the 'good' needle had stopped
matching, and all sorts of other tests will fail in that case.
anacondatest was only using it to save a screenshot of whatever
was on the tty if it didn't reach a root console, which doesn't
seem that useful, and we'll get screenshots from check_screen
and assert_screen anyway.
Test Plan:
Run all tests, check they behave as expected and
none inappropriately fails on console login.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D1016
use 'ps' output for Xorg and Xwayland. We'd need some new
openQA var to get this right by 'guessing', as it's vt1 for
Workstation when running live - so long as autologin worked -
but vt2 after install. We'd need a var or some other thing to
detect which case we're running in. LIVE doesn't do it, it's
set even when running a post-install test from a live image.
So instead let's just do it a bit more cleverly. This also
gives us a bit of insurance against changes in GDM, SDDM etc.
behaviour, so long as Xwayland or Xorg is running (and we can
add additional processes to the list, like gnome-shell, if
needed/appropriate). We assume the *final* listed process -
i.e. the most recently-started one - will be the desktop;
this covers gdm's behaviour of starting up on vt1 then running
the user session on vt2. We can make this function more complex
and add args if we ever get to the point where we have multi-
user tests running or anything (e.g. allow to pass a username
and only look for that user's processes).
Landing without review as this broke the live variant of the
test on Workstation in production (kinda not sure why it worked
in testing, or I didn't notice that it failed, but never mind).
I've tested it on staging.
Summary:
Very similar to the CLI update test, but using the desktops'
update applications. This is based off the CLI update test
branch as it uses the shared functions that branch introduced.
We do not use the fake update packages, as they don't really
do anything useful for these tests; for dnf they can help us
distinguish between issues with the dnf mechanism and issues
with the repos, but we can't really tell that in the graphical
case. So we only use the python3-kickstart package here.
Test Plan:
Run the test on both KDE and GNOME and ensure it
performs as intended. I've been testing it on staging, so you
can see it there.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D1010
Summary:
tty2 is where wayland desktop sessions run. I think it makes
sense to use a high tty for the post_fail_hook, so we know the
lower ones can be used by the tests...
Test Plan:
Run a Workstation post-install test that fails, check
the hook works.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D1003
Summary:
the main thing this does is try and type slower in X - this
should cover nearly everywhere we type anything in X, and make
it type slower. We also add a bit more safety checking to some
old tests which didn't have it (mainly _do_install_and_reboot)
- wait_still_screen after typing to make sure all the keypresses
were registered before continuing.
This is an attempt to mitigate the problems we've seen where
the wrong text gets typed into the wrong places and the tests
break.
This branch is live on staging atm. It still has *some* issues,
but I do think it's an improvement.
Test Plan:
run the tests (probably several times), compare to
runs without the change, see if it's better or worse...
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D993
cockpit 118 just landed in Rawhide, and it seems the username
field on the login screen is no longer selected by default,
you have to hit tab to navigate to it. We could get smart and
store the cockpit version in a variable or something, but it
doesn't seem worth it for now, let's just use a simple 'if
rawhide' conditional which can be adjusted as necessary as
things change.
https://github.com/cockpit-project/cockpit/issues/5000
when we run firefox in a bare X session, by default we get an
800x600 firefox in a 1024x768 X server with some dead black
space to the right and bottom of the screen. Now it turns out
that if the mouse is in the dead space, Firefox will not get
any keystrokes we send.
This didn't used to be a problem, but I made it into one with
this os-autoinst change:
https://github.com/os-autoinst/os-autoinst/pull/559
that makes os-autoinst move the cursor to 1023,767 after each
`assert_and_click`, instead of 0x0 as it did before, unless the
cursor has previously been explicitly place somewhere. So in
this case it gets moved to the dead space, and Firefox stops
responding to keypresses after the first `assert_and_click`.
We could equally well fix this by setting the cursor to 0,0
after running Firefox, but I like this more as it makes sure
we won't run into the same problem some other way, and makes
the videos and screenshots look nicer.
This fixes the realmd_join_cockpit test that's been failing
ever since I installed an os-autoinst with that fix. Committing
without review as it's a straightforward fix and I want the
test working again...
Summary:
we have a long-standing problem with all the tests that hit
the repositories. The tests are triggered as soon as a compose
completes. At this point in time, the compose is not synced to
the mirrors, where the default 'fedora' repo definition looks;
the sync happens after the compose completes, and there is also
a metadata sync step that must happen after *that* before any
operation that uses the 'fedora' repository definition will
actually use the packages from the new compose. Thus all net
install tests and tests that installed packages have been
effectively testing the previous compose, not the current one.
We have some thoughts about how to fix this 'properly' (such
that the openQA tests wouldn't have to do anything special,
but their 'fedora' repository would somehow reflect the compose
under test), but none of them is in place right now or likely
to happen in the short term, so in the mean time this should
deal with most of the issues. With this change, everything but
the default_install tests for the netinst images should use
the compose-under-test's Everything tree instead of the 'fedora'
repository, and thus should install and test the correct
packages.
This relies on a corresponding change to openqa_fedora_tools
to set the LOCATION openQA setting (which is simply the base
location of the compose under test).
Test Plan:
Do a full test run, check (as far as you can) tests run sensibly
and use appropriate repositories.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D989
Summary:
again, added as a non-fatal module for realmd_join_cockpit as
it's convenient to do it here. Also abstract a couple of ipa
bits into a new exporter package in the style of SUSE's
mm_network, rather than using ill-fitting class inheritance as
we have before - we should probably convert our existing class
based stuff to work this way.
Also a few minor tweaks and clean-ups of the other tests:
The path in console_login() where we detect login of a regular
user when we want root or vice versa and log out was actually
broken because it would 'wait' for the result of the 'exit'
command, which obviously doesn't work (as it relies on running
another command afterwards, and we're no longer at a shell).
This commit no longer actually uses that path, but I spotted
the bug with an earlier version of this which did, and we may
as well keep the fix.
/var/log/lastlog is an apparently-extremely-large sparse file.
A couple of times it seemed to cause tar to run very slowly
while creating the /var/log archive for upload on failure. It's
no use for diagnosing bugs, so we may as well exclude it from
the archive.
I caught cockpit webUI login failing one time when testing the
test, so threw in a wait_still_screen before starting to type
the URL, as we have for the FreeIPA webUI.
I also caught a timing issue with the openQA webUI policy add
step; the test flips from the Users screen to the HBAC screen
then clicks the 'add' button, but there's actually an identical
'add' button on *both* screens, so it could wind up trying to
click the one on the Users screen instead, if the web UI took
a few milliseconds to switch. So we throw in a needle match to
make sure we're actually on the HBAC screen before clicking the
button.
We make the freeipa_webui test a 'milestone' so that if the
new test fails, restoring to the last-known-good milestone
doesn't take so long; it actually seems like openQA can get
confused and try to cancel the test if restoring the milestone
takes a *really* long time, and wind up with a zombie qemu
process, which isn't good. This seems to avoid that happening.
Test Plan:
In the simple case, just run all the FreeIPA-related
tests on Fedora 24 (as Rawhide is broken) and make sure they all
work properly. To get a bit more advanced you can throw in an
`assert_script_run 'false'` in either of the non-fatal tests to
break it and make sure things go properly when that happens (the
last milestone should be restored - which should be right after
freeipa_webui, sitting at tty1 - and run properly; things are
set up so each test starts with root logged in on tty1).
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D935
Summary:
This requires a few other changes:
* turn clone_host_resolv into clone_host_file, letting you clone
any given host file (cloning /etc/hosts seems to make both
server deployment and client enrolment faster/more reliable)
* allow loading of multiple POSTINSTALL tests (so we can share
the freeipa_client_postinstall test). Note this is compatible,
existing uses will work fine
* move initial password change for the IPA test users into the
server deployment test (so the client tests don't conflict over
doing that)
* add GRUB_POSTINSTALL, for specifying boot parameters for boot of
the installed system, and make it work by tweaking _console_wait
_login (doesn't work for _graphical_wait_login yet, as I didn't
need that)
* make the static networking config for tap tests into a library
function so the tests can share it
* handle ABRT problem dirs showing up in /var/spool/abrt as well
as /var/tmp/abrt (because the enrol attempt hits #1330766 and
the crash report shows up in /var/spool/abrt, don't ask me why
the difference, I just work here)
* specify the DNS servers from the worker host's resolv.conf as
the forwarders for the FreeIPA server when deploying it; if we
don't do this, rolekit defaults to using the root servers as
forwarders(!) and thus we get the public, not phx2-appropriate,
results for e.g. mirrors.fedoraproject.org, some of which the
workers can't reach, so PackageKit package install always fails
(boy, was it fun figuring THAT mess out)
Even after all that, the test still doesn't actually pass, but
I'm reasonably confident this is because it's hitting actual bugs,
not because it's broken. It runs into #1330766 nearly every time
(I think I saw *one* time the enrolment actually succeeded), and
seems to run into a subsequent bug I hadn't seen before when
trying to work around that by trying the join again (see
https://bugzilla.redhat.com/show_bug.cgi?id=1330766#c37 ).
Test Plan:
Run the test, see what happens. If you're really lucky,
it'll actually pass. But you'll probably run into #1330766#c37,
I'm mostly posting for comment. You'll need a tap-capable openQA
instance to test this.
Reviewers: jskladan, garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D880
Summary:
This adds tests for the Server_cockpit_default and cockpit_basic
test cases. Some notes: I was initially thinking of combining
these into a single test with multiple test modules and coming
up with a system for doing wiki reporting based on individual
test module status, but because we'll also want to do a cockpit
FreeIPA enrol test, I decided against it. We don't really want
to combine all three because then we would skip the cockpit
tests whenever FreeIPA server deployment failed, which isn't
ideal. So since we'll need a separate FreeIPA enrolment test
anyway it doesn't really make sense to go to the trouble of
designing a system for loading multiple postinstall tests (though
I have an idea for that!) and a per-module wiki reporting system.
This was the most minimal and hopefully reliable method for
running Cockpit from a stock Server install that I could think
of. An alternative approach would be to have, say, the most
recent stable Workstation live as a 'stock' asset and have two
tests, one which runs a stock Server install and just waits and
another which boots the live image and accesses the cockpit
running on the other box, but that seems a bit over-complex. It
is not possible to have dependencies between tests for different
ISOs, in case you were wondering about having a Workstation live
test which runs parallel with a Server DVD test, we can't do
that. One funny thing is the font that winds up getting used for
the desktop, but I don't *think* that should be a problem.
Picking needles was a bit tricky; any improvement suggestions
are welcome. I'm hoping it turns out to be safe to rely on some
dbus log messages being present; I think logging into Cockpit
triggers activation of the realmd dbus interface, so there
*should* always be some messages related to that. An alternative
would just be to match on a sliver of the dark grey table header
and the light grey row beneath it and assume that'll always be
the first message (whatever the message is), but then we have to
find some area of the message details screen which is always
present for any message, and it just seems a tad more likely to
result in false passes. Similary I'm making an assumption that
auditd is always going to show up on the first page of the
Services screen and the details screen will always show that
'loaded...enabled' text.
Test Plan:
Run the tests and see if they work! See
https://openqa.stg.fedoraproject.org/tests/21373 and
https://openqa.stg.fedoraproject.org/tests/21371 for my tests.
Reviewers: garretraziel
Reviewed By: garretraziel
Subscribers: tflink
Differential Revision: https://phab.qadevel.cloud.fedoraproject.org/D874