At our last meeting https://meetbot.fedoraproject.org/meeting_matrix_fedoraproject-org/2024-09-1...), I had agreed to try to summarize our current state of discussion and to describe a possible solution. Here we go.
We discuss this topic for a long time, the initial tracking issue #63 (https://pagure.io/fedora-server/issue/63) is 2 years old.
Our initial intention was (and still is)
* Systematization of activities according to criteria and objective needs
* As a supplement to automatic tests and aspects that may not be amenable to automatic testing
* Integration and coordination with distribution-wide QA
* Discovery of new problems for which, of course, there are no automated tests (yet).
* This includes, among other things, monitoring the release changes that could potentially have side effects for Server.
* Checking the documentation for necessary updates
Topic "WHAT to test" ====================
One position was/is that manual testing is more or less completely redundant because everything relevant is now covered by automated testing.
One argument against this is that in the past, some problems were only noticed and found in the course of manual testing (e.g. software raid when switching to GPT) or were not found at all or not in time because manual testing of a release was not carried out or was insufficient (e.g. the problems with LVM administration, one of the most important functionality for Server).
And somehow it doesn't feel right not to test our installation media at all and use it to perform an installation or update for the first time after the release has been published.
I suppose it is agreeable, "it is good to have human testing of the deliverables written to real physical media on real physical systems" (adamwill). And this is exactly what we did in the past. Sometimes our manual testing program included our central services, virtualization and containerization, which are highly consequential in the event of a failure.
Additionally, is seems to be agreeable, "to test whatever workflows (we) have that *aren't* covered in the validation tests" (adamwill), although we may need to clarify, what "validation tests" do exactly cover.
Probably, we agree about the following list of human / manual test tasks:
* test DVD installation media on physikal hardware * test netboot installation media on physikal hardware * test VM (KVM) instantiation including first steps after first boot * in both cases checking: **** Everything works without breakage **** No distortion of the graphical or terminal output. **** No irritating, inaccurate or misleading error messages. **** besides the graphical guided steps, check shell access (<F1> etc) inkl. access to log files, print screen etc. **** Accuracy of the relevant documentation * In case of DVD installation, additionally (here running on hardware) **** Installing virtualization **** Installing containerization systemd-nspawn **** Installing containerization podman (as soon as we have documentation and procedures ready) * Test the dnf upgrade procedure on hardware and on a "real life" instance, not just the minimal default * Test the dnf upgrade procedure on a "real life" VM instance, not just the minimal default * Create a list of any special or one-off tests that are likely to be needed, based on the list of changes **** Perform and monitor these tests
Topic "HOW to test" ====================
Previously, we created a corresponding list as a tracking issue. For this release, I created a wiki page, which is easier to use.
As adamwill noted, this page "definitely shouldn't exist, we should fold anything important it covers into one of the existing pages". It takes us away from our goal from the very beginning of aligning testing with the distribution QA. It is a stopgap solution because the current pages do not offer us this capability.
To organize the test practically, we need a concise and clear list of all the tasks that need to be completed and that we can “tick off”. It would be good to have a structure like the one offered by the wiki page (https://fedoraproject.org/wiki/Server/QA_Manual_Testing_Overview). And nothing that is not part of the server test program belongs on this page.
It would be really great if the current QA pages could be added to or changed accordingly.
A good starting page would be the server page that is now sent in the announcement emails: https://fedoraproject.org/wiki/Test_Results:Fedora_41_Branched_20240924.n.0_...
We should remove all items that have nothing to do with Server, starting with the download list.
The test matrix and coverage page / lists should be split into automated tests and manual/human tests.
The lists themselves can probably consist largely of annotated links to existing pages, but preferably with anchors directly to the relevant spot. And they should already indicate who tested and, above all, with which result. And it must be clear at a glance what has not yet been tested.
And we need a place for one-off, release-specific tests, should the need arise.
Topic "SUPPLEMENTING the tests" ===============================
Of our server-specific services (or roles), only two are currently covered by tests: PostgreSQL and IPA. This needs to be completed.
The process is: 1. define these as blocking roles in the prd/tech spec 2. cover them in the release criteria 3. write wiki test cases 4. automate them
The first task is done. We should continue with the second one. Pragmatically, we should focus on the services for which documentation and procedures already exist: virtualization, containerization (nspawn), web server and NFS server.
The biggest issue might be the Apache server. Its current installation procedure effectively results in an unusable and broken instance. There is a lot of work to be done.
But we should discuss this separately from the general release tests. We have tracking issue #61 for this (https://pagure.io/fedora-server/issue/61).
-- Peter Boy https://fedoraproject.org/wiki/User:Pboy PBoy@fedoraproject.org
Timezone: CET (UTC+1) / CEST (UTC+2)
Fedora Server Edition Working Group member Fedora Docs team contributor and board member Java developer and enthusiast
On Tue, 2024-09-24 at 16:36 +0200, Peter Boy wrote:
A good starting page would be the server page that is now sent in the announcement emails: https://fedoraproject.org/wiki/Test_Results:Fedora_41_Branched_20240924.n.0_...
We should remove all items that have nothing to do with Server, starting with the download list.
I disagree with this. The download list is there for a reason: when we didn't have it, we sent people to the pages, and they said "but where do I download the thing to test in the first place?" So we added the download lists.
What we *could* do is set the table up to be filtered on edition specific pages - only show Server images on the Server page, only show desktop images on the Desktop page etc. I can try and find some time to look into that (though I have about 40 things I'm trying to find time to look into already ATM :|). If anyone else wants to look, the code is https://pagure.io/fedora-qa/python-wikitcms/blob/main/f/src/wikitcms/page.py... . The fancy way to do it would be to add some kinda magic to the table to let different pages have different 'views' on it or something (I don't know off the top of my head if this is possible). The dumb-but- probably-easy way to do it would just be to generate several different tables, either as separate pages or all in one page in different sections set up to be transcluded by different test pages.
The test matrix and coverage page / lists should be split into automated tests and manual/human tests.
I don't really agree with this either, except possibly for cases where a test is human-run *by policy*. Otherwise these are not inherent characteristics of the test, but implementation details. Automation has happened gradually; at some point, *all* the tests in the matrix were human-run. We only automated some of the tests on the Server page fairly recently.
We do already distinguish test results from automated systems - they're the ones with the robot icon next to them. To me this is the right way to do it: automation is a property of the *result*, not of the *test*. Results from automation should always be posted within a few hours after the validation event is created, and you can always refer to past result pages to see which tests get results filed by automation.
Note we *do* already explicitly require a smoke test install of Server (along with all of the other release-blocking images) to real hardware using a real USB stick. This is in the Installation page, not the Server page - https://fedoraproject.org/wiki/Test_Results:Fedora_41_Branched_20241012.n.0_...) .
We *could* break out some rows of that table onto the various edition- specific pages, but it gets kinda awkward since Everything is in the list and there's no page for Everything. We'd need an instance of the table on multiple subpages *and* to keep it on Installation for Everything, which I don't really love. I do get the point that, if you specifically want to "help with testing Server", it's a bit awkward that you have to find or know that there are Server-relevant tests on the Installation and Base pages too. I'll try and think of ways to improve that.
Just for some context, the reason I want to emphasize use of the current setup is it's already the result of a process that started with more or less what you started with (ad-hoc, human created wiki pages). The very first one (I think) was https://fedoraproject.org/wiki/Releases/5/FC5FinalTreeTesting . The current wikitcms setup is a direct (though distant) descendant of that page. :D
It seems easy at first to just throw up a simple wiki page with some tables in it. Then you get more tests. Then people start asking questions like "where do I download this thing? How do I enter results exactly?". Then you want to get some data on the results over time. Then you start getting really freaking tired of manually creating these damn wiki pages every three days.
For instance, in your current page, you don't cover arches. You don't have a page per build, which makes comparing results from different builds and knowing which builds are "covered" difficult. You have the comments in-line in the tables, which will make the tables very hard to read if anyone puts a long comment in (this is why, on the wikitcms pages, the comments are footnotes). etc etc etc...this is all stuff we went through already, years ago. :D
We went through that whole cycle, and the current wikitcms setup is the end point of it. I'd really hate someone to have to go through that whole cycle again from scratch.
Am 14.10.2024 um 18:31 schrieb Adam Williamson via server server@lists.fedoraproject.org:
On Tue, 2024-09-24 at 16:36 +0200, Peter Boy wrote:
A good starting page would be the server page that is now sent in the announcement emails: https://fedoraproject.org/wiki/Test_Results:Fedora_41_Branched_20240924.n.0_...
We should remove all items that have nothing to do with Server, starting with the download list.
I disagree with this. The download list is there for a reason: when we didn't have it, we sent people to the pages, and they said "but where do I download the thing to test in the first place?" So we added the download lists.
What we *could* do is set the table up to be filtered on edition specific pages - only show Server images on the Server page,
Sorry, that filtering is what I meant. So we have no disagreement.
only show desktop images on the Desktop page etc. I can try and find some time to look into that (though I have about 40 things I'm trying to find time to look into already ATM :|). If anyone else wants to look, the code is https://pagure.io/fedora-qa/python-wikitcms/blob/main/f/src/wikitcms/page.py... . The fancy way to do it would be to add some kinda magic to the table to let different pages have different 'views' on it or something (I don't know off the top of my head if this is possible). The dumb-but- probably-easy way to do it would just be to generate several different tables, either as separate pages or all in one page in different sections set up to be transcluded by different test pages.
Yes, for me I would we content if have a solution we want to build. Maybe it takes some time to get the resources to make it real. In the meantime, we can live with a makeshift solution.
The test matrix and coverage page / lists should be split into automated tests and manual/human tests.
I don't really agree with this either, except possibly for cases where a test is human-run *by policy*. Otherwise these are not inherent characteristics of the test, but implementation details. Automation has happened gradually; at some point, *all* the tests in the matrix were human-run. We only automated some of the tests on the Server page fairly recently.
OK, I See from a QA group view it makes not much sense to differentiate the tables.
But from a WG view we would need a clear and consistent table of what we need or should do. So probably your other idea, to provide a table of links to the tests in question might be good solution?
We do already distinguish test results from automated systems - they're the ones with the robot icon next to them. To me this is the right way to do it: automation is a property of the *result*, not of the *test*. Results from automation should always be posted within a few hours after the validation event is created, and you can always refer to past result pages to see which tests get results filed by automation.
I totally agree with that.
Note we *do* already explicitly require a smoke test install of Server (along with all of the other release-blocking images) to real hardware using a real USB stick. This is in the Installation page, not the Server page - https://fedoraproject.org/wiki/Test_Results:Fedora_41_Branched_20241012.n.0_...) .
We *could* break out some rows of that table onto the various edition- specific pages, but it gets kinda awkward since Everything is in the list and there's no page for Everything. We'd need an instance of the table on multiple subpages *and* to keep it on Installation for Everything, which I don't really love. I do get the point that, if you specifically want to "help with testing Server", it's a bit awkward that you have to find or know that there are Server-relevant tests on the Installation and Base pages too. I'll try and think of ways to improve that.
That’s the issue, indeed! And I would love it you can make a proposal how to do it in a good way.
Just for some context, the reason I want to emphasize use of the current setup is it's already the result of a process that started with more or less what you started with (ad-hoc, human created wiki pages). The very first one (I think) was https://fedoraproject.org/wiki/Releases/5/FC5FinalTreeTesting . The current wikitcms setup is a direct (though distant) descendant of that page. :D
I wholeheartedly support that. My original idea was to create a “test week” for Server, too. This would have everything that needs to be done in one place and very clearly laid out, with links to the corresponding tests in the overall view.
But you said that it couldn't be done with the available resources.
It seems easy at first to just throw up a simple wiki page with some tables in it. Then you get more tests. Then people start asking questions like "where do I download this thing? How do I enter results exactly?". Then you want to get some data on the results over time. Then you start getting really freaking tired of manually creating these damn wiki pages every three days.
For instance, in your current page, you don't cover arches. You don't have a page per build, which makes comparing results from different builds and knowing which builds are "covered" difficult. You have the comments in-line in the tables, which will make the tables very hard to read if anyone puts a long comment in (this is why, on the wikitcms pages, the comments are footnotes). etc etc etc...this is all stuff we went through already, years ago. :D
We went through that whole cycle, and the current wikitcms setup is the end point of it. I'd really hate someone to have to go through that whole cycle again from scratch.
Again, I'm not committed to this table and I'm not happy with it. I see it as a stopgap because we haven't been able to create anything decent so far. Since the revival of the Server Working Group, we have stumbled from one temporary solution to the next. I want to move away from the unsystematic way we have been doing things so far. I'm not the expert on distribution QA. I'm just the fool trying to organize the necessary testing to get done, to make the best of the unsatisfactory situation, and to initiate the discussion about who of us is doing what.
As a first step, we should agree on what tests we want and need to do systematically and reasonably. You are the member of our working group who knows the most about this. So it would be best if you put together a to-do list on this.
If I understand our discussion so far correctly, that might be something like:
1) Smoke tests of the installation media (all those „USB“ tests)
2) Possibly test of server-specific procedures, e.g. installation of virtualization with internal protected network and internal name resolution (but maybe this can also be automated) or our special web server configuration
3) Review of all changes to a release to see if a change might particularly challenge Server (such as the aforementioned LVM change if it had been announced).
But you are the expert, not me.
And in a second step we should find a way to make it easy for every WG member to participate in the testing, probably by modifying the Server test announce page a bit.
And in a third step, we have to gradually supplement the automatic tests as far as possible by the central services that we have compiled in our technical specification.
Can you participate at the meeting later today? Would be very helpful. I would really like to bring this point to a conclusion, which has been occupying us for a long time.
-- Peter Boy https://fedoraproject.org/wiki/User:Pboy PBoy@fedoraproject.org
Timezone: CET (UTC+1) / CEST (UTC+2)
Fedora Server Edition Working Group member Fedora Docs team contributor and board member Java developer and enthusiast
server@lists.fedoraproject.org