Proposed validation test case: root on LVM thinp

Adam Williamson awilliam at redhat.com
Sat Dec 14 00:49:44 UTC 2013


On Fri, 2013-12-13 at 15:49 -0700, Chris Murphy wrote:
> On Dec 13, 2013, at 3:13 PM, Adam Williamson <awilliam at redhat.com> wrote:
> 
> > I meant to write this test case anyhow, but today's brown paper bag bug
> > - https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=1040669 - gives
> > it a certain sense of importance, so I thought I'd best get it done now.
> > 
> > I wrote
> > https://fedoraproject.org/wiki/QA:Testcase_anaconda_lvmthinp_rootfs ,
> > and I propose we add it to the installation validation matrix for F21
> > and later, with release level Beta, as it matches this criterion:
> > 
> > https://fedoraproject.org/wiki/Fedora_20_Beta_Release_Criteria#Guided_partitioning
> > 
> > "When using the guided partitioning flow, the installer must be able
> > to: ... Complete an installation using any combination of disk
> > configuration options it allows the user to select"
> > 
> > LVM thinp is now one of the options on the 'filesystem type' dropdown on
> > the Installation Options screen, which is what 'disk configuration
> > options' in that criterion is talking about.
> > 
> > Anyone have feedback on the test case, or the proposed inclusion in the
> > matrices? I based the test case on the ext4 rootfs one but tidied it up
> > somewhat, and placed stronger restrictions on changing other
> > configuration during the test; I will probably go through all the
> > filesystem-y test cases and propose similar changes, if we think that's
> > a good idea. We should try to make the whole set consistent.
> 
> Yeah I'm holding off on a major f bomb email until I understand what
> happened and exactly when this broke. If you have some insight on that
> to save me regression testing, that would be helpful.

Already looked into it. it was broken by
dracut-034-64.git20131205.fc20.x86_64 , which landed in either TC5 or
RC1, I don't recall which, but either way, very late. We needed it to
fix https://bugzilla.redhat.com/show_bug.cgi?id=881624 and
https://bugzilla.redhat.com/show_bug.cgi?id=1038838 . However, dracut
team pulled their usual trick of sending us a new git snapshot with a
bunch of other unrelated changes. I told them on IRC that we weren't
happy about that but they really wanted the other changes and promised
they were all safe. I sat down with them and eyeballed the list of
changes in the RPM changelog and Bodhi ticket, and made sure none of
them looked like they affected real sensitive areas.

It turns out, though, that the new git snapshot also included at least
one other change which was not communicated in the package changelog or
to me while we were talking about it, and that change was very badly
broken - it contains at least two errors and one major logic fail in
seven lines of shell script, which is kind of impressive, in an odd way.
It tried to make 'hostonly' dracut builds only install the thinp tools
if a thinp was actually present on the system...but the check did not
work, so it just wound up never installing them. And it neglected to set
up the conditional so that 'generic' builds always included the tools,
instead they *never* get the tools.

Ultimately, the effect of the patch was that no initramfs dracut
generated could possibly have the thinp tools in it. If they'd only
managed to break _either_ hostonly _or_ generic mode then installs would
at least have _one_ kernel which would boot, but since they managed to
break both modes with one patch, neither kernel you have after a fresh
install works.

The change was utterly unnecessary and should never have been sent near
a release in a final freeze in the first place, and on top of that, it
appears to have been committed to upstream git on the _same day_ they
sent the build downstream for F20 final, 12-05:

https://git.kernel.org/cgit/boot/dracut/dracut.git/commit/?id=c21c4dc2b469107ac35d8c1157f245965fd55292

so it seems very likely it had no upstream review or testing at the time
it was sent down to Fedora. Needless to say, I'm not exactly happy with
this, especially since one of the blocker bugs we pulled the new dracut
to fix in the _first_ place -
https://bugzilla.redhat.com/show_bug.cgi?id=881624 - was caused by a
very similar logic error in upstream dracut.

I'd suggest to the dracut developers that they strengthen their
development practices and checks. Right now they seem to be developing
with the mindset "things should only be in the initramfs if we're
absolutely sure they should be": they should precisely reverse that
mindset, and make it "we should only include code that introduces the
possibility that something will be left out of an initramfs if we are
VERY sure that code is correct, and every single change of this kind
should be specifically vetted to ensure that generic builds are not
being incorrectly conditionalized or broken."
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net



More information about the test mailing list