On ti, 29 touko 2018, Stephen Gallagher wrote:
On Fri, May 25, 2018 at 10:26 PM Alexander Bokovoy
<abbra(a)fedoraproject.org>
wrote:
> > On Fri, 2018-05-25 at 20:31 +0000, Alexander Bokovoy wrote:
> >
> > To be totally honest, I did not get this message either, Alex - my
> > understanding was that once we finally got all the intended packages
> > landing and the automated tests worked, you actually thought FreeIPA
> > was in acceptable shape fore real use. If it was known that it was not,
> > we absolutely ought to have communicated this *far* more widely than on
> > a niche mailing list: FreeIPA is supposed to be a key feature of Fedora
> > Server which is itself a key edition of Fedora. This should have been
> > up-front in the release notes and the release announcement, or frankly,
> > should've caused us to rethink the release plans.
> I did tell that several times but the only real answer I've got: "these
> issues are not blocking criteria for Fedora Server". At some point you
> choose your own fights: fixing software or fixing release criteria. For
> Fedora 29 I'd like us to extend Fedora Server blocking criteria, now that
> majority of porting has been completed.
>
>
To be clear, what I understood prior to F28 release was that the creation
of new FreeIPA Replicas was not working. Had I realized that the problem
was more in-depth, I absolutely would have hit the big red button on the
release. I may have been misunderstanding what you were telling me, but my
impression of the issues I heard from you was that it did *not* in fact
affect the set of things we were treating as blocking.
(For a bit of a history lesson, at the time when we first started shipping
a separate Server Edition, we expressly did not include replicas as a
blocking feature because we were trying to encourage people to use
RHEL/CentOS for the sort of environments where replicas would be required.
Also, replicas were MUCH harder to set up in those days than they are
today. We absolutely should have this on the blocking criteria for Fedora
29).
> For us a push with Python3 migration (we have to migrate all Python base,
> not a selected module here or there), NSS to OpenSSL migration, mod_nss to
> mod_ssl migration, NSS default database format migration, Apache ignorance
> of its ecosystem (changes in ABI in mod_proxy in minor versions),
> modularity inconsistence through the course of year 2017, have killed a lot
> of the productive time.
>
> > Obviously there was some sort of significant communication fail if
> > enough people missed the message that this got totally whiffed on, so
> > we should absolutely figure out what we can do better there.
> >
> > Perhaps this also suggests our existing release criteria and test cases
> > for FreeIPA are insufficient: if it can pass our existing tests and
> > thus appear to meet our existing criteria, yet be in your judgment "not
> > ready for production", that seems fundamentally wrong. How do you we
> > think we could address that? Can you give some kind of summary of the
> > issues here, which we can use to think about how to extend the test
> > cases and criteria?
> The issues were listed in the email referenced by Jonathan already.
>
> - Replication failures should have been a blocker alone (they are for
> FreeIPA team) but Fedora Server criteria does not include them.
>
>
Yeah, see above. That was due to a historical decision that is no longer
appropriate as well as a misunderstanding on my part about the severity of
the problem; I honestly did not know that the problems extended to existing
replicas.
> - Broken NSS sqldb defaults caused us several months working on fixes. The
> latest one,
https://bodhi.fedoraproject.org/updates/FEDORA-2018-8cf042000b,
> was only pushed after Fedora 28 release.
>
https://bugzilla.redhat.com/show_bug.cgi?id=1568271 was found in late
> April, after we did fight all the previous issues. We started with NSS
> sqldb adaptation in October 2017.
>
>
This might be fodder for a separate thread, but has the FreeIPA team
considered dropping NSS as a crypto library entirely? It really seems that
the NSS upstream cares only about Firefox and is perfectly happy to break
all other consumers whenever they feel like it.
Yes, and we spent more than six
months clearing the fallout. It is not
completed yet and will not be completed any time soon because openssl is
not a better crypto library, just a different beast.
Namely, it has issues with HSM support that prevent Dogtag from moving
away from NSS. 389-ds also uses NSS for its server-side operations,
implementing a hybrid mode where openldap libraries (compiled against
openssl) are fed with certificates extracted from NSS database at
runtime. There are numerous other issues in this migration path.
> - Only on Thursday this week we've finally tracked down a
nasty
> python-ldap bug that crashed FreeIPA framework on every time --all option
> was used on a host or service entry with additional access controls
> defined. This is not part of Fedora Server criteria but kills FreeIPA use
> with delegated permissions to retrieve Kerberos credentials.
>
>
We do need to add HBAC rules to the criteria as well (and I thought we did
have at least minimal testing for this), but I suspect this would *not*
have risen to the status of blocker, but would probably would have been a
lively conversation at the blocker bug meetings.
HBAC rules are domain of SSSD. If
that fails, existing OpenQA tests for
domain clients will notice it even with a default allow_all rule.
> - We had to do a lot of Python 3 porting work for other projects.
Time is
> not unlimited, especially when it comes to releases and blocking criteria.
>
>
This is one place where I think the FreeIPA team needed to be more
proactive. Presumably, this work was known about well before Final Freeze.
Given FreeIPA's critical place in the Server Edition, it would have been
grounds for approaching the Server WG and FESCo about an adjustment to the
Fedora Schedule.
We've been saying that existing schedule is unrealistic for at
least 3-4
Fedora releases now. I don't think it is productive to ask for extension
every time. Let's be clear: Fedora puts unrealistic goals to make it
possible to move forward over multiple releases. It just unrealistic to
tackle them within the same release for such a complex infrastructure
arrangement we deal with.
However, I'm very grateful to Python team at Red Hat who helped us
enormously over past two years with Python 3 migrations in a number of
key components. I'm not talking about pure Python code as in majority
cases Fedora had faced. Samba has ~200K lines of generated C Python
extension code that needs to be supported with both Python 2 and Python
3 at the same time.
> - Dogtag had to work on Tomcat 8.5 adaptation where existing API
it
> dependent on was removed.
>
>
This is the sort of place where I think that modularity can help in the
future. Tomcat regularly breaks backwards-compatibility and I think we in
Fedora need to have a way to keep the known-working versions in the
distribution, even if it is non-default.
I don't think modularity could help
here. Well, may be with tomcat, but
it will not help with NSS and other low-level libraries.
We had also to help Dogtag guys who were heads down in Common Criteria
work for about a year. Python 3 migration for their installer came out
of this work in March. Without Python 3-enabled dogtag installer we
weren't able to get rid of Python 2 in Fedora 28 at all.
> So on May 15th we released
>
https://www.freeipa.org/page/Releases/4.6.90.pre2 which is now in Fedora
> 28 stable updates. We consider it as one of closer candidates to being
> stable. Between 4.6.90.pre1 and pre2 are two months of hard work across
> several sizeable projects (freeipa, sssd, 389-ds, MIT Kerberos, dogtag,
> nss, authselect, gssproxy, to name a few).
>
> We have a testing setup at FreeIPA upstream that allows us to test complex
> topologies. Only recently we were able to move to Fedora 28 testing there
> as we had issues with our components. There we test also what OpenQA is
> unable to test so far. I think 4.6.90.pre2 is in much better shape than
> what Fedora 28 had released. However, if we were to get it as a blocking
> release, Fedora 28 would have been delayed by at least a month.
>
> As I said, we had no choice: a push of NSS sqldb defaults change forced us
> to work on both nss-related code and openssl migration at the same time. It
> made impossible to keep FreeIPA from Fedora 27 and do our work in a
> separate module. This was known since autumn 2017 and was a well voiced
> situation.
>
>
It may have been known since autumn 2017, but it was not sufficiently
voiced. As I said, I failed to understand the degree of trouble that
FreeIPA was in. I suspect some of that was communication fatigue with
failing to get me to understand, but your statement above that you
basically abandoned trying to get me to understand isn't a good outcome
either.
If nothing else, it might have been prudent to find another person to speak
to (or a different person to speak *for* you who might have more success).
Or perhaps at least had proposed a voice conversation where at least I
could have heard the urgency in your voice that was apparently missing from
my reading of your IRC communiques.
No, this is not about you or me, Stephen, or
anyone specific otherwise.
I think it is just a general issue with mandate-driven releases -- it
does not work when a committee issuing a mandate has no involvement in
the actual implementation. To me it looks like Fedora Server SIG is
generally not interested in identity management area development (as
opposed to being able to consume resulted features admin-wise) so we are
left with our own effort. It is the same with my interest in other areas
though, so people who do AI/ML stuff couldn't care less to take my
advise either. ;)
--
/ Alexander Bokovoy