On Thu, Jan 28, 2016 at 08:51:29PM +0100, Bolke de Bruin wrote:
> Op 28 jan. 2016, om 20:42 heeft James Ralston <ralston(a)pobox.com> het volgende
geschreven:
>
> On Thu, Jan 28, 2016 at 8:18 AM, Bolke de Bruin <bdbruin(a)gmail.com> wrote:
>
>> As mentioned in another thread one of the Hadoop components (Ranger)
>> syncs all users and groups (including GIDs) on a regular basis to
>> provide authorization.
>
> Unfortunately, that is the problem. :-(
>
> Apache Ranger assumes that the back-end database for the passwd/group
> services is capable of enumeration. That is true for the "files"
> database, but is not guaranteed to be true for other databases.
>
> More simply put: there is no guarantee that getpwent()/getgrent() will
> enumerate all users/groups (respectively) known to the passwd/group
> services.
>
> At our site, we have a team that uses Hadoop, and they encountered
> this issue when we first deployed sssd. Their work-around was to
> manually create local passwd/group entries for the users/groups they
> wanted to be visible within Hadoop. That worked for them, because
> their Hadoop cluster was for only a handful of users, but that
> solution isn't going to work for a production Hadoop cluster of any
> significant size.
>
> I asked the developers on our Hadoop team to file a bug against Apache
> Ranger, but I don't know if they ever did.
Ranger is actually even worse. It currently uses /etc/passwd and /etc/group
directly - so no nss. I have a patch in the works that addresses this by using getent
instead.
Moreover, I am adding some config parameters that allow to sync/enumerate
specific groups that ranger otherwise doesn’t see. It might help your guys in the
future.
Still I think Ranger is a load of crap though, enumerating all users with over 50.000 in
our corp directory that is not fun. I just try to make it a little bit more manageable.
What might also be a workaround for broken software like this is to
include a filter in the ldap_search_base option to only match the
required users (maybe coupled with a special attribute on the server or
a group you can easily match with memberof:). Then sssd would really
only match the filter and enumeration could be bearable. (it still
wouldn't include features like id views though).