arthurdejong / nss-pam-ldapd

NSS and PAM modules for lookups using LDAP

Home Page:https://arthurdejong.org/nss-pam-ldapd/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

nslcd not switching back to primary LDAP server

FrankyGT opened this issue · comments

I've already created a ticket for this at redhat, but I think this is a better place to report this.

To connect to our LDAP slaves we use nslcd. Using nslcd we run into the following issue :

On our clients have configured 4 LDAP slaves in the nslcd config file. The first two are the main ones, and the other 2 should only be used as fallback. What happens, after rebooting the first 2 servers, is nslcd only connecting to the fallback servers, and never returning to the main 2 servers. Also the man page states the concept of "fallback servers" :

          This option may be specified multiple times and/or with more URIs on the line, separated by space. Normally, only the first server will be used with the following  servers as fall-back (see bind_timelimit below).

So in our case with 10000 clients, we normally perform maintenance in a sequential way, so we update server1, which moves all clients to server2, the next day we update server2, which moves all clients to server 3, etc... So we end up with one server servicing 10.000 clients, overloading and all kinds of nasty issues.
What I expect to occur, is that if the connection idle timeout kicks in (idle_timelimit), and the current TCP session is closed, after that, the first server in the list is retried instead of the last one used.
I looked into the nslcd code, and I think this issue is in all versions of nslcd. A simple fix would be to add "session->current_uri = 0" on line 1065 of myldap.c

Hi @FrankyGT,

Thanks for your ticket and providing a solution. Ideally we would have some kind of concept of primary and secondary servers but the config is a bit too limited for that. Ideally you would round-robin over the primary servers and only use the secondary servers if no primary server was available but indeed nslcd has never supported that.

I think your change is useful, even though it only works if idle_timelimit is defined in the config (by default it is disabled). I've merged it as 6d5a2eb. I will try to get around to making another 0.9 release of nss-pam-ldapd sometime soon.

Thanks.