Operating the cluster

When running PhenixID authentication services with a clustered configuration it's important that services, nodes and external dependencies are operated, maintained and monitored closely.  In general PAS has a central role in the day to day work of the users and should be handled accordingly. 

Communication routes

Ensure that the ports required are open at all times and that they are not closed by firewalls.  Latency between cluster nodes should be kept as low as possible.  Since PAS operates in an "eventually consistent" mode, the lower latency the better.

Shorter network glitches should not be a problem, but remaining interference in communications will cause significant  performance problems and in some cases complete failure.

Synchronized time

It is absolutely essential that all nodes are synchronizing time against the same time sync server(s).

Restarting of services/nodes

Restarting of services or the entire node should NOT be done in "un-monitored" manor. In cases where  nodes need to be restarted due to OS updates etc.  ensure that the nodes have time to reconnected to the cluster before restarting other nodes.

Verify logs

Put into daily routine to verify the logs of the cluster, identifying any anomalies or misbehaviour at an early stage.