Verifying the cluster, version 3.2 and later

Baseline setup

Setup will contain of two or more servers, with separate cluster for PhenixID Server Service and PhenixID Database service.
Servers running PhenixID Server should all point to database servers in the same order. Meaning that the parameter "db_url", in boot.json, is the same on all machines. Even though there might be a database service running locally. With this approach, all PhenixID Servers will write to the same database, that is then replicated to the other nodes of the database cluster.
The Database service should be started first, so databases can be reached when starting the PhenixID Server service.

The log file in use for PhenixID Server is /logs/server.log.
For the database, its /logs/orient-server.log.0.

Startup of PhenixID Database service

When starting the first node in the database cluster, the end of the log should look like below.
Showing that the service has been started correctly and that status for both the node and the database is "ONLINE".
As seen, only one node is started and part of the cluster so far.

2020-04-16 09:15:56:867 INFO  [CLUSTER1] Current node started as MASTER for database 'phenixid' [OHazelcastPlugin$6]
2020-04-16 09:15:57:711 INFO  Storage 'plocal:C:/Program Files/PhenixID/Database/databases/phenixid' is opened under OrientDB distribution : 2.2.37 (build a7541e7ceeabf592dd9a7b2928b6c023cbc73193, branch 2.2.x) [OLocalPaginatedStorage]
2020-04-16 09:15:57:742 INFO  [CLUSTER1] Reassigning ownership of clusters for database phenixid... [OHazelcastPlugin]
2020-04-16 09:15:57:742 INFO  [CLUSTER1] Reassignment of clusters for database 'phenixid' completed (classes=15) [OHazelcastPlugin]
2020-04-16 09:15:57:758 INFO  [CLUSTER1] Publishing ONLINE status for database CLUSTER1.phenixid... [ODistributedDatabaseImpl]
2020-04-16 09:15:57:758 INFO  [CLUSTER1] Updated node status to 'ONLINE' [OHazelcastPlugin]
2020-04-16 09:15:57:789 INFO  [CLUSTER1] Distributed servers status (*=current @=lockManager[CLUSTER1]):

+---------------------+------+------------------------+-----+---------+----------------+--------------+----------------------+
|Name                 |Status|Databases               |Conns|StartedOn|Binary          |HTTP          |UsedMemory            |
+---------------------+------+------------------------+-----+---------+----------------+--------------+----------------------+
|CLUSTER1(*)(@)       |ONLINE|phenixid=ONLINE (MASTER)|0    |09:15:51 |172.31.1.60:2424|127.0.0.1:2480|73.20MB/3.56GB (2.01%)|
+---------------------+------+------------------------+-----+---------+----------------+--------------+----------------------+
 [OHazelcastPlugin]
2020-04-16 09:15:57:805 WARNI Authenticated clients can execute any kind of code into the server by using the following allowed languages: [sql] [OServerSideScriptInterpreter]
2020-04-16 09:15:57:805 INFO  OrientDB Studio available at http://127.0.0.1:2480/studio/index.html [OServer]
2020-04-16 09:15:57:805 INFO  OrientDB Server is active v2.2.37 (build a7541e7ceeabf592dd9a7b2928b6c023cbc73193, branch 2.2.x). [OServer]
Click to copy

When starting the second node of the database cluster, node1 will have the following log entries.
Showing that the two nodes are now both ONLINE and the phenixid database as well.

[ODistributedStorage]
2020-04-16 09:18:04:543 INFO  [CLUSTER1] Received new status CLUSTER2.phenixid=SYNCHRONIZING [OHazelcastPlugin]
2020-04-16 09:18:04:543 INFO  [CLUSTER1]->[CLUSTER2] Deploying database 'phenixid' with delta of changes... [OSyncDatabaseDeltaTask]
2020-04-16 09:18:04:543 INFO  [CLUSTER1]->[CLUSTER2] Creating delta backup of database 'phenixid' (startLSN=LSN{segment=27, position=94250}) in directory: C:\Windows\TEMP\orientdb\backup_CLUSTER2_phenixid_server0.zip... [OSyncDatabaseDeltaTask]
2020-04-16 09:18:04:558 INFO  Exporting records after LSN=LSN{segment=27, position=94250}. Found 0 records [OLocalPaginatedStorage]
2020-04-16 09:18:04:558 INFO  [CLUSTER1]->[CLUSTER2] Delta backup of database 'phenixid' completed. range=LSN{segment=27, position=94250}-LSN{segment=28, position=52} [OSyncDatabaseDeltaTask]
2020-04-16 09:18:04:558 INFO  [CLUSTER1]->[CLUSTER2] Deploy delta database task completed [OSyncDatabaseDeltaTask]
2020-04-16 09:18:04:574 INFO  [CLUSTER1]->[CLUSTER2] - transferring chunk #1 offset=0 size=8... [OSyncDatabaseDeltaTask]
2020-04-16 09:18:04:761 INFO  [CLUSTER1]<-[CLUSTER2] Received updated status CLUSTER2.phenixid=ONLINE [OHazelcastPlugin]
2020-04-16 09:18:04:855 INFO  [CLUSTER1] Distributed servers status (*=current @=lockManager[CLUSTER1]):

+---------------------+------+------------------------+-----+---------+-----------------+--------------+----------------------+
|Name                 |Status|Databases               |Conns|StartedOn|Binary           |HTTP          |UsedMemory            |
+---------------------+------+------------------------+-----+---------+-----------------+--------------+----------------------+
|CLUSTER2             |ONLINE|phenixid=ONLINE (MASTER)|4    |09:17:51 |172.31.6.250:2424|127.0.0.1:2480|51.13MB/3.56GB (1.40%)|
|CLUSTER1(*)(@)       |ONLINE|phenixid=ONLINE (MASTER)|0    |09:15:51 |172.31.1.60:2424 |127.0.0.1:2480|82.42MB/3.56GB (2.26%)|
+---------------------+------+------------------------+-----+---------+-----------------+--------------+----------------------+
Click to copy

Node2 will show similar log entries when starting up and joining the cluster:

[ODistributedStorage]
2020-04-16 09:18:03:554 INFO  [CLUSTER2] Current node is a MASTER for database 'phenixid' [ODistributedAbstractPlugin$3]
2020-04-16 09:18:03:617 WARNI [CLUSTER2]->[[CLUSTER1]] requesting delta database sync for 'phenixid' on local server... [OHazelcastPlugin]
2020-04-16 09:18:04:179 INFO  Storage 'plocal:C:/Program Files/PhenixID/Database/databases/phenixid' is opened under OrientDB distribution : 2.2.37 (build a7541e7ceeabf592dd9a7b2928b6c023cbc73193, branch 2.2.x) [OLocalPaginatedStorage]
2020-04-16 09:18:04:445 INFO  Found 3 records changed in last 6 operations [OLocalPaginatedStorage]
2020-04-16 09:18:04:460 INFO  [CLUSTER2] Executing the realignment of the last records modified before last close [#22:107498, #22:107499, #22:107500]... [OHazelcastPlugin]
2020-04-16 09:18:04:538 INFO  [CLUSTER2] Realignment completed. [OHazelcastPlugin]
2020-04-16 09:18:04:538 INFO  [CLUSTER2]->[CLUSTER1] Requesting database delta sync for 'phenixid' LSN=LSN{segment=27, position=94250}... [OHazelcastPlugin]
2020-04-16 09:18:04:570 INFO  [CLUSTER2]<-[CLUSTER1] Copying remote database 'phenixid' to: C:/Windows/TEMP//orientdb/install_phenixid_server1.zip [OHazelcastPlugin]
2020-04-16 09:18:04:570 INFO  [CLUSTER2]<-[CLUSTER1] Installing database 'phenixid' to: C:/Program Files/PhenixID/Database/databases/phenixid... [OHazelcastPlugin]
2020-04-16 09:18:04:585 INFO  [CLUSTER2]<-[CLUSTER1] Started import of delta for database 'phenixid' [OIncrementalServerSync$1]
2020-04-16 09:18:04:585 INFO  [CLUSTER2] - writing chunk #1 offset=0 size=8b [OHazelcastPlugin]
2020-04-16 09:18:04:585 INFO  [CLUSTER2] Database copied correctly, size=8b [ODistributedAbstractPlugin$5]
2020-04-16 09:18:04:710 INFO  [CLUSTER2]<-[CLUSTER1] Installed database delta for 'phenixid'. 0 total entries: 0 created, 0 updated, 0 deleted, 0 holes, 0 skipped [OIncrementalServerSync$1]
2020-04-16 09:18:04:741 INFO  [CLUSTER2] Installed database 'phenixid' (LSN=LSN{segment=28, position=52}) [OHazelcastPlugin]
2020-04-16 09:18:04:757 INFO  [CLUSTER2] Publishing ONLINE status for database CLUSTER2.phenixid... [ODistributedDatabaseImpl]
2020-04-16 09:18:04:804 INFO  [CLUSTER2] Reassigning ownership of clusters for database phenixid... [OHazelcastPlugin]
2020-04-16 09:18:04:804 INFO  [CLUSTER2] Reassignment of clusters for database 'phenixid' completed (classes=15) [OHazelcastPlugin]
2020-04-16 09:18:04:804 INFO  [CLUSTER2]<-[CLUSTER1] Installed delta of database 'phenixid' [OHazelcastPlugin]
2020-04-16 09:18:04:820 INFO  [CLUSTER2] Updated node status to 'ONLINE' [OHazelcastPlugin]
2020-04-16 09:18:04:851 INFO  [CLUSTER2] Distributed servers status (*=current @=lockManager[CLUSTER1]):

+------------------+------+------------------------+-----+---------+-----------------+--------------+----------------------+
|Name              |Status|Databases               |Conns|StartedOn|Binary           |HTTP          |UsedMemory            |
+------------------+------+------------------------+-----+---------+-----------------+--------------+----------------------+
|CLUSTER2(*)       |ONLINE|phenixid=ONLINE (MASTER)|4    |09:17:51 |172.31.6.250:2424|127.0.0.1:2480|51.13MB/3.56GB (1.40%)|
|CLUSTER1(@)       |ONLINE|phenixid=ONLINE (MASTER)|0    |09:15:51 |172.31.1.60:2424 |127.0.0.1:2480|82.42MB/3.56GB (2.26%)|
+------------------+------+------------------------+-----+---------+-----------------+--------------+----------------------+
 [OHazelcastPlugin]
2020-04-16 09:18:04:851 WARNI Authenticated clients can execute any kind of code into the server by using the following allowed languages: [sql] [OServerSideScriptInterpreter]
2020-04-16 09:18:04:851 INFO  OrientDB Studio available at http://127.0.0.1:2480/studio/index.html [OServer]
2020-04-16 09:18:04:851 INFO  OrientDB Server is active v2.2.37 (build a7541e7ceeabf592dd9a7b2928b6c023cbc73193, branch 2.2.x). [OServer]
Click to copy

When one of the nodes in the database cluster is stopped or restarted, this will be seen in the log:

[OHazelcastPlugin]
2020-04-16 09:32:12:656 INFO  [172.31.1.60]:2434 [orientdb] [3.8.4] Shutdown request of [172.31.6.250]:2434 is handled [MigrationManager]
2020-04-16 09:32:12:672 INFO  [172.31.1.60]:2434 [orientdb] [3.8.4] Re-partitioning cluster data... Migration queue size: 135 [MigrationManager]
2020-04-16 09:32:12:984 INFO  [172.31.1.60]:2434 [orientdb] [3.8.4] Connection[id=1, /172.31.1.60:2434->/172.31.6.250:56641, endpoint=[172.31.6.250]:2434, alive=false, type=MEMBER] closed. Reason: Removing member [172.31.6.250]:2434, uuid: 66226d4b-afc4-4479-b7f4-5f4a6ac8c45c, requested by: [172.31.6.250]:2434 [TcpIpConnection]
2020-04-16 09:32:12:984 INFO  [172.31.1.60]:2434 [orientdb] [3.8.4] Removing Member [172.31.6.250]:2434 - 66226d4b-afc4-4479-b7f4-5f4a6ac8c45c [ClusterService]
2020-04-16 09:32:12:984 INFO  [172.31.1.60]:2434 [orientdb] [3.8.4] 

Members [1] {
	Member [172.31.1.60]:2434 - 3d31c545-c1ab-400b-8fb2-822439fa3eb0 this
}
 [ClusterService]
2020-04-16 09:32:12:984 INFO  [CLUSTER1] Distributed transaction: rolled back 0 transactions and 0 single locks in database 'phenixid' owned by server 'CLUSTER2' [ODistributedDatabaseImpl]
2020-04-16 09:32:12:984 INFO  [172.31.1.60]:2434 [orientdb] [3.8.4] Partition balance is ok, no need to re-partition cluster data...  [MigrationManager]
2020-04-16 09:32:12:984 INFO  [CLUSTER1] Removing server 'CLUSTER2' from all the databases (removeOnlyDynamicServers=true)... [OHazelcastPlugin]
2020-04-16 09:32:12:984 INFO  [172.31.1.60]:2434 [orientdb] [3.8.4] Committing/rolling-back alive transactions of Member [172.31.6.250]:2434 - 66226d4b-afc4-4479-b7f4-5f4a6ac8c45c, UUID: 66226d4b-afc4-4479-b7f4-5f4a6ac8c45c [TransactionManagerService]
2020-04-16 09:32:12:984 INFO  [CLUSTER1] Cannot remove server 'CLUSTER2' because it is enlisted in 'servers' of the distributed configuration for database 'phenixid' [OHazelcastPlugin]
2020-04-16 09:32:12:984 WARNI [CLUSTER1] Node removed id=Member [172.31.6.250]:2434 - 66226d4b-afc4-4479-b7f4-5f4a6ac8c45c name=CLUSTER2 [OHazelcastPlugin]
2020-04-16 09:32:12:984 INFO  [CLUSTER1] Reassigning ownership of clusters for database phenixid... [OHazelcastPlugin]
2020-04-16 09:32:13:000 INFO  [CLUSTER1] Reassignment of clusters for database 'phenixid' completed (classes=15) [OHazelcastPlugin]
2020-04-16 09:32:14:000 INFO  [172.31.1.60]:2434 [orientdb] [3.8.4] All migration tasks have been completed, queues are empty. [MigrationThread]
2020-04-16 09:32:18:547 INFO  [CLUSTER1] Distributed servers status (*=current @=lockManager[CLUSTER1]):

+---------------------+------+------------------------+-----+---------+----------------+--------------+----------------------+
|Name                 |Status|Databases               |Conns|StartedOn|Binary          |HTTP          |UsedMemory            |
+---------------------+------+------------------------+-----+---------+----------------+--------------+----------------------+
|CLUSTER1(*)(@)       |ONLINE|phenixid=ONLINE (MASTER)|7    |09:15:51 |172.31.1.60:2424|127.0.0.1:2480|91.87MB/3.56GB (2.52%)|
+---------------------+------+------------------------+-----+---------+----------------+--------------+----------------------+
 [OHazelcastPlugin]
Click to copy

Startup of PhenixID Server service

When starting the first node in the PhenixID Server cluster it should report the following:

2020-04-16 08:49:55,123 [NodeManager]  INFO: 11 module(s) successfully deployed on node 'PhenixID_Node1'
2020-04-16 08:49:55,685 [NodeVerticle]  INFO: Node 'PhenixID_Node1' started successfully
Click to copy

As well as an event, that can be seen in /logs/events.log and also in the configuration UI:

2020-04-16 08:50:10,702 [EVENT]  INFO: Apr 16 2020 08:50:10.702 CEST PhenixID1 CEF:0|PhenixID_Node1|PAS|3.2.0|EVT_000003|Server started|2|msg=Some<space>operations<space>may<space>still<space>be<space>pending
Click to copy

When node2 is started, the following will be seen in server.log on node1, after the nodes has finished communicating and agreed in the cluster:

2020-04-16 09:24:12,339 [ClusterConnector]  INFO: Node '/172.31.6.250' joined the cluster
Click to copy

And when one of the nodes is stopped or restarted, this will be seen in the log:

2020-04-16 09:29:02,504 [TcpIpConnectionMonitor]  WARN: [172.31.1.60]:5701 [phenixid] [3.8.4] Removing connection to endpoint [172.31.6.250]:5701 Cause => java.net.SocketException {Connection refused: connect to address /172.31.6.250:5701}, Error-Count: 5
2020-04-16 09:29:02,520 [ClusterConnector]  WARN: Node '/172.31.6.250' left the cluster
Click to copy

The database service should always be started before the PhenixID Server. If that is not the case, messages like this will occur in server.log:

2020-04-16 09:12:50,487 [EventHandler]  ERROR: Cannot create a connection to remote server address(es): [172.31.1.60:2424, 172.31.6.250:2424]
	DB name="phenixid"
Click to copy