Troubleshooting inter-cluster replication issues
1. Check cluster names are different across all clusters in replication.
2. Check all the Secondary cluster “replicationjournal” datasource is properly pointing to the CJ/Primary Cluster details only (including the failover datasources)
3. Run the below command on the CJ cluster
$RLI_HOME/bin/advanced/monitoring.bat(.sh) –d cloud-replication
fetch’s properties for HDAP stores under replication.
The above command does the below operations
- Primary cluster send the Encrypt(RJ password + auth type+ SSHA) to remote HTTP servlet on Secondary Cluster, by getting the hostname from the "replica" in ou=replication,cn=config
- The Secondary Cluster HTTP servlet will fetch the local Encrypt(RJ password + auth type+ SSHA) to do the comparison
- If they both match, then as a response to HTTP request the cn=Directory Manager user and password fetched from vds_server.conf of ZK and sent back to primary cluster
- Primary cluster will use the response of DM user and password to perform an LDAP bind and base search on HDAP to fetch the "vdssynccursor“
- NOTE : Here the HTTP port will be 8089 or 8090 depending on the SSL checked in VDSHA data source on the secondary cluster...
- if there is any problem in this process, you will see this error | errors | ["SYNC_CURSOR_CONNECT"] |
4. For datasources configured over a secure port, make sure the Settings>>Client Certificate Truststore both the clusters have each others server certificates.
5. Check if 'vdsha' datasource is configured appropriately with right credentials and the ports. Inconsistencies in this might cause a red arrow with a cross to appear in Replication Monitoring Tab.
6. Make sure, there are NO firewalls blocking the ports 8089, 8090 , 9100 and 9101.
7. Check “vdssynccursor” on both clusters HDAP store to be close to the “lastchangenumber” in the “cn=replicationjournal” store in an Active – Active clusters.