When starting a Child Aggregator node in a Version 5 MemSQL Cluster Child Aggregator runs for a brief moment then shuts down. This is the last line printed to the Child Aggregator's memsql.log file:
FATAL: Unable to obtain aggregator id from master aggregator at 10.144.201.20:3306 with error 2005: Timed out reading from socket after 10 seconds
Check the position of each sharding database in the cluster.
First generate a cluster report using `memsql-ops report`.
The following command can be used to detect a behind sharding database in the cluster (more than one result for the output of this command indicates an issue):
cluster-report-20180620T043709$ find . -name memsql_info.json | xargs cat | jq '.show_databases_extended | select( ."Database"| contains("sharding"))' | grep Commits | sort | uniq
Then find the node with the behind sharding database:
MacBook-Pro Wed Jun 20 11:12:19 cluster-report-20180620T043709$ find . -name memsql_info.json | xargs grep '"Commits": 4057'
./agent-A8e96af-10.144.201.56-9000-follower-child/memsql-D341F4B-child-3306/memsql_info.json: "Commits": 4057,
Restart the MemSQL node with the behind sharding database. In this example restart Child Aggregator 10.144.201.56