The AGGREGATOR SET AS MASTER command is used to convert a child aggregator node into a master aggregator node. It is meant to be used under the guidance of MemSQL Support.
The AGGREGATOR SET AS MASTER is an offline command and requires stopping all traffic to the target MemSQL cluster.
Q: Why would I need to use the AGGREGATOR SET AS MASTER command?
A: If the current master aggregator is permanently offline (underlying host unrecoverable) OR if the current master aggregator is exhibiting anomalous behavior.
STEPS
- Stop the workload on the MemSQL cluster. This means disabling all query traffic to the aggregators in the cluster.
- Select target child aggregator to be converted to master. Check to make sure the child aggregator's sharding database is at the same exact position as the sharding database of the current master aggregator.
- Turn off aggregator_failure_detection on the target child aggregator.
- Confirm a master aggregator is not currently online in the cluster. If the master aggregator is online turn it off (memsql-ops memsql-stop)
- Run memsql>AGGREGATOR SET AS MASTER IDENTIFIED BY PASSWORD 'password_here'; on the target child aggregator. Note: This command will use the permission associated with the root@localhost account.
- Run memsql> SELECT @@master_aggregator;. The command should return 'self'
- Run memsql> SNAPSHOT DATABASE sharding;
- Turn on aggregator_failure_detection on the new master aggregator.
- Update the memsql.cnf file of the newly promoted Master Aggregator. Change line master_aggregator = <original-ma-ip-here> to master_aggregator
- Restart the new master aggregator
- Backup User Grants on All Aggregators
- Delete and Re-deploy all Child Aggregators in the cluster
Scenario: Promote a Child Aggregator to Master Aggregator
Consider the following cluster:
master-agg-and-leaf-ip-10-0-3-178 /home/admin $ memsql-ops memsql-list ID Agent Id Process State Cluster State Role Host Port Version A741CD4 Acfe95c RUNNING CONNECTED MASTER 10.0.3.178 3306 5.8.0 6058F29 Aabbb60 RUNNING CONNECTED AGGREGATOR 10.0.2.243 3306 5.8.0 5CCD123 Aabbb60 RUNNING CONNECTED LEAF 10.0.2.243 3307 5.8.0 F0EE144 Acfe95c RUNNING CONNECTED LEAF 10.0.3.178 3307 5.8.0
Step 1: Stop workload against the MemSQL Cluster
Step 2: Pick a Child Aggregator
Check the position of the sharding database on the master aggregator
memsql> show databases extended; +--------------------+---------+--------+--------+----------+---------+-------------+------------+--------------+ | Database | Commits | Role | State | Position | Details | AsyncSlaves | SyncSlaves | Memory (MBs) | +--------------------+---------+--------+--------+----------+---------+-------------+------------+--------------+ | information_schema | 66 | master | online | 0:177714 | | 0 | 0 | 0.00 | | memsql | 21 | master | online | 0:1644 | | 0 | 0 | 0.00 | | memsql_demo | 12 | master | online | 0:14779 | | 2 | 1 | 0.13 | | sharding | 23 | master | online | 0:1469 | | 0 | 3 | 0.00 | +--------------------+---------+--------+--------+----------+---------+-------------+------------+--------------+ 4 rows in set (0.01 sec)
Check the position of the sharding database on a child aggregator in the cluster
memsql> show databases extended; +--------------------+---------+------------+-------------+----------+---------+-------------+------------+--------------+ | Database | Commits | Role | State | Position | Details | AsyncSlaves | SyncSlaves | Memory (MBs) | +--------------------+---------+------------+-------------+----------+---------+-------------+------------+--------------+ | information_schema | 66 | master | online | 0:177714 | | 0 | 0 | 0.00 | | memsql | 20 | master | online | 0:1613 | | 0 | 0 | 0.00 | | memsql_demo | 12 | sync slave | replicating | 0:14779 | | 0 | 0 | 0.13 | | sharding | 23 | sync slave | replicating | 0:1469 | | 0 | 0 | 0.00 | +--------------------+---------+------------+-------------+----------+---------+-------------+------------+--------------+ 4 rows in set (0.00 sec)
Because both sharding databases are at the same position (0:1469) this child aggregator is a candidate to be promoted to a Master Aggregator.
Step 3: Turn Off aggregator_failure_detection on the target child aggregator
memsql> select @@aggregator_failure_detection; +--------------------------------+ | @@aggregator_failure_detection | +--------------------------------+ | ON | +--------------------------------+ 1 row in set (0.00 sec) memsql> SET GLOBAL aggregator_failure_detection = OFF; Query OK, 0 rows affected (0.00 sec) memsql> select @@aggregator_failure_detection; +--------------------------------+ | @@aggregator_failure_detection | +--------------------------------+ | OFF | +--------------------------------+ 1 row in set (0.00 sec)
Step 4: Confirm Master Aggregator is OFFLINE
master-agg-and-leaf-ip-10-0-3-178 /home/admin $ memsql-ops memsql-list ID Agent Id Process State Cluster State Role Host Port Version A741CD4 Acfe95c RUNNING CONNECTED MASTER 10.0.3.178 3306 5.8.0 6058F29 Aabbb60 RUNNING CONNECTED AGGREGATOR 10.0.2.243 3306 5.8.0 5CCD123 Aabbb60 RUNNING CONNECTED LEAF 10.0.2.243 3307 5.8.0 F0EE144 Acfe95c RUNNING CONNECTED LEAF 10.0.3.178 3307 5.8.0 master-agg-and-leaf-ip-10-0-3-178 /home/admin $ memsql-ops memsql-stop A741CD4 2017-06-21 18:43:28: J0cd3e3 [INFO] Stopping MemSQL node A741CD44ABDE3BBC9C0DE4FA30C8BBA2EF362DDD on Agent Acfe95c1c527a4610af53c0ca514f289a 2017-06-21 18:43:31: J0cd3e3 [INFO] Service script output: Stopping memsqld:.Waiting up to 60 seconds for a clean exit:. 2017-06-21 18:43:31: J0cd3e3 [INFO] Successfully stopped MemSQL node A741CD44ABDE3BBC9C0DE4FA30C8BBA2EF362DDD master-agg-and-leaf-ip-10-0-3-178 /home/admin $ memsql-ops memsql-list ID Agent Id Process State Cluster State Role Host Port Version A741CD4 Acfe95c NOT RUNNING CONNECTED MASTER 10.0.3.178 3306 5.8.0 6058F29 Aabbb60 RUNNING CONNECTED AGGREGATOR 10.0.2.243 3306 5.8.0 5CCD123 Aabbb60 RUNNING CONNECTED LEAF 10.0.2.243 3307 5.8.0 F0EE144 Acfe95c RUNNING CONNECTED LEAF 10.0.3.178 3307 5.8.0 master-agg-and-leaf-ip-10-0-3-178 /home/admin $ ps aux | grep memsqld | grep master
Step 5: Run AGGREGATOR SET AS MASTER
memsql> AGGREGATOR SET AS MASTER IDENTIFIED BY PASSWORD 'TPqMIzGqm27GEsv2'; Query OK, 1 row affected (1.91 sec)
Step 6: Run SELECT @@master_aggregator
memsql> SELECT @@master_aggregator; +---------------------+ | @@master_aggregator | +---------------------+ | self | +---------------------+ 1 row in set (0.00 sec)
Step 7: Run SNAPSHOT DATABASE sharding
memsql> SNAPSHOT DATABASE sharding; Query OK, 0 rows affected (0.28 sec)
Step 8: Turn On aggregator_failure_detection on the New Master Aggregator
memsql> select @@aggregator_failure_detection; +--------------------------------+ | @@aggregator_failure_detection | +--------------------------------+ | OFF | +--------------------------------+ 1 row in set (0.00 sec) memsql> SET GLOBAL aggregator_failure_detection = ON; Query OK, 0 rows affected (0.00 sec) memsql> select @@aggregator_failure_detection; +--------------------------------+ | @@aggregator_failure_detection | +--------------------------------+ | ON | +--------------------------------+ 1 row in set (0.00 sec)
Step 9: Update memsql.cnf
BEFORE
agg-and-leaf-ip-10-0-2-243 /ebs/memsql/child-3306 $ cat memsql.cnf ; ------------------------------------------------------------------------ ; THIS CONFIGURATION FILE IS MANAGED BY MEMSQL OPS ; MemSQL Ops controls the data in this file. Please be careful ; when editing it. ; For more information, see our documentation at http://docs.memsql.com ; ------------------------------------------------------------------------ [server] basedir = . bind_address = 0.0.0.0 core_file lc_messages_dir = ./share lock_wait_timeout = 60 snapshot_trigger_size = 256m socket = memsql.sock syslog_level = all syslog_protocol = rfc3164 tmpdir = . transaction_buffer = 64m ; ------------------------------------------------------------------------ ; MEMSQL OPS VARIABLES ; ; Variables below this header are controlled by MemSQL Ops. ; Please do not edit any of these values directly. ; ------------------------------------------------------------------------ master_aggregator = 10.0.3.178:3306 port = 3306
AFTER
agg-and-leaf-ip-10-0-2-243 /ebs/memsql/child-3306 $ cat memsql.cnf ; ------------------------------------------------------------------------ ; THIS CONFIGURATION FILE IS MANAGED BY MEMSQL OPS ; MemSQL Ops controls the data in this file. Please be careful ; when editing it. ; For more information, see our documentation at http://docs.memsql.com ; ------------------------------------------------------------------------ [server] basedir = . bind_address = 0.0.0.0 core_file lc_messages_dir = ./share lock_wait_timeout = 60 snapshot_trigger_size = 256m socket = memsql.sock syslog_level = all syslog_protocol = rfc3164 tmpdir = . transaction_buffer = 64m ; ------------------------------------------------------------------------ ; MEMSQL OPS VARIABLES ; ; Variables below this header are controlled by MemSQL Ops. ; Please do not edit any of these values directly. ; ------------------------------------------------------------------------ master_aggregator port = 3306
Step 10: Restart the New Master Aggregator
agg-and-leaf-ip-10-0-2-243 /ebs/memsql/child-3306 $ memsql-ops memsql-list ID Agent Id Process State Cluster State Role Host Port Version 6058F29 Aabbb60 RUNNING CONNECTED MASTER 10.0.2.243 3306 5.8.0 A741CD4 Acfe95c NOT RUNNING CONNECTED MASTER 10.0.3.178 3306 5.8.0 5CCD123 Aabbb60 RUNNING CONNECTED LEAF 10.0.2.243 3307 5.8.0 F0EE144 Acfe95c RUNNING CONNECTED LEAF 10.0.3.178 3307 5.8.0 agg-and-leaf-ip-10-0-2-243 /ebs/memsql/child-3306 $ memsql-ops memsql-restart 6058F29 2017-06-21 19:02:56: J8e052c [INFO] Stopping MemSQL node 6058F2902074F84323E3B4AC0ADBDA0BF7ED504A on Agent Aabbb60bd3c5b4b98b26159eaa2424429 2017-06-21 19:02:58: J8e052c [INFO] Service script output: Stopping memsqld:.Waiting up to 60 seconds for a clean exit:. 2017-06-21 19:02:58: J8e052c [INFO] Successfully stopped MemSQL node 6058F2902074F84323E3B4AC0ADBDA0BF7ED504A 2017-06-21 19:02:59: J73767f [INFO] Starting MemSQL node 6058F2902074F84323E3B4AC0ADBDA0BF7ED504A on Agent Aabbb60bd3c5b4b98b26159eaa2424429 2017-06-21 19:03:03: J73767f [INFO] Successfully started MemSQL node 6058F2902074F84323E3B4AC0ADBDA0BF7ED504A
Step 11: Backup User Grants on All Aggregators
It is best practice to maintain backups of user grants on aggregator nodes. If you do not have current user backups from all aggregator nodes stop now and create them.
You can use the following bash script backup_grants.sh to backup user grants from a MemSQL node:
#!/bin/bash #./user_backup.sh -o -h memsql_host -u user -p password -o /path/to/sqlfile #defaults outfile="grants-$(date +%F_%H_%M).sql" host="127.0.0.1" user="root" port="3306" while getopts ":o:h:u:p:P:" opt; do case $opt in o) outfile=$OPTARG ;; h) host=$OPTARG ;; u) user=$OPTARG ;; p) pass=$OPTARG ;; P) port=$OPTARG ;; esac done echo "Backing up grants from $host to the file $outfile" if [ -z "$pass" ]; then mysql_command="mysql -h$host -u$user -P$port " else mysql_command="mysql -h$host -u$user -P$port -p$pass" fi echo "Connection string is: $mysql_command" cmd="for i in \`$mysql_command -e 'select distinct grantee from information_schema.user_privileges;' | awk {'print \$1'}| grep -v 'grantee'\`; do $mysql_command -e "'"'"show grants for \$i"'"'" >> $outfile; done" eval $cmd sed -i '/Grants for.*/d' "$outfile" sed -i 's/$/;/' "$outfile"
Step 12: Delete and Re-deploy all Child Aggregators in the cluster
agg-and-leaf-ip-10-0-2-243 /ebs/memsql/master-3306 $ memsql-ops memsql-list ID Agent Id Process State Cluster State Role Host Port Version 6058F29 Aabbb60 RUNNING CONNECTED MASTER 10.0.2.243 3306 5.8.0 A741CD4 Acfe95c NOT RUNNING CONNECTED MASTER 10.0.3.178 3306 5.8.0 5CCD123 Aabbb60 RUNNING CONNECTED LEAF 10.0.2.243 3307 5.8.0 F0EE144 Acfe95c RUNNING CONNECTED LEAF 10.0.3.178 3307 5.8.0 agg-and-leaf-ip-10-0-2-243 /ebs/memsql/master-3306 $ memsql-ops memsql-delete A741CD4 This will completely delete the MemSQL node running on 10.0.3.178:3306, along with all of its data The target MemSQL node has role MASTER If you want to delete this node, please type the word DELETE: DELETE Deleting MemSQL node Successfully deleted MemSQL agg-and-leaf-ip-10-0-2-243 /ebs/memsql/master-3306 $ memsql-ops memsql-deploy -a Acfe95c -r aggregator -P 3306 2017-06-21 19:05:17: J159202 [INFO] Deploying MemSQL to 10.0.3.178:3306 2017-06-21 19:05:17: J159202 [INFO] Installing MemSQL 2017-06-21 19:05:20: J159202 [INFO] Finishing MemSQL Install Waiting for MemSQL to start... MemSQL successfully started agg-and-leaf-ip-10-0-2-243 /ebs/memsql/master-3306 $ memsql-ops memsql-list ID Agent Id Process State Cluster State Role Host Port Version 6058F29 Aabbb60 RUNNING CONNECTED MASTER 10.0.2.243 3306 5.8.0 EDD6E59 Acfe95c RUNNING CONNECTED AGGREGATOR 10.0.3.178 3306 5.8.0 5CCD123 Aabbb60 RUNNING CONNECTED LEAF 10.0.2.243 3307 5.8.0 F0EE144 Acfe95c RUNNING CONNECTED LEAF 10.0.3.178 3307 5.8.0