This page discusses common memory management topics. It includes the following sections:
Configuring Memory Limits
MemSQL has two engine variables that control its memory usage. Both are measured in megabytes.
MemSQL will not allocate more than
maximum_memory megabytes of memory. If a
SELECT query’s memory allocations put the memory usage by MemSQL over
maximum_memory, query execution stops and an error message is generated.
Query compilations whose allocations exceed this limit will also terminate the server. See Code Generation for more information on query compilations. By default,
maximum_memory is set to 90% of the physical memory on the host machine if no swap is enabled, and 100% of physical memory on the machine if swap is enabled.
MemSQL will not allow writes to any table once the cumulative memory in use by all tables in MemSQL reaches
maximum_table_memory (MemSQL will become read-only).
DELETE queries will still be allowed even once the limit is reached.
CREATE INDEX or
DROP INDEX statements will fail with an error message once the limit has been reached. Query compilations are also disallowed once the limit is reached.
This setting is designed to allow SELECT queries to allocate temporary memory for sorting, hash group-by, and so on. The
maximum_table_memory must be set to a value lower then
maximum_memory. By default,
maximum_table_memory is set to 90% of
maximum_memory, which translates to about 80% of physical memory on the host machine.
maximum_table_memory limit has been reached,
DELETE queries can still be executed to remove data from the table; however large
DELETE queries may fail if the memory used by MemSQL reaches
Caution should be taken as
DELETE queries allocate extra memory to mark rows as deleted. For rowstore tables, this equates to roughly 40 + 8*number_of_indexes bytes per deleted row. For columnstore tables, the memory usage will be lower because of how rows are marked to be deleted (roughly num_rows_in_table/8 bytes if you delete a row in every segment file in the table).
If the table is narrow, such as containing a small number of int columns,
DELETE queries will show up as a relatively large spike in memory usage compared to the size of the table.
The memory for a deleted row is reclaimed after the transaction commits and the memory is freed asynchronously by the garbage collector. For more information, see the DELETE reference topic.
Replicating databases will pause if memory use reaches
maximum_table_memory while replicating data. If memory becomes available again - say some data is deleted - replication will automatically continue replicating.
Maximum Table Memory Reached
If you see the error message below, the leaf at
'leafhost':leafport has reached
maximum_table_memory, which is the maximum cumulative memory in use by all tables on a node. MemSQL will not allow writes to any table once
maximum_table_memory is reached (MemSQL will become read-only).
Leaf Error (leafhost:leafport): Memory usage by MemSQL for tables (##### MB) has reached the value of 'maximum_table_memory' global variable (##### MB). This query cannot be executed.
You can resolve this issue with the steps below.
Check for Data Skew
First, determine which nodes are reaching their memory capacity. Do this by checking for data skew across your nodes. If skew is present (significantly higher memory consumption on specific nodes), you should re-evaluate your shard keys. If no skew is present, you should increase cluster memory or delete data. There are two ways to check for skew:
- Execute the following query to view per-table memory consumption and row counts in your MemSQL cluster for tables with more than 10000 rows:
SELECT DATABASE_NAME, TABLE_NAME, MIN(ROWS), MAX(ROWS), FLOOR(AVG(ROWS)) AS avg_rows, ROUND(STDDEV(ROWS)/AVG(ROWS),3) AS row_skew, MIN(MEMORY_USE), MAX(MEMORY_USE), SUM(MEMORY_USE)/(1024*1024) AS total_memory_mb, FLOOR(AVG(MEMORY_USE)) AS avg_memory, ROUND(STDDEV(MEMORY_USE)/AVG(MEMORY_USE),3) AS memory_skew FROM INFORMATION_SCHEMA.TABLE_STATISTICS GROUP BY 1, 2 HAVING SUM(ROWS) > 10000;
With any version of MemSQL, you can manually compare
Alloc_table_memory(memory used by tables) on each node:
mysql> SHOW VARIABLES LIKE "maximum_%"; +----------------------+-------+ | Variable_name | Value | +----------------------+-------+ | maximum_memory | 48291 | | maximum_table_memory | 43462 | +----------------------+-------+ mysql> SHOW STATUS EXTENDED LIKE "Alloc_table_memory"; +--------------------+----------+ | Variable_name | Value | +--------------------+----------+ | Alloc_table_memory | 43462 MB | +--------------------+----------+ 1 row in set (0.01 sec)
If memory consumption is significantly higher on specific nodes, skew is present. If memory consumption is fairly uniform across leaves, no skew is present.
More Examples of Finding Data Skew
- Show skew across all partitions for a single table. Fill in
SELECT database_name, table_name, ordinal AS PARTITION_ID, rows, memory_use FROM information_schema.table_statistics WHERE table_name = '<table_name>';
- Show per-partition skew for columns in a group by. Helps identify a potential bottleneck after a subquery with a group by. If skew factor is high, that indicates that some groups have very high cardinality and some have very low cardinality and can lead to long single-threaded operations. Fill in
SELECT ROUND(STDDEV(c)/AVG(c),3)*100 AS group_skew, PARTITION_ID() FROM ( SELECT col1, col2, ..., coln, count(*) FROM <table_name> GROUP BY 1, 2) sub GROUP BY PARTITION_ID();
- Show data distribution for a table if you were to reshard on another column or set of columns. Fill in
SELECT WITH(leaf_pushdown=true) SUM(c) rows, PARTITION_ID() partition_id FROM ( SELECT count(*) c FROM <table_name> GROUP BY <potential_shard_key>) reshuffle GROUP BY PARTITION_ID();
If Skew is Present
MemSQL distributes data based on the SHARD KEY specified in a table’s schema. If no SHARD KEY is explicitly specified, the PRIMARY KEY is used as the default SHARD KEY. If the SHARD KEY shards on a low cardinality column, data will accumulate on certain nodes. To resolve this, you should export your data, modify your schema, and then reload your data. To export your data, see Exporting Data From MemSQL. After you backup your data, you can
DROP TABLE low_cardinality_shard_key_table, and then
CREATE TABLE with a higher cardinality SHARD KEY. When you reload the data, MemSQL will automatically shard your data based on the new SHARD KEY.
$ mysqldump -h 127.0.0.1 -u root my_db low_cardinality_shard_key_table --no-create-info > databackup.sql memsql> DROP TABLE low_cardinality_shard_key_table; memsql> source updated_shard_key_schema.sql;
If No Skew is Present
If nodes are reaching
maximum_table_memory without skew, the cluster itself is nearing maximum memory capacity. To resolve this, you need to add capacity or delete data. To add capacity, see Administering a Cluster. If additional leaf nodes are not available, use the instructions below to delete data in batches.
DELETE queries use 50 bytes of memory per record deleted until the
DELETE commits, so the
maximum_table_memory needs to be temporarily increased on all nodes during the
On each leaf node in your cluster, do the following:
Pause write workloads from an aggregator or your application.
memsql> select @@global.maximum_memory | @@global.maximum_memory | +-------------------------+ | 65536 | +-------------------------+ 1 row in set (0.00 sec) memsql> select @@global.maximum_table_memory +-------------------------------+ | @@global.maximum_table_memory | +-------------------------------+ | 57927 | +-------------------------------+ 1 row in set (0.00 sec)
maximum_table_memoryto 95% of
maximum_memory. The default value is 90% of
memsql> set @@global.maximum_table_memory=62259; Query OK, 0 rows affected (0.00 sec)
Execute small batches of DELETE queries. Limit to 100,000 records or less to reduce memory used until commit.
memsql> DELETE FROM mytable limit 100000; Query OK, 0 rows affected (1.64 sec)
maximum_table_memoryto original value.
memsql> set @@global.maximum_table_memory=57927; Query OK, 0 rows affected (0.00 sec)
Linux Out of Memory Behavior
Linux will kill processes using a lot of memory when the amount of free memory on the system is too low. MemSQL is often the target process killed. To check if MemSQL was killed by Linux as a result of the system running out of memory, run:
$ dmesg | grep -i "out of memory"
If the server has rebooted since MemSQL was killed, you can find the logs from the last boot by running:
$ cat /var/log/kern.log
Linux can be configured to disable this behaviour by changing the setting of
vm.overcommit_ratio, but this is not recommended. It is safer to configure MemSQL to use less memory by setting
maximum_memory to a lower value.