3.1 Release Notes
Upgrading to 3.1
MemSQL supports various methods for upgrading to version 3.1 depending on your current installation. For more information see: Upgrading MemSQL.
MemSQL 3.1 supports unmaterialized views. Views in MemSQL are not writeable, but do support online
MemSQL 3.1 supports automated cross-datacenter replication. Previously this had to be done with manual configuration of master and slave replicas. Now it can be done with a single command:
memsql> REPLICATE DATABASE db_name FROM user:'password'@'host':port;
Replication is done on a per-database basis. The topology and redundancy level of the secondary (replica) cluster do not need to match the topology or redundancy level of the primary cluster. Sharding of the replica database on the secondary cluster is managed automatically by MemSQL. Note that the secondary cluster must have access to the public IP addresses of all leaf nodes in the primary cluster. This is because MemSQL replicates data directly from the leaf nodes on the primary cluster to the leaf nodes on the secondary cluster. To get a list of leaf node public IPs, run
SHOW LEAVES on the master aggregator of the primary cluster.
Replica databases are read-only. The only supported DML is
SELECT and database DDL operations, such as
CREATE TABLE and
DROP TABLE, are disabled. However, distributed DDL such as
REBALANCE PARTITIONS work normally to perform distributed operations on the secondary cluster.
MemSQL offers a number of simple commands for managing replication. Users can pause and restart replication using
PAUSE REPLICATING and
CONTINUE REPLICATING, respectively. Running
STOP REPLICATING will stop replication and convert the replica database to an independent MemSQL database with all standard DDL and DML.
INSERT queries that insert multiple records are now transactional. While attempting to complete the query, if MemSQL encounters a duplicate key, the entire transaction will roll back. Note that transactionality slows multi-INSERT by roughly 20% (depending on network performance). For cases where performance is more important than transactionality, MemSQL 3.1 allows users to disable transactions using
INSERT IGNORE. In this case, MemSQL will ignore records with duplicate keys and, without rolling back, continue inserting records with unique keys. For more information see INSERT.
Additional ALTER TABLE Support
MemSQL 3.1 supports
ALTER TABLE ... DROP INDEX and
ALTER TABLE ... MODIFY COLUMN. Both are online operations. For more information, see ALTER TABLE.
Support for Uncorrelated Subselects with IN Lists
MemSQL 3.1 adds support for uncorrelated subselects with
IN lists. For example, the following query will now execute:
memsql> SELECT a.name FROM Table_A a WHERE a.user_id IN (SELECT b.user_id FROM Table_B b WHERE b.age > 23);
Previously queries with this structure could only execute if there was a foreign shard key relationship between the expression preceding
IN and the projection of the subquery. Now the above query will execute even if
user_id is not a foreign shard key.
MemSQL now also supports the
SQL_SMALL_RESULT keyword to be used with
IN list uncorrelated subselects.
SQL_SMALL_RESULT tells MemSQL to temporarily materialize the
IN list, rather than comparing the projection from the outer query with the result of the
IN list “natively” without materializing the list. When the result of the subquery defining the
IN list has relatively few results,
SQL_SMALL_RESULT will allow the query to execute faster. Note that MemSQL always materializes the
IN list unless the inner subquery is on a reference table or it can leverage a foreign shard key relationship.
SQL_SMALL_RESULT differently than MySQL does. In MySQL,
SQL_SMALL_RESULT is only used with
GROUP BY and
DISTINCT queries. MemSQL only uses
IN lists as described above. For other queries, MemSQL simply ignores
Streaming GROUP BY With SQL_BIG_RESULT
GROUP BY queries, MemSQL 3.1 supports the
SQL_BIG_RESULT keyword, which tells MemSQL to do as much sorting as possible on the leaves rather than performing all sorting on the aggregator.
SQL_BIG_RESULT can improve performance for
SELECT queries with large result sets because it parallelizes the sorting process and can save significant memory on the aggregator. This is the default behavior when the
GROUP BY is over indexed columns. For more information see SELECT.
LOAD DATA Improvements
MemSQL 3.1 improves
LOAD DATA in a number of ways. In particular, performance is improved thanks to the following changes:
- Files are streamed throughout the
LOAD DATAprocess rather than buffering the entire dataset in memory
- File contents are streamed directly to leaf nodes, whereas they used to be encoded as multi-inserts
- Various low-level optimizations
LOAD DATA is now transactional. If the query fails, the entire load rolls back. As with multi-inserts, transactionality can be turned off using the
IGNORE keyword. For more information see LOAD DATA.
Finally, it is now possible to run multiple concurrent
LOAD DATA queries at once.
Manually Triggered Snapshots
MemSQL 3.1 enables users to manually trigger full database snapshots using the
SNAPSHOT command on an aggregator. The snapshot process is the same one MemSQL uses for periodic, automatic snapshots. The advantage of manual snapshots is that they allow DBAs to effectively compress log files at will to expedite recovery time. For example, after running several DDL commands, such as
ALTER TABLE, replaying log files may take longer than desired since the database will be restored to an earlier state and then has to perform the
ALTER again. By triggering a snapshot after performing DDL operations, the latest snapshot will have all the changes and recovery does not require replaying the DDL.
Support for Distributed INSERT … SELECT
INSERT ... SELECT can now be used with arbitrary distributed tables. If the query can be completed while maintaining the foreign shard key relationship, the process happens entirely on the leaves. Otherwise, the data is streamed to the aggregator. For more information see INSERT.
New ATTACH LEAF Utilities
MemSQL 3.1 enables
ATTACH LEAF ALL, a command that attaches all leaves that are currently in the
DETACHED state. In addition, clusters configured with redundancy level 1 will automatically reattach leaves unless the user manually detaches them. For more information see ATTACH LEAF.
MemSQL 3.1 supports table-level security. User-based permissions are managed using
REVOKE, just as with database-level security. Table-level permissions can also be applied to views. MemSQL does not currently support row or column-level security. However, an easy way to enforce row or column-level access controls is to create a view that selects particular rows or columns and configure permissions on the view.