7.0 Release Notes

The main features of the MemSQL 7.0 release are highlighted below.

Release Highlights

  • Much improved performance of synchronous replication and durability. Our synchronous replication now has only a minor performance penalty compared to our already fast asynchronous replication – giving you better data consistency guarantees without incurring large performance penalties. You can read more about our improved synchronous replication here.

  • Selective queries on columnstore tables now run faster, via sub-segment access.

  • Hash indexes on columnstore tables are now supported. CREATE TABLE and ALTER TABLE now support adding and dropping hash indexes on columnstore tables.

  • By default, UPDATE and DELETE queries now lock columnstore tables at the row level, allowing improved concurrency. Previously, these queries locked columnstore tables on the segment level.

  • Sparse rowstore compression is now supported.

  • Queries on columnstore tables can now reorder filters to decrease execution time. Reordering occurs automatically and allows filters that are more selective to be evaluated first.


To deploy this release, follow the appropriate guide.

To upgrade a self-managed install to this release, follow this guide.

To make a backup of a database in this release or to restore a database backup to this release, follow this guide.

In addition to the release highlights noted above, the following new features and improvements are available in this release:

Query Execution

  • Improved Workload Management implementation that no longer needs to be tuned via the workload_management_expected_aggregators engine variable; this variable is now deprecated.
  • Support for multi-statement transactions that use distributed joins.
  • Support for multi-statement transactions that write to a database and then read from a different database.
  • Added a session variable node_degree_of_parallelism that specifies the number of threads per node to use for columnstore table scans.
  • Queries that use the IN filter on shard keys now perform better.
  • Hash joins that use reference tables now perform better.
  • The data_conversion_compatibility_level engine variable can now be set to 7.0. This allows errors to be thrown for more invalid type conversions, instead of doing implicit type conversions, where the user could see unexpected results.
  • Now, a maximum of 1024 tables can be written to in a transaction.

Query Optimization

  • Column statistics are now automatically gathered on rowstore tables. This is in addition to column statistics being automatically gathered on columnstore tables, which was done previously.
  • Range statistics (histograms) are now automatically gathered on rowstore and columnstore tables.
  • Rowstore table sampling has been improved.
  • Now, a true row-level random sample of rows from columnstore tables is maintained automatically and used to estimate the selectivity of complex predicates.
  • All shapes of bushy joins are now supported.
  • Common Table Expressions (CTEs) can now be materialized; redundant expressions can use cached data instead of being recomputed.

Data Storage, Loading, and Backup

  • Performance improvements for bulk inserts, large insert/selects, LOAD DATA, and pipelines to columnstore tables (particularly for wide columnstore tables).
  • Added support for publishing data to Kafka via SELECT … INTO KAFKA ….
  • Added support for Parquet in CREATE PIPELINE.
  • Improved error reporting for pipelines and LOAD DATA.
  • Incremental backup for columnstore data.
  • Added the experimental engine variable enable_varbuffer_dictionary_compression. When set to ON, this setting enables compression of identical strings in columns with the VARCHAR, VARBINARY, LONGTEXT, LONGBLOB, MEDIUMBLOB, BLOB, TINYBLOB, MEDIUMTEXT, TEXT, and TINYTEXT data types.

System Management

  • Added a column to information_schema.TABLE_STATISTICS to indicate if a table is rowstore or a columnstore.

Functional Extensions

  • Cross-database views are now supported.
  • Added built-in support for time series reporting through the new FIRST, LAST, and TIME_BUCKET functions.
  • GROUP_CONCAT() now supports the ORDER BY clause.
  • Added the MEDIAN() aggregate function.

Maintenance Release Changelog

2020-02-18 Version 7.0.12

  • Fixed an incorrect “Transaction rolled back mid-query” error that could occur during a multi-statement transaction if the transaction was started on at least 64 partitions.
  • Fixed an error that occurred when SELECT ... GROUP BY ... ORDER BY RAND() was run on a columnstore table.
  • Now, increase the number of columns that SELECT queries can return.
  • Removed a limitation on the number of columns in a table allowed for columnstore sampling. Increased the number of keys allowed for optimized columnstore JSON storage in some cases.
  • Fixed a deadlock that is possible when the RESTORE DATABASE command fails and attempts to remove the restored database partitions that were created before the command failed.
  • Now, cluster-wide operations that take a global lock, such as ADD LEAF and REBALANCE, update the STATE column of information_schema.PROCESSLIST. The update indicates that a command is blocked on the lock.

2020-01-27 Version 7.0.11

  • Added the columnstore_sample_per_partition_limit engine variable. This variable controls the maximum number of rows sampled per partition for columnstores.
  • Now, stop examining samples of columnstore tables that have many columns. Continue collecting samples for these types of tables.
  • Fixed an issue with kerberos-related CONFIG clause keys in CREATE PIPELINE ... being considered invalid by HDFS pipelines; added the allow_unknown_configs option to such pipelines to skip CONFIG clause validity checks.
  • Now, periodically shrink the memory cached by libc malloc by calling malloc_trim().
  • Now, the table_name_case_sensitivity engine variable can be used to change whether MemSQL treats tables, views, and table aliases as case-sensitive or case-insensitive. This variable can only be changed if no user databases have been created on the cluster.
  • Fixed a crash that occurred when running UPDATE ... SET ... which sets a sparse VARCHAR column that had been previously assigned to earlier in the same UPDATE ... SET ... query. Fixed the same issue that occurred when setting a sparse DECIMAL column.
  • Fixed an issue where cross-database distributed joins could fail to cleanup the temporary result tables created to run the join.

2020-01-13 Version 7.0.10

  • Now, the REPLICATE DATABASE command requires a master database and its slave database to have the same durability setting (SYNC or ASYNC).
  • Fixed an issue that caused STOP REPLICATING to encounter an error after a database was upgraded from MemSQL 6.x to MemSQL 7.0.
  • Now, LOAD DATA incorporates the behavior of the ignore_insert_into_computed_column engine variable.
  • Fixed a crash that sometimes occurred when calling the MAX function on an ENUM column.
  • Now, HDFS Pipelines skip files in the data source that are under the _temporary directory.
  • Added the clause NULL DEFINED BY ... OPTIONALLY ENCLOSED to the syntax of LOAD DATA .... When this clause is used, LOAD DATA counts NULL values enclosed with quotes as NULLs.
  • Now, snapshot reference databases after ALTER TABLE is run. Also, snapshot all databases after TRUNCATE is run.
  • Fixed a memory leak in queries that use full-text indexes.
  • Fixed an issue with database replication that occurred when the snapshots_to_keep engine variable is set to 1.
  • Now, delete the existing samples associated with a table when columnstore sampling on the table is disabled. Also, reset the sampling flag on all segments associated with the table so they will be resampled if columnstore sampling is re-enabled.

2019-12-10 Version 7.0.9

  • Initial GA release of MemSQL 7.0
Was this article useful?