7.0 Release Notes
The main features of the MemSQL 7.0 release are highlighted below.
Much improved performance of synchronous replication and durability. Our synchronous replication now has only a minor performance penalty compared to our already fast asynchronous replication – giving you better data consistency guarantees without incurring large performance penalties. You can read more about our improved synchronous replication here.
Selective queries on columnstore tables now run faster, via sub-segment access.
Hash indexes on columnstore tables are now supported.
ALTER TABLEnow support adding and dropping hash indexes on columnstore tables.
DELETEqueries now lock columnstore tables at the row level, allowing improved concurrency. Previously, these queries locked columnstore tables on the segment level.
Sparse rowstore compression is now supported.
Queries on columnstore tables can now reorder filters to decrease execution time. Reordering occurs automatically and allows filters that are more selective to be evaluated first.
To deploy this release, follow the appropriate guide.
To upgrade a self-managed install to this release, follow this guide.
To make a backup of a database in this release or to restore a database backup to this release, follow this guide.
In addition to the release highlights noted above, the following new features and improvements are available in this release:
- Improved Workload Management implementation that no longer needs to be tuned via the
workload_management_expected_aggregatorsengine variable; this variable is now deprecated.
- Support for multi-statement transactions that use distributed joins.
- Support for multi-statement transactions that write to a database and then read from a different database.
- Added a session variable
node_degree_of_parallelismthat specifies the number of threads per node to use for columnstore table scans.
- Queries that use the
INfilter on shard keys now perform better.
- Hash joins that use reference tables now perform better.
data_conversion_compatibility_levelengine variable can now be set to
7.0. This allows errors to be thrown for more invalid type conversions, instead of doing implicit type conversions, where the user could see unexpected results.
- Now, a maximum of 1024 tables can be written to in a transaction.
- Column statistics are now automatically gathered on rowstore tables. This is in addition to column statistics being automatically gathered on columnstore tables, which was done previously.
- Range statistics (histograms) are now automatically gathered on rowstore and columnstore tables.
- Rowstore table sampling has been improved.
- Now, a true row-level random sample of rows from columnstore tables is maintained automatically and used to estimate the selectivity of complex predicates.
- All shapes of bushy joins are now supported.
- Common Table Expressions (CTEs) can now be materialized; redundant expressions can use cached data instead of being recomputed.
Data Storage, Loading, and Backup
- Performance improvements for bulk inserts, large insert/selects,
LOAD DATA, and pipelines to columnstore tables (particularly for wide columnstore tables).
- Added support for publishing data to Kafka via SELECT … INTO KAFKA ….
- Added support for Parquet in
- Improved error reporting for pipelines and
- Incremental backup for columnstore data.
- Added the experimental engine variable
enable_varbuffer_dictionary_compression. When set to
ON, this setting enables compression of identical strings in columns with the
- Added a column to
information_schema.TABLE_STATISTICSto indicate if a table is rowstore or a columnstore.
- Added columns to the
- Added the
- Cross-database views are now supported.
- Added built-in support for time series reporting through the new
GROUP_CONCAT()now supports the
- Added the
Maintenance Release Changelog
2020-02-18 Version 7.0.12
- Fixed an incorrect “Transaction rolled back mid-query” error that could occur during a multi-statement transaction if the transaction was started on at least 64 partitions.
- Fixed an error that occurred when
SELECT ... GROUP BY ... ORDER BY RAND()was run on a columnstore table.
- Now, increase the number of columns that
SELECTqueries can return.
- Removed a limitation on the number of columns in a table allowed for columnstore sampling. Increased the number of keys allowed for optimized columnstore JSON storage in some cases.
- Fixed a deadlock that is possible when the
RESTORE DATABASEcommand fails and attempts to remove the restored database partitions that were created before the command failed.
- Now, cluster-wide operations that take a global lock, such as
REBALANCE, update the
information_schema.PROCESSLIST. The update indicates that a command is blocked on the lock.
2020-01-27 Version 7.0.11
- Added the
columnstore_sample_per_partition_limitengine variable. This variable controls the maximum number of rows sampled per partition for columnstores.
- Now, stop examining samples of columnstore tables that have many columns. Continue collecting samples for these types of tables.
- Fixed an issue with kerberos-related
CONFIGclause keys in
CREATE PIPELINE ...being considered invalid by HDFS pipelines; added the
allow_unknown_configsoption to such pipelines to skip
CONFIGclause validity checks.
- Now, periodically shrink the memory cached by libc malloc by calling
- Now, the
table_name_case_sensitivityengine variable can be used to change whether MemSQL treats tables, views, and table aliases as case-sensitive or case-insensitive. This variable can only be changed if no user databases have been created on the cluster.
- Fixed a crash that occurred when running
UPDATE ... SET ...which sets a sparse
VARCHARcolumn that had been previously assigned to earlier in the same
UPDATE ... SET ...query. Fixed the same issue that occurred when setting a sparse
- Fixed an issue where cross-database distributed joins could fail to cleanup the temporary result tables created to run the join.
2020-01-13 Version 7.0.10
- Now, the
REPLICATE DATABASEcommand requires a master database and its slave database to have the same durability setting (
- Fixed an issue that caused
STOP REPLICATINGto encounter an error after a database was upgraded from MemSQL 6.x to MemSQL 7.0.
LOAD DATAincorporates the behavior of the
- Fixed a crash that sometimes occurred when calling the
MAXfunction on an
- Now, HDFS Pipelines skip files in the data source that are under the
- Added the clause
NULL DEFINED BY ... OPTIONALLY ENCLOSEDto the syntax of
LOAD DATA .... When this clause is used,
NULLvalues enclosed with quotes as
- Now, snapshot reference databases after
ALTER TABLEis run. Also, snapshot all databases after
- Fixed a memory leak in queries that use full-text indexes.
- Fixed an issue with database replication that occurred when the
snapshots_to_keepengine variable is set to 1.
- Now, delete the existing samples associated with a table when columnstore sampling on the table is disabled. Also, reset the sampling flag on all segments associated with the table so they will be resampled if columnstore sampling is re-enabled.
2019-12-10 Version 7.0.9
- Initial GA release of MemSQL 7.0