7.0 Release Notes

The main features of the MemSQL 7.0 release are highlighted below.

Release Highlights

  • Much improved performance of synchronous replication and durability. Our synchronous replication now has only a minor performance penalty compared to our already fast asynchronous replication – giving you better data consistency guarantees without incurring large performance penalties. You can read more about our improved synchronous replication here.

  • Selective queries on columnstore tables now run faster, via sub-segment access.

  • Hash indexes on columnstore tables are now supported. CREATE TABLE and ALTER TABLE now support adding and dropping hash indexes on columnstore tables.

  • By default, UPDATE and DELETE queries now lock columnstore tables at the row level, allowing improved concurrency. Previously, these queries locked columnstore tables on the segment level.

  • Sparse rowstore compression is now supported.

  • Queries on columnstore tables can now reorder filters to decrease execution time. Reordering occurs automatically and allows filters that are more selective to be evaluated first.

Info

To deploy this release, follow the appropriate guide.

To upgrade a self-managed install to this release, follow this guide.

To make a backup of a database in this release or to restore a database backup to this release, follow this guide.

In addition to the release highlights noted above, the following new features and improvements are available in this release:

Query Execution

  • Improved Workload Management implementation that no longer needs to be tuned via the workload_management_expected_aggregators engine variable; this variable is now deprecated.
  • Support for multi-statement transactions that use distributed joins.
  • Support for multi-statement transactions that write to a database and then read from a different database.
  • Added a session variable node_degree_of_parallelism that specifies the number of threads per node to use for columnstore table scans.
  • Queries that use the IN filter on shard keys now perform better.
  • Hash joins that use reference tables now perform better.
  • The data_conversion_compatibility_level engine variable can now be set to 7.0. This allows errors to be thrown for more invalid type conversions, instead of doing implicit type conversions, where the user could see unexpected results.
  • Now, a maximum of 1024 tables can be written to in a transaction.

Query Optimization

  • Column statistics are now automatically gathered on rowstore tables. This is in addition to column statistics being automatically gathered on columnstore tables, which was done previously.
  • Range statistics (histograms) are now automatically gathered on rowstore and columnstore tables.
  • Rowstore table sampling has been improved.
  • Now, a true row-level random sample of rows from columnstore tables is maintained automatically and used to estimate the selectivity of complex predicates.
  • All shapes of bushy joins are now supported.
  • Common Table Expressions (CTEs) can now be materialized; redundant expressions can use cached data instead of being recomputed.

Data Storage, Loading, and Backup

  • Performance improvements for bulk inserts, large insert/selects, LOAD DATA, and pipelines to columnstore tables (particularly for wide columnstore tables).
  • Added support for publishing data to Kafka via SELECT … INTO KAFKA ….
  • Added support for Parquet in CREATE PIPELINE.
  • Improved error reporting for pipelines and LOAD DATA.
  • Incremental backup for columnstore data.
  • Added the experimental engine variable enable_varbuffer_dictionary_compression. When set to ON, this setting enables compression of identical strings in columns with the VARCHAR, VARBINARY, LONGTEXT, LONGBLOB, MEDIUMBLOB, BLOB, TINYBLOB, MEDIUMTEXT, TEXT, and TINYTEXT data types.

System Management

  • Added a column to information_schema.TABLE_STATISTICS to indicate if a table is rowstore or a columnstore.
  • Added columns to the MV_QUERY_ACTIVITIES and MV_QUERY_ACTIVITIES_EXTENDED_CUMULATIVE views.
  • Added the MV_GLOBAL_STATUS, MV_GLOBAL_VARIABLES, MV_SYSINFO_CPU, MV_SYSINFO_CPU_DISK, MV_SYSINFO_CPU_LIST, MV_SYSINFO_MEM, and MV_SYSINFO_NET views.

Functional Extensions

  • Cross-database views are now supported.
  • Added built-in support for time series reporting through the new FIRST, LAST, and TIME_BUCKET functions.
  • GROUP_CONCAT() now supports the ORDER BY clause.
  • Added the MEDIAN() aggregate function.

Maintenance Release Changelog

2020-01-13 Version 7.0.10

  • Now, the REPLICATE DATABASE command requires a master database and its slave database to have the same durability setting (SYNC or ASYNC).
  • Fixed an issue that caused STOP REPLICATING to encounter an error after a database was upgraded from MemSQL 6.x to MemSQL 7.0.
  • Now, LOAD DATA incorporates the behavior of the ignore_insert_into_computed_column engine variable.
  • Fixed a crash that sometimes occurred when calling the MAX function on an ENUM column.
  • Now, HDFS Pipelines skip files in the data source that are under the _temporary directory.
  • Added the clause NULL DEFINED BY ... OPTIONALLY ENCLOSED to the syntax of LOAD DATA .... When this clause is used, LOAD DATA counts NULL values enclosed with quotes as NULLs.
  • Now, snapshot reference databases after ALTER TABLE is run. Also, snapshot all databases after TRUNCATE is run.
  • Fixed a memory leak in queries that use full-text indexes.
  • Fixed an issue with database replication that occurred when the snapshots_to_keep engine variable is set to 1.
  • Now, delete the existing samples associated with a table when columnstore sampling on the table is disabled. Also, reset the sampling flag on all segments associated with the table so they will be resampled if columnstore sampling is re-enabled.

2019-12-10 Version 7.0.9

  • Initial GA release of MemSQL 7.0
Was this article useful?