MemSQL Documentation

MemSQL is a high-performance, in-memory database that combines the horizontal scalability of distributed systems with the familiarity of SQL.

Get Started    

5.7 Release Notes

MemSQL 5.7 brings a number of improvements to query processing and performance, new query processing extensions, management enhancements, a new DATE_TRUNC function, and support for Amazon S3 as a data source for MemSQL Pipelines.

Important

The MemSQL Terms of Service for this software have changed. By downloading and/or using the software, you acknowledge you have read and agree to the MemSQL terms of service.

Amazon S3 Pipelines

Amazon S3 support has been added for MemSQL Pipelines. MemSQL can now ingest objects from S3 buckets in a massively parallel way with exactly-once semantics. See the S3 Pipelines docs for more information, including a quickstart for how to start using an S3 pipeline immediately.

Query Performance Enhancements

Query performance has been improved through enhancements to the query optimizer:

  • Scans with impossible predicates are optimized out in more situations. For example, a query like SELECT * from T where 0=1 does not scan T; it returns immediately.
  • INSERT INTO…SELECT… queries with a computed shard key column on the target table can now be executed with a reshuffle, which improves performance when inserting large amounts of data.
  • Performance has been improved for joins using the null-safe equals operator <=>.
  • Performance has been improved for queries like SELECT COUNT(DISTINCT c1) FROM t GROUP BY c2 via an internal optimizer query rewrite.
  • Memory usage and performance has been improved for queries with multiple count distinct expressions, via an internal optimizer query rewrite.
  • Short circuit evaluation for LIMIT queries has been improved. The aggregator node now terminates computations on leaf nodes as soon as the limit is reached.
  • Reshuffles are reduced when inserting rows into a table that are selected from same table.
  • DATE_FORMAT is automatically rewritten to DATE_TRUNC when applicable, to improve query performance, since string manipulation done by DATE_FORMAT is more expensive than truncating the date. For example, DATE_FORMAT(date_col,"%Y-%m-01 00:00:00") can be automatically rewritten to DATE_TRUNC(date_col,"year") in situations where its result is immediately converted back to a date.

Query Processing Extensions

  • A LOAD DATA … SKIP … ERRORS option has been added, to skip loading of rows with errors. See the updated LOAD DATA docs for more information and a comparison between this feature and IGNORE.
  • Window functions with PARTITION BY in distributed joins are now supported.
  • Support for correlated subqueries in distributed plans has been improved.

DATE_TRUNC

The DATE_TRUNC(period, timestamp) built-in function is now supported. DATE_TRUNC returns the beginning of the specified period (e.g. the start of Monday for the period of ”week”). It supports grouping on rounded time buckets for convenient summary reports by day, week, etc. See the DATE_TRUNC docs for more information.

Manageability Enhancements

A distributed plan cache has been made available in information_schema. Two new system views are available:

  • distributed_plancache displays plan caches for all nodes in the cluster. It contains rows for both leaf and aggregator queries.
  • distributed_plancache_summary has one record per query for the whole cluster. It sums up activity over all nodes.
  • Full query text including parameters is now displayed in results for SHOW PROCESSLIST and queries on information_schema.processlist. In MemSQL 5.5 and older, only the parameterized query text is displayed. To control the visibility of query parameters, a new global variable named show_query_parameters has been added. For more information, see SHOW PROCESSLIST.
  • A light-weight process ID (thread ID) column LWPID has been added to the information_schema.processlist table.

Maintenance Release Changelog

The changelog below contains MemSQL Database improvements and bug fixes introduced in maintenance or revision releases. For a similar list for MemSQL Ops, see MemSQL Ops Releases.

2017-02-08 Version 5.7.3 (LATEST)

  • Fixed an issue where S3 pipelines would indefinitely suspend data extraction if connection issues or throttling occurred between Amazon S3 and MemSQL.
  • Fixed an issue where users with REQUIRE SSL permissions could not execute ANALYZE TABLE commands.

2017-01-26 Version 5.7.2

  • Fixed an issue where Rows_returned_by_readand Rows_affected_by_writes in SHOW STATUS EXTENDED reported inaccurate values.

5.7 Release Notes