5.0 Release Notes
New MemSQL Code Generation Architecture
MemSQL 5 features a completely revamped code generation architecture, which greatly improves query compilation speed. This enables much faster performance on ad-hoc SQL queries, such as those from Business Intelligence visualization dashboards.
This new code generation architecture introduces first in the industry LLVM code generation for SQL queries, featuring a new SQL to LLVM to machine code compiler built into the the engine. This allows much faster query compilation compared to prior versions of MemSQL, which generated and compiled query execution plans in C++.
Learn more about the new MemSQL code generation architecture in the Code Generation section.
Query Optimization Improvements
The MemSQL 5 query optimizer now uses more collected statistics and histograms to choose better query plans. It also adds improved plan invalidation to trigger recompilation of plans when data statistics change and a new plan could be faster. Learn more in the ANALYZE section.
The query optimizer also features improved hash join selection, improved distributed and local join optimization, better selection of bushy joins, elimination of unnecessary tables, views, and filters, intelligent selection of distributed group by execution plans, optimizations to perform certain distributed joins on the aggregator for better concurrency, and many more improvements to choose more efficient query plans.
Columnstore Performance Improvements
MemSQL 5 includes a suite of features that improve columnstore query execution, including:
- Operations on compressed data: for some data types, MemSQL can apply filters without decompressing data from its serialized-for-disk format.
- Batch scanning: read data and apply filters in batches for faster scan performance.
- Prefetching: prefetching segments improves query execution when the table does not fit in memory.
In addition, columnstore query execution benefits from several other improvements including faster decimal aggregations, query optimizer improvements, and join execution improvements.
Other Query Execution Performance Improvements
MemSQL 5 includes various other query execution performance improvements, including improved hash join and hash group-by execution, constant folding, improved index scan performance in certain cases, and enhancements to take advantage of shard key matches in more cases.
EXPLAIN shows a new, more informative description of query execution plans. It now shows a more detailed view of database query execution operations represented as a hierarchical tree. The new representation is especially improved for analytical queries. More information can be found in the EXPLAIN page.
MemSQL 5 now supports temporary tables (
CREATE TEMPORARY TABLE ...). Learn more in the Temporary Tables section.
Window Functions and OVER clause
MemSQL 5 now supports several window functions and the
OVER clause. Window functions perform ranking and aggregate calculations across sets of rows that bear some relation to the current row. Learn more in the Window Functions Guide.
Other SQL Features
MemSQL 5 adds support for several other SQL features, including but not limited to:
- UNION [DISTINCT] (in addition to UNION ALL)
- CREATE TABLE … SELECT
DATETIMEwith microsecond precision (
DATETIME(6)) in columnstore tables.
- Non-alphanumeric table names
- Several new functions and operators, such as: ASCII, STR_TO_DATE, LEFT, RIGHT, FIELD, TO_BASE64, FROM_BASE64, GET_FORMAT
Breaking SQL Changes
MemSQL 5 removes a non-standard additional SQL semantics guarantee provided by MemSQL 4.1 and earlier for selecting non-aggregated, non-grouped fields alongside
MAX. See MemSQL MIN/MAX Row Correlation.
SQL Compatibility Changes
MemSQL 5 allows creation of tables without a
SHARD KEY or
PRIMARY KEY, instead of requiring one to be specified. In this case MemSQL defaults to sharding rows uniformly, equivalent to an empty shard key
SHARD KEY (). See Porting Tables to MemSQL for more information and performance considerations of using this default empty shard key.
CREATE TABLE statements supported prior to MemSQL 5, which include a
SHARD KEY or
PRIMARY KEY, are unchanged and use the same shard key as before.
MemSQL users can now grant database privileges using PAM, thus allowing MemSQL authentication through PAM. Learn more in the Using MemSQL and PAM section.
MemSQL 5 has improved replication divergence detection to improve cluster resiliency.
MemSQL Ops 5 now shows statistics for databases, tables, and data within tables in graphical format. Learn more in the Schema Explorer Page section.
MemSQL Ops 5 now shows statistics for queries running in the database in graphical format. Learn more in the Query Explorer Page section.
Data Import from MySQL, Amazon S3, and Hadoop
MemSQL Streamliner with MemSQL Ops 5 now supports defining data import jobs from MySQL database tables, CSVs in Amazon S3, and CSVs in HDFS. Learn more in the Data Import Page section.
Maintenance Release Changelog
2016-05-31 Version 5.0.9 (LATEST)
- Upgraded SSL framework OpenSSL to version 1.0.1t
2016-05-10 Version 5.0.8
- Support for the ANSI_QUOTES SQL mode
- Fixed bug where some queries on columnstore tables could return wrong results if the order of columns in the clustered columnstore index differs from the order of the columns in the create table DDL
2016-04-25 Version 5.0.7
- Fixed issue around replication not working as expected after an upgrade to MemSQL 5. HA slaves would stop replicating and not automatically re-provision themselves.
- Temp tables are now supported in distributed joins
- EXPLAIN now outputs warnings for missing statistics
- Various code generation and query execution performance improvements for rowstore
- Fixed several other small bugs in query execution