MemSQL 5.5 introduces MemSQL Pipelines, a native mechanism for scalable real-time data ingestion from a wide variety of streaming and static data sources. In addition, this release delivers 3x-5x faster query performance driven by a novel hash-table design, use of Bloom filters, native support for semi-joins and anti-joins, and improved concurrency management for distributed joins. MemSQL 5.5 also enhances ease of use with new functionality like Query Profiling, Workload Management, a more scalable new MemSQL Ops version, and faster recovery for row stores.
MemSQL Pipelines is a native database feature that supports real-time data ingestion from Kafka streams with exactly-once semantics. Pipelines provides a robust, scalable, and highly performant way of extracting, transforming, and loading external data for distributed workloads. Learn more about Pipelines here.
MemSQL 5.5 delivers a new hash table design which combined with Bloom filters delivers up to 3x-5x performance improvements for hash joins.
More efficient utilization of threads and connections delivers increased performance for distributed joins and enables higher levels of concurrent queries.
MemSQL now delivers native support for executing certain correlated subqueries as semi-joins and anti-joins.
The query profiling feature enables customers to run a query with the
PROFILE option in order to examine the execution details for each step of the query in terms of number of rows processed and the actual processing time. The query profiler can be helpful when identifying performance issues down to particular steps in query execution. Learn more about PROFILE here.
System administrators have the challenging task of coordinating query workloads from their large user base. The workload manager prevents system overload by limiting the resource usage of queries executing at any given time. The workload manager queues up queries that can not be run immediately and runs them later when capacity becomes available. Learn more about Workload Management here.
MemSQL Ops provides administration of MemSQL clusters. MemSQL 5.5 Ops now offers support for clusters with more than 100 nodes and beyond. Additionally MemSQL Ops provides UX for the newly introduced MemSQL Pipelines. For more information, see MemSQL Ops Releases.
In the event of a node failure, MemSQL now offers faster recovery for row stores with secondary indexes.
Creating reference tables obviates the need to reshuffle data and creates co-located joins. MemSQL 5.5 adds support for reference tables in the column store.
It is now possible to debug
LOAD DATA processes in MemSQL, leading to increased developer productivity. The
LOAD DATA... IGNORE command stores errors it sees during execution. Errors can be displayed using a new
SHOW LOAD ERRORS command, and can be used subsequently for corrective actions. Errors are stored in an in-memory buffer that is deleted when a subsequent query is run.
- Columnstore tables now replicate much faster.
- Pipelines transforms now execute much faster.
- Performance has been improved during cluster failover by reducing the duration of partition locks.
- Fixed an issue where broadcast operations on temporary tables could fail with an error in some cases.
- Fixed an issue with queries that require many seek operations.
- Fixed an issue with
ALTER TABLE CHANGEwhen renaming a column to address possible table corruption on recovery.
PROFILEis now more accurate for column store scans.
- Added a global variable that disables parallel query execution for leaf nodes:
enable_multipartition_queries = false.
- Fixed an instability and accuracy issue that could occur for users upgrading query optimizer statistics from 5.1 to 5.5.