6.7 Release Notes
The MemSQL 6.7 release is focused on usability, compatibility and performance. With respect to usability, we’ve made major enhancements in our toolset for deploying and managing MemSQL, understanding how tables are being used to help you tune your table structures, automatic sampling for improved query optimization, resource management, and more. Compatibility functions have been added for date handling and regular expression matching to make it easier to port queries from legacy databases. And performance of data loading and star join query processing have been improved dramatically.
Beginning with this release, MemSQL may be used free of charge in production for clusters with up to 128GB of total RAM capacity.
See the descriptions and changelog below for more information on these new features.
The following query execution features have been added:
- Star join query execution performance has improved through addition of support for joining on integer columns in columnstore tables, using Intel Single-Instruction Multiple-Data (SIMD) instructions and operations directly on encoded (compressed) data. Some join/group by/aggregate queries have improved over 100X.
- Faster filtering on columnstores
- Improved IN-list filter performance
Query Optimization, Compilation and Statistics
The following query optimizations have been added in this release:
- Cardinality estimation (CE) improvements for IN-lists using histograms
- Fast histogram creation on row store tables with random sampled scan
- Fast, accurate, uniform random sampling for complex predicates CE on row store tables using random sampled scan
interpret_firstsetting to improve ad hoc query (first run) query performance (experimental)
OPTIONclause for global query options, applicable to SELECT, INSERT…SELECT, UPDATE, DELETE
- Added warnings in
EXPLAINand information_schema for queries with comparisons between mismatched datatypes which may involve unsafe datatype conversions and/or degrade performance.
- New management views (
MV_AGGREGATED_COLUMN_USAGE) to track how columns are used in queries to help with index tuning and application design
- Improvements to query compilation speed
The following new functional capabilities are supported:
- Dynamic SQL
- Regular expression support for advanced regular expressions (ARE) and common Perl regular expression extensions. Added functions
APPROX_COUNT_DISTINCT_*functions based on HyperLogLog algorithm to allow persisting and rolling up approximate count distinct state for fast count distinct estimates
UNION/UNION ALLon the right side of
SELECT FOR UPDATEto allow lock stability between statements for online transaction processing workloads
- MySQL binary protocol and prepared statements
- Fast random sampled scans on row store tables using
table_name WITH(sample_ratio = <value>)
- Next generation of command-line tools that lay a foundation for flexible, efficient deployment and management of MemSQL clusters using industry-standard tools like Puppet, Chef, Ansible, and Salt
memsql-deploytool for those who wish to deploy with the MemSQL-provided toolset
- Fully-supported APT and YUM repositories for secure distribution of MemSQL packages. New toolchain is distributed and installed via the RPM and DEB file format.
memsql-toolbox, a stateless tool for administering MemSQL clusters, with performance and reliability improvements over the previous-generation memsql-ops tool
- New customer portal providing a single point of access for license management, downloading MemSQL software, discussion forums, and support
- MemSQL Studio graphical management console upgrades including 6.7 support and graphical
- Resource governor extensions to control CPU usage and concurrency
- Machine-learning-based workload management extensions to estimate resources used by queries the first time they are run and queue large queries appropriately to maintain throughput and response time under load
- Management view to show columnstore merge status
- Extensions to information_schema views to track status of statistics (last update and automatic statistics status)
- Additional cluster events tracked in mv_events view
MV_CLUSTER_STATUS_VIEWto give cluster status in query-able form
- Added memory-related information to
Data Loading, Backup and Restore
- Adaptive data compression for
LOAD DATA(up to 2X performance gain)
- Tar backup to Amazon S3
- Backup/restore to/from Azure Blob Store
- Enhanced metadata for backups in
MV_BACKUP_HISTORYview as well as backup files
- Enhanced validation prior to restore or disaster recovery replicate operations, to ensure there is enough space on the target system
SELECT INTO OUTFILEon S3
- JSON and Avro support for
LOAD DATAand pipelines
Maintenance Release Changelog
2019-01-14 Version 6.7.9
RESTOREof partition databases on leaves always denotes the databases as a partition in metadata. Previously, it was possible to trick the
RESTOREcommand into restoring the partition as a reference database (such as when running
BACKUPon a leaf node).
2018-01-07 Version 6.7.8
SHOW CLUSTER STATUSuses connection pooling to communicate with other nodes in the cluster. This allows the command to run more safely when it is executed repeatedly.
- Now provide more troubleshooting help when MemSQL runs out of memory when attempting to run a query.
- Fixed an issue where BLOBs were not being restored during recovery of a slave database.
- Fixed an issue where the state column in the
information_schema.PROCESSLISTtable was not being cleared properly after a query ended.
- Now provide more detailed information in the tracelog about heartbeat failures when a node fails.
- Added a system variable
auditlog_disk_syncthat delays when the audit log gets written to the disk. This delay improves the performance of audit logging. This system variable delays disk syncs by default (i.e. the default value is
- Fixed an issue where zero-byte files were being created in the plancache directory during a hard shutdown such as a power outage.
- Fixed a master aggregator crash that occurred when running
REBALANCE ... FORCEon a database with a table that has a shard key on a computed column.
memsqlctl create-nodecommand now requires a password to be specified.
- Now, the
memsqlctl restart-nodecommands do not end the
memsqld_safeprocess if the commands cannot connect to the node.
2018-12-21 Version 6.7.7
- Fixed an issue when
sync_permissionsis enabled on DR secondary clusters where users could not log in.
- No longer truncate the results of the
CURRENT_SECURITY_ROLESfunction to 64 characters if used inside a sub-query.
- Now generate an error when attempting to create an array having a negative size.
- Fixed an issue with the
PERCENTILE_DISCfunction rounding to the wrong percentile.
- Fixed an issue where the master aggregator incorrectly populated the database name in the
FILEfield of the
2018-12-18 Version 6.7.6
- Fixed optimizer error when using binary builtins.
- The information_schema.users table now includes local users when sync_permissions is enabled.
- Added systemd service file to memsql-server packages. This allows memsqlctl-managed nodes to start automatically when the cluster starts.
2018-12-11 Version 6.7.5
- Sped up
LOAD DATACSV parsing for files with short lines.
- Fixed a crash when parsing
LOAD DATAstatements that used some of the optional clauses in particular orders.
- Fixed a crash when revoking privileges from the root user when
- Fixed a MySQL wire protocol incompatibility with the
FIELD_LISTprotocol command. Newer MariaDB clients were getting “lost connection” errors as a result of this incompatibility.
- Reference tables now reserve
AUTO_INCREMENTvalues in batches of 1000 instead of 1 million.
- Memsqld_safe now exits when it receives SIGABRT instead of passing it through to memsqld.
- Removed misleading “REQUIRED_STRING” from the description of optional memsqlctl flags.
- Added the
memsqlctl enable-high-availabilityto skip the safety check for free disk space.
2018-12-03 Version 6.7.4
- Fixed a crash during code-generation when a large string is used inside an
- Fixed a crash when querying
information_schema.ADVANCED_HISTOGRAMSon a Disaster Recovery (DR) secondary cluster.
- When MemSQL hits an out of memory error, a trace entry containing the query text of any query using more than 100 MB is now written to the tracelog to help diagnose what was using memory at the time of the error. Also included in each entry are the current query memory, average query memory, and activity name.
- Fixed incorrect result with correlated subselect expression using
ORcondition on a reference table.
2018-11-19 Version 6.7.3
- Improved out-of-memory handling when retrieving autostats during query compilation.
- Now allow
SET SESSION autocommit = 0commands against leaf nodes. Some MySQL clients will execute this statement when they connect, so disallowing it blocked users from connecting to leaves to monitor usage with clients that run this command.
SET GLOBAL autocommit=0remains blocked on leaf nodes. Disabling autocommit globally on leaves interferes with memsql’s ability to coordinate transactions.
- Now block
FROMclauses from compiling when MemSQL memory use is nearing
SELECTqueries with no
FROMclauses were allowed to compile even if memory use was low as they played a part in cluster health checks that run against leaves. This is no longer the case, so there is no reason to risk compiling them when memory use is close to
CRC32builtin function was not returning results consistent with the crc32c algorithm when compared to the result produced by external libraries.
- Fixed tracking of missing blobs at the end of recovery. Previously, finishing recovery of a slave database in the middle of a columnstore transaction could cause a missing blob to go untracked.
- Added variable (
enable_broadcast_left_join) to control whether the optimizer chooses the broadcast left join optimization.
- Updated Brazilian time zones due to recent changes made by the country of Brazil. Specifically
America/Sao_Paulo. Note: All
Brazil/time zones are deprecated.
- Correctly block restoring an individual partition database backup on the master aggregator. Before this fix, the
RESTOREcommand would fail but an empty reference database would be left behind that couldn’t be dropped. This fix also allows the left over empty reference database to be dropped if it had already been created before upgrading.
- Now allow the memsqlctl configuration file path to be specified using the
- Fixed bug in memsqlctl print-secure-key that caused it to incorrectly attempt writing the state file.
2018-11-06 Version 6.7.1
- Initial GA release of MemSQL 6.7.