Replay Configuration and Error Handling min read

Replication and Durability use logs and snapshots that are replayed to restore a database’s state.

How Snapshots and Logs are Used

How Snapshots and Logs are Used with Durability

In-memory database updates you make using DDL and DML commands are written to logs on disk. When the size of the updates reaches snapshot_trigger_size, a snapshot is taken and written to disk. A snapshot is a full backup of the database. Following the creation of a snapshot, subsequent DDL and DML in-memory updates are again written to the logs, until snapshot_trigger_size is again reached.

Following a server restart, the latest snapshot and the logs containing the updates made after the snapshot are loaded from disk and replayed in memory.

How Snapshots and Logs are Used with Replication

With replication, database partitions are copied from a primary host to a secondary host.

When a replica is provisioned, it receives an initial snapshot from the master. The replica replays this snapshot.

Going forward, the replica receives and replays logs from the master. These logs contain the in-memory database updates (from DDL and DML commands) made on the master. When the size of the logs reaches snapshot_trigger_size as set on the master, the replica takes a snapshot. A snapshot is a full backup of the database. Following the creation of a snapshot, subsequent DDL and DML in-memory updates on the master are again written to the logs that are sent to the replica, until snapshot_trigger_size (as set on the master) is again reached.

Replay Configuration

You can tune the snapshot-trigger-size and snapshots-to-keep engine variables to make efficient use of the logs and snapshots.

A large snapshot-trigger-size decreases the frequency that snapshots are taken. But a large snapshot-trigger-size increases the time needed to replay the snapshot.

A large snapshots-to-keep increases the number of snapshots available, and it increases the amount of space needed to store the snapshots and logs.

snapshots-to-keep defaults to 2.

The datadir engine variable stores the location of the snapshots and logs.

Replay Error Handling

This section lists errors that can occur when MemSQL processes the logs and the snapshots. It also discusses how MemSQL addresses the errors.

CRC32 Instruction not Supported

If your system hardware does not support the CRC32 instruction, you will receive the following error.

Warning: SSE4.2 is not supported. Resorting to software CRC32C. MemSQL recovery and log writing performance will be negatively impacted.

This error will be commonly seen on older processors and some virtualized environments. In this instance, MemSQL will use a software implementation of CRC32; however, this will slow down reading and writing log files. We recommend that production deployments of MemSQL run on environments that support this instruction.

Data Corruption Found During Replay

During replay, if MemSQL encounters corrupted logs or snapshots, it puts the database in the unrecoverable state. Such a database will usually auto-heal if in high availability, redundancy-2 mode and the corrupted logs or snapshots are on the replica partitions. During auto-healing, the primary host takes a snapshot and sends it to the replica. When auto-healing is complete, the secondary database will resume its operation in the replicating state.