Command Reference

replicate [init || snapshot || full || fetch-schemas || infer-schemas] <additional parameters>

The only required parameter for all modes is <src_conn_config_file>, including the path to this file.

For init, snapshot and full modes, the <src_conn_config_file> must be followed by <dst_conn_config_file>, including the path to this file.

For infer-schemas mode, <src_conn_config_file> must be followed by the keyword MEMSQL.

All other parameters are optional.

Modes

Operation Modes

Parameter Description
init <src_conn_config_file> <dst_conn_config_file> <additional parameters> Replicate retrieves the existing source schemas and creates equivalent schemas on the destination.
snapshot <src_conn_config_file> <dst_conn_config_file> <additional_parameters> Replicate first creates the destination schemas. Once the schemas are created, all pre-existing data from the source is replicated to MemSQL. Any changes made to the data on the source (insert, update, and/or delete) after the replication has started are ignored. After the data is replicated, a summary file is generated in either ./data/default/snapshot_summary.txt (if no instance name is specified) or ./data/<instance_name>/snapshot_summary.txt (if an instance name is specified) and Replicate exits.
full <src_conn_config_file> <dst_conn_config_file> <additional_parameters> Replicate first creates the destination schemas. Once the schemas are created, all pre-existing data from the source is replicated to MemSQL. Data is continually synchronized with minimal latency until Replicate is stopped via Ctrl-C (^C). Full mode is currently available only for an Oracle source, where CDC is used. The transition between snapshot migration and continuous integration is seamless, and Replicate guarantees an “exactly-once” replication.
fetch-schemas <src_conn_config_file> <additional_parameters> Fetches and analyzes the contents of a source or MemSQL database and generates a schemas.yaml file in the current directory.
infer-schemas <src_conn_config_file> MEMSQL <additional_parameters> Infers schemas from an Oracle database and conforms them to a MemSQL database. Note: Inferring schemas is highly recommended before replicating data as it allows the involved schemas to be examined and, if required, tailored to meet data replication requirements.

Write Modes

These following options are applied to individual tables only if a conflict is encountered among them. A conflict occurs if a table on the destination database has the same name as a table that’s being replicated from the source database. If a conflict is encountered and none of these options are specified, Replicate exits with an error to preserve pre-existing destination data. Only one of these options can be specified at a time.

Parameter Description
--append-existing Data from the source is appended to existing data in MemSQL. The existing schemas must be compatible with the data being inserted, and the user is responsible for resolving any potential primary key conflicts with the existing data.
--replace-existing Tables from the source replace conflicting tables in MemSQL. When --lazy-init is specified, the existing data is replaced at the last possible moment. By default, the existing data in MemSQL is deleted when Replicate initializes.
--truncate-existing Data from the source replaces data for conflicting tables at MemSQL, while the schema at MemSQL is retained. The existing schemas must be compatible with the data being replicated. If the table on the destination has extra columns, the replication will succeed as long as those columns are auto-generated or nullable. When --lazy-init is specified, the existing data is replaced at the last possible moment. By default, the existing data in MemSQL is deleted when Replicate initializes.

Configuration Files

Parameter Description
<src_conn_config_file> The YAML file for the source, including the path. Required. Must be specified before the destination configuration file.
<dst_conn_config_file> The YAML file for the destination, including the path. Required. Must be specified after the source configuration file.
--extractor <extractor_config_file> The YAML file for the extractor configuration file, which can be used to fine-tune Replicate’s behavior when retrieving data from a specified source. Use of an extractor configuration file is typically optional and, when not provided, default values are used. An extractor configuration file may be required in specific cases.
--applier <applier_config_file> The YAML file for an applier configuration file, which can be used to fine-tune Replicate’s behavior when applying changes to MemSQL. Use of an applier configuration file is typically optional and, when not provided, default values are used. An applier configuration file may be required in specific cases.

Filters

Parameter Description
--filter <filter_file> The YAML file containing the filter that defines which data is replicated to MemSQL.

Maps

Parameter Description
--map <mapper_file> The YAML file containing the map that defines how data is replicated to MemSQL.

Schemas

Parameter Description
--src-schemas <schemas_file> The YAML file containing the schema that defines how source data is read and interpreted. Replicate fetches schemas from the source even if a schemas file is specified, with the schema defined in the schemas_file taking precedence. For those tables specified in the schemas_file, the user-provided definitions are used. For other tables, the definitions retrieved from the source are used.
--dst-schemas <schemas_file> The YAML file containing the schema that defines how data is replicated to MemSQL. The specified schemas_file is used as a template for the MemSQL database structure. Definitions provided using --dst-schemas take precedence over the schemas inferred from the source. For tables not included in the schemas_file, the schemas are inferred from the source.

Other

Parameter Description
--resume Used to resume real-time replication when Replicate is running in full mode and stops either due to an error or by the user via Ctrl-C (^C).
--id <instance_name> Used to differentiate between multiple Replicate instances, where instance_name is a user-defined string. If instance_name is not provided, default is used.Instance names can be useful when running multiple replications. For example, you can run one replication (A), stop it, run another replication (B), stop it, and resume the first replication (A).Without providing an instance_name, as soon as replication B is started, the context needed to resume replication A is overwritten. However, if each instance of Replicate has a different instance_name, then their corresponding namespaces (folders) are used to preserve the replication context.If successive replications are run with the same instance_name and without the --resume option, the --overwrite option must be used to overwrite the existing replication context of the specified instance name.Note: While multiple instances of Replicate can run simultaneously on the same machine, a single instance can potentially utilize all machine resources.