Join the MemSQL Community Today
Get expert advice, develop skills, and connect with others.

Reference Tables

Reference tables are relatively small tables that do not need to be distributed and are present on every node in the cluster. They are both created and written to on the master aggregator. Reference tables are implemented via primary-secondary replication to every node in the cluster from the master aggregator. Replication enables reference tables to be dynamic: updates that you perform to a reference table on the master aggregator are quickly reflected on every machine in the cluster.

MemSQL aggregators can take advantage of reference tables’ ubiquity by pushing joins between reference tables and a distributed table onto the leaves. Imagine you have a distributed clicks table storing billions of records and a smaller customers table with just a few million records. Since the customers table is small, it can be replicated on every node in the cluster. If you run a join between the clicks table and the customers table, then the bulk of the work for the join will occur on the leaves.


Reference tables are a convenient way to implement dimension tables.