Persistent Computed Columns

A computed column is a column defined by an expression that uses other columns in the table. MemSQL allows users to create persistent computed columns defined by an expression that combines other columns, constants, built-in functions, and operators. MemSQL’s computed columns are fully materialized and can be indexed like a standard column.

There are storage and performance tradeoffs to consider when using computed columns. At a high level, computed columns are a way to optimize computationally expensive read queries that use built-in functions or require additional data processing. They allow users to precompute values that would otherwise have to be computed as part of execution of a read query. However, computed columns also consume additional storage and require more computation on writes. Some example use cases for computed columns include:

Computed columns can be created as part of a CREATE TABLE statement, or can be added later using ALTER TABLE.

Computed Column Examples

The following statement creates a table t with an INT column a and a persistent computed column b that is defined using an expression that takes the value in a and adds 1 to it.

memsql> CREATE TABLE t (a INT PRIMARY KEY, b AS a + 1 PERSISTED INT);

Inserting values into column a will automatically populate column b using the column definition expression.

    memsql> INSERT INTO t (a) VALUES (1), (2);

    memsql> SELECT * FROM t;
    +---+------+
    | a | b    |
    +---+------+
    | 2 |    3 |
    | 1 |    2 |
    +---+------+

It is also possible to add a computed column to an existing table using ALTER TABLE.

    memsql> ALTER TABLE t ADD COLUMN c AS a * 2 - b PERSISTED INT;

    memsql> SELECT * FROM t;
    +---+------+------+
    | a | b    | c    |
    +---+------+------+
    | 2 |    3 |    1 |
    | 1 |    2 |    0 |
    +---+------+------+

Attempts to insert values directly into a computed column will result in an error, even if the values are “correct” with respect to the column definition.

    memsql> INSERT INTO t (a, b, c) VALUES (3, 4, 2);
    ERROR 1822 (HY000): Cannot insert into computed column 'b'

A computed column can be created using any expression that outputs a value that is a valid MemSQL data type, composed of operators and built-in functions. For example, computed columns allow you to extract values from a JSON blob so they can be scanned and read more quickly. However, note that aggregations, user-defined functions, and non-deterministic functions such as rand() and now() cannot be used in computed columns.

   memsql> SELECT * FROM j;
   +---+-------------------------------------------------+
   | a | b                                               |
   +---+-------------------------------------------------+
   | 1 | {"name":"Sandy Beach","occupation":"lifeguard"} |
   +---+-------------------------------------------------+

   memsql> ALTER TABLE j ADD COLUMN occupation AS b::$occupation PERSISTED TEXT;

   memsql> SELECT  * FROM j;
   +---+-------------------------------------------------+------------+
   | a | b                                               | occupation |
   +---+-------------------------------------------------+------------+
   | 1 | {"name":"Sandy Beach","occupation":"lifeguard"} | lifeguard  |
   +---+-------------------------------------------------+------------+

Modifying values in a column used to define a computed column will also update the affected values in the computed column.

    memsql> UPDATE j SET b::$occupation = 'circus clown';

    memsql> SELECT * FROM j;
    +---+----------------------------------------------------+--------------+
    | a | b                                                  | occupation   |
    +---+----------------------------------------------------+--------------+
    | 1 | {"name":"Sandy Beach","occupation":"circus clown"} | circus clown |
    +---+----------------------------------------------------+--------------+
Was this article useful?