CUBE and ROLLUP

The CUBE and ROLLUP operations let you group by different subsets of a specified set of columns in a SELECT clause. CUBE(column_list) groups by all subsets of the columns specified in column_list. For example, CUBE(a, b) groups by (a), (b), (a, b) and () (which represents the entire input set of rows). ROLLUP(column_list) groups by all subsets, inclucing the entire set of input rows, (), then the leftmost column, the two leftmost columns, and so on. For example, ROLLUP(a, b) groups by (), (a), and (a, b).

Syntax

Refer to the definition of the SELECT clause for where extended_grouping_expr can be used.

extended_grouping_expr ::= 
    CUBE ( col_name | expr | position}, ...)
  | ROLLUP ( col_name | expr | position}, ... )

Remarks

Output rows for CUBE and ROLLUP contain a NULL value in the position of a grouping column if that row represents an aggregate result that does not group by that column.

Examples

The following table contains some hypothetical sales quantities for products sold by state.

CREATE TABLE sales(state VARCHAR(30), product_id INT, quantity INT);
INSERT sales VALUES
  ("Oregon", 1, 10), ("Washington", 1, 15), ("California", 1, 40),
  ("Oregon", 2, 15), ("Washington", 2, 25), ("California", 2, 70);

This query uses CUBE to group by all subsets of columns in (state, product_id):

SELECT state, product_id, SUM(quantity)
FROM sales
GROUP BY CUBE(state, product_id)
ORDER BY state, product_id;

+------------+------------+---------------+
| state      | product_id | SUM(quantity) |
+------------+------------+---------------+
| NULL       |       NULL |           175 |
| NULL       |          1 |            65 |
| NULL       |          2 |           110 |
| California |       NULL |           110 |
| California |          1 |            40 |
| California |          2 |            70 |
| Oregon     |       NULL |            25 |
| Oregon     |          1 |            10 |
| Oregon     |          2 |            15 |
| Washington |       NULL |            40 |
| Washington |          1 |            15 |
| Washington |          2 |            25 |
+------------+------------+---------------+

The first row above represents the grand total. The second row represents the total for product 1 across all states. The fifth row represents the total for product 1 in California.

This query is the same as the one above but uses ROLLUP instead of CUBE:

SELECT state, product_id, SUM(quantity)
FROM sales
GROUP BY ROLLUP(state, product_id)
ORDER BY state, product_id;

+------------+------------+---------------+
| state      | product_id | SUM(quantity) |
+------------+------------+---------------+
| NULL       |       NULL |           175 |
| California |       NULL |           110 |
| California |          1 |            40 |
| California |          2 |            70 |
| Oregon     |       NULL |            25 |
| Oregon     |          1 |            10 |
| Oregon     |          2 |            15 |
| Washington |       NULL |            40 |
| Washington |          1 |            15 |
| Washington |          2 |            25 |
+------------+------------+---------------+

Notice that the results are the same as for CUBE except that there is no total for product_id 1 and product_id 2 across all states.

GROUPING() Function

The GROUPING(column_name) can be used in the output column list of a SELECT block to show whether an output row of a query with a GROUP BY clause has aggregated together all the input rows for different values of column_name. The output of GROUPING(column_name) is 1 if the result row has grouped together rows for column_name and 0 otherwise. For example:

SELECT state, product_id, SUM(quantity), GROUPING(state), GROUPING(product_id)
FROM sales
GROUP BY CUBE(state, product_id)
ORDER BY state, product_id;

+------------+------------+---------------+-----------------+----------------------+
| state      | product_id | SUM(quantity) | GROUPING(state) | GROUPING(product_id) |
+------------+------------+---------------+-----------------+----------------------+
| NULL       |       NULL |           175 |               1 |                    1 |
| NULL       |          1 |            65 |               1 |                    0 |
| NULL       |          2 |           110 |               1 |                    0 |
| California |       NULL |           110 |               0 |                    1 |
| California |          1 |            40 |               0 |                    0 |
| California |          2 |            70 |               0 |                    0 |
| Oregon     |       NULL |            25 |               0 |                    1 |
| Oregon     |          1 |            10 |               0 |                    0 |
| Oregon     |          2 |            15 |               0 |                    0 |
| Washington |       NULL |            40 |               0 |                    1 |
| Washington |          1 |            15 |               0 |                    0 |
| Washington |          2 |            25 |               0 |                    0 |
+------------+------------+---------------+-----------------+----------------------+

The first row of output has GROUPING(state) = 1 and GROUPING(product_id) = 1 because the row represents the grand total, meaning that rows for all state and product_id values have been combined together to form this row. The second row Has GROUPING(state) = 1 but GROUPING(product_id) = 0 because it represents a summary accross all state values for a specific product_id.

Was this article useful?