PERCENTILE_CONT and Median

In MemSQL, percentile functions are only available as window functions. There are two percentile functions: `PERCENTILE_CONT` and PERCENTILE_DISC. `PERCENTILE_CONT` is an inverse distribution function. It assumes a continuous distribution between values of the expression in the sort specification. Then, it interpolates the value of that expression at the given percentile, performing a linear interpolation. For an overview of window functions in MemSQL, see Window Functions Guide.

Syntax

``````PERCENTILE_CONT(percentile)
WITHIN GROUP (ORDER BY (col | expr))
OVER (
[PARTITION BY (col | expr), ...]
[frame_clause]
)

or equivalently

PERCENTILE_CONT(percentile)
OVER (
ORDER BY (col | expr)
[PARTITION BY {col | expr}, ...]
[frame_clause]
)
``````

Arguments

percentile

A numeric value between 0 and 1.

col

A column of a numeric data type.

expr

An expression that evaluates to a numeric data type.

Remarks

The `OVER()` clause defines how to partition the input set. If no `PARTITION BY` clause is specified, the entire input set is treated as a single partition.

Only one column or expression is allowed in the `ORDER BY` clause.

Example

The following example creates a table containing SAT scores. The resulting window frame shows the hypothetical 90th percentile SAT score given the existing scores.

``````memsql> CREATE TABLE SAT (math INT, reading INT, writing INT);

memsql> INSERT INTO SAT VALUES (800,800,800),(650,300,500),(700,500,520);

-> math + reading + writing AS total,
-> PERCENTILE_CONT(.9)
-> WITHIN GROUP (ORDER BY math + reading + writing)
-> OVER () AS `90th Percentile` FROM SAT;
+------+---------+---------+-------+-----------------+
| math | reading | writing | total | 90th Percentile |
+------+---------+---------+-------+-----------------+
|  650 |     300 |     500 |  1450 |       2264.0000 |
|  700 |     500 |     520 |  1720 |       2264.0000 |
|  800 |     800 |     800 |  2400 |       2264.0000 |
+------+---------+---------+-------+-----------------+
``````

Median

Compute the median using percentile = 0.5.

``````PERCENTILE_CONT( 0.5 ) ...

Example:
memsql> create table t (a int);
memsql> insert into t values (1),(2),(3),(166),(1024);

memsql> select a, avg(a) over () as mean, percentile_cont(.5) within group (order by a) over () as median from t;
+------+----------+--------+
| a    | mean     | median |
+------+----------+--------+
|    1 | 239.2000 | 3.0000 |
|    2 | 239.2000 | 3.0000 |
|    3 | 239.2000 | 3.0000 |
|  166 | 239.2000 | 3.0000 |
| 1024 | 239.2000 | 3.0000 |
+------+----------+--------+
``````