ColumnStore Distributed Aggregate Functions

MariaDB ColumnStore supports the following aggregate functions, these can be used in the SELECT, HAVING, and ORDER BY clauses of the SQL statement.

Function Description
AVG([DISTINCT] column) Average value of a numeric (INT variations, NUMERIC, DECIMAL) column
CORR(ColumnY, ColumnX) The correlation coefficient for non-null pairs in a group.
COUNT (*, [DISTINCT] column) The number of rows returned by a query or grouping. All datatypes are supported
COVAR_POP(ColumnY, ColumnX) The population covariance for non-null pairs in a group.
COVAR_SAMP(ColumnY, ColumnX) The sample covariance for non-null pairs in a group.
MAX ([DISTINCT] column) The maximum value of a column. All datatypes are supported.
MIN ([DISTINCT] column) The maximum value of a column. All datatypes are supported.
REGR_AVGX(ColumnY, ColumnX) Average of the independent variable (sum(ColumnX)/N), where N is number of rows processed by the query
REGR_AVGY(ColumnY, ColumnX) Average of the dependent variable (sum(ColumnY)/N), where N is number of rows processed by the query
REGR_COUNT(ColumnY, ColumnX) The total number of input rows in which both column Y and column X are nonnull
REGR_INTERCEPT(ColumnY, ColumnX) The y-intercept of the least-squares-fit linear equation determined by the (ColumnX, ColumnY) pairs
REGR_R2(ColumnY, ColumnX) Square of the correlation coefficient. correlation coefficient is the regr_intercept(ColumnY, ColumnX) for linear model
REGR_SLOPE(ColumnY, ColumnX) The slope of the least-squares-fit linear equation determined by the (ColumnX, ColumnY) pairs
REGR_SXX(ColumnY, ColumnX) REGR_COUNT(y, x) * VAR_POP(x) for non-null pairs.
REGR_SXY(ColumnY, ColumnX) REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs.
REGR_SYY(ColumnY, ColumnX) REGR_COUNT(y, x) * VAR_POP(y) for non-null pairs.
STD(), STDDEV(), STDDEV_POP() The population standard deviation of a numeric (INT variations, NUMERIC, DECIMAL) column
STDDEV_SAMP() The sample standard deviation of a numeric (INT variations, NUMERIC, DECIMAL) column
SUM([DISTINCT] column) The sum of a numeric (INT variations, NUMERIC, DECIMAL) column
VARIANCE(), VAR_POP() The population standard variance of a numeric (INT variations, NUMERIC, DECIMAL) column
VAR_SAMP() The population standard variance of a numeric (INT variations, NUMERIC, DECIMAL) column

Note

  • Regression functions (REGR_AVGX to REGR_YY), CORR, COVAR_POP and COVAR_SAMP are supported for version 1.2.0 and higher

Example

An example group by query using aggregate functions is:

select year(o_orderdate) order_year, 
avg(o_totalprice) avg_totalprice, 
max(o_totalprice) max_totalprice, 
count(*) order_count 
from orders 
group by order_year 
order by order_year;
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.

© 2021 MariaDB
Licensed under the Creative Commons Attribution 3.0 Unported License and the GNU Free Documentation License.
https://mariadb.com/kb/en/columnstore-distributed-aggregate-functions/