ποΈ any_value
Obtains an arbitrary row from each aggregated group. You can use this function to optimize a query that has a GROUP BY clause.
ποΈ approx_count_distinct
Returns the approximate value of aggregate function similar to the result of COUNT(DISTINCT col).
ποΈ approx_top_k
Returns the top k most frequently occurring item values in an expr along with their approximate counts.
ποΈ avg
Returns the average value of selected fields.
ποΈ bitmap
Here is a simple example to illustrate the usage of several aggregate functions in Bitmap. For detailed function definitions or more Bitmap functions, see bitmap-functions.
ποΈ corr
Returns the Pearson correlation coefficient between two expressions. This function is supported from v2.5.10. It can also be used as a window function.
ποΈ count
Returns the total number of rows specified by an expression.
ποΈ count_if
Returns the number of records that meet the specified condition or 0 if no records satisfy the condition.
ποΈ covar_pop
Returns the population covariance of two expressions. This function is supported from v2.5.10. It can also be used as a window function.
ποΈ covar_samp
Returns the sample covariance of two expressions. This function is supported from v2.5.10. It can also be used as a window function.
ποΈ ds_hll_count_distinct
Returns the approximate value of aggregate function similar to the result of COUNT(DISTINCT col). APPROXCOUNTDISTINCT(expr) is similar function.
ποΈ group_concat
Concatenates non-null values from a group into a single string, with a sep argument, which is , by default if not specified. This function can be used to concatenate values from multiple rows of a column into one string.
ποΈ grouping
Indicates whether a column is an aggregate column. If it is an aggregate column, 0 is returned. Otherwise, 1 is returned.
ποΈ grouping_id
grouping_id is used to distinguish the grouping statistics results of the same grouping standard.
ποΈ hll_raw_agg
This function is an aggregate function that is used to aggregate HLL fields. It returns an HLL value.
ποΈ hll_union
Returns the concatenation of a set of HLL values.
ποΈ hll_union_agg
HLL is an engineering implementation based on the HyperLogLog algorithm, which is used to save the intermediate results of HyperLogGog calculation process.
ποΈ mann_whitney_u_test
Description
ποΈ max
Returns the maximum value of the expr expression.
ποΈ max_by
Returns the value of x associated with the maximum value of y.
ποΈ min
Returns the minimum value of the expr expression.
ποΈ min_by
Returns the value of x associated with the minimum value of y.
ποΈ multi_distinct_count
Returns the total number of rows of the expr, equivalent to count(distinct expr).
ποΈ multi_distinct_sum
Returns the sum of distinct values in expr, equivalent to sum(distinct expr).
ποΈ percentile_approx
Returns the approximation of the pth percentile, where the value of p is between 0 and 1.
ποΈ percentile_cont
Computes the percentile value of expr with linear interpolation.
ποΈ percentile_disc
Returns a percentile value based on a discrete distribution of the input column expr. If the exact percentile value cannot be found, this function returns the larger value between the two closest values.
ποΈ percentile_disc_lc
Returns a percentile value based on a discrete distribution of the input column expr. Same behavior as percentiledisc. However, the implementation algorithm is different. percentiledisc needs to obtain all input data, and the memory consumed by merge sorting to obtain percentile values ββis the memory of all input data. On the other hand, percentiledisclc builds a hash table of key->count, so when the input cardinality is low, there is no obvious memory increase even if the input data size is large.
ποΈ retention
Calculates the user retention rate within a specified period of time. This function accepts 1 to 31 conditions and evaluates whether each condition is true. If the condition evaluates to true, 1 is returned. Otherwise, 0 is returned. It eventually returns an array of 0 and 1. You can calculate the user retention rate based on this data.
ποΈ std
Returns the standard deviation of an expression. Since v2.5.10, this function can also be used as a window function.
ποΈ stddev,stddev_pop,std
Returns the population standard deviation of the expr expression. Since v2.5.10, this function can also be used as a window function.
ποΈ stddev_samp
Returns the sample standard deviation of an expression. Since v2.5.10, this function can also be used as a window function.
ποΈ sum
Returns the sum of non-null values for expr. You can use the DISTINCT keyword to compute the sum of distinct non-null values.
ποΈ var_samp,variance_samp
Returns the sample variance of an expression. Since v2.5.10, this function can also be used as a window function.
ποΈ variance,var_pop,variance_pop
Returns the population variance of an expression. Since v2.5.10, this function can also be used as a window function.
ποΈ window_funnel
Searches for an event chain in a sliding window and calculates the maximum number of consecutive events in the event chain. This function is commonly used for analyzing conversion rate. It is supported from v2.3.