# Coefficient of Variation Function in PostgreSQL

By Susam Pal on 13 May 2010

Today I learnt how to create an aggregate function in PostgreSQL function that wraps the functionality provided by other aggregate functions in PosgreSQL. In this experiment, I created a `cv()` function that calculates the coefficient of variation. The function `cv(x)` is equivalent to `stddev(x)` / `avg(x)` where `x` represents the list of data points.

## Example Data Table

```\$ cat perf.sql
CREATE TABLE performance
(
name VARCHAR,
duration DOUBLE PRECISION
);

INSERT INTO performance VALUES ('RAND', 101.0);
INSERT INTO performance VALUES ('ZERO', 157.0);
INSERT INTO performance VALUES ('NONE', 209.0);
INSERT INTO performance VALUES ('TEST', 176.0);
INSERT INTO performance VALUES ('UNIT', 197.0);
INSERT INTO performance VALUES ('LOAD', 193.0);
INSERT INTO performance VALUES ('FREE', 198.0);
\$ psql statistics
psql (8.4.3)
Type "help" for help.
statistics=# \i perf.sql
DROP TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
statistics=# select * from performance;
name | duration
------+----------
RAND |      101
ZERO |      157
NONE |      209
TEST |      176
UNIT |      197
FREE |      198
(7 rows)

statistics=#
```

## Useful Details to Create Our Function

```statistics=# SELECT aggtransfn, aggfinalfn, aggtranstype::regtype, agginitval
statistics-# FROM pg_aggregate
statistics-# WHERE aggfnoid='stddev(double precision)'::regprocedure;
aggtransfn  |     aggfinalfn     |    aggtranstype    | agginitval
--------------+--------------------+--------------------+------------
float8_accum | float8_stddev_samp | double precision[] | {0,0,0}
(1 row)

statistics=# SELECT aggtransfn, aggfinalfn, aggtranstype::regtype, agginitval
statistics-# FROM pg_aggregate
statistics-# WHERE aggfnoid='avg(double precision)'::regprocedure;
aggtransfn  | aggfinalfn |    aggtranstype    | agginitval
--------------+------------+--------------------+------------
float8_accum | float8_avg | double precision[] | {0,0,0}
(1 row)

statistics=#
```

## Function Definition

```\$ cat cv.sql
CREATE OR REPLACE FUNCTION finalcv(double precision[])
RETURNS double precision
AS \$\$
SELECT float8_stddev_samp(\$1) / float8_avg(\$1);
\$\$ LANGUAGE SQL;

CREATE AGGREGATE cv(double precision)
(
sfunc = float8_accum,
stype = double precision[],
finalfunc = finalcv,
initcond = '{0, 0, 0}'
);
```

## Usage

```\$ psql statistics
psql (8.4.3)
Type "help" for help.

statistics=# select stddev(duration), avg(duration) from performance;
stddev      |       avg
------------------+------------------
37.1682147873178 | 175.857142857143
(1 row)

statistics=# select stddev(duration) / avg(duration) as cv from performance;
cv
-------------------
0.211354592616754
(1 row)

statistics=# \i cv.sql
CREATE FUNCTION
CREATE AGGREGATE
statistics=# select cv(duration) from performance;
cv
-------------------
0.211354592616754
(1 row)

statistics=#
```

## Bessel's Correction

Checked whether Bessel's correction was used in the `stddev()` function of PostgreSQL. Yes, it was used.

```\$ octave -q
octave:1> std([101, 157, 209, 176, 197, 193, 198], 0)
ans =  37.168
octave:2> std([101, 157, 209, 176, 197, 193, 198], 1)
ans =  34.411
octave:3>
```

The `std()` function in MATLAB and GNU Octave applies Bessel's correction when invoked with the second argument as `0`.