PostgreSQL
Custom Aggregates in PostgreSQL
Given a bank account of debit and credit transactions, what is the greatest balance the account ever had? Given a hotel with arrivals and departures, what is the greatest number of guests the hotel ever had? Both of these are examples of finding the greatest running total. Finding the greatest running total is a great exercise with which to explore some of the lesser known features of PostgreSQL such as window functions, custom aggregates, and C functions.
The Setup
All code can be found on Github.
For this example, we will use a simple data schema that only contains an amount
column and id
column to provide ordering.
create table entries(
id serial primary key,
amount float8 not null
);
We will use random()
and generate_series()
to insert 1,000,000 rows of test data. By calling setseed()
before calling random()
we can ensure that this code always produces the same data.
select setseed(0);
insert into entries(amount)
select (2000 * random()) - 1000
from generate_series(1, 1000000);
Running Total
To find the greatest running total, we first have to find the running total for every row. This can be easily done with a window function.
select
id,
amount,
sum(amount) over (order by id asc) as running_total
from entries
order by id asc;
id | amount | running_total
---------+-----------------------+--------------------
1 | -462.016298435628 | -462.016298435628
2 | 162.440904416144 | -299.575394019485
3 | -820.292402990162 | -1119.86779700965
4 | -866.230697371066 | -1986.09849438071
5 | -495.30001822859 | -2481.3985126093
6 | 772.393747232854 | -1709.00476537645
7 | -323.866365477443 | -2032.87113085389
8 | -856.917716562748 | -2889.78884741664
9 | 285.323366522789 | -2604.46548089385
10 | -867.916810326278 | -3472.38229122013
-- snip --
The expression sum(amount) over (order by id asc)
can be read as sum amount for all rows ordered by id ascending from the first row to the current row. See the window function tutorial in the PostgreSQL documentation if you need a primer on window functions.
Greatest Running Total
Now that we have the running total for every row it should be simple to use the max
aggregate function to find the greatest running total.
select max(sum(amount) over (order by id asc))
from entries;
Unfortunately, we get an error:
ERROR: aggregate function calls cannot contain window function calls
LINE 1: select max(sum(amount) over (order by id asc))
Instead we have use a subquery.
select max(running_total)
from (
select sum(amount) over (order by id asc) as running_total
from entries
) t;
Here is the result:
max
------------------
396271.274807863
(1 row)
Time: 643.848 ms
Not too bad, but there are two potential areas for improvement: query simplicity and speed. What we really want to do is this:
select greatest_running_total(amount order by id asc)
from entries;
Note the order by id asc
in the aggregate. Because a greatest_running_total
function would require its inputs to be ordered to be correct, it is vital that we include this clause.
Custom Aggregates
The greatest_running_total
function doesn't exist, but PostgreSQL gives us the functionality to create our own aggregate functions. In this case, our greatest_running_total
aggregate should accept float8
values and return a float8
.
To create an aggregate function we first need a state transition function. This function will be called for each input row with the aggregate internal state and the current row value. The internal state needs to contain the current running total as well as the greatest running total. So we need a structure of two float8
values. Fortunately, PostgreSQL has the point
type which is exactly what we need (a float8
array would also work, but point
is simpler use and potentially faster).
The following state transition function is implemented in PL/pgSQL.
create function grt_sfunc(agg_state point, el float8)
returns point
immutable
language plpgsql
as $$
declare
greatest_sum float8;
current_sum float8;
begin
current_sum := agg_state[0] + el;
if agg_state[1] < current_sum then
greatest_sum := current_sum;
else
greatest_sum := agg_state[1];
end if;
return point(current_sum, greatest_sum);
end;
$$;
The point
agg_state
is used as a 2-element, zero-based array. agg_state[0]
is the current sum; agg_state[1]
is the greatest sum the aggregate has seen. We simply add agg_state[0]
and the current row value el
to get the new current sum. The new greatest sum is the greater of the old greatest sum (agg_state[1]
) and the new current sum. Finally, we return a new point value with the new current and greatest sums.
Because our aggregate's internal state is of type point
and the output of our aggregate is float8
, we need an aggregate final function that takes the final value of the aggregate's internal state and converts it to a float8
.
create function grt_finalfunc(agg_state point)
returns float8
immutable
strict
language plpgsql
as $$
begin
return agg_state[1];
end;
$$;
Lastly, we have to create the aggregate by providing the state transition function, internal aggregate state type, and the final function.
create aggregate greatest_running_total (float8)
(
sfunc = grt_sfunc,
stype = point,
finalfunc = grt_finalfunc
);
Let's try our new function.
select greatest_running_total(amount order by id asc)
from entries;
greatest_running_total
------------------------
396271.274807863
(1 row)
Time: 3386.443 ms
Success! The new function returns the same result and the query is much simpler. Unfortunately, performance took a huge hit. It is now over 5x slower than before. Clearly, this is not acceptable.
Custom Aggregates in C
The majority of the computation is in the state transition function. What if we implement that in C? The code below is the same logic implemented in C. For more details on C extensions see the documentation.
#include "postgres.h"
#include "fmgr.h"
#include "utils/geo_decls.h"
"postgres.h" and "fmgr.h" are needed by all custom C functions. "utils/geo_decls.h" is needed to import the Point
struct.
#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif
The PG_MODULE_MAGIC
macro that ensures extension won't load against incompatible version of PostgreSQL.
PG_FUNCTION_INFO_V1(grt_sfunc);
Datum
grt_sfunc(PG_FUNCTION_ARGS)
{
PG_FUNCTION_INFO_V1
is a macro that specifies that a function will use the version 1 calling convention. Version 1 functions always have the same signature.
Point *new_agg_state = (Point *) palloc(sizeof(Point));
A Point
is a C struct behind the SQL point
type. It has two fields x
and y
. These directly correspond to point[0] and point[1] in SQL.
palloc
is a PostgreSQL provided function to allocate memory. PostgreSQL will ensure all memory allocated with palloc
is released at an appropriate time.
double el = PG_GETARG_FLOAT8(1);
Here we use the PostgreSQL provided macro PG_GETARG_FLOAT8
to extract a float8
from the second argument to the function (in C arrays are zero based so argument 1 is the second argument).
bool isnull = PG_ARGISNULL(0);
if(isnull) {
new_agg_state->x = el;
new_agg_state->y = el;
PG_RETURN_POINT_P(new_agg_state);
}
If argument 0 (agg_state
) is null this is the first value provided to the aggregate. Return a new state with the current sum (x
) and greatest sum (y
) equal to that value. PG_ARGISNULL
is a macro that evaluates to true if the argument is null. PG_RETURN_POINT_P
is a macro the returns a point
as the result of the function.
Point *agg_state = PG_GETARG_POINT_P(0);
new_agg_state->x = agg_state->x + el;
if(new_agg_state->x > agg_state->y) {
new_agg_state->y = new_agg_state->x;
} else {
new_agg_state->y = agg_state->y;
}
PG_RETURN_POINT_P(new_agg_state);
}
With the null state for the first row handled, it is just a simple translation of logic from PL/pgSQL: Compute the new running total (x
), and update the greatest running total (y
) if the new running total is larger than the old.
Installing the new function takes several steps.
First, the shared libary must be built and installed. The following Makefile will accomplish that task. Just run make install
(PostgreSQL installed with homebrew on OSX should have all PostgreSQL dependencies installed, on Debian or Ubuntu install postgresql-server-dev-9.5
).
MODULES = grt
PGXS := $(shell pg_config --pgxs)
include $(PGXS)
Next, we must create the function in SQL.
create function
grt_sfunc( point, float8 )
returns
point
as
'grt.so', 'grt_sfunc'
language
c
immutable;
The final function and aggregate creation are unchanged.
create function grt_finalfunc(agg_state point)
returns float8
immutable
language plpgsql
as $$
begin
return agg_state[1];
end;
$$;
create aggregate greatest_running_total (float8)
(
sfunc = grt_sfunc,
stype = point,
finalfunc = grt_finalfunc
);
Let's see what happens:
select greatest_running_total(amount order by id asc)
from entries;
greatest_running_total
------------------------
396271.274807863
(1 row)
Time: 825.365 ms
4x faster than the PL/pgSQL version, but still a bit slower than the subquery version.
Summary
PostgreSQL gives us many ways to tackle problems. In the greatest running total example, the initial solution with a subquery and window function is best.
However, sometimes a calculation may be exceedingly difficult without a custom aggregate. In these cases, a PL/pgSQL implementation may be ideal. However, PL/pgSQL performance can be lacking.
If a custom aggregate is necessary and performance of PL/pgSQL is insufficient, a C function may be a solution. But this step should not be undertaken lightly. A bug in a C extension can crash PostgreSQL and even corrupt data. The subtleties of C and the deployment and portability difficulties of custom C are costs that should only be paid when there is no reasonable alternative.