Heading image for post: Generate Dates in PostgreSQL

PostgreSQL

Generate Dates in PostgreSQL

Profile picture of Josh Branchaud

PostgreSQL has shipped with the generate_series() function for quite some time. This function is often used to generate series of numeric data. For instance, in my post, Understanding Common Table Expressions with FizzBuzz, I used it to generate a series of integers from 1 to 100.

As of PostgreSQL 8.4, support for generating series of timestamp data was added. I don't see this used often, so let's take a look at it.

We could take a peak at the online documentation to see the function signature, or we could pull it up in psql using \df.

> \df generate_series()
                                                               List of functions
   Schema   |      Name       |         Result data type          |                        Argument data types                         |  Type
------------+-----------------+-----------------------------------+--------------------------------------------------------------------+--------
 pg_catalog | generate_series | SETOF bigint                      | bigint, bigint                                                     | normal
 pg_catalog | generate_series | SETOF bigint                      | bigint, bigint, bigint                                             | normal
 pg_catalog | generate_series | SETOF integer                     | integer, integer                                                   | normal
 pg_catalog | generate_series | SETOF integer                     | integer, integer, integer                                          | normal
 pg_catalog | generate_series | SETOF numeric                     | numeric, numeric                                                   | normal
 pg_catalog | generate_series | SETOF numeric                     | numeric, numeric, numeric                                          | normal
 pg_catalog | generate_series | SETOF timestamp with time zone    | timestamp with time zone, timestamp with time zone, interval       | normal
 pg_catalog | generate_series | SETOF timestamp without time zone | timestamp without time zone, timestamp without time zone, interval | normal

The last two records are what we are looking for -- support for timestamps with and without time zones.

Notice that each requires three arguments. The first two are the lower and upper bound timestamps of the series to be generated. The interval specifies what amount of spacing to put between each timestamp when generating the series.

Let's try it out by generating the series of days in 2017.

> select generate_series(
           (date '2017-01-01')::timestamp,
           (date '2017-12-31')::timestamp,
           interval '1 day'
         );

   generate_series
---------------------
 2017-01-01 00:00:00
 2017-01-02 00:00:00
 2017-01-03 00:00:00
 2017-01-04 00:00:00
 2017-01-05 00:00:00
 2017-01-06 00:00:00
 2017-01-07 00:00:00
 2017-01-08 00:00:00
 2017-01-09 00:00:00
 2017-01-10 00:00:00
 2017-01-11 00:00:00
 2017-01-12 00:00:00
 ...

Take note that we have to satisfy the function signature, so we create a date using date and then coerce it to a timestamp. There are 365 results, so I've truncate them a bit.

PostgreSQL understands the calendar, so you can even count on proper handling of concepts like a leap year.

> select * from (
    select generate_series(
             (date '2020-01-01')::timestamp,
             (date '2020-12-31')::timestamp,
             interval '1 day'
           )
    ) as twenty_twenty(d)
  where date_part('month', twenty_twenty.d) = 2;

          d
---------------------
 2020-02-01 00:00:00
 2020-02-02 00:00:00
 2020-02-03 00:00:00
 2020-02-04 00:00:00
 2020-02-05 00:00:00
 2020-02-06 00:00:00
 2020-02-07 00:00:00
 2020-02-08 00:00:00
 2020-02-09 00:00:00
 2020-02-10 00:00:00
 2020-02-11 00:00:00
 2020-02-12 00:00:00
 2020-02-13 00:00:00
 2020-02-14 00:00:00
 2020-02-15 00:00:00
 2020-02-16 00:00:00
 2020-02-17 00:00:00
 2020-02-18 00:00:00
 2020-02-19 00:00:00
 2020-02-20 00:00:00
 2020-02-21 00:00:00
 2020-02-22 00:00:00
 2020-02-23 00:00:00
 2020-02-24 00:00:00
 2020-02-25 00:00:00
 2020-02-26 00:00:00
 2020-02-27 00:00:00
 2020-02-28 00:00:00
 2020-02-29 00:00:00

We repurposed our previous query for the year 2020. After wrapping it in a subquery, we are able to filter what we are looking at, in this case, just the month of February (i.e. when the month is 2). Notice it has 29 days!

We've only look at intervals of '1 day' so far, but we can do whatever interval we want. How about every 3 days?

> select generate_series(
           (date '2017-01-01')::timestamp,
           (date '2017-12-31')::timestamp,
           interval '3 days'
         );

   generate_series
---------------------
 2017-01-01 00:00:00
 2017-01-04 00:00:00
 2017-01-07 00:00:00
 2017-01-10 00:00:00
 2017-01-13 00:00:00
 2017-01-16 00:00:00
 2017-01-19 00:00:00
 2017-01-22 00:00:00
 ...

Or every month?

> select generate_series(
           (date '2017-01-01')::timestamp,
           (date '2017-12-31')::timestamp,
           interval '1 month'
         );

   generate_series
---------------------
 2017-01-01 00:00:00
 2017-02-01 00:00:00
 2017-03-01 00:00:00
 2017-04-01 00:00:00
 2017-05-01 00:00:00
 2017-06-01 00:00:00
 2017-07-01 00:00:00
 2017-08-01 00:00:00
 2017-09-01 00:00:00
 2017-10-01 00:00:00
 2017-11-01 00:00:00
 2017-12-01 00:00:00

We can even use a really odd interval of time.

> select generate_series(
           (date '2017-01-01')::timestamp,
           (date '2017-12-31')::timestamp,
           interval '1 month 1 day 1 hour'
         );

   generate_series
---------------------
 2017-01-01 00:00:00
 2017-02-02 01:00:00
 2017-03-03 02:00:00
 2017-04-04 03:00:00
 2017-05-05 04:00:00
 2017-06-06 05:00:00
 2017-07-07 06:00:00
 2017-08-08 07:00:00
 2017-09-09 08:00:00
 2017-10-10 09:00:00
 2017-11-11 10:00:00
 2017-12-12 11:00:00

PostgreSQL's date and timestamp capabilities are quite powerful. What we've looked at above can be put to great use in report generation. For instance, if you need aggregate data for certain intervals of time, you can join your timestamped data against any generate_series table. Instead of pulling large amounts of data into application land for processing, it can be done in the database. Your CPU and your users will thank you.


Cover image by bady qb on Unsplash.com

More posts about sql PostgreSQL