Working with interdependent Postgres functions and materialized views

In Working with Postgres types I showed an example of a materialized view that depends on a typed set-returning function. Because Postgres knows about that dependency, it won’t allow DROP FUNCTION foo. Instead it requires DROP FUNCTION foo CASCADE.

A similar thing happens with materialized views that depend on tables or other materialized views. Let’s build a cascade of views and consider the implications.

create materialized view v1 as (
  select 
    1 as number,
    'note_count' as label 
);
SELECT 1

select * from v1;

number | label
-------+-------
     1 | note_count

Actually, before continuing the cascade, let’s linger here for a moment. This is a table-like object created without using CREATE TABLE and without explicitly specifying types. But Postgres knows the types.

\d v1;

Materialized view "public.v1"

Column  | Type 
--------+-----
number  | integer
label   | text

The read-only view can become a read-write table like so.

create table t1 as (select * from v1);
SELECT 1

select * from t1;

number | label
-------+-------
     1 | note_count

\d t1

Table "public.v1"

Column  | Type 
--------+-----
number  | integer
label   | text

This ability to derive a table from a materialized view will come in handy later. It’s also just interesting to see how the view’s implicit types become explicit in the table.

OK, let’s continue the cascade.

create materialized view v2 as (
  select 
    number + 1,
    label
  from v1
);
SELECT 1

select * from v2;

number | label
-------+-------
     2 | note_count

create materialized view v3 as (
  select 
    number + 1,
    label
  from v2
);
SELECT 1

select * from v3;

number | label
-------+-------
     3 | note_count

Why do this? Arguably you shouldn’t. Laurenz Albe makes that case in Tracking view dependencies in PostgreSQL. Recognizing that it’s sometimes useful, though, he goes on to provide code that can track recursive view dependencies.

I use cascading views advisedly to augment the use of CTEs and functions described in Postgres functional style. Views that refine views can provide a complementary form of the chunking that aids reasoning in an analytics system. But that’s a topic for another episode. In this episode I’ll describe a problem that arose in a case where there’s only a single level of dependency from a table to a set of dependent materialized views, and discuss my solution to that probem.

Here’s the setup. We have an annotation table that’s reloaded nightly. On an internal dashboard we have a chart based on the materialized view annos_at_month_ends_for_one_year which is derived from the annotation table and, as its name suggests, reports annotation counts on a monthly cycle. At the beginning of the nightly load, this happens: DROP TABLE annotation CASCADE. So the derived view gets dropped and needs to be recreated as part of the nightly process. But that’s a lot of unnecessary work for a chart that only changes monthly.

Here are two ways to protect a view from a cascading drop of the table it depends on. Both reside in a SQL script, monthly.sql, that only runs on the first of every month. First, annos_at_month_ends_for_one_year.

drop materialized view annos_at_month_ends_for_one_year;
create materialized view annos_at_month_ends_for_one_year as (
  with last_days as (
    select 
      last_days_of_prior_months( 
        date(last_month_date() - interval '6 year')
      ) as last_day
    ),
    monthly_counts as (
      select
        to_char(last_day, '_YYYY-MM') as end_of_month,
        anno_count_between( 
          date(last_day - interval '1 month'), last_day
        ) as monthly_annos
      from last_days
    )
    select
      end_of_month,
      monthly_annos,
      sum(monthly_annos) over 
        (order by end_of_month asc rows 
           between unbounded preceding and current row
        ) as cumulative_annos
    from monthly_counts
) with data;

Because this view depends indirectly on the annotation table — by way of the function anno_count_between — Postgres doesn’t see the dependency. So the view isn’t affected by the cascading drop of the annotation table. It persists until, once a month, it gets dropped and recreated.

What if you want Postgres to know about such a dependency, so that the view will participate in a cascading drop? You can do this.

create materialized view annos_at_month_ends_for_one_year as (
  with depends as (
    select * from annotation limit 1
  )
  last_days as (
    ),
  monthly_counts as (
    )
  select
    *
  from monthly_counts;

The depends CTE doesn’t do anything relevant to the query, it just tells Postgres that this view depends on the annotation table.

Here’s another way to protect a view from a cascading drop. This expensive-to-build view depends directly on the annotation table but only needs to be updated monthly. So in this case, cumulative_annotations is a table derived from a temporary materialized view.

create materialized view _cumulative_annotations as (
  with data as (
    select 
      to_char(created, 'YYYY-MM') as created
    from annotation
    group by created
  ) 
  select 
    data.created, 
    sum(data.count) 
      over ( 
        order by data.created asc 
        rows between unbounded preceding and current row
      )
    from data
    group by data.created 
    order by data.created 

drop table cumulative_annotations;

create table cumulative_annotations as (
  select * from _cumulative_annotations
);

drop materialized view _cumulative_annotations;

The table cumulative_annotations is only rebuilt once a month. It depends indirectly on the annotation table but Postgres doesn’t see that, so doesn’t include it in the cascading drop.

Here’s the proof.

-- create a table
create table t1 (number int);

insert into t1 (number) values (1);
INSERT 0 1

select * from t1;

number
-------
     1

-- derive a view from t1
create materialized view v1 as (select * from t1);
SELECT 1

select * from v1

number
-------
     1

-- try to drop t1
drop table t1;

ERROR: cannot drop table t1 because other objects depend on it
DETAIL: materialized view v1 depends on table t1
HINT: Use DROP ... CASCADE to drop the dependent objects too.

-- derive an independent table from t1 by way of a matview
drop materialized view v1;

create materialized view v1 as (select * from t1);
SELECT 1

create table t2 as (select * from v1);
SELECT 1

-- drop the matview
drop materialized view v1;

-- drop t1
drop table t1;

-- no complaint, and t2 still exists
select * from t2;

number
-------
     1

These are two ways I’ve found to protect a long-lived result set from the cascading drop of a short-lived table on which it depends. You can hide the dependency behind a function, or you can derive an independent table by way of a transient materialized view. I use them interchangeably, and don’t have a strong preference one way or another. Both lighten the load on the analytics server. Materialized views (or tables) that only need to change weekly or monthly, but were being dropped nightly by cascade from core tables, are now recreated only on their appropriate weekly or monthly cycles.

Working with interdependent Postgres functions and materialized views

Like this:

Published by Jon Udell

Leave a ReplyCancel reply

Share this:

Like this:

Published by Jon Udell

Leave a ReplyCancel reply

Discover more from Jon Udell