Metabase as a lightweight app server

In A virtuous cycle for analytics I said this about Metabase:

It’s all nicely RESTful. Interactive elements that can parameterize queries, like search boxes and date pickers, map to URLs. Queries can emit URLs in order to compose themselves with other queries. I came to see this system as a kind of lightweight application server in which to incubate an analytics capability that could later be expressed more richly.

Let’s explore that idea in more detail. Consider this query that finds groups created in the last week.

with group_create_days as (
     to_char(created, 'YYYY-MM-DD') as day
  from "group"
  where created > now() - interval '1 week'
from group_create_days
group by day 
order by day desc 

A Metabase user can edit the query and change the interval to, say, 1 month, but there’s a nicer way to enable that. Terms in double squigglies are Metabase variables. When you type {{interval}} in the query editor, the Variables pane appears.

Here I’m defining the variable’s type as text and providing the default value 1 week. The query sent to Postgres will be the same as above. Note that this won’t work if you omit ::interval. Postgres complains: “ERROR: operator does not exist: timestamp with time zone – character varying.” That’s because Metabase doesn’t support variables of type interval as required for date subtraction. But if you cast the variable to type interval it’ll work.

That’s an improvement. A user of this Metabase question can now type 2 months or 1 year to vary the interval. But while Postgres’ interval syntax is fairly intuitive, this approach still requires people to make an intuitive leap. So here’s a version that eliminates the guessing.

The variable type is now Field Filter; the filtered field is the created column of the group table; the widget type is Relative Date; the default is Last Month. Choosing other intervals is now a point-and-click operation. It’s less flexible — 3 weeks is no longer an option — but friendlier.

Metabase commendably provides URLs that capture these choices. The default in this case is METABASE_SERVER/question/1060?interval=lastmonth. For the Last Year option it becomes interval=lastyear.

Because all Metabase questions that use variables work this way, the notion of Metabase as rudimentary app server expands to sets of interlinked questions. In Working in a hybrid Metabase / Postgres code base I showed the following example.

A Metabase question, #600, runs a query that selects columns from the view top_20_annotated_domains_last_week. It interpolates one of those columns, domain, into an URL that invokes Metabase question #985 and passes the domain as a parameter to that question. In the results for question #600, each row contains a link to a question that reports details about groups that annotated pages at that row’s domain.

This is really powerful stuff. Even without all the advanced capabilities I’ve been discussing in this series — pl/python functions, materialized views — you can do a lot more with the Metabase / Postgres combo than you might think.

For example, here’s a interesting idiom I’ve discovered. It’s often useful to interpolate a Metabase variable into a WHERE clause.

select * 
from dashboard_users
where email = {{ email }} 

You can make that into a fuzzy search using the case-insensitive regex-match operator ~*.

select * 
from dashboard_users
where email ~* {{ email }}

That’ll find a single address regardless of case; you can also find all records matching, say, But it requires the user to type some value into the input box. Ideally this query won’t require any input. If none is given, it lists all addresses in the table. If there is input it does a fuzzy match on that input. Here’s a recipe for doing that. Tell Metabase that {{ email }} a required variable, and set its default to any. Then, in the query, do this:

select * 
from dashboard_users
where email ~*
    when {{ email }} = 'any' then ''
    else {{ email}}

In the default case the matching operator binds to the empty string, so it matches everything and the query returns all rows. For any other input the operator binds to a value that drives a fuzzy search.

This is all very nice, you may think, but even the simplest app server can write to the database as well as read from it, and Metabase can’t. It’s ultimately just a tool that you point at a data warehouse to SELECT data for display in tables and charts. You can’t INSERT or UPDATE or ALTER or DELETE or CALL anything.

Well, it turns out that you can. Here’s a Metabase question that adds a user to the table.

select add_dashboard_user( {{email}} )

How can this possibly work? If add_dashboard_user were a Postgres procedure you could CALL it from psql, but in this context you can only SELECT.

We’ve seen the solution in Postgres set-returning functions that self-memoize as materialized views. A Postgres function written in pl/python can import and use a Python function from a plpython_helpers module. That helper function can invoke psql to CALL a procedure. So this is possible.

We’ve used Metabase for years. It provides a basic, general-purpose UX that’s deeply woven into the fabric of the company. Until recently we thought of it as a read-only system for analytics, so a lot of data management happens in spreadsheets that don’t connect to the data warehouse. It hadn’t occurred to me to leverage that same basic UX for data management too, and that’s going to be a game-changer. I always thought of Metabase as a lightweight app server. With some help from Postgres it turns out to be a more capable one than I thought.


Posted in .

Leave a Reply