When the rubber duck talks back

I’d been needing to refactor the pagination logic in the Mastodon plugin for Steampipe. After a couple of abortive tries, I took another run at it this week with the help of the latest generation of LLM-powered coding assistants.

Here was the problem. The pre-release version of the plugin consolidated pagination for many tables in one place. That was a good thing, but the downside was that there was only one Steampipe table which represented what should have been many of them. So you could say select * from mastodon_timeline but then you had qualify with where timeline = 'home' or where timeline = 'local' and so on. For a user of the plugin this was awkward, you’d rather say select * from mastodon_timeline_home or select * from mastodon_timeline_local, and reserve the where clause for more specific purposes. The v1 plugin made separate tables, but duplicated the pagination logic on a per-table basis. It worked, and was good enough to ship the plugin in time to demo at FediForum, but it obviously needed improvement.

ChatGPT-4 and Sourcegraph Cody

Since then, Sourcegraph has released its new coding assistant, Cody, which you can run as a VSCode extension or on sourcegraph.com. This set up the possibility for an interesting comparison. ChatGPT-4 builds on OpenAI’s LLM; Sourcegraph’s Cody, on the other hand, uses Anthropic’s Claude.

Another key difference is that ChatGPT only has the context you paste into it. Cody, sitting inside VSCode, can see your repository and has all that context. And if you index your repo, which is something Sourcegraph are willing to do for beta users on request, then Cody has access to what are called embeddings that represent the structure of your code in various ways. These embeddings, according to Sourcegraph, can powerfully enhance your LLM prompts. Even without embeddings, Cody offers quite a range of assistance, from a high-level overview of what your repo does to line-level improvement. It’s all packaged, in the extension, as a set of recipes behind buttons with names like Explain selected code, Improve variable names, and Smell code. I haven’t yet used these recipes enough to form solid opinions, though. For this exercise I used Cody mostly in a ChatGPT-like conversational way. In that mode, it’s wonderful to be able to select the code you want to talk about, instead of pasting it into the chat.

In both cases, as should be no surprise, it wasn’t enough to just ask the tools to consolidate the pagination logic. They were perfectly happy to propose solutions that could never work and might not even compile. So I began with a simpler version of the problem. Mastodon uses the same pagination machinery for APIs that return arrays of different kinds of results: Statuses (toots), Accounts, and Notifications. By focusing on these separately I reduced the duplicate pagination from 13 instances to three. Then, in a separate pass, I worked out how to collapse those into a single paginate function that accepted one of three data-fetching function parameters.

I tried to pay careful attention to prompts and completions as I went along, but in the heat of the action I didn’t do a great job of that, partly because I was switching back and forth between the two tools. But I’m quite happy with the result. There was one key insight in particular which, fascinatingly, I am hard pressed to assign credit for. Was it me or one of the assistants? I think it was me, but in a way that doesn’t matter, and isn’t the point of this story.

The key insight

Here was the insight. When I was building the transitional paginateStatus function, the first attempt returned results to the calling code in each table’s List function, which was responsible for streaming the data to Steampipe. This led to a series of detours to work around the problem that the returned data could be quite large, and chew up a lot of memory. That could probably be solved with a goroutine that would stream results back to the caller, instead of returning them as a batch. I tried prodding both LLMs to come up with that kind of solution, had no luck with several tries in both cases, but then came the insight. The helper functions could stream results directly to Steampipe, and just return nil or err to the calling List function.

With that dramatic simplication I was able to complete the phase 1 refactoring, which yielded three pagination functions: paginateStatus, paginateAccount, and paginateNotification. Phase 2, which consolidated those into a single paginate function, was a bit more prosaic. I did need some help understanding how the necessary switch statements could switch on the timeline types passed into the paginate function. Both assistants had seen lots of examples of this pattern, and both helpfully augmented my imperfect knowledge of golang idioms.

Partnering with machine intelligence

I came away with a profound sense that the real value of these assistants isn’t any particular piece of code that they get “right” or “wrong” but rather the process of collaborating with them. When you’re working alone, you have an ongoing conversation with yourself, usually in your own head. The point of talking to a rubber duck is to voice that conversation so you can more effectively reason about it. Externalizing your thinking in that way is intrinsically valuable. But when the rubber duck talks back, it’s a whole new game. As Garry Kasparov famously wrote:

The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time. Their skill at manipulating and coaching their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.

I’m not worried about robot overlords. Instead, I look forward to collaborating with robot partners.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Migrating Mastodon lists

My original Mastodon account was on mastodon.social. I chose it because it’s the largest instance, it’s operated by the developers of the service, and it runs the reference implementation of the software. I wanted to experience the fediverse, initially, as most people do, then consider whether to migrate to another server — and if so, to which one.

Last week I migrated to social.coop which, like cosocial.ca, it what Darius Kazemi calls a community-owned independent social media site. I don’t think the acronym COISMS will fly, but the business model is just what I’ve been looking for. I don’t want to use “free” services that make me the product they sell. I want to pay for a service that’s funded by my dollars, not by my data. That’s why I bailed on Gmail years ago, in favor of Fastmail, and it’s why I’m happy to spend $10/month — the price of a couple of cappucinos — to support the people who keep the lights on at social.coop and maintain civility.

The instructions for migrating from one server offer two paths: Profile redirect and Profile move. I chose the former, because I assumed I would retain API access to mastodon.social post-migration, and thus have the option to use Steampipe to query both accounts. Why do that? You can’t migrate your posts from the old server to the new one, only your followers and (optionally) your follows, bookmarks, blocks, and mutes. Steampipe can connect to two different Mastodon servers at the same time, so I figured I could run queries against both.

Lessons learned

That turned out to be a wrong assumption. I can reactivate my mastodon.social account by turning off the redirection, but I’m vague on the implications of doing that. Meanwhile it’s still possible to export my posts, so I can always reacquire them that way if needed.

Here was another wrong assumption. I think that if I’d chosen to move my account, instead of redirect it, my profile would have transferred to the new server. Instead it was on me to fill out the new profile. I’d created @judell@social.coop but, when I kicked off the migration, I hadn’t yet completed the profile. So when followers were notified that judell@social.coop was now following them, there was no picture or bio or verified website. That led some people to think it was a bogus account and block it. It wasn’t a catastrophe, and I’ve spoken with a few of them to resolve the matter, but it was an unforced error that would have been easy to avoid.

The view from here

Here’s a comparison of weekly activity on two servers, by way of the server dashboard.

What’s it like moving to a server with only 1% of the flow that happens on mastodon.social? Not too different! Because I’ve moved my followers and follows, and because I interact primarily with the home timeline and with lists, the experience is mostly the same. For example, here’s the graph of boost relationships among servers in my home timeline. It looks just like it did before.

It’s wonderful to have a portable social graph that you can bring along as you move around in the fediverse. The key differences are that I’m meeting new people, and spending more time on the local timeline. On a big server like mastodon.social the local timeline feels like the Twitter firehose, for those who remember when that was still a thing you could watch. I’d look at it occasionally, because it could be a source of useful serendipity, but mostly it was just too random. The local timeline on social.coop is still somewhat random, but I find it more useful — not only because it represents a far smaller population but also because the people who’ve signed up share a common interest in the coop model.

I want my lists

There was, however, one big obstacle. You can export lists from the old server and import them into the new server, but there’s no export/import for the accounts you’ve assigned to those lists. Since I rely heavily on lists to navigate Mastodon flow in a focused and intentional way, this was a problem I really needed to fix. I tried Eliot Nash’s masto-list-importer but when I tried to transfer my 460 person/list mappings I hit API rate limits. So instead I worked out a Steampipe-based solution which is a useful alternative, and also a good way to illustrate the steps involved.

Let’s assume that I want to migrate again, from social.coop to fosstodon.org. I’ve exported the names of my lists from social.coop, and imported them to fosstodon.org. Now I need to populate those lists. The SQL solution I came up with proceeds by steps, each being a common table expression (CTE) that accomplishes one part of a complex transformation, then passes results to the next step.

Step 1: Gather the accounts assigned to each social.coop list.

We start by joining two tables to enumerate accounts assigned to each list.

  with accounts_by_list as (
  select 
    a.acct as old_account,
    a.id as old_account_id,
    l.id as old_list_id,
    l.title as title
  from
    social_coop.mastodon_my_list l
  join
     social_coop.mastodon_list_account a
  on
    l.id = a.list_id
)
select * from accounts_by_list

The output, in accounts_by_list, is effectively a temporary table that you can review to make sure the results are as expected. The ability to verify the output of each phase of a CTE pipeline, before sending it to the next phase, is a key advantage of this approach. Nested subqueries are much harder to debug!

Step 2: Map the accounts on social.coop to accounts on fosstodon.org

Two transformations happen in this phase. If an account is a bare username, say personA, then it’s a social.coop account. On fosstodon.org that same account will be represented as personA@social.coop. Conversely if personB@fosstodon.org exists on social.coop, that account becomes the bare username personB there. All other accounts (e.g. personC@hachyderm.io pass through unchanged.

with accounts_by_list as (
  -- as above
),
adjusted_accounts_by_list as (
  select
    old_account,
    old_account_id,
    old_list_id,
    title,
    case 
      when old_account !~ '@' then old_account || '@social.coop' 
      when old_account ~ '@fosstodon.org' then replace(old_account, '@fosstodon.org', '')
      else old_account
    end as new_account
  from
     accounts_by_list
)
select * from adjusted_accounts_by_list

Step 3: Map the list titles to the new server.

The names of the lists are the same in both places, but their ids differ. For example, my Fediverse list is 1043 on social.coop and 6771. To invoke the API call that adds someone to a list, we’ll need to use the latter id.

with accounts_by_list as (
  -- as above
),
adjusted_accounts_by_list as (
  -- as above
),
adjusted_list_ids as (
  select
    a.*,
    l.id as new_list_id
  from
    adjusted_accounts_by_list a
  join
    fosstodon.mastodon_my_list l
  on
    a.title = l.title
)
select * from adjusted_list_ids

Step 4: Map the account ids to the new server.

Like lists, accounts on the new server also have different ids, and those are also required for the API call. We can find the new ids by joining the new_account column from step one with the table fosstodon.mastodon_my_following.

with accounts_by_list as (
  -- as above
),
adjusted_accounts_by_list as (
  -- as above
),
adjusted_list_ids as (
  -- as above
,
),
adjusted_account_ids as (
  select 
    a.*,
    f.id as new_account_id
  from
    adjusted_list_ids a
  join
    fosstodon.mastodon_my_following f
  on
    f.acct = a.new_account
)
select * from adjusted_account_ids

Step 5: Build the API calls

There are lots of ways to skin this cat. Because you can use Steampipe as a component, you could use any programming language with a Postgres driver to run this query, and call the Mastodon API with the appropriate list and account ids.

And then, of course, there’s always curl.

with accounts_by_list as (
  -- as above
),
adjusted_accounts_by_list as (
  -- as above
),
adjusted_list_ids as (
  -- as above
),
adjusted_account_ids as (
  -- as above
)
select
  'curl -X POST -H "Authorization: Bearer ***" https://fosstodon.org/api/v1/lists/' ||
  new_list_id ||
  '/accounts/?' ||
  'account_ids[]=' ||
  new_account_id ||
  '; sleep 1;' as command

To avoid throttling, I appended sleep 1 to each line. I saved the query in migrate-lists.sql, and exported the output to a file.

steampipe query --output csv migrate-lists.sql > migrate.sh

That was close, but not quite right. The output looked like this:

command
"curl -X POST -H ""Authorization: Bearer ***"" https://fosstodon.org/api/v1/lists/6771/accounts/?account_ids[]=279462; sleep 1;"
"curl -X POST -H ""Authorization: Bearer ***"" https://fosstodon.org/api/v1/lists/6771/accounts/?account_ids[]=109283421559929728; sleep 1;"

I need to remove the header line, remove the double quotes on either end of each line, and deduplicate the pairs of double quotes. For that I turned to ChatGPT (v4). It took a few tries to get it right, but soon enough it produced a working bash script that I’m very glad I did not have to write.

#!/bin/bash

# read the original script from migrate.sh and store it in an array
readarray -t original_script < migrate.sh

# remove the first line of the original script
modified_script=("${original_script[@]:1}")

# loop through the modified script array and replace double double-quotes with a single double-quote and remove double quotes at the beginning and end of each line
for (( i=0; i<${#modified_script[@]}; i++ )); do
    modified_script[$i]=$(sed 's/""/"/g' <<< "${modified_script[$i]}")
    modified_script[$i]=$(sed 's/^"//' <<< "${modified_script[$i]}")
    modified_script[$i]=$(sed 's/"$//' <<< "${modified_script[$i]}")
done

# print out the modified script with newlines
printf '%s\n' "${modified_script[@]}"

And that did the trick. With my lists now fully populated on social.coop, I'm now reading Mastodon from my new home server in the same list-oriented way as I was on the old one.

If you move servers and want to migrate the people on your lists, you should definitely try masto-list-importer first, and consider the method here as fallback if that doesn't work for you. I agree with Eliot Nash that list transfer ought to be baked into the Mastodon UI. People rightly complain that it can be hard to find the people you want to follow and topics you want to explore. You can follow hashtags but that can be too random. Assigning people to topical lists strikes a good balance. But only if those list assignments are portable!


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

The Mastodon plugin is now available on the Steampipe Hub

When Twitter changed hands last November I switched to Mastodon; ever since I’ve enjoyed happier and more productive social networking. To enhance my happiness and productivity I began working on a Mastodon plugin for Steampipe. My initial goal was to study the fediverse writ large. Which people and which servers are powerful connectors? How do moderation policies work? What’s it like to join a small server versus a large one? These are important questions, and you can use the plugin to begin to answer them. But I soon realized that as a newcomer to a scene that’s been evolving for six years, and has not welcomed such analysis, I should start by looking for ways to enhance the experience of reading Mastodon. So I began building a set of dashboards that augment the stock Mastodon client or (my preference) elk.zone. And I’ve narrated that project in a series of posts.

Last week we released the plugin to the Steampipe hub. If you’ve installed Steampipe, you can now get the plugin using steampipe plugin install mastodon. The next phases of this project will explore using the plugin and dashboards in Steampipe Cloud, and speeding up the dashboards by means of persistent Postgres tables and Steampipe Cloud snapshots. Meanwhile, here’s a recap of what I’ve learned thus far.

See more with less distraction

While the dashboards use charts and relationship graphs, they are mainly tables of query results. Because Steampipe dashboards don’t (yet) render HTML, these views display plain text only — no images, no styled text. I’ve embraced this constraint, and I find it valuable in two ways. First, I’m able to scan many more posts at a glance than is possible in conventional clients, and more effectively choose which to engage with. When I described this effect to a friend he said: “It’s a Bloomberg terminal for Mastodon!” As those of us who rode the first wave of the blogosphere will recall, RSS readers were a revelation for the same reason.

Second, I find that the absence of images and styled text has a calming effect. To maintain a healthy information diet you need to choose sources wisely but, no matter where you go, sites deploy a barrage of attention-grabbing devices. I find dialing down the noise helpful, for the same reason that I often switch my phone to monochrome mode. Attention is our scarcest resource; the fewer distractions, the better.

There’s a tradeoff, of course; sometimes an image is the entire point of a post. So while I often read Mastodon using these Steampipe dashboards, I also use Elk directly. The Steampipe dashboards work alongside conventional Mastodon clients, and indeed depend on them: I click through from the dashboards to Elk in order to boost, reply, or view images. That experience is enhanced by instance-qualified URLS that translate foreign URLs to ones that work on your home server.

Manage people and lists

The ability to assign people to lists, and read in a list-oriented way, is a handy Twitter affordance that I never used much because it was easy to let the algorithms govern my information diet. Because Mastodon doesn’t work like that, lists have become the primary way I read the fediverse flow. Of the 800+ people I follow so far, I’ve assigned more than half to lists with titles like *Climate* and *Energy* and *Software*. To help me do that, several dashboards report how many of the people I follow are assigned to lists (or not).

I want as many people on lists as possible. So I periodically review the people I follow, put unassigned people on lists, and track the ratio of people who are, or aren’t, on lists. Here’s the query for that.

with list_account as (
  select
    a.id,
    l.title as list
  from
    mastodon_my_list l
    join mastodon_list_account a on l.id = a.list_id
),
list_account_follows as (
  select
    list
  from
    mastodon_my_following
    left join list_account using (id)
)
select 'follows listed' as label, count(*) from list_account_follows where list is not null
union
select 'follows unlisted' as label, count(*) from list_account_follows where list is null

When you read in a list-oriented away, as is also true when you read by following hashtags, there are always people whose chattiness becomes a distraction. To control that I’ve implemented the following rule: show at most one original toot per person per list per day. Will I miss some things this way? Sure! But if you’ve said something that resonates with other people, I’m likely to hear about it from someone else. It’s a tradeoff that’s working well for me so far.

Here’s the SQL implementation of the rule.

with data as (
  select
    list_id,
    to_char(created_at, 'YYYY-MM-DD') as day,
    case
      when display_name = '' then username
      else display_name
    end as person,
    instance_qualified_url as url,
    substring(content from 1 for 200) as toot
  from
    mastodon_toot_list
  where
    list_id = '42994'
    and reblog is null -- only original posts
    and in_reply_to_account_id is null -- only original posts
  limit
    40
)
select
  distinct on (person, day) -- only one per person per day
  day,
  person,
  toot,
  url
from
  data
order by
  day desc,
  person;

On the home timeline’s dashboard I’ve made it optional to include or hide boosts, which can be the majority of items. On the list-reading dashboard I’ve opted to always exclude them, but the SQL idiom for doing so — select distinct on (person, day) — is simple, easy to understand, and easy to change.

Visualize relationships

I’ve so far found three ways in which relationship graphs can make Mastodon more legible. First, in Mastodon relationship graphs, I showed how to use SQL-defined nodes and edges to show boost relationships among people and servers. In another article I used the same tools to map relationships among people and tags. And most recently I used them to explore server-to-server moderation.

In all three cases the format conveys information not directly available from tabular views. Clusters of interesting people pop out, as do people who share tags. And when I graphed servers that block other servers I discovered an unexpected category: some servers that block others are themselves also blocked, like infosec.exchange in this example.


blocked and blocking mastodon servers

The Steampipe combo of SQL-oriented API access and dashboards as code is a uniquely productive way to build relationship graphs that can unlock insights in any domain. As we’ve seen with Kubernetes, they can help make cloud infrastructure more legible. The Mastodon graphs suggest that the same can happen in the social networking realm.

Use RSS feeds

When you append .rss to the URL of a Mastodon account, or tag, you produce an RSS feed like https://mastodon.social/@judell.rss or https://mastodon.social/tags/steampipe.rss. These feeds provide a kind of auxiliary API that includes data not otherwise available from the primary API: related tags, which appear in the feeds as RSS category elements. Steampipe really shines here thanks to the RSS plugin which enables joins with the primary Mastodon API. This query augments items in account’s feed with tags that appear in each item.

with data as (
  select
    name,
    url || '.rss' as feed_link
  from
    mastodon_search_hashtag
  where
    query = $1,
    and name = query
  limit 
)
select
  to_char(r.published, 'YYYY-MM-DD') as published,
  d.name as tag,
  (
    select string_agg(trim(JsonString::text, '"'), ', ')
    from jsonb_array_elements(r.categories) JsonString
  ) as categories,
  r.guid as link,
  ( select content as toot from mastodon_search_toot where query = r.guid ) as content
from
  data d
join
  rss_item r
on
  r.feed_link = d.feed_link
order by
  r.published desc
limit 10

A similar query drives the graph discussed in Mapping people and tags on Mastodon.


blocked and blocking mastodon servers

In that example, surfacing the connection between a user, @themarkup, and a pair of tags, scotus and section230, was useful in two ways. First, it helped me instantly spot the item that I most wanted to read, which was buried deep in the search results. Second, it helped me discover a source that I’ll return to for guidance on similar topics. Of course I added that source to my Law list!

Own the algorithm

Everyone who comes to Mastodon appreciates not having an adversarial algorithm control what they see in their timelines. Most of us aren’t opposed to algorithmic influence per se, though, we just don’t like the adversarial nature of it. How can we build algorithms that work with us, not against us? We’ve already seen one example: the list-reading dashboard displays just one item per list per person per day. That’s a policy that I was able to define, and easily implement, with Steampipe. And in fact I adjusted it after using it for a while. The original policy was hourly, that was too chatty, so I switched to daily by making a trivial change to the SQL query.

In News in the fediverse I showed another example. The Mastodon server press.coop aggregates feeds from mainstream news sources. I was happy to have those feeds, but I didn’t want to see those news items mixed in with my home timeline. Rather, I wanted to assign them to a News list and read them only when I visit that list in a news-reading mindset. The fediverse offers an opportunity to reboot the social web and gain control of our information diets. Since our diets all differ, it ought to be possible — and even easy — for anyone to turn on a rule like *news only on lists, not timelines*. Steampipe can make it so.

Steampipe as component

When you ask people on Mastodon about these kinds of features, the response is often “Have you tried client X? It offers feature Y.” But that solution doesn’t scale. It would require massive duplication of effort for every client to implement every such policy; meanwhile, people don’t want to switch to client X just for feature Y (which might entail losing feature Z). Could policies be encapsulated and made available to any Mastodon client? It’s interesting to think about Steampipe as a component that delivers that encapsulation. A timeline built by SQL queries, and governed by SQL-defined policies, is a resource available to any app that can connect to Postgres, either locally or in Steampipe Cloud.

If you’re curious about the Steampipe + Mastodon combo, install the plugin, try out the sample queries, then clone the mod and check out the dashboards. Do they usefully augment your Mastodon reader? What would improve them? Can you use these ingredients to invent your own customized Mastodon experience? Join our Slack community and let us know how it goes!


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Mastodon timelines for teams

Welcome to episode 17 of the Steampipe+Mastodon series, in which we introduce a new subplot: timeline history. So far, the examples I’ve shown and discussed work with current timelines. We’ve seen SQL queries that fetch results from realtime calls to the Mastodon API, and Steampipe dashboards that display those results. But Steampipe isn’t just an API siphon, it’s also a Postgres database. As such it supports the transient tables created by Steampipe’s foreign data wrapper and plugins, but also enables you to create your own native tables as well. And you can use those native tables to accumulate data from the transient foreign tables.

Because saving and searching Mastodon data is a controversial topic in the fediverse — none of us wants to recapitulate Big Social — I’ve focused thus far on queries that explore recent Mastodon flow, of which there are plenty more to write. But nobody should mind me remembering my own home timeline, so a few weeks ago I made a tool to read it hourly and add new toots to a Postgres table.

Before you can add any toots to a table, of course, you’ve got to create that table. Here’s how I made this one.

create table mastodon_home_timeline as 
  select * from mastodon_toot_home limit 200

Once created, the table can be updated with new toots like so.

with data as (
  select
    account,
    -- more
    -- columns
    username
  from
    mastodon_toot_home
  limit 200
)
insert into mastodon_home_timeline (
  account,
    -- more
    -- columns
  username
)
select
  *
from 
  data
where 
  id not in ( select t.id from mastodon_home_timeline t )

To run that query from a crontab, on a machine where Steampipe is installed, save it as mastodon_home_timeline.sql, then schedule it.

15 * * * * cd /home/jon/mastodon; steampipe query mastodon_home_timeline.sql

That’s it! Now the number reported by select count(*) from mastodon_home_timeline is growing hourly.

I’ve only been collecting toots for a couple of weeks, and haven’t yet begun to explore that data yet; we’ll see what happens when we get there. Meanwhile, though, I want to show how such exploration can be a team exercise.

A friend of mine, whom I’ll call Elvis, shares my interest in teasing out connections among people, servers, and hashtags. He could capture his own timeline using the method shown here. But since we’ll be looking at this data together, we agreed that I’ll gather both our timelines. To enable that, he shared a (revokable) Mastodon API token that I’ve used to configure Steampipe with credentials for both our accounts.

connection "mastodon_social_jon" {
    plugin = "mastodon"
    server = "https://mastodon.social"
    access_token = "..."
}

connection "mastodon_social_elvis" {
    plugin = "mastodon"
    server = "https://mastodon.social"
    access_token = "..."
}

Steampipe’s foreign data wrapper turns each of these named connections into its own Postgres schema. Athough we happen to share the same home server, by the way, we needn’t. A team collaborating like this could pool timelines from mastodon.social and hachyderm.io and fosstodon.org and any other Mastodon-API-compatible server.

(You can do the same thing with AWS or Slack or GitHub or other kind of account by defining multiple connections. Steampipe makes API calls concurrently across parallel connections.)

With this configuration I can read my timeline like so.

select * from mastodon_social_jon.mastodon_toot_home limit 200

And Elvis’ like so.

select * from mastodon_social_elvis.mastodon_toot_home limit 200

If I want to query both in realtime, for example to count the combined total, I can use a SQL UNION. Or I can define umbrella connection that aggregates these two.

connection "all_mastodon" {
    plugin = "mastodon"
    type = "aggregator"
    connections = [ "mastodon_social_jon", "mastodon_social_elvis" ]
}

connection "mastodon_social_jon" {
    plugin = "mastodon"
    server = "https://mastodon.social"
    access_token = "..."
}

connection "mastodon_social_elvis" {
    plugin = "mastodon"
    server = "https://mastodon.social"
    access_token = "..."
}

Now the query select * from all_mastodon.mastodon_toot_home limit 200 makes API calls on behalf of both accounts — in parallel — and combines the results. When we follow the resulting URLs in order to reply or boost, we’ll do so as individual identities. And we’ll be able to use Steampipe queries and dashboards in that same single-user mode. But we’ll also be able to pool our timelines and point our queries and dashboards at the combined history.

Will that prove interesting? Useful? That remains to be seen. I think it’s one of many experiments worth trying as the fediverse sorts itself out. And I see Steampipe as one laboratory in which to run such experiments. With SQL as the abstraction over APIs, aggregation of connections, and dashboards as code, you have all the ingredients needed to iterate rapidly, at low cost, toward shared Mastodon spaces tailored for teams or groups.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Visualizing Mastodon server moderation

In Mapping the wider fediverse I showed how a new table added to the Mastodon pluginmastodon_domain_block — enables queries that find which servers are moderating which other servers. For example, here are servers on nerdculture.de‘s list of offenders.

select
  server,
  domain,
  severity
from
  mastodon_domain_block
where
  server = 'https://nerdculture.de'

+------------------------+--------------------------------+----------+
| server                 | domain                         | severity |
+------------------------+--------------------------------+----------+
| https://nerdculture.de | roysbeer.place                 | silence  |
| https://nerdculture.de | *.10minutepleroma.com          | suspend  |
| https://nerdculture.de | *.activitypub-troll.cf         | suspend  |

...snip...

| https://nerdculture.de | shitposter.club                | suspend  |
| https://nerdculture.de | wolfgirl.bar                   | suspend  |
| https://nerdculture.de | www2.gabbers.me                | suspend  |
+------------------------+--------------------------------+----------+

I used the new query pattern to build a dashboard to report, for each server in the home timeline:

  • The list of blocking servers.
  • The count of blocked servers for each blocking server.
  • The list of blocking servers for each blocked server, and the count of those blocking servers.

This was a good start, but I had a hunch that relationship graphs would reveal non-obvious connections among these servers. And indeed they do! Here’s a view of the new graph-enriched dashboard.

The left pane shows the blocked servers shunned by a blocking server selected from those in the home timeline. The right pane has the inverse view: the blocking servers that shun a selected blocked server. These were the two categories I defined for the first iteration of these graphs.

category "blocking_server" {
  color = "darkgreen"
  icon = "server"
}

category "blocked_server" {
  color = "darkred"
  icon = "server"
}

Here’s the code for one of the graphs.

graph {
   node {
    base = node.blocking_server
  }
   node {
    base = node.blocked_server
  }
   node {
    base = node.blocked_and_blocking_server
  }
   edge {
    args = [ self.input.blocking_server.value ]
    base = edge.match_blocked_server
  }
   edge {
    args = [ self.input.blocking_server.value ]
    base = edge.match_blocking_server
  }
}

Here is the definition of node.blocking_server, which refers to category.blocking_server.

node "blocking_server" {
  category = category.blocking_server
  sql = <<EOQ
    with servers as (
      select distinct
        blocking_server,
        blocked_server
      from
        blocking_servers(${local.limit})
    )
    select
      blocking_server as id,
      blocking_server as title
    from
      servers
    order by
      blocking_server
  EOQ
}

The FROM clause calls blocking_servers(), a set-returning function defined like so.

create or replace function public.blocking_servers(max int) returns table (
  blocking_server text,
  blocked_server text
) as $$
with servers as (
  select distinct
    server as domain,
    'https://' || server as server_url
  from
    mastodon_toot
  where
    timeline = 'home'
  limit max
),
blocking_and_blocked as (
  select
    s.domain as blocking_domain,
    d.domain as blocked_domain
  from
    servers s
  join
    mastodon_domain_block d
  on
    s.server_url = d.server
)
select
  blocking_domain,
  blocked_domain
from
  blocking_and_blocked
order by
  blocking_domain, blocked_domain
$$ language sql

I thought these ingredients would suffice. But when I began poking around in the graphs made with these definitions, infosec.exchange behaved strangely. Someties it would appear as a blocking server, other times as a blocked server. I was missing a category!

category "blocked_and_blocking_server" {
  color = "orange"
  icon = "server"
}

As you can see in the graph, infosec.exchange not only blocks 73 servers, it is also blocked by two servers: religion.masto.host and weatherishappening.net. Why?

weatherishappening.net calls infosec.exchange a “HOSTING DEATH SPIRAL FASCIST ORGANIZATION” and blocks it with “Limited” severity.

religion.masto.host blocks infosec.exchange at the “Suspended” level, but does not explain why (“Reason not available”).

Although those servers could, in turn, be blocked by others, no such blocks appear in the block lists of my immediate server neighborhood.

> select count(*) from blocking_servers(100) where blocked_server = 'weatherishappening.net'
+-------+
| count |
+-------+
| 0     |
+-------+

> select count(*) from blocking_servers(100) where blocked_server = 'religion.masto.host'
+-------+
| count |
+-------+
| 0     |
+-------+

There is another switch hitter in my neighborhood, though. c.im is blocked by me.dm and octodon.social.

Why?

me.dm (Medium) has “Suspended” c.im for “Hate speech”.

octodon.social has “Suspended” c.im for “Reason not available”.

When the opinions and policies of your server differ from those of mine, we see different realities through our respective lenses. Could such fragmentation drive the fediverse-curious back into arms of Big Social? I’m sure that will happen — indeed is happening — to some degree.

But I hope that some of us, at least, will learn to thrive in diverse networks of online communities, aware of the kaleidoscopic interplay of filters but not overwhelmed by it. That skill will serve us well IRL too. To acquire it, we’ll need to visualize the operation of our filters. One great way to do that: SQL queries that drive relationship graphs.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Mapping people and tags on Mastodon

In Mastodon relationship graphs I showed how to use Steampipe to map Mastodon network neighborhoods. When I use the word map here, I’m channeling Denis Wood’s The Power of Maps:

Every map shows this … but not that, and every map shows what it shows this way … but not the other way. Not only is this inescapable but it is precisely because of this interested selectivity — this choice of word or sign or aspect of the world to make a point — that the map is enabled to work.

The aspect chosen by those neighborhood maps is the boost — the Mastodon version of a retweet. One of the maps focuses on a selected instance that appears in the home timeline. It shows people who belong to that instance and who boost toots from people on the same or different instances.

The other map zooms out to show boost relationships among all the instances that appear in the home timeline. This view wouldn’t be legible if it included people, so it omits them in order to focus on server-to-server relationships.

These maps represent (or as Denis Wood emphasizes, “re-present”) a set of toots. They omit original toots to which nobody replies, and they also omit replies, in order to focus on boost relationships. What about replies? That would be a different map, one that might also be interesting to draw.

Meanwhile, though, I’ve come up with another map to display the tags that appear in the results of a Mastodon tag search, along with the accounts that use those tags. It proved its worth today when I was looking for perspectives on Section 230 of the Communications Decency Act. As you’ve likely noticed, the US Supreme Court is reconsidering Section 230. My understanding of the topic wasn’t current, I wanted to refresh it, and I especially wanted to test whether Mastodon could provide a useful alternative to a conventional web search.

One affordance that Mastodon provides: search for toots that use the #Section230 tag. Here are two ways to map the results of that search.

On the left is a conventional Mastodon view: a list of toots that match the query. In this case the article I ultimately wanted to read appears way down in that list. The toot that announced it was from The Markup, “a nonprofit newsroom that investigates how powerful institutions are using technology to change our society.” The article, Section 230 Is a Load-Bearing Wall — Is It Coming Down?, transcribes part of a conversation with two legal scholars whom I know to be reliable guides to Net-related issues.

On the right is my Steampipe-based Mastodon tag explorer. Working with the same data, it surfaced The Markup’s article in a way that brought it immediately to my attention. The first thing that caught my eye was the conjunction of two tags: #section230 and #scotus. Since the Supreme Court’s interest in Section 230 is what’s driving the current news cycle, I wanted to hear from legal scholars qualified to discuss the Court’s interest in Section 230. So the tag conjunction was a significant landmark.

The map displayed two nodes that connect to both #section230 and #scotus. How did I choose between them? My prior familiarity with The Markup led me to click on that node and visit the Markup’s Mastodon instance where I read the article.

Had I been following The Markup then, as I am now, I would likely have seen the article on the news list to which I’ve assigned The Markup’s account. But that wouldn’t have changed the experience of searching for the #section230 tag. The relationship graph works by reformulating the results of that search. It omits the text of toots that contain the tag, and the images in those toots, in order to highlight two aspects of the result list: people (or accounts) and tags. It contextualizes those tags by charting their relative frequency in the result list. And it attaches, to each tag node, a link to a new graph focused on that tag.

This “interested selectivity” enables the map to do its work: find accounts that use given tags. Like a tag node, an account nodes provides a link — in this case, to the account’s Mastodon home page. It also reports the accounts’ description using a property that appears when hovering the node. So if I were unfamiliar with The Markup I could reveal its description without leaving the graph. Here’s the query that adds that property to the node.

select
  note
from 
  mastodon_search_account
where 
  query = 'https://mastodon.themarkup.org/@themarkup'

+---------------------+
| note                |
+---------------------+
|  Watching Big Tech. |
+---------------------+

That query is embedded in another query that joins across two Steampipe plugins: one that wraps the Mastodon API and another that queries RSS feeds. That’s because, as noted in Mastodon, Steampipe, and RSS, the RSS feeds that Mastodon provides for tags enrich the results available from the core API.

Enabling SQL to query diverse APIs in a common way is one of Steampipe’s core superpowers. Enabling such queries to form the nodes and edges of relationship graphs is another. Used together, these two superpowers enable maps that select what is helpful, omit what isn’t, and thus “re-present” information for a given purpose.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

News in the fediverse

On a recent episode of You’re Wrong About, Sarah Marshall delivered a crisp summary of how the 24-hour news cycle came to be. This led me to realize that many among us do not remember when news was confined to time slots: the 7 o’clock news, the 11 o’clock news, the morning paper. I think it might be healthy to bring that back in some form. From the excitement I heard in their voices I gather that Sarah Marshall and her co-presenter Blair Braverman feel the same way. When I hear people complain about Mastodon’s lack of breaking news, I think: “Feature, not bug!” Maybe what the 24-hour cycle has been breaking is us.

So when press.coop launched today I was a bit conflicted. On the one hand, I would like to use Mastodon to read mainstream news headlines just as I once used RSS to do. (Full disclosure: press.coop is run as a public service by hello.coop whose founder, Dick Hardt, is a longtime friend.) On the other hand, when reading Mastodon timelines I’m enjoying NOT knowing what’s happening in the world right now.

What if you could exclude news from the home timeline, put it on a list, and go there when — and only when — in a news mindset? That’s a feature of the Fedilab client, I learned from Keith Soltys.

What would it take to implement the same idea in the Mastodon reader I’m developing? It couldn’t be just an extra WHERE condition, could it?

Well, actually, it could.

Problem solved. Now I can read news in Mastodon when and how I want, and never see it otherwise.

If you want that same control, you shouldn’t have to use a particular Android client, or Steampipe client, or any other. There’s a Cambrian explosion of such creatures right now. The more they can share important DNA, the better for all of us.

I hope that the Steampipe plugin for Mastodon, which enables the dashboards I’m building using Steampipe’s own dashboards-as-code system, can provide some useful common DNA. A rule like news only on lists, not timelines, once expressed in SQL, can be used (and thus not reinvented) by any kind of Steampipe (read: Postgres) client: psql (or any Postgres CLI), Metabase or Tableau or any Postgres-compatible BI tool, Python or JavaScript or any programming language. Steampipe is a versatile component. Its mapping from APIs to SQL can, in theory, provide the data layer for any application.

My Steampipe + Mastodon adventure notwithstanding, let’s hold on to that larger idea. The fediverse is our chance to reboot the social web and gain control of our information diets. Since our diets all differ, it ought to be trivial for anyone — in any client — to turn on a rule like news only on lists, not timelines. I’ve shown one way it can be trivial. Please show me others!


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Protocols, APIs, and conventions

The Fediverse is wired together by protocols like ActivityPub and WebFinger which, as yet, I know very little about. That’s because the Steampipe plugin, which supports the dashboards I’ve been building and describing in this series, doesn’t require me to understand or use those protocols.

It does, however, require me to understand and use the Mastodon API. Mostly I use that API by way of the Go SDK for Mastodon (thanks, mattn!), sometimes I make REST calls directly. Either way, my read-only dashboards use a fairly small subset of the Mastodon API. The full API is quite broad and deep; it enables API clients to read from and write to Mastodon servers in all sorts of ways. Here are the chapters of the Mastodon API book: apps, accounts, admin, instance, search, statuses, timelines, notifications, oembed. These chapters define what’s common to all Mastodon clients, including web apps, phone apps, native OS apps, and Steampipe dashboards.

So far I’ve ignored protocol-enabled interop in order to focus on API-enabled interop. I’m aware that the Fediverse includes much more than just Mastodon. I intend to explore BookWrym, Friendica, Funkwhale, Lemmy, takahe, PeerTube, Pixelfed, PeerTube, and others in due time. But right now the Mastodon ecosystem is plenty to try to wrap my head around.

For example, there’s a new web client for Mastodon: elk.zone. With the recent addition of support for lists, it has become my favorite way to interact in Mastodon space. So naturally I wanted to be able to click through from Steampipe dashboards to Elk, and use it as an alternative to the batteries-included Mastodon web app.

It turned out to be easy to enable that integration. Not thanks to ActivityPub, and not even thanks to the API. It works thanks to a third level of interop at play: common patterns for account URLs and toot URLs.

Here’s the account URL for Ward Cunningham who hangs his shingle at mastodon.radio: https://mastodon.radio/@k9ox. But as we saw in instance-qualified Mastodon URLs, if you visit that URL directly — and if it’s not your home server — you can’t follow Ward there, or add him to a list. You’ll need to copy that URL, paste it into your home server’s search box, run the search, and arrive at an instance-qualified URL where you can follow him or add him to a list: https://mastodon.social/@k9ox@mastodon.radio. If you’re home is fosstodon.org this would instead be https://fosstodon.org/@k9ox@mastodon.radio.

Similarly here is one of Ward’s toots at mastodon.radio: https://mastodon.radio/@k9ox/109802968820955379. If you want to reply or boost or favorite, you can’t do it there. The URL you need is again one that routes through your home server: https://mastodon.social/@k9ox@mastodon.radio/109802969999396562. Note that the ids for the same toot differ! That difference surprised me and some others, and is a topic for another episode. Here I’ll just note that these two patterns govern how we interact when crossing server boundaries in Mastodon space using the stock web client.

When I started using Elk, another layer of pattern emerged. Here are those same URLs in Elk:

https://elk.zone/mastodon.social/@k9ox@mastodon.radio

https://elk.zone/mastodon.social/@k9ox@mastodon.radio/109802969999396562

As it turns out, I just needed to make two of the Steampipe plugin’s transform functions prepend elk.zone to the instance-qualified URLs, then make such prefixing a configuration option. Now when I visit Mastodon links from dashboards, to reply or boost or follow or enlist, I land in the Elk experience that I prefer.

ActivityPub and WebFinger are formal standards. I would describe the Mastodon API as a de-facto standard. But this prefixing maneuver is just a convention. It’s not guaranteed to work with another web client, and not even guaranteed to work across all URLs presented by the stock Mastodon client. That’s OK by me. Conventions are incredibly useful. The Twitter hashtag is just a convention, after all, inspired in turn by an IRC convention.

We’re in one of those Internet moments of rapid innovation, when new conventions can unlock emergent behaviors. It hadn’t even occurred to me that the Steampipe dashboards could support Elk. A few hours after I thought they might, they did. I’ve seen this kind of thing before, perhaps most notably when the blogosophere adopted <link rel="alternate" type="application/rss+xml" href="{feedUrl}"> to enable browsers to auto-discover RSS feeds. That happened about 20 years ago, and quite suddenly when a handful of leading blogging tools adopted the convention in a matter of days. It was a lot of fun to live through that era. If you missed it, enjoy the sequel that’s unfolding now!


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Mapping the wider Fediverse

I began this journey convinced that Steampipe could help the Fediverse evolve, but not sure exactly how. My first thought was to use Steampipe’s API-wrangling superpower to study patterns of communication (and conflict) across the Fediverse. But as one of many Twitter escapees last November, I soon realized that the network I was joining reflected a culture that had been humming along nicely for six years and didn’t particularly want to be the object of sociological study.

As I argued in Autonomy, packet size, friction, fanout, and velocity, Mastodon bakes in certain kinds of friction for reasons. You’ve likely heard about a default unfriendliness to search, which is both a technical setting and a cultural choice that privileges the experience of current flow over the mining of past flow. Even more fundamentally, the ID of a toot not only differs from server to server but also obfuscates the toot’s date, another technical/cultural choice that means you can’t randomly access history by date. None of these frictions is insurmountable. They will be overcome for purposes good and bad. I hope and expect that communities will be able to choose their desired amounts and kinds of friction while still interoperating with others. But for my project it seemed that trying to survey the wider Fediverse wasn’t the right place to start.

So instead I began to explore a different way to read my home timeline. The dashboards that I’ve built and described in this series have become, for me at least, an effective way to scan recent Mastodon flow, then click into the stock client to post, reply, or boost. After overcoming a few obstacles, things are starting to feel like the Bloomberg terminal for Mastodon that I envision.

One of those obstacles was the awkward copy/paste/search of foreign toot URLs that was required in order to interact with them. That’s now overcome by Instance-qualified Mastodon URLs. Another obstacle was the difficulty of curating and effectively reading topical lists of people. The strategies described in Lists and people on Mastodon and Working with Mastodon lists have improved matters nicely. And relationship graphs turned out to be a more useful alternate view of the current flow than I had expected.

I think I’ve proven that a set of Steampipe dashboards, layered on a plugin that maps the Mastodon API to tables that the dashboards query, can improve the ability to absorb and react to Mastodon flow. An unproven corollary: Steampipe’s dashboards-as-code system is only one of potentially many clients of the Mastodon plugin. Any dashboarding system or web app or native app could tap into the same query capability to deliver still another way to experience the flow. But that’s a future episode.

Meanwhile, with a decent reading experience in place, this seemed like a good time to circle back to the question of surveying the wider Fediverse. To begin enabling that I added a couple of new tables to the plugin: mastodon_peer and mastodon_domain_block.

Here’s a query that uses mastodon_peer.

with data as (
  select
    'https://' || server as server
  from
    mastodon_toot
  where
    timeline = 'home'
  limit 100
),
servers as (
  select
    server,
    count(*) as occurrences
  from
    data
  group by
    server
)
select
  s.server,
  s.occurrences,
  count(p.peer) as peers
from
  servers s
join
  mastodon_peer p
on
  s.server = p.server
group by
  s.server,
  s.occurrences
order by
  peers desc

In Engish: gather the most recent 100 toots in my home timeline, count the occurrences of each origin server, then ask each origin server how many other servers it talks to. Unsurprisingly my home server, mastodon.social, occurs most often. And because it’s the marquee Mastodon server it has the most peers.

+----------------------------------+-------------+-------+
| server                           | occurrences | peers |
+----------------------------------+-------------+-------+
| https://mastodon.social          | 11          | 49732 |
| https://fosstodon.org            | 1           | 33973 |
| https://octodon.social           | 1           | 29983 |
| https://infosec.exchange         | 2           | 26833 |
| https://indieweb.social          | 9           | 26279 |
| https://hachyderm.io             | 3           | 19911 |
| https://social.treehouse.systems | 3           | 18110 |
| https://journa.host              | 1           | 18021 |
| https://nerdculture.de           | 9           | 17984 |
| https://werd.social              | 2           | 13792 |
| https://dan.mastohon.com         | 2           | 13351 |
| https://masto.nyc                | 1           | 10917 |
| https://mastodon.archive.org     | 1           | 9582  |
| https://social.fossdle.org       | 1           | 8343  |
| https://devdilettante.com        | 12          | 6898  |
+----------------------------------+-------------+-------+

Here’s a query that uses mastodon_domain_block.

with data as (
  select
    'https://' || server as server
  from
    mastodon_toot
  where
    timeline = 'home'
  limit 100
),
servers as (
  select
    server,
    count(*) as occurrences
  from
    data
  group by
    server
)
select
  s.server,
  s.occurrences,
  count(d.domain) as "blocked domains"
from
  servers s
join
  mastodon_domain_block d
on
  s.server = d.server
group by
  s.server,
  s.occurrences
order by
  "blocked domains" desc

This one says: Again gather the origin servers in my recent home timeline, but this time ask each one how many other servers it blocks. Here we see that octodon.social, which happened to show up in my timeline when I ran the query, blocks a lot more servers than mastodon.social does.

+--------------------------+-------------+-----------------+
| server                   | occurrences | blocked domains |
+--------------------------+-------------+-----------------+
| https://octodon.social   | 1           | 510             |
| https://mastodon.social  | 8           | 181             |
| https://hachyderm.io     | 4           | 125             |
| https://infosec.exchange | 4           | 66              |
| https://nerdculture.de   | 1           | 36              |
| https://indieweb.social  | 4           | 23              |
+--------------------------+-------------+-----------------+

One could, and maybe at some point I will, comprehensively acquire and store this data. But meanwhile, how might it enhance the experience of reading recent Mastodon flow? Here’s what I’ve got so far.

We’ve already seen the first table which answers the question: “How many servers does each of the origin servers in my timeline block?” The second table answers a different question: “Which servers are most often blocked by the origin servers in my timeline?”

The blocking server list column shows an interesting mix of consensus and variation, and I think that will be something to explore comprehensively across the Fediverse. But for now I like how this view contextualizes what’s in my home timeline. Like the relationship graphs, it’s highly dynamic because my immediate network neighborhood is changing all the time. Whatever the current set of servers happens to be, though, I now have some clues about how connected each of those servers is and how aggressively each one is blocking others. This feels like a good first step toward mapping the wider Fediverse.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Images considered harmful (sometimes)

The Mastodon dashboards I’ve been developing and describing in this series are backed by a Steampipe plugin that translates SQL queries to Mastodon API calls. Like all Steampipe plugins you can use this one to run those queries in all sorts of ways: from psql or another Postgres CLI (perhaps via cron, perhaps in a CI/CD pipeline); from Metabase or Grafana or any Postgres-compatible BI tool; from Python or JavaScript or any programming language. The Steampipe core is a versatile software component that you can plug into just about any environment.

There’s also, of course, Steampipe’s dashboards as code approach which powers the alternate Mastodon UX that I’ve been exploring in this series. You can think of this dashboarding tool as a primitive web browser with a freakish talent for querying Postgres and weaving SQL results into widgets such as infocards, input controls, charts, tables, and relationship graphs. You compose widgets using HCL (Hashicorp Configuration Language) instead of HTML, arrange them using basic layout syntax, and view them by connecting your browser to the local Steampipe dashboard server or to cloud.steampipe.io.

The alternate Mastodon UX built this way was aptly described (thanks again Greg Wilson!) as A Bloomberg terminal for Mastodon. As a reminder, the stock Mastodon web client looks like this.

And the pages served by the Mastodon dashboards look like this.

I can scan the latter view far more effectively.

Now, I’ve been trying out a bunch of alternate Mastodon clients lately. I love Ivory on my phone, and Elk in the browser, and I use them when I want an experience that feels social-media-ish. But I don’t use Ivory or Elk or Toot! or the stock Mastodon web clients (browser, phone) when I want an experience that feels RSS-reader-ish.

For some information landscapes I want to unfold a big map, spread it out, and scan for points of interest. Social media is that kind of landscape, RSS readers were the first way I surveyed it effectively, and these Mastodon dashboards are becoming my new way.

When I compare those two screenshots though, it’s not just the density of the latter that I notice, but also the absence of pictures. At first I fretted about that. Could the dashboard pages render full-strength HTML? Perhaps they will, and there are a few different ways that could happen, but meanwhile I’ve come to embrace the text-only constraint. When I’ve got my map unfolded, and I’m scanning my home timeline or my lists for things to click into, images can be a distraction. I wouldn’t have chosen to omit them, but I find that their absence enables me to focus very effectively on who is speaking, and what they are saying or boosting.

There is also, of course, the graphical view afforded by relationship graphs. These feel very map-like in the way they reveal clusters of people interacting with one another. I’m finding them more useful than anticipated.

But the text that I read in these dashboards is image-free for now. And I think it’s having a calming effect. There are plenty of times when I want images, don’t get me wrong, and that’s partly why I use a mix of conventional Mastodon clients. But there are times when I want to dial down the clutter, just like there are times when I switch my phone to monochrome. Two things can be true: The pictures we share with one another are a source of joy, and they are sensory overload. I’ll make them optional here when I can, and would like to have the option to use all social media interfaces in text mode.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Working with Mastodon lists

Since the early days of the blogosphere I have cherished the ability to view the world through the eyes of people more qualified than me to understand and explain what happens in particular domains. Although Twitter lists were a great way to collect people who provide those perspectives, I made little use of them. Prompted by Martin Fowler’s frustration with lists I described my first steps toward reviving the technique in Lists and people on Mastodon.

First I encapsulated the relevant APIs in a pair of tables provided by the Steampipe plugin for Mastodon: mastodon_list and mastodon_list_account. Then I used those tables to enhance the Mastodon dashboard in a couple of ways. The followers and following tabs now show me who is or isn’t on a list.

And I can use the list tab to read recent posts from people on each list.

With these basic building blocks in place I want to be more intentional about curating these lists. To that end I thought I’d share the lists I’ve built so far, and invite suggestions. You can export your lists from the Mastodon web client using Preferences -> Data export -> Lists. The resulting CSV file has two columns: the name of a list and the Mastodon handle for a person.

list person
Library liza@post.lurk.org
Library alexwade@fosstodon.org
Library jdiffin@glammr.us

Here’s the export I really wanted.

list person server url note
Library liza post.lurk.org https://mastodon.social/@liza@post.lurk.org Technologist in Residence at the Harvard Library Innovation Lab. Interactive fiction, generative art, Democratic political tech. Amateur scholar of fringe utopian sci-fi. I will pet your dog.
Library alexwade fosstodon.org https://mastodon.social/@alexwade@fosstodon.org Librarian, open science advocate, geek. VP Data Products, @digitalscience (Seattle, WA) Formerly: AllenAI (AI2), CZI, Amazon, Microsoft Research, UW
Library jdiffin glammr.us https://mastodon.social/@jdiffin@glammr.us Head of Library Technology Services Section, NLM. Opinions are my own.

It’s easy to create that list using the Steampipe plugin for Mastodon.

steampipe query “select title as list, username, server, instance_qualified_account_url, note from mastodon_list l join mastodon_list_account a on l.id = a.list_id order by list, username” –output csv > mastodon_lists.csv

(I’m using instance-qualified URLs relative to my home server, if your home is elsewhere than mastodon.social you’ll want to adjust the links accordingly.)

I’ve uploaded the output of that query to a Google sheet. The most recent addition to my climate list is Peter Gleick who was quoted yesterday in Farhad Manhjoo’s What Will ‘Weather Whiplash’ Mean for California? I’d been looking for an expert perspective on California water policy, so I checked to see if Peter is on Mastodon, found that he is, and added him to my climate list. Here’s an example of the kind of perspective I’m looking for.

California reservoirs have captured enough water from the recent storms to supply all urban water use in the state for a year. Not counting the massive snowpack. Those claims that “all the water is being wasted flowing to the ocean” is nonsense. — Mastodon link

Of course it’s a complex issue, and I don’t think I’m wrong to be very interested in strategies for capturing more rainfall and using it to recharge aquifers. But this was a useful reality check.

Who are the experts you follow in various domains? Can I cherrypick from your lists? And can we imagine a way of remixing and sharing these lists? I’m looking at you, Lucas Gonze, with fond memories of how Webjay did that for MP3 playlists. Can we do that for Mastodon lists too?


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Of course the attention economy is threatened by the Fediverse

Megan McArdle says this week, in the Washington Post, that “Twitter might be replaced, but not by Mastodon or other imitators.” I’m not linking to the article, you can easily find it, but that title is all we need for my purpose here, along with this bit of context: she has 93K followers on Twitter.

Nobody wants to walk away from that kind of audience. Well, almost nobody. Sam Harris’ recent Twitter exit is a rare example of someone concluding that a large follower count is a net negative. If I were in his shoes I’m not sure I’d be able to do the same. When my own audience was at its peak — at BYTE during the dawn of the Internet, then at InfoWorld in the early years of the blogosphere — I could press the Publish button on my blog and watch in realtime as the responses rolled in on waves of dopamine. It’s addictive, there’s never enough, you’re always looking for the next hit.

When Twitter started, that momentum carried forward for a while. I never racked up a huge follower count — it maxed out just shy of 6K — but most of those people followed me early on, thanks to the the ad-supported publications that had brought me to their attention. My Twitter following reached a plateau years ago. Did I wish for 100K followers? Sure, I’d be lying to pretend otherwise. But gradually I came to see that there was a sweet spot, somewhere between (let’s say) 200 and 15,000 followers, where it was possible to enjoy the kinds of pleasant and stimulating interaction that I’d first experienced in web forums and the blogosophere.

Until it wasn’t. Like a frog in slowly boiling water, I failed to notice how the Twitter experience degraded over time. Fewer and fewer of my 6K followers corresponded regularly, and my social graph there became stagnant. For me the Mastodon reboot has been a delightful replay of the early blogosphere: new acquaintances, collegial discussion, positive energy.

If you occupy a privileged position in the attention economy, as Megan McArdle does now, and as I once did in a more limited way, then no, you won’t see Mastodon as a viable replacement for Twitter. If I were still a quasi-famous columnist I probably wouldn’t either. But I’m no longer employed in the attention economy. I just want to hang out online with people whose words and pictures and ideas intrigue and inspire and delight me, and who might feel similarly about my words and pictures and ideas. There are thousands of such people in the world, not millions. We want to congregate in different online spaces for different reasons. Now we can and I couldn’t be happier. When people say it can’t work, consider why, and who benefits from it not working.

Here’s a graph of the Fediverse as it appears from my perspective right now.

It looks and feels healthy and it’s working just great. I don’t want us to replace Twitter, or imitate it. I want The Internet Transition that I hope is underway.

Mastodon relationship graphs

The new release of Steampipe is all about relationship graphs. Our blog post shows how these graphs provide contextual awareness for devops and security pros who can now see all the resources related to an EC2 instance, or determine at a glance whether the permissions related to an IAM role are properly scoped. As always, developers can explore and remix the code that builds these graphs, and adapt the idioms for their own purposes in any data domain.

These relationship graphs are driven by SQL queries that define nodes and edges. Such queries can use any column of any table provided by any Steampipe plugin to form nodes, and then edges between nodes. If you want to see connections among the people and objects represented by diverse APIs, you can now use SQL idioms to graph them. The only limit is your imagination.

Naturally I imagined graphing Mastodon relationships. So far I’ve built two graphs that visualize my home timeline. Here’s the first one.

Here we’re looking at the most recent 50 boosts (the Mastodon version of retweet) in my homeline. This is the query to find them.

select
  *
from
  mastodon_toot
where
  timeline = 'home'
  and reblog_server is not null
  limit  50

If we focus on Brian Marick we can see that:

  • Brian belongs to mastdn.social
  • Brian boosted a post by Tim Bray
  • Tim belongs to hachyderm.io

So this graph shows people on a selected server boosting people on other servers. In this case mastdn.social is the selected server, but we can refocus the graph on any other server that’s sending boosts.

The second graph zooms out to show the web of boost relationships among servers. If anyone on infosec.exchange boosts anyone on mastodon.world, there’s an edge connecting the two nodes. Although it’s not happening anywhere in this graph, the arrow can point both ways and would if anyone on mastodon.world were also boosting anyone on infosec.exchange.

Let’s build up the first graph step by step.

Step 1: Identify the selected server

Here’s the definition of the node that represents the selected server.

node {
  category = category.selected_server
  args = [ self.input.server.value ]
  sql = <<EOQ
    select
      server as id,
      server as title,
      jsonb_build_object(
        'server', server
      ) as properties
    from
      mastodon_boosts()
    where
      server = $1
  EOQ
}

Per the documentation, a node’s query must at least select a column aliased as id. Here it’s the server column in a row returned by the above query. I’ve packaged that query into a SQL function, mastodon_boosts, to hide details (timeline = 'home' reblog_server is not null limit 50) and make it easier to focus on what’s special about each node. In this case the special quality is the server column that gives the node its identity matches the selected server.

If the graph block includes only this node, and mastdn.social is the selected server, here is the rendering. Not much to see here yet!

The node defines a bag of properties that can be any of the columns returned by the underlying query; these appear when you hover the node. The node also refers to a category that governs the node’s icon, color, and link. Here’s the category for the selected server.

category "selected_server" {
  color = "darkgreen"
  icon = "server"
  href  = "https://{{.properties.'server'}}"
}

Step 2: Identify boosted servers

Now we’ll add boosted servers. This node uses the same set of records: the 50 most recent boosts in my feed. Again it finds just those whose server column matches the selected server. But the id is now the reblog_server which is the target, instead of the origin, of boosts from the selected server.

node {
  category = category.boosted_server
  args = [ self.input.server.value ]
  sql = <<EOQ
    select
      reblog_server as id,
      reblog_server as title
    from
      mastodon_boosts()
    where
      server = $1
    EOQ
}

Here’s the graph with both selected_server and boosted_server nodes. We’ve used another category to differentiate the boosted nodes.

There’s only one selected server but it can send boosts to more than one boosted server. The default rendering folds them into one node but you can click to unfold and see all of them.

Step 3: Identify people who boost others

Where are the people? Let’s add them next, starting with the people who are sending boosts.

node {
  category = category.person
  args = [ self.input.server.value ]
  sql = <<EOQ
    select
      username as id,
      display_name as title,
	    jsonb_build_object(
        'instance_qualified_account_url', instance_qualified_account_url
      ) as properties
    from
      mastodon_boosts()
    where
     server = $1
    EOQ
  }

The username column gives the node its identity. Note also the property instance_qualified_account_url. That’s the synthetic column we added to the Mastodon plugin last time to ensure that links to people and toots will work properly in the Mastodon client. Because it’s included in a property here, and because category.person refers to that property, links representing people in the graph will resolve properly.

Step 4: Identify people who are boosted

This node takes its identify from the reblog_username column, and uses the synthetic column instance_qualified_reblog_url to provide the link.

node {
  category = category.boosted_person
  args = [ self.input.server.value ]
  sql = <<EOQ
    select
      reblog_username as id,
      reblog_username as title,
      jsonb_build_object(
        'instance_qualified_reblog_url', instance_qualified_reblog_url
      ) as properties
    from
      mastodon_boosts()
    where
      server = $1
  EOQ
}

Step 5: Connect boosters on the selected server to that server

So far we’ve seen only nodes, whose queries minimally return the id property. An edge connects nodes by way of a query that minimally returns columns aliased to from_id and to_id.

edge {
  sql = <<EOQ
    select
      username as from_id,
      server as to_id,
      'belongs to' as title
    from
      mastodon_boosts()
  EOQ
}

You’ll also want to provide a title to label the edge. Here this edge occurs twice to represent “John Mashey belongs to mstdn.social” and “Brian Marick belongs to mstdn.social.”

Step 6: Connect people on boosted servers to their servers

This edge works the same way, but captures the relationship between boosted people and their servers.

edge {
  args = [ self.input.server.value ]
  sql = <<EOQ
    select
      reblog_username as from_id,
      reblog_server as to_id,
      'belongs to' as title
    from
      mastodon_boosts()
    where
      server = $1
  EOQ
}

Step 7: Connect boosters to the people they boost

Finally we add an edge to connect boosters to the people they boost.

edge {
  category = category.boost
  args = [ self.input.server.value ]
  sql = <<EOQ
    select
      username as from_id,
      reblog_username as to_id,
      'boosts' as title,
      jsonb_build_object(
        'reblog_username', reblog_username,
        'reblog_server', reblog_server,
        'content', reblog ->> 'content'
      ) as properties
    from
      mastodon_boosts()
    where
      server = $1
  EOQ
}

And now we’ve completed the first graph shown above.

Graphing GitHub relationships

You can use this grammar of nodes and edges to describe relationships in any domain. Here’s a graph that looks across all the Steampipe-related repos and shows recently-updated PRs from external contributors.

And here’s one that uses any Steampipe plugin to show recently-updated pull requests for a selected repo.

These two views share a common SQL query and serve complementary purposes. The table is handy for sorting by date or author, the graph highlights one-to-many relationships.

Lifting the burden of context assembly

In What TimeDance got right I mourned the passing of a tool for scheduling meetings that had excelled at bringing together the messages and documents related to a meeting. I called this “context assembly” — a term I’d picked up from Jack Ozzie, cofounder of Groove, another collaboration tool whose passing I mourn. Context assembly is hard work. Too often the burden falls on people who only need to use that context and would rather not spend time and effort creating it.

We’ve seen how SQL can unify access to APIs. Now it can also help us see relationships among the data we extract from those APIs.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Instance-qualified Mastodon URLs

In Lists and people on Mastodon I showed how I added a list column to the following tab of the Mastodon browser I’m building. That was a step in the direction of easier and more powerful list management. It enables me to see whether the people I follow are assigned to lists, and to consider who should be on a list (or perhaps on a different list).

Today, as I began to use that new affordance in earnest, I discovered a new challenge. In order to assign someone to a list, or change a list assignment, I clicked the link in the account_url column to open that person’s profile in the Mastodon web app. That was fine for accounts on my home server, mastodon.social. An account URL like Shelley Powers’ https://mastodon.social/@burningbird brings me to Shelley’s profile on my home server where the list manager is available.

But if I’m following someone elsewhere, like Ward Cunningham at https://mastodon.radio/@k9ox, the account URL brings me to Ward’s profile on that server where the list manager isn’t available. In order to assign Ward to a list I had to capture his account URL, paste it into the search box in my home server’s web app, and then click the resulting link: https://mastodon.social/@k9ox@mastodon.radio.

That got old real fast, so I adjusted the following tab to display the latter flavor of URL which I’ll call an instance-qualified URL.

Steampipe provides a few ways to make that adjustment. As a user of the dashboard, you can use Postgres’ regular expression functions to do the transformation in the SQL query that drives the view. But you’d rather not have to. It’s much nicer if the plugin does that for you, so the SQL can just refer to a column called instance_qualified_url.

I chose the latter approach. As the author of a Steampipe plugin you want to make life as easy as possible for users of the plugin. When you’re the author of both the plugin and the dashboard, as I am in this case, you can enjoy a nice virtuous cycle. As the dashboard evolves you discover ways to improve the plugin, which leads to more use of the dashboard, which suggests further opportunities to improve the plugin. I have been greatly enjoying the coevolution of these two components!

Adding a new column to a Steampipe table

To make the change, I extended the structure that defines the columns of the tables mapped from Mastodon Account API. A Steampipe plugin defines columns using a list of structs like this.

...,
{
	Name:        "url",
	Type:        proto.ColumnType_STRING,
	Description: "URL for the account.",
},
...,

That struct says: “When the name of a top-level field in the API response is url, tell Steampipe to make a database column with that name and with the Postgres type text.”

You can also transform values in API responses to synthesize new columns that don’t appear in API responses. Here’s the struct I added for this case.

...,
{	
	Name:        "instance_qualified_account_url",
	Type:        proto.ColumnType_STRING,
	Description: "Account URL prefixed with my instance.",
	Transform:   transform.FromValue().Transform(instanceQualifiedAccountUrl),
},
...

That one says: “Send the API response to the transform function instanceQualifiedAccountUrl, and use its result as the value of the column.

Here’s the function.

func instanceQualifiedAccountUrl(ctx context.Context, input *transform.TransformData) (interface{}, error) {
	url := input.Value.(*mastodon.Status).Account.URL
	qualifiedUrl := qualifiedUrl(ctx, url)
	return qualifiedUrl, nil
}

It delegates the real work to another function.

func qualifiedUrl(ctx context.Context, url string) string {
	plugin.Logger(ctx).Debug("instanceQualifiedUrl", "server", homeServer, "url", url)
	re := regexp.MustCompile(`https://([^/]+)/@(.+)`)
	matches := re.FindStringSubmatch(url)
	if len(matches) == 0 {
		return url
	}
	person := matches[1]
	server := matches[2]
	qualifiedUrl := fmt.Sprintf("%s/@%s@%s", homeServer, server, person)
	plugin.Logger(ctx).Debug("instanceQualifiedUrl", "qualifiedUrl", qualifiedUrl)
	schemelessHomeServer := strings.ReplaceAll(homeServer, "https://", "")
	qualifiedUrl = strings.ReplaceAll(qualifiedUrl, "@"+schemelessHomeServer, "")
	plugin.Logger(ctx).Debug("qualifiedUrl", "qualifiedUrl", qualifiedUrl)
	return qualifiedUrl
}

Why? Two different sets of column definitions need the same transformation. instanceQualifiedAccountUrl works with responses from the Account API. But account URLs also appear in the Status API that drives timeline views. Those use a different transform function, instanceQualifiedStatusUrl, to do the same transformation for a different API response.

From account URLs to status URLs

The instanceQualifiedAccountUrl column solved the original problem. I was able to remove my plugin-author hat, put on my dashboard-author hat, and refer to account urls as instance-qualified URLs in all the tabs that display them. Any such link now leads to a profile that I view through the lens of mastodon.social and that enables me to use the web app’s list manager directly, without the cumbersome copy/paste/search procedure.

My happy dance didn’t last long, though. Newly sensitized to that copy/paste/search friction, I realized it was still happening when I try to reply to items that appear in a timeline view. Here is a recent example: https://techpolicy.social/@mnot/109610641523489182.

That’s the URL displayed in the dashboard. When I click it I land on Mark’s server and can view the item, but if I try to reply I’m confronted with the dreaded copy/paste/search operation.

No problem! I’ll use a similar transform! Not so fast. I can form an URL like https://mastodon.social/@mnot@techpolicy.social/109610641523489182 but it doesn’t go anywhere.

If I do the copy/paste/search operation, I land on a similar-but-different URL: https://mastodon.social/@mnot@techpolicy.social/109610641692667630. It has the same structure but a different toot ID. This URL is also the one that appears in the web app’s home timeline, which is why I can reply directly from that view.

I’m out of my depth here so I’ll just end with an appeal for help. It makes sense that a home server will assign its own id to an item fetched from a foreign server, and that the web app will use that id. But I’m not seeing a way to aquire that id directly from the API. I suspect it’s possible to acquire it by way of search, but doing that for every item in a timeline will quickly exhaust the tight budget for API requests (just 300 every 5 minutes).

So, Lazy Mastodon, am I just stuck here or is there a way to transform foreign status URLs into instance-relative status URLs?

Update: Solved!

After chatting with Jari Pennanen I took another look and realized the needed ID was available in the API response after all, I just wasn’t using it (facepalm). And in fact there are two flavors of the ID — one for original toots, another for boosts. Columns for both cases are added here and the tweak to make the dashboard use them here.

Here is the result.

Thanks for being my rubber duck, Jari! The instance-qualified toot and reblog URLs make this dashboard massively more useful.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

How many people in my Mastodon feed also tweeted today?

I lasted tweeted on Dec 22. (It was, unsurprisingly, a link to a blog post about Mastodon.) Today I wondered what percentage of the people who appear in my Mastodon timeline today also appeared on Twitter today.

To start, I wrote this query which tries to match Twitter and Mastodon usernames. When it finds a match, it reports the day on which that person last tweeted.

with mastodon as (
  select
    substring(username from 1 for 15) as username, -- twitter names are max 15 chars
    'from:' || substring(username from 1 for 15) as query -- we will query twitter using, e.g., 'from:judell'
  from
    mastodon_toot
  where
    timeline = 'home'
  limit
    500
)
select
  m.username as mastodon_person,
  t.author->>'username' as twitter_person,
  max(to_char(t.created_at, 'YYYY-MM-DD')) as last_tweet_day
from 
  mastodon m
left join
  twitter_search_recent t -- see https://hub.steampipe.io/plugins/turbot/twitter/tables/twitter_search_recent
on
  t.query = m.query
group by
  mastodon_person,
  twitter_person
order by
  last_tweet_day desc

This is my favorite kind of Steampipe query: two different APIs, each represented as a Postgres table, combined with a SQL JOIN.

The result looks like this, with nulls for failed matches.

+-----------------+-----------------+----------------+
| mastodon_person | twitter_person  | last_tweet_day |
+-----------------+-----------------+----------------+
| AlanSill        | null            | null           |
| Colarusso       | null            | null           |
| ...                                                |
| williamgunn     | null            | null           |
| xian            | null            | null           |
| ...                                                |
| futurebird      | futurebird      | 2022-12-29     |
| glynmoody       | glynmoody       | 2022-12-29     |
| ...                                                |
| khinsen         | khinsen         | 2022-12-23     |
| blaine          | blaine          | 2022-12-23     |
+-----------------+-----------------+----------------+

Next I created a table from the above query.

create table public.mastodon_twitter as 
  -- sql as above

And then ran this query.

select
  last_tweet_day,
  count(*)
from
  mastodon_twitter
where
  last_tweet_day is not null
group by
  last_tweet_day
order by
  last_tweet_day desc

Here’s the result.

+----------------+-------+
| last_tweet_day | count |
+----------------+-------+
| 2022-12-29     | 36    |
| 2022-12-28     | 6     |
| 2022-12-27     | 1     |
| 2022-12-26     | 1     |
| 2022-12-25     | 2     |
| 2022-12-23     | 2     |
+----------------+-------+

The 500 toots represented here were created by 93 people who tooted today.

select count(*) from mastodon_twitter

+-------+
| count |
+-------+
| 93    |
+-------+

Of those 93 people, 48 have matching usernames.

select count(*) from mastodon_twitter where last_tweet_day is not null

+-------+
| count |
+-------+
| 48    |
+-------+

Of the 48 with matching usernames, 36 also tweeted today.

So there’s my answer: 75% of the people who appeared in my Mastodon home timeline (when I sampled it just now) also appeared on Twitter today.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

My belated introduction to Super Mario Brothers

On a hike today my friend noticed a clump of Amanita Muscaria and took away this glorious specimen. As we headed down the trail he said: “I can never see one of these without hearing the music from Super Mario Brothers — beep, boop.” He went on to explain that the spotted red mushroom was a coveted power-up that made Mario grow large.

I knew nothing about this. I have never played Super Mario Brothers, or any other iconic game, or really any computer-based game at all, except briefly in the early 1990s when I played Rogue during an office craze that lasted a few weeks, and around the same time played Putt-Putt with our young kids.

This seems unusual for someone like me. I have spent my adult life deeply engrossed in computer-based activity, and am often described to others, by friends and family, as a “computer guy.” That makes me a geek and/or nerd by definition. But I’ve never been comfortable with either of those terms.

Over the years I’ve known a great many others in the geek/nerd tribe who self-identify as gamers. It always puzzles me. I’m perfectly happy to spend long hours in front of a screen reading or writing or coding. But games? I simply lack that gene. It baffles me how other members of the tribe — especially adults who like me already put in long hours in front of screens — could then pile up more screen time playing games, when they could be outside hiking, or cycling, or gardening, or doing really anything else at all.

Perhaps this explains my reluctance to self-identify as a member of the geek/nerd tribe. If I really belonged I’d be gamer. Since I’m not I must be an impostor. I wonder if there’s a word — in some other language than English probably — for when you share certain core attributes with a tribe but not others?

Lists and people on Mastodon

I hadn’t thought to use Mastodon lists until I read the Frustration with lists chapter of Martin Fowler’s Exploring Mastodon, in which he writes:

I like lists because they allow me to divide up my timeline to topics I want to read about at different times. They are frustrating because the tools to manage them in Twitter are very limited, so it’s more hassle to set up the kind of environment I’d like. Mastodon also has lists, sadly its current management tools are equally bad.

This seemed like a good challenge for Steampipe. To tackle it, I first needed to add some new tables to the plugin to encapsulate the list APIs: mastodon_list and mastodon_list_account. I’ll save that story for another time. Here I’ll just show that together they enable queries like this.

select
  l.title as list,
  array_agg(a.username order by a.username) as people
from
  mastodon_list l
join
  mastodon_list_account a
on
  l.id = a.list_id
group by
  l.title
+--------------+--------------------------------------+
| list         | people                               |               
+--------------+--------------------------------------+
| Academic     | ____, ______, ____, ___              |
| Education    | ___, ______  ___, ______             |
| Energy       | ___, ______, ____ __                 |
| Fediverse    | ____ __,                             |
| Humor        | ____, ____ __, ____ __               |
| Journalism   | ___ __, ___ ____,  ___, ______       |
| Library      | __                                   |
| Net          | ___ __, _____, ___ __, __ __, ____   |
| Science      | __, ____ __, ______                  |
| Software     | ____ __, ______, ____ __             |
+--------------+--------------------------------------+

That’s a useful view, and I’ve now included it, but it didn’t address Martin’s specific need.

To manage these lists, I really need a display that shows every account that I follow in a table with its lists. That way I can easily see which list each account is on, and spot any accounts that aren’t on a list.

For that I needed to add a list column to the Following tab.

This was the original query.

select
  url,
  case when display_name = '' then username else display_name end as person,
  to_char(created_at, 'YYYY-MM-DD') as since,
  followers_count as followers,
  following_count as following,
  statuses_count as toots,
  note
from
  mastodon_following
order by
  person

The new version captures the above join of mastodon_list and mastodon_list_account, and joins that to the mastodon_following (people I follow) table. It’s a left join which means I’ll always get all the people I follow. If you’re not on a list, your list column will be null.

with data as (
  select
    l.title as list,
    a.*
  from
    mastodon_list l
  join
    mastodon_list_account a
  on
    l.id = a.list_id
),
combined as (
  select
    d.list,
    f.url,
    case when f.display_name = '' then f.username else f.display_name end as person,
    to_char(f.created_at, 'YYYY-MM-DD') as since,
    f.followers_count as followers,
    f.following_count as following,
    f.statuses_count as toots,
    f.note
  from
    mastodon_following f
  left join
    data d
  on
    f.id = d.id
)
select
  *
from
  combined
order by
  person

That query drives the new version of the Following tab.

It’s pretty sparse, I’ve only just begun adding people to lists. And honestly I’m not sure I’ll want to keep doing this curation, it’s the kind of thing that can become a burden, I need to play around some more before I commit. Meanwhile, the default sort puts unlisted people first so they’re easy to find.

To provide a better way to find people who are on lists, I expanded the List tab in a couple of ways. It had included a dropdown of lists by which to filter the home timeline. Now that dropdown has counts of people on each list.

input "list" {
  type = "select"
  width = 2
  sql = <<EOQ
    with list_account as (
      select
        l.title
      from
        mastodon_list l
      join
          mastodon_list_account a
      on
        l.id = a.list_id
    ),
    counted as (
      select
        title,
        count(*)
      from
        list_account
      group by
        title
      order by
        title
    )
    select
      title || ' (' || count || ')' as label,
      title as value
    from
      counted
    order by
      title
  EOQ
}

I also used this query to expand the List tab.

select
  l.title as list,
  array_to_string( array_agg( lower(a.username) order by lower(a.username)), ', ') as people
from
  mastodon_list l
join
  mastodon_list_account a
on
  l.id = a.list_id
group by
  l.title

The result is the list / people table on the right.

I know that some won’t cotton to this SQL-forward programming model. But for others who will, I wanted to show a few detailed examples to give you a sense of what’s possible at the intersection of Mastodon and Steampipe.

If you’re not tuned into SQL (like I wasn’t for a very long time), here’s your takeaway: as SQL goes, this stuff is not too scary. Yes there are joins, yes there’s an array_agg that transposes a column into a list. It’s not beginner SQL. But lots of people know how to use join and array_agg in these ways, lots more could easily learn how, and with SQL ascendant nowadays these are skills worth having.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Create your own Mastodon UX

I’ve been discussing Mastodon UX wishlists with some new acquaintances there. This excerpt from A Bloomberg terminal for Mastodon concludes with part of my own wishlist.

In a Mastodon timeline, a chatty person can dominate what you see at a glance. When we participate in social media we are always making bids for one another’s attention. As publishers of feeds it’s wise to consider how a flurry of items can overwhelm a reader’s experience. But it’s also useful to consider ways that feed readers can filter a chatty source. Steampipe’s SQL foundation affords an easy and natural way to do that. Here’s part of the query that drives the list view.

select distinct on (list, person, hour) -- only one per list/user/hour
  person,
  url,
  hour,
  toot
from
  data
order by
  hour desc, list, person

It was easy to implement a rule that limits each person to at most one toot per hour. Next steps here will be to apply this rule to other views, show the number of collapsed toots, and enable such rules on a per-person basis.

As a warmup exercise, I decided to first add a simple control for boosts that enables me to see my home timeline with or without boosts. To give technically-inclined readers a sense of what’s involved in doing this kind of thing with Steampipe, I’ll describe the changes here. I’m obviously biased but I find this programming environment to be accessible and productive. If it seems that way to you as well, you might want to try out some of the items on your own UX wishlist. And if you do, let me know how it goes!

Here are the original versions of the two files that I changed to add the new feature. First there’s home.sp which defines the dashboard for the home timeline.

dashboard "Home" {
  
  tags = {
    service = "Mastodon"
  }

  container {
    // a text widget with the HTML links that define the menu of dashboards
  }

  container {
    text {
    // a block that displays the HTML links that form a menu of dashboards
    }

    card {
    // a block that reports the name of my server
    }

    input "limit" {
      width = 2
      title = "limit"
      sql = <<EOQ
        with limits(label) as (
          values 
            ( '50' ),
            ( '100' ),
            ( '200' ),
            ( '500' )
        )
        select
          label,
          label::int as value
        from 
          limits
      EOQ
    }    
  }


  container { 

    table {
      title = "home: recent toots"
      query = query.timeline
      args = [ "home", self.input.limit ]
      column "person" {
        wrap = "all"
      }
      column "toot" {
        wrap = "all"
      }
      column "url" {
        wrap = "all"
      }
    }
  }

}

And here’s the new version. It adds an input block called boosts, and passes its value to the referenced query.

dashboard "Home" {
  
  tags = {
    service = "Mastodon"
  }

  container {
    // a text widget with the HTML links that define the menu of dashboards
  }

  container {
    text {
    // a block that displays the HTML links that form a menu of dashboards
    }

    card {
    // a block that reports the name of my server
    }

    input "limit" {
    // as above
    }

    input "boosts" {
      width = 2
      title = "boosts"
      sql = <<EOQ
        with boosts(label, value) as (
          values
            ( 'include', 'include' ),
            ( 'hide', ' ' ),
            ( 'only', ' 🢁 ' )
        )
        select
          label,
          value
        from
          boosts
      EOQ
    }

  }

  container { 

    table {
      // as above
      args = [ "home", self.input.limit, self.input.boosts ]
    }
  }

}

Steampipe dashboards are built with two languages. HCL (Hashicorp configuration language) defines the UX widgets, and SQL fills them with data. In this case we’re selecting static values for the boosts input. But any Steampipe query can run there! For example, here is the input block I use on the dashboard that filters the timeline by the list to which I’ve assigned people.

input "list" {
  type = "select"
  width = 2
  title = "search home timeline"
  sql = <<EOQ
    select
      title as label,
      title as value
    from
      mastodon_list
    order by
      title
  EOQ
}

Now here is the referenced query, query.timeline, from the file query.sp which contains queries used by all the dashboards.

query "timeline" {
  sql = <<EOQ
    with toots as (
      select
        account_url as account,
        case 
          when display_name = '' then user_name 
          else display_name
        end as person,
        case
          when reblog -> 'url' is null then
            content
          else
            reblog_content
        end as toot,
        to_char(created_at, 'MM-DD HH24:MI') as created_at,
        case
          when reblog -> 'url' is not null then '🢁'
          else ''
        end as boosted,
        case
          when in_reply_to_account_id is not null then ' 🢂 ' || ( select acct from mastodon_account where id = in_reply_to_account_id )
          else ''
        end as in_reply_to,
        case
          when reblog -> 'url' is not null then reblog ->> 'url'
          else url
        end as url
      from
        mastodon_toot
      where
        timeline = $1
      limit $2
    )
    select
      account,
      person || 
        case 
          when in_reply_to is null then ''
          else in_reply_to
        end as person,
      boosted || ' ' || toot as toot,
      url
    from
      toots
    order by
      created_at desc
  EOQ
  param "timeline" {}
  param "limit" {}
}

And here is the new version of that query.

query "timeline" {
  sql = <<EOQ
    with toots as (
    // as above      
    ),
    boosted as (
      select
        $3 as boost,
        boosted,
        account,
        in_reply_to,
        person,
        toot,
        url
      from
        toots
      order by
        created_at desc
    )
    select
      account,
      person ||
        case
          when in_reply_to is null then ''
          else in_reply_to
        end as person,
      boosted || ' ' || toot as toot,
      url
    from
      boosted
    where
      boost = boosted
      or boost = 'include'
      or boost = 'n/a'
  EOQ
  param "timeline" {}
  param "limit" {}
  param "boost" {}
}

The original version uses a single CTE (aka common table expression aka WITH clause), toots, to marshall data for the concluding SELECT. The new version inserts another CTE, boosts, into the pipeline. It uses $3 to reference param "boost" {} which maps to the self.input.boosts passed from home.sp

The SQL code is all standard. Postgres is the engine inside Steampipe, and I sometimes use Postgres-specific idioms, but I don’t think any of those are happening here.

The HCL code may be unfamiliar. Steampipe uses HCL because its core audience are DevSecOps pros who are familiar with Terraform, which is HCL-based. But its a pretty simple language that can be used to describe all kinds of resources. Here the resources are widgets that appear on dashboards.

The other thing to know, if you want to roll up your sleeves and try building your own dashboards, is that the developer experience is — again in my biased opinion! — pretty great because if you’re using an autosaving editor you’ll see your changes (to both HCL and SQL code) reflected in realtime.

To illustrate that, here’s the screencast we included in our blog post introducing the dashboard system.

Not shown there, because we wanted to focus on the happy path, is realtime feedback when your SQL queries provoke Postgres errors. The experience feels very much like the one Bret Victor champions in Inventing on Principle. The core principle: “Creators need an immediate connection to what they’re creating.”

Here’s the wrong way that too often constrains us.

If there’s anything wrong with the scene, or if I go and make changes, or if I have further ideas, I have to go back to the code, and I edit the code, compile and run, see what it looks like. Anything wrong, I go back to the code. Most of my time is spent working in the code, working in a text editor blindly, without an immediate connection to this thing, which is what I’m actually trying to make.

And here is the right way.

I’ve got this picture on the side, and the code on the side, and this part draws the sky and this draws the mountains and this draws the tree, and when I make any change to the code, the picture changes immediately. So the code and the picture are always in sync; there is no compile and run. I just change things in the code and I see things change in the picture.

A screenshot of live editing from Bret Victor's Inventing on Principle

We want to work the right way wherever we can. The experience isn’t available everywhere, yet, but it is available in Steampipe where it powerfully enables the experimentation and prototyping that many of us are inspired to do as we delve into Mastodon.

If you want to try this for yourself, please check out the setup instructions for the plugin that maps Mastodon APIs to Postgres tables, and the dashboards that use those tables, and ping me (on Mastodon if you like!) with any questions you may have.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

A Bloomberg terminal for Mastodon

As I mentioned last time, the Steampipe dashboard for Mastodon has evolved in unexpected ways. I imagined that the components — a plugin that maps Mastodon APIs to Postgres foreign tables, and a suite of views that query the APIs — would combine to enable a broad overview of activity in the Fediverse. That didn’t pan out for two reasons.

First, I learned that the Mastodon community didn’t appreciate the kind of surveillance required for such analysis. That was the original community, I should stress, and things have changed dramatically, but I want to respect the original ethos. Plenty of people will, nevertheless, crawl and index the Fediverse, but I don’t need to put my shoulder to that wheel. And if I did I’d be pushing Steampipe out of its sweet spot: realtime acquisition, querying, and visualization of API-sourced data.

Second, Mastodon’s API allows 300 requests every 5 minutes. You can use Steampipe in batch mode to defeat that limit, and you can store data permanently in its Postgres database, but that cuts across the grain with respect to both Steampipe and Mastodon. All Mastodon clients are subject to the same API rate limit. If you use the web app, or one of the phone apps, you will likely never have seen a message announcing that you’ve hit the limit and need to wait a few minutes. I never saw that message until I started querying the API with Steampipe while also using the web app.

So if Mastodon culture and tech resist deep data mining, and the system is optimized for clients that live within an API budget of 300 requests every 5 minutes, what kind of Mastodon client could Steampipe enable? It wouldn’t be a conventional client because Steampipe is a read-only system. The path forward would be some kind of reader, or browser, that augments the interactive apps.

The outcome, so far, is a suite of dashboards that display tabular views (along with some charts) of the home, local, and federated timelines, of my toot history and my favorites, of my follows and followers, of my notifications, of searches for terms, people, and hashtags, and of the timelines formed by the lists to which I’ve assigned people I follow. These are all HTML tables rendered by Steampipe’s dashboard server. The columns are all sortable, and the cells of the tables can contain only links or plain text.

Given that the toot content returned from the Mastodon API is HTML, the plain-text-only constraint felt, initially, like a blocker. No images? No links in toot content? What good is that?

Some constraints are worth embracing, though, and that may prove true here. The views created this way put a lot of information onto the screen. Here’s my default view in the stock client.

At a glance I can see three items on the home timeline, and if I want to scroll through 100 items I can only do so awkwardly in small gulps.

Here’s my home timeline in the Steampipe dashboard. I can see a dozen items at a glance, and can easily scan 100 items in gulps of that size.

When I described this effect to Greg Wilson he gave me the title for this post: “That sounds like the Bloomberg terminal for Mastodon.” I’ve never used one, and I’m aware that its design is often derided as a UX disaster, but as I understand it the product is built to enable traders to scan fast-moving data feeds from many different sources. In that sense I do think it’s an interesting and useful comparison.

The underlying principle is one I’ve learned from Edward Tufte: present information at maximum density. Our brains are built to take in a lot of information at a glance, and if it’s organized well we can do that very effectively. It feels like that’s happening for me when I scan these densely-packed views of Mastodon activity.

To enhance the effect, I’ve begun to apply filters. In a Mastodon timeline, for example, a chatty person can dominate what you see at a glance. When we participate in social media we are always making bids for one another’s attention. As publishers of feeds it’s wise to consider how a flurry of items can overwhelm a reader’s experience. But it’s also useful to consider ways that feed readers can filter a chatty source. Steampipe’s SQL foundation affords an easy and natural way to do that. Here’s part of the query that drives the list view.

select distinct on (list, user_name, person, hour) -- only one per list/user/hour
  person,
  url,
  hour,
  toot
from
  data
order by
  hour desc, list, person

It was easy to implement a rule that limits each person to at most one toot per hour. Next steps here will be to apply this rule to other views, show the number of collapsed toots, and enable such rules on a per-person basis.

There are always links into the Mastodon web app, and I follow them when I want to view images, boost someone, or reply to someone. The dashboards help me scan a lot of Mastodon activity quickly, and decide which items I want to interact with. Your 500-character toot is all you’ve got to grab my attention, and I’ll only see it as an unformatted chunk of plain text. That’s a pretty severe constraint, and not everyone will want to embrace it, but it’s working pretty well for me so far.

I expect that our dashboard system will support formatted text and images in cells of HTML tables. When it does I’d like to make it an option you can turn on or off in Mastodon dashboards. What should the default be? I suspect I’ll want plain text and no images, especially if image captions can appear along with the text of toots. Some of the original Mastodon cultural norms aren’t surviving the onslaught of new people, but writing descriptions of images is one that’s held up so far, and it’s a wonderful thing. So write a short thoughtful post, write a caption for your image if you include one, and if you capture my attention I’ll click through to view and interact.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Browsing the Fediverse

A month ago, when the Great Discontinuity happened, I started working on a Steampipe plugin to enable SQL queries against the Mastodon API, along with a companion Steampipe “mod” (suite of dashboards) to display and chart the results of those queries.

I expect these dashboards will soon be available in Steampipe Cloud, where it will take just a few seconds to pop in your Mastodon access token (from, e.g., https://mastodon.social/settings/applications/new) and begin using the dashboards.

Meanwhile, if you’re so inclined, you can find the plugin here and the dashboards here. If you’re reasonably technical you can pretty quickly and easily install Steampipe, clone these repos, build the plugin, and start using the dashboards.

Why would you want to? My own motivation, originally, was to do Mastodon analytics. I thought Steampipe’s SQLification of the API would be a handy way to discern and monitor activity trends during a period of extraordinary flux. And that’s proven to be true, to a limited extent. Here’s a snapshot of the dashboard that uses the instance activity API.

I’m watching this chart with great interest. Where does it go from here? I’m not going to hazard a guess. Everything’s up in the air right now, and anything could happen.

But as I added tables to the plugin to encapsulate more of the Mastodon API, and added dashboards to visualize those tables, my focus shifted. I began to see the suite of dashboards as a Mastodon reader/browser that complements the web and phone clients, and that’s how I mainly use them now.

I think the key benefit is one of Edward Tufte’s core principles: information density. Each of these dashboards shows more activity than you can see at a glance in the web or phone interfaces. I find this very helpful for searching and browsing. When I see items of interest that I want to interact with, I click through to the web app in order to boost, reply, or favorite.

Will this way of browsing Mastodon appeal to you? To get a feel for what it’s like, here are snapshots of some of the dashboards I’ve built so far.

dashboard.Favorites

dashboard.Following

dashboard.Home

dashboard.List

dashboard.Me

dashboard.Notification

dashboard.PeopleSearch

dashboard.StatusSearch

dashboard.TagSearch

For me, at least, this approach has become an effective way to browse the fediverse, find interesting people, read what they boost, and keep track of my own activity.

If you are dev-minded, by the way, please note that these dashboards are just one way to skin the results of queries against the plugin. Any SQL client can connect to Steampipe’s Postgres endpoint. You could use dashboards like Metabase or Grafana, or you could embed Steampipe as a component in an app.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

When your database is an HTTP client

Here are three things I once hoped — but no longer expect — to outlive:

1. PDF files

2. passwords

3. The occasional need to scrape data from web pages

PDF files and passwords are topics for another day, but web scraping is timely. Today I was asked to corral data from the Steampipe blog, and the path of least resistance was (of course!) to extract it from the site.

I was all warmed up for the exercise because we’d just published a post dramatically entitled “Why build an HTTP client into a database? So you can ingest web data directly!” In that post I show three solutions enabled by the Net plugin’s net_http_request table.

Since the dawn of the web, scraping has worked this way: Use a script to fetch the data, then save it for retrieval and analysis. You might use the script language to query the data, or the query language of a database.

A couple of years ago I found a way to unify those ingredients: Run the script inside the database. You can do a lot with Postgres’ built-in procedural language, and even more if you activate Python inside Postgres. I went deeply into both and explained why in an earlier episode.

PL/Python was great for advanced uses, and I used it for a bit of web scraping too. It all worked fine, and I never thought to ask this question: “What if the database is the HTTP client, and SQL the common way to reason over data coming through that pipe?”

The examples in the post show what that’s like. In its simplest form you write a query like this.

select
    response_status_code,
    jsonb_pretty(response_headers) as headers
  from
    net_http_request
  where
    url = 'https://steampipe.io'

The result is like this.

+----------------------+-------------------------------------------------------+
| response_status_code | headers                                               |
+----------------------+-------------------------------------------------------+
| 200                  | {                                                     |
|                      |     "Age": [                                          |
|                      |         "45557"                                       |
|                      |     ],                                                |
|                      |     "Date": [                                         |
|                      |         "Fri, 09 Dec 2022 06:46:40 GMT"               |
|                      |     ],                                                |
|                      |     "Etag": [                                         |
|                      |         "W/\"614a142998557b388e053bfa4408cf70\""      |

The response_status_code is a regular Postgres column, the headers column is a JSONB column that you can index into (e.g. headers ->> 'etag'). If you also select the response_body column of the net_http_request table you’ll get another regular Postgres column containing the text of the web page. If it’s HTML text, you can use regular expressions to match patterns in it. If it’s JSON text you can use Postgres’ JSON functions to query and transform it.

You can join fields nested in the JSON with other Postgres columns. And those other columns can belong to tables populated in any of the ways Steampipe plugins populate tables: from JSON or GraphQL API endpoints, from CSV or Terraform or Yaml files, from anywhere really. As a developer writing Steampipe queries (and flowing results into dashboards) you see all of these sources as tables, you query them individually in all the ways Postgres can, and you join across diverse sources in a common way.

Of course web pages are structured in ways that regular expressions can’t easily grok. It’s easy to match links, but parsing HTML tag names and attributes is a job for a real parser. I’d made a start on an HTML plugin for Steampipe. There were already two tables: one to extract links from a web page, one to transform HTML tables to CSV format. So today, when tasked with tabulating blog metadata, I added a third table to enable these queries.

-- find the title

select
  page,
  tag_name,
  tag_content
from
  html_tag
where
  page = 'https://steampipe.io/blog/selective-select'
  and tag_name = 'title'

-- list the meta tags

select
  page,
  tag_name,
  tag_attrs
from
  html_tag
where
  page = 'https://steampipe.io/blog/selective-select'
  and tag_name = 'meta'

That’s a powerful way to reason over HTML data! It was easy for me to extend the HTML plugin in this way, and I assure you that I’m no 10x programmer. The Steampipe plugin SDK and the wonderful goquery package are doing all the heavy lifting. I just had to stitch the components together, and if you’re any kind of programmer, with or without Go experience, you could pretty readily do the same.

Mastodon, Steampipe, and RSS

I was determined to write my Mastodon #introduction today. To get started I used the tag search in the dashboard I’m building.

The idea was to look at a bunch of other #introduction posts to get a feel for how mine should go. When you search specifically for hashtags, the Mastodon search API returns this information.

"hashtags": [
    {
      "name": "introduction",
      "url": "https://mastodon.social/tags/introduction",
      "history": [
        {
          "day": "1574553600",
          "uses": "10",
          "accounts": "9"
        },
        // ...
      ]
    },

A first version of the dashboard, having only this data to work with, just listed the names of tags matching the search term along with corresponding URLs. Here was the initial query.

select 
  name,
  url
from 
  mastodon_search_hashtag 
where 
  query = 'introduction'

That produced a list of links, like https://mastodon.social/tags/introduction, to home pages for variants of the tag. These are useful links! Each goes to a page where you can see who is posting to the tag.

To make this view slightly more useful, I tapped the third element of the API response, history, in a revised query.

with data as (
  select 
    name,
    url,
    ( jsonb_array_elements(history) ->> 'uses' )::int as uses
  from 
    mastodon_search_hashtag 
  where 
    query = 'introduction'
)
select
  name,
  url,
  sum(uses)
from
  data
group by
  name, url
order by
  sum desc

These results help me decide which variant to use.

+-------------------+---------------------------------------------------+------+
| name              | url                                               | sum  |
+-------------------+---------------------------------------------------+------+
| introduction      | https://mastodon.social/tags/introduction         | 1816 |
| introductions     | https://mastodon.social/tags/introductions        | 218  |
| introductionpost  | https://mastodon.social/tags/introductionpost     | 19   |
| introductionfr    | https://mastodon.social/tags/introductionfr       | 6    |

But I still need to visit each link’s page to explore how it’s being used. It would be nice to surface more context in the dashboard, and I found a nifty way to do it, but first let’s dwell on the revised query for a minute. Postgres’ JSON features are powerful and it’s often a challenge (at least for me) to visualize how they work.

The Postgres jsonb_array_elements() function is what’s called a set-returning function. Here it unpacks Postgres’ JSON representation of the list of history structures returned from the Mastodon API. In its simplest form, the function call jsonb_array_elements(history) produces a temporary table with per-tag, per-day data.

select
  name,
  jsonb_array_elements(history) as history
from
  mastodon_search_hashtag 
where 
  query = 'introduction'
+--------------------------------+----------------------------------------------------+
| name                           | history                                            |
+--------------------------------+----------------------------------------------------+
| introduction                   | {"accounts":"16","day":"1670371200","uses":"19"}   |
| introduction                   | {"accounts":"250","day":"1670284800","uses":"269"} |
| introduction                   | {"accounts":"259","day":"1670198400","uses":"274"} |
| introduction                   | {"accounts":"253","day":"1670112000","uses":"270"} |
| introduction                   | {"accounts":"245","day":"1670025600","uses":"269"} |
| introduction                   | {"accounts":"345","day":"1669939200","uses":"383"} |
| introduction                   | {"accounts":"307","day":"1669852800","uses":"339"} |
| introductionsfr                | {"accounts":"0","day":"1670371200","uses":"0"}     |
| introductionsfr                | {"accounts":"0","day":"1670284800","uses":"0"}     |
| introductionsfr                | {"accounts":"0","day":"1670198400","uses":"0"}     |
| introductionsfr                | {"accounts":"0","day":"1670112000","uses":"0"}     |
| introductionsfr                | {"accounts":"0","day":"1670025600","uses":"0"}     |

history is a JSONB column that holds an object with three fields. The revised query uses Postgres’ JSON indexing operator ->> to reach into that object and hoist the number of daily uses into its own column, so it can be the target of a SQL SUM function.

OK, ready for the nifty solution? Recall that https://mastodon.social/tags/introduction is the home page for that variant of the tag. There you can see introduction posts from people using the tag. Those posts typically include other tags. In the dashboard shown above you can see that Kathy Nickels is using these: #Music #Art #Equestrian #Nature #Animals. The tags appear in her introduction post.

I didn’t immediately see how to capture them for use in the dashboard. Then I remembered that certain classes of Mastodon page have corresponding RSS feeds, and wondered if the tag pages are members of one such class. Sure enough they are, and https://mastodon.social/tags/introduction.rss is a thing. That link, formed by tacking .rss onto the base URL, provides the extra context I was looking for. Here’s the final version of the query.

with data as (
  select 
    name,
    url,
    ( jsonb_array_elements(history) ->> 'uses' )::int as uses
  from 
    mastodon_search_hashtag 
  where 
    query = 'introduction'
  ),
  uses as (
    select 
      name,
      url || '.rss' as feed_link,
      sum(uses) as recent_uses
    from 
      data 
    group 
      by connection, name, url
  )
  select
    u.name,
    r.guid as link,
    to_char(r.published, 'YYYY-MM-DD') as published,
    r.categories
  from
    uses u
  join
    rss_item r
  on 
    r.feed_link = u.feed_link
  where
    recent_uses > 1
  order by 
    recent_uses desc, published desc
)

The new ingredients, courtesy of the RSS feed, are: guid which links to an individual introduction like Kathy’s, published which is the day the introduction appeared, and categories which has the tags used in the introduction post. Sweet! Now I can scan the dashboard to get a sense of which introductions I want to check out.

The first three queries use the Steampipe plugin for Mastodon, and in particular its mastodon_search_hashtag table which encapsulates the Mastodon API for searching tags. The final version joins that table with the rss_item table provided by the RSS plugin, using the common base URL as the basis of the join.

This delights me in so many ways. When the blogosphere first emerged in the early 2000s, some of us discovered that the RSS protocol was capable of far more than just delivering feeds to RSS readers. The other new hot protocol in that era was XML web services. As an InfoWorld analyst I was supposed to be cheering the latter as an enterprise-grade technology, but I couldn’t help noticing that RSS kept turning out to be a great way to move data between cooperating systems. That’s always been true, and I love how this example reminds us that it’s still true.

I’m equally delighted to show how Steampipe enables this modern exercise in RSS-powered integration. Steampipe was, originally, an engine for mapping results from JSON API endpoints to SQL tables. Over time, though, it has broadened its view of what constitutes an API. You can use Steampipe to query CSV files, or Terraform files, or — as we see here — RSS feeds. Data comes in all kinds of flavors. Steampipe abstracts those differences and brings all the flavors into a common space where you can reason over them using SQL.

And finally, it’s just wonderful to be at the intersection of Mastodon, Steampipe, and RSS in this remarkable moment. I’ll readily admit that nostalgia is a factor. But RSS did bust things wide open twenty years ago, Mastodon’s doing that now, and I love that RSS can help it happen again.

Now I need to write that #introduction!


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Autonomy, packet size, friction, fanout, and velocity

Nostalgia is a dangerous drug and it’s always risky to wallow in it. So those of us who fondly remember the early blogosphere, and now want to draw parallels to the fediverse, should do so carefully. But we do want to learn from history.

Here’s one way to compare five generations of social software along the five dimensions named in the title of this post.

              Autonomy    Packet Size   Friction    Fanout    Velocity
 
Usenet        medium      high          medium      medium    low

Blogosphere   high        high          high        low       low

Facebook      low         high          low         medium    high

Twitter       low         low           low         high      high

Fediverse     high        medium        high        medium    medium

These are squishy categories, but I think they surface key distinctions. Many of us who were active in the blogosphere of the early 2000s enjoyed a high level of autonomy. Our RSS readers were our Internet dashboards. We loaded them with a curated mix of official and individual voices. There were no limits on the size of packets exchanged in this network. You could write one short paragraph or a 10,000-word essay. Networking wasn’t frictionless because blog posts did mostly feel like essays, and because comments didn’t yet exist. To comment on my blog post you’d write your own blog post linking to it.

That friction limited the degree to which a post would fan out through the network, and the velocity of its propagation. The architecture of high friction, low fanout, and low velocity was a winning combination for a while. In that environment I felt connected but not over-connected, informed but not overloaded.

Twitter flipped things around completely. It wasn’t just the loss of autonomy as ads and algos took over. With packets capped at 120 characters, and tweets potentially seen immediately by everyone, friction went nearly to zero. The architecture of low friction created an addictive experience and enabled powerful effects. But it wasn’t conducive to healthy discourse.

The fediverse can, perhaps, strike a balance. Humans didn’t evolve to thrive in frictionless social networks with high fanout and velocity, and arguably we shouldn’t. We did evolve in networks governed by Dunbar’s number, and our online networks should respect that limit. We need less friction within communities of knowledge and practice, more friction between them. We want messages to fan out pervasively and rapidly within communities, but less so between them.

We’re at an extraordinary inflection point right now. Will the fediverse enable us to strike the right balance? I think it has the right architectural ingredients to land where I’ve (speculatively) placed it in that table. High autonomy. As little friction as necessary, but not too little. As much fanout and velocity as necessary, but not too much. Nobody knows how things will turn out, predictions are futile, behavior is emergent, but I am on the edge of my seat watching this all unfold.


1 https://blog.jonudell.net/2022/11/28/autonomy-packet-size-friction-fanout-and-velocity/
2 https://blog.jonudell.net/2022/12/06/mastodon-steampipe-and-rss/
3 https://blog.jonudell.net/2022/12/10/browsing-the-fediverse/
4 https://blog.jonudell.net/2022/12/17/a-bloomberg-terminal-for-mastodon/
5 https://blog.jonudell.net/2022/12/19/create-your-own-mastodon-ux/
6 https://blog.jonudell.net/2022/12/22/lists-and-people-on-mastodon/
7 https://blog.jonudell.net/2022/12/29/how-many-people-in-my-mastodon-feed-also-tweeted-today/
8 https://blog.jonudell.net/2022/12/31/instance-qualified-mastodon-urls/
9 https://blog.jonudell.net/2023/01/16/mastodon-relationship-graphs/
10 https://blog.jonudell.net/2023/01/21/working-with-mastodon-lists/
11 https://blog.jonudell.net/2023/01/26/images-considered-harmful-sometimes/
12 https://blog.jonudell.net/2023/02/02/mapping-the-wider-fediverse/
13 https://blog.jonudell.net/2023/02/06/protocols-apis-and-conventions/
14 https://blog.jonudell.net/2023/02/14/news-in-the-fediverse/
15 https://blog.jonudell.net/2023/02/26/mapping-people-and-tags-on-mastodon/
16 https://blog.jonudell.net/2023/03/07/visualizing-mastodon-server-moderation/
17 https://blog.jonudell.net/2023/03/14/mastodon-timelines-for-teams/
18 https://blog.jonudell.net/2023/04/03/the-mastodon-plugin-is-now-available-on-the-steampipe-hub/
19 https://blog.jonudell.net/2023/04/11/migrating-mastodon-lists/
20 https://blog.jonudell.net/2023/05/24/when-the-rubber-duck-talks-back/

Debuggable explanations

I’ve been reviewing Greg Wilson’s current book project, Software Design in Python. Like the earlier JavaScript-based Software Design by Example it’s a guided tour of tools, techniques, and components common to many software systems: testing frameworks, parsers, virtual machines, debuggers. Each chapter of each of these books shows how to build the simplest possible working version of one of these things.

Though I’ve used this stuff for most of my life, I’ve never studied it formally. How does an interpreter work? The chapter on interpreters explains the basic mechanism using a mixture of prose and code. When I read the chapter I can sort of understand what’s happening, but I’m not great at mental simulation of running code. I need to run the code in a debugger, set breakpoints, step through execution, and watch variables change. Then it sinks in.

The GitHub repo for the book includes all the text and all the code. I’d like to put them side-by-side, so that as I read the narrative I can run and debug the code that’s being described. Here’s how I’m doing that in VSCode.

This is pretty good! But it wasn’t dead simple to get there. In a clone of the repo, the steps included:

  • Find the HTML file for chapter 3.
  • Install a VSCode extension to preview HTML.
  • Find the code for chapter 3.
  • Adjust the code to not require command-line arguments.
  • Arrange the text and code in side-by-side panes.

Though it’s all doable, the activation threshold is high enough to thwart my best intention of repeating the steps for every chapter.

Whether in VSCode or another IDE or some other kind of app, what would be the best way to lower that activation threshold?

Curating the Studs Terkel archive

I can read much faster than I can listen, so I rarely listen to podcasts when I’m at home with screens on which to read. Instead I listen on long hikes when I want to shift gears, slow down, and absorb spoken words. Invariably some of those words strike me as particularly interesting, and I want to capture them. Yesterday, what stuck was these words from a 1975 Studs Terkel interview with Muhammad Ali:

Everybody’s watching me. I was rich. The world saw me, I had lawyers to fight it, I was getting credit for being a strong man. So that didn’t really mean nothing. What about, I admire the man that had to go to jail named Joe Brown or Sam Jones, who don’t nobody know who’s in the cell, you understand? Doing his time, he got no lawyers’ fees to pay. And when he get out, he won’t be praised for taking a stand. So he’s really stronger than me. I had the world watching me. I ain’t so great. I didn’t do nothing that’s so great. What about the little man don’t nobody know? He’s really the one.

I heard these words on an episode of Radio OpenSource about the Studs Terkel Radio Archive, an extraordinary compilation of (so far) about a third of his 5600 hours of interviews with nearly every notable person during the latter half of the twentieth century.

If you weren’t aware of him, the Radio OpenSource episode, entitled Studs Terkel’s Feeling Tone, is the perfect introduction. And it’s delightful to hear one great interviewer, Chris Lydon, discuss with his panel of experts the interviewing style of perhaps the greatest interviewer ever.

Because I’d heard Muhammad Ali’s words on Radio OpenSource, I could have captured them in the usual way. I always remember where I was when I heard a segment of interest. If that was 2/3 of the way through my hike, I’ll find the audio at the 2/3 mark on the timeline. I made a tool to help me capture and share a link to that segment, but it’s a clumsy solution.

What you’d really rather do is search for the words in a transcript, select surrounding context, use that selection to define an audio segment, and share a link to both text and audio. That’s exactly what I did to produce this powerful link, courtesy of WFMT’s brilliant remixer, which captures both the written words I quoted above and the spoken words synced to them.

That’s a dream come true. Thank you! It’s so wonderful that I hesitate to ask, but … WFMT, can you please make the archive downloadable? I would listen to a lot of those interviews if I could put them in my pocket and take them with me on hikes. Then I could use your remixer to help curate the archive.

GitHub for English teachers

I’ve long imagined a tool that would enable a teacher to help students learn how to write and edit. In Thoughts in motion I explored what might be possible in Federated Wiki, a writing tool that keeps version history for each paragraph. I thought it could be extended to enable the kind of didactic editing I have in mind, but never found a way forward.

In How to write a press release I tried bending Google Docs to this purpose. To narrate the process of editing a press release, I dropped a sample release into a GDoc and captured a series of edits as named versions. Then I captured the versions as screenshots and combined them with narration, so the reader of the blog post can see each edit as a color-coded diff with an explanation.

The key enabler is GDoc’s File -> Version history -> Name current version, along with File -> See version history‘s click-driven navigation of the set of diffs. It’s easy to capture a sequence of editing steps that way.

But it’s much harder to present those steps as I do in the post. That required me to make, name, and organize a set of images, then link them to chunks of narration. It’s tedious work. And if you want to build something like this for students, that’s work you shouldn’t be doing. You just want to do the edits, narrate them, and share the result.

This week I tried a different approach when editing a document written by a colleague. Again the goal was not only to produce an edited version, but also to narrate the edits in a didactic way. In this case I tried bending GitHub to my purpose. I put the original doc in a repository, made step-by-step edits in a branch, and created a pull request. We were then able to review the pull request, step through the changes, and review each as a color-coded diff with an explanation. No screenshots had to be made, named, organized, or linked to the narration. I could focus all my attention on doing and narrating the edits. Perfect!

Well, perfect for someone like me who uses GitHub every day. If that’s not you, could this technique possibly work?

In GitHub for the rest of us I argued that GitHub’s superpowers could serve everyone, not just programmers. In retrospect I felt that I’d overstated the case. GitHub was, and remains, a tool that’s deeply optimized for programmers who create and review versioned source code. Other uses are possible, but awkward.

As an experiment, though, let’s explore how awkward it would be to recreate my Google Docs example in GitHub. I will assume that you aren’t a programmer, have never used GitHub, and don’t know (or want to know) anything about branches or commits or pull requests. But you would like to be able to create a presentation that walks a learner though a sequence of edits, with step-by-step narration and color-coded diffs. At the end of this tutorial you’ll know how to do that. The method isn’t as straightforward as I wish it were. But I’ll describe it carefully, so you can try it for yourself and decide whether it’s practical.

Here’s the final result of the technique I’ll describe.

If you want to replicate that, and don’t already have a GitHub account, create one now and log in.

Ready to go? OK, let’s get started.

Step 1: Create a repository

Click the + button in the top right corner, then click New repository.

Here’s the next screen. All you must do here is name the repository, e.g. editing-step-by-step, then click Create repository. I’ve ticked the Add a README file box, and chosen the Apache 2.0 license, but you could leave the defaults — box unchecked, license None — as neither matters for our purpose here.

Step 2: Create a new file

On your GitHub home page, click the Repositories tab. Your new repo shows up first. Click its link to open it, then click the Add file dropdown and choose Create new file. Here’s where you land.

Step 3: Add the original text, create a new branch, commit the change, and create a pull request

What happens on the next screen is bewildering, but I will spare you the details because I’m assuming you don’t want to know about branches or commits or pull requests, you just want to build the kind of presentation I’ve promised you can. So, just follow this recipe.

  • Name the file (e.g. sample-press-release.txt
  • Copy/paste the text of the document into the edit box
  • Select Create a new branch for this commit and start a pull request
  • Name the branch (e.g. edits)
  • Click Propose new file

On the next screen, title the pull request (e.g. edit the press release) and click Create pull request.

Step 4: Visit the new branch and begin editing

On the home page of your repo, use the main dropdown to open the list of branches. There are now two: main and edits. Select edits

Here’s the next screen.

Click the name of the document you created (e.g. sample-press-release.txt to open it.

Click the pencil icon’s dropdown, and select Edit this file.

Make and preview your first edit. Here, that’s my initial rewrite of the headline. I’ve written a title for the commit (Step 1: revise headline), and I’ve added a detailed explanation in the box below the title. You can see the color-coded diff above, and the rationale for the change below.

Click Commit changes, and you’re back in the editor ready to make the next change.

Step 5: Visit the pull request to review the change

On your repo’s home page (e.g. https://github.com/judell/editing-step-by-step), click the Pull requests button. You’ll land here.

Click the name of the pull request (e.g. edit the press release) to open it. In the rightmost column you’ll see links with alphanumeric labels.

Click the first one of those to land here.

This is the first commit, the one that added the original text. Now click Next to review the first change.

This, finally, is the effect we want to create: a granular edit, with an explanation and a color-coded diff, encapsulated in a link that you can give to a learner who can then click Next to step through a series of narrated edits.

Lather, rinse, repeat

To continue building the presentation, repeat Step 4 (above) once per edit. I’m doing that now.

… time passes …

OK, done. Here’s the final edited copy. To step through the edits, start here and use the Next button to advance step-by-step.

If this were a software project you’d merge the edits branch into the main branch and close the pull request. But you don’t need to worry about any of that. The edits branch, with its open pull request, is the final product, and the link to the first commit in the pull request is how you make it available to a learner who wants to review the presentation.

GitHub enables what I’ve shown here by wrapping the byzantine complexity of the underlying tool, Git, in a much friendlier interface. But what’s friendly to a programmer is still pretty overwhelming for an English teacher. I still envision another layer of packaging that would make this technique simpler for teachers and learners focused on the craft of writing and editing. Meanwhile, though, it’s possible to use GitHub to achieve a compelling result. Is it practical? That’s not for me to say, I’m way past being able to see this stuff through the eyes of a beginner. But if that’s you, and you’re motivated to give this a try, I would love to know whether you’re able to follow this recipe, and if so whether you think it could help you to help learners become better writers and editors.

How to rewrite a press release: a step-by-step guide

As a teaching fellow in grad school I helped undergrads improve their expository writing. Some were engineers, and I invited them to think about writing and editing prose in the same ways they thought about writing and editing code. Similar rules apply, with different names. Strunk and White say “omit needless words”; coders say “DRY” (don’t repeat yourself.) Writers edit; coders refactor. I encouraged students to think about writing and editing prose not as a creative act (though it is one, as is coding) but rather as a method governed by rules that are straightforward to learn and mechanical to apply.

This week I applied those rules to an internal document that announces new software features. It’s been a long time since I’ve explained the method, and thanks to a prompt from Greg Wilson I’ll give it a try using another tech announcement I picked at random. Here is the original version.

I captured the transformations in a series of steps, and named each step in the version history of a Google Doc.

Step 1

The rewritten headline applies the following rules.

Lead with key benefits. The release features two: support for diplex-matched antennas and faster workflow. The original headline mentions only the first, I added the second.

Clarify modifiers. A phrase like “diplex matched antennas” is ambiguous. Does “matched” modify “diplex” or “antennas”? The domain is unfamiliar to me, but I suspected it should be “diplex-matched” and a web search confirmed that hunch.

Omit needless words. The idea of faster workflow appears in the original first paragraph as “new efficiencies aimed at streamlining antenna design workflows and shortening design cycles.” That’s a long, complicated, yet vague way of saying “enables designers to work faster.”

Step 2

The original lead paragraph was now just a verbose recap of the headline. So poof, gone.

Step 3

The original second paragraph, now the lead, needed a bit of tightening. Rules in play here:

Strengthen verbs. “NOUN is a NOUN that VERBs” weakens the verb. “NOUN, a NOUN, VERBs” makes it stronger.

Clarify modifiers. “matching network analysis” -> “matching-network analysis”. (As I look at it again now, I’d revise to “analysis of matching networks.”)

Break up long, weakly-linked sentences. The original was really two sentences linked weakly by “making it,” so I split them.

Omit needless words. A word that adds nothing, like “applications” here, weakens a sentence.

Strengthen parallelism. If you say “It’s ideal for X and Y” there’s no problem. But when X becomes “complex antenna designs that involve multi-state and multi-port aperture or impedance tuners,” and Y becomes “corporate feed networks with digital phase shifters,” then it helps to make the parallelism explicit: “It’s ideal for X and for Y.”

Step 4

Omit needless words. “builds on the previous framework with additional” -> “adds”.

Simplify. “capability to connect” -> “ability to connect”.

Show, don’t tell. A phrase like “time-saving options in the schematic editor’s interface” tells us that designers save time but doesn’t show us how. That comes next: “the capability to connect two voltage sources to a single antenna improves workflow efficiency.” The revision cites that as a shortcut.

Activate the sentence. “System and radiation efficiencies … can be effortlessly computed from a single schematic” makes efficiencies the subject and buries the agent (the designer) who computes them. The revision activates that passive construction. Similar rules govern the rewrite of the next paragraph.

Step 5

When I reread the original fourth paragraph I realized that the release wasn’t only touting faster workflow, but also better collaboration. So I adjusted the headline accordingly.

Step 6

Show, don’t tell. The original version tells, the new one shows.

Simplify. “streamline user input” -> “saves keystrokes” (which I might further revise to “clicks and keystrokes”).

Final result

Here’s the result of these changes.

I haven’t fully explained each step, and because the domain is unfamiliar I’ve likely missed some nuance. But I’m certain that the final version is clearer and more effective. I hope this step-by-step narration helps you see how and why the method works.

The Velvet Bandit’s COVID series

The Velvet Bandit is a local street artist whose work I’ve admired for several years. Her 15 minutes of fame happened last year when, as reported by the Press Democrat, Alexandria Ocasio-Cortez wore a gown with a “Tax the Rich” message that closely resembled a VB design.

I have particularly enjoyed a collection that I think of as the Velvet Bandit’s COVID series, which appeared on the boarded-up windows of the former Economy Inn here in Santa Rosa. The building is now under active renovation and the installation won’t last much longer, so I photographed it today and made a slideshow.

I like this image especially, though I have no idea what it means.

If you would like to buy some of her work, it’s available here. I gather sales have been brisk since l’affaire AOC!

Subtracting devices

People who don’t listen to podcasts often ask people who do: “When do you find time to listen?” For me it’s always on long walks or hikes. (I do a lot of cycling too, and have thought about listening then, but wind makes that impractical and cars make it dangerous.) For many years my trusty podcast player was one or another version of the Creative Labs MuVo which, as the ad says, is “ideal for dynamic environments.”

At some point I opted for the convenience of just using my phone. Why carry an extra, single-purpose device when the multi-purpose phone can do everything? That was OK until my Quixotic attachment to Windows Phone became untenable. Not crazy about either of the alternatives, I flipped a coin and wound up with an iPhone. Which, of course, lacks a 3.5mm audio jack. So I got an adapter, but now the setup was hardly “ideal for dynamic environments.” My headset’s connection to the phone was unreliable, and I’d often have to stop walking, reseat it, and restart the podcast.

If you are gadget-minded you are now thinking: “Wireless earbuds!” But no thanks. The last thing I need in my life is more devices to keep track of, charge, and sync with other devices.

I was about to order a new MuVo, and I might still; it’s one of my favorite gadgets ever. But on a recent hike, in a remote area with nobody else around, I suddenly realized I didn’t need the headset at all. I yanked it out, stuck the phone in my pocket, and could hear perfectly well. Bonus: Nothing jammed into my ears.

It’s a bit weird when I do encounter other hikers. Should I pause the audio or not when we cross paths? So far I mostly do, but I don’t think it’s a big deal one way or another.

Adding more devices to solve a device problem amounts to doing the same thing and expecting a different result. I want to remain alert to the possibility that subtracting devices may be the right answer.

There’s a humorous coda to this story. It wasn’t just the headset that was failing to seat securely in the Lightning port. Charging cables were also becoming problematic. A friend suggested a low-tech solution: use a toothpick to pull lint out of the socket. It worked! I suppose I could now go back to using my wired headset on hikes. But I don’t think I will.