Visible Workings (redux)

30 Dec 200830 Dec 2008 ~ Jon Udell ~ 2 Comments

For me, one of the 2008’s most important (but least remarked-upon) ideas was spelled out in this post which details how Ward Cunningham implemented Brian Marick’s notion of Visible Workings. The idea, briefly, is that businesses can wear (non-confidential aspects of) their business logic on their sleeves, observable to all.

In a year of devastating consequences ensuing from the lack of transparency in business, you’d think Ward and Brian would be celebrated for this work. No such luck. Partly, I’m sure, because their insights flow from the realm of software development and software testing, and don’t generalize in an obvious way.

It struck me this morning that yesterday’s item on using del.icio.us to manage trusted feeds may help to broaden the appeal of the idea.

In that item I mainly talked about the logistical benefits of the approach. You write less code, and you get to leverage existing infrastructure for data management, web UI, collaboration, and syndicated alerts. That’s all good. But there’s also a transparency benefit which I neglected to point out.

At this moment, for example, del.icio.us/elmcity is a snapshot of the feeds and contributors known to, and classified by, the live version of my service at elmcity.info/events. That version uses private lists of trusted feeds, and of new and trusted contributors. I haven’t yet cut over to the newly-rewritten Azure version, but when I do, it will use these public lists instead.

The del.icio.us/elmcity snapshot reports that there are 41 Eventful contributors of which 37 are trusted and 4 are new.

Why are the four new contributors still sitting in the holding tank? One I mentioned yesterday. jheslin created a venue, but no events. I plan to delete that contributor and wait to see if he or she shows up again with actual event contributions.

That leaves TallWilly, blahblah25, and michellelewis. Why are they still sitting in the holding tank? Here’s the crucial point: I’m not sure. I know that I reviewed them when they showed up, and applied a policy. If it were written down, which until now it hasn’t been, it would use language like “legitimate” and “substantive” to define the kinds of contributions that move a new contributor into the trusted bucket. But I can’t actually say how I applied that policy in these cases.

So let’s investigate. First, TallWilly. Clicking through, I find that TallWilly is no longer an Eventful user. Obviously I’ll want to remove him from the new bucket. Implicit rule now stated: Must be an Eventful user.

Second, blahblah25. Clicking through, I find only one event. Seems legit, and so far I haven’t required more evidence than a single legit event, so why didn’t I promote blahblah25? Oh, I see. Jan 4, 1900 12:30 AM isn’t a reasonable start date. Implicit rule now stated: Date must be reasonable.

(Of course there’s more to the story here. blahblah25’s bogus date was either a human error or a software error, or both. Ideally the aggregator, when rejecting a contribution on that basis, would notify the contributor and invite a correction.)

Third, michellelewis. Why didn’t I decide to trust her? Turns out it was just a mistake! Clicking through, I find an entire schedule of concerts, including this one at Fritz Belgian Fries on April 3, 2009. That event, and future events posted by michellelewis, absolutely belong on the calendar.

I only discovered this mistake by reviewing the lists of new and trusted contributors. In the existing version of the system, I’m the only one who can do that. But in the new version, everyone can. More eyeballs, fewer bugs.

Even more interesting, to me, is notion of developing and applying policy-driven business logic in a transparent way. Of course business processes can’t always work that way. But the default, now, is that none do. Sometimes, maybe more often than we imagine, we could flip that default. It would be an interesting experiment to try.

Databasing trusted feeds with del.icio.us

29 Dec 200830 Dec 2008 ~ Jon Udell ~ 17 Comments

In my last entry, I sketched a strategy for maintaining lists of the Eventful and Flickr accounts that I consider trusted sources for the elmcity.info event and photo streams. I didn’t spell out exactly how I plan to maintain those lists, in the Azure rewrite of the service that I’m now doing, but David Hochman read my mind:

It sure would be interesting to syndicate those lists from a trusted del.icio.us feed, leveraging tags as a public data store, and allowing others to trust your trusted lists.

It sure would. And that’s just what I’m doing.

Part One: The User’s View

Here’s the del.icio.us account:

delicious.com/elmcity

Here are the trusted ICS feeds:

elmcity/trusted+ics+feed

Here are the trusted Eventful contributors:

elmcity/trusted+eventful+contributor

Here are the new Eventful contributors — that is, ones I’ve not yet marked as trusted:

elmcity/new+eventful+contributor

This is wildly convenient in several ways. For starters, I get a feed of new Eventful contributors for free:

feeds.delicious.com/v2/rss/elmcity/eventful+new+contributor

Anyone who subscribes to that feed is alerted to the appearance of a previously-unseen contributor of events within 15 miles of Keene. Here’s one:

eventful.com/users/jheslin

Clicking that link reveals that jheslin has created one venue, but so far no events. That’s not enough evidence on which to base a trust/no-trust decision. So what I’d do, in that case, is just delete the del.icio.us bookmark. If the aggregator were to see another event from jheslin, he (or she) will show up again in the feed. In that case, if jheslin has created events that look legitimate, I can decide to trust him (or her). How? Trivially, by editing the bookmark and changing the new tag to trusted.

That’s easy enough, but I don’t want to be forever responsible for monitoring this feed and making trust decisions. And thankfully I needn’t be. When I delegate that job to somebody else, I’ll just need to transfer the credentials to the del.icio.us/elmcity account, and explain what it means for an Eventful account to be bookmarked at del.icio.us/elmcity with a new or trusted tag, and how to decide when to promote an Eventful account from new to trusted.

The same technique can apply to other account-based event sources — for example, upcoming.org. It also applies to feed-based sources. I’ve been encouraging event publishers in Keene to create iCalendar feeds. Those feeds have URLs, and to include them in the aggregation, somebody just needs to bookmark them under the elmcity account with the tags trusted and ics and feed. Like this.

Same for new and trusted Flickr accounts that feed the photos page, for blogs that feed the blog directory, and for any other class of resource that might be contributed.

Part Two: The Developer’s View

Notice that I haven’t had to write any Web forms, any Ajax code, any database CRUD (create/read/update/delete) logic. Del.icio.us, a database with a Web user interface, takes care of all that. Which is fine by me, because life’s too short to write any more CRUD or Web UI than I have to. I’d rather do more interesting things.

By the same token, life’s too short to write more than a few lines of code to drive the CRUD apparatus. As I mentioned last time, I’m writing the core of the Azure event aggregator in C# rather than Python, because IronPython isn’t yet ready for prime time on Azure. I worried that a C# implementation would be too verbose, but I’ve been pleasantly surprised.

Here’s a C# method that reads a del.icio.us RSS feed and returns a dictionary (aka hashtable, aka associative array) of titles and links:

00 const string rssbase = "http://feeds.delicious.com/v2/rss/elmcity";

01 public static Dictionary<string,string> get_delicious_feed(string args)
02  {
03  var dict = new Dictionary<string,string>();
04  string url = String.Format("{0}/{1}", rssbase, args);
05  var response = Utils.FetchUrl(url);
06  var xdoc = Utils.xdoc_from_xml_bytes(response.data);
07  var items = from item in xdoc.Descendants("item")
08  select new { Title = item.Element("title").Value,
09     Link = item.Element("link").Value, };
10  foreach (var item in items)
11    dict[item.Link] = item.Title;
12  return dict;
13  }

The Python equivalent is more concise, but not by much. I am, admittedly, deferring any discussion of the Utils class which I’m using to make the .NET Framework’s HttpWebRequest/HttpWebResponse classes feel more Pythonic to me.

Also noteworthy here is the use of the generic collection class, Dictionary (lines 3, 11, 12), instead of the more Pythonic (and Java-like) Hashtable. I’ll also defer discussion of tradeoffs between Dictionary and Hashtable until I’ve learned more about them.

Finally, I’ll defer discussion of the LINQ-to-XML idioms (lines 6-10) until I’ve learned more about the tradeoffs between LINQ-to-XML and the XPath style which I’m more familiar with, and which is more widely available.

For now, I’ll just observe that this C# method is readable, debuggable, and Azure-deployable.

Here are some of the ways the above method will be used in the service:

get_delicious_feed("trusted+feed+ics")
get_delicious_feed("trusted+eventful+contributor")
get_delicious_feed("new+flickr+contributor")

For example, here’s the method that the aggregator uses to check whether or not to include an Eventful event contributed by a given Eventful account:

01 public static bool isTrustedEventfulContributor(string accountname)
02  {
03  var dict = get_delicious_feed("trusted+eventful+contributor");
04  var re = new Regex("eventful.com/users/([^/]+)/created/events");
05  return match_url(dict, re, accountname);
06  }

The regular expression at line 4 matches URLs like this:

eventful.com/users/judell/created/events

If you check the corresponding Eventful page you’ll see why the aggregator posts bookmarks with addresses in this format. That way, the human who’s monitoring the feed can easily click through to eyeball the events created by a new user whose legitimacy needs to be checked.

To see how isTrustedEventfulContributor makes its yes/no determination, we need to unpack the match_url method. Here’s the first version I wrote:

private static bool match_url(Dictionary<string,string> dict, 
  Regex re, string url)
  {
  bool isTrusted = false;
  Match m;
  foreach (string key in dict.Keys)
    {
    m = re.Match(key);
    if (m.Groups[1].Value == url)
      {
      isTrusted = true;
      break;
      }
    }
    return isTrusted;
  }

This worked, but didn’t have the concise, functional, Pythonic feel that I like. So I went back to the drawing board and came up with another version:

private static bool match_url(Dictionary<string,string> dict, 
  Regex re, string url)
  {
  var keys = dict.Keys.ToList();
  var matched = keys.FindAll(x => re.Match(x).Groups[0].Value == url);
  return matched.Count == 1;
  }

This works identically, and it’s much closer to what I’d do in Python: Filter a list using a lambda expression.

Part Three: Conclusion

If you’re not a programmer — and in particular, a programmer who would be interested in Azure, or in a comparison between C# and Python — your eyes glazed over when you got to part two. That’s fine. There’s still an important takeway for you. Del.icio.us (and any del.icio.us-like service) is a database! You can use it, without doing any programming, to maintain lists of arbitrary sets of resources that can be queried and edited, with equal ease, by humans and by programs.

Whatever you can identify with a URL is fair game. You can invent your own simple business logic by defining rules for what tags to use, and when and how to change them. You can monitor RSS feeds, in any feedreader, in order to be alerted when monitored items change. You can share or delegate the work by sharing or delegating access to the del.icio.us account. And last but not least, when you need to get a programmer to make use of this database you and your collaborators have built, that person’s job will be drop-dead simple.

Lightweight event syndication with trusted feeds

24 Dec 200824 Dec 2008 ~ Jon Udell ~ 6 Comments

If you check the elmcity.info events page for March 7, 2008 you’ll see that Beau Bristow is performing at Keene State College at 8PM. The Eventful item that has syndicated to the events page doesn’t say anything else. There’s no link to beaubristow.com, though it’s easy enough to find. And there’s no more precise venue than Keene State College, though that’ll be easy enough to find as well, when the time comes.

But the item carries enough information to participate in a (still mostly nascent) network of calendar events. Beau Bristow doesn’t know that his concert shows up at elmcity.info, or that on March 7 it’ll show up at citizenkeene.ning.org and cheshiretv.org. And he shouldn’t need to know. But he ought to be able to take it for granted that events he posts to some kind of syndication source — could be Eventful, could be another public service, could be a personal iCalendar feed — will propagate.

I am particularly fascinated by the lightweight, ad-hoc interaction between Eventful, Beau Bristow, and elmcity.info. This lightness is a powerful enabler. If you’re Beau, and you need to promote 18 events in 18 towns, some of which you may only visit once in your career, you don’t have time — and can’t pay for the help — to build relationships in all those places. But you can assert that you’ll be in those places, on specified dates, doing a specified thing. And under the right circumstances, that’s enough.

The question I’ve been exploring is how to create those circumstances. One aspect of the answer, and the one I want to focus on here, is trusted feeds.

Originally, at elmcity.info, any Flickr photo mentioning “Keene NH” showed up in the photo stream, and any Eventful event located within 15 miles of the center of Keene showed up in the event stream. That arrangement was clearly open to abuse. Even though Flickr and Eventful try to take responsibility for their stuff, my aggregator had to take more responsibility for the subsets of their stuff it manages. So I created two lists of trusted contributors. One is a list of Flickr account names, and the other is a list of Eventful account names.

When the aggregator runs, a couple of times a day, it puts previously-unseen account names into a holding tank and writes those names to RSS feeds which I monitor here and here.

Yesterday I found Dan York in the Flickr holding tank, and Beau Bristow in the Eventful holding tank. I happen to know Dan, but even if I didn’t, it only takes a minute to judge that his Flickr portfolio is legitimate. I don’t know Beau, but again it’s easy to determine that his Eventful presence is legitimate. So I marked both accounts as trusted, and today their contributions appear on the site.

If a trusted account ever abuses that trust, it’s easily revoked.

When I tell folks about this model of event syndication, they sooner or later realize that it’s an invitation to spam and ask about that. My answer is trusted feeds. It would be impossible to moderate every event flowing through your network. But it’s easy to moderate a much smaller number of event sources.

Azure calendar aggregator: Part 1

22 Dec 2008 ~ Jon Udell ~ 5 Comments

For about a week now, I’ve been running a service in the Azure cloud that aggregates calendar events from Eventful.com and from a diverse set of iCalendar feeds. As I mentioned last month, my aim is to recreate and then extend my experimental elmcity.info community information hub, while exploring and documenting the evolution of Azure and the layered services emerging on top of it.

I haven’t written a whole lot about programming here for a while, because I’ve trying to to explain the whys and wherefores of syndication-oriented communication to a wider audience. But as I build out this service I’m learning a lot about cloud-based software development in general, and about Azure in particular, and I want to narrate this work. I’ll try to do it in a way that will inform developers who currently use Microsoft tools and technologies, as well as those who don’t. But I’ll also try to be accessible to folks who don’t write software, yet would like to learn something about the opportunities that cloud computing is creating as well as the challenges it poses.

The service, as it currently exists, is running as an Azure worker role. That means it does input, processing, and output, but presents no user interface. The inputs are Eventful.com, accessed by way of its API, and a growing set of public iCalendar feeds. The processing involves reading calendar events and normalizing them to a common intermediate format. The output is currently XML to the Azure blob store, one file for Eventful and another for the iCalendar feeds.

I’m only allocating one instance of this worker process, and that’s probably enough horsepower for any single community’s events. But I’d like to be able to scale out the aggregator to serve other communities as well, potentially many others. Turning up the dial to do that would be a nice illustration — and test — of the cloud computing fabric.

The existing aggregator at elmcity.info is written in Python, and my original plan was to port it with minimal change to IronPython on Azure. That didn’t work out because, although bare-bones IronPython code runs on Azure as I show here, you quickly run into restrictions imposed by Azure’s security sandbox. The trust policy, defined here, is based on a feature of the .NET platform known as code access security (CAS).

When you upload code to the Azure cloud, or run it in the local development fabric, the hosting environment only partly trusts your code, and also only partly trusts any components used by your code. This is part of a layered, defense-in-depth security strategy, prudent for the same reason that it’s prudent to run your own computer as a partly-trusted user instead of an all-powerful administrator. It is also problematic for the same reason. A lot of Windows applications used to require administrative privilege in order to run properly, and some — though fewer month by month — still do. Similarly, a lot of .NET components that run happily in the fully-trusted environment of your local computer won’t run in Azure’s medium-trust environment, or (what’s nearly equivalent) in Internet Information Server 7 (IIS 7) when its security mode is set to medium trust.

I am no expert on the subject of code access security, but here’s what I think:

The medium-trust policy is probably a good thing.
It does, however, impede instant gratification when you’re mixing components from various sources.
But that impedance will diminish as more component builders adopt the good practice of not making their components unnecessarily require full trust.

I think that IronPython is likely to become such a component, once the dust settles from the recent 2.0 release. (If you care about this issue, you can vote up its priority.) Meanwhile I’ve been working in C#, which has been a fascinating experience. On the one hand, I believe that dynamic languages like Python are excellent choice for agile development everywhere, and especially in the fluid environment of the cloud. On the other hand, I’m not a language bigot and have always appreciated the virtues of statically-typed languages.

My basic philosophy has always been to use a mix of best-of-breed tools in order to gain maximum leverage. The combination of IronPython and C#, on the .NET platform, is a really powerful one, for the same reason that the Jython/Java combo is. On this project, even though I am not yet deploying any code written in IronPython, I often use IronPython to test C# components that I’ve written or acquired.

Along the way, I’ve been recalling something IronPython’s creator, Jim Hugunin, said at the Professional Developers Conference back in October. Jim’s talk followed one by Anders Hejlsberg, the creator of C#. Anders showed an experimental future version of C# that makes use of the Dynamic Language Runtime which supports IronPython and IronRuby on .NET. The effect was to create an island of dynamic typing within C#’s otherwise statically-typed world. We all appreciated the delicious irony of a static type called ‘dynamic’.

Jim might have sounded a bit wistful when he said: “I’m not sure what a dynamic language is any more.” But I think this blurring of boundaries is a wonderful thing. Many smart people I deeply respect value the static typing of C#. Some of the same smart people, and many different ones, value the dynamic typing in languages like Ruby and Python. If I can leverage the union of what all of those smart people find valuable, I’ll happily do so.

I’ll have more to say about this project, and of course code to share, as things evolve. Meanwhile, though, I want to acknowledge Doug Day at DDay Software. When I switched from Python to C#, the key component I needed was an iCalendar module equivalent to MaxM’s excellent Python iCalendar module, which I’m using at elmcity.info. Doug’s DDay.iCal met the need. It’s a solid, cleanly-built, open source .NET component that enables code written in any of the .NET family of languages to parse, and generate, iCalendar (RFC 2445) files.

And now back to the project, which reminds me of the era at BYTE during which I got to build stuff while writing about what I was building. It’s great fun. And as John Leeke so eloquently says, it engages the mind, the hands, and the heart.

My rationalization for buying a Wii Balance Board

19 Dec 200819 Dec 2008 ~ Jon Udell ~ 14 Comments

This week’s ITConversations show features a cameo appearance by my wife Luann, who came home a couple of weeks ago raving about the Wii Balance Board that she’d been using in physical therapy. I talked with Luann about how her therapists, Anna Domyancic and Darren Gerber, are using the Balance Board — and the Wii Fit software — to help retrain her proprioceptors. Then I visited Keene Physical Therapy and Sports Medicine where Anna gave me a demo of their Wii setup and talked about how and why physical therapists are adopting the technology.

I still haven’t bought one, but there are 6 shopping days until Christmas so there’s plenty of time.

A recipe for industrial transformation

11 Dec 2008 ~ Jon Udell ~ 16 Comments

When Tom Raftery pointed me to this gloomy assessment I had to go back and remind myself of what I found hopeful in Saul Griffith’s extraordinary energy talk at ETech.

Saul concedes a 2-degree-C rise in temperature by 2033. The question is what it will take to hold the line. He thinks we’ll need to build and deploy something like this mix of clean new energy production:

100 sq meters of solar voltaic cells per second for the next 25 years (2TW)

50 sq meters of solar thermal mirrors per second for the next 25 years (2TW)

1 100 megawatt wind turbine every 5 minutes for the next 25 years(2TW)

1 3 gigawatt nuclear plant every week for the next 25 years (3TW)

3 100 megawatt geothermal steam turbines every day for the next 25 years (2TW)

1250 sq meters of bio-fuel-producing algae every second for the next 25 years (.5TW)

Can we do it? The recipe calls for 11.5 terawatts of new (and carbon-free) power supply over the next 25 years, and we created 6 in the last 25 years. So, it’s “within the scale of what we know how to do.”

Now consider these existing capacities:

Cans. We produce 110 billion aluminum cans per year. Turned into thermal mirrors, that’s 200GW solar thermal/year. “If you make Coke and Pepsi into solar thermal companies, in 10 years you get to your 2 terawatts of solar thermal. It’s within our industrial capacity to do that.”

Phones. “Nokia makes 9 phones/second. Within Nokia + Intel + AMD there is roughly the capacity to make the needed photovoltaics.”

Cars. “GM makes 1 car every 2 minutes. GM + Ford = 1 wind turbine every 5 minutes.”

Of course it’s crazy to imagine retargeting our industrial capacity in such dramatic fashion, and turning it on a dime, isn’t it?

Not necessarily. For months I’ve been meaning to blog a segment from a Lester Brown podcast, which I can’t find now, but here’s the same point from his book Plan B 3.0: Mobilizing to Save Civilization:

In his State of the Union address on January 6, 1942, one month after the bombing of Pearl Harbor, President Roosevelt announced the country’s arms production goals. The United States, he said, was planning to produce 45,000 tanks, 60,000 planes, 20,000 anti-aircraft guns, and 6 million tons of merchant shipping. He added, “Let no man say it cannot be done.”

No one had ever seen such huge arms production numbers. But Roosevelt and his colleagues realized that the world’s largest concentration of industrial power at that time was in the U.S. automobile industry. Even during the Depression, the United States was producing 3 million or more cars a year. After his State of the Union address, Roosevelt met with automobile industry leaders and told them that the country would rely heavily on them to reach these arms production goals. Initially they wanted to continue making cars and simply add on the production of armaments. What they did not yet know was that the sale of new cars would soon be banned. From early 1942 through the end of 1944, nearly three years, there were essentially no cars produced in the United States.

In addition to a ban on the production and sale of cars for private use, residential and highway construction was halted, and driving for pleasure was banned. Strategic goods—including tires, gasoline, fuel oil, and sugar—were rationed beginning in 1942. Cutting back on private consumption of these goods freed up material resources that were vital to the war effort.

The year 1942 witnessed the greatest expansion of industrial output in the nation’s history—all for military use. Wartime aircraft needs were enormous. They included not only fighters, bombers, and reconnaissance planes, but also the troop and cargo transports needed to fight a war on distant fronts. From the beginning of 1942 through 1944, the United States far exceeded the initial goal of 60,000 planes, turning out a staggering 229,600 aircraft, a fleet so vast it is hard even today to visualize it. Equally impressive, by the end of the war more than 5,000 ships were added to the 1,000 or so that made up the American Merchant Fleet in 1939.

In her book No Ordinary Time, Doris Kearns Goodwin describes how various firms converted. A sparkplug factory was among the first to switch to the production of machine guns. Soon a manufacturer of stoves was producing lifeboats. A merry-go-round factory was making gun mounts; a toy company was turning out compasses; a corset manufacturer was producing grenade belts; and a pinball machine plant began to make armor-piercing shells.

In retrospect, the speed of this conversion from a peacetime to a wartime economy is stunning. The harnessing of U.S. industrial power tipped the scales decisively toward the Allied Forces, reversing the tide of war. Germany and Japan, already fully extended, could not counter this effort. Winston Churchill often quoted his foreign secretary, Sir Edward Grey: “The United States is like a giant boiler. Once the fire is lighted under it, there is no limit to the power it can generate.”

This mobilization of resources within a matter of months demonstrates that a country and, indeed, the world can restructure the economy quickly if convinced of the need to do so. Many people—although not yet the majority—are already convinced of the need for a wholesale economic restructuring. The purpose of this book is to convince more people of this need, helping to tip the balance toward the forces of change and hope.

And FDR engineered that transformation in less time than we’ve been occupying Iraq. So as Jan 20 approaches, I find myself wondering if maybe, just maybe, the new guy can galvanize a similar response.

Two IronPythonic spreadsheets

9 Dec 20089 Dec 2008 ~ Jon Udell ~ 2 Comments

I should get a life, I know, but I can’t help myself, one of my favorite pastimes is figuring out new ways to wrangle information. One of the reasons that IronPython had me at hello is that, my fondness for the Python programming language notwithstanding, IronPython sits in an interesting place: on Windows, side by side with Office, where a lot of information gets wrangled — particularly in spreadsheets.

There are now two interestingly different IronPython applications that marry Python and the spreadsheet. The first, Resolver One, I wrote about last year and featured in a screencast. In this case, IronPython runs the whole show. It drives the user interface, and it also drives the recalculation engine.

More recently Blue Reference, whose Inference suite integrates statistical and analytical tools like MATLAB and R into Office, has taken a different tack. Its Inference for .NET taps the general-purpose scripting capabilities of the dynamic .NET languages, including IronPython and IronRuby.

Now to be clear, I’m not in Blue Reference’s target market. Their customers are doing scientific and technical work that benefits from the ability to embed live R or MATLAB analysis into documents. I don’t know, but would be curious to find out, how those folks — or others — might also want to leverage more general-purpose glue languages like IronPython or IronRuby.

In any case, there are clear tradeoffs between the two approaches. With Inference, the IronPython engine is loosely coupled to the Office apps. That buys you the full fidelity of the applications, but costs you Pythonic impedance.

With Resolver One there is no impedance. The application and your data are made of Pythonic stuff. You give up a ton of affordances in order to get that unification, but it enables some really interesting things.

Here’s one example: row- and column-level formulae. This is a pretty handy idea all by itself. Instead of putting a formula into the first row of a column and then copying it down, you put it into the column header where it applies to the whole column automatically.

Michael Foord has a nice example (screencast, article) that shows how to do some nifty data aggregation using Python list comprehensions.

He starts with a worksheet of People:

Name	Age	Country	Job
Stan	23	USA	Blogger
Wendy	66	AUS	Analyst
Eric	33	UK	Developer

In a second worksheet, he aggregates by Country, like so:

Country	People	Number of People	Average Age
USA	[<Stan>,<Kenny>,<Craig>]	3	30.7
UK	[<Eric>,<Kyle>]	3	41.3

Here’s the column-level formula that does that:

=[person for person in <People>.ContentRows if person[‘Country’] == #Name#_]

In other words, for each row make a list of People whose Country attribute equals the value in the Name column of the row. And stick that value into the current cell. If you’re familiar with Python, you’ll notice that the syntax — [<Eric>,<Kyle>] — looks like how Python prints out a list. That because it really is a Python list sitting in that cell.

Now the other columns can refer to that list. Here’s Number of People:

=len(#People#_)

Here’s Average Age:

=AVERAGE(person[‘Age’] for person in #People#_)

This idea of having live Python objects sitting in a spreadsheet is what really grabbed me the first time I saw Resolver, and it still does.

Here’s another little example of my own. Yesterday I was revisiting some of the code I used in my crime analysis project. These kinds of projects invariably turn into pipelines that transform data one stage at a time. Typically I store those intermediate results in files, which tends to be awkward.

This time around, I did the pipeline as a Resolver spreadsheet like so:

The column-level formula on D combines the fields in A, B, and C into an URL-encoded string in D.

The formula on E calls a geocoding service with an URL made from the string in D and puts the XML result in E.

The formula on F parses the XML in E, creates a Python dictionary, and dumps that into F.

The formulae on G and H extract the lat and lon values out of the object in F and stick them into G and H.

I dunno, maybe it’s just me, but I think that’s cool.

Wiring the web (redux)

4 Dec 20084 Dec 2008 ~ Jon Udell ~ 14 Comments

Information technologists often recite David Wheeler’s famous aphorism:

Any problem in computer science can be solved with another layer of indirection.

Often, though, they omit the corollary:

But that usually will create another problem.

Those problems used to plague only IT folk. But now we’re all involved. Effective social information management is quite severely constrained by the fact that regular folks are not (yet) taught the basics of computational thinking.

For example, when I explain my community calendar project to prospective contributors, they invariably assume that I’m asking them to enter their data into my database. It’s quite hard to convey: that the site isn’t a database of events, only a coordinator of event feeds; that I’m only asking them to create feeds and give me pointers to their feeds; that this arrangement empowers them to control their information and materialize it in contexts other than the one I’m creating.

I’m having some success explaining this model, but it’s slow going. People don’t take naturally to the indirection and abstraction.

Here’s another example. I know various folks who are trying to create online resource directories of one kind or another. I’ve identified a pattern, which I call collaborative list curation, that is an ideal way to solve this problem. Consider this directory of blogs for the Monadnock region. It looks like any other such directory, but it’s made differently. Again, there is no explicit database. Entries come from the del.icio.us tag delicious.com/judell/monadnockblog — a personal collection whose items are, currently, the same as those in the global collection delicious.com/tag/monadnockblog.

I’m subscribed to the global collection at feeds.delicious.com/v2/rss/tag/monadnockblog which means I can monitor it for new items, vet them, and transfer those I want to include to my personal collection. If I wanted to delegate that editorial control, I would point my directory-making service at the del.icio.us account of a trusted associate and have it camp on that account’s monadnockblog tag instead of (or in addition to) my own.

Of course this is all way too indirect for any normal person to grok, which is why nothing has been added to the global collection. Even many IT-savvy folks, I’m finding, don’t take naturally to this model.

That said, I’m finding that once I can get people to walk through one of these experiences, and see the connection — OK, I do this over here, and that happens over there, and it can also happen somewhere else, and I’m in control — the light bulb does go on.

Now we need to take forward-thinking evangelists like me out of the loop, and get people to discover for themselves how to wire the web. If Live Clipboard didn’t exist, we’d have to invent it. Oh wait. It doesn’t, and we do.

Mind, hands, and heart: John Leeke on Internet video for sharing knowledge about historic home preservation

1 Dec 20085 Dec 2008 ~ Jon Udell ~ 10 Comments

This week’s ITConversations show suffered a tragic glitch that rendered the audio unusable, but I was able to transcribe it as text. My guest is John Leeke, a carpenter who takes care of old buildings and shares his knowledge of the tools and best practices involved in doing that. His methods of sharing have evolved over many years. He started in the early 1980s as a writer for magazines like Old House Journal and Fine Woodworking, transitioned to Internet publishing when that became possible, and more recently has become a leader in the use of Internet video to communicate knowledge that’s embodied, as he likes to say, in the mind, the hands, and the heart.

His approach to Internet video exemplifies and weaves together a number of themes that I’ve focused on in recent years, including narration of work, online apprenticeship, tacit knowledge, screencasting to document our work in the virtual world, and video to document our work in the physical world.

JU: We got introduced by way of the folks at the Open University, whom I met when I visited the UK in January 2007 to speak at the Technology, Knowledge, and Society conference. They were showing me their FlashMeeting videoconferencing system, and they cited you as an example of somebody who’s making very practical use of the medium in your work, which is historic home renovation.

JL: Right. I’d been using FlashMeeting for about a year and half then. They had singled me out because I wasn’t doing education, or developing the FlashMeeting system, like they were there at the Knowledge Media Institute, I was out in the real world doing things with it, demonstrating the horizontal movement of knowledge.

JU: That absolutely grabbed me. Ever since I got involved in Internet video, I saw there was a huge opportunity for horizontal, or direct, or peer-to-peer transfer of knowledge. In particular, of knowledge that is embodied, literally — it’s in your hands…

JL: It’s in your mind, your hands, and your heart. I’ve been sharing what I know through print media since the early 1980s. I grew up working in my father’s shop, in the 1950s, and then was out in the field working on historic buildings as a preservation carpenter for fifteen years. Then I fell into writing about my work: Homebuilding Magazine, Fine Woodworking, Old House Journal. I got pretty practiced at that by the late 1990s.

JU: You’ve published books too, right?

JL: Yes, I’ve self-published a series on caring for older buildings. Through the 1990s I knew that video would be important for my work, but I never came around to publishing anything in video. I didn’t have the time or dollars to put into it. But but 2003 and 2004, it was getting streamlined enough and easy enough to do over the Internet.

JU: As much use of online video as there is, I think we’ve barely scratched the surface when it comes to the sort of sharing of practical knowledge that you’ve been doing.

JL: It’s starting to happen. Just yesterday a colleague sent me a link to a YouTube video about how to draw and sketch the classical forms, like Ionic capitals. It was an architect showing how he sketched, and how he developed a balustrade for a fancy classical building. It showed him actually doing it. This wasn’t happening in the 1990s. You could do it, but it was a huge expensive production. Now you can do it for a couple of hundred dollars, and sometimes even less.

JU: Of course there’s still the question of why someone would do this. And in fact, the theme of the talk I gave at that conference was network-enabled apprenticeship. The idea was that throughout human history, people have learned trades and crafts by direct observation and imitation.

JL: Yeah, workers working side by side. And it’s more than observation. It’s the guiding of hands that makes that work. Internet video, even when it’s live, doesn’t get you all the way there. But it’s certainly a dramatic next level beyond print media, that’s for sure.

Expositional work online — presentation of words and pictures and even videos — it’s all presentational. Someone develops it, and as a separate event in time someone else comes and watches and learns. But when it’s live and interactive, that’s when you jump to the next level. Being there in person is best, of course, but this is a really valuable and powerful intermediate level because it opens up access to many more people than I can get together with personally, side by side.

JU: Can you give an example?

JL: In our work we’re often restoring old windows. This is the time of year when you have to take care of them. One of the details of that work is reglazing, where the glass meets the sash — the wooden frame that slides up and down. There’s a material called glazing compound, or putty, and it’s easy enough to use so that any handy person can do it, but it’s hard to get it so that it looks nice and smooth and even, if you haven’t done it before. Once you learn, it’s a cinch. And it’s easy to show someone how. I’ve taught eight-year-olds and eighty-year-olds how to run a perfect line. But you can’t do that with even a detailed series of photos.

JU: And you’ve tried…

JL: Yes, I’ve written three or four articles over the years, and each one is better, and you can learn a certain kind of thing from print and photos. You can learn what kind of putty to use, you can learn how to hold the putty knife. But until you see a putty knife in motion, and can respond in realtime — adjust the angle, a little more pressure — you can get it in thirty seconds if you’re side by side, and in a few minutes over interactive video.

JU: So you’re talking about a couple of levels here. The first is direct observation and imitation. My first revelation on that front was when I had to fix an old HP laser printer. I found a parts kit online that came with a video on a CD, and it enabled me to successfully disassemble and reassemble that printer. Later I realized there was no other way I could have done the job successfully. No written instruction would have gotten me there.

JL: Right, that’s one level and it works well when the printer you’re repairing is just like the one in the video. And when the job involves mechanical parts that lock and fit together.

But with the window putty, it’s different. You’re working with a plastic material. It’s as if you had to make those printer parts yourself. It’s basic stuff, not manufactured stuff.

JU: The motor skills are subtler, and the nonverbal communication is more critical.

JL: Right, and with the nonverbal communication as well as the visual, you really need to be able to go back and forth between the learner and the teacher. If you can do that within seconds — or if you’re standing next to someone, microseconds — that feedback between the eyes, the mind, the hand, the muscles, the tool, the material the tool is shaping — that’s how they learn so fast in person. And it can happen in seconds when you’re doing interactive video over the Internet.

JU: What’s your setup for doing these interactive training sessions over the Internet?

JL: I take my notebook computer, plug in my Sony HandyCam, and shoot whatever it is we’re teaching or discussing. It’s getting to the point where it’s all plug and play, and if I can do it, many people can.

JU: So that’s the broadcast piece of it, what’s the setup for interacting with people who are following along?

JL: That happens on a page at my website, HistoricHomeworks.com. Other people log in there to the FlashMeeting system, and if they have camera and audio at their end, I can see and hear them. Typical numbers are two or three participants, up to eight or ten. The live sessions are also catalogued for later viewing.

JU: The FlashMeeting system has some interesting features, including a method of visualizing the conversation so you can see who spoke when and for how long.

JL: That helps support the Knowledge Media Institute’s principal mission, which is to study and understand how knowledge spreads from person to person around the world. The analytical features built into FlashMeeting serve that mission.

It fascinates me. For example, you can see displayed on a map of the world the locations of viewers of these recorded sessions showing how to restore historic windows, or painting and restoring exterior woodwork. I can see where the interest is, and it turns out that people everywhere care about this stuff, because there are wooden buildings all around the world. On six of the seven continents there are people using these videos streaming from my office in Portland, Maine. At KMI they joke that they’re waiting for someone to start watching in Antarctica.

JU: It’s an interesting point because in the world of online media there’s a lot of emphasis on what’s new, but you’re operating out on the long tail. Your piece on interior storm windows was very relevant to me because I just went through the exercise of doing the stretch-and-seal method, and your demonstration of how to build reusable interior storms really got my attention.

That’s an idea a person might never encounter. But if you do, it doesn’t matter when. The publishing world calls this evergreen content, it’s valuable anytime.

JL: Right. There’s also a discussion on my website about this topic. It’s more expositional — words and pictures — and that goes hand in hand with the video. One of the limitations of the FlashMeeting system is that I can’t annotate the video, after the fact, with links to those materials.

JU: A lot of folks will look at this and say, OK, John Leeke is an unusual guy. He doesn’t just do the work, he also documents the work, and that’s great for him, but it’s not really relevant to most people who won’t have the time or inclination. For them, this process seems tangential.

But I think that’s often untrue. Here’s an example. I have a pellet stove, and there are a couple of maintenance procedures that I frankly screwed up the first time through because I didn’t absorb the understanding of how to do them from the manual. What struck me was that once I knew how to do it, I could have illustrated these procedures with a couple of five minute videos. And maybe I should just do that myself. But the thing is, if I’m the dealer, and I’m getting complaints from customers who are buying these things and then failing to understand the manual and screwing things up, it’s very much in my interest to do some of my own video documentation.

JL: Of course. And by the way, I’m not special. I’m just a carpenter up here in Maine, taking care of my own house. It just turns out that my work is also helping other people to take care of their houses. Well, yes, it’s not unique but special that I have this compulsion to share what I’m learning and figuring out. But the ability to share it — well, no matter who you are, if your neighbor sees you fixing your windows, and comes over and knocks on your door and asks about how to do it, you would show him. This is just an extension of that. Now we can have neighbors further afield.

JU: Yes. There was a time when the work people did was visible. You saw what they did.

JL: You saw what the people next to you did.

JU: That’s right. And you understood what the different kinds of work were, because you saw people doing that work. But then, in the industrial age, dad went off to work, he disappeared in the morning, and showed up again at the end of the day, and work was a black box. Who knew what dad did?

JL: That’s the industrial disconnect. And there’s a disconnect on the marketing side as well. Through the last half of the 20th century, as the industrial revolution gears up to grind itself into nothing — which is now happening — the method of marketing to more people than needed stuff was to disconnect the people from each other, so that everybody needed something, instead of sharing with their family or neighbors. Everybody needed their own lawnmower. But you figure your lawnmower is sitting idle in your garage for 99% of its time. One lawnmower could easily mow everybody’s lawn on the block.

But that’s the consumer culture that was developed by manufacturers. So very few people now know to run that glazing compound to seal the glass to the wooden frame. This is purposeful. They don’t want people to know how to run glazing because that limits the market for vinyl plastic imitation windows.

So I only have one person on the block I can teach locally, but I can connect with more people with interactive video. Because of the access to the long tail, I can be teaching lots of people who need to know that.

JU: Here’s another aspect I wanted to ask you about. When it’s hard to see how work is done, it’s hard to know what it’s like to be a person who does that kind of work. Unless it’s in the family, you won’t see it, and even then you probably won’t. You don’t have the family or community scope in which to see other kinds of work being done. And lacking that, you can get pretty far down an educational path before you realize that the path isn’t for you at all.

JL: Right. So, I’ve been focused on task-specific demonstration, but you’re talking about another thing that’s happening with video over the Internet — life blogging, or life broadcasting. I don’t think anybody’s doing that as a tradesperson. What is it like to wake up at 4:30 AM, so you can be on the site working on the windows, all day long, and then get in your pickup truck and drive back home? As you say, a lot of people could go all the way through school, and study building construction at the college level, and then take specialty courses in historic carpentry work, and by the time they’re in their early 20s they’re well-educated and have a good set of hands-on skills — and then realize that they don’t like to get up early in the morning.

JU: You’ve painted the downside, and that’s fair, people should understand that, but on the upside, the life blogging should also communicate how you feel when you drive by a house that you’ve restored, and how you know the people living there feel as a result of the work you’ve done.

JL: Absolutely. This is the heart side of the work that the industrial revolution leaves out. It boils everything down to mind and hand, and leaves out the heart. That is the heart side, when you drive by those buildings you helped restore, last month or last year or 20 years ago. It is the reason why we get up early in the morning to go to work. You know that you’re helping people who live in and use those buildings.

JU: Now there are certainly many people who will feel that these methods they get paid to practice are proprietary knowledge they wouldn’t want to reveal. My argument is that in a lot of cases, by demonstrating expertise you’ll attract more work than you lose, and that it’ll often be more interesting and rewarding work. What’s your experience?

JL: Both of those ideas do play strongly in the building trades. It’s a real tradition to keep secrets. Going back hundreds and hundreds of years, with the guild systems, there were ways to control the sharing of that kind of knowledge. And it’s still the case. Not every plasterer who can do those decorative Ionic capitals wants everybody to know exactly how they do it. But they do want everybody to know that it can be done.

You’re right, this is how artisans can do good marketing — by letting people know what is involved, by showing some of these methods, and they don’t have to give up all their secrets in order to do that. But you can help people to understand that it’s not just a machine spitting out product, it’s people making stuff with their minds and their hands and their hearts.

That’s another part of how I use Internet video. I go to some of my colleagues’ shops, as well as my own, and show what this is all about, because it is not well understood by the public. Video can get to the nuances of the heart side of this work.

JU: Also, if you can show me how to take care of some basic things for myself, maybe I can turn around and hire you to do something really special.

JL: Yeah. I’m hoping that we’re now in a post-modern cultural movement, which is what I think you’re talking about. Back in the 1970s I was already working in this realm of making fine things by hand, and there was a groundswell of interest. That’s when Alex Haley’s Roots phenomenon happened. It was important because it touched the hearts of people in America. That’s really what our restoration work is about, it’s the connnection with the people who once lived in these buildings. It wasn’t the national trust and the President telling us to save buildings, it was people who wanted to save them because their grandfathers built them.

JU: So where do you fall along the continuum of trade secrets and knowledge sharing?

JL: I’m at the extreme end of sharing everything I know. I’m a one-person microbusiness and always have been. I grew up in the midwest where sharing what you knew, and helping people, was what life was about, for everybody. That was the culture. It was a natural for me. It didn’t seem like it was worth keeping secrets.

My dad said that if you want to do well in trades, you have to let people know what you do. This is what it’s been all about for me — letting enough people know.

JU: And you have found incredible marketing power in doing what you do?

JL: Oh yeah. As I was working as a tradesperson in the 70s, and a contractor in the 80s, I made a shift because I’d been doing a good job of documenting my work. That’s something else I learned from my father. I also had the documents he created for his work, going back to the 20s, this huge information resource that I had to share.

JU: Really? What did he document?

JL: He documented his work in the arts and trades. He was a commercial artist through the 20s, then shifted into furniture and buildings at the craftsman/artisan level.

JU: And he left behind detailed logs of his practice?

JL: Yeah, detailed files of every project he ever worked on. So I learned that as part of my carpentry and woodworking, growing up in his shop, and continued it when I left his shop and came east to work on old buildings. So by the early 1980s I had this whole backlog of my own work to share. And by sharing it, I created extraordinary interest in my work. Back then it was through the print media — Fine Woodworking, Old House Journal, Fine Homebuilding — and a lot of people learned about the work I was doing to restore columns on old porches, saving windows, doing woodwork repairs. When I learned something I thought was worth sharing, I’d write an article about it. The editors loved it, and their readers did too, it was the authentic stuff, what was really going on out there in the field.

With that body of knowledge, by the late 1980s I was consulting on projects, helping people solve problems with their buildings. That meant I could be on even more projects, helping more people, and if I was writing about what I was learning, then each project was an order of magnitude larger. If I’m doing hands-on work on buildings I might only be helping a few people. If I’m consulting, it might be tens of people. If I’m writing, we figured ten or fifteen thousand people were using my articles. Each is a jump in magnitude. Then of course the Internet, where I got an early start in 1994 and 95.

JU: I’m sure a lot of folks will look at your example and feel that, since they’ll never become featured writers for magazines, there’s no point in doing this kind of sharing in a more modest way. But I think there’s benefit at any level of engagement. You’ve clearly thought through the dynamics of the communication pattern here: one-to-many, multi-level distribution. But for a lot of people, even with electronic media, that isn’t obvious. They’ll still spend a lot of time doing one-to-one communication. They’ll write something up, they’ll even take some pictures, but then they’ll just email that to somebody else.

JL: Two birds with one stone. I realized that if I wanted to accomplish the things I want to get done in my life, I have to get more than one result for every action or activity. The print — and now online — publications that I do are my marketing program, so I don’t have to spend money on advertising. And now you call, and want to talk with me, and if I was only getting one benefit from that, I wouldn’t be able to say yes. But I can already see two or three things that’ll come from talking to you, so I can say yes.

Say I’m thinking of taking on a project to help my neighbor rebuild her front steps. OK, I can earn some money. And I can take a series of photos for a print article, and that’ll bring some more income but it’ll also help with my personal goal of sharing more, and then I can easily shoot a little video that I can broadcast on the Internet and that will help an astonishing number of people. So I can’t say no, because I’m getting multiple benefits. But I would have to say no if the only benefit was getting paid to fix the steps.

JU: You’ve really thought it through.

JL: The key is that the video camera and the computer and the Internet are just tools, no different from my table saw and push stick, or my old wooden hand plane. They’re all just tools, and they’re all in the same kit for me, and I’m a tool user, and I help people with their old buildings.

How can people do this? I’ve found a balance. Instead of watching television, I make television.

JU: Well said. Thanks John!