The social scripting continuum

Back in June, IBM’s Tessa Lau joined me on my ITConversations podcast to discuss Koala, “a system for recording, automating, and sharing business processes performed in a web browser.” The service is now available on the AlphaWorks site as CoScripter, where the first script I tried was Tessa’s own Update your Facebook status. Here is the text of the script as it appears in the CoScripter wiki:

* go to "http://www.facebook.com"
* enter your "e-mail address" (e.g. tlau@tlau.org) into the "Email:" textbox
* enter your password into the "Password:" textbox
* click the "Login" button
* click the "Profile" link
* click the "Update your status..." link
* enter your status into the status field

Interestingly there was a bug in that script. The fourth step was originally:

* click the "Password" button

Because there is no button labeled “Password” on Facebook’s login page, the script failed.1 When I made the change from “Password” to “Login” in the CoScripter sidebar I simultaneously fixed the script and added the corrected version to the wiki. After posting this entry, I added a comment to the wiki that points back here. All in all, it’s a nice illustration of the emerging style of social programming that we also see in applications like Yahoo! Pipes and Popfly.

As Tessa explains in the podcast, many scripts — including this Facebook example — require secrets, notably usernames and passwords. These you can conveniently record as name/value pairs stored in a personal database. I have two observations about that. First, secrets appear to be stored remotely. If so, I’d prefer to keep them local. (Update: They are indeed local, see Tessa’s comment below.) Second, there should be a way to qualify them by domain, because names like “Email Address” and “Password” will soon become overloaded.

One of the delightful things about CoScripter is the simple and natural language used to express sequences of actions. It looks just like the instructions an ordinary user would write down for another ordinary user to follow. By embedding those instructions in an interpreter that makes it easy for anyone to run and debug them step by step, and by reflecting them into a versioned wiki, CoScripter creates a rich environment in which people can record, exchange, and refine their operational knowledge of web applications.

Currently CoScripter is a creature of the web, and specifically of a Firefox-based, Flash-free web. Adapting it to another browser would be hard but doable. Adapting it to work with RIA (rich Internet application) plug-ins like Flash or Silverlight is really problematic, though, because RIA plug-ins don’t mesh very well with the web’s RESTful style.

There are minor exceptions. Back in 2004 I raised that issue in terms of Flash, and Adobe’s Kevin Lynch showed how to materialize URLs for states within a Flash application. But this doesn’t occur normally and naturally when you write a Flash application, as it does when you write a web application. Or rather, as it used to when you wrote a web application, because AJAX also tends to hide an application’s URL namespace.

Because the same issue is going to come up all over again in the context of Silverlight, now would be a good time to think about how Silverlight apps can expose automation interfaces that cooperate with the RESTful web they’re part of.

With any flavor of web application, whether it’s based on simple HTML and JavaScript, or enriched with AJAX, or turbocharged with Flash or Silverlight, it would be great not only to be able to automate as CoScripter can, but also to share and collaboratively refine the scripts. How can we best assure that possibility? Tessa Lau thinks that web accessibility guidelines represent our best hope. If CoScripter-style automation were to catch on it would be a further incentive to adopt those guidelines, and would likely reshape them in useful ways as well.

But why stop there? In principle there’s no reason why desktop applications can’t play the same game, and there are compelling reasons why they should. Today, for example, I found the answers to the 25 top “How do I?” questions asked about Word. Those answers are pointers to articles in the Microsoft knowledge base. For the ever-popular “How do I create mailing labels?”, the answer includes instructions like these:

  1. Open the document in Word, and then start the mail merge. To start a mail merge, follow these steps, as appropriate for the version of Word that you are running:
    • Microsoft Word 2002:
      On the Tools menu, click Letters and Mailings, and then click Mail Merge Wizard.
    • Microsoft Office Word 2003:
      On the Tools menu, click Letters and Mailings, and then click Mail Merge.
    • Microsoft Office Word 2007:
      On the Mailings tab, click Start Mail Merge, and then click Step by Step Mail Merge Wizard.
  2. Under Select document type, click Labels, and then click Next: Starting Document. Step 2 of the Mail Merge appears.
  3. Under Select starting document, click Change document layout or Start from existing document. With the Change document layout option, you can use one of the mail-merge templates to set your label options. When you click Label options, the Label Options dialog box appears. Select the type of printer (dot matrix or laser), the type of label product (such as Avery), and the product number. If you are using a custom label, click Details, and then type the size of the label. Click OK. With the Start from existing document option, you can open an existing mail-merge document and use that as your main document.
  4. Click Next: Select Recipients

The resemblance to CoScripter’s step-by-step instructions is striking. Why shouldn’t instructions like these be able to drive Word’s automation interfaces? Why couldn’t users create and share their own instructions? Sure it’s a desktop application, but nowadays that’s just an endpoint along a continuum of application styles — HTML, JavaScript, AJAX, RIA, desktop app — all of which are connected and can communicate. Collaborative automation is just one of many opportunities to exploit that ability to communicate, but it’s a huge one.


1 I suspect that Tessa planted that bug intentionally to see if we were paying attention!

18 thoughts on “The social scripting continuum

  1. Tessa Lau

    Thanks for the thought-provoking write-up, and for fixing my script. :)

    The personal database is stored as a text file in your Firefox profile directory, not on the server. It will be interesting to watch how the vocabulary of personal database entries evolves over time. I imagine people will start creating a namespace for things like email addresses and passwords, so for example you’d call it a “facebook login” and “facebook password”. Different scripts could then refer to different entries by name. But I expect this to be driven by the community and how they decide to use this tool.

    I absolutely agree with you about scripting desktop applications. I’d love to see CoScripter’s “sloppy programming” approach be used to control all sorts of applications, and not just on the desktop. Could we use it to program VCRs? Or teach our parents how to use newfangled cellphones?

    But I think what’s most interesting is not CoScripter itself, but what it enables. We’ve watched Facebook grow from an application into a platform for developing social-network applications. I can’t wait to see what people build on the CoScripter platform.

    Reply
  2. Jon Udell Post author

    “I imagine people will start creating a namespace for things like email addresses and passwords, so for example you’d call it a “facebook login” and “facebook password”.”

    How does that work, though? It seems you have to exactly match what the script calls for. So for example, in this case:

    * enter your “e-mail address” (e.g. tlau@tlau.org) into the “Email:” textbox

    I included these entries in my personal database:

    facebook e-mail address = address1
    e-mail address = address2

    What matched was address2.

    Reply
  3. Tessa Lau

    Take a look at this script: http://services.alphaworks.ibm.com/coscripter/browse/script/859

    Someone else in the community has already made a copy of my script that uses the variables “facebook e-mail” and “facebook password” instead. So that’s what I mean by getting the community to standardize on variable naming. People who favor using site-specific names will tend to favor that script over mine, and that script will become more popular. And I’m hoping that eventually conventions will arise over what exactly to name things so that people can reuse variables across scripts.

    Reply
  4. Jon Udell Post author

    “Someone else in the community has already made a copy of my script that uses the variables “facebook e-mail” and “facebook password” instead. So that’s what I mean by getting the community to standardize on variable naming.”

    Oh. Duh. I gotcha now :-)

    Reply
  5. Mike

    I came across this post in search for other peoples experiences with CoScripter. I find the natural language approach truly amazing, but the execution…. well, it simply does not work with 90% of all websites that I use.

    On the other hand, in the last months I have been using iMacros, which is a Firefox extension very similar to CoScripter. It uses the “classical” record & replay approach. With iMacros I have been able to automate about 60% of all websites using the visual recording only, and another 30% I got to working after tweaking the recorded imacro manually.

    >Because there is no button labeled “Password” on Facebook’s login page, the script failed….
    If you have to fix something like this manually, I do not think this is “natural” language processing. Every human would have executed the script correctly.

    Reply
  6. engtech

    Thanks for pointing out CoScripter! This looks exactly like the kind of automation stuff I find interesting.

    But unfortunately it has “failure” written all over it — because you’re required to sign up for an IBM ID before downloading the extension or using the site.

    This would be a great tool for explaining actions to non-web-savvy users, but the requirements to get it running are such that they’ll never use it!

    Step 1: Long registration for an IMB ID

    Step 2: Install a Firefox extension

    Step 3: load and run script

    That’s way too complicated for the audience base that would find such scripts useful.

    Reply
  7. Pingback: Automation and accessibility « Jon Udell

  8. Pingback: » CoScripter looks very cool » business|bytes|genes|molecules

  9. Nigel Snoad

    This looks like useful fun. Useful if we can indeed link it up to pipes and the likes. More useful if applications get these kinds of hooks built in a-la applescript. Very useful for things like unit-testing webapps.

    As you note, it’s very similar to how you write howtos, or for my line of work, how I write acceptance tests and specifications/scenarios.

    This, for me, is the strength of rules engines. And you can prove they don’t break.

    Reply
  10. zaidimai

    Thanks for the thought-provoking write-up, and for fixing my script. :)

    The personal database is stored as a text file in your Firefox profile directory, not on the server. It will be interesting to watch how the vocabulary of personal database entries evolves over time. I imagine people will start creating a namespace for things like email addresses and passwords, so for example you’d call it a “facebook login” and “facebook password”. Different scripts could then refer to different entries by name. But I expect this to be driven by the community and how they decide to use this tool.

    I absolutely agree with you about scripting desktop applications. I’d love to see CoScripter’s “sloppy programming” approach be used to control all sorts of applications, and not just on the desktop. Could we use it to program VCRs? Or teach our parents how to use newfangled cellphones?

    But I think what’s most interesting is not CoScripter itself, but what it enables. We’ve watched Facebook grow from an application into a platform for developing social-network applications. I can’t wait to see what people build on the CoScripter platform.

    Comment by Tessa Lau — September 6, 2007 @ 5:27 pm

    I’cant understand how do it ?

    Reply
  11. Pingback: Peter Van Dijck’s Guide to Ease » Blog Archive

  12. Chui Tey

    Hmmm, this is kind of a workaround for web applications that has functionality that isn’t url addressable. Wikipedia doesn’t have this kind of problem. e.g. http://en.wikipedia.org/w/index.php?title=Flood&action=edit&section=6 brings one to the edit screen.

    I’ve been toying with similar concepts for rich client applications. Using URL monikers, data inside rich client apps can be directly accessed, forms and dialogs too. These URLs can be emailed from a user to another. What’s not to love about this kind of application?

    Reply
  13. Pingback: Demo of Ubiquity for Firefox (by Aza Raskin) — Video Archive

  14. Pingback: Scratchpad

  15. Pingback: From screencasting to automation « Jon Udell

  16. Pingback: Jim Rohn Motivation

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s