In The Chess Master and the Computer, Garry Kasparov famously wrote:
The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time. Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.
The title of his subsequent TED talk sums it up nicely: Don’t fear intelligent machines. Work with them.
That advice resonates powerfully as I begin a second work week augmented by GitHub Copilot, a coding assistant based on OpenAI’s Generative Pre-trained Transformer (GPT-3). Here is Copilot’s tagline: “Your AI pair programmer: get suggestions for whole lines or entire functions right inside your editor.” If you’re not a programmer, a good analogy is Gmail’s offer of suggestions to finish sentences you’ve begun to type.
In mainstream news the dominant stories are about copyright (“Copilot doesn’t respect software licenses”), security (“Copilot leaks passwords”), and quality (“Copilot suggests wrong solutions”). Tech Twitter amplifies these and adds early hot takes about dramatic Copilot successes and flops. As I follow these stories, I’m thinking of another. GPT-3 is an intelligent machine. How can we apply Kasparov’s advice to work effectively with it?
Were I still a tech journalist I’d be among the first wave of hot takes. Now I spend most days in Visual Studio Code, the environment in which Copilot runs, working most recently on analytics software. I don’t need to produce hot takes, I can just leave Copilot running and reflect on notable outcomes.
In this example I would have needed a broad search to recall the name of the date-formatting function that’s available in Python: strftime. Then I’d have needed to search more narrowly to find the recipe for printing a date object in a format like Mon Jan 01. A good place for that search to land is https://strftime.org/, where Will McCutchen has helpfully summarized several dozen directives that govern the strftime function.
Here’s the statement I needed to write:
day = day.strftime('%a %d %b')
Here’s where the needed directives appear in the documentation:
To prime Copilot I began with a comment:
# format day as Mon Jun 15
Copilot suggested the exact strftime incantation I needed.
This is exactly the kind of example-driven assistance that I was hoping @githubcopilot would provide. Life's too short to remember, or even look up, strptime and strftime.
(It turns out that June 15 was a Tuesday, that doesn't matter, Mon Jun 15 was just an example.) pic.twitter.com/a1epnaZRF9
— Jon Udell (@judell) July 5, 2021
Now it’s not hard to find a page like Will’s, and once you get there it’s not hard to pick out the needed directives. But when you’re in the flow of writing a function, avoiding that context switch doesn’t only save time. There is an even more profound benefit: it conserves attention and preserves flow.
The screencast embedded in the above tweet gives you a feel for the dynamic interaction. When I get as far as # format day as M, Copilot suggests MMDDYYY even before I write Mon, then adjusts as I do that. This tight feedback loop helps me explore the kinds of natural examples I can use to prime the system for the lookup.
I’m reminded of Language evolution with del.icio.us, from 2005, in which I explored the dynamics of the web’s original social bookmarking system. To associate a bookmarked resource with a shared concept you’d assign it a tag broadly used for that concept. Of course the tags we use for a given concept often vary. Your choice of cinema or movie or film was a way to influence the set of resources associated with your tag, and thus encourage others to use the same tag in the same way.
That kind of linguistic evolution hasn’t yet happened at large scale. I hope Copilot will become an environment in which it can. Intentional use of examples is one way to follow Kasparov’s advice for working well with intelligent systems.
Here’s a contrived Copilot session that suggests what I mean. The result I am looking for is the list [1, 2, 3, 4, 5].
l1 = [1, 2, 3]
l2 = [3, 4, 5]
# merge the two lists
l3 = l1 + l2 # no: [1, 2, 3, 3, 5]
# combine as [1, 2, 3, 4, 5]
l3 = l1 + l2 # no: [1, 2, 3, 3, 5]
# deduplicate the two lists
l1 = list(set(l1)) # no: [1, 2, 3]
# uniquely combine the lists
l3 = list(set(l1) | set(l2)) # yes: [1, 2, 3, 4, 5]
# merge and deduplicate the lists
l3 = list(set(l1 + l2)) # yes: [1, 2, 3, 4, 5]
The last two Copilot suggestions are correct; the final (and simplest) one would be my choice. If I contribute that choice to a public GitHub repository am I voting to reinforce an outcome that’s already popular? If I instead use the second comment (combine as [1, 2, 3, 4, 5]) am I instead voting for a show-by-example approach (like Mon Jun 15) that isn’t yet as popular in this case but might become so? It’s hard for me — and likely even for Copilot itself — to know exactly how Copilot works. That’s going to be part of the challenge of working well with intelligent machines. Still, I hope for (and mostly expect) a fruitful partnership in which our descriptions of intent will influence the mechanical synthesis even as it influences our descriptions.