Don’t Repeat Yourself (DRY) is a touchstone principle of software development. It’s often understood to inveigh against duplication of code. Copying a half-dozen lines from one program to another is a bad idea, DRY says, because if you change your mind about how that code works, you’ll have to revise it in several places. Better to convert those lines of code into a function that you write once and reuse.
More broadly, the DRY principle asserts:
Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.
Code and data are two kinds of knowledge that ought to be represented canonically, and repeated — if at all — only by mechanical derivation, never by variation.
I often violate the DRY principle by indulging in CopyAndPasteProgramming. In my defense I point to another principle, CodeHarvesting, which defends duplication as a necessary stepping stone.
Letting a duplication of logic live for now, in order to see how to best universalize it at some later point.
For me, at least, that’s what tends to work best. A common theme doesn’t emerge until I’ve seen — and ideally others have seen and reacted to — several variations on that theme. This kind of duplication — deferred universalization — is beneficial, right?
Here’s another kind. In the JavaScript world the dominant engine of reuse is the Node Package Manager (NPM). When I first started using it a few years ago, I was shocked at the amount of duplication it entails. When you install an NPM package, the modules it depends on are copied into a subdirectory. If those modules depend on others, they are copied into yet deeper subdirectories. For even a simple JavaScript program you can end up with a forest of thousands of files.
A similar thing happens in the Python world. It’s a best practice, nowadays, to use a tool called virtualenv to create, for each Python program you run, an isolated environment with the particular Python interpreter and set of modules needed by that particular program. In practice that means, again, copying lots of files.
Arguably these duplications don’t violate DRY because they are mechanical copies that won’t vary from their originals. But they can! And here too I am prone to indulge in local variation to explore possibilities that might or might not warrant generalization.
While pondering the vices and virtues of duplicative software construction I reread Metamagical Themas, the compendium of Douglas Hofstadter’s columns in Scientific American. (The title is an anagram of Mathematical Games, the column he inherited from Martin Gardner.) In Variations on a Theme as the Crux of Creativity he states the case as plainly as anywhere. At the core of creative thought are “slippery” concepts that we develop in a virtuous cycle of innovation:
Once you have decided to try out a new way of viewing a phenomenon, you can let that view suggest a set of knobs to vary. The act of varying them will lead you down new pathways, generating new images ripe for perception in their own right.
This sets up a closed loop:
– fresh situations get unconsciously framed in terms of familiar concept;
– those familiar concepts come equipped with standard knobs to twiddle;
– twiddling those knobs carries you into fresh new conceptual territory.
We need to get DRY eventually in order to maintain stable systems. But the countervailing state needn’t be WET (“write everything twice”, “we enjoy typing” or “waste everyone’s time”). Instead I propose DRYWV: Do Repeat Yourself, With Variations.
Every piece of knowledge should have a single, unambiguous, authoritative representation within a system. But how do we arrive at such knowledge? I think we have to DRYWV our way there.
Interesting point about the massive duplication in NPM and virtualenv. Minor correction: Martin Gardner’s column was “Mathematical Games.”
“Mathematical Games” -> Thanks! The funny thing is that I wrote out both titles, one under the other, and transposed one to the other to prove to myself they are anagrams. And then — how appropriate! — Games slipped to Themes.
In other slips related to the letter ‘T’, Martin Gardner has transformed into “Margin Gardner”.
Thanks!
Duplication is appropriate during the red and green stages (of TDD), but should be eliminated in the refactor stage before checking in.
A subjective experience with working on a big C# project for about a year tells me, that DRYWV is perhaps the essence of C# programming with design patterns. A large project needs bridges and factories in order to remain stable; bridges and factories promote using interfaces; having an interface implies that one has to write objects with very similar yet a bit different properties and methods.
“Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.” Such novel perceptions bare the intention to rhapsodize reality into shocked silence. This sort of theory is okay if you have twelve or so knowledges kicking around in your brain, but neither humans nor programs generally qualify. Anyways, at which point the amount of code you’d be repeating is a tacit weight, the number of lines and the context. DRY principle yet another way of repeating what beginner realize in first programs. KISS principle much clearer to person. Or TL;DT : Too long; didn’t type.
DRYWV is cutesy idea you thought of. Thank you so much for sharing.
Coding and philosophy. Brilliant. :-) Thanks!