On this week’s Interviews with Innovators I spoke with Janis Dickinson, director of citizen science at the Cornell Ornithology Lab. We talked about several of the lab’s projects that involve collection and analysis of volunteer observations about birds and bird habitats.
Courtesy of the eBird project, for example, here is a view of first sightings of common bird species in New Hampshire. At first glance it might be tempting to see the preponderance of dates in the current decade as an effect of global warming. But to support that interpretation, you’d have to answer a bunch of questions about the evolution of record-keeping over the period, and the distribution, reliability, and bias of volunteer observers.
Extracting signal from noise is, of course, one of the classic bread-and-butter activities of information science. What’s fascinating here is the Web 2.0 angle. Birdwatchers are famously passionate data collectors who develop reputations among their peers. When they contribute their data to eBird — and thence to the Avian Knowledge Network — those reputations can begin to be measured, and used to tune the analysis of a large body of contributed data.
For example, the all-time latest reported sighting of the Nelson’s Sharp-tailed Sparrow in New Hampshire was on Nov 24 2007, by Michael Harvey. Is that unusually late? And if so, is it credible? To answer these questions, Cornell’s data crunchers can compare what was and wasn’t reported in the region around that time, by observers whose reputations are one kind of signal that emerges from noisy data.