Where is the money going?

Over the weekend I was poking around in the recipient-reported data at recovery.gov. I filtered the New Hampshire spreadsheet down to items for my town, Keene, and was a bit surprised to find no descriptions in many cases. Here’s the breakdown:

# of awards 25
# of awards with descriptions 05 20%
# of awards without descriptions 20 80%
$ of awards 10,940,770
$ of awards with descriptions 1,260,719 12%
$ of awards without descriptions 9,680,053 88%

In this case, the half-dozen largest awards aren’t described:

award amount funding agency recipient description
EE00161 2,601,788 Sothwestern Community Services Inc
S394A090030 1,471,540 Keene School District
AIP #3-33-SBGP-06-2009 1,298,500 City of Keene
2W-33000209-0 1,129,608 City of Keene
2F-96102301-0 666,379 City of Keene
2F-96102301-0 655,395 City of Keene
0901NHCOS2 600,930 Sothwestern Community Services Inc
2009RKWX0608 459,850 Department of Justice KEENE, CITY OF The COPS Hiring Recovery Program (CHRP) provides funding directly to law enforcement agencies to hire and/or rehire career law enforcement officers in an effort to create and preserve jobs, and to increase their community policing capacity and crime prevention efforts.
NH36S01050109 413,394 Department of Housing and Urban Development KEENE HOUSING AUTHORITY ARRA Capital Fund Grant. Replacement of roofing, siding, and repair of exterior storage sheds on 29 public housing units at a family complex

That got me wondering: Where does the money go? So I built a little app that explores ARRA awards for any city or town: http://elmcity.cloudapp.net/arra. For most places, it seems, the ratio of awards with descriptions to awards without isn’t quite so bad. In the case of Philadelphia, for example, “only” 27% of the dollars awarded ($280 million!) are not described.

But even when the description field is filled in, how much does that tell us about what’s actually being done with the money? We can’t expect to find that information in a spreadsheet at recovery.gov. The knowledge is held collectively by the many people who are involved in the projects funded by these awards.

If we want to materialize a view of that collective knowledge, the ARRA data provides a useful starting point. Every award is identified by an award number. These are, effectively, webscale identifiers — that is, more-or-less unique tags we could use to collate newspaper articles, blog entries, tweets, or any other online chatter about awards.

To promote this idea, the app reports award numbers as search strings. In Keene, for example, the school district got an award for $1.47 million. The award number is S394A090030. If you search for that you’ll find nothing but a link back to a recovery.gov page entitled Where is the Money Going?

Recovery.gov can’t bootstrap itself out of this circular trap. But if we use the tags that it has helpfully provided, we might be able to find out a lot more about where the money is going.

25 thoughts on “Where is the money going?

  1. Tom Lee

    You’re absolutely right that project award identifiers ought to be given more prominence. Unfortunately, I think that emphasis probably needs to start higher up the chain than just the data consumer. In our work with the FAADS and FPDS systems (upon which Recovery reporting is based), we’ve found that records aren’t given award identifiers reliably enough to even total up the funding for a project across its various payments (or obligations, more accurately).

    Other identifiers in the system have their problems, too. The system relies on DUNS numbers to identify recipients, but these aren’t particularly reliable, and are largely controlled by a private entity. Using a proprietary identifier was a terrible decision; the government really needs to fix this.

    Finally, I’d note that the quality of the ARRA data is also definitely in question. I think there’s reason to expect that it’s better than the normal FAADS disclosure, if only because of the political attention it’s garnered. But more worrying to me than the missing descriptions you point to are the records that might be missing — something that we can’t spot except by doing a cross-walk with other funding records (which is easier said than done).

    But let me end on a more cheerful note: I’m glad that the attention that Recovery funds are attracting is helping to expose some of these problems. They’ve been there for a long time — maybe now we can get them fixed!

    Reply
  2. Jon Udell Post author

    we’ve found that records aren’t given award identifiers reliably enough to even total up the funding for a project across its various payments (or obligations, more accurately).

    Doesn’t surprise me a bit.

    And yes, the higher-ups ought to mandate metadata hygiene.

    But even were that to happen, the knowledge of how that money is actually flowing through our society is held collectively. And it will take collective effort to materialize it.

    As broken as the award-numbering system is, it exists. And we could do a lot with it right now, if we could easily tag those numbers onto the things we read and write online.

    And…it’s getting a lot easier lately to envision a silo-crossing web application that could make that tagging possible. Fun, even.

    Reply
  3. Jon Udell Post author

    Finally, I’d note that the quality of the ARRA data is also definitely in question.

    Ya think? :-)

    If I were applying for a $2.6M award I think I’d bother to write “Southwestern” vs “Sothwestern Community Services” — c’mon, please.

    But more worrying to me than the missing descriptions you point to are the records that might be missing — something that we can’t spot except by doing a cross-walk with other funding records (which is easier said than done).

    A high-level cross-check is possible, right?

    I.e., is the sum of recipient-reported dollars received within shouting distance of the sum of government-reported dollars given?

    Anyone done this?

    Reply
    1. Tom Lee

      This is possible, but difficult. You can’t easily start with the budget, because of its complexity and because it’s spread across years in ways that are hard to account for. You can’t get the Treasury records of expenditures because they haven’t been scrubbed to protect recipients’ privacy (if this could be fixed, it could probably be the best way to do a crosswalk with obligations). Instead you have to go to each agency, ask for/demand their financial records, then compare them to what was reported in FAADS/FPDS. Often you’ll have to do significant amounts of manual reconciliation to account for deviations from the reporting guidelines — to separate what’s just confusing from what’s genuinely missing or in error.

      This is never done in a comprehensive way. In practice the issue keeps popping up when GAO tries to do a report about a specific question, finds it has to use this data, and notices that the data isn’t really good enough for serious analysis (at this point they usually proceed with caveats). Here’s one from 2005 that’s nominally about rural economic development; here’s a more recent one about the nonprofit sector.

      On the upside, the text of FFATA states that the Comptroller General will be reporting about the state of these systems before the new year. It’ll be very interesting to see what that report finds.

      Reply
  4. Jon Udell Post author

    I’m glad that the attention that Recovery funds are attracting is helping to expose some of these problems. They’ve been there for a long time — maybe now we can get them fixed!

    I hope so. Let’s please just not miss the opportunity to apply our cognitive surplus to the fix.

    Reply
  5. Eric

    Hi Jon,

    We’ve been working on some of these issues with some recommendations here:

    http://recovery.berkeley.edu/tech/

    My colleague Raymond Yee has also been trying to do some cross checking. One bit of background is knowing the universe of accounts involved in the Recovery Act. We’ve made a FOIA request for getting a list of all the Treasury Accounts used for the Recovery, but we don’t have those data yet.

    Thanks for writing about this! It’s a really interesting topic!
    -Eric

    Reply
  6. Raymond Yee

    Hi Jon,

    I see that Eric Kansa has already posted about our work and our FOIA request. I’m redesigning my Mixing and Remixing Information course at Berkeley (for next semester) to focus on making sense of ARRA spending. I’m glad to see that you’ve gotten into looking at ARRA data yourself and hope that you’ll want to do even more work in the area.

    -Raymond

    Reply
  7. Jon Udell Post author

    Eric, good to meet you. And Raymond, nice to hear from you!

    We’ve made a FOIA request for getting a list of all the Treasury Accounts used for the Recovery, but we don’t have those data yet.

    At what feed could an interested party track the progress of that request?

    Reply
  8. Jon Udell Post author

    This is never done in a comprehensive way.

    Suppose that it were. Suppose that every line item in those recipient reports added up — within shouting distance — to the totals reported by the government. We still wouldn’t know jack about outcomes.

    But collectively that knowledge exists. It’s distributed across a broad swath of funders, recipients, and beneficiaries. It was never before possible to collate their narratives. Now it is possible. That doesn’t mean that it will happen. But it could.

    Reply
  9. Pingback: Mixing and Remixing Information » MRI 2010: Making Sense of the Stimulus

    1. Raymond Yee

      Faye — I’d be curious to hear how you are making use of recovery.gov and whether you are directly analyzing any of the data (which you can download) to help “connect minority-owned businesses with stimulus opportunities”.

      Reply
  10. Pingback: Notional Slurry » links for 2009-11-11

  11. Robert G

    Make sure you filter by P for Prime and S for Sub-recipients. Subs do not provide a description of the work only Primes yet both are available from the download center.

    To find the Prime description use the same Award Key as the Sub.

    Reply
  12. Robert G

    For the city of Keene there are 5 Prime Recipients and all reported a description.

    You can type in the Award ID into either Google or the Recipient Search scope and the Award Summary page result surfaces all of the Recipient reported info into a single page.

    Reply
  13. Jon Udell Post author

    For the city of Keene there are 5 Prime Recipients and all reported a description.

    Thanks Robert, that helps a bit. In the case of the example I searched for…

    http://elmcity.info/doublesearch/?q=S394A090030

    …the Prime Recipient is:

    EXECUTIVE OFFICE OF THE STATE OF NEW HAMPSHIRE

    And the description is:

    The purpose of this grant is to support and restore funding for elementary, secondary, and postsecondary education and, as applicable, early child hood education programs and services in States and local educational agencies

    That description covers, for the whole state, 161 awards totaling $192,121,666.

    There’s obviously more to the story. But when I search for the identifier S394A090030
    all that comes back is the page I referenced before at recovery.gov. And also, now, some links to this blog.

    My point, again: S394A090030 is a webscale identifier that thousands of people involved in hundreds of funded projects could be using to tell much more of the story. I think a lot of them would like to write and tell it, and a lot of us would like to read and hear it.

    I don’t expect the ARRA spreadsheet alone to tell the story. It can’t possibly. But it could be a table of contents for a book of stories written by many people.

    Reply
    1. Raymond Yee

      I like Jon’s analogy of the ARRA spreadsheets to “table of contents for a book of stories written by many people.” In our report “Web Services for recovery.gov”, we argue for the use of various identifiers to help make such a table of contents: http://escholarship.org/uc/item/0fv601z8?pageNum=7#page-7

      For example, the treasury account symbol (TAS) has been of particular interest to me. In one of the grants to Keene, NH (2009RKWX0608) is part of the Community Oriented Policing Services super-program of the Recovery Act:

      COMMUNITY ORIENTED POLICING SERVICES
      For an additional amount for ‘‘Community Oriented Policing
      Services’’, for grants under section 1701 of title I of the 1968
      Omnibus Crime Control and Safe Streets Act (42 U.S.C. 3796dd)
      for hiring and rehiring of additional career law enforcement officers
      under part Q of such title, notwithstanding subsection (i) of such
      section, $1,000,000,000.

      (See http://www.govtrack.us/congress/billtext.xpd?bill=h111-1&version=enr&nid=t0:enr:237 )

      All the grants/contracts tied to Community Oriented Policing (COPS) are tied to a TAS of 15-0412 (15 is the Treasury symbol for the Department of Justice)

      Wouldn’t it be useful to be able to find all the programs funded under COPS (TAS = 15-0412) across the country to see what the $1 billion appropriate for this purpose is doing? And, of course, COPS is just one of many different programs funded under ARRA….

      Reply
      1. Jon Udell Post author

        Wouldn’t it be useful to be able to find all the programs funded under COPS (TAS = 15-0412) across the country

        Sure. In this case, there are a couple of ways to isolate the 7 COPS awards for NH.

        1. The titles match: “The COPS Hiring Recovery Program (CHRP) provides funding directly to law enforcement agencies…”

        2. The award numbers share a common pattern:

        2009RKWX0612
        2009RKWX0609
        2009RKWX0613
        2009RKWX0617
        2009RKWX0608
        2009RKWX0614
        2009RKWX0616

        If these correspondences hold across all 50 spreadsheets, then there’s an easy algorithmic way to tag all COPS entries with TAS_15-0412 and link them to other things so tagged.

        Ideally the tagging would be done at the source. But there’s no need to wait for that to happen. If it’s useful and important, it can be done in a view overlaid onto the source.

  14. Robert G

    Agreed. The AwardID alone is obviously insufficient though as it refers only to Primes. You need a key pair of AwardID and OrderID to specifically identify the Subs.

    Agencies also have their own systems for creating the AwardID number so there are some inconsistencies.

    Reply
  15. Jon Udell Post author

    The AwardID alone is obviously insufficient though as it refers only to Primes.

    The award number alone is not ideal, but it’s what we’ve got and what we are likely to have in the foreseeable future.

    Given that reality, the award number can be augmented with other clues — either from within the Arra data or from outside it — to create views of the data that give more traction.

    You need a key pair of AwardID and OrderID to specifically identify the Subs.

    Unless OrderID is missing, as is true for the COPS example discussed in #14 above.

    But in any case, if that kind of key pair is a requirement, few will be able to meet it, and little or no collective annotation will emerge.

    From a pure information management perspective, you want to uniquely identify records with key pairs.

    But from a social information management perspective, you want to keep the activation threshold really low, so that it’s quick and easy for people to associate things with other things.

    Reply
  16. Jon Udell Post author

    Cool!

    This example does, BTW, amplify the point already discussed in this thread about Primes vs Subs. The Stimulus Watch page reports only Primes (and actually, just 4 of 5 of them), summing to $1.2 million. But there are 25 Keene-related awards summing to $10.9 million.

    ARRA’s default data model doesn’t deliver that wider view, but we can — I would argue must — augment it so we can expose and annotate more of what’s going on.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s