My contribution to Silona Bonewald week was an interview about her new project citability.org. Silona proposes two new features for government websites. First, change tracking. Second, permalinks for documents, sections, and paragraphs.
Nobody will dispute the need for, or utility of, these features. The question is how to implement them across a sprawling landscape of content management systems and publishing procedures that still, in many cases, regard print as canonical and the web as an afterthought.
In a follow-on discussion with Silona, on the citability wiki, I recalled a little-known and rarely-used feature of PDF documents. You can form URLs that point to specific pages. And with the right preparation, you can even form URLs that point to named destinations within pages.
Those of us fluent in web-friendly document formats like HTML and XML will tend to recommend that these become canonical. But having recently observed what happened when the old-fashioned non-XML method of math typesetting was supported by WordPress.com, I have to ask: How much more mileage might we be getting out of the existing print-oriented systems?
I am not an expert user of PDF authoring tools, nor an expert user of software libraries that enable programmatic manipulation of PDF files. But some of you are. What would it take, I wonder, to post-process the kinds of PDF files that governments typically produce, in order to add Purple Numbers?
August 25, 2009 at 9:11 am
Considering that many … most? official types of document are pretty well locked down, adding anything other than by the owner would be pretty hard Jon?
August 25, 2009 at 9:43 am
What about 508 in the states Jon? If the department has to provide alternative formats, how might you relate ‘purple number XXX’ with the same point in the audio or braille version of the document?
August 25, 2009 at 4:10 pm
I’m not sure what the permalink behind the purple number would be in a PDF. I would hate it to be something that causes the PDF to be downloaded from a web location and then scrolled to the particular place. That might be better than having to download the PDF and then search for a cited place in it, but I can see mischief here too.
I put purple links in my web-based work to provide anchors that are useful in citing passages, but I am not so sure this is a great idea for anything but self-referenced fragment IDs into XML and HTML documents.
August 25, 2009 at 4:22 pm
> I would hate it to be something that causes
> the PDF to be downloaded from a web
> location and then scrolled to the
> particular place.
That is, in fact, how it works. It is an application of the same HTTP Byte Range feature that I was exploring when I did the MP3 Sound Bites hack (http://www.oreillynet.com/pub/a/network/2004/09/03/primetime.html).
I think may make sense to separate two different concerns here, however. One is the creation of a namespace that will unify citations pointing into government PDFs. Another is the implementation of a mechanism to serve that namespace.
In the HTML realm, Silona is already talking about a shadow service that augments gov sites with features like Purple Numbers. I guess the same could be done for the PDF realm.
August 28, 2009 at 1:05 am
let’s get rid of .pdf as an “archival” format,
and not waste our time and energy trying to
juryrig methods to make them less useless…
please…
-bowerbird
August 28, 2009 at 8:41 am
> let’s get rid of .pdf as an “archival” format
I violently agree. However there’s a ton of it out there, and tons more in the pipeline, and methods that make that tonnage less useless are useful in proportion to the tonnage.
August 28, 2009 at 9:59 am
Adobe’s Acrobat Reader supports linking within a PDF but others, for example Foxit Reader, do not.
I am very glad to see efforts to help government department’s create citable documents. A simple first step would be to just add revision date and number every paragraph. Have a clear, one page instruction sheet on how to do this within Microsoft Word. And then distributed it to every Town Manager and government office manager in the US. Perhaps make it a one page advertisement in trade journals!
August 28, 2009 at 12:31 pm
jon said:
> and methods that make that tonnage
> less useless are useful in proportion to
> the tonnage.
are they? i’m not so sure. i think rather
they make us think we can “live with” .pdf.
instead, let’s work on a better alternative.
i’ve got one, and i’d love for you to see it,
evaluate it, and — if you think it worthy –
champion it.
-bowerbird
October 19, 2009 at 12:34 am
[...] Purple Numbers for PDF documents? (jonudell.net) Related Posts:Best Free System Management, File Management, System Maintenance, Anti-virus, Anti-spyware and Firewalls SoftwareBest Free Imaging Tools, Graphics, 3D Software, Image Managers SoftwareBest Free Productivity SoftwareMicrosoft Warns of Serious Computer Security HoleBest Free Utilities Software Categories: Blogging, WordPress, web development Tags: cms, content management, content management software Comments (0) Trackbacks (0) Leave a comment Trackback [...]
October 28, 2009 at 3:22 pm
[...] Purple Number plugin for WordPress, when Google coughed up Jon Udell’s post: “Purple Numbers for PDF documents?“. I’m glad to see such an industry stalwart plugging for Purple Numbers. While this [...]