<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: That first step can be a doozy</title>
	<atom:link href="http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/</link>
	<description>Strategies for Internet citizens</description>
	<lastBuildDate>Sun, 12 Feb 2012 18:22:41 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Where the oil comes from: Not from where I thought &#171; Jon Udell</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125837</link>
		<dc:creator><![CDATA[Where the oil comes from: Not from where I thought &#171; Jon Udell]]></dc:creator>
		<pubDate>Sun, 09 Nov 2008 18:39:10 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125837</guid>
		<description><![CDATA[[...] Importing. There are a few different ways to grab data from a web page. You can have Dabble DB parse the page, or you can copy/paste. In this case, I wound up trying both and had better luck with the latter. But we&#8217;re still very much in an era when data published to the web is not really intended to be used as data. That first step can be a doozy. [...]]]></description>
		<content:encoded><![CDATA[<p>[...] Importing. There are a few different ways to grab data from a web page. You can have Dabble DB parse the page, or you can copy/paste. In this case, I wound up trying both and had better luck with the latter. But we&#8217;re still very much in an era when data published to the web is not really intended to be used as data. That first step can be a doozy. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: What is an Internet operating system? &#171; Jon Udell</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125384</link>
		<dc:creator><![CDATA[What is an Internet operating system? &#171; Jon Udell]]></dc:creator>
		<pubDate>Mon, 22 Sep 2008 22:50:15 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125384</guid>
		<description><![CDATA[[...] services woven into the web&#8217;s fabric were hard to use back then, and in many ways still are. One key enabler for the Internet OS, therefore, would be a framework for defining and deploying [...]]]></description>
		<content:encoded><![CDATA[<p>[...] services woven into the web&#8217;s fabric were hard to use back then, and in many ways still are. One key enabler for the Internet OS, therefore, would be a framework for defining and deploying [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon Udell</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125343</link>
		<dc:creator><![CDATA[Jon Udell]]></dc:creator>
		<pubDate>Sun, 14 Sep 2008 17:06:15 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125343</guid>
		<description><![CDATA[&gt; I tend to favor the XML method

In general I do too, but it hasn&#039;t prevailed so far.

In the case of Wikipedia, there&#039;s a much more straightforward possibility. The wikitable markup is essentially CSV with pipes instead of commas:

&#124; 0:00 &#124;&#124; [[The Kills]] &#124;&#124; &quot;[[Sour Cherry]]&quot;

Why not put a bug next to every table that extracts its contents as CSV or tab-delimited?]]></description>
		<content:encoded><![CDATA[<p>&gt; I tend to favor the XML method</p>
<p>In general I do too, but it hasn&#8217;t prevailed so far.</p>
<p>In the case of Wikipedia, there&#8217;s a much more straightforward possibility. The wikitable markup is essentially CSV with pipes instead of commas:</p>
<p>| 0:00 || [[The Kills]] || &#8220;[[Sour Cherry]]&#8221;</p>
<p>Why not put a bug next to every table that extracts its contents as CSV or tab-delimited?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tim</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125329</link>
		<dc:creator><![CDATA[Tim]]></dc:creator>
		<pubDate>Fri, 12 Sep 2008 23:14:26 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125329</guid>
		<description><![CDATA[This a good argument for technical folk involved in &lt;b&gt;publishing&lt;/b&gt; information to think about what they do, too.

We need (easier) ways to publish data as XML documents or fragments, or at the least to have a way to tag data on pages to make it easily consumable.

Some ideas: 

- publish pages like that as XML with an XSLT stylesheet for presentation;

- modify HTML standards to allow some sort of  tag, which can contain only table elements within it. 

I tend to favor the XML method - if you want the page to be presented as HTML it&#039;s fairly trivial to transform XML to HTML and CSS using XSL and we&#039;re not heading down the direction of screwing up HTML and browser (in)compatibility.]]></description>
		<content:encoded><![CDATA[<p>This a good argument for technical folk involved in <b>publishing</b> information to think about what they do, too.</p>
<p>We need (easier) ways to publish data as XML documents or fragments, or at the least to have a way to tag data on pages to make it easily consumable.</p>
<p>Some ideas: </p>
<p>- publish pages like that as XML with an XSLT stylesheet for presentation;</p>
<p>- modify HTML standards to allow some sort of  tag, which can contain only table elements within it. </p>
<p>I tend to favor the XML method &#8211; if you want the page to be presented as HTML it&#8217;s fairly trivial to transform XML to HTML and CSS using XSL and we&#8217;re not heading down the direction of screwing up HTML and browser (in)compatibility.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon Udell</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125327</link>
		<dc:creator><![CDATA[Jon Udell]]></dc:creator>
		<pubDate>Fri, 12 Sep 2008 17:33:05 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125327</guid>
		<description><![CDATA[&gt; No mention of IE’s “Export to Microsoft 
&gt; Excel” contextual menu when right-clicking
&gt; on HTML tables?

Good grief, you&#039;re right! How did I not know, or discover, this? 

Well, it&#039;s a sure bet I&#039;m not alone in my ignorance. This leads to two questions:

1. What uncommon search strategy would have located this nugget?

2. How would the nugget need to have been packaged in order to yield to common search strategies?

Anyway, thanks Alf!]]></description>
		<content:encoded><![CDATA[<p>&gt; No mention of IE’s “Export to Microsoft<br />
&gt; Excel” contextual menu when right-clicking<br />
&gt; on HTML tables?</p>
<p>Good grief, you&#8217;re right! How did I not know, or discover, this? </p>
<p>Well, it&#8217;s a sure bet I&#8217;m not alone in my ignorance. This leads to two questions:</p>
<p>1. What uncommon search strategy would have located this nugget?</p>
<p>2. How would the nugget need to have been packaged in order to yield to common search strategies?</p>
<p>Anyway, thanks Alf!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: alf</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125324</link>
		<dc:creator><![CDATA[alf]]></dc:creator>
		<pubDate>Fri, 12 Sep 2008 15:23:18 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125324</guid>
		<description><![CDATA[No mention of IE&#039;s &quot;Export to Microsoft Excel&quot; contextual menu when right-clicking on HTML tables?]]></description>
		<content:encoded><![CDATA[<p>No mention of IE&#8217;s &#8220;Export to Microsoft Excel&#8221; contextual menu when right-clicking on HTML tables?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon Udell</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125320</link>
		<dc:creator><![CDATA[Jon Udell]]></dc:creator>
		<pubDate>Fri, 12 Sep 2008 01:04:57 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125320</guid>
		<description><![CDATA[&gt; I wonder if we’re just conditioned to look
&gt; for the programmable, re-usable solution 
&gt; and so don’t spot that the 
&gt; so-simple-it-doesn’t-need-re-using 
&gt; solution is easier.

Probably so. Interesting to be reminded by these examples, though, what a grab-bag clipboard formats can be, and therefore how unpredictable the results.]]></description>
		<content:encoded><![CDATA[<p>&gt; I wonder if we’re just conditioned to look<br />
&gt; for the programmable, re-usable solution<br />
&gt; and so don’t spot that the<br />
&gt; so-simple-it-doesn’t-need-re-using<br />
&gt; solution is easier.</p>
<p>Probably so. Interesting to be reminded by these examples, though, what a grab-bag clipboard formats can be, and therefore how unpredictable the results.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Data friction : business&#124;bytes&#124;genes&#124;molecules</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125319</link>
		<dc:creator><![CDATA[Data friction : business&#124;bytes&#124;genes&#124;molecules]]></dc:creator>
		<pubDate>Fri, 12 Sep 2008 00:41:49 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125319</guid>
		<description><![CDATA[[...] don&#8217;t think I need to add anything to this line (at the end of a typically great post by [...]]]></description>
		<content:encoded><![CDATA[<p>[...] don&#8217;t think I need to add anything to this line (at the end of a typically great post by [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adrian McEwen</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125317</link>
		<dc:creator><![CDATA[Adrian McEwen]]></dc:creator>
		<pubDate>Thu, 11 Sep 2008 22:51:44 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125317</guid>
		<description><![CDATA[For me, running it from FF3 into Excel 2007 with a simple Ctrl-V works but has the complication of active hyperlinks.  As you say, paste special would be the way to go.

I wonder if we&#039;re just conditioned to look for the programmable, re-usable solution and so don&#039;t spot that the so-simple-it-doesn&#039;t-need-re-using solution is easier.]]></description>
		<content:encoded><![CDATA[<p>For me, running it from FF3 into Excel 2007 with a simple Ctrl-V works but has the complication of active hyperlinks.  As you say, paste special would be the way to go.</p>
<p>I wonder if we&#8217;re just conditioned to look for the programmable, re-usable solution and so don&#8217;t spot that the so-simple-it-doesn&#8217;t-need-re-using solution is easier.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Doug K</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125315</link>
		<dc:creator><![CDATA[Doug K]]></dc:creator>
		<pubDate>Thu, 11 Sep 2008 20:12:56 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125315</guid>
		<description><![CDATA[hm. I do this to get race results from triathlons into spreadsheets. Typically follow a variant of the non-techie approach: select the table in Firefox, &#039;view selection source&#039;, cut/paste html into either Excel or Word, either as special or Ctrl-V, whichever treats it better. 

Results that aren&#039;t in a HTML table require more fudging in Word: scan/replace the spaces with some other character which can then be replaced with a tab so the whole thing can then be made into a table and cut/pasted to Excel. Usually for example names are separated by a single space, where the virtual columns of data are separated by two or more spaces, so a careful scan/replace strategy can get it all lined up.

I&#039;m going to try the Data-&gt; from Web next, thanks ;-)]]></description>
		<content:encoded><![CDATA[<p>hm. I do this to get race results from triathlons into spreadsheets. Typically follow a variant of the non-techie approach: select the table in Firefox, &#8216;view selection source&#8217;, cut/paste html into either Excel or Word, either as special or Ctrl-V, whichever treats it better. </p>
<p>Results that aren&#8217;t in a HTML table require more fudging in Word: scan/replace the spaces with some other character which can then be replaced with a tab so the whole thing can then be made into a table and cut/pasted to Excel. Usually for example names are separated by a single space, where the virtual columns of data are separated by two or more spaces, so a careful scan/replace strategy can get it all lined up.</p>
<p>I&#8217;m going to try the Data-&gt; from Web next, thanks ;-)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andy Baio</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125314</link>
		<dc:creator><![CDATA[Andy Baio]]></dc:creator>
		<pubDate>Thu, 11 Sep 2008 19:02:29 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125314</guid>
		<description><![CDATA[Turns out I was remembering incorrectly.  For me, when pasting from Firefox, Excel 2004 for Mac &lt;b&gt;automatically&lt;/b&gt; applies the default Text-to-Columns transform.]]></description>
		<content:encoded><![CDATA[<p>Turns out I was remembering incorrectly.  For me, when pasting from Firefox, Excel 2004 for Mac <b>automatically</b> applies the default Text-to-Columns transform.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon Udell</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125313</link>
		<dc:creator><![CDATA[Jon Udell]]></dc:creator>
		<pubDate>Thu, 11 Sep 2008 18:43:53 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125313</guid>
		<description><![CDATA[&gt; Yeah, that’s basically what I did. Select
&gt; the text from Wikipedia, paste into Excel. 

Why/how did you apply Text to Columns then?]]></description>
		<content:encoded><![CDATA[<p>&gt; Yeah, that’s basically what I did. Select<br />
&gt; the text from Wikipedia, paste into Excel. </p>
<p>Why/how did you apply Text to Columns then?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon Udell</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125312</link>
		<dc:creator><![CDATA[Jon Udell]]></dc:creator>
		<pubDate>Thu, 11 Sep 2008 18:43:17 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125312</guid>
		<description><![CDATA[&gt; I wonder whether in this case you and I
&gt; would fall foul of the expert’s trait of
&gt; making it overly difficult?

Almost certainly. That’s why I floated this item.

&gt; But it would be one of the last things
&gt; that I’d try.

Fascinating, isn’t it.

When I try this from IE the CTRL-V result is as above: mostly good, some correction needed, correction complicated by active hyperlinks.

When I try from FF the CTRL-V result is garbage. But Paste Special -&gt; As Text gives the mostly-good result without problematic hyperlinks.]]></description>
		<content:encoded><![CDATA[<p>&gt; I wonder whether in this case you and I<br />
&gt; would fall foul of the expert’s trait of<br />
&gt; making it overly difficult?</p>
<p>Almost certainly. That’s why I floated this item.</p>
<p>&gt; But it would be one of the last things<br />
&gt; that I’d try.</p>
<p>Fascinating, isn’t it.</p>
<p>When I try this from IE the CTRL-V result is as above: mostly good, some correction needed, correction complicated by active hyperlinks.</p>
<p>When I try from FF the CTRL-V result is garbage. But Paste Special -&gt; As Text gives the mostly-good result without problematic hyperlinks.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andy Baio</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125310</link>
		<dc:creator><![CDATA[Andy Baio]]></dc:creator>
		<pubDate>Thu, 11 Sep 2008 17:56:39 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125310</guid>
		<description><![CDATA[Thanks for the writeup, Jon!

@Adrian: Yeah, that&#039;s basically what I did.  Select the text from Wikipedia, paste into Excel.  I had to tweak some of the cells after that, but that&#039;s pretty much it.]]></description>
		<content:encoded><![CDATA[<p>Thanks for the writeup, Jon!</p>
<p>@Adrian: Yeah, that&#8217;s basically what I did.  Select the text from Wikipedia, paste into Excel.  I had to tweak some of the cells after that, but that&#8217;s pretty much it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adrian McEwen</title>
		<link>http://blog.jonudell.net/2008/09/11/that-first-step-can-be-a-doozy/#comment-125309</link>
		<dc:creator><![CDATA[Adrian McEwen]]></dc:creator>
		<pubDate>Thu, 11 Sep 2008 17:27:00 +0000</pubDate>
		<guid isPermaLink="false">http://jonudell.wordpress.com/?p=623#comment-125309</guid>
		<description><![CDATA[I wonder whether in this case you and I would fall foul of the expert&#039;s trait of making it overly difficult?

I was thinking about how my non-techie girlfriend would approach the problem.  Having observed her cut-and-pasting things from web pages into Word when she wants to reorganise them, I think she&#039;d try the same with Excel.

Lo and behold, selecting all the relevant text from that wikipedia page, switching to Excel and hitting Ctrl-V gives me it all in separate cells automatically.  

But it would be one of the last things that I&#039;d try.]]></description>
		<content:encoded><![CDATA[<p>I wonder whether in this case you and I would fall foul of the expert&#8217;s trait of making it overly difficult?</p>
<p>I was thinking about how my non-techie girlfriend would approach the problem.  Having observed her cut-and-pasting things from web pages into Word when she wants to reorganise them, I think she&#8217;d try the same with Excel.</p>
<p>Lo and behold, selecting all the relevant text from that wikipedia page, switching to Excel and hitting Ctrl-V gives me it all in separate cells automatically.  </p>
<p>But it would be one of the last things that I&#8217;d try.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

