From an undisclosed location somewhere on the east coast, middleware maven Steve Vinoski joins me for this week’s Friday podcast. Earlier this month Steve announced that he was leaving IONA to join a stealth-mode startup. He can’t discuss his new job yet, but I took this opportunity to ask him to review his long career working with distributed systems and reflect on lessons learned.
It’s a timely conversation because Steve was originally slated to represent IONA at next week’s W3C Workshop on Web of Services for Enterprise Computing. I’d hoped to attend that conference as well, but all attendees have to present position papers, and I’d be the wrong guy to represent Microsoft’s position. So instead I asked Steve what he would have said there, and I chimed in with some things that I would have said, and we both had a lot of fun.
I have to take issue with your point (quoting Joe Gregorio) to the effect that an enterprise developer with 300,000 users has reached the pinnacle of the profession while a Web developer with 300K users is ready to move out of the family garage. First, those 300K Web users probably are probably doing mostly read operations on public information. That wouldn’t be much of a challenge for even a novice enterprise developer either. The challenge in the IT shops of the world includes a) there are a lot more update operations than in typical web applications; b) those updates have to succeed for fail completely, none of this “go back and check the result URI to see if the transaction went through” stuff; c) the data is generally confidential, at least to outsiders and often to all but a specified group of insiders; d) there is less homogeneity of the underlying protocols, etc. — you can’t just give every chunk of information a URI and get/put/post/delete to it from anywhere; and e) those “legacy systems chugging along” work a whole lot better than HTTP for end to end secure, reliable messages, and the IT managers of the world have no interest in moving to the “modern” world of HTTP’s quality of service. There are only a handful of Web companies who have managed to do all this stuff (Amazon, eBay, etc.) that work has required hundreds of millions of dollars to hire extremely bright people who still required several iterations of their architectures to get to the current quality of service levels … and they do *not* interoperate with one another except via “behind the eyeballs” integration (or read-only operations, of course).
That said, I agree with the thrust of your comments that the “dispute” is overblown, this is really a continuum of concerns and technologies, we need to be looking for common ground, etc. People who have the luxury of building “enterprise” apps from the ground up really should be thinking very hard about how the Web manages to work as well as it does, to leverage well-proven ideas such as identifying everything with a URI, designing operations to be safe gets or idempotent saves/deletes, etc. whereever possible. But by the same token, people developing Web applications that must be secure, reliable, transactional, neutral with respect to back-end protocols, etc. should be thinking harder about the WS technologies now that they are starting to mature and a common subset is implemented in a demonstrably interoperable fashion by MS WCF, Sun Tango, etc. There’s a lot more method underlying the apparent madness than Tim Bray et al give it credit for, there’s a lot more real world success than is widely known, and whatever its flaws, the alternative of starting from scratch *to address the challenges that WS addresses* is not available to any but the most deep pocketed organizations.
Well said, Mike!
And thanks to you Jon for convincing me that listening to podcasts is an efficient use of otherwise wasted time, e.g. while the machine is busy doing a massive software update :-) It used to bug me that I had to listen to rather than read some of your best stuff.
The VM approach you mentioned sounds similar to something that the Second Life technical team presented at Lang.NET last year. They’re in the process of converting their custom scripting engine, Second Life Script, to be JIT compiled and run under Mono. The interesting part is that they’re doing some pretty extreme customizations to Mono to allow microthreading and in-process stack serialization, which will let their scripts move from server to server in a way that’s transparent to the running process. Their server farm of 1800+ Debian boxes is broken up corresponding to the geography, so each server supports the users and objects in a specific area of their virtual world. This system will let the scripts travel with their users.
There’s an interesting video here: http://download.microsoft.com/download/9/4/1/94138e2a-d9dc-435a-9240-bcd985bf5bd7/Jim-Cory-SecondLife.wmv
It seems conceptually similar to what you were talking about, in which services would flow through the network in the same way content currently does.
“It seems conceptually similar to what you were talking about, in which services would flow through the network in the same way content currently does.”
Yes, and your mention of Second Life and Mono reminds me of a follow-on point I neglected to make. One candidate for the Basic Unit of Distributable Computation is in fact not the virtual machine instance with a whole copy of Linux or Windows or some other OS running in it, but instead a hypervisor with a managed runtime like the JVM or CLR riding atop.
The bellwether for this is BEA’s WLS-VE, which I think stands for WebLogic Server Virtualization Edition — basically the JVM running on a VMWare hypervisor, we don’t need no stinking OS.
A compute fabric with these kind of critters (JVM-style or CLR-style or both) running around in it — that’s a hell of an interesting thing to imagine.
Sorry Mike, your arguments just don’t hold water, IMHO.
“there are a lot more update operations than in typical web applications, those updates have to succeed for fail completely, none of this “go back and check the result URI to see if the transaction went through” stuff; ”
Last I checked, I haven’t seen a POST half-succeed.
Put another way – how is a POST backed by a durable queue any different from fat client sending a message to a proprietary durable queue?
“there is less homogeneity of the underlying protocols, etc. — you can’t just give every chunk of information a URI and get/put/post/delete to it from anywhere”
Why not? That’s exactly what companies are doing today when building SOAP wrappers on top of their legacy. Why can’t it just be HTTP?
“those “legacy systems chugging along” work a whole lot better than HTTP for end to end secure, reliable messages”
No. Firstly, there’s little “end to end”, it’s usually “hop to hop”, with enormous resources dedicated to the neurosurgery of getting each hop to work with each other, whether that’s MQ, or CORBA, or DCOM, or WS-*. And securely? Identity propagation is pretty rare in practice — how many Kerberos or SAML experts are out there?
More typically, I see lots of “service accounts” with very simplistic passwords, even in environments with heavy regulation and compliance rules. As for message integrity, are they really using SSL or digital signatures everywhere, or are they assuming a trusted physical network? Mostly it’s the latter.
Thirdly, I see a lot of reconciliation batch processes to catch the records that are dropped between supposedly “reliable” message queue networks, because they can’t get two phase commit to work at scale.
“and the IT managers of the world have no interest in moving to the “modern” world of HTTP’s quality of service.”
This is the great myth of enterprise IT. Reliability is an end to end quality, something that the application has always covered. Yes, there’s an argument for performance and convenience that lower level messaging systems like WS-RM or JMS could take it on, but it still doesn’t solve double submissions from a UI, or double submissions from a buggy intermediary. That requires an application-level check, end to end.
Distributed XA transactions are an exception, as it TX boundaries are essentially an application-driven end to end protocol. The problem is that they don’t scale beyond a handful of participants, and wreak havoc on availability.
Stu, I wouldn’t presume to quarrel with your apparent real-world experience. I’m not convinced about the larger implications, however. First “I haven’t seen a POST half-succeed”. Uhh, I have — Look at all the blog threads out there with duplicate comments. Look at all the forms that do something with real money that say “Don’t hit the back button, that may duplicate the transaction.” There are plenty of situations where requested operation succeeds on the back end but the HTTP request times out. That can be worked around by making updates idempotent, but lots of people don’t seem to have worked out the details of how to make that work with their existing systems.
As for “with enormous resources dedicated to the neurosurgery of getting each hop to work with each other, whether that’s MQ…”. You may be right. All I know is that during my RESTifarian days I was laughed at in a couple of enterprisey prospects for my employer’s HTTP-interface product by people who explained that they used MQ products to send gigabytes of data around the world daily over complex multi-hop networks with unreliable underlying connections, without worry, constant monitoring of results and retries if failure, or neurosurgery. I’ve heard similar stories from others who tried to promote HTTP-based solutions to an MQ-based audience.
As for transactions … again I wouldn’t dispute your experience, but *I* won’t do business again with companies who can’t at least fake a two phase commit — if they take my money, they’d better deliver the product… if they can’t deliver they’d better roll back the charge. My point was not that this is easy … in fact only a handful of companies *really* do it well on Web scale. I do believe that those who do work well at enterprise scale and want to expose their supply chains or whatever over the Web are better off leveraging the WS-* technologies than trying to invent a raw HTTP/XML solution from scratch. I would love to hear about counter-examples… besides Amazon, eBay, etc., who have spent fortunes hiring geniuses and going through multiple iterations to figure it all out.
There are defiunitely lots of people who need (whether or not they currently achieve) reliability, security, identity management, transaction management, etc. at enterprise scale but with the Web in the picture at some point in the dataflows. Are they better off trying to rebuild that using nothing but the Web technologies and the REST design pattern … or tunneling WS messages over HTTP and praying that the consultant or ESB vendor makes it all work? Neither extreme seems sensible; understanding the capabilities and limitations of both and judiciously choosing the right mix seems far more pragmatic.