This week’s Perspectives is a two-parter: an interview and companion screencast on the topic of cluster computing in the classroom. The interview is with Kyril Faenov, the General Manager of the Windows HPC (high performance computing) unit, and the screencast is with Rich Ciapala, a program manager for Microsoft HPC++ Labs.
The project demonstrated in the screencast, and discussed in the interview, is called CompFin Lab. It’s a system that enables professors to in turn enable their students to run computationally expensive financial models on large quantities of data. From the student’s perspective, you go to a SharePoint server, select a computational model, pick a basket of stocks, and run the model. Behind the scenes the task is partitioned and sprayed across a cluster of computers, then the results are gathered and presented in an Excel spreadsheet.
From the professor’s point of view, some .NET programming is required. But a framework abstracts the mechanics of dealing with the cluster, so the professor can focus on the logic of the model itself.
There are couple of key points about the evolution of high-performance computing that I want to highlight here. First, there’s what Kyril calls “the gravitational pull of data.” Increasingly, people and organizations are building vast repositories of data that other people and organizations will want to analyze in computationally expensive ways. It’s great to have access to a compute cluster in the cloud that can do the heavy lifting, but when datasets get really big you get bottlenecked trying to send the data to where the code runs. At a certain point you’d rather send the code to where the data lives.
A second and related point is that in our current model for large-scale cloud-based computing, there are only a handful of what I call intergalactic clusters — namely, those operated by Google, Yahoo, Amazon, and Microsoft. These are one-of-a-kind behemoths. You can’t replicate one of them locally and apply it to your terabytes of data. So as Kyril and his team build out their cloud-based HPC services, they’re working to ensure the services can be replicated locally.
Maybe the most optimal thing is for you to stand up a 1000-node cluster with each node having a terabyte of disk. We want to enable that. We want to be able to tell our customers: Here’s how we run this large-scale data-driven HPC applications, and here’s how, within a day or two, you can stand up one of these yourself.
The idea is that if you build one of those for your own terabyte trove of astronomical or climatalogical data, you can run your own computations against that data, and you can also share that capability with other people and organizations who want to run their code against your data.