HPC shifting sands: Distributed or centralized infrastructure

As NVIDIA takes the HPC world by storm by lending the power of their GPUs (graphic processing units) to HPC applications, ISVs, system architects, and users are all trying to find their footing when it comes to planning how they’ll be using HPC in the next year or two. To make a historical comparison, the core counts, memory footprints, and I/O bandwidth achievable in a single workstation today would have enough to make HPC aficionados salivate only two years ago.

Indeed, “deskside” clusters were the prediction in 2007, expected to bring about the demise of the centralized cluster—with more computing power than any one user could ever need. Well, the HPC community are a greedy bunch. And in some ways the prediction has become half true. “Deskside clusters” have actually developed into “desktop SMPs”.

It seems that the line between when a user needs a cluster and when he can live with his personal workstation will be shifting. I believe that’s due to the enormous computing power offered by GPU accelerators. Yes, most applications aren’t ready yet, and, yes, the Tesla hardware is expensive. But those two barriers are true only today and are likely to quickly disappear.

Time will bring new and more mainstream applications into GPU enablement and force their now square peg into NVIDIA’s round hole. The elite prices of GPGPUs are nothing more than market perception. I just purchased an NVIDIA card for less than $200 for a media center PC that has 96 cores in it.

This trend adds up to a temporary shift in the centralized vs. distributed computing tug-of-war. Every one of the workstations on a corporate refresh cycle will have graphics capability. Servers, on the other hand, are often on a slower refresh cycle and are scrutinized more carefully by corporate bean counters who won’t buy GPUs servers until CIO/CTOs tell them to.

Desktop compute harvesting, once a fringe HPC endeavor, may suddenly have more ROI than it ever has as a technique for providing HPC scale for minimal incremental cost. The question is: Will the window of opportunity for such techniques close before a complete solution can be developed and delivered?

Of course, the inclusion of GPUs in servers will happen. When that occurs, the balance will shift back. Until then, though, the horsepower of cloud or even local clusters has become a little less attractive when there’s a processing stallion underneath the desk waiting to be bridled.


Post a Comment