A Missed Opportunity for Cluster Users - Aggregated Memory

Clusters for doing batch workloads continue to find new areas where they are useful. In the late 90s clusters were originally referred to as "Beowulf" (a term I believe was coined by Don Becker) and were focused on running Message Passing Interface (MPI) or Parallel Virtual Machine (PVM) based parallel applications by decomposing domains into multiple pieces, distributing those pieces to member compute hosts, iteratively running a simulation and then updating boundary conditions via a network interconnect.

Today clusters are used more widely than ever before especially because more application simulation algorithmshave been recast in a manner friendly to using MPI for communication; however, this is just a fraction of the total use cases now commonly deployed for clusters. Other uses include: financial services institutions using clusters to calculate risk in their portfolios; electronic design firms running simulations and regression tests on chip designs to push die sizes ever smaller; movie houses creating special effects and sometimes building and visualizing entire virtual worlds with clusters; scientists use Monte Carlo style simulations to the range of design scenarios for a risk situation; and pharmaceutical companies probing genomes to speed their drug discovery processes.

All of these use cases are serial applications that are run in batches, each often running thousands to millions of times. Clusters provide an aggregated level of throughput at a price point that has been proven extremely advantageous. We at Platform Computing like to think we helped this market develop by supplying the leading, fastest and most scalable and robust workload management suite available.

An emerging trend with these serial applications is that the amount of memory required to run the simulations is growing faster than commodity server memory capabilities. Additionally, clusters are being used for many purposes simultaneously, so jobs large and small must compete for resources. Large jobs tend to get "starved" since small jobs can often fit many to a single server and, while running, prevent a large job from running. This results in users either compromising on simulation detail to make footprints smaller or they are forced to buy non-commodity "big iron" to house 512GB of memory and more.

The frustrating part of this trend is that usually there is plenty of memory in the cluster to service an application requirement, however the memory is often located on a separate server.

Virtualization is often viewed as a non-HPC technology. Though I am fighting this blanket assumption, this is not the point of this blog post. Virtualization, in this case, offers mobility to a workload in a way that was not possible before. By this I am referring to migration technology that allows a virtual machine to move between hardware resources while running. Before virtualization, the only way an application could be moved from one server to another was by check pointing it, and even then it suffered from having to shutdown and restart, as well as several restrictions on the character of the hosts for restarting the job.

In the case of memory limitations preventing jobs from running, it is now possible to continuously "pack" a workload onto the minimum number of servers to maximize the availability of large chunks of memory. If this is done automatically and to good effect then those large memory jobs will not be starved and instead launch immediately.

Platform's Adaptive Cluster product will be able to do just this by leveraging the power of virtualization. It may be possible that the increase in throughput will more than balance the reduction in performance associated with virtualization. Time will tell.

0 comments:

Post a Comment