Performance and Productivity of an HPC Cluster (3)

My last blog described the results of implementing a refreshed 32-node CFD (computational fluid dynamics) cluster by using a completely different solution from the one implemented in the previous cluster. Using Platform HPC for the refresh, the new cluster significantly reduced its job failure rate and had a 15% usage increase and 25% throughput increase.


After the HPC cluster was in production with happy users, the management team began making plans to further improve application performance and user productivity. They were able to start to investigate and plan for the future. GPU adoption and a new ways to speed up the design process are two areas identified.


GPU Use


Using GPU to accelerate applications is what this organization is looking at next. Some commercial applications ISVs have already released applications that support GPU acceleration, for example, ANSYS. Adding GPU to the cluster could dramatically reduce the application run time. However, it will obviously increase the cluster complexity. Fortunately Platform HPC can help reduce such complexity. With Platform HPC, it’s much easier for administrators to deploy GPU required software. Platform Computing has repackaged NVIDIA’s CUDA software into a format that can be automatically deployed across all compute nodes in a cluster by leveraging the cluster management capability of Platform HPC. The workload scheduler in Platform HPC is also GPU aware so it can schedule GPU resources such that users running GPU jobs don’t need to worry about which node has GPUs or if the GPUs are used by other users. These capabilities would go a long way to helping this organization overcome the hurdles of adopting GPU technology.


Design Process Acceleration


After users transferred from using the command line interface to the web interface, they still had some complex scripts that they needed to automate job flows. These scripts were originally built by a few power users. New users typically just copied them and wrapped the scripts with a few lines of code as required. But when there was a problem, it was very difficult to debug. One technology they are evaluating to help alleviate this problem is Platform Process Manager. Platform Process Manager allows users to program a flow in a graphical way without writing scripts. It can be integrated with Platform HPC’s web interface. Once a flow is developed, users can click a button to launch the flow. Job dependencies are taken care of by the Platform Process Manager. This company is exploring the possibility of using Platform Process Manager as a technology to automate their job flow to further increase user productivity. The graphical view of the flow also makes the job flows much easier to maintain and improve, which can also be used to document best practices in their engineering process.


The partnership with Platform Computing allows this design firm to advance digital design efficiency with increased user productivity. By doing more simulations to improve the quality of the design, high performance computing gives them the competitive advantage to be leading edge.

0 comments:

Post a Comment