What a Part Time HPC Cluster Admin Needs

In most organizations that use small and medium HPC clusters (smaller than 200 nodes), a HPC cluster is treated as a separate system from an IT management perspective. As a result, the amount of IT administration effort allocated to HPC clusters is very limited. Often, a part-time Linux administrator is tasked with taking care of an HPC cluster that is used by tens and even hundreds of users. Because most IT administrators are not HPC experts, they usually rely on the management tool included with the HPC cluster package to perform their daily work. Those packages are usually a stack of open source software with very limited support. Because these software packages are assembled from functionality perspective, they are not integrated. But what IT administrator really need is a robust, easy-to-use management tool to keep the cluster up and running rather spending time integrating the software stack provided with the cluster. This is a very different use case than large HPC shops that have lots of HPC expertise.

We’ve designed and built Platform HPC specifically for organizations that need small or medium-sized clusters. Starting with an easy one-step installation, Platform HPC automates many of the complex tasks of managing a cluster, including provision and patch nodes, integrate applications, and troubleshooting user problems, saving time and hassles for part-time IT administrators. A web interface also makes remote administration a reality for users. As long as administrator can access a web browser that can reach the head node of the HPC cluster, he can easily monitor, manage and troubleshoot problems.

Then again, the tools and technologies included in Platform HPC were originally designed for large scale environments. So under the hood, those tools are powerful, scalable, and highly customizable. For those running a small cluster, whether or not they are rapidly growing, Platform HPC would be their best choice now and for the future.


Post a Comment