Here come the GPGPUs

Unquestionably, much of the buzz at SC’2010 in New Orleans was about the performance that HPC users can attain using GPUs in hybrid computing architectures, much like Blue-Gene did over the past few years. Now, GPUs from AMD/ATI and Nvidia are taking hybrid compute offload out of the realm of proprietary architectures and putting it in the hands of almost every workstation user in the world.

When used correctly (for applications that can properly fit into the physical constraints of the GPU) the performance increase of using GPUs can be staggering. Low estimates for currently ported solvers show 4-5 times the speed increase with many often reaching 7-9 times the performance of regular processing speeds. With numbers like that, the HPC community takes notice!

The surprise in the GPU market is just how one-sided the supplier market seems to be. At SC’10 processor giant Intel only very quietly started talking about their accelerator codenamed “Knights-Ferry” (see . The technology was being demo’d in the Intel booth, but with clear implications that it won’t be generally available for quite some time.

Meanwhile Nvidia is becoming a juggernaut. Their much bemoaned language CUDA is being upgraded very quickly, releasing new versions that have taken notice of criticisms and are addressing them one by one. By sometime next year, CUDA 4.0 will have had some major enhancements that should help compiler and debuggers used by ISVs to make better and higher performing code for Nvidia’s crop of new GPGPUs.

Alas, openCL seems to be playing step-sister to CUDA’s Cinderella. No major ISVs have announced--much less released--applications ported to leverage the AMD/ATI FireStorm series. What’s on the horizon for them? The Fusion architecture seems terribly compelling, but the question is, will Nvidia already own the GPGPU market by the time an alternative is available?

Type of analytics

In previous blog, I spoke about running Analytics on a grid comprised of commodity hardware. In this blog I will discuss the different types of analytics and where it is used. Predictive analytics is used in actuarial science, financial services, insurance, telecommunications, retail, travel, healthcare, pharmaceuticals and other fields. Sales analytics is the process of determining how successful a company's sales forecast is and how best to predict future sales. In Financial Services, Credit Scoring analytics is used to determine future risk payments. Financial service firms also do rigorous Risk & Credit analysis which is required by industry regulation and provides insight into the exposure of firms to different types of risk. There are other types of analytics: Marketing analytics, Collections analytics, Fraud analytics, pricing analytics, Supply Chain analytics, Telecommunications, Transportation analytics.

Platform Symphony is the leading grid management middleware solution that can be used to accelerate advanced analytics and/or “do more with less”. In next blog, we’ll discuss how Platform Symphony can be used to reduce the runtime of the analytics or do more analysis with the same hardware.

Overheard at SC’10…

It was a busy but exciting week for me last week in New Orleans during Supercomputing 2010 (SC’10). I’m really pleased to report that our complete cluster management solution, Platform HPC, which was announced one week prior to the conference, resonated very well with many of the show attendees I talked to.

As our VP of Product Management and Marketing Ken Hertzler noted in his blog announcing the product release a couple of weeks ago, Platform HPC was designed to make cluster building an easier, less painful process for people who need clusters, but may not be cluster experts. It’s a complete solution that boasts built-in workload management, application integration, and leading MPI, each of which give Platform HPC a leg up on our competition. In addition to the completeness, ease of use and commercial support are other two aspects that set Platform HPC apart. In packaging this product, we found that support was a huge differentiator with our channel IHV and ISV partners—they told us they always have happier customers when they can sell a complete cluster management solution where commercial support is delivered together with their hardware or applications. They spoke, so we listened!

So you can imagine how pleased we were to also be listening in on some of the comments overheard in our booth at SC’10 last week…

“I thought the best thing about your tool was the completely integrated solution stack it provides. This provides a strong competitive advantage over other advanced scheduling/provisioning/and monitoring tools in that you are able to offer an entire solution stack and have control over it to meet customers’ needs. I don't think Adaptive Computing has this, as Moab requires plugins to xCAT, CMU, or ROCKS.”

SC’10 in Review

We’ve just arrived back from another year at the Supercomputing conference where Platform had a terrific week, and we heard all the industry scuttlebutt about a number of developments that will be affecting everyone over the next year. Over the next few blog entries, I’ve taken the liberty of writing up a short review of some of the events, rumors, and observations that we at Platform found to be the most interesting at the show in 2010.

One of the ceremonies that makes every supercomputing conference is the unveiling of the TOP500 list. with conference attendees always taking a keen interest in the newest systems rounding out the fastest 10 supercomputers on the planet. This year’s Supercomputing 2010 conference was different in a few ways, namely because China, which had never taken the top spot in the TOP500 honor roll before, took both third place and the coveted king of the hill spot with its “Tianhe-1A.” computer. This top contender was built using a completely hybrid architecture, meaning it contains Intel and AMD processors, as well as GPUs from Nvidia and even a Chinese-designed and made processor coming from the the system.

Perhaps Tianhe-1A differs from its recent “purpose-built” progenitors (Blue-Gene L by IBM, the “earth” simulator by NEC, to name just a few) because it’s end goal is to be in service to Chinese research facilities and private industry both of which will purchase time on the system to run their most challenging simulations to date. (Maybe the West should take notice of this cooperative approach and consider it for their next bid for the top spot.)

On the downside on the show floor was the implication that Intel has almost officially put the Itanium processor out to pasture. No matter what some may say, it’s difficult to argue that, for many HPC applications, the Itanium was a quantum leap ahead of it’s time. Had Intel continued to invest their tsunami of R&D funding in the Itanium there is little doubt it would have continued on a path to ever higher levels of greatness. But, alas, even though a good idea before it’s time is still a good idea, it’s not always a successful one. (I’m sure Steve Jobs would agree!) So, the IA64 platform didn’t even make the grade to be mentioned or displayed in Intel’s booth, and HP, the last OEM to be shipping the processors, has relegated them to their non-HPC targeted “integrity” lineup.

There was also another abandonment story on the lips of many attendees. Oracle seems to be walking away from the HPC customers and market that Sun flirted with for many years. Though present at the show, their name didn’t even make it to the published list of vendors. In the workload management turf war, rumors of Oracle’s decision has the sharks at Altair, Adaptive Computing, and Platform Computing circling in anticipation of several companies jumping off the SGE ship.

Finally, Microsoft continues to invest heavily in what for them must be a very small niche market. They had the largest booth of the entire show, and were sponsors for several events throughout the week. Their investment in HPC seems to be gaining them both mindshare and market share, as they are no longer seen as a joke by the HPC community. Still, their numbers remain in the single digits in terms of production clusters that actually run jobs using their OS.

Overall, it was a very lively and well-attended conference with many HPC consumers renewing their interest after the last two years of economic hardship and belt-tightening. We hope the trend will continue and the 2011 show in Seattle will be standing room only!

Live from New Orleans at SC10 – Platform LSF Version 8!

Having just arrived at the bustling Supercomputing 2010 show in New Orleans, I’m pleased to say that the latest version of our flagship product Platform LSF Version 8 was announced today at the show. We believe LSF 8 is the industry’s most comprehensive workload scheduling solution for HPC available today. Our goal with Platform LSF 8 is to help IT organizations really maximize their compute resource utilization and get the most mileage out of their HPC infrastructure.

HPC technology has advanced quite a bit in recent years, with dramatically improved application processing performance across a widening range of disciplines - from computer generated designs and graphics to scientific calculations. But in speaking with our customers, we’ve noticed that even sophisticated HPC centers are still struggling to take full advantage of the processing capacity within the HPC infrastructure. Customers are increasingly using clusters to run a variety of applications and many are also dealing multiple SLA requirements, which can make it very difficult to orchestrate job management and submissions to ensure optimum cluster utilization and performance.

With that in mind, we built LSF 8 to better handle even more complex workload scheduling in distributed HPC environments, allowing more work to be done with fewer computing resources in the fastest time possible. It provides new intelligent workload scheduling capabilities, such as letting IT managers delegate administrative rights to the appropriate user groups, and incredibly useful enhanced fairshare rules. These new HPC management features continue to demonstrate why our market-leading workload solutions continue to dominate the market.

The new bells and whistles in version 8 include:
  • the ability to delegate administrative rights to line of business managers
  • live, dynamic cluster reconfiguration
  • guaranteed resources to ensure service level agreements (SLAs) are met
  • flexible fairshare scheduling policies
  • unparalleled scalability to support the large clusters in use today; certified up to 100,000 slots.

The ultimate goal of all these new features is higher processing utilization. This will allow our customers to keep down IT costs while improving application processing throughput and getting jobs done faster!

Platform makes HPC in the Cloud a Reality
I wasn’t kidding when I mentioned that we’ve been busy these past couple months…besides launching Platform LSF 8 today, we’ve also announced three solutions geared toward helping HPC users to take advantage of high performance computing cloud bursting. For those HPC users that often face unpredictable workloads, they can now save time and money by taking advantage of the cloud when internal resources are overloaded. We’ve outlined three separate paths for HPC users to dynamically scale resources to handle peak workload necessary for HPC applications:

Solution 1 – Integrated Cluster with the Cloud – Uses Platform LSF’s dynamic host capabilities along with a cloud service, such as the Amazon Virtual Private Cloud (VPC), to provide cloud resources that appear to operate within the HPC datacenter (local IP addresses, host names, etc.).

Solution 2 - MultiCluster to the Cloud – Allows users to start up a new cluster in any cloud or hosting provider environment without a dedicated link such as Amazon VPC.

Solution 3 - Dynamic Cluster Extension to the Cloud – Platform LSF and Platform ISF create a powerful and flexible cloud management solution that dynamically scales clusters to include resources within the data center infrastructure or externally to a cloud provider based on the application workload.

All of these paths to HPC clouds are based on products we ship today. They allow customers to improve utilization and resource capacity while avoiding the costs and delays due to lack of resources. For more information get a copy of our HPC Cloud Bursting whitepaper.

If you’re here in New Orleans attending Supercomputing ’10 (SC’10), feel free to stop by the Platform Computing’s booth (#2739), and we can show you a demo of these new products and offerings. Hope to see you there!

Supercomputing goes commercial in China

Just having come back from the HPC China Conference in Beijing two weeks ago where TianHe 1A was announced, I am now packing for SC’10 in New Orleans.

For those of you unfamiliar with TianHe 1A, it is currently the world’s fastest supercomputer capable of computing 2.507 petaflops, or 2,507 trillion calculations per second. TianHe 1A, together with Nebulae (the word’s second fastest supercomputer according to the July 2010 TOP500) will be hosted in government funded supercomputing centers in China. However, unlike many other countries where the top systems are almost exclusively running code for research, supercomputing centers in China will be running like commercial companies to sell computing capacity to industries and research firms. Their model center, the Shanghai Supercomputing Center (#24 on July 2010 TOP500) has about 300 customers and keeps full capacity operations 24x7. As the result, those big systems will run more commercial code than the systems in national labs. Such commercialized operations will demand high software quality and strong local support.

The recently launched Platform HPC matches needs like these very well. With a complete management solution with a leading MPI implementation, Platform HPC brings performance and productivity to production environments. It provides all the necessary components to manage large systems including user web portal, application integrations, resource allocation and job scheduling, cluster management and monitoring, and job usage accounting. Most importantly, it is supported by a single vendor to dramatically decrease troubleshooting time.

I’m looking forward to seeing how such HPC cloud model on top systems shakes the future of HPC community.

Introducing Platform HPC, the most complete cluster solution available

The past few weeks have been really busy here at Platform leading up to today, but I’m proud to say that we have just released the latest product in our flagship HPC product family, Platform HPC. As the most complete cluster software solution currently on the market, Platform HPC makes cluster management and workload scheduling easy for users who are not experts in clusters, but need HPC clusters to help improve their application performance. One main goal with Platform HPC is to allow technical application users across a variety of vertical markets—from engineering and scientific research to oil and gas and digital media—to spend their time and energy concentrating on their research and designs rather than setting up and managing their computing environments. Until now, clusters have been a daunting prospect for most workstation and technical application users. While engineers, scientists and graphic artists may all be expert users of the apps they work with on a daily basis, most may not be IT experts—and within most organizations, cluster building has been left to the IT department because clusters can be challenging to set up. To address this, we specifically designed Platform HPC with technical app users in mind so that both experienced or novice HPC users can quickly and easily deploy, run and manage their own clusters and still meet the application performance and workload management demands they require.

Another primary goal of Platform HPC is to make is easier for user to use their cluster from their application. Based on our Platform LSF workload scheduler, Platform HPC’s interface is accessible through a web portal interface that is integrated through application templates with some leading technical applications already on the market. We’re supporting apps by Abaqus, ANSYS, Blast, Fluent, LS-DYNA, MSC Natran and Schlumberger among others, with plans to add more in the next few months. Platform HPC uses a single installer and is pre-certified by leading hardware and software vendors and available through our certified channel partners. The new product also allows multiple apps to run on the cluster regardless of OS—it supports both Linux and Windows environments. Talk about maximizing your existing infrastructure and compute resources! We also fully support the product and provide one point of contact for support, despite the product being sold through the channel - just another way to make clusters easier and more approachable for users who aren’t cluster experts!

Platform support for GPU Environments

In addition to our Platform HPC product launch, we’re also pleased to announce our continued support for GPU-aware application workload scheduling and monitoring. Both Platform LSF and Platform HPC now incorporate GPU-aware scheduling features for better and more efficient workload management in GPU environments. For more on our support of GPUs, see today’s release.

Finally, we’ll be at Supercomputing ’10 (SC’10) in New Orleans next week, so please stop by the Platform booth (#2739) if you’d be interested in a demo. And stay tuned for more exciting Platform product news at SC’10, as well!

Analytics on a compute grid?

Analytics programs have become an increasingly important tool for many companies to help provide insight into how the business is performing. The need to measure performance and gauge ROI across lines of business has been particularly important during the global recession over the past two years. At the same time, the amount of data that companies amass daily and must process to analyze performance has exploded. As a result, analytics programs are becoming increasingly bloated, expensive and difficult to run within many organizations.

This problem is due in part to the fact that a lot of analytics are running on legacy environments (including Hardware, software, processes). These environments are very expensive to own and maintain in terms of both CapEx and OpEx. Given the markets trends of huge increase in data volumes and users and analytical models requiring near real-time service level agreements (SLA), these legacy systems have become very expensive and are beginning to hinder, rather than enable, the growth of business. Furthermore, scalability and performance issues are also beginning to hinder ease of adoption and use for BI tools.

What is the alternative? Running the analytics applications on a compute grid comprised of commodity hardware. This enables the adoption of a lower cost, higher performance approach by simplifying the environment and making it easier for users to move to model. Rather than overtaxing legacy systems, grid solutions allow you to make the most of the systems you already have in place. Grid provides high availability, lower costs and virtually unlimited scalability.

Platform Symphony is the leading grid management middleware solution that enables embarrassingly high parallelization of high performance compute and data intensive applications using policy based grid management that enforces application service levels. Platform Symphony also manages a shared pool of resources to meet the dynamic demands of multiple applications, enabling enterprise-class availability, security, and scalability and providing maximum utilization across large, diverse pools of resources.