Showing posts with label Cloud cloud computing HPC. Show all posts
Showing posts with label Cloud cloud computing HPC. Show all posts

Why Combine Platform Computing with IBM?

You may have read the news that Platform Computing has signed a definitive agreement to be acquired by IBM and you may wonder why. I’d like to share with you our thinking at Platform and what our relevance is to you and the dramatic evolution of enterprise computing. Even though not an old man yet and usually too busy doing stuff, for once I will try to look at the past, present and future.

After the first two generations of IT architecture, centralized mainframe and networked client/server, IT has finally advanced to its third generation architecture of (true) distributed computing. An unlimited number of resource components, such as servers, storage devices and interconnects, are glued together by a layer of management software to form a logically centralized system – call it virtual mainframe, cluster, grid, cloud, or whatever you want. The users don’t really care where the “server” is, as long as they can access application services – probably over a wireless connection. Oh, they also want those services to be inexpensive and preferably on a pay-for-use basis. Like a master athlete making the most challenging routines look so easy, such a simple computing model actually calls for a sophisticated distributed computing architecture. Users’ priorities and budgets differ, apps’ resource demands fluctuate, and the types of hardware they need vary. So, the management software for such a system needs to be able to integrate whatever resources, morph them dynamically to fit each app’s needs, arbitrate amongst competing apps’ demands, and deliver IT as a service as cost effectively as possible. This idea gave birth to commodity clusters, enterprise grids, and now cloud. This has been the vision of Platform Computing since we began 19 years ago.



Just as client/server took 20 years to mature into the mainstream, clusters and grids have taken 20 years, and cloud for general business apps is still just emerging. Two areas have been leading the way: HPC/technical computing followed by Internet services. It’s no accident that Platform Computing was founded in 1992 by Jingwen Wang and I, two renegade Computer Science professors with no business experience or even interest. We wanted to translate ‘80s distributed operating systems research into cluster and grid management products. That’s when the lowly x86 servers were becoming powerful enough to do the big proprietary servers’ job, especially if a bunch of them banded together to form a cluster, and later on an enterprise grid with multiple clusters. One system for all apps, shared with order. Initially, we talked to all the major systems vendors to transfer university research results to them, but there was no taker. So, we decided to practice what we preached. We have been growing and profitable all these 19 years with no external funding. Using clusters and grids, we replaced a supercomputer at Pratt & Whitney to run 10 times more Boeing 777 jet engine simulations, and we supported CERN to process insane amounts of data looking for God’s Particle. While the propeller heads were having fun, enterprises in financial services, manufacturing, pharmaceuticals, oil & gas, electronics, and the entertainment industries turned to these low cost, massively parallel systems to design better products and devise more clever services. To make money, they compute. To out-compete, they out-compute.

The second area adopting clusters, grids and cloud, following HPC/technical computing, is Internet services. By the early 2000s, business at Amazon and Google was going gangbusters, yet they wanted a more cost effective and infinitely scalable system versus buying expensive proprietary systems. They developed their own management software to lash together x86 servers running Linux. They even developed their own middleware, such as MapReduce for processing massive amounts of “unstructured” data.



This brings us to the present day and the pending acquisition of Platform by IBM. Over the last 19 years, Platform has developed a set of distributed middleware and workload and resource management software to run apps on clusters and grids. To keep leading our customers forward, we have extended our software to private cloud management for more types of apps, including Web services, MapReduce and all kinds of analytics. We want to do for enterprises what Google did for themselves, by delivering the management software that glues together whatever hardware resources and applications these enterprises use for production. In other words, Google computing for the enterprise. Platform Computing.



So, it’s all about apps (or IaaS J). Old apps going distributed, new apps built as distributed. Platform’s 19 years of profitable growth has been fueled by delivering value to more and more types of apps for more and more customers. Platform has continued to invest in product innovation and customer services.

The foundation of this acquisition is the ever expanding technical computing market going mainstream. IDC has been tracking this technical computing systems market segment at $14B, or 20% of the overall systems market. It is growing at 8%/year, or twice the growth rate of servers overall. Both IDC and users also point out that the biggest bottleneck to wider adoption is the complexity of clusters and grids, and thus the escalating needs for middleware and management software to hide all the moving parts and just deliver IT as a service. You see, it’s well worth paying a little for management software to get the most out of your hardware. Platform has a single mission: to rapidly deliver effective distributed computing management software to the enterprise. On our own, especially in the early days when going was tough, we have been doing a pretty good job for some enterprises in some parts of the world. But, we are only 536 heroes. Combined with IBM, we can get to all the enterprises worldwide. We have helped our customers to run their businesses better, faster, cheaper. After 19 years, IBM convinced us that there can also be a “better, faster, cheaper” way to help more customers and to grow our business. As they say, it’s all about leverage and scale.

We all have to grow up, including the propeller heads. Some visionary users will continue to buy the pieces of hardware and software to lash together their own systems. Most enterprises expect to get whole systems ready to run their apps, but they don’t want to be tied down to proprietary systems and vendors. They want choices. Heterogeneity is the norm rather than exception. Platform’s management software delivers the capabilities they want while enabling their choices of hardware, OS and apps. 


IBM’s Systems and Technology Group wants to remain a systems business, not a hardware business nor a parts business. Therefore, IBM’s renewed emphasis is on systems software in its own right. IBM and Platform, the two complementary market leaders in technical computing systems and management software respectively, are coming together to provide overall market leadership and help customers to do more cost effective computing. In IBM speak, it’s smarter systems for smarter computing enabling a Smarter Planet. Not smarter people. Just normal people doing smarter things supported by smarter systems.

Now that I hopefully have you convinced that we at Platform are not nuts coming together with IBM, we hope to show you that Platform’s products and technologies have legs to go beyond clusters and grids. After all, HPC/technical computing has always been a fountainhead of new technology innovation feeding into the mainstream. Distributed computing as a new IT architecture is one such example. Our newer products for private cloud management, Platform ISF, and for unstructured data analytics, Platform MapReduce, are showing some early promise, even awards, followed by revenue. 

IBM expects Platform to operate as a coherent business unit within its Systems and Technology Group. We got some promises from folks at IBM. We will accelerate our investments and growth. We will deliver on our product roadmaps. We will continue to provide our industry-best support and services. We will work even harder to add value to our partners, including IBM’s competitors. We want to make new friends while keeping the old, for one is silver while the other is gold. We might even get to keep our brand name. After all, distributed computing needs a platform, and there is only one Platform Computing. We are an optimistic bunch. We want to deliver to you the best of both worlds – you know what I mean. Give us a chance to show you what we can do for you tomorrow. Our customers and partners have journeyed with Platform all these years and have not regretted it. We are grateful to them eternally.

So, with a pile of approvals, Platform Computing as a standalone company may come to an end, but the journey continues to clusters, grids, clouds, or whatever you want to call the future. The prelude is drawing to a close, and the symphony is about to start. We want you to join us at this show.

Thank you for listening.

ISC cloud 2011

The European conference concerning the intersection between cloud computing and HPC has just finished, and it's very pleasing to report that this conference delivered considerable helpings of useful and exciting information on the topic.

Though cloud computing and HPC have tended to stay separated, the HPC community starting with the sc2010 conference, interest has been gaining primarily because access to additional temporary resource is very temping. However, other reasons for HPC users and architects to evaluate the viability of cloud include total cost of ownership comparisons, and startup businesses which may need temporary access to HPC but do not have the capital to purchase dedicated infrastructure.

Conclusions varied from presenter to presenter, tough some things were generally agreed upon:

  1. if using Amazon EC2, HPC applications must use the cluster compute instance to achieve comparable performance to local clusters.
  2. fine grained MPI applications are not well suited to the cloud simply because none of the major vendors offer infiniband or other low latency interconnect on the back end
  3. running long term in the cloud, even with favorable pricing agreements is much more expensive than running in local data centers, as long as those data centers already exist. (no one presented a cost analysis which included the datacenter build costs as an amortized cost of doing HPC.)


Another interesting trend was the different points of view depending on where the presenter came from. Generally, researchers from national labs had the point of view that cloud computing was not comparable to their in-house supercomputers and was not a viable alternative for them. Also, compared to the scale of their in-house systems, the resources available from Amazon or others were seen as quite limited.

Conversely, presenters from industry had the opposite point of view (notably a presentation given by Guillaume Alleon from EADS). Their much more modest requirements seemed to map much better into the available cloud infrastructure and the conclusion was positive for cloud being a capable alternative to in-house HPC.

Perhaps this is another aspect of the disparity between capability and capacity HPC computing. One maps well into the cloud, the other doesn't.

Overall it was a very useful two days. My only regret was not being able to present Platform's view on HPC cloud. See my next blog for some technologies to keep an eye on for overcoming cloud adoption barriers. Also, if anyone is interested in HPC and the cloud, this was the best and richest content event I've ever attended. Highly recommended.

One small step for man, one enormous leap for science

News from CERN last week that E may not, in fact, equal mc2 was earth shattering. As the news broke, physicists everywhere quivered in the knowledge that everything they once thought true may no longer hold and the debate that continues to encircle the announcement this week is fascinating. Commentary ranges from those excited by the ongoing uncertainties of the modern world to those who are adamant mistakes have been made.

This comment from Matt Alden-Farrow on the BBC sums up the situation nicely:

“This discovery could one day change our understanding of the universe and the way in which things work. Doesn’t mean previous scientists were wrong; all science is built on the foundation of others work.”

From our perspective, this comment not only sums up the debate, but also the reality of the situation. Scientific discoveries are always built on the findings of those that went before and the ability to advance knowledge often depends on the tools available.

Isaac Newton developed his theory of gravity when an Apple fell on his head – the sophisticated technology we rely on today just didn’t exist. His ‘technology’ was logic. Einstein used chemicals and mathematical formulae which had been discovered and proven. CERN used the large hadron collider and high performance computing.

The reality is that scientific knowledge is built in baby steps, and the time these take is often determined by the time it takes for the available technology to catch up with the existing level of knowledge. If we had never known Einstein’s theory of relativity, who’s to say that CERN would have even attempted to measure the speed of particle movement?

The Economics of Cloud, Part I - IDC HPC User Forum

After attending IDC’s HPC User Forum in Houston last month and participating in an HPC cloud panel, it's clear that many potential cloud users still seem confused about the economics of cloud and when it's beneficial. One of the complaints we heard many times was about Amazon’s pricing model being several factors (nearly three times) more expensive than outright hardware purchases. While true, users who complain about this fact may be, at least partially, missing the primary use case for external cloud computing.


Our cloud panel didn’t have enough time to delineate the conditions and workload where cloud computing offers economic advantages, so it seems appropriate to start that discussion here in the first of a series of the Economics of Cloud.


There several factors that should contribute to doing an HPC cloud computing pilot and most are necessary, but not required conditions. These include:

· Practical data sizes input and output or post processing methods that can be used to post process without data transfer

· Serial or course grained parallel workload

· Data security policies that can be satisfied by the cloud

· Application OS and performance requirements that lead to acceptable performance in the cloud

· Unsteady workload requirements (meaning the amount of resource a workload requires varies over time)


This last factor is the one that might be the most confusing. Using IaaS can be very cost effective if the results from a workload are highly valuable and short lived. Conversely, results of unknown value and lengthy execution durations or large data requirements can have enormous charges associated with them.


One simple way of visualizing this is to understand the peak workload (expressed as a fraction of the available local resource) and the average workload. The difference between these two values, if significant, is a good indicator for whether cloud computing could have positive ROI or not. If this effect is plotted in time and the average and peak lines are overlaid, the term "peak shaving" is clearly an apt description of what benefit cloud computing can offer.


Invariably, a steady workload is most efficiently processed in a local data enter resource when compared to pay-per-use rates. Indeed, most IaaS providers have calculated a factor between two and three times over into their pricing for hardware costs to account for the opportunity value of near instantaneous access to compute resources. Thus, paying this "tax" for a steady workload could have disastrous financial consequences if adopted as a strategy.


Anyone interested in permanent or long term cloud resource access should probably investigate longer term service contracts with a selected IaaS provider if local resources are not an option. Such an alternative agreement could easily change any potential negative financial estimates for the benefits of cloud.

HPC from A-Z (part 8) - H

This blog post will focus on the letter H, and how HPC is being used to crunch data to determine the origins of the universe!


High Energy Physics – When it comes to high energy physics, there is only one name in the frame: CERN. The European Organization for Nuclear Research is using HPC for the Large Hadron Collider. The Large Hadron Collider is a 100 meter long particle accelerator designed to improve our understandings of atoms and the universe. This is quite possibly the most fascinating and imaginative use of HPC. Can you think of one that trumps it?


CERN depends on computing power to ensure that 17,000+ scientists and researchers in 270 research centers in 48 countries can collaborate on a global scale to solve the mysteries of matter and the universe. An HPC environment is central to this, and enables scientist and researchers to quickly analyze the data.

Not to burst your bubble…

Probably everyone who has ever been in an HPC environment as a user has run into resource constraints. Limitations like hardware, licenses, memory, or maybe even GPUs on hybrid compute engines rank at the high end of the limitations most users face are facing today. The problem comes when you run into these limitations in the middle of running a job. What do you do then? “Hitting the stops,” so to speak, will often cause another procurement cycle and consume lots of resources in terms of analysis, internal meetings, planning, as well as the purchase and deployment phases before any additional real work can get completed.


Cloud computing, or in this case, cloud bursting offers an approach to mitigate the process and limitations that most HPC consumer corporations go through today. Certainly, using resources outside the firewall can have its own challenges for corporate users, but those aren’t the focus of this blog.


Assume for a moment that security, licensing, provisioning latency and data access are not a problem. Of course, they’re all major issues that need to be addressed to make a cloud solution usable, but bear with me. There are still some important questions that need to be answered:

  • What are the appropriate conditions to start up and provision cloud infrastructure?
  • What jobs should be sent to the cloud once that infrastructure is provisioned and which should stay local and wait?


This second question is the focus here. Often in science, the hardest part of solving a tough problem is stating the question properly. In this case, the question is nicely represented by the below inequality where each entity represents a factor of elapsed time. For cloud computing to be advantageous from a performance perspective,


(Data upload to cloud) + (Cloud Pend time) + (Cloud Run time) + (Data download from cloud) < (Local Pend time) + (Local Run time)


Such a statement allows us to draw a few conclusions about the conditions for when cloud bursting is advantageous for the HPC user:

  • When local job pend time estimates for a job get very large
  • When local elapsed run time is large -- A corollary to this condition is that if the job can be parallelized, but there are insufficient resources locally to run the job quickly, then cloud bursting the job may return results to the user sooner than allowing the job to run on insufficient resources locally.
  • When the job’s data transfer requirements into and out of the cloud are small

Additionally to those conditions, we start to see where several of the real challenges are for a scheduler to make the right decision about which jobs get sent to the cloud and which don’t. For instance, most schedulers today do not consider the data volume associated with a job. But, in a cloud scenario, the data transfer times associated could be 2-50x the runtime for a job and not only dependent upon the file size, but the available transfer bandwidth. Schedulers will need to evolve on several levels to tackle this challenge:
  • Allow users to indicate the files (both input and output) required for each job.
  • Estimate pending and run time for disparate infrastructures
  • Estimate the run time for jobs which run parallel
Be on the lookout for solutions from Platform Computing that start to address exactly these challenges.

HPC from A-Z (part 2) - B

This is our second post in the HPC ABC series. Last week we mused upon the different industries beginning with the letter A that could benefit from HPC. The idea of this series is to help realize the potential HPC has to solve problems or enhance development and design for a number of industries. Predictably enough, we’re focusing on the letter B in this post.

Biology - One of our customers, The Sanger Institute, is a genome research institute that is primarily funded by the Wellcome Trust. It has participated in some of the most important advances in genomic research; developing new understanding of genomes and their role in biology. That type of research requires a great deal of computational power so scientists can perform large scale analysis, such as quickly comparing similar genomic structures.

For more information on how Sanger benefits from an HPC environment please have a look at our video.

HPC is helping biology researchers find out what we are made of. Next week we’ll look at how HPC is helping companies develop and design better consumer products.

Straw Poll Shows HPC Looking to the Cloud

If what we saw at the Supercomputing 2010 (SC’10) Conference in New Orleans proves to be any indicator, 2011 will see real cloud deployments for many HPC users. As Chris Porter reported in his blog from the show, and as we reported in today’s press release on the short survey we did at the conference, the cloud is coming to HPC and it’s coming fast!

We spoke to 100 delegates at the conference from a number of disciplines—research, government, education, manufacturing to name a few—and almost two-thirds of them have already been experimenting with public and private cloud environments within their organizations. What’s more—they’re generally happy with their cloud experiences thus far—and many of those who have not done cloud trials are planning to do so within the next 12 months.

What a far cry from last year. In our 2009 survey, users were only “considering” establishing a private cloud, let alone actually starting a cloud initiative. What’s driving this move toward experimenting with both public and private clouds? The ability to offload applications and workloads to public cloud providers (23 percent) and to burst workloads (15 percent) is according to our survey. HPC users have been generally skeptical about the hype around cloud but eager to reduce costs as they require more performance and scale. Our recent webinar on "when offloading to the cloud works and doesn't" along with the proper cloud strategy, are some considerations for organizations considering HPC in the cloud.

If you’re interested in learning more, check out our whitepaper on HPC cloud scenarios—we’ve got customers running some really fascinating HPC solutions in the cloud right now.

Storm is brewing – HPC Cloud

At the 2009 Supercomputing conference in Portland last year, Platform Computing showed off our our first generation cloud computing management tool, Platform ISF. At that time, the “cloud” buzzword was still fairly new to the HPC community, and had several stern critics in the HPC space. To many HPC folk, “cloud” meant virtualization, and virtualization meant low performance. Very few other vendors at the conference last year even used the C-word, and when they did it was to describe other types (e.g. enterprise computing, dynamic datacenter provisioning, etc) of computing. So for a while, many believed as we did, that virtualization takes the “H” out of “HPC.”

In contrast to last year’s conference, this year both software vendors and infrastructure vendors were present talking about making cloud adoption easy, with every hardware vendor trying to persuade potential customers that building a private cloud using their hardware was a smart choice – especially when the vendor offers their own IaaS model for workload overflow (Platform calls that “Cloud Bursting”).

Also in contrast to 2009, this year has shown hypervisors and processors alike have matured to better support near hardware performance with virtualization. Indeed the performance chasm for some applications has narrowed to a crack (For more on this see our Platform whitepaper). Also, perceptions of the cloud have started to change in the HPC community. For the correct jobs, virtualization doesn’t have to mean there’s an unacceptable performance burden, and the advantages it brings to management, not to mention flexibility, are hard to ignore.

This year at SC’2010 we gave almost the same demonstration with more polish. The difference was the reaction had turned from disdain and skepticism to curiosity and interest. Yes, there are still several issues that need to be sorted before cloud computing is simple for HPC (licensing, data movement, and data security are the biggies). Nevertheless, HPC users are finally beginning to think about the cloud and performance is becoming less and less of an issue. Amazon, for instance, let their HPC performance data walk and talk for itself at the show. A cluster using HPC on EC2 placed at 230th in the TOP500(see http://www.top500.org/system/10661). So there’s no debating it--you can do HPC in the external public cloud – at least if you’re running Linpack.

Even if your application may be difficult to adapt to the cloud, the barriers are falling one by one. So taking the longer term view, in the next 5 years, HPC in the cloud doesn’t seem only feasible, it seems--as we a Platform Computing believe--inevitable.