Showing posts with label HPC. Show all posts
Showing posts with label HPC. Show all posts

Hadoop MapReduce low latency matters! BGI Shenzen wins IDC Innovation Award


Last week IDC released the winners for the HPC Innovation Excellence Awards.

As per IDC’s announcement, BGI Shenzen saved millions of dollars while enabling faster processing of their large genome data sets. The IBM Platform Symphony product was used for the Hadoop MapReduce applications. Platform Symphony provides a low latency Hadoop MapReduce implementation. In conjunction with its unique shared memory management and the data aware low latency scheduler, it accelerates many life sciences applications by as much as 10 times over open source Apache Hadoop

Below is an excerpt from the release:

BGI Shenzhen (China). BGI has developed a set of distributed computing applications using a MapReduce framework to process large genome data sets on clusters. By applying advanced software technologies including HDFS, Lustre, GlusterFS, and the Platform Symphony MapReduce framework, the institute has saved more than $20 million to date. For some application workloads, BGI achieved a significant improvement in processing capabilities while enabling the reuse of storage, resulting in reduced infrastructure costs while delivering results in less than 10 minutes, versus the prior time of 2.5 hours. Some of the applications enabled through the MapReduce framework included: sequencing of 1% of the Human Genome for the International Human Genome Project; contributing 10% to the International Human HapMap Project; conducting research in combating SARS, and a German variant of the E. coli virus; and completely sequencing the rice genome, the silkworm potato genome, and the human gut metagenome. Project leader: Lin Fang

Congratulations to the BGI team for their well deserved recognition.

Rohit Valia
Director, Marketing
HPC Cloud and Analytics

And the winner is... Jaguar Land Rover for business IT project of the year!

Held at a glittering event in London, Platform customer Jaguar Land Rover (JLR) has been awarded the prestigious business project of the year award at the annual UK IT Industry Awards hosted by Computing Magazine in the UK.

Platform Computing worked with Jaguar Land Rover to create an advanced IT environment to underpin the organisation’s virtual car product development, complying with strict safety and environment regulations. JLR has deployed a state-of-the-art system consisting of scalable compute clusters, engineering workstations all built from commodity technologies. The project was recognised for its complexity, ability to reduce the time to market, engineering costs and environmental impact of product development for JLR.

Big congratulations to the team for this significant achievement!

Big Data report from SC’11

In my previous blog I expressed high expectations for the Big Data-related activities at this year’s Supercomputing conference. Coming back from the show, I’d say the enthusiasm and knowledge on Big Data within the HPC community truly surprised me. Here are the major highlights from the show:

  • Good flow of traffic at the Platform booth for Platform MapReduce. Many visitors stopped by our booth to learn more about Platform MapReduce – a distributed runtime engine for MapReduce applications. I found it easy to talk to the HPC crowd because many folks in this  community are already familiar with Platform LSF and Platform Symphony; both are flagship products from Platform that have been deployed and tested in large-scale distributed computing environment for many years. Since Platform MapReduce is built on similar core technology as what’s in those mature products, the HPC community quickly understood the key features and functions it brings to Big Data environments. Even though many users are still at  early stage of either developing MapReduce applications or looking into new programming models, they understand that a sophisticated workload scheduling engine and resource management tool will become critically important once they are ready to deploy their applications into production. Many HPC sites were also interested in exploring the potential of leveraging their existing infrastructure for processing data-intensive applications. For instance, questions on how a MPI and MapRedcue jobs can coexist on the same cluster were frequently asked at the show. The good news, Platform MapReduce is the only solution that can provide capability of supporting mixed workloads.
  •  “Hadoop for Adults” -- This was a quote from one of the attendees after sitting through our breakfast briefing on overcoming MapReduce barriers. We LOVE it! The briefing lured over 130 people and well exceeded our expectations! Our presentation on how to overcome the major challenges in current Hadoop MapReduce implementations drew great interest. “Hadoop for Adults” sums up the distinct benefits Platform MapReduce brings. Platform Computing knows how to manage large-scale distributed computing environments. Bringing that same technology into Big Data environments is a natural extension for us. The reaction at SC’11 for Platform MapReduce was encouraging and a validation on our expertise in scheduling and managing workloads and overall infrastructure in a center.
  • Growing momentum on application development. As sophisticated as always, the HPC community is at the forefront of developing applications to solve data-intensive problems across various industries and disciplines: cyber security, bioinformatics, electronic industry and financial services are just a few examples. Many Big Data related projects are being funded at HPC data centers and we are expecting a proliferation of applications coming out of those projects very soon.

The show is officially over but the excitement around Big Data will continue. For me, not only have I gained tremendous insights on the Big Data momentum in HPC, but I’m also pleased to see the overwhelming reaction for Platform MapReduce within the HPC community. Nothing beats pitching the right product to the right audience!  

Linux Interacting with Windows HPC Server

There are many interesting technology showcases at SC’11 this week. One of the Birds of a Feather sessions this week will discuss a solution implemented at Virginia Tech Transportation Institute (VTTI) that mixes Linux with Windows HPC Server in a cluster for processing large amount of data.


Without a proper tool or a lot of practice, getting Linux and Windows to work together seamlessly to provide a unified interface for end users is a very challenging task. Having both systems coexist in an HPC cluster environment adds an order of magnitude of additional complexity compared to an already complex enough HPC Linux cluster.


This is because Windows and Linux “speak very different languages” in many areas such as user account management, file path and directory structure, cluster management practice, application integrations etc.


The good news is the Platform Computing engineering team did some heavy lifting in product development for this project. Platform HPC integrates with the full software stack required to run an HPC Linux cluster. Its major differentiator compared to alternative solutions is that Platform HPC is application aware. When adding Windows HPC Server into the HPC cluster, the solution delivered by Platform HPC ensures it provides a unified user experiences across Linux and Windows, and hides the difference and complexity between the two OSs.


Platform HPC team has developed a step by step guide for implementing an end-to-end solution with provisioning both Windows and Linux, unified user authentication, unified job scheduling, automated workload driven OS switch, application integrations, and unified end-user interfaces.



This solution significantly reduces the complexity of a mixed Windows and Linux cluster, so users can focus on their applications and their productive work, as opposed to managing the complexity of the mixed Windows and Linux cluster.

Why Combine Platform Computing with IBM?

You may have read the news that Platform Computing has signed a definitive agreement to be acquired by IBM and you may wonder why. I’d like to share with you our thinking at Platform and what our relevance is to you and the dramatic evolution of enterprise computing. Even though not an old man yet and usually too busy doing stuff, for once I will try to look at the past, present and future.

After the first two generations of IT architecture, centralized mainframe and networked client/server, IT has finally advanced to its third generation architecture of (true) distributed computing. An unlimited number of resource components, such as servers, storage devices and interconnects, are glued together by a layer of management software to form a logically centralized system – call it virtual mainframe, cluster, grid, cloud, or whatever you want. The users don’t really care where the “server” is, as long as they can access application services – probably over a wireless connection. Oh, they also want those services to be inexpensive and preferably on a pay-for-use basis. Like a master athlete making the most challenging routines look so easy, such a simple computing model actually calls for a sophisticated distributed computing architecture. Users’ priorities and budgets differ, apps’ resource demands fluctuate, and the types of hardware they need vary. So, the management software for such a system needs to be able to integrate whatever resources, morph them dynamically to fit each app’s needs, arbitrate amongst competing apps’ demands, and deliver IT as a service as cost effectively as possible. This idea gave birth to commodity clusters, enterprise grids, and now cloud. This has been the vision of Platform Computing since we began 19 years ago.



Just as client/server took 20 years to mature into the mainstream, clusters and grids have taken 20 years, and cloud for general business apps is still just emerging. Two areas have been leading the way: HPC/technical computing followed by Internet services. It’s no accident that Platform Computing was founded in 1992 by Jingwen Wang and I, two renegade Computer Science professors with no business experience or even interest. We wanted to translate ‘80s distributed operating systems research into cluster and grid management products. That’s when the lowly x86 servers were becoming powerful enough to do the big proprietary servers’ job, especially if a bunch of them banded together to form a cluster, and later on an enterprise grid with multiple clusters. One system for all apps, shared with order. Initially, we talked to all the major systems vendors to transfer university research results to them, but there was no taker. So, we decided to practice what we preached. We have been growing and profitable all these 19 years with no external funding. Using clusters and grids, we replaced a supercomputer at Pratt & Whitney to run 10 times more Boeing 777 jet engine simulations, and we supported CERN to process insane amounts of data looking for God’s Particle. While the propeller heads were having fun, enterprises in financial services, manufacturing, pharmaceuticals, oil & gas, electronics, and the entertainment industries turned to these low cost, massively parallel systems to design better products and devise more clever services. To make money, they compute. To out-compete, they out-compute.

The second area adopting clusters, grids and cloud, following HPC/technical computing, is Internet services. By the early 2000s, business at Amazon and Google was going gangbusters, yet they wanted a more cost effective and infinitely scalable system versus buying expensive proprietary systems. They developed their own management software to lash together x86 servers running Linux. They even developed their own middleware, such as MapReduce for processing massive amounts of “unstructured” data.



This brings us to the present day and the pending acquisition of Platform by IBM. Over the last 19 years, Platform has developed a set of distributed middleware and workload and resource management software to run apps on clusters and grids. To keep leading our customers forward, we have extended our software to private cloud management for more types of apps, including Web services, MapReduce and all kinds of analytics. We want to do for enterprises what Google did for themselves, by delivering the management software that glues together whatever hardware resources and applications these enterprises use for production. In other words, Google computing for the enterprise. Platform Computing.



So, it’s all about apps (or IaaS J). Old apps going distributed, new apps built as distributed. Platform’s 19 years of profitable growth has been fueled by delivering value to more and more types of apps for more and more customers. Platform has continued to invest in product innovation and customer services.

The foundation of this acquisition is the ever expanding technical computing market going mainstream. IDC has been tracking this technical computing systems market segment at $14B, or 20% of the overall systems market. It is growing at 8%/year, or twice the growth rate of servers overall. Both IDC and users also point out that the biggest bottleneck to wider adoption is the complexity of clusters and grids, and thus the escalating needs for middleware and management software to hide all the moving parts and just deliver IT as a service. You see, it’s well worth paying a little for management software to get the most out of your hardware. Platform has a single mission: to rapidly deliver effective distributed computing management software to the enterprise. On our own, especially in the early days when going was tough, we have been doing a pretty good job for some enterprises in some parts of the world. But, we are only 536 heroes. Combined with IBM, we can get to all the enterprises worldwide. We have helped our customers to run their businesses better, faster, cheaper. After 19 years, IBM convinced us that there can also be a “better, faster, cheaper” way to help more customers and to grow our business. As they say, it’s all about leverage and scale.

We all have to grow up, including the propeller heads. Some visionary users will continue to buy the pieces of hardware and software to lash together their own systems. Most enterprises expect to get whole systems ready to run their apps, but they don’t want to be tied down to proprietary systems and vendors. They want choices. Heterogeneity is the norm rather than exception. Platform’s management software delivers the capabilities they want while enabling their choices of hardware, OS and apps. 


IBM’s Systems and Technology Group wants to remain a systems business, not a hardware business nor a parts business. Therefore, IBM’s renewed emphasis is on systems software in its own right. IBM and Platform, the two complementary market leaders in technical computing systems and management software respectively, are coming together to provide overall market leadership and help customers to do more cost effective computing. In IBM speak, it’s smarter systems for smarter computing enabling a Smarter Planet. Not smarter people. Just normal people doing smarter things supported by smarter systems.

Now that I hopefully have you convinced that we at Platform are not nuts coming together with IBM, we hope to show you that Platform’s products and technologies have legs to go beyond clusters and grids. After all, HPC/technical computing has always been a fountainhead of new technology innovation feeding into the mainstream. Distributed computing as a new IT architecture is one such example. Our newer products for private cloud management, Platform ISF, and for unstructured data analytics, Platform MapReduce, are showing some early promise, even awards, followed by revenue. 

IBM expects Platform to operate as a coherent business unit within its Systems and Technology Group. We got some promises from folks at IBM. We will accelerate our investments and growth. We will deliver on our product roadmaps. We will continue to provide our industry-best support and services. We will work even harder to add value to our partners, including IBM’s competitors. We want to make new friends while keeping the old, for one is silver while the other is gold. We might even get to keep our brand name. After all, distributed computing needs a platform, and there is only one Platform Computing. We are an optimistic bunch. We want to deliver to you the best of both worlds – you know what I mean. Give us a chance to show you what we can do for you tomorrow. Our customers and partners have journeyed with Platform all these years and have not regretted it. We are grateful to them eternally.

So, with a pile of approvals, Platform Computing as a standalone company may come to an end, but the journey continues to clusters, grids, clouds, or whatever you want to call the future. The prelude is drawing to a close, and the symphony is about to start. We want you to join us at this show.

Thank you for listening.

ISC cloud 2011

The European conference concerning the intersection between cloud computing and HPC has just finished, and it's very pleasing to report that this conference delivered considerable helpings of useful and exciting information on the topic.

Though cloud computing and HPC have tended to stay separated, the HPC community starting with the sc2010 conference, interest has been gaining primarily because access to additional temporary resource is very temping. However, other reasons for HPC users and architects to evaluate the viability of cloud include total cost of ownership comparisons, and startup businesses which may need temporary access to HPC but do not have the capital to purchase dedicated infrastructure.

Conclusions varied from presenter to presenter, tough some things were generally agreed upon:

  1. if using Amazon EC2, HPC applications must use the cluster compute instance to achieve comparable performance to local clusters.
  2. fine grained MPI applications are not well suited to the cloud simply because none of the major vendors offer infiniband or other low latency interconnect on the back end
  3. running long term in the cloud, even with favorable pricing agreements is much more expensive than running in local data centers, as long as those data centers already exist. (no one presented a cost analysis which included the datacenter build costs as an amortized cost of doing HPC.)


Another interesting trend was the different points of view depending on where the presenter came from. Generally, researchers from national labs had the point of view that cloud computing was not comparable to their in-house supercomputers and was not a viable alternative for them. Also, compared to the scale of their in-house systems, the resources available from Amazon or others were seen as quite limited.

Conversely, presenters from industry had the opposite point of view (notably a presentation given by Guillaume Alleon from EADS). Their much more modest requirements seemed to map much better into the available cloud infrastructure and the conclusion was positive for cloud being a capable alternative to in-house HPC.

Perhaps this is another aspect of the disparity between capability and capacity HPC computing. One maps well into the cloud, the other doesn't.

Overall it was a very useful two days. My only regret was not being able to present Platform's view on HPC cloud. See my next blog for some technologies to keep an eye on for overcoming cloud adoption barriers. Also, if anyone is interested in HPC and the cloud, this was the best and richest content event I've ever attended. Highly recommended.

One small step for man, one enormous leap for science

News from CERN last week that E may not, in fact, equal mc2 was earth shattering. As the news broke, physicists everywhere quivered in the knowledge that everything they once thought true may no longer hold and the debate that continues to encircle the announcement this week is fascinating. Commentary ranges from those excited by the ongoing uncertainties of the modern world to those who are adamant mistakes have been made.

This comment from Matt Alden-Farrow on the BBC sums up the situation nicely:

“This discovery could one day change our understanding of the universe and the way in which things work. Doesn’t mean previous scientists were wrong; all science is built on the foundation of others work.”

From our perspective, this comment not only sums up the debate, but also the reality of the situation. Scientific discoveries are always built on the findings of those that went before and the ability to advance knowledge often depends on the tools available.

Isaac Newton developed his theory of gravity when an Apple fell on his head – the sophisticated technology we rely on today just didn’t exist. His ‘technology’ was logic. Einstein used chemicals and mathematical formulae which had been discovered and proven. CERN used the large hadron collider and high performance computing.

The reality is that scientific knowledge is built in baby steps, and the time these take is often determined by the time it takes for the available technology to catch up with the existing level of knowledge. If we had never known Einstein’s theory of relativity, who’s to say that CERN would have even attempted to measure the speed of particle movement?

Blog Series – Five Challenges for Hadoop MapReduce in the Enterprise, Part 3


Challenge #3: Lack of Application Deployment Support

In my previous blog, I explored the shortcomings in resource management capabilities in the current open source Hadoop MapReduce runtime implementation. In this installment of the “Five Challenges for Hadoop MapReduce in the Enterprise” series, I’d like to take a different view on the existing open source implementation and examine the weaknesses in its application deployment capabilities. This is critically important because, at the end of day, it is the applications that a runtime engine needs to drive, without a sufficient support mechanism, a runtime engine will only have limited use.
 
To better illustrate the shortcomings in the current Hadoop implementation for its application support, we use below diagram to demonstrate how the current solution handles workloads.


As shown in the diagram, the current Hadoop implementation does not provide multiple workload support. Each cluster is dedicated to a single MapReduce application so if a user has multiple applications, s/he has to run them in serial on that same resource or buy another cluster for the additional application. This single-purpose resource implementation creates inefficiency, a siloed IT environment and management complexity (IT ends up managing multiple resources separately).

Our enterprise customers have told us they require  a runtime platform designed to support mixed workloads running across all resources simultaneously so that multiple lines of business can be served. Customers also need support for workloads that may have different characteristics or  are  written in different programming languages. For instance, some of those applications could be data intensive such as MapReduce applications written in Java, some could be CPU intensive such as Monte Carlo simulations which are often written in C++ -- a runtime engine must be designed to support both simultaneously.  In addition, the workload scheduling engine in this runtime has to be able to handle many levels of fair share scheduling priorities and also be capable of handling exceptions such as preemptive scheduling. It needs to be smart enough to detect resource utilization levels so it can reclaim functionalities when the resources are available.  Finally, a runtime platform needs to be application agnostic so that developers do not have to make code changes or recompile to adapt the runtime engine supporting their applications. The architecture design of the current Hadoop implementation simply does not provide those enterprise-class features required in a true production environment.   


The Release of Platform HPC 3.0.1 Dell Edition

Last week, Platform HPC 3.0.1 Dell Edition was released. One of the significant features of this release is the support of high availability for management nodes on the latest version of Red Hat Enterprise Linux 6.1. Why is support for management node HA special when many cluster management tools support it already?


Well, Platform HPC is not just a cluster management solution. It is an end-to-end cluster management and productivity tool. When handling management node failover, Platform HPC not only needs to ensure all the cluster management functionalities can fail over, it also has to ensure other functionalities fail over at the same time so that the end user and administrator won’t see the difference before and after the failover. The functionalities that failover handles include but are not limited to:

  1. Provisioning
  2. Monitoring & alerting
  3. Workload management service
  4. Web portal for users and administrators

In a heavy production environment, the failover function of the user web portal is far more critical than the failover of the cluster management functionality. This ensures users have non-stop access to the cluster through the web portal even if the primary management node running the web server is down.


Other capabilities included in Platform HPC 3.0.1 Dell Edition include:


  1. Dell hardware specific setup for management through IPMI
  2. One-To-Many BIOS configuration via the idrac-bios-tool
  3. Dell OpenManage integration
  4. Mellanox OFED 1.5.3-1 kit
  5. CUDA 4 kit
  6. Management software integration provided by QLogic and Terascale
  7. Dell fully automated factory install

With the complete and integration software components, the cluster solution that Platform Computing delivers together with Dell has gone a long way since the release of Open Cluster Stack in 2007.

HPC from A - Z (Part 26) - Z

Z is for… Zodiac
The wonders of the universe will remain of interest to the human race until the end of time or at least until we think we know everything about everything. Whichever comes first…

With the developments in technology of late, the amount of information we have about constellations and other universes is growing exponentially. For all we know, in fifty years we may be taking holidays in space or have discovered a form of life just like us in a far away galaxy.

But understanding what is really out there is far more complex than books and films make it seem. The pretty pictures of constellations don’t do astronomy justice! The amount detail needed to track star and planet movements and understand which direction constellations are moving in requires some seriously high resolution telescopes. Just think about the amount of ‘zoom’ required to detect traces of flowing liquid on Mars. This is well beyond the capabilities of your standard Canon or Nikon – that’s for sure!

With high resolution come high data volumes. So, like all the posts before this, HPC is crucial for cosmology, astrophysics, and high energy physics research. Without it, results could take years to find instead of months of minutes. By the time the path of the celestial sphere is mapped it could easily be into its second or even third cycle.

HPC can also be used in more theoretical contexts. For example, researches at Ohio State University required the compute power provided by the Ohio Supercomputing Center Glenn Cluster to run simulations and modeling required for their study on the effects of star formation and growth of black holes.

As we finally reach the end of our ABC series, there’s no denying the critical role that compute power plays in our day-to-day lives. Technology is developing at a startling pace, and with each and every new development comes more data and a consequent need to process and make sense of it. Without HPC our technological advancements would not be nearly as fast, and we as a society would not have the insight and capabilities that we do today.

Blog Series – Five Challenges for Hadoop MapReduce in the Enterprise, Part 2

Challenge #2: Current Hadoop MapReduce implementations lack flexibility and reliable resource management

As outlined in Part 1 of this series, here at Platform, we’ve identified five significant challenges that we believe are currently hindering Hadoop MapReduce adoption in enterprise environments.  The second challenge, addressed here, is a lack of flexibility and resource management provided by the open source solutions currently on the market.

Current Hadoop MapReduce implementations derived from open source are not equipped to address the dynamic resource allocation required by various applications.  They are also susceptible to single points of failure on HDFS NameNode, as well as on JobTracker. As mentioned in Part 1,  these shortcomings are due to the fundamental architectural design in the open source implementation in which the job tracker is not separated from the resource manager.  As IT continues its transformation from a cost center to a service-oriented organization, the need for an enterprise–class platform capable of providing services for multiple lines of business will rise.  In order to support MapReduce applications running in a robust production environment, a runtime engine offering dynamic resource management (such as borrowing and lending capabilities) is critical for helping IT deliver its services to multiple business units while meeting their service level agreements.  Dynamic resource allocation capabilities promise to not only yield extremely high resource utilization but also eliminate IT silos, therefore bringing tangible ROI to enterprise IT data centers.

Equally important is high reliability. An enterprise-class MapReduce implementation must be highly reliable so there are no single points of failure. Some may argue that the existing solution in Hadoop MapReduce has shown very low rates of failure and therefore reliability is not of high importance.  However, our experience and long history of working with enterprise-class customers has proved that in mission critical environments, the cost of one failure is measured in millions of dollars and is in no way justifiable for the organization. Eliminating single points of failure could significantly minimize the downtime risk for IT. For many organizations, that translates to faster time to results and higher profits.

HPC from A-Z (part 25) - Y

Y is for… yield strength

Think how many children play with plasticine and Play-Doh and day dream about being an artist or a sculptor when they grow up. The majority of adults actually follow different career paths but the true passions of a child often resonate with their adult selves in a different and more advanced form. Just look at the number of engineers in the world!

‘Yield strength’ or ‘yield point’ is in many ways, basic knowledge for ‘grown up’ sculptors. It is the stress at which a material begins to deform and can no longer return to its original shape. This is vital information for design as it represents an upper limit of a load that can be applied on a surface and similarly is an important consideration in materials production. How would we create new objects and materials without this knowledge?

Times have moved on since my Play-Doh days and the engineers of today no longer need to determine the yield strength of a material by stacking weights on top of materials one by one. Instead, this type of testing is nearly all done through simulations so that the extensive (and expensive) real world testing is only conducted on a select few prototypes. Think about the materials used to create space shuttles. Not only do they cost a small fortune but making a mistake and using a material with the wrong yield strength could actually impact human life. Simulation is crucial to avoid these types of issues.

The role of HPC? It’s all about the high performing cluster technology that is required for the engineers at the heart of material development to vet prototypes without ever having to develop scale models. It’s a profit enabler, enabling faster product development and time to market for materials and the designers that make use of them.

HPC from A-Z (part 24) - X

X is for X-ray analysis

There’s no denying that science is significantly more advanced than ever before. In fact, I would argue that today’s doctors and scientists would be lost if asked to work in the conditions experienced by their counterparts in times gone by.

Modern technology has allowed medicine and associated scientific disciplines to go beyond the simple diagnosis of coughs and colds or a broken leg. Instead, researchers can now extract huge amounts of detail about atom arrangement and electron density using techniques such as x-ray crystallography modelling and gather physical proof for their theories.

For example, a deeper understanding of biological molecules means bioscientific researchers can examine molecular dynamics, look further into protein analysis and identification, sequence alignment and annotation, and structure determination analysis. The results are fascinating and they are really driving the growth of knowledge in the scientific profession. What many don’t realize, however, is the sheer volume of data behind discoveries and how much information advanced research techniques actually generates.

Another example is medical imaging studies which use data from digitized X-ray images. In this case HPC is used to glean information from numerous medical databases. In fact, supercomputing facilities, along with industry, are exploring ways today that will ensure access to massive amounts of stored digital data for future research.

With HPC able to provide such insight into scientific happenings here on earth, just think what insights could come from Chandra about the wider Universe (and ET!) in years to come…

HPC from A-Z (part 23) - W

W is for weather mapping

From Argentina to Afghanistan, Lisbon to London, and Boston to Beijing, strip away all our cultural differences and you are left with one single unifying trait of humanity: we love to talk about the weather.

Weather by its very nature is unpredictable, and it is this uncertainty that has been a thorn in the side of humanity since the beginning. So, in a similar way that Ancient Greek sailors consulted an oracle for the likelihood of a smooth ocean voyage in 300BC, we now check the weather forecast on our phones before we leave for work in 2011.

Nowadays of course, our ability to predict the weather is significantly more accurate than it was in Ancient Greece, but it is still by no means an exact science.

For The University of Oklahoma, weather forecasting applications put one of the heaviest burdens on its computing resources. These HPC applications help crunch masses of data from satellites and radars, which is then amalgamated throughout the day and processed into a forecast so that hundreds of thousands of Americans know whether to pack an umbrella or sunscreen.

Of course, while the ability to accurately predict the weather can help keep us dry or plan what to do with our long holiday weekends, it also provides a much more important service in helping scientists to predict the occurrence of natural disasters. HPC is crucial for the successful research of severe storm patterns in Oklahoma, the surrounding region, and across the United States, and is literally helping save lives.

HPC from A-Z (part 22) - V

V is for… virtual prototyping

While virtual prototyping may sound like complicated process, it’s actually just a form of product development that uses Computer Aided Design or Engineering to develop a virtual prototype before putting a physical object into production. In other words, it’s about building big expensive stuff like spaceships, race cars, and battleships.

Organizations like NASA aren’t just going to experiment with billions of dollars in the hope that the resulting rocket successfully launches. When you’re investing that much money, you want a guarantee that your product is 99.9% likely to work as it should and the people using it are going to come out in one piece! It’s in the design and test phase of building these products that HPC plays a crucial role.

In the last blog post we talked about developing fictional worlds for computer games. HPC helps to easily produce a realistic virtual environment for testing manufacturing designs in a very similar way. And no company better understands how important this is than engine manufacturer, MTU Aero Engines. They use Platform LSF to prioritize jobs so that time sensitive tests can be completed quickly and make use of maximum possible computing power to create a realistic test environment. In fact, the technology is so popular that they keep coming back for more and the organization has had to create a special queue that allows users to submit low priority, short duration jobs to test for immediate problems before submitting for thorough testing.

HPC from A-Z (part 21) - U

U is for universe modeling

As we near the end of our HPC ABC we’ve seen a wide range of HPC examples. It literally does not get bigger than this one…

Our H blog post explained how CERN’s Large Hadron Collider is using HPC to crunch data and determine the origins of the universe. What many don’t know is it can also be used to see how our universe might grow. According to Scientific American, the diameter of the observable universe is at least 93 billion light years or 8.80×1026 meters [1]. So, the big question is how fast does it expand? HPC can help calculate the rate at which the universe is expanding.

Not only can HPC help model our universe, it can also help create fictional universes. Parallel processing can help render imagined virtual worlds. Many of today’s games are incredibly immersive and realistic, and HPC can play an important part in powering that level of reality by reducing the amount of time it takes to create complex graphics.

HPC from A-Z (part 20) - T

T is for transactional processing, time travel and tea

When it comes to “T” I can think of a few great examples of HPC use, or potential use, ranging from the very practical, to science fiction, and then back down the earth.

The very practical example is transactional processing. It can be a complicated process, but HPC can really simplify it. Consider, for example, the computing processes involved when moving a pot of money from one account to another. HPC can be used to automate this activity by processing transactions in parallel , speeding up the time it takes for money to travel from one account to another.

If HPC can help speed up machine processes, perhaps one day it can help us travel beyond the speed of light, possibility into the future or back into the past. HPC could model the perfect time travelling machine, which is optimized to travel at fast speeds across the cosmos.

Closer to home, HPC could also help create the perfect cup of tea. Calculations could be performed to discover the perfect tea leaf-to-water-and-milk ratio. Bad cups of tea could become a thing of the past!

HPC from A-Z (part 19) - S

S is for Space exploration!

Are we alone in the cosmos? What is dark matter? What is the universe expanding into? The nature of the cosmos has fascinated us earthlings since we first looked up at the sky and began to wonder. We’ve come a long way since then. We've ditched the loin cloths, created the telescope and even set foot on the moon – but there are many questions which remain unanswered and HPC can be used to help to answer them.

In 1999, a project called SETI@home was set up to search for signs of intelligent life in the universe using volunteers’ idle computers. Back then, SETI borrowed cycles from these computers across the Internet, using their compute resources to analyse intergalactic data. Now while SETI@home didn’t quite manage to find E.T, it’s just one great example of how HPC can benefit space exploration. Put a little more thought into it though, and HPC could be used for a whole lot more; everything from crunching telescope data, analysing rocket and shuttle stability, to scrutinising far away galaxies which could host potential alien life forms.

Whatever the next great step for mankind is, you can bet that HPC will be somehow involved.

The Experience of Building a Scalable Supercomputer

This week, the TOP500 ranked the system at Taiwan’s National Center for High Performance Computing as #42 in their June 2011 bi-annual supercomputer list. This system was provided by Acer together with its technology partners, AMD, DataDirect Networks, QLogic, and Platform Computing. Platform Computing provided the management software and MPI libraries for the system, as well as services for deploying these software components.


During the period of system installation and configuration, a number of areas demonstrated the advantages of partnering with Platform Computing:


(1) Management software: Platform HPC was chosen to manage the system. The scalability and maturity of the software components simplified the installation and the configuration of the management software layer. Both the workload scheduler (based on Platform LSF) and MPI library (Platform MPI) on the system scale effortlessly.


(2) MPI expertise: To achieve maximum Linpack performance results, it is critical to ensure MPI performance is optimized. During the installation and configuration stage, the Platform MPI development team provided numerous best practices to help maximize the benchmarking results, from checking cluster healthiness to MPI performance tuning. They collaborated closely with developers from QLogic, who provided Infiniband interconnects.


(3) Dynamic zoning: The system will be used by multiple research user groups. There is a separate workload management instance for each user group. Based on the workload of each user group, the size of the workload management zone will change from time to time. Each zone has its own user account management system and scheduling policies. Platform HPC was set up to easily manages such dynamic configuration changes.


The maturity of Platform HPC, as well as the expertise from Platform Computing’s development and services teams played a key role in ensuring the success of this Acer project. The maximized performance and stability of the benchmarking runs enabled the results to be submitted in time for the June TOP500 list. But mostly importantly, when the system is in hands of hundreds of users in production, the robustness of the workload management, the performance of MPI, as well as the support from experts who built the software will make a difference in delivering the quality of services from this top Taiwanese supercomputer.

HPC from A-Z (part 18) - R

R is for Reservoir modelling

It might just look like thin brown treacle to you and I, but crude oil is a big money business.

Millions of years worth of pressure under the earth’s surface has turned the tiny plants and animals of prehistoric Earth into the modern world’s most valuable resource – powering vehicles, industries and economies across the globe. As such, the financial rewards for finding and trading in oil are substantial. However, when you’re using millions of pounds worth of equipment including a 30ft drill to bore holes into the planet’s crust then equally, so are the risks. Choosing the wrong spot to drill can be an expensive mistake.

StatoilHydro ASA, a Norway-based oil and gas company, is one of the world’s largest crude oil traders. It relies on sophisticated 3D simulation programmes to search for natural oil-wells in the Earth’s crust -- if you want to strike it rich you need to be drilling in the right place. I don’t need to tell you that this process involves vast amounts of data, large numbers of complex calculations and requires thousands of iterations to produce accurate results. To put it simply, it’s a very, very big job.

To ensure StatoilHydro had the required resources to power such colossal calculations it installed an HPC environment, which is now invaluable to its reservoir engineers worldwide. It allows its users to run significantly more simulations which in turn means for much greater accuracy when drilling.

And accuracy is important when only a small error in location can cost many millions of dollars. This isn’t ‘pin the tail on the donkey’ – it’s an exact science.