Kernel recompilation & HPC application performance

Published: 08th February 2011
Views: N/A
Ask About This Article Print Republish This Article
Some questions never die, kernel recompilation for improving the performance of an application is one of them. I have heard this question from users from various domains (CFD, seismic, financial, oil & gas, academic, bio-molecular modeling and so on and so on). It always starts the same way.

"I think I should recompile the kernel of my cluster so I can have better performance. What do you think?"

And my answer always is "No". It does sound logical … you compile your code with the best possible optimizations and you get better performance (in most cases, I should add). Why does it not apply to the kernel? After all, kernel is what is managing my processes, running my system. It’s easy to start the debate this way but miss out a key aspect.

Here are a few key questions to ask before you start on this (almost always) fruitless exercise:

How much time do I actually spend in the kernel when you are running your (scientific) code?

How much of that time is actually spent doing something useful than waiting on something else (good old friend, disk I/O, interrupt handling)?


With newer interconnects like Infiniband which use user level drivers and employ kernel bypass to drastically improve latencies (barring the initial setup time), how much performance improvement can you really expect from recompiling your kernel?

Kernel recompilation can also bring cluster management headaches:

Deploy the new kernel to every node in the cluster

Recompile your kernel every time a new security or performance related patch is released

Recompile your hardware drivers to match your new kernel

Stability and performance issues of drivers with your choice of compiler optimizations

Not knowing what areas of the kernel code are adversely affected by your choice of optimizations

And not to forget, some ISV’s support their code on only certain kernels only. Once you start using your ISV code on a different kernel, goodbye vendor support!

A more practical approach would be to look in to the application code and make optimizations in its code either through good old hand tuning or through performance libraries or straight forward compiler optimizations. Beware if you are dealing with floating point and double precision arithmetic, you should tread carefully when using more aggressive compiler optimizations. Several compilers do not guarantee precision at higher optimizations.


Using simple techniques like data decomposition, functional decomposition, overlapping computation & communication and pipelining to improve the efficiency of your available compute resources. This will yield a better return on investment especially when we are moving in to an increasingly many-core environment.

There is a paper on how profile-based optimization of the kernel yielded a significant performance improvement. More on that here

And results from a recent article on Gentoo essentially show that for most applications and usage cases, it does not make much sense to compile and build your own kernel.



Hpc systems Inc offers the industry standard servers in various form factors for a range of applications and the product Quad Xeon and Quad Core Xeonprovides the best performance at affordable rates.

This article is free for republishing
Source: http://alfredwinston.articlealley.com/kernel-recompilation--hpc-application-performance-2016719.html


Report this article Ask About This Article Print Republish This Article


Loading...
More to Explore
 


Ask a Professional Online Now
27 Experts are Online. Ask a Question, Get an Answer ASAP.
Type your question here...
Optional:
Select...