Thursday, April 9, 2015

HPC is dying?

Jonathan Dursi has a new post out that is causing a storm, HPC is dying, and MPI is killing it. Jonathan makes a lot of good points, many I agree with but I think many commenting on it have fallen into a common problem of looking at it as a technical problem.

It's a social problem. It's a Knowledge Problem.

Find me a faculty or a grad student who knows about Spark, or Chapel (RCE 80)?  Heck find me a faculty or grad student who even uses BLAS or LAPACK?  These have been around for decades and the understanding about their benefit and availability is rare to come across among faculty and students.

I am fully on board with Jonathan, and I am going to put words in his mouth, HPC needs to be a big tent, and for research we need to be open to all technologies that demonstrate value, and not cling to a single solution.

So getting back to the Knowledge Problem where is the information? MapReduce and the successors Spark/Flink come from the data intensive internet scale application world and honestly comes from business and is coming back to academia where most of us MPI folk are.  It is just two different communities solving their own problems. Getting them to talk, when they have no common goals other than scale and performance is mixing oil and water.

There is also a generational gap, I spent some time evaluating running Spark (really Yarn containers) and HPC/MPI codes next to each other without any hacks, and I got push back from both communities.  Each saw the other as a play thing that is a novelty and while could be useful is not where effort is being invested.

As for Chapel and PGAS, most of this is information dispersion also.  People don't know these languages exist.  Chapel has the other artifact that the funding for the base effort was limited, and left to a community effort in a community that didn't see a driving need for it.  Even in a world where simpler methods would be useful adoption will stay low and never hit critical mass.

An example of a simpler method would be MPI bindings for Python or other easy to boot strap language.  We don't see much new code being done here as one would expect ether.  Why deal with stdlib when you could have all the simplicities of Python, a mature, stable, easy language and use the MPI we are all so desperately clinging to?

It will take a generation, and new domains entering the space of stodgy FORTRAN and C programers.  We see this in genomics where many codes are java, perl, or python, languages that a 'respectable' HPC programer would never touch. 

This is how new things will happen.  The old guard that made the last innovation will on average not bring you the next innovation.  The ice company didn't bring us the refrigerator.

No comments:

Post a Comment