Tuesday, February 10, 2015

Hardware is not a service

In my last post Will scrutiny of HPC resource allocation rise over time?  I tried to equate dollar values between different common resource providers.  This is a common task that researchers must face and this post is for them:

Hardware is not a service.

Amazon will give you hardware as a service, but not an HPC or data service.  Cloud has made the task much easier to put something together that resembles an HPC service, but let's list what an HPC service has as features:
  • Compute Nodes
  • Ample cooling and power to operate equipment at scale
  • Ample Network (maybe Infiniband or other RDMA/MPI network)
  • Ample WAN for data ingest and exgest
  • Queue Software and management
  • Application Software (MPI, Scientific Libraries)
  • Support Staff to fix/install all the above software
  • Storage, maybe MPI-IO capable like Lustre/GPFS/PVFS/OrangeFS
  • Maybe Auditing for HIPPA IPHI EAR conformance.
If a research compares the cost of an HPC service like XSEDE, Flux or Penguin to a cloud providers instance cost or node cost, you will have bills later you didn't plan on.  In fact you are comparing the first bullet and maybe the second and third.  Let's take AWS as an example.  In my post I stated that a c4.8xlarge reserved for 3 years costs $3,820.  What would it cost to make that into a service?

$3,820 3 year node cost
$2,370 3 year 1 TB EBS Volume
$685 3 years 50GB/month data transfer out of AWS
$70,000 3 years of 1/3rd time Linux Admin

Total Hardware: $6,875 for 3 years
Total w/Labor: $76,875

I'm going to ignore the labor costs. Many researchers will see that cost buried into their institution just be aware of it and what it really costs.  Worse case a student is managing everything, distracted from their research, and eventually graduates.  Notice I left all software out.  There are open-source versions of all the software mentioned. In fact many times they are the industry standard and should be used. It doesn't ignore the fact that a good HPC admin and support staff will know tweaks and what packages are out there.

In conclusion be careful when comparing solutions, some are far from equal. This does not take into consideration performance differences between virtualized cloud providers and bare metal offerings.  It also does not consider network performance impact on parallel codes as most providers provide only Ethernet based networking.

No comments:

Post a Comment