In my continuing tirade against HPC condos I will now propose what I think the solution is: HPC As A Service or HAAS.
HAAS provides a number of advantages over condos, the main being flexibility in construction and utilization.
For the HPC operators leasing cores by the hour or month etc. provides the administering organization ownership over hardware. If users don't own the hardware this means operators can swap out underlying hardware as needs demand. Example would be wanting to swap out older hardware for newer more powerful hardware to free data center space for expatiation, cooling needs, etc.
With a condo this meant buyouts and negotiations that were long and drawn out, and had the overhead of negotiating with every hardware owner. With HAAS the service is maintained and can be moved from older equipment to new equipment without the involvement of users as long as the service sold before and after the hardware swap is the same or improved.
For funding agencies this HAAS model provides for better utilization of hardware. The cost passed to the user, and thus the funding source, should be less than the condo due to recovered capital deprecation. This would be in the form of over subscription of resources, so their average utilization as a unit is higher than in the condo. Remember in the condo that groups own gear and if the group is not currently using their gear, no other group can. In the HAAS model over subscription can reach 50% or higher of the available hardware driving capital utilization higher. Thus more research for less buck.
For the user they gain flexibility in utilization. Groups with small budgets can now utilize large chunks of HPC resources for short periods, opening HPC to an entire new class of user. To illustrate the Michigan Flux Project, a group with a budget of $1000 could not even buy one node in a condo but can purchase 89 cores compute for 1 month, or 1 core for 89 months. The options here are the biggest to be realized by moving to a HAAS model.
The existing groups with large funding and continuous needs also benefit from flexibility, again as they can now procure additional resources for short periods to augment their standard needs. This burst use in emergencies or other sporadic needs went unsatisfied under pure condo models.
I personally think more HPC providers should move to an HAAS model. These models are already used in the commercial space with providers like Amazon, Penguin and IBM. It is also heavily used in the academic space where funding does not change hands eg. XSEDE. As soon as funding comes into play, the push is for condos because of funding requirements and this is unfortunate.
Nice article Brock. Seems like a condo type model would be a huge investment over time and be underutilized compared to HASS.ReplyDelete
In admin and user time the condo and HAAS models I think take the same amount of investment, it is the resulting output in productivity that is different. Picture the same resources organized to get more science out.Delete