Researchers that have been traditional users of HPC clusters have been asking how can they make use of Amazon Web Services (AWS) aka the cloud to run their workloads. While AWS gives you a great hardware infrastructure they really are just renting you bare metal machines by the hour.
I should stress using AWS is not magic. There is a lot that needs to be known to avoid extra costs or risk of losing data if you are new to cloud computing. Before you start contact ARC at firstname.lastname@example.org.
Admins of HPC clusters know that it takes a lot more than metal to make a useful HPC service which is what researchers really want. Researchers don't want to spend time installing and configuring queueing systems, exporting shared storage, and building AMI images.
Lucky for the community the nice folks at MIT created Star Cluster. Star cluster is really a set of prebuilt AMIs and a set of python codes that uses the AWS API to create HPC clusters on the fly. Their AMIs also includes many common packages such as MPI libraries, compilers, and python packages.
There is a great Quick-Start guide form the Star team. Users can follow this, but HPC users at the University of Michigan can use the ARC Cluster Flux, which has star cluster installed as an application. Users only need user accounts to access the login node to then create clusters on AWS.
module load starcluster/0.95.5Following the rest of the Quick-Start guide will get your first cluster up and running.
Common Star Cluster Tasks
Switch Instance Type
Make a shared Disk on EBS
Add/Remove Nodes to Cluster
$ starcluster addnode -n # <clustername>
$ starcluster removenode -n 3 smallcluster