Saturday, November 7, 2015

SSH Directly to XSEDE Resources with GSISSH

Many know about the XSEDE Single Sign On Login Hub.  Many don't know that you can make your own version of this on your local systems.  To create the sign on hub, XSEDE uses the Globus Toolkit.

The steps include:

  • Build Globus Toolkit with GSI enabled
  • Download XSEDE Certificates

Big Data and Data Job Openings

Advanced Research Computing - Technology Services ( ) at the University of Michigan has four new job openings as part of our Data Science Initiative ( ) and supporting our ongoing efforts in High Performance Computing.  Available from entry level to senior.

ARC-TS builds and operates research computing platforms. These platforms will contain High Performance Computing (HPC) Linux clusters, High Throughput Computing (HT-Condor), data intensive (Hadoop, SQL, and NoSQL) systems, and containerized/virtualized systems (OpenStack, Docker). 
Big Data System Administrator Senior

This position will act as a senior technical resource and be the primary position responsible for creating and operating and expanding our Hadoop and Spark infrastructure.

Research Database Administrator Senior

This position will act as a senior technical resource and be responsible for creating and operating our research database infrastructure and will be responsible for designing, building, operating, and supporting database platforms. These platforms will contain SQL, NoSQL, and columnar data stores.

Research Cloud Administrator Intermediate

This position will act as a technical resource as part of a team that will create and operate our private cloud infrastructure, and will be responsible for designing, building, operating, and supporting a research private cloud. The private cloud will host administrative systems, databases, and other services.

Friday, September 11, 2015

New Job Openings

Over at ARC-TS we have two new job openings:

Advanced Research Computing - Technology Services, the HPC, BigData and all around research computing group is expanding and we have two new job postings. While these postings refer to recent awards, the positions are backed by firm money.

HPC Storage Administrator Senior

The position will be the primarily responsible for procurement, testing, development, implementation and user integration of a Ceph based storage system for the National Science Foundation’s CC*DNI funded OSiRIS project. Open Storage Research InfraStructure, or OSiRIS, will provide computable storage to a geographically distributed set of science users via virtualization technologies including RedHat Enterprise Virtualization (RHEV), software defined networking, and Shibboleth. 

HPC System Administrator Associate

As a member of a high-performing team, the selected candidate will be responsible for user support, performing systems analysis, implementation, and troubleshooting moderate to complex technical issues and projects.

Wednesday, July 29, 2015

Should all large allocations come with ECCS support?

Quick thought, please forgive its underdeveloped nature.

I'm sitting in the XSEDE15 Champions Fellow panel and I'm watching a trend of each project they worked on they are able to get huge speedups to codes.

Given the size of some proposals, and the dollar value that translates into, if this behavior holds true (huge speedups for ECSS efforts) for some class of requests*, should ECSS review be required?  It might cost less and then any changes to those codes will benefit anyone else using them on other systems.

*I'm thinking that the large community codes that have already been heavily optimized, probably won't see this benefit.  Large requests for privately developed codes with no prior relationship.

Just thoughts, at 1Million CPU Hours, labor starts looking cheap.

Friday, June 19, 2015

RCE-Cast hits 100

Jeff and I would like to thank all our listeners who have kept with us going all the way back to 2009 !

We released our 100th episode today with Eli Dart about Fasterdata, be sure to check it out.

If you are new to it is a podcast we host for all things scientific computing and/or nerdy.  Be sure to get the back catalog. The best kind of support we can get from you is if you leave a rating for us in iTunes and refer us to your friends, or send us requests for the show.

Here is for a 100 more!

Brock Palen

Monday, June 1, 2015

GridFTP Log Analysis with Logstash and Kibana / Elastic Search

As noted in my post about Lustre Stats with Graphite and Logstash we are huge fans of the ELK (Elastic Search, Logstash, Kibana) stack.  In that last example we didn't use the full ELK stack but in this example we are going to use ELK what it was meant for, log parsing and dash-boarding.

We run a GridFTP server using the packages. GridFTP for those who don't know is a better performing way to transfer data around. If you want to setup GridFTP please use the Globus Connect Server, its much easier than setting up the certificate system, and it quickly becoming the standard auth and identity provider for national research systems.

GridFTP logs each transfer with your server.  What I want to know his where, who, and how much is going though the server.  I have been running this setup for a while now, but it could use some refining.  You can find my full logstash config as of this writing at Gist.

First the results:

Logstash has a number of filters that makes this easier.  We use the regular Grok filter to match the transfer stats lines from the GridFTP log.  You could modify this to capture the entire log in Elastic search for archive reasons.  Then the kv (key value) filter does a wonder on all of the log files key=value entries doing most of our work for us.

I have to use a few grok filters to get the IP of the remote server isolated, but once done logstash has a built in geoip filter that tags all the transfers with geolocation information which lets the maps be created.  Oh and in the dashboard those maps are interactive, so you can sort transfers just from another country by clicking on that country, or adding a direct filter for the country code, zipcode, etc.  Really handy.

Individual transfers are also mapped by what campus they are coming from if coming from a University address.  Our sub nets across the three campuses are known and published, so we use the cidr filter to add a tag for each campus, so we can look at traffic from a specific campus.  Again really handy, and would love to get contributions to see what traffic comes from internet2 / MiLR and the commodity internet.

A few warnings, the bandwidth calculation is commented out for a reason.  It works, but not all GridFTP log entries are complete to do the calculation, this makes ruby get angry and makes logstash hang.

So it was very easy to use logstash to understand the GridFTP log files, then the rest of the ELK stack let us quickly make dashboards for our file transfers. 

I was inspired to write this after thinking there must be an easier way to handle GridFTP logs after a presentation at XSEDE 14 where the classic, scripts, plus copy log files, system was employed.  The solution here is near real-time, and we found to be very durable.