Friday, June 29, 2012

of the art, and velocity of research computing...

Hot on the news of Google Compute Engine (it looks epic!):

So, we run about 20,000 cpu, but as with all things it is never quite enough! Literally, with GCE being less than one day old, this ticket came into our service desk system at 5:13am this morning...

I love this stuff!

From: Redacted @ rchelp
Date: Fri, Jun 29, 2012 at 5:13 AM
Subject: [#25078] Support Google Compute Engine as a fallback option?


Sometimes I have had problems waiting for the long queues. After yesterday's announcement, I was wondering if you can add support for (or simply teach users how to do) submitting jobs the new Google Compute Engine.



What an epic request eh?

We do so love our jobs, and this one ticket request is clearly a classic. It absolutely personifies Research Computing being so close the bleeding edge! We are also constantly reminded that our scientists, which we are honored to work with, are quite simply the best in the world. Oh, and by the way - they quite rightly demand the highest of standards!

It is all pretty awesome!

I love this ticket so much - this is basically a reflection of what we do!

p.s. As you can see below I'm on it!

The very moment I get an account on GCE that is... ;-)

Wednesday, June 27, 2012

participating: postdoc networking night

Sponsor: FAS Office of Postdoctoral Affairs

This will be an informal event for postdocs to learn more about different career paths in the sciences. We will open the event with some refreshments and casual networking for about 15 minutes. Once everyone has arrived, we will ask each of the speakers to give a 2- to 3-minute introduction including: 1) what their organization does, 2) their role in the organization, and 3) their personal career path. We will then invite the speakers to sit at assigned groups with the postdocs. From that point, postdocs will have 20-30 minutes to gather information and ask questions before moving to another group. There will be 3 rotations. The goal of this event is to expose postdocs to a variety of potential career paths, and we are grateful to you for your participation.

I'm totally looking forward to taking part in this tonight!

Monday, June 25, 2012

does science ever sleep? "on the psychology of cluster computing..."

It's been a while since I posted anything, and this one is both hopefully fun and informative for folks that either run, or run on HPC cluster systems.

Ok here we go...

So we run ca. 20,000 processors in our cluster, it is all time shared and allows over 5,000 folks to submit their work in batches, at any time of the day or night. There are times at which, it all gets rather hectic I can tell you!

Recently we have been working on a C/MySQL interface to our flat-file batch logs going over the last three or so years to look for trends and spot which groups are running which codes and when they are running them.

There are about 82 million rows in the database right now, so there are some pretty interesting SQL queries on the go, but I digress. I'll talk more about the interface and database in a follow up post - it is Michele's project, you can read about it here on her github:

The C code is also pretty darn cool and groovy, but I'm biased ;-)

Anyway, it turns out it's pretty fun to look at the global stats. First up, let us look at how all our jobs have been submitted over each month. This one is pretty straight forward for trending capacity:

Ignore the negative job number, this stuff is still work in progress, it will get better with time, this is all 10,000ft stuff right now ;-) Ok, good, so now let's look at all our jobs based on which day it is (I was looking for good days to do downtime and updates):

Interesting... And finally a look see at which hour of the day it is, again we could use this to spot good opportunities to carry out some deferred cluster maintenance:

Science NEVER Sleeps!

Is a little bit of a misnomer don't ya think? :-) Mind you, when you consider our population of customers (grad student heavy), and also correlate this to how many help tickets we get at around 6pm on any given evening because of a failed submission script, and then the number of requests we see on Saturday morning with folks upset about code failing...

It all makes sense.

Most jobs are fired off at about 5pm, on a Thursday or Friday night (some nearly 38 million jobs!) There is very little (in comparison only 13 million jobs) going on, on a Saturday night, but then with a clear uptick of rate on Sunday night on to Monday morning.

I looked at my own jobs, and sure enough other than some crazy 4:30-5:00am submission I half way remember, I follow the trend fairly well. You can get a hint of our new charting method from this snap below, it has all sorts of neat javascript predictive text entry courtesy of Michele Clamp. You can also see we make use of the totally awesome charting software. Very cool.

Data Driven Summary, smart kids will get up at 6am on Sat. to run their science...

Oh wait, no! That makes absolutely no sense. hehehe :-)

(c) 2018 James Cuff