Saturday, April 20, 2013

little known technologies that helped save the human genome


What is that chunky wiring monstrosity?

That my friends, is the utter magic that happens when you have to get creative to solve problems as a team. Back in the day, there were no clouds, amazon was still selling books. There was also no "devops", I mean heck we hardly had functioning cell phones!

So we faced a problem.  We had no space, not enough kit, and a race to further inform the human genome. These were the days before "map reduce" or any of that clever stuff. We wanted to find a way to get DNA on multiple computer clusters in a real hurry. Tim Cutts on the team with Guy Coates came up with this epic system that in essence relies on that set of little wires you see above. 

We were using RLX Technologies blades which could stack 24 little motherboards into a 3U chassis! These were some interesting days as we stacked up 768 of these puppies in 2 racks - we used 52U monster racks back then, that needed a stepladder to get to the top blades! However, this did allow us to fit 16 chassis or 384 individual computers in a single rack! Each blade had two network ports that were presented out the back via a nifty RJ21 connection...




You can see that the networking could get very quickly out of hand... 12*4 wires is 48 ethernet connections out of a single box!


Yeah it got out of hand fast! To run the 768 machines we had purchased we would have needed a network switch with 1,536 ports.  Remember this is 2002, that would have been a crazy switch to buy... even today it would be pretty nuts to do it...

So enter Tim Cutts, and his magic wires!

Because we were basically doing a file system distribution and image run (we talked about it later here). We could effectively make a next to next to next network taking each node and hooking it up to his friend next door.  This way we had the ability to make a network that was perfect to distribute data from one end of the chain all the way to the end of the chain.

We called this a "Distribution Area Network".

We basically took Dolly as our inspiration:

     +--------+  Data   +----------+  Data  +----------+
     | Master |-------->| Client 1 |------->| Client 2 |
     +--------+         +----------+        +----------+
         ^                   |                   |
         | Data              | Data              | Data
         |                   V                   V
      +------+            +------+            +------+
      | Disk |            | Disk |            | Disk |
      +------+            +------+            +------+

We were using the local 2.5" disks as a software raid stripe of 2 x 40G disks so there were 1,536 little spindles (60TB total) in there, each with local copies of the human genome and analysis data needed to get our science out of the door.

This all kinda looks a bit like something released not that long ago eh?   Seeing these recent announcements of so called "high density CPU servers" reminds me of that phrase...

What was once old. Is new again. I guess?

As a foot note, which to me was even more staggering. Tim sent the folks over at the manufacturers what was basically an upgraded beer mat design. We, much like the world famous Dr. Birney also loved to design things at the pub :-) The first example "wire" came back we plugged it in on a test chassis, set up some IP addresses... sent some pings... crossed our fingers!

and the bloody thing went and worked the FIRST TIME!

I still so much remember laughing in our ISG portakabin.

p.s.
While researching this article, I stumbled across proof I also used Pine on Tru64 ;-)




[any opinions here are all mine, and have absolutely nothing to do with my employer]
(c) 2011 James Cuff