Ever hear of Nostrasamus?

 [This is reposted from our Tumblr where Sam reflects on his time in the lab, since the beginning... back in 2007]

That’s not a typo.  Of course everyone’s heard of that old fool Nostradamus, who was supposed to be able to predict the fate of the world (including the OJ Simpson trial!).  However, I think he’s a miserable failure.

NostraSAMus (that refers to me, by the way), on the other hand, seems to actually have a good grasp on reality and the future.  As reference, notice what I wrote in this 2007 Genefish post (my first post to the lab blog, in fact!): Soon, I’m sure we’ll all be able to buy personal GPS units so that we’ll be able to post our real-time locations on Google Earth/Maps and setup an RSS feed for our blogs without having to even be at a computer.  Hmmm, maybe I should get that idea notarized/patented…”

Now, another question is whether I was being sarcastic (I’m never sacastic, though [/sarcasm]) or, more likely, an amazing visionary who can see into the future.  I suspect the latter (just for reference, the first iPhone wouldn’t be released until a full year after that post).

Anyway, now that I’m done fluffing my ego, I was simply reminiscing about the lab and this blog a bit and I stumbled across that amazing post from 6 years ago.  Granted, I, of course, knew that I’d still be working in the lab 6 years from when I started in March 2007.  You can even ask Steven what my response was to his interview question of:

SR – “Where do you see yourself in 5yrs?”

My response: “Celebrating the 5yr anniversary of you asking me that question.”  (Thank you, Mitch Hedberg!)

Despite my predictive abilities, I never envisioned how the lab would evolve over these last 6 years.  We started out with skyscrapers of yeast plates:

Moved on to octopus eggs:

Continued with lab “jokes”

Acquired some new personnel:

The big boss man got tenured (and he’s happy about it)!

And we’ve finally settled on bioinformatics as our lab activity of choice.  We’ve had to dive head first into the world of big data acquisition, manipulation and visualization.  And, as a small lab, dealing with big data sets (e.g. 120 million sequences from a single sample) has been a fun, enlightening, and frustrating learning experience.  This is all particularly true since all of us our bench-trained biologists, have little-to-no “programming” experience, and are trying to discover (and rely upon) free-ware for all of this.  We’ve been utilizing software like:

SQLShare – A web-based way to use SQL (good for manipulating/joining enormous tables together) without having to deal with the backend stuff, like schema and other weird tech jargon we don’t grasp.

R – Programming language geared towards data analysis (statistical) and visualization.  Think command line.  Ugh.  R Studio is a nice “skin” that provides a nicer GUI for R.

Galaxy – A web-based means of analyzing virtually any type of “-omics” data, including basic manipulation of text files, FASTA files, BED files, etc.  Has a variety of high-throughput sequencing analysis tools, too.

iPlant Discovery Environment – Similar to Galaxy but their servers are much, much faster at virtually everything.  And, they also have de novo assembly software built-in (like SOAPdenovo, Trinity, and Velvet), which Galaxy does not.

BS Map – For analyzing bisulfite-treated sequences.

iPython – A notebook for tracking your programming codes that allows you to execute the codes directly in the notebook.

Integrated Genomics Viewer (IGV) – For visualizing various features annotated within a given nucleotide/protein sequence.

There are others that we have tried (and continue to try) that often serve a fringe purpose and there are others that we’d like to continue using that we just need more time using and practicing.  Unfortunately, time is probably one of the most precious commodities in the lab and it’s difficult to set aside serious chunks of time to really learn, and apply, tools like Perl, Python, BioPerl, BioPython, Quadrigram, and Orange Canvas.  We know these tools are immensely powerful and useful, but it’s hard to sit at the computer learning these things from the beginning while your data is just sitting around waiting to be dealt with in some fashion.

Additionally, in order to handle all this data, we’ve purchased our own server!  Keeping with a trend of computer naming in the lab (most of our computers are named after birds, which should be obvious, since we do all of our work on shellfish), we call it The Eagle.  It’s a Synology DS413 and it’s been one of the best purchases the lab has ever made.  It has (temporarily) resolved multiple issues:

- Data storage capacity (8TB!)

- Automated “backup” (RAIDed hard drives ensure that when one HDD fails, we don’t lose any data)

- Data hosting for generating URLs for our files.  Additionally, the Synology also helps to replace Dropbox, which the sizes of our data had finally outgrown.

- Direct downloads to the server eliminate the need to download large files to a desktop, only to then upload them to the server.

All in all, the last 6yrs have been pretty awesome over here in Fisheries Teaching & Research at the University of Washington.  The lab is never short on exciting new work and approaches to science.  Plus, we really have had (and continue to have) great people in the lab (personality-wise and scientific-mind-wise), which is critical to a fun, stimulating, creative laboratory.

Well, enough of the brown nosing.  Time for you grad students to stop wasting time reading this blog and get back to work!

In the mean time, I think I’ll walk up to Big Time for lunch and a beer before I leave early today to play softball…

