Monday, May 30, 2011

Setbacks and Starting Fresh/Community Bonding Period--Take 2

Happy Memorial Day everyone!  Finally feels like summer here in Chicago. First day in a while that it hasn't felt like November. Just wanted to provide you with a few updates as to what has been going on the past week.

First, remember my last post when I was really excited to get started because my code base was finally built? Well, lesson #1: never get too excited. Somehow, some way, I managed to seriously botch up what I had spent the past few weeks doing and had to have Rutger help me out. Nothing says mentor-mentee bonding like a 4 hour VNC session!

That is one extremely useful thing I have learned to do thus far. VNC. I've never run it before on my Mac and spent an evening testing it out with my dad. It's actually very simple once you figure it out. First, for Macs, you go to System Preferences, Sharing, and then check the box that says "Screen Sharing". There, you can click on Computer Settings and set the password that people will use to log to VNC into your computer. I also had to set up permissions for my router--that was probably the hardest part. And finally, I found that the easiest way to figure out your ip address is http://whatismyipaddress.com. Pretty cool :) Although I have to say it is extremely freaky to see your mouse moving on its own.

Thankfully I had this all figured out at the time I was in real need. After creating an endless amount of new projects, I ended up having to check out the code in Eclipse, and then installing Maven via my terminal rather than through Eclipse.

So I think I am OFFICIALLY ready to go now. Before this drama happened, I was just about ready to start writing code :( The first thing I was working on was expressing row-segment metadata for NeXML. At first, I was a little nervous. It seemed like there were a MILLION places to start. But then, the the advice of a professor (I really wish I could remember which one so I could give him credit for his words of wisdom that I use daily) resounded in my head--Step 1: Don't Panic. Okay...trying my best. Step 2: Start with what you know. Okay...I know that I am working on the NeXML section. I also know what metadata annotations entail, so I was looking for keywords. And I ALSO vaguely remember an e-mail from a while back that said that annotations to georeference information had been included. After spending some time looking at how pieces fit together, I found that I was going to be editing the populateXmlMatrix() function within the NexmlMatrixConverter class.  I checked with Rutger and it was very exciting to know that I ended up being correct! Yay.

Now that I am finally on the right page, I will be putting together a JUnit test to make this work.  I have never written one before so it is going to be challenging, but I am looking at code that was already written to see how others have done it and am reading up on tutorials to do so. If I can get it working, I will be committing my first piece of code!

Other things I will be working on this week (this is mainly to keep tabs on myself):
--Progress report.
--Add charsets. There are two ways to do so and get back to Rutger about which way would be best and if changes to the NeXML API are necessary.
--Fill in Wiki.

Okay that's all for now. I know I'm supposed to be taking the day off for Memorial Day, but I at least wanted to work a little in the morning to make up for some lost time from last week. Ta-ta for now!

Monday, May 23, 2011

If you build it...

...you can finally get started! So my code base is built (kinda--more in a minute) and just in time to get down to real work. But boy was that a challenge. My only error that occurs when I build it is:

The markup in the document following the root element must be well-formed. styles.xml


So not really sure what is going on, but if I comment out line 3 and 4, the code builds. So moving on now....

To kick every thing off, I am going to be expressing CHARSET free text and expressing the row-segment metadata (Genbank accession number) for NeXML. Right now, it only seems to be working for NEXUS. And don't worry...I am working on my directions for setting up the code base as well so more to come.

Okay back to work. Bye bye.

Wednesday, May 18, 2011

Community Bonding Period--Take 1

So it has been a crazy past few weeks. Last minute schoolwork, finals, graduation, etc. leads me to say that I am happy to be FINITO! As this is the last week before the actual coding kickoff for Google Summer of Code, there are plenty of loose ends that I am tying up during what is known as the Community Bonding Period, which involves getting to know your mentor (mine is Rutger Vos (mainly) and Bill Piel--see left panel), setting up your code base, and figuring out what the heck you are exactly doing. So that is what I am doing now.

Main thing I'm doing now: trying to set up the coding framework. And I have to admit that it is fairly intimidating. There are so many pieces to fit together!!! And the how-to manual is rather thin (as in nearly non-existent). So this task alone has led me to Google and then Google some more and then Bing it every once in a while. I hope to put together some sort of coherent and very clear instruction page that maybe the next person trying to get involved will be able to follow. But more to come on that.

If you would like to read more about what exactly it is that I am doing, feel free to refer to: http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2011#Automated_submission_of_rich_data_to_TreeBASE. This also includes information on the other Google Summer of Code projects going on within NESCent.

My next post will involved a more detailed account of what I did to set up (or have set up so far) the code base, errors and ruts that I have run into, and how I have solved them.  I didn't want to bore anyone too much with that info. It's not exactly light reading.

Hello world!

Greetings to my viewers!  For my first post, I would just like to warmly welcome you to my blog and explain to you a little about myself and how I became interested in this project.

I just successfully completed a Bioinformatics and Biology degree from Loyola University Chicago. I became interested in phylogenetics after taking the course Bioinformatics and learning different ways to construct trees and analyze the data. This prompted me to use this field as my research for my degree and was able to start working in a phylogenetics and evolutionary biology lab under the supervision of Dr. Sushma Reddy. My project involved (and still ongoing) the investigation of the phylogeny and diversification of Pomatorhinus ruficollis (scimitar babbler) throughout Southeast Asia using mitochodrial and nuclear genes. Through this lab and project I really began to gain a full perspective of the field of phylogenetics. I have been able to work in both the phyloinformatics side of things, as well as the biological side. This has allowed me to understand how the integration of the two is crucial for a complete understanding and analysis.  I really started getting excited about this field after making several visits to the Field Museum of Natural History in Chicago and seeing all the research that goes on there. Very cool stuff.

Anyways, I swear I am going somewhere with all of this. I found out about Google Summer of Code through my bioinformatics department and was absolutely thrilled when i saw a list of "phyloinformatics projects". PERFECT!! I had been doing a little coding here and there, but not a major project. I was so excited to have a chance to string together everything I have learned and really contribute to something useful. The bioinformatics major definitely makes you take your fair share of comp sci classes, but sometimes the flow of classes is a little disjointed and there is only so much you can cram in within four years.

The TreeBASE project caught my eye for a few reasons. 1) I had the qualifications for it (hehe); 2) I have worked with TreeBASE before so I was familiar with its purpose; 3) The goals of the project interested me because the addition of metadata is extremely valuable to the user doing research in this field. Annotations that connect to Genbank or the inclusion of georeferencing information is extremely useful, but often times not possible to find in one place. I hope to be able to help TreeBASE become a one-stop-shop for phylogenetics researchers!

Okay, enough for now. I will add more very shortly about what exactly it is I am going to be doing this summer and what I have done to prep for it thus far during our "community-bonding period" :). But now, it is lunch time!