I'm very excited that I'm accepted for this year's Google Summer of Code(GSOC). In recent days, I have been busy preparing my master's thesis and defense. This news is like good stress reliever for me. The project I'm going to work on is "Phylogenetics in Biopython: Filling in the gaps", which is to implement some phylogenetics algorithms for Biopython. I believe it will be an exciting coding experience.
Get to Know GSOC
The first time I got to know the GSOC was from Biojava homepage when I was trying to use Biojava for my own bioinformatics work. As I thought most of the applicants and biojava contributors might be from the computer background, I never had the courage to apply this. Last September, I got the chance to know Professor Allen and Karen when they were visiting our lab. And Karen told us more details about the GSOC and also the NESCent, and that they had been the mentoring organization for several years. I must say this finally inspired me to apply the GSOC this year.
As the name implies, this project is to implement some phylogenetic algorithms that are currently absent in the Biopython.Phylo package. In this package, some basic phylogenetics functions, such as tree operations, parsers for Newick, Nexus and PhyloXML, and wrappers for Phyml, Raxml and PAML, are already implemented. While there are some important components that remain to be filled in to better support phylogenetic workflows. These include simple tree construction algorithms, consensus tree searching, tree comparison and visualization. In this project, I will focus on the first two functions: tree construction and consensus tree searching. The tree construction part includes three algorithms: UPGMA, Neighbor Joining, and Maximum Parsimony. And the consensus tree part includes another three: Strict, Majority-rule and Adams Consensus. So after this project, there will be two separate modules providing those algorithms in Biopython.Phylo package.
Works for the Next Two Weeks
The coding time will start on June 17. So during the next two weeks, I will read related source code in Biopython and trying to design two draft modules for both two parts.