Report bugs | Sign in
Powered by Melange
Release 0-6-20100201

Student Name: Eric Talevich
Mentor Name: Brad Chapman
Co-Mentors: Christian Zmasek
Title: Biopython support for parsing and writing phyloXML
Abstract: PhyloXML is an XML format for phylogenetic trees, designed to allow storing information about the trees themselves (such as branch lengths and multiple support values) along with data such as taxonomic and genomic annotations. Connecting these pieces of evolutionary information in a standard format is key for comparative genomics.

A Bioperl driver for phyloXML was created during the 2008 Summer of Code; this project aims to build a similar module for the popular Biopython package.
Public info:

The emerging phyloXML format offers a consistent way to store and share information about richly annotated phylogenetic trees, including geographic, taxonomic and sequence-level data as well as other XML-based extensions.  However, researchers can only benefit from this if existing libraries and toolkits support this format.  To support phyloXML in Biopython, I created a pair of new modules, Tree and TreeIO, providing a common interface for reading and writing phylogenetic trees in several file formats and operating on a common basic object representation.  These modules are generalized to provide a foundation for enhanced phylogenetics support in Biopython in the future.

For more on how to use these modules, see the PhyloXML, Tree and TreeIO pages on the Biopython wiki.

I'll continue to work on the Tree and TreeIO modules with Biopython. To track this project, follow my phyloxml branch on GitHub. The code should be merged into the mainline sometime within the next few months and included in the next regular Biopython release after that.

Additional info: http://www.biopython.org/wiki/PhyloXML