Bioinformatics != sequence analysis

By einar

April 14, 2007 - Comments

This post sums up my frustration in trying to use Python for my daily work. Like Perl and Ruby, it has its own Bio version to deal with biological data. However, the current implementation leaves a lot to be desired. A lot of stuff that doesn’t deal with sequence analysis, even for simple tasks such as fetching annotations from Entrez Gene, is missing (but present in Bioperl, for example). Also, documentation for some modules is lacking or non-existant (why keeping a parser for Affymetrix CEL files when there are no information on how to use it, let alone know which formats does it support?). Basically, maintenance is good for everything related to sequence analysis… the rest is somewhat in slumber.

I can understand that Bioconductor has the spotlight regarding microarrays, but some of us don’t want to use R for that purpose (also to avoid duplication of tasks in my laboratory). At least for annotations, some stuff would be welcome, to avoid forcing people to reinvent the wheel every time. I hope to get enough time to complete and polish up my “annotation project” so that it can be helpful to someone (but with my PhD thesis coming up, it won’t be anytime soon).

Comments