Add the whole blog

This commit is contained in:
Luca Beltrame 2020-12-28 18:06:15 +01:00
parent 0d2f58ce7a
commit c4f23c1529
Signed by: einar
GPG key ID: 4707F46E9EC72DEC
418 changed files with 15708 additions and 0 deletions

View file

@ -0,0 +1,17 @@
---
author: einar
categories:
- Science
comments: true
date: "2007-04-14T07:55:45Z"
header:
image_fullwidth: banner_other.jpg
slug: bioinformatics-sequence-analysis
title: Bioinformatics != sequence analysis
disable_share: true
wordpress_id: 226
---
This post sums up my frustration in trying to use Python for my daily work. Like Perl and Ruby, it has [its own Bio version](http://biopython.org) to deal with biological data. However, the current implementation leaves a lot to be desired. A lot of stuff that doesn't deal with sequence analysis, even for simple tasks such as fetching annotations from [Entrez Gene](http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene), is missing (but present in Bioperl, for example). Also, documentation for some modules is lacking or non-existant (why keeping a parser for Affymetrix CEL files when there are no information on how to use it, let alone know which formats does it support?). Basically, maintenance is good for everything related to sequence analysis... the rest is somewhat in slumber.
I can understand that [Bioconductor](http://bioconductor.org) has  the spotlight regarding microarrays, but some of us don't want to use R for that purpose (also to avoid duplication of tasks in my laboratory). At least for annotations, some stuff would be welcome, to avoid forcing people to reinvent the wheel every time. I hope to get enough time to complete and polish up my "annotation project" so that it can be helpful to someone (but with my PhD thesis coming up, it won't be anytime soon).