Add the whole blog
This commit is contained in:
parent
0d2f58ce7a
commit
c4f23c1529
418 changed files with 15708 additions and 0 deletions
21
content/post/2006-11-25-a-simple-annotator.markdown
Normal file
21
content/post/2006-11-25-a-simple-annotator.markdown
Normal file
|
@ -0,0 +1,21 @@
|
|||
---
|
||||
author: einar
|
||||
categories:
|
||||
- General
|
||||
- Linux
|
||||
- Science
|
||||
comments: true
|
||||
date: "2006-11-25T09:06:10Z"
|
||||
header:
|
||||
image_fullwidth: banner_other.jpg
|
||||
slug: a-simple-annotator
|
||||
title: 'A simple annotator '
|
||||
disable_share: true
|
||||
wordpress_id: 132
|
||||
---
|
||||
|
||||
In the past two days I've written a simple annotator program, that given an input list of RefSeq genes, automatically determines the relevant Entrez Gene IDs and annotates them using the flat files provided by the [NCBI](http://www.ncbi.nlm.nih.gov). A direct conversion was not possible due to limitations in Biopython's parsers, but I managed to use the GenBank parser to identify and extract the references to the Gene IDs (and putting them in a list).
|
||||
|
||||
Once that had been done, I created a series of dictionaries when reading the annotation file, for data such as gene name, symbol, chromosome and cytoband. Using the list I already obtained, it was easy to create a new file with the required fields.
|
||||
|
||||
During this process I learnt somewhat more about how to play with iterators to skip headings and so on. The code is not yet sufficiently generic, but once I finish toying with it, I may publish it for "general" (assuming anyone would use it) consumption, under GPL v2.
|
Loading…
Add table
Add a link
Reference in a new issue