Initial import of new posts
This commit is contained in:
		
					parent
					
						
							
								e4bafbb361
							
						
					
				
			
			
				commit
				
					
						0e12688f04
					
				
			
		
					 391 changed files with 14594 additions and 0 deletions
				
			
		
							
								
								
									
										19
									
								
								_posts/2006-11-25-a-simple-annotator.markdown
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										19
									
								
								_posts/2006-11-25-a-simple-annotator.markdown
									
										
									
									
									
										Normal file
									
								
							| 
						 | 
				
			
			@ -0,0 +1,19 @@
 | 
			
		|||
---
 | 
			
		||||
author: einar
 | 
			
		||||
comments: true
 | 
			
		||||
date: 2006-11-25 09:06:10+00:00
 | 
			
		||||
layout: post
 | 
			
		||||
slug: a-simple-annotator
 | 
			
		||||
title: 'A simple annotator '
 | 
			
		||||
wordpress_id: 132
 | 
			
		||||
categories:
 | 
			
		||||
- General
 | 
			
		||||
- Linux
 | 
			
		||||
- Science
 | 
			
		||||
---
 | 
			
		||||
 | 
			
		||||
In the past two days I've written a simple annotator program, that given an input list of RefSeq genes, automatically determines the relevant Entrez Gene IDs and annotates them using the flat files provided by the [NCBI](http://www.ncbi.nlm.nih.gov). A direct conversion was not possible due to limitations in Biopython's parsers, but I managed to use the GenBank parser to identify and extract the references to the Gene IDs (and putting them in a list).
 | 
			
		||||
 | 
			
		||||
Once that had been done, I created a series of dictionaries when reading the annotation file, for data such as gene name, symbol, chromosome and cytoband. Using the list I already obtained, it was easy to create a new file with the required fields.
 | 
			
		||||
 | 
			
		||||
During this process I learnt somewhat more about how to play with iterators to skip headings and so on. The code is not yet sufficiently generic, but once I finish toying with it, I may publish it for "general" (assuming anyone would use it) consumption, under GPL v2.
 | 
			
		||||
		Reference in a new issue