1.1 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	| author | categories | comments | date | header | slug | tags | title | omit_header_text | disable_share | wordpress_id | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| einar | 
 | true | 2007-11-15T19:57:16Z | 
 | gene-identifiers | 
 | Gene identifiers | true | true | 336 | 
While working today on an annotation class in Python I stumbled on a problem. Normally I work with lists of genes that are consistent, i.e. all Entrez Gene IDs (or RefSeq IDs, or Genome Browser IDs...), but today I had a list of mixed identifiers.
The subsequent idea was "let's implement auto-detection of common identifiers in the class". The problem is... is there any actual documentation on how identifiers are made? So far, using regular expressions, I've tracked down a few:
- 
RefSeq 
- 
GenBank 
- 
Entrez Gene 
- 
UCSC Genome Browser 
- 
Ensembl 
However, I have no idea if I have implemented all types of these IDs. Does anyone know a place where to look these information up?
(On a related note: my thesis defense will be on January 14th, 2008, so I have to get the printing going)