dennogumi/content/post/2007-11-15-gene-identifiers.markdown
Luca Beltrame 64b24842b8
All checks were successful
continuous-integration/drone/push Build is passing
Update all posts to not show the header text
2021-01-13 00:05:30 +01:00

45 lines
1.1 KiB
Markdown

---
author: einar
categories:
- Science
comments: true
date: "2007-11-15T19:57:16Z"
header:
image_fullwidth: banner_other.jpg
slug: gene-identifiers
tags:
- annotation
- bioinformatics
- microarray
- python
title: Gene identifiers
omit_header_text: true
disable_share: true
wordpress_id: 336
---
While working today on an annotation class in Python I stumbled on a problem. Normally I work with lists of genes that are consistent, i.e. all Entrez Gene IDs (or RefSeq IDs, or Genome Browser IDs...), but today I had a list of mixed identifiers.
The subsequent idea was "let's implement auto-detection of common identifiers in the class". The problem is... is there any actual documentation on how identifiers are made? So far, using regular expressions, I've tracked down a few:
* RefSeq
* GenBank
* Entrez Gene
* UCSC Genome Browser
* Ensembl
However, I have no idea if I have implemented all types of these IDs. Does anyone know a place where to look these information up?
(On a related note: my thesis defense will be on January 14th, 2008, so I have to get the printing going)