Add the whole blog

This commit is contained in:
Luca Beltrame 2020-12-28 18:06:15 +01:00
parent 0d2f58ce7a
commit c4f23c1529
Signed by: einar
GPG key ID: 4707F46E9EC72DEC
418 changed files with 15708 additions and 0 deletions

View file

@ -0,0 +1,20 @@
---
author: einar
categories:
- General
- Science
comments: true
date: "2006-12-13T21:13:59Z"
header:
image_fullwidth: banner_other.jpg
slug: working-with-genome-browser-data
title: Working with Genome Browser data
disable_share: true
wordpress_id: 138
---
In the past two days I've been tackling an annotation problem. I'm trying to provide annotations for genes found in regions that are significantly altered, DNA copy-number wise (thanks to the [STAC](http://www.genome.org/cgi/content/abstract/16/9/1149) method). The idea would be to annotate those regions (that span one megabase) using [UCSC Table Browser](http://genome.ucsc.edu/cgi-bin/hgTables).
However, the task was impractical, so I decided to automate it a bit. I converted the data into ranges and then used the KnownGene annotation file (downloaded from UCSC) to obtain which genes were in which reagion. The last part wasn't easy at all (at least in Python), as I had to check for ranges and adjust for consecutive regions. The code is terribly ugly, so I'll try to clean it up before posting it.
If I can I'll try to integrate it with the other scripts I have written to make a small annotation pipeline.