dennogumi/content/post/2007-06-20-data-handling.markdown at cf5d4f6d7ff75c4f2c801659be3fc8d051fff3e2

websites/dennogumi

Fork 0

Luca Beltrame 64b24842b8

continuous-integration/drone/push Build is passing

Details

Update all posts to not show the header text

2021-01-13 00:05:30 +01:00

1.8 KiB

Raw Blame History

author

categories

comments

date

header

slug

title

omit_header_text

disable_share

wordpress_id

einar

Science

true

2007-06-20T17:50:39Z

image_fullwidth
banner_other.jpg

data-handling

Data handling

true

264

As the people who read my science related posts already know,[ I'm in the middle of doing meta-analysis]({{ site.url }}/2007/05/28/more-meta-analysis-difficulty/). That brought up a problem, so to speak, and it's related to annotations.

Probes on microarrays are referenced to genes (to over-simplify): usually these references are made with the latest version of the genome available. As the map of the genome is not static, but it's a moving target, these annotations tend to become obsolete. And that unfortunately leads to problems when you compare experiments made in different time frames.

To be precise, the papers I'm using the data from are from 2005 to 2006, but the actual experiments were performed earlier. One uses the annotation data from the Affymetrix HG-U133A chip, which (along with the whole HG-U133 family) have been proven to be outdated by Dai and coworkers. The other uses Entrez Gene identifiers, but some IDs are no longer valid or overlap.

How can such a situation be solved? For some experiments there's nothing much to do, perhaps reannotate the IDs using an automated system (I believe this is possible), for others (Affy chips) the paper I linked gives a possible (and effective: we've tested it in our group) solution by creating new "meta-probes" that reflect the updated annotations.

In any case, you should be wary of that, should you want to compare different microarray datasets.

1.8 KiB Raw Blame History

1.8 KiB

Raw Blame History