dennogumi/content/post/2007-05-28-more-meta-analysis-difficulty.markdown

---
author: einar
categories:
- Science
comments: true
date: "2007-05-28T19:27:29Z"
header:
  image_fullwidth: banner_other.jpg
slug: more-meta-analysis-difficulty
title: More meta-analysis difficulty
omit_header_text: true
disable_share: true
wordpress_id: 253
---

**UPDATE:**  Today I found out that J Brooks (the corresponding author of Zhao's paper) has agreed to send the  data I needed. Thanks a lot!

When you do bioinformatics, you often test your own procedures not only on your data, but also on datasets provided by other people and publicly available.  [As I stated previously]({{ site.url }}/2006/11/10/the-joy-of-meta-analysis/), that's what meta-analysis is. I'm doing a bit of that for my thesis and recently I noticed that some datasets, while being public, are far from complete.

I was looking at the data published by [Zhao _et al._](http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&cmd=Retrieve&dopt=AbstractPlus&list_uids=16318415&query_hl=1&itool=pubmed_docsum) today and while it's a rather interesting dataset (177 samples of renal cell carcinoma compared to Human Universal Reference RNA), there is little or no information regarding the samples themselves. As I'm running analyses comparing different tumor grades, this is essential for me. However neither the supplementary materials nor the paper give any information. Basically  this makes the whole dataset a lot less useful than what it could be.

On the same note, evaluating [results by Jones _et al._](http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=16115910) presented different problems, because of the aging annotation of the Affymetrix HG-U133A chip. Dai _et al_. have shown [an interesting approach to reannotation for several Affymetrix chips](http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=16284200), so I thought I could use that. However, while the supplementary materials give raw normalized data, there are no CEL files, needed for such a procedure.

Personally I think that all journals should make the submission to databases such as Array Express mandatory. MIAME was meant to be a way to give enough information about a microarray experiment, and it's a shame that there are still so many hurdles when someone wants to make use of someone else's data.
  *[MIAME]: Minimal Information About a Microarray Experiment