30 lines
1.4 KiB
Markdown
30 lines
1.4 KiB
Markdown
---
|
|
author: einar
|
|
comments: true
|
|
date: 2006-07-08 08:16:30+00:00
|
|
layout: page
|
|
slug: the-power-of-the-shell
|
|
title: The power of the shell
|
|
wordpress_id: 86
|
|
categories:
|
|
- General
|
|
- Linux
|
|
- Science
|
|
header:
|
|
image_fullwidth: banner_other.jpg
|
|
---
|
|
|
|
Yesterday I was trying to adjust some files in order to make a program use Affymetrix SNP arrays data (instead of arrayCGH data like the program was designed for). I had a big (116,000 rows) tab-delimited text file and I needed to use only part of the columns there.
|
|
|
|
<!-- more -->
|
|
|
|
Most people would just try to use Excel (ugh) but since it has way too many limitations, it is unstable, and runs on Windows, I had to use other ways. The _awk_ command is what I needed, given the fact that my input was a text file:
|
|
[code]awk ' { print $1"\t"$7 } ' CAKI1_CNAT.txt > CAKI-1.txt
|
|
awk ' { print $1"\tchr"$2"\t"$3"\t"$3 } ' CAKI1_CNAT.txt > CAKI-1.ann [/code]
|
|
|
|
With two commands I created the two files I needed for the obscure software I was testing and without a single headache. The first one created a file with only columns 1 and 7, while the second only with the first three columns, adding "chr" to the text in the second column.
|
|
|
|
A simpler and more elegant solution would have probably been using _cut_ for the first file:
|
|
[code]cut -f1,7 CAKI1_CNAT.txt > CAKI-1.txt[/code]
|
|
|
|
Either way, these are things that make my job easier. Try doing that with cmd.exe.
|