1
0
Fork 0
This repository has been archived on 2021-01-06. You can view files and clone it, but cannot push or open issues or pull requests.
dennogumi.org-archive/_posts/2006-07-08-the-power-of-the-shell.markdown

1.4 KiB

author comments date layout slug title wordpress_id categories header
einar true 2006-07-08 08:16:30+00:00 page the-power-of-the-shell The power of the shell 86
General
Linux
Science
image_fullwidth
banner_other.jpg

Yesterday I was trying to adjust some files in order to make a program use Affymetrix SNP arrays data (instead of arrayCGH data like the program was designed for). I had a big (116,000 rows) tab-delimited text file and I needed to use only part of the columns there.

Most people would just try to use Excel (ugh) but since it has way too many limitations, it is unstable, and runs on Windows, I had to use other ways. The awk command is what I needed, given the fact that my input was a text file: [code]awk ' { print $1"\t"$7 } ' CAKI1_CNAT.txt > CAKI-1.txt awk ' { print $1"\tchr"$2"\t"$3"\t"$3 } ' CAKI1_CNAT.txt > CAKI-1.ann [/code]

With two commands I created the two files I needed for the obscure software I was testing and without a single headache. The first one created a file with only columns 1 and 7, while the second only with the first three columns, adding "chr" to the text in the second column.

A simpler and more elegant solution would have probably been using cut for the first file: [code]cut -f1,7 CAKI1_CNAT.txt > CAKI-1.txt[/code]

Either way, these are things that make my job easier. Try doing that with cmd.exe.