dennogumi/content/post/2006-07-08-the-power-of-the-shell.markdown
Luca Beltrame 64b24842b8
All checks were successful
continuous-integration/drone/push Build is passing
Update all posts to not show the header text
2021-01-13 00:05:30 +01:00

1.4 KiB

author categories comments date header slug title omit_header_text disable_share wordpress_id
einar
General
Linux
Science
true 2006-07-08T08:16:30Z
image_fullwidth
banner_other.jpg
the-power-of-the-shell The power of the shell true true 86

Yesterday I was trying to adjust some files in order to make a program use Affymetrix SNP arrays data (instead of arrayCGH data like the program was designed for). I had a big (116,000 rows) tab-delimited text file and I needed to use only part of the columns there.

Most people would just try to use Excel (ugh) but since it has way too many limitations, it is unstable, and runs on Windows, I had to use other ways. The awk command is what I needed, given the fact that my input was a text file: [code]awk ' { print $1"\t"$7 } ' CAKI1_CNAT.txt > CAKI-1.txt awk ' { print $1"\tchr"$2"\t"$3"\t"$3 } ' CAKI1_CNAT.txt > CAKI-1.ann [/code]

With two commands I created the two files I needed for the obscure software I was testing and without a single headache. The first one created a file with only columns 1 and 7, while the second only with the first three columns, adding "chr" to the text in the second column.

A simpler and more elegant solution would have probably been using cut for the first file: [code]cut -f1,7 CAKI1_CNAT.txt > CAKI-1.txt[/code]

Either way, these are things that make my job easier. Try doing that with cmd.exe.