dennogumi/content/post/2006-07-08-the-power-of-the-shell.markdown at 64b24842b8561e9333952f43be39623f1b8551fb

websites/dennogumi

Fork 0

Luca Beltrame 64b24842b8

continuous-integration/drone/push Build is passing

Details

Update all posts to not show the header text

2021-01-13 00:05:30 +01:00

1.4 KiB

Raw Blame History

author

categories

comments

date

header

slug

title

omit_header_text

disable_share

wordpress_id

einar

General

Linux

Science

true

2006-07-08T08:16:30Z

image_fullwidth
banner_other.jpg

the-power-of-the-shell

The power of the shell

true

Yesterday I was trying to adjust some files in order to make a program use Affymetrix SNP arrays data (instead of arrayCGH data like the program was designed for). I had a big (116,000 rows) tab-delimited text file and I needed to use only part of the columns there.

Most people would just try to use Excel (ugh) but since it has way too many limitations, it is unstable, and runs on Windows, I had to use other ways. The awk command is what I needed, given the fact that my input was a text file: [code]awk ' { print $1"\t"$7 } ' CAKI1_CNAT.txt > CAKI-1.txt awk ' { print $1"\tchr"$2"\t"$3"\t"$3 } ' CAKI1_CNAT.txt > CAKI-1.ann [/code]

With two commands I created the two files I needed for the obscure software I was testing and without a single headache. The first one created a file with only columns 1 and 7, while the second only with the first three columns, adding "chr" to the text in the second column.

A simpler and more elegant solution would have probably been using cut for the first file: [code]cut -f1,7 CAKI1_CNAT.txt > CAKI-1.txt[/code]

Either way, these are things that make my job easier. Try doing that with cmd.exe.

1.4 KiB Raw Blame History

1.4 KiB

Raw Blame History