1
0
Fork 0
This repository has been archived on 2021-01-06. You can view files and clone it, but cannot push or open issues or pull requests.
dennogumi.org-archive/_posts/2006-07-08-the-power-of-the-shell.markdown

30 lines
1.4 KiB
Markdown

---
author: einar
comments: true
date: 2006-07-08 08:16:30+00:00
layout: page
slug: the-power-of-the-shell
title: The power of the shell
wordpress_id: 86
categories:
- General
- Linux
- Science
header:
image_fullwidth: banner_other.jpg
---
Yesterday I was trying to adjust some files in order to make a program use Affymetrix SNP arrays data (instead of arrayCGH data like the program was designed for). I had a big (116,000 rows) tab-delimited text file and I needed to use only part of the columns there.
<!-- more -->
Most people would just try to use Excel (ugh) but since it has way too many limitations, it is unstable, and runs on Windows, I had to use other ways. The _awk_ command is what I needed, given the fact that my input was a text file:
[code]awk ' { print $1"\t"$7 } ' CAKI1_CNAT.txt > CAKI-1.txt
awk ' { print $1"\tchr"$2"\t"$3"\t"$3 } ' CAKI1_CNAT.txt > CAKI-1.ann [/code]
With two commands I created the two files I needed for the obscure software I was testing and without a single headache. The first one created a file with only columns 1 and 7, while the second only with the first three columns, adding "chr" to the text in the second column.
A simpler and more elegant solution would have probably been using _cut_ for the first file:
[code]cut -f1,7 CAKI1_CNAT.txt > CAKI-1.txt[/code]
Either way, these are things that make my job easier. Try doing that with cmd.exe.