The power of the shell

Yesterday I was trying to adjust some files in order to make a program use Affymetrix SNP arrays data (instead of arrayCGH data like the program was designed for). I had a big (116,000 rows) tab-delimited text file and I needed to use only part of the columns there.

Most people would just try to use Excel (ugh) but since it has way too many limitations, it is unstable, and runs on Windows, I had to use other ways. The awk command is what I needed, given the fact that my input was a text file:
[code]awk ‘ { print $1”\t”$7 } ‘ CAKI1_CNAT.txt > CAKI-1.txt
awk ‘ { print $1”\tchr”$2”\t”$3”\t”$3 } ‘ CAKI1_CNAT.txt > CAKI-1.ann [/code]

With two commands I created the two files I needed for the obscure software I was testing and without a single headache. The first one created a file with only columns 1 and 7, while the second only with the first three columns, adding “chr” to the text in the second column.

A simpler and more elegant solution would have probably been using cut for the first file:
[code]cut -f1,7 CAKI1_CNAT.txt > CAKI-1.txt[/code]

Either way, these are things that make my job easier. Try doing that with cmd.exe.

Dialogue & Discussion