Tips and tricks *Nix

A collection of quick-and-dirty tricks to do things with the standard command line tools in *Nix and cygwin. I am copying stuff here as I use these snippets regularly so this is a central repo for the ageing brain.


Here's the GAWK manual, for reference…

Print unique lines based on column (from Stackoverflow)

awk -F, '!seen[$1]++' Input.csv

Filter column-based text file:

awk '{ if ($4 == "TheString") {print $0 } }' INFILE > OUTFILE

Pass variable from bash into awk (Stackoverflow)

This is the best way to do it. It uses the -v option: (P.S. use a space after -v or it will be less portable. E.g., awk -v var= not awk -vvar)
variable="line one\nline two"
awk -v var="$variable" 'BEGIN {print var}'
line one
line two
# --- Multiple vars
# This should be compatible with most awk and variable is 
# available in the BEGIN block as well:
awk -v a="$var1" -v b="$var2" 'BEGIN {print a,b}'

Removing/adding whitespace

Sometimes one just wants to remove linebreaks in a file and have all data on one line (e.g. as input for a a “for loop”:

echo $(<file.txt) | tr -d ' '

If the troff part of the expression is removed, then the echo just pads with a space (from Stackoverflow

Using tr to replace spaces with tabs:

tr -s '\t' '\t' < INFILE > OUTFILE

Interrogating EBDIC Headers of SEGY files

When trying to plot SEG-Y files with GMT one needs to have knowledge about the data. Using Unix's dd command, one can interrogate the EBDIC header:

 dd if=$1 conv=ascii ibs=3200 count=1 | awk 'BEGIN{RS="C[0-9 ][0-9]"}{printf "C%2d%s\n",NR,$0}'