Here are my responses to a few help-requests at unix.ittoolbox.com:
1. The original request:
"hi
i have string like below
aabccdee
how to extract just the unique values from above..
i want my output to be
abcde
"
My solution:
[shan@ipc4 ~]$ sed -n 's@\([a-zA-Z]\)\1*@\1@pg' < input_file > txt.output
or (a smart guy's tr solution.)
echo "aabccdee" | tr -s [[:alpha:]]
2. The original request:
i have a around ~15k csv files created in the month of nov,
each file has around 100 fields, i have to get the unique of 10th field.
MY solution:
find . -name "*in" -exec awk -F" " '{print $3}' {} \; | tr -sc 0-9A-Za-z '\012' | sort | uniq -c | sort -nr
sample output: (count : 10th field)
43 Makefile
38 YPPUSH
36 YPDIR
36 in
19 YPSRCDIR ...
3. The original request:
I have a file of the following format:
gi|*170079665*|ref|YP_001728985.1| gi|*15829256*|ref|NP_308029.1|
99.76 820 2 0 1 820 1 820 0.0 1645
gi|*170079666*|ref|YP_001728986.1| gi|*15829257*|ref|NP_308030.1|
99.35 310 2 0 1 310 1 310 4e-177 613
:
:
:
:
For each of the lines in this file I want to create a file having the two
numbers in bold in this format :
*170079665
**15829256*
My solution: (there are simple ones with perl/python... but, hey, shell is great!)
$ mkdir tmp; export j=1
$ for i in $(cut -d'|' -f2,6 <>> ./tmp/file.$j
echo '*'$(echo $i | cut -d"|" -f2) >> ./tmp/file.$j
let j=$j+1
done
input-file:
gi|*170079665*|ref|YP_001728985.1| gi|*15829256*|ref|NP_308029.1| 99.76 820 2 0 1 820 1 820 0.0 1645
gi|*170079800*|ref|YP_001728235.1| gi|*15829846*|ref|NP_308029.1| 99.76 820 2 0 1 820 1 820 0.0 1645
output:
$ ls tmp
file.1 file.2 file.3 file.4
$ cat ./tmp/file.1
*170079665*
**15829256*
$ cat ./tmp/file.2
*170079800*
**15829846* ...
cheers!