suppose in directory have 3 files, file 1, file 2 & file 3. same header name possible in awk compare & write frequency of occurrence
file 1  c1  c2  c3  c4   d     d   d     d   d     d    file 2  c1  c2  c3  c4   d     d   v     d   d     d  file 3   c1  c2  c3  c4   d   r   d   f     d   d     d   step 1 compare file 1 & file 2
temp.output  c1  c2  c3  c4 0   0   0   0 0   1   0   0 0   0   0   0   then compare file 2 & file 3 & overwrite temp.output frequency
final.output c1  c2  c3  c4 0   0   1   0 0   2   0   0 0   0   0   0   the original directory may contain multiple files, , want each of them process in orderly manner, ie. file1.txt file2.txt file2.txt file3.txt
let me suggest convert input files lines. this, can apply awk easily.
the paste -s <file> command ally. below can see how sort files sorted , convert them lines:
$ cat file1.txt  c1  c2  c3  c4   d     d   d     d   d     d $ ls file1.txt  file2.txt  file3.txt $ ls | sort file1.txt file2.txt file3.txt $ ls | sort | xargs -l 1 -i {} /bin/bash -c 'echo -n {}" "; paste -s {}' file1.txt c1  c2  c3  c4      d     d     d     d     d     d file2.txt c1  c2  c3  c4      d     d     v     d     d     d file3.txt c1  c2  c3  c4      d   r   d     f     d     d     d $    once lines, can use awk iterate fields (nf tell how many there). use several rules.
for every line, compare if field @ i different previous saved value , increment result accordingly. skip comparing results first line (nr != 1) selector. 
(nr != 1) { (i = 1; <= nf; i++) { if (last[i] != $i) { result[i]++; } } }   in same awk call, include rule updates array keep last values:
{ (i = 1; <= nf; i++) { last[i] = $i  }  }   finally printout file , status of results:
{ printf("%s", $1); (i = 1; <= nf; i++) { printf(" %d", result[i]) } print "" }   here whole command:
$ ls | sort | xargs -l 1 -i {} /bin/bash -c 'echo -n {}" "; paste -s {}' | awk '(nr != 1) { (i = 1; <= nf; i++) { if (last[i] != $i) { result[i]++; } } } { (i = 1; <= nf; i++) { last[i] = $i  }  } { printf("%s", $1); (i = 1; <= nf; i++) { printf(" %d", result[i]) } print "" }' file1.txt 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 file2.txt 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 file3.txt 2 0 0 0 0 0 0 1 0 0 2 0 0 0 0 0 0 $    this output starts filename, accumulated differences in:
- the filename (always different)
 - the columns (c1, c2, c3, c4 same)
 - then 12 values. relevant data starts @ field 7.
 
you can format again awk inserting new lines when appropriate:
awk '{ print ""; printf("%s", $1); (i = 7; <= nf; i++) { if (((i - 7) % 4) == 0) print "" ; printf(" %d", $i) } print "" }'   here have complete run:
$ ls | sort | xargs -l 1 -i {} /bin/bash -c 'echo -n {}" "; paste -s {}' | awk '(nr != 1) { (i = 1; <= nf; i++) { if (last[i] != $i) { result[i]++; } } } { (i = 1; <= nf; i++) { last[i] = $i  }  } { printf("%s", $1); (i = 1; <= nf; i++) { printf(" %d", result[i]) } print "" }' | awk '{ print ""; printf("%s", $1); (i = 7; <= nf; i++) { if (((i - 7) % 4) == 0) print "" ; printf(" %d", $i) } print "" }'  file1.txt  0 0 0 0  0 0 0 0  0 0 0 0  file2.txt  0 0 0 0  0 1 0 0  0 0 0 0  file3.txt  0 0 1 0  0 2 0 0  0 0 0 0 $       
Comments
Post a Comment