bash - Using sed or awk to extract the lines between 2 patterns -


i new shell scripting , tried lot many things using old threads retrieve message log file failed desired output.

below sample message how looks

00:31:54.184 mnk  4155809232 (monklog:391): result of mapping : s|aaaaa|bbbbb|32|d|m|28/04/2015|ccc|33208369 00:31:54.184 mnk  4155809232 (monklog:391): .05|28/04/2015|0428|c|105840.|dddd|fffff|9511705558|/ctc/097/eeeeee eee|/pt 00:31:54.184 mnk  4155809232 (monklog:391): /sc/tt/12/sn/eee eeeeeee/ceey/ee -eee aa aaaa s.a.b. de c.v./dc/aaaaa 00:31:54.184 mnk  4155809232 (monklog:391):  , aaaaa aaaa/na/aaaaa,/sk/aaaaa|d|m|28/04/2015|mxn|11111.17|||| 00:31:54.184 mnk  4155809232 (monklog:391): ||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg|||||||||||||||| 00:31:54.184 mnk  4155809232 (monklog:391): s|aaaaa|bbbbb|32|d|m|28/04/2015|ccc|33208369 00:31:54.184 mnk  4155809232 (monklog:391): .05|28/04/2015|0428|c|105840.|dddd|fffff|9511705558|/ctc/097/eeeeee eee|/pt 00:31:54.184 mnk  4155809232 (monklog:391): /sc/tt/12/sn/eee eeeeeee/ceey/ee -eee aa aaaa s.a.b. de c.v./dc/aaaaa 00:31:54.184 mnk  4155809232 (monklog:391):  , aaaaa aaaa/na/aaaaa,/sk/aaaaa|d|m|28/04/2015|mxn|11111.17|||| 00:31:54.184 mnk  4155809232 (monklog:391): ||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg|||||||||||||||| 00:31:54.184 mnk  4155809232 (monklog:406): ||29/04/2015 01:31:00|||||||||^m 

i need message s| , before ^m.

i tried these codes.

awk '/s|/{flag=1}/|^m/{flag=0}flag' $log  > output2.txt sed -n '/: s|/,/|^m/p' $log > output.txt 

both gives me same input output. please help. thanks.


expected output

s|aaaaa|bbbbb|32|d|m|28/04/2015|ccc|33208369.05|28/04/2015|0428|c|105840.|dddd|fffff|9511705558|/ctc/097/eeeeee eee|/pt/sc/tt/12/sn/eee eeeeeee/ceey/ee -eee aa aaaa s.a.b. de c.v./dc/aaaaa , aaaaa aaaa/na/aaaaa,/sk/aaaaa|d|m|28/04/2015|mxn|11111.17||||||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg|||||||||||||||| s|aaaaa|bbbbb|32|d|m|28/04/2015|ccc|33208369.05|28/04/2015|0428|c|105840.|dddd|fffff|9511705558|/ctc/097/eeeeee eee|/pt/sc/tt/12/sn/eee eeeeeee/ceey/ee -eee aa aaaa s.a.b. de c.v./dc/aaaaa , aaaaa aaaa/na/aaaaa,/sk/aaaaa|d|m|28/04/2015|mxn|11111.17||||||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg|||||||||||||||| 

each set should come in single line.

sed based approach:

$ sed -n '/s\|/,/\^m/{             /s\|/  {s/.*s|/s|/};                   {s/.*[0-9]\+): //;h}            /\^m/  {g;s/\n//g;s/\^m.*//p;};         }' file.log s|aaaaa|bbbbb|32|d|m|28/04/2015|ccc|33208369.05|28/04/2015|0428|c|105840.|dddd|fffff|9511705558|/ctc/097/eeeeee eee|/pt/sc/tt/12/sn/eee eeeeeee/ceey/ee -eee aa aaaa s.a.b. de c.v./dc/aaaaa , aaaaa aaaa/na/aaaaa,/sk/aaaaa|d|m|28/04/2015|mxn|11111.17||||||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg||||||||||||||||s|aaaaa|bbbbb|32|d|m|28/04/2015|ccc|33208369.05|28/04/2015|0428|c|105840.|dddd|fffff|9511705558|/ctc/097/eeeeee eee|/pt/sc/tt/12/sn/eee eeeeeee/ceey/ee -eee aa aaaa s.a.b. de c.v./dc/aaaaa , aaaaa aaaa/na/aaaaa,/sk/aaaaa|d|m|28/04/2015|mxn|11111.17||||||||ssssss|ssssss|qwerrt-aaaaaa|ggggggg||||||||||||||||00||||||||| 

explanation:

  1. for lines between s| & ^m - '/s\|/,/\^m/{
  2. if line contains s|, remove till s| - /s\|/{s/.*s|/s|/};
  3. remove till <digits>): & append remaining string hold space - {s/.*[0-9]\+): //;h}. removes prefix text 00:31:54.184 mnk 4155809232 (monklog:391):
  4. for lines matching ^m, copy entire hold space pattern space. remove newlines (which had got added because of h command. remove after ^m & print. - /\^m/ {g;s/\n//g;s/\^m.*//p;};

similar logic using awk:

$ awk -v fs="[0-9]+): " '      /s\|/ && (!a){  = gensub(/.*s\|/,"s|","",$2); next;}      /\^m/ && { print gensub(/\^m.*/,"","",$2); a=0;}      a{a=a $2};      ' file.log  

Comments