Using awk
or sed
how can I select lines which are occurring between two different marker patterns? There may be multiple sections marked with these patterns.
For example: Suppose the file contains:
abc
def1
ghi1
jkl1
mno
abc
def2
ghi2
jkl2
mno
pqr
stu
And the starting pattern is abc
and ending pattern is mno
So, I need the output as:
def1
ghi1
jkl1
def2
ghi2
jkl2
I am using sed to match the pattern once:
sed -e '1,/abc/d' -e '/mno/,$d' <FILE>
Is there any way in sed
or awk
to do it repeatedly until the end of file?
Use awk
with a flag to trigger the print when necessary:
$ awk '/abc/{flag=1;next}/mno/{flag=0}flag' file
def1
ghi1
jkl1
def2
ghi2
jkl2
How does this work?
/abc/ matches lines having this text, as well as /mno/ does.
/abc/{flag=1;next} sets the flag when the text abc is found. Then, it skips the line.
/mno/{flag=0} unsets the flag when the text mno is found.
The final flag is a pattern with the default action, which is to print $0: if flag is equal 1 the line is printed.
For a more detailed description and examples, together with cases when the patterns are either shown or not, see How to select lines between two patterns?.
Using sed
:
sed -n -e '/^abc$/,/^mno$/{ /^abc$/d; /^mno$/d; p; }'
The -n
option means do not print by default.
The pattern looks for lines containing just abc
to just mno
, and then executes the actions in the { ... }
. The first action deletes the abc
line; the second the mno
line; and the p
prints the remaining lines. You can relax the regexes as required. Any lines outside the range of abc
..mno
are simply not printed.
-e
sed
should execute. If you want or need to use several arguments to include the entire script, then you must use -e
before each such argument; otherwise, it's optional (but explicit).
d
to all lines up to the first match, and then another d
to all lines starting with the second match?
sed -n '1,/\\begin{document}/d;/\\end{document}/d;p'
. (This is cheating a little bit, since the second part does not delete up to the document end, and I would not know how to cut multiple parts as the OP asked for.)
$
mark, as in /^abc$
and others
This might work for you (GNU sed):
sed '/^abc$/,/^mno$/{//!b};d' file
Delete all lines except for those between lines starting abc
and mno
!d;//d
golfs 2 characters better :-) stackoverflow.com/a/31380266/895245
{//!b}
prevents the abc
and mno
from being included in the output, but I can't figure out how. Could you explain?
//!b
reads if the current line is neither one of the lines that match the range, break and therefore print those lines otherwise all other lines are deleted.
sed '/^abc$/,/^mno$/!d;//d' file
golfs two characters better than ppotong's {//!b};d
The empty forward slashes //
mean: "reuse the last regular expression used". and the command does the same as the more understandable:
sed '/^abc$/,/^mno$/!d;/^abc$/d;/^mno$/d' file
This seems to be POSIX:
If an RE is empty (that is, no pattern is specified) sed shall behave as if the last RE used in the last command applied (either as an address or as part of a substitute command) was specified.
From the previous response's links, the one that did it for me, running ksh
on Solaris, was this:
sed '1,/firstmatch/d;/secondmatch/,$d'
1,/firstmatch/d: from line 1 until the first time you find firstmatch, delete.
/secondmatch/,$d: from the first occurrance of secondmatch until the end of file, delete.
Semicolon separates the two commands, which are executed in sequence.
1,
) come before /firstmatch/
? I'm guessing this could also be phrased '/firstmatch/1,d;/secondmatch,$d'
?
something like this works for me:
file.awk:
BEGIN {
record=0
}
/^abc$/ {
record=1
}
/^mno$/ {
record=0;
print "s="s;
s=""
}
!/^abc|mno$/ {
if (record==1) {
s = s"\n"$0
}
}
using: awk -f file.awk data
...
edit: O_o fedorqui solution is way better/prettier than mine.
Don_crissti's answer from Show only text between 2 matching pattern?
firstmatch="abc"
secondmatch="cdf"
sed "/$firstmatch/,/$secondmatch/!d;//d" infile
which is much more efficient than AWK's application, see here.
perl -lne 'print if((/abc/../mno/) && !(/abc/||/mno/))' your_file
I tried to use awk
to print lines between two patterns while pattern2 also match pattern1. And the pattern1 line should also be printed.
e.g. source
package AAA
aaa
bbb
ccc
package BBB
ddd
eee
package CCC
fff
ggg
hhh
iii
package DDD
jjj
should has an ouput of
package BBB
ddd
eee
Where pattern1 is package BBB
, pattern2 is package \w*
. Note that CCC
isn't a known value so can't be literally matched.
In this case, neither @scai 's awk '/abc/{a=1}/mno/{print;a=0}a' file
nor @fedorqui 's awk '/abc/{a=1} a; /mno/{a=0}' file
works for me.
Finally, I managed to solve it by awk '/package BBB/{flag=1;print;next}/package \w*/{flag=0}flag' file
, haha
A little more effort result in awk '/package BBB/{flag=1;print;next}flag;/package \w*/{flag=0}' file
, to print pattern2 line also, that is,
package BBB
ddd
eee
package CCC
This can also be done with logical operations and increment/decrement operations on a flag:
awk '/mno/&&--f||f||/abc/&&f++' file
flag
to f
, in the spirit of some good ol' code golf fun. :-)
Success story sharing
awk '/abc/{a=1}/mno/{print;a=0}a' file
.awk '/abc/{a=1} a; /mno/{a=0}' file
- with this, puttinga
condition before the/mno/
we make it evaluate the line as true (and print it) before settinga=0
. This way we can avoid writingprint
.awk '/abc/,/mno/' file
awk 'flag; /PAT1/{flag=1; next} /PAT1/{flag=0}' file
would make.[pattern] { action }
orpattern [{ action }]
. 2. An action consists of one or more awk statements, enclosed in braces (‘{…}’). —— So the endingflag
is abbr offlag {print $0}