130 lines
2 KiB
Plaintext
130 lines
2 KiB
Plaintext
.TH REGEXP 6
|
|
.SH NAME
|
|
regexp \- regular expression notation
|
|
.SH DESCRIPTION
|
|
A
|
|
.I "regular expression"
|
|
specifies
|
|
a set of strings of characters.
|
|
A member of this set of strings is said to be
|
|
.I matched
|
|
by the regular expression. In many applications
|
|
a delimiter character, commonly
|
|
.LR / ,
|
|
bounds a regular expression.
|
|
In the following specification for regular expressions
|
|
the word `character' means any character (rune) but newline.
|
|
.PP
|
|
The syntax for a regular expression
|
|
.B e0
|
|
is
|
|
.IP
|
|
.EX
|
|
e3: literal | charclass | '.' | '^' | '$' | '(' e0 ')'
|
|
|
|
e2: e3
|
|
| e2 REP
|
|
|
|
REP: '*' | '+' | '?'
|
|
|
|
e1: e2
|
|
| e1 e2
|
|
|
|
e0: e1
|
|
| e0 '|' e1
|
|
.EE
|
|
.PP
|
|
A
|
|
.B literal
|
|
is any non-metacharacter, or a metacharacter
|
|
(one of
|
|
.BR .*+?[]()|\e^$ ),
|
|
or the delimiter
|
|
preceded by
|
|
.LR \e .
|
|
.PP
|
|
A
|
|
.B charclass
|
|
is a nonempty string
|
|
.I s
|
|
bracketed
|
|
.BI [ \|s\| ]
|
|
(or
|
|
.BI [^ s\| ]\fR);
|
|
it matches any character in (or not in)
|
|
.IR s .
|
|
A negated character class never
|
|
matches newline.
|
|
A substring
|
|
.IB a - b\f1,
|
|
with
|
|
.I a
|
|
and
|
|
.I b
|
|
in ascending
|
|
order, stands for the inclusive
|
|
range of
|
|
characters between
|
|
.I a
|
|
and
|
|
.IR b .
|
|
In
|
|
.IR s ,
|
|
the metacharacters
|
|
.LR - ,
|
|
.LR ] ,
|
|
an initial
|
|
.LR ^ ,
|
|
and the regular expression delimiter
|
|
must be preceded by a
|
|
.LR \e ;
|
|
other metacharacters
|
|
have no special meaning and
|
|
may appear unescaped.
|
|
.PP
|
|
A
|
|
.L .
|
|
matches any character.
|
|
.PP
|
|
A
|
|
.L ^
|
|
matches the beginning of a line;
|
|
.L $
|
|
matches the end of the line.
|
|
.PP
|
|
The
|
|
.B REP
|
|
operators match zero or more
|
|
.RB ( * ),
|
|
one or more
|
|
.RB ( + ),
|
|
zero or one
|
|
.RB ( ? ),
|
|
instances respectively of the preceding regular expression
|
|
.BR e2 .
|
|
.PP
|
|
A concatenated regular expression,
|
|
.BR "e1\|e2" ,
|
|
matches a match to
|
|
.B e1
|
|
followed by a match to
|
|
.BR e2 .
|
|
.PP
|
|
An alternative regular expression,
|
|
.BR "e0\||\|e1" ,
|
|
matches either a match to
|
|
.B e0
|
|
or a match to
|
|
.BR e1 .
|
|
.PP
|
|
A match to any part of a regular expression
|
|
extends as far as possible without preventing
|
|
a match to the remainder of the regular expression.
|
|
.SH "SEE ALSO"
|
|
.IR awk (1),
|
|
.IR ed (1),
|
|
.IR grep (1),
|
|
.IR sam (1),
|
|
.IR sed (1),
|
|
.IR regexp (2)
|