
NAME
 
regexp – Plan 9 regular expression notation

DESCRIPTION
 
This manual page describes the regular expression syntax used
by the Plan 9 regular expression library regexp9(3). It is the
form used by egrep(1) before egrep got complicated.
A regular expression specifies a set of strings of characters.
A member of this set of strings is said to be matched by the regular
expression. In many applications a delimiter character, commonly
/, bounds a regular expression. In the following specification
for regular expressions the word ‘character’ means any character
(rune) but newline.
The syntax for a regular expression e0 is
 
e3: literal  charclass  '.'  '^'  '$'  '(' e0 ')'
e2: e3
REP: '*'  '+'  '?'
e1: e2
e0: e1

A literal is any nonmetacharacter, or a metacharacter (one of
.*+?[]()\^$), or the delimiter preceded by \.
A charclass is a nonempty string s bracketed [s] (or [^s]); it
matches any character in (or not in) s. A negated character class
never matches newline. A substring a−b, with a and b in ascending
order, stands for the inclusive range of characters between a
and b. In s, the metacharacters −, ], an initial ^, and the regular
expression delimiter
must be preceded by a \; other metacharacters have no special
meaning and may appear unescaped.
A . matches any character.
A ^ matches the beginning of a line; $ matches the end of the
line.
The REP operators match zero or more (*), one or more (+), zero
or one (?), instances respectively of the preceding regular expression
e2.
A concatenated regular expression, e1e2, matches a match to e1
followed by a match to e2.
An alternative regular expression, e0e1, matches either a match
to e0 or a match to e1.
A match to any part of a regular expression extends as far as
possible without preventing a match to the remainder of the regular
expression.

SEE ALSO

