Regular Expressions
Refer to the Perl documentation for
regular
expressions for more information.
Escape sequences for common classes of characters
-
\d
-
Matches any digit
-
\w
-
Matches a-z, A-Z, 0-9, _
-
\s
-
Matches spaces, tabs
-
\b
-
Matches a word boundary (not necessarily a space: foo.bar is two words)
-
\D
-
Opposite of \d
-
\W
-
Opposite of \w
-
\S
-
Opposite of \s
-
\B
-
Opposite of \b
Special symbols for matching
-
\
-
Escape the special meaning if the following character
-
.
-
Match any character.
\.
matches a period.
-
$
-
Mathces the end of line character
-
^
-
Matches the start of a line.
^$
will match an empty line.
-
|
-
Alternation: match one or the other.
one|two
will match either
the word one or two.
-
[]
-
Character class. Matches any one of the characters in the brackets:
[two]
will match one of the letters t, w, or
o. A hyphen means a range of character when it is inside the brackets:
[0-9]
matches any digit. Note: if the hypen is first, then it
means a hyphen, not a range indicator: [-09] m
atches hypen,
0, or 9.
-
[^]
-
Negate the character class. Matches any one of the characters that is
not in the brackets:
[^two]
will match any one letter,
except the letters t, w, or o. Note: ^ does not match
the start of a line when it is inside [].
-
( )
-
Remember the subpattern that matches the pattern inside the parentheses.
The subpatterns are stored in special varaibles: $1, $2, $3, etc. These can
be referenced until the next regular expressions changes them.
([hj]ello)
will match either hello or jello and
store the matched word in $1.
-
(?: )
-
Same as ( ), but don't set the $ variables. Used to group patterns.
Special symbols for repetition
-
*
-
Match 0 or more occurences of the preceding pattern
-
+
-
Match 1or more occurences of the preceding pattern
-
?
-
Match 0 or 1 occurences of the preceding pattern
-
{n}
-
Match exactly n occurences of the preceding pattern
-
{m,n}
-
Match from m to n occurences occurences of the preceding pattern.
-
{,n}
-
Match from 0 to n occurences of the preceding pattern. (? = {,1})
-
{m,}
-
Match m or more occurences of the preceding pattern. (* = {0,}; +
= {1,})
A few examples
-
fred|barney|wilma|betty
-
Matches any one of the words fred, barney, wilma, betty
-
Miami, (?:FL|OH)
-
Matches Miami, FL or Miami, OH. Does not set $1. The pattern
could also be Miami, (FL|OH), in which case $1 would be either FL or
OH. The grouping is necessary, otherwise the pattern would match Miami,
FL or OH.
-
[cm][[ao][dp]
-
Matches cad, cap, cod, cop, mad, map, mod, mop.
-
\d{3}-\d{2}-\d{4}
-
Matches a social security number
-
^\s*(\w+)\s*$
-
Matches an entire line. Looks for a word in the line, and stores the matched
word in $1.