Perl Regular Expressions

Pattern Matching in Perl

Perl has two pattern matching operators.

The match operator searches a string for the specified pattern and returns true or false.  The substitution operator does the same thing, but in addition replaces the matched portion of the
string with a replacement.

These operators (along with the translation operator) can be used with current string $_ (by default) or with the =~ operator.  For example:

$a =~ /pattern/;
returns true if the string $a contains the pattern.  The patterns in these operators are specified using regular expressions.

Regular Expressions in Perl

Perl uses regular expressions similar to those in Unix shells (e.g., csh), editors (e.g., ed, vi, and sed), and filters (e.g., awk and grep).  The set of regular expression patterns in Perl is extensive.

 
Pattern Meaning
abc Matches a, b, and c, in order
a|b|c Matches at least one of a or b or c
. Matches any character except newline
^;  $ Matches beginning of string;  Matches end of string
[a-z0-9] Any charcter in the set, e.g., lower-case alphabetic and digit characters
[^a-z0-9] Any character not in the set, e.g., any character except lower-case alphabetic and digit characters
\d;  \D Any digit, i.e., [0-9];  Any non-digit, i.e., [^0-9]
\w;  \W Any alphanumeric character, i.e., [a-zA-Z0-9];  Any non-alphanumeric character, i.e., [^a-zA-Z0-9]
\s;  \S Any whitespace character, e.g., space, tab, newline;  Any non-whitespace character
\b;  \B Word boundary (outside [ ]);  Non-word boundary
\n;  \r;  \t;  \b;  \f;  \0 Newline;  Return;  Tab;  Backspace (inside [ ]);  Formfeed;  Null
\nnn;  \xnn;  \cX ASCII character for octal value;  ASCII character for hexadecimal value;  ASCII control character
\meta Matches the character, e.g., \., \[
(abc) Remembers match for backreference
\1;  \2; ... Backreference to what was matched by the first, second, etc. remembered match


Examples

A Unix password file has the format:

username:password:userid:groupid:name:home:shell
Print the password file lines for users who have shell of /usr/sh.
while (<>) {
    if (/\/usr\/sh$/) {
        print "$_";
    }
}
Extract and print usernames starting with the letter a from a Unix passwd file.
while (<>) {
    if (/^(a[^:]+):/) {
        print "$1\n";
    }
}
Swap the userid and groupid in a Unix passwd file.
while (<>) {
    s/^([^:]+):([^:]+):([^:]+):([^:]+)/$1:$2:$4:$3/;
    print "$_";
}
Parse a complex date-time string [Wall and Schwartz, Programming in Perl, O'Reilly, 1991].
$s = "Date: 1 Apr 91 12:34:56 GMT";
($date, $mday, $month, $year, $time, $hour, $minute, $second, $timezone) = ($s =~ /^Date: ((\d+) (\w+) (\d+)) ((\d+):(\d+):(\d+)) (.*)$/);