Regular expressions are too huge of a topic to introduce here, but make sure that you understand these concepts. For tutorials, see perlrequick or perlretut. For the definitive documentation, see perlre.
The m// and s/// operators return the number of matches or replacements they made,
respectively.
You can either use the number directly,
or check it for truth.
if ( $str =~ /Diggle|Shelley/ ) {
print "We found Pete or Steve!\n";
}
if ( my $n = ($str =~ s/this/that/g) ) {
print qq{Replaced $n occurrence(s) of "this"\n};
}
The capture variables, $1, etc, are not valid unless the match succeeded, and they're not cleared, either.
# BAD: Not checked, but at least it "works".
my $str = 'Perl 101 rocks.';
$str =~ /(\d+)/;
print "Number: $1"; # Prints "Number: 101";
# WORSE: Not checked, and the result is not what you'd expect
$str =~ /(Python|Ruby)/;
print "Language: $1"; # Prints "Language: 101";
Instead, you must check the return value from the match:
# GOOD: Check the results
my $str = 'Perl 101 rocks.';
if ( $str =~ /(\d+)/ ) {
print "Number: $1"; # Prints "Number: 101";
}
if ( $str =~ /(Python|Ruby)/ ) {
print "Language: $1"; # Never gets here
}
/i - case insensitive match/g - match multiple times
$var = "match match match";
while ($var =~ /match/g) { $a++; }
print "$a\n"; # prints 3
$a = 0;
$a++ foreach ($var =~ /match/g);
print "$a\n"; # prints 3
/m - ^ and $ change meaning
^ means "start of string" and $, "end of string"/m makes them mean start and end of line, respectively
$str = "one\ntwo\nthree";
@a = $str =~ /^\w+/g; # @a = ("one");
@b = $str =~ /^\w+/gm; # @b = ("one","two","three")
\A and \z for start and end of string regardless of /m\Z is the same as \z except it will ignore a final newline/s - . also matches newline
$str = "one\ntwo\nthree\n";
$str =~ /^(.{8})/s;
print $1; # prints "one\ntwo\n"
$1 and friends
my $str = "abc";
$str =~ /(((a)(b))(c))/;
print "1: $1 2: $2 3: $3 4: $4 5: $5\n";
# prints: 1: abc 2: ab 3: a 4: b 5: c
?:?:, the group will not be captured
my $str = "abc";
$str =~ /(?:a(b)c)/;
print "$1\n"; # prints "b"
/x switch/x flag.
This ugly behemoth
my ($num) = $ARGV[0] =~ m/^\+?((?:(?<!\+)-)?(?:\d*.)?\d+)$/x;
is more readable with whitespace and comments, as allowed by the /x flag.
my ($num) =
$ARGV[0] =~ m/^ \+? # An optional plus sign, to be discarded
( # Capture...
(?:(?<!\+)-)? # a negative sign, if there's no plus behind it,
(?:\d*.)? # an optional number, followed by a point if a decimal,
\d+ # then any number of numbers.
)$/x;
\Q and \E
my $num = '3.1415';
print "ok 1\n" if $num =~ /\Q3.14\E/;
$num = '3X1415';
print "ok 2\n" if $num =~ /\Q3.14\E/;
print "ok 3\n" if $num =~ /3.14/;
prints
ok 1
ok 3
/e flag to s///
my $str = "AbCdE\n";
$str =~ s/(\w)/lc $1/eg;
print $str; # prints "abcde"
$1 and friends if necessarystudystudy is not helpful in the vast majority of cases. All it does is make a table of where the first occurrence of each of 256 bytes is in the string. This means that if you have a 1,000-character string, and you search for lots of strings that begin with a constant character, then the matcher can jump right to it. For example:
"This is a very long [... 900 characters skipped...] string that I have here, ending at position 1000"
Now, if you are matching this against the regex /Icky/, the matcher will try to find the first letter "I" that matches. That may take scanning through the first 900+ characters until you get to it. But what study does is build a table of the 256 possible bytes and where they first appear, so that in this case, the scanner can jump right to that position and start matching.
-Mre=debug
Submit a PR to github.com/petdance/perl101

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.