So, you can write:$str = "My name is Reinier!";
$result = ($str =~ m/Reinier/);
print("$result\n");# Output: 1
... in a more concise way, omitting the character 'm' (of 'matching') and using the postfix form:$str = "My name is Reinier!";
if ($str =~ m/Reinier/) {
print("The name 'Reinier' was found!\n");
}
... using the special variable$str = "My name is Reinier!";
print("The name 'Reinier' was found!") if ($str =~ /Reinier/);
...implicitly in a$_= "My name is Reinier!";
print("The name 'Reinier' was found!\n") if (/Reinier/);
The negated variant of the =~ operator is !~@names = qw(Sebastian Daniel Floor Reinier);
foreach (@names) {
print("The name 'Reinier' was found!\n") if (/Reinier/);
}
Useful is alternation by the OR of | operator:$str = "My name is Reinier!";
print("The name 'Barbara' was not found!") if ($str !~ /Barbara/);
The special variable $& contains of course the value Sebastian.$str = "My name is Sebastian!";
print("The name 'Reinier' or 'Sebastian' was found!") if ($str =~ /Reinier|Sebastian/);
Regular expressions do not have an && equivalent. A workaround (so called 'lookahead anchor') is (notice: 'match only whole words' by the metacharacter \b):$str = "Reinier";
print("The name 'Reinier' or 'Beinier' was found!") if ($str =~ /[BR]einier/);# [BR] is a so called character class: study 10.1.5
(?=regex) is a test. The code .*(?=\bSebastian\b) means "match .* (one or more characters), but match only if this match is followed by the whole word 'Sebastian'.$str = "Sebastian Daniel Florence";
$str =~ /(.*(?=\bSebastian\b).*(?=\bDaniel\b))/ ? (print("True\n")) : (print("False\n"));
The pattern matches anything (as a group), then whitespace (\s), the word 'is', whitespace and then anything (as a group). The two grouped matches are assigned to the list ($obj, $color), i.e. $obj contains 'ball' and $color 'blue'. Notice that without defining a variable, the expression refers to the special variable '$_'$_ = "ball is blue"
($obj, $color) = /(.*)\sis\s(.*)/;# no use of =~
$str = "My name is Reinier!";
print("The name 'reinier' was found!") if ($str =~ /reinier/i);
$str = "My name is Reinier Reinier!";
print("The name 'Reinier' was found " . scalar (@all_matches) . " times!") if (@all_matches = ($str =~ /Reinier/g));
The output is probably not interesting. One would like the complete words. Add simply the '+' character after the '.' character which means 'one or more of the preceding character or character class' which is the same as {1,} These additions on 'how many things to match' are called quantifiers.@matches = ();
while(<DATA>){
chomp;
push (@matches,$&) if ( $_ =~ /ci./g );# match all words containing the characters ci followed by another character
}
print("@matches\n");# output: cit cia
__DATA__
acute
city
local
social
twice
The result 'city' is ok, but 'cial' isn't. To match 'social', one could add the dot or '.' character (1) followed by the '*' character (like '+' also a quantifier), which means 'zero or more of the preceding character or character class' - which is the same as {0,}. The dot or '.' character between // refers to any character (exception )@matches = ();
while(<DATA>){
chomp;
push (@matches,$&) if ( $_ =~ /ci.+/g );# match all words containing the characters ci followed by one or more character
}
print("@matches\n");# output: city cial
__DATA__
acute
city
local
social
twice
To match the literal character '.' you've to use '\.' The character '.' needs to be 'escaped'.@matches = ();
while(<DATA>){
chomp;
push (@matches,$&) if ( $_ =~ /.*ci.+/g );# match all words containing the characters ci followed by one or more character and preceded by zero or more character.
}
print("@matches\n");# output: city precise social
__DATA__
acute
city
local
precise
social
twice
To match all words that end with 'ce', add the $ character at the end of the regular expression:@matches = ();
while(<DATA>){
chomp;
push (@matches,$&) if ( $_ =~ /^ci.+/g );# match all words that start with the characters ci followed by one or more character.
}
print("@matches\n");# output: city
__DATA__
acute
city
local
precise
social
twice
To match all words that start with 'so' and end with 'al', use both ^ and $ character in the regular expression:@matches = ();
while(<DATA>){
chomp;
push (@matches,$&) if ( $_ =~ /.*ce$/g );# match all words that end with the characters ce
}
print("@matches\n");# output: twice
__DATA__
acute
city
local
precise
social
twice
@matches = ();
while(<DATA>){
chomp;
push (@matches,$&) if ( $_ =~ /^s.*al$/g );# match all words that begin with the character 's' and end with the characters 'al'
}
print("@matches\n");# output: signal social
__DATA__
acute
city
local
precise
signal
social
twice
The character ^ within class negates all its characters or ranges. [0-9] only matches all digits, [^0-9] matches any character that is not a digit. [ae] matches the characters \'a\' or \, [^ae] any character that is not \'a\' or \'e\'.@matches = ();
while(<DATA>){
chomp;
push (@matches,$&) if ( $_ =~ /.*c[ae].*/g );# match all words that contain the characters 'ca' or 'ce'
}
print("@matches\n");# output: local twice
__DATA__
acute
city
local
precise
signal
social
twice
class | meaning | metacharacter |
---|---|---|
[0-9] | digit | \d |
[^0-9] | non digit | \D |
[_a-zA-Z0-9] | word | \w |
[^_a-zA-Z0-9] | not word | \W |
[\r\t\n\f] | space | \s |
[^\r\t\n\f] | not space | \S |
$str = "012.1";
$pattern = "[0-9]{3}";
($str =~ /$pattern/ ) ? (print("Match! $&\n")) : (print("No match...\n"));# output: Match! 012
$str = "012.1";
$pattern = "[0-9]{4}";
($str =~ /$pattern/ ) ? (print("Match! $&\n")) : (print("No match...\n"));# output: No match...
$str = "012.1";
$pattern = "[0-9.]{4}";
($str =~ /$pattern/ ) ? (print("Match! $&\n")) : (print("No match...\n"));# output: Match! 012.
$str = "012.1";
$pattern = "^[0-9]{3}\$";
($str =~ /$pattern/ ) ? (print("Match! $&\n")) : (print("No match...\n"));# output: No match...
Match whole words by the metacharacter \b$str = "012.1";
$pattern = "[0-9]{3}\.[0-9]{1}";# Notice that the dot is escaped!
($str =~ /$pattern/ ) ? (print("Match! $&\n")) : (print("No match...\n"));# output: Match! 012.1
@m_arr = ();
$str = "Wally read the wallpaper on a wall!";
print("@m_arr" . "\n") if (@m_arr = $str =~ m/\bwall\b/gi);# output: wall
@m_arr = ();
$str = "Wall y read the wallpaper on a wall!";
print("@m_arr") . "\n" if (@m_arr = $str =~ m/\bwall\b/gi);# output: Wall wall
Another example:$str = "197 197";
$pattern = '(\d{3}) \1';# match a three digit pattern and its repetition
if ($str =~ /$pattern/) {
print("Match! Number: $1 was found twice.\n");
}
else {
print("No match...\n");
}
$str = "19:07:55";
$pattern = '(\d{2}):(\d{2}):(\d{2})';
if ($str =~ /$pattern/ ) {
($hour, $minutes, $seconds) = ($1, $2, $3);
print("Match! Time: $hour:$minutes:$seconds\n");# the special variable $& contains of course 19:07:55
}
else {
print("No match...\n");
}
$str = "Explore the versatility of Windows operating system.";
$old = "Windows";
$new = "Linux";
$str =~ s/$old/$new/;
print("$str\n");
$str = "Da, da, da!";
print("$str\n") if ($str =~ s/da/do/);# output: Da, do, da!
$str = "Da, da, da!";
print("$str\n") if ($str =~ s/da/do/g);# output: Da, do, do!
$str = "Da, da, da!";
print(ucfirst($str) . "\n") if ($str =~ s/da/do/gi);# output: Do, do, do!
It can written a bit more concise using the | operator:$str = " trim_string ";
$str =~ s/^\s+//;
$str =~ s/\s+$//;
print("|$str|\n");# output: |trim_string|
The in 10.1.1 mentioned 'lookahead anchor' is useful in replacing specific characters followed by a specific pattern. In the next example, only --- should be replaced if it is followed by ### (notice that ### is not included in the match).$str = " trim_string ";
$str =~ s/(^\s+|\s+$)//g;
print("|$str|\n");# output: |trim_string|
There is also a 'lookbehind anchor'. If you want to replace specific characters not followed by a specific pattern, use (?!regex)$str = "---##---#---###";
$str =~ s/---(?=###)/xxx/g;
print("$str\n");# output: ---##---#xxx###
$str = "---##---#---###";
$str =~ s/---(?!###)/xxx/g;
print("$str\n");# output: xxx##xxx#---###
How to use regex modifier dynamically?
Match example:
$mod = "i"; # case insensitive $pattern = "(?$mod)reinier"; $str = "My name is Reinier reinier!"; print("The name 'reinier' was found! " . scalar(@all) . " times \n") if (@all = $str =~ /$pattern/g);# The name 'reinier' was found! 2 times
Substitution example:$mod = "i"; # case insensitive $pattern = "rei((?$mod)N)ier";# case insensitive on a selection, here the character N $str = "My name is reinier reiNier!"; print("The name 'reinier' was found! " . scalar(@all) . " times \n") if (@all = $str =~ /$pattern/g);# The name 'reinier' was found! 2 times
$mod = "i"; # case insensitive $str = "Da, da, da!"; $pattern = "(?$mod:da)"; print("$str\n") if ($str =~ s/$pattern/do/g);# output: do, do, do!