regular expression perl cheat sheet

2 min read 17-10-2024

Perl Regular Expression Cheat Sheet: Mastering Text Manipulation

Regular expressions (regex) are powerful tools for searching, manipulating, and validating text data. Perl is renowned for its robust regex engine, making it a popular choice for tasks involving text processing. This cheat sheet provides a concise overview of essential Perl regex patterns and their applications.

Fundamental Syntax

Matching Characters:
- . : Matches any single character (except newline).
- \d: Matches any digit (0-9).
- \w: Matches any alphanumeric character (a-z, A-Z, 0-9, and underscore).
- \s: Matches any whitespace character (space, tab, newline).
Character Classes:
- [abc]: Matches any one of the characters listed within the brackets (e.g., [a-z] matches any lowercase letter).
- [^abc]: Matches any character not listed within the brackets.
- \b: Matches a word boundary.

Example:

$string = "The quick brown fox jumps over the lazy dog.";
if ($string =~ /quick \w+/) {
    print "Found the word 'quick'!\n";
}

This code snippet uses the regex /quick \w+/ to find the word "quick" followed by one or more alphanumeric characters.

Repetition

*: Matches zero or more repetitions of the preceding character or group.
+: Matches one or more repetitions.
?: Matches zero or one repetition.
{n}: Matches exactly n repetitions.
{n,}: Matches at least n repetitions.
{n,m}: Matches between n and m repetitions.

Example:

$string = "123-456-7890";
if ($string =~ /^\d{3}-\d{3}-\d{4}$/) {
    print "Valid phone number format!\n";
}

This example uses the regex ^\d{3}-\d{3}-\d{4}$ to validate a phone number in the format XXX-XXX-XXXX. The ^ and $ anchors ensure that the entire string matches the pattern.

Grouping and Alternatives

( ): Groups parts of the regex for reference and allows applying repetition operators to multiple characters.
|: Matches either one of the alternatives separated by the pipe character.

Example:

$string = "This is an example string.";
if ($string =~ /(is|an) \w+/) {
    print "Found 'is' or 'an' followed by a word!\n";
}

This regex matches either the word "is" or "an", followed by a word, and prints the corresponding message.

Backreferences

\1: Matches the first captured group.
\2: Matches the second captured group, and so on.

Example:

$string = "This string contains repeated words.";
if ($string =~ /(\w+)\s+\1/) {
    print "The word '$1' is repeated!\n";
}

This regex uses backreferences to find repeated words in a string. The first group (\w+) captures a word, and the backreference \1 checks if the captured word is repeated later in the string.

Lookarounds

(?= ): Positive lookahead – ensures that the pattern following the lookahead exists without actually matching it.
(?! ): Negative lookahead – ensures that the pattern following the lookahead does not exist.
(?<= ): Positive lookbehind – ensures that the pattern preceding the lookbehind exists without actually matching it.
(?<! ): Negative lookbehind – ensures that the pattern preceding the lookbehind does not exist.

Example:

$string = "This is a test string.";
if ($string =~ /\b(?<!\d)word\b/) {
    print "Found 'word' that is not preceded by a digit!\n";
}

This regex uses a negative lookbehind to find the word "word" that is not preceded by a digit.

Further Learning Resources

This cheat sheet serves as a starting point for understanding Perl regular expressions. To master the art of regex, delve deeper into these resources:

Perl documentation: https://perldoc.perl.org/perlre.html
Regex101: https://regex101.com/
Regular-Expressions.info: https://www.regular-expressions.info/

With practice and exploration, you will be able to use Perl's powerful regex engine to solve complex text manipulation challenges.

regular expression perl cheat sheet

Perl Regular Expression Cheat Sheet: Mastering Text Manipulation

Fundamental Syntax

Repetition

Grouping and Alternatives

Backreferences

Lookarounds

Further Learning Resources

Related Posts

Latest Posts

Popular Posts