close
close
regular expression perl cheat sheet

regular expression perl cheat sheet

2 min read 17-10-2024
regular expression perl cheat sheet

Perl Regular Expression Cheat Sheet: Mastering Text Manipulation

Regular expressions (regex) are powerful tools for searching, manipulating, and validating text data. Perl is renowned for its robust regex engine, making it a popular choice for tasks involving text processing. This cheat sheet provides a concise overview of essential Perl regex patterns and their applications.

Fundamental Syntax

  • Matching Characters:
    • . : Matches any single character (except newline).
    • \d: Matches any digit (0-9).
    • \w: Matches any alphanumeric character (a-z, A-Z, 0-9, and underscore).
    • \s: Matches any whitespace character (space, tab, newline).
  • Character Classes:
    • [abc]: Matches any one of the characters listed within the brackets (e.g., [a-z] matches any lowercase letter).
    • [^abc]: Matches any character not listed within the brackets.
    • \b: Matches a word boundary.

Example:

$string = "The quick brown fox jumps over the lazy dog.";
if ($string =~ /quick \w+/) {
    print "Found the word 'quick'!\n";
}

This code snippet uses the regex /quick \w+/ to find the word "quick" followed by one or more alphanumeric characters.

Repetition

  • *: Matches zero or more repetitions of the preceding character or group.
  • +: Matches one or more repetitions.
  • ?: Matches zero or one repetition.
  • {n}: Matches exactly n repetitions.
  • {n,}: Matches at least n repetitions.
  • {n,m}: Matches between n and m repetitions.

Example:

$string = "123-456-7890";
if ($string =~ /^\d{3}-\d{3}-\d{4}$/) {
    print "Valid phone number format!\n";
}

This example uses the regex ^\d{3}-\d{3}-\d{4}$ to validate a phone number in the format XXX-XXX-XXXX. The ^ and $ anchors ensure that the entire string matches the pattern.

Grouping and Alternatives

  • ( ): Groups parts of the regex for reference and allows applying repetition operators to multiple characters.
  • |: Matches either one of the alternatives separated by the pipe character.

Example:

$string = "This is an example string.";
if ($string =~ /(is|an) \w+/) {
    print "Found 'is' or 'an' followed by a word!\n";
}

This regex matches either the word "is" or "an", followed by a word, and prints the corresponding message.

Backreferences

  • \1: Matches the first captured group.
  • \2: Matches the second captured group, and so on.

Example:

$string = "This string contains repeated words.";
if ($string =~ /(\w+)\s+\1/) {
    print "The word '$1' is repeated!\n";
}

This regex uses backreferences to find repeated words in a string. The first group (\w+) captures a word, and the backreference \1 checks if the captured word is repeated later in the string.

Lookarounds

  • (?= ): Positive lookahead – ensures that the pattern following the lookahead exists without actually matching it.
  • (?! ): Negative lookahead – ensures that the pattern following the lookahead does not exist.
  • (?<= ): Positive lookbehind – ensures that the pattern preceding the lookbehind exists without actually matching it.
  • (?<! ): Negative lookbehind – ensures that the pattern preceding the lookbehind does not exist.

Example:

$string = "This is a test string.";
if ($string =~ /\b(?<!\d)word\b/) {
    print "Found 'word' that is not preceded by a digit!\n";
}

This regex uses a negative lookbehind to find the word "word" that is not preceded by a digit.

Further Learning Resources

This cheat sheet serves as a starting point for understanding Perl regular expressions. To master the art of regex, delve deeper into these resources:

With practice and exploration, you will be able to use Perl's powerful regex engine to solve complex text manipulation challenges.

Related Posts


Latest Posts