Python re Module

The re module in Python provides support for working with regular expressions, which are patterns used to match character combinations in strings. Regular expressions are used for searching, matching, and manipulating strings based on specific patterns.

Introduction
re Module Functions
- re.compile
- re.search
- re.match
- re.fullmatch
- re.split
- re.findall
- re.finditer
- re.sub
- re.subn
Regular Expression Syntax
Examples
- Basic Usage
- Using Groups and Capturing
- Using Flags
- Advanced Substitution
Real-World Use Case
Conclusion
References

Introduction

The re module in Python is used for working with regular expressions. Regular expressions allow you to specify patterns for searching and manipulating strings. With the re module, you can perform various operations such as searching for patterns, splitting strings, replacing substrings, and more.

re Module Functions

re.compile

Compiles a regular expression pattern into a regex object, which can be used for matching.

import re

pattern = re.compile(r'\d+'

re.search

Searches the string for a match to the pattern. Returns a match object if found.

import re

result = re.search(r'\d+', 'Sample123String')
print(result.group())

Output:

re.match

Checks for a match only at the beginning of the string. Returns a match object if found.

import re

result = re.match(r'\d+', '123Sample')
print(result.group())

Output:

re.fullmatch

Checks for a match only if the entire string matches the pattern. Returns a match object if found.

import re

result = re.fullmatch(r'\d+', '123')
print(result.group())

Output:

re.split

Splits the string by occurrences of the pattern.

import re

result = re.split(r'\d+', 'Sample123String456Another789')
print(result)

Output:

['Sample', 'String', 'Another', '']

re.findall

Finds all non-overlapping matches of the pattern in the string. Returns a list of matches.

import re

result = re.findall(r'\d+', 'Sample123String456Another789')
print(result)

Output:

['123', '456', '789']

re.finditer

Finds all non-overlapping matches of the pattern in the string. Returns an iterator yielding match objects.

import re

result = re.finditer(r'\d+', 'Sample123String456Another789')
for match in result:
    print(match.group())

Output:

123
456
789

re.sub

Replaces occurrences of the pattern with a replacement string.

import re

result = re.sub(r'\d+', '#', 'Sample123String456Another789')
print(result)

Output:

Sample#String#Another#

re.subn

Replaces occurrences of the pattern with a replacement string. Returns a tuple containing the new string and the number of replacements.

import re

result = re.subn(r'\d+', '#', 'Sample123String456Another789')
print(result)

Output:

('Sample#String#Another#', 3)

Regular Expression Syntax

Regular expressions use special characters to define patterns. Here are some commonly used special characters:

.: Matches any character except a newline.
^: Matches the start of the string.
$: Matches the end of the string.
*: Matches 0 or more repetitions of the preceding pattern.
+: Matches 1 or more repetitions of the preceding pattern.
?: Matches 0 or 1 repetition of the preceding pattern.
{n}: Matches exactly n repetitions of the preceding pattern.
{n,}: Matches n or more repetitions of the preceding pattern.
{n,m}: Matches between n and m repetitions of the preceding pattern.
[]: Matches any one of the characters inside the brackets.
|: Matches either the pattern before or the pattern after the |.
(): Creates a group for extracting or manipulating the matched text.

Examples

Basic Usage

Search for all digits in a string.

import re

pattern = re.compile(r'\d+')
matches = pattern.findall('Sample123String456Another789')
print(matches)

Output:

['123', '456', '789']

Using Groups and Capturing

Use groups to capture parts of the match.

import re

pattern = re.compile(r'(\d+)-(\d+)-(\d+)')
match = pattern.search('Phone number: 123-456-7890')
if match:
    print(match.groups())

Output:

('123', '456', '7890')

Using Flags

Use flags to modify the behavior of the pattern.

import re

pattern = re.compile(r'sample', re.IGNORECASE)
matches = pattern.findall('Sample123String456sample789')
print(matches)

Output:

['Sample', 'sample']

Advanced Substitution

Use a function as the replacement argument in re.sub.

import re

def replace(match):
    return str(int(match.group()) * 2)

result = re.sub(r'\d+', replace, 'Sample123String456Another789')
print(result)

Output:

Sample246String912Another1578

Real-World Use Case

Validating Email Addresses

Use regular expressions to validate email addresses.

import re

def validate_email(email):
    pattern = re.compile(r'^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$')
    return bool(pattern.match(email))

emails = ['test@example.com', 'invalid-email', 'user@domain.com']
valid_emails = [email for email in emails if validate_email(email)]
print(valid_emails)

Output:

['test@example.com', 'user@domain.com']

Conclusion

The re module in Python provides functions for working with regular expressions. From searching and matching patterns to splitting strings and performing substitutions, the re module is essential for any text processing tasks. Understanding regular expressions and the re module can significantly enhance your ability to manipulate and analyze string data in Python.

Python re Module

Table of Contents

Introduction

re Module Functions

re.compile

re.search

Output:

re.match

Output:

re.fullmatch

Output:

re.split

Output:

re.findall

Output:

re.finditer

Output:

re.sub

Output:

re.subn

Output:

Regular Expression Syntax

Examples

Basic Usage

Output:

Using Groups and Capturing

Output:

Using Flags

Output:

Advanced Substitution

Output:

Real-World Use Case

Validating Email Addresses

Conclusion

References

Leave a Comment Cancel Reply

Table of Contents

Introduction

re Module Functions

re.compile

re.search

Output:

re.match

Output:

re.fullmatch

Output:

re.split

Output:

re.findall

Output:

re.finditer

Output:

re.sub

Output:

re.subn

Output:

Regular Expression Syntax

Examples

Basic Usage

Output:

Using Groups and Capturing

Output:

Using Flags

Output:

Advanced Substitution

Output:

Real-World Use Case

Validating Email Addresses

Conclusion

References

Related Posts:

Leave a Comment Cancel Reply