C++ Regular Expressions

Introduction

Regular expressions (regex) are sequences of characters that form search patterns. They are widely used for string matching and manipulation. In C++, the <regex> library provides support for regular expressions, allowing you to perform complex string operations easily.

Basic Concepts

  1. Regex Pattern: The sequence of characters that defines the search pattern.
  2. Regex Match: The result of applying the regex pattern to a string.
  3. Regex Token: The parts of the string that match the regex pattern.

Regular Expressions in C++

C++ uses the <regex> library, which is part of the Standard Library introduced in C++11. This library provides classes and functions to work with regular expressions.

Key Classes and Functions

  • std::regex: Represents the regular expression.
  • std::smatch: Holds the results of a regex match on a std::string.
  • std::regex_match: Checks if a regex matches the entire string.
  • std::regex_search: Searches for a regex match within a string.
  • std::regex_replace: Replaces parts of a string that match a regex pattern.

Example 1: Simple Regex Match

#include <iostream>
#include <regex>
using namespace std;

int main() {
    string str = "Hello, World!";
    regex pattern("World");
    if (regex_search(str, pattern)) {
        cout << "Pattern found!" << endl;
    } else {
        cout << "Pattern not found." << endl;
    }
    return 0;
}

Output

Pattern found!

Explanation

  • std::regex pattern("World"): Defines a regex pattern to search for the word "World".
  • regex_search(str, pattern): Searches for the pattern in the string str.

Example 2: Regex Match and Capture Groups

#include <iostream>
#include <regex>
using namespace std;

int main() {
    string str = "My email is example@example.com";
    regex pattern("(\\w+)@(\\w+\\.\\w+)");
    smatch matches;
    if (regex_search(str, matches, pattern)) {
        cout << "Full match: " << matches[0] << endl;
        cout << "Username: " << matches[1] << endl;
        cout << "Domain: " << matches[2] << endl;
    } else {
        cout << "No match found." << endl;
    }
    return 0;
}

Output

Full match: example@example.com
Username: example
Domain: example.com

Explanation

  • std::regex pattern("(\\w+)@(\\w+\\.\\w+)"): Defines a regex pattern to match an email address and capture the username and domain.
  • regex_search(str, matches, pattern): Searches for the pattern in the string str and stores the results in matches.
  • matches[0]: The full match.
  • matches[1]: The username (first capture group).
  • matches[2]: The domain (second capture group).

Example 3: Regex Replace

#include <iostream>
#include <regex>
using namespace std;

int main() {
    string str = "Hello, World!";
    regex pattern("World");
    string replaced = regex_replace(str, pattern, "C++");
    cout << "Original: " << str << endl;
    cout << "Replaced: " << replaced << endl;
    return 0;
}

Output

Original: Hello, World!
Replaced: Hello, C++!

Explanation

  • regex_replace(str, pattern, "C++"): Replaces all occurrences of the pattern "World" with "C++" in the string str.

Example 4: Validating Input with Regex

#include <iostream>
#include <regex>
using namespace std;

bool validatePhoneNumber(const string& phoneNumber) {
    regex pattern("^\\d{3}-\\d{3}-\\d{4}$");
    return regex_match(phoneNumber, pattern);
}

int main() {
    string phone1 = "123-456-7890";
    string phone2 = "123-45-6789";

    cout << phone1 << " is " << (validatePhoneNumber(phone1) ? "valid" : "invalid") << endl;
    cout << phone2 << " is " << (validatePhoneNumber(phone2) ? "valid" : "invalid") << endl;

    return 0;
}

Output

123-456-7890 is valid
123-45-6789 is invalid

Explanation

  • regex pattern("^\\d{3}-\\d{3}-\\d{4}$"): Defines a regex pattern to validate phone numbers in the format "123-456-7890".
  • regex_match(phoneNumber, pattern): Checks if the phone number matches the pattern.

Example 5: Splitting a String with Regex

#include <iostream>
#include <regex>
#include <string>
#include <vector>
using namespace std;

vector<string> split(const string& str, const string& regex_str) {
    regex pattern(regex_str);
    sregex_token_iterator it(str.begin(), str.end(), pattern, -1);
    sregex_token_iterator reg_end;
    return {it, reg_end};
}

int main() {
    string str = "one,two,three,four";
    vector<string> tokens = split(str, ",");

    for (const string& token : tokens) {
        cout << token << endl;
    }

    return 0;
}

Output

one
two
three
four

Explanation

  • sregex_token_iterator is used to split the string based on the regex pattern.
  • split function splits the input string str by the delimiter specified in regex_str.

Example 6: Extracting Numbers from a String

#include <iostream>
#include <regex>
using namespace std;

int main() {
    string str = "There are 3 apples, 4 bananas, and 5 cherries.";
    regex pattern("\\d+");
    sregex_iterator it(str.begin(), str.end(), pattern);
    sregex_iterator reg_end;

    while (it != reg_end) {
        cout << "Found number: " << it->str() << endl;
        ++it;
    }

    return 0;
}

Output

Found number: 3
Found number: 4
Found number: 5

Explanation

  • regex pattern("\\d+"): Defines a regex pattern to match one or more digits.
  • sregex_iterator is used to iterate through all matches of the pattern in the string str.

Example 7: Matching Multiple Lines

#include <iostream>
#include <regex>
using namespace std;

int main() {
    string str = "First line\nSecond line\nThird line";
    regex pattern("^\\w+");
    smatch matches;
    string::const_iterator searchStart(str.cbegin());

    while (regex_search(searchStart, str.cend(), matches, pattern)) {
        cout << "Found: " << matches[0] << endl;
        searchStart = matches.suffix().first;
    }

    return 0;
}

Output

Found: First
Found: Second
Found: Third

Explanation

  • regex pattern("^\\w+"): Defines a regex pattern to match the first word in each line.
  • regex_search is used to find matches iteratively.

Example 8: Using Regex Flags

#include <iostream>
#include <regex>
using namespace std;

int main() {
    string str = "Hello, world! Hello, Universe!";
    regex pattern("hello", regex_constants::icase);
    sregex_iterator it(str.begin(), str.end(), pattern);
    sregex_iterator reg_end;

    while (it != reg_end) {
        cout << "Found: " << it->str() << endl;
        ++it;
    }

    return 0;
}

Output

Found: Hello
Found: Hello

Explanation

  • regex pattern("hello", regex_constants::icase): Defines a case-insensitive regex pattern to match "hello".

Example 9: Checking for Valid Email Addresses

#include <iostream>
#include <regex>
using namespace std;

bool isValidEmail(const string& email) {
    regex pattern(R"((\w+)(\.\w+)*@(\w+)(\.\w+)+)");
    return regex_match(email, pattern);
}

int main() {
    string email1 = "test.email@example.com";
    string email2 = "invalid-email@.com";

    cout << email1 << " is " << (isValidEmail(email1) ? "valid" : "invalid") << endl;
    cout << email2 << " is " << (isValidEmail(email2) ? "valid" : "invalid") << endl;

    return 0;
}

Output

test.email@example.com is valid
invalid-email@.com is invalid

Explanation

  • regex pattern(R"((\w+)(\.\w+)*@(\w+)(\.\w+)+)"): Defines a regex pattern to match valid email addresses.
  • regex_match(email, pattern): Checks if the email matches the pattern.

Example 10: Finding All Words in a String

#include <iostream>
#include <regex>
using namespace std;



int main() {
    string str = "This is a sample sentence.";
    regex pattern("\\w+");
    sregex_iterator it(str.begin(), str.end(), pattern);
    sregex_iterator reg_end;

    while (it != reg_end) {
        cout << "Found word: " << it->str() << endl;
        ++it;
    }

    return 0;
}

Output

Found word: This
Found word: is
Found word: a
Found word: sample
Found word: sentence

Explanation

  • regex pattern("\\w+"): Defines a regex pattern to match words.
  • sregex_iterator is used to iterate through all matches of the pattern in the string str.

Conclusion

The <regex> library in C++ provides powerful tools for working with regular expressions. These examples demonstrate various functionalities such as matching, searching, replacing, and validating strings using regex patterns. Understanding and using regular expressions effectively can greatly enhance your ability to handle complex string operations in C++.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top