Introduction
Regular expressions (regex) are sequences of characters that form search patterns. They are widely used for string matching and manipulation. In C++, the <regex>
library provides support for regular expressions, allowing you to perform complex string operations easily.
Basic Concepts
- Regex Pattern: The sequence of characters that defines the search pattern.
- Regex Match: The result of applying the regex pattern to a string.
- Regex Token: The parts of the string that match the regex pattern.
Regular Expressions in C++
C++ uses the <regex>
library, which is part of the Standard Library introduced in C++11. This library provides classes and functions to work with regular expressions.
Key Classes and Functions
std::regex
: Represents the regular expression.std::smatch
: Holds the results of a regex match on astd::string
.std::regex_match
: Checks if a regex matches the entire string.std::regex_search
: Searches for a regex match within a string.std::regex_replace
: Replaces parts of a string that match a regex pattern.
Example 1: Simple Regex Match
#include <iostream>
#include <regex>
using namespace std;
int main() {
string str = "Hello, World!";
regex pattern("World");
if (regex_search(str, pattern)) {
cout << "Pattern found!" << endl;
} else {
cout << "Pattern not found." << endl;
}
return 0;
}
Output
Pattern found!
Explanation
std::regex pattern("World")
: Defines a regex pattern to search for the word "World".regex_search(str, pattern)
: Searches for the pattern in the stringstr
.
Example 2: Regex Match and Capture Groups
#include <iostream>
#include <regex>
using namespace std;
int main() {
string str = "My email is example@example.com";
regex pattern("(\\w+)@(\\w+\\.\\w+)");
smatch matches;
if (regex_search(str, matches, pattern)) {
cout << "Full match: " << matches[0] << endl;
cout << "Username: " << matches[1] << endl;
cout << "Domain: " << matches[2] << endl;
} else {
cout << "No match found." << endl;
}
return 0;
}
Output
Full match: example@example.com
Username: example
Domain: example.com
Explanation
std::regex pattern("(\\w+)@(\\w+\\.\\w+)")
: Defines a regex pattern to match an email address and capture the username and domain.regex_search(str, matches, pattern)
: Searches for the pattern in the stringstr
and stores the results inmatches
.matches[0]
: The full match.matches[1]
: The username (first capture group).matches[2]
: The domain (second capture group).
Example 3: Regex Replace
#include <iostream>
#include <regex>
using namespace std;
int main() {
string str = "Hello, World!";
regex pattern("World");
string replaced = regex_replace(str, pattern, "C++");
cout << "Original: " << str << endl;
cout << "Replaced: " << replaced << endl;
return 0;
}
Output
Original: Hello, World!
Replaced: Hello, C++!
Explanation
regex_replace(str, pattern, "C++")
: Replaces all occurrences of the pattern "World" with "C++" in the stringstr
.
Example 4: Validating Input with Regex
#include <iostream>
#include <regex>
using namespace std;
bool validatePhoneNumber(const string& phoneNumber) {
regex pattern("^\\d{3}-\\d{3}-\\d{4}$");
return regex_match(phoneNumber, pattern);
}
int main() {
string phone1 = "123-456-7890";
string phone2 = "123-45-6789";
cout << phone1 << " is " << (validatePhoneNumber(phone1) ? "valid" : "invalid") << endl;
cout << phone2 << " is " << (validatePhoneNumber(phone2) ? "valid" : "invalid") << endl;
return 0;
}
Output
123-456-7890 is valid
123-45-6789 is invalid
Explanation
regex pattern("^\\d{3}-\\d{3}-\\d{4}$")
: Defines a regex pattern to validate phone numbers in the format "123-456-7890".regex_match(phoneNumber, pattern)
: Checks if the phone number matches the pattern.
Example 5: Splitting a String with Regex
#include <iostream>
#include <regex>
#include <string>
#include <vector>
using namespace std;
vector<string> split(const string& str, const string& regex_str) {
regex pattern(regex_str);
sregex_token_iterator it(str.begin(), str.end(), pattern, -1);
sregex_token_iterator reg_end;
return {it, reg_end};
}
int main() {
string str = "one,two,three,four";
vector<string> tokens = split(str, ",");
for (const string& token : tokens) {
cout << token << endl;
}
return 0;
}
Output
one
two
three
four
Explanation
sregex_token_iterator
is used to split the string based on the regex pattern.split
function splits the input stringstr
by the delimiter specified inregex_str
.
Example 6: Extracting Numbers from a String
#include <iostream>
#include <regex>
using namespace std;
int main() {
string str = "There are 3 apples, 4 bananas, and 5 cherries.";
regex pattern("\\d+");
sregex_iterator it(str.begin(), str.end(), pattern);
sregex_iterator reg_end;
while (it != reg_end) {
cout << "Found number: " << it->str() << endl;
++it;
}
return 0;
}
Output
Found number: 3
Found number: 4
Found number: 5
Explanation
regex pattern("\\d+")
: Defines a regex pattern to match one or more digits.sregex_iterator
is used to iterate through all matches of the pattern in the stringstr
.
Example 7: Matching Multiple Lines
#include <iostream>
#include <regex>
using namespace std;
int main() {
string str = "First line\nSecond line\nThird line";
regex pattern("^\\w+");
smatch matches;
string::const_iterator searchStart(str.cbegin());
while (regex_search(searchStart, str.cend(), matches, pattern)) {
cout << "Found: " << matches[0] << endl;
searchStart = matches.suffix().first;
}
return 0;
}
Output
Found: First
Found: Second
Found: Third
Explanation
regex pattern("^\\w+")
: Defines a regex pattern to match the first word in each line.regex_search
is used to find matches iteratively.
Example 8: Using Regex Flags
#include <iostream>
#include <regex>
using namespace std;
int main() {
string str = "Hello, world! Hello, Universe!";
regex pattern("hello", regex_constants::icase);
sregex_iterator it(str.begin(), str.end(), pattern);
sregex_iterator reg_end;
while (it != reg_end) {
cout << "Found: " << it->str() << endl;
++it;
}
return 0;
}
Output
Found: Hello
Found: Hello
Explanation
regex pattern("hello", regex_constants::icase)
: Defines a case-insensitive regex pattern to match "hello".
Example 9: Checking for Valid Email Addresses
#include <iostream>
#include <regex>
using namespace std;
bool isValidEmail(const string& email) {
regex pattern(R"((\w+)(\.\w+)*@(\w+)(\.\w+)+)");
return regex_match(email, pattern);
}
int main() {
string email1 = "test.email@example.com";
string email2 = "invalid-email@.com";
cout << email1 << " is " << (isValidEmail(email1) ? "valid" : "invalid") << endl;
cout << email2 << " is " << (isValidEmail(email2) ? "valid" : "invalid") << endl;
return 0;
}
Output
test.email@example.com is valid
invalid-email@.com is invalid
Explanation
regex pattern(R"((\w+)(\.\w+)*@(\w+)(\.\w+)+)")
: Defines a regex pattern to match valid email addresses.regex_match(email, pattern)
: Checks if the email matches the pattern.
Example 10: Finding All Words in a String
#include <iostream>
#include <regex>
using namespace std;
int main() {
string str = "This is a sample sentence.";
regex pattern("\\w+");
sregex_iterator it(str.begin(), str.end(), pattern);
sregex_iterator reg_end;
while (it != reg_end) {
cout << "Found word: " << it->str() << endl;
++it;
}
return 0;
}
Output
Found word: This
Found word: is
Found word: a
Found word: sample
Found word: sentence
Explanation
regex pattern("\\w+")
: Defines a regex pattern to match words.sregex_iterator
is used to iterate through all matches of the pattern in the stringstr
.
Conclusion
The <regex>
library in C++ provides powerful tools for working with regular expressions. These examples demonstrate various functionalities such as matching, searching, replacing, and validating strings using regex patterns. Understanding and using regular expressions effectively can greatly enhance your ability to handle complex string operations in C++.