Problem
Given an input string s
and a pattern p
, implement regular expression matching with support for '.'
and '*'
where:
'.'
Matches any single character.'*'
Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
Problem description
Given an input string s and a pattern p, we need to implement a regular expression matching algorithm with support for ‘.’ and ‘‘ characters. The ‘.’ character matches any single character and the ‘‘ character matches zero or more of the preceding element. The goal is to determine if the pattern matches the entire input string (not partial).
Solution
There are multiple ways to solve this problem, but one of the most efficient algorithms is to use dynamic programming. We can create a 2D boolean array dp where dp[i][j] represents whether the substring s[0:i] matches the pattern p[0:j].
Algorithm
Step 1: Initialize the dp array with dp[0][0] = true, as the empty string matches the empty pattern.
Step 2: Iterate through the string and pattern characters. If the current character in the pattern is ‘.’ or the current character in the string and the pattern match, then set dp[i][j] = dp[i-1][j-1].
Step 3: If the current character in the pattern is ‘‘, then we have two cases to consider: a) The character before the ‘‘ matches zero characters, meaning the pattern can match the current character in the string, so dp[i][j] = dp[i][j-2]. b) The character before the ‘*’ matches one or more characters, meaning the pattern can match the current character in the string, so dp[i][j] = dp[i-1][j].
Step 4: Continue to iterate through the string and pattern until we reach the end of both strings.
Step 5: If dp[s.length()][p.length()] is true, then the pattern matches the entire input string.
C# Code:
public bool IsMatch(string s, string p) {
bool[,] dp = new bool[s.Length + 1, p.Length + 1];
dp[0, 0] = true;
for (int i = 0; i < p.Length; i++) {
if (p[i] == '*' && dp[0, i - 1]) {
dp[0, i + 1] = true;
}
}
for (int i = 0; i < s.Length; i++) {
for (int j = 0; j < p.Length; j++) {
if (p[j] == '.' || p[j] == s[i]) {
dp[i + 1, j + 1] = dp[i, j];
} else if (p[j] == '*') {
if (p[j - 1] != s[i] && p[j - 1] != '.') {
dp[i + 1, j + 1] = dp[i + 1, j - 1];
} else {
dp[i + 1, j + 1] = (dp[i + 1, j] || dp[i, j + 1] || dp[i, j - 1]);
}
}
}
}
return dp[s.Length, p.Length];
}
Java
class Solution {
public boolean isMatch(String s, String p) {
int m = s.length();
int n = p.length();
boolean[][] dp = new boolean[m + 1][n + 1];
dp[0][0] = true;
for (int i = 0; i < n; i++) {
if (p.charAt(i) == '*' && dp[0][i - 1]) {
dp[0][i + 1] = true;
}
}
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
if (p.charAt(j) == '.') {
dp[i + 1][j + 1] = dp[i][j];
}
if (p.charAt(j) == s.charAt(i)) {
dp[i + 1][j + 1] = dp[i][j];
}
if (p.charAt(j) == '*') {
if (p.charAt(j - 1) != s.charAt(i) && p.charAt(j - 1) != '.') {
dp[i + 1][j + 1] = dp[i + 1][j - 1];
} else {
dp[i + 1][j + 1] = (dp[i + 1][j] || dp[i][j + 1] || dp[i][j - 1]);
}
}
}
}
return dp[m][n];
}
}
C++
class Solution {
public:
bool isMatch(string s, string p) {
int m = s.length();
int n = p.length();
vector<vector<bool>> dp(m + 1, vector<bool>(n + 1, false));
dp[0][0] = true;
for (int i = 1; i <= n; i++) {
if (p[i - 1] == '*') {
dp[0][i] = dp[0][i - 2];
}
}
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (p[j - 1] == '.' || p[j - 1] == s[i - 1]) {
dp[i][j] = dp[i - 1][j - 1];
} else if (p[j - 1] == '*') {
dp[i][j] = dp[i][j - 2];
if (p[j - 2] == '.' || p[j - 2] == s[i - 1]) {
dp[i][j] |= dp[i - 1][j];
}
}
}
}
return dp[m][n];
}
};
JS
let isMatch = function(s, p) {
let dp = [];
for (let i = 0; i <= s.length; i++) {
dp[i] = [];
for (let j = 0; j <= p.length; j++) {
dp[i][j] = false;
}
}
dp[0][0] = true;
for (let i = 0; i < p.length; i++) {
if (p[i] === '*' && dp[0][i - 1]) {
dp[0][i + 1] = true;
}
}
for (let i = 0; i < s.length; i++) {
for (let j = 0; j < p.length; j++) {
if (p[j] === '.' || p[j] === s[i]) {
dp[i + 1][j + 1] = dp[i][j];
} else if (p[j] === '*') {
if (p[j - 1] !== s[i] && p[j - 1] !== '.') {
dp[i + 1][j + 1] = dp[i + 1][j - 1];
} else {
dp[i + 1][j + 1] = (dp[i + 1][j] || dp[i][j + 1] || dp[i][j - 1]);
}
}
}
}
return dp[s.length][p.length];
};
Python
class Solution:
def isMatch(self, s: str, p: str) -> bool:
dp = [[False for i in range(len(p) + 1)] for j in range(len(s) + 1)]
dp[0][0] = True
for i in range(1, len(p)):
if p[i] == '*' and dp[0][i - 1]:
dp[0][i + 1] = True
for i in range(len(s)):
for j in range(len(p)):
if p[j] == '.' or p[j] == s[i]:
dp[i + 1][j + 1] = dp[i][j]
elif p[j] == '*':
if p[j - 1] != s[i] and p[j - 1] != '.':
dp[i + 1][j + 1] = dp[i + 1][j - 1]
else:
dp[i + 1][j + 1] = dp[i + 1][j] or dp[i][j + 1] or dp[i][j - 1]
return dp[len(s)][len(p)]
Time & space complexity
In the above solution we’ve used the most common algorithm to solve the Regular Expression which is Dynamic Programming approach. In this approach, the time complexity of the solution is O(n * m), where n is the length of the input string s and m is the length of the pattern p.
The space complexity of this solution is O(n * m) as well, as it requires a 2D array to store the intermediate results.
Also there is another approach to solve this problem. It is called the Backtracking approach, which has a time complexity of O(2^n) in the worst case. This approach is not recommended for large input strings, as it has exponential time complexity.
Conculsion
In conclusion, Regular Expression Matching is a well-studied problem and has many different solutions. The Dynamic Programming approach is the most efficient in terms of time and space complexity, but the implementation of the code depends on the programming language used. It is important to choose the right approach based on the requirements and constraints of the problem.
0 Comments