Implement regular expression matching with support for
'.'
and '*'
.'.' Matches any single character.
'*' Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
The function prototype should be:
bool isMatch(const char *s, const char *p)
Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "a*") → true
isMatch("aa", ".*") → true
isMatch("ab", ".*") → true
isMatch("aab", "c*a*b") → true
Solution:
1. Bruteforce
(a) if p.charAt(j+1)!='*',if s.charAt(i)!=p.charAt(j) && p.charAt(j)) return false,
else return (s,p,i+1,j+1); means if p.next char!='*', then it is simple, if s.charAt(i)!=p.charAt(j) && p.charAt(j) they are not matched.
(b) if p.charAt(j+1)=='*', we have to check all possible subMatch(i++,j+2) as long as s.charAt(i)==p.charAt(j) || p.charAt(j)=='.', if anyone of them return true, then s,p are matched. Because we don't know "*" represent exactly how many chars, we have to try each possibility. For example, "abbbc" and "ab*bc", we have to check {"bbbc","bc"};{"bbc","bc"};{"bc","bc"}(matched return true); if no one return true, check the rest of string.
Time complexity is very high, should be exponential level.
public boolean isMatch(String s, String p) {
return subMatch(s,p,0,0);
}
public boolean subMatch(String s, String p, int i, int j){
if(j==p.length()) return (i==s.length());
if((j==p.length()-1)||(p.charAt(j+1)!='*')){
if((i==s.length())||((p.charAt(j)!='.')&&(s.charAt(i)!=p.charAt(j)))) return false;
else return subMatch(s,p,i+1,j+1);
}
else {
while ((i<s.length())&&((p.charAt(j)=='.')||(s.charAt(i)==p.charAt(j)))){
if(subMatch(s,p,i,j+2)) return true;
i++;
}
return subMatch(s,p,i,j+2);
}
}
2. Dynamic Programing
We maintain an boolean matrix match[i][j], the matrix represent does s.substring(0,i+1) and p.substring(0,j+1) match.
The recurrence relation is :
(a) if p.charAt(j)!='*',if (s.charAt(i)==p.charAt(j)|| p.charAt(j)=='.') match[i+1][j+1]=match[i][j]
(b) else p.charAt(j)=='*' match[i+1][j+1]=match[i+1][j] ('*' represent one previous char) || match[i+1][j-1]('*'represent zero previous char) || (match[i][j+1] && (s.charAt(i)==p.charAt(j)|| p.charAt(j)=='.')('*' represent multiple previous char till the current i).
The advantage of dynamic programming is previous information is stored and no need to recalculate.
A little trick of this is set the size of match to be [ls+1][lp+1] and set match[0][0]=true. Update match[0][j] when j is iterating. The trick allow us to process the corner case much easier. For example "" and "a*b*" ,match[0][4]=match[0][2]=match[0][0]=true;
Time complexity: O(m*n)
if(s==null || p==null) return false;
int ls=s.length();
int lp=p.length();
boolean[][] match=new boolean[ls+1][lp+1];
match[0][0]=true;
for(int i=-1;i<ls;i++){
for(int j=0;j<lp;j++){
if(i==-1) match[0][j+1]=j>0 && p.charAt(j)=='*' && match[0][j-1];
else{
if(s.charAt(i)==p.charAt(j) || p.charAt(j)=='.') match[i+1][j+1]=match[i][j];
else if(p.charAt(j)=='*'){
match[i+1][j+1] =match[i+1][j-1]||match[i+1][j]||(match[i][j+1] && (s.charAt(i)==p.charAt(j-1) ||p.charAt(j-1)=='.'));
}
else match[i+1][j+1]=false;
}
}
}
return match[ls][lp];
}
No comments:
Post a Comment