Sunday, January 4, 2015

10. Regular Expression Matching Leetcode Java

Implement regular expression matching with support for '.' and '*'.
'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "a*") → true
isMatch("aa", ".*") → true
isMatch("ab", ".*") → true
isMatch("aab", "c*a*b") → true

Solution:
1. Bruteforce 
(a) if p.charAt(j+1)!='*',if s.charAt(i)!=p.charAt(j) && p.charAt(j)) return false, 
else return (s,p,i+1,j+1); means if p.next char!='*', then it is simple, if s.charAt(i)!=p.charAt(j) && p.charAt(j) they are not matched.
(b) if p.charAt(j+1)=='*', we have to check all possible subMatch(i++,j+2) as long as s.charAt(i)==p.charAt(j) || p.charAt(j)=='.', if anyone of them return true, then s,p are matched. Because we don't know "*" represent exactly how many chars, we have to try each possibility.  For example, "abbbc" and "ab*bc", we have to check {"bbbc","bc"};{"bbc","bc"};{"bc","bc"}(matched return true); if no one return true, check the rest of string. 
Time complexity is very high, should be exponential level.
  public boolean isMatch(String s, String p) {  
    return subMatch(s,p,0,0);  
   }  
  public boolean subMatch(String s, String p, int i, int j){  
   if(j==p.length()) return (i==s.length());  
   if((j==p.length()-1)||(p.charAt(j+1)!='*')){  
    if((i==s.length())||((p.charAt(j)!='.')&&(s.charAt(i)!=p.charAt(j)))) return false;  
    else return subMatch(s,p,i+1,j+1);  
   }  
   else {  
     while ((i<s.length())&&((p.charAt(j)=='.')||(s.charAt(i)==p.charAt(j)))){  
       if(subMatch(s,p,i,j+2)) return true;  
       i++;  
      }  
       return subMatch(s,p,i,j+2);  
     }  
  }  

2. Dynamic Programing
We maintain an boolean matrix match[i][j], the matrix represent does s.substring(0,i+1) and p.substring(0,j+1) match. 
 The recurrence relation is :
(a) if p.charAt(j)!='*',if (s.charAt(i)==p.charAt(j)|| p.charAt(j)=='.') match[i+1][j+1]=match[i][j]
(b) else p.charAt(j)=='*' match[i+1][j+1]=match[i+1][j] ('*' represent one previous char) || match[i+1][j-1]('*'represent zero previous char) || (match[i][j+1] && (s.charAt(i)==p.charAt(j)|| p.charAt(j)=='.')('*' represent multiple previous char till the current i).
The advantage of dynamic programming is previous information is stored and no need to recalculate. 
A little trick of this is set the size of match to be [ls+1][lp+1] and set match[0][0]=true. Update match[0][j] when j is iterating. The trick allow us to process the corner case much easier. For example "" and "a*b*" ,match[0][4]=match[0][2]=match[0][0]=true;  
Time complexity: O(m*n)
   if(s==null || p==null) return false;  
    int ls=s.length();  
    int lp=p.length();  
    boolean[][] match=new boolean[ls+1][lp+1];  
    match[0][0]=true;  
    for(int i=-1;i<ls;i++){  
      for(int j=0;j<lp;j++){  
        if(i==-1) match[0][j+1]=j>0 && p.charAt(j)=='*' && match[0][j-1];  
        else{  
        if(s.charAt(i)==p.charAt(j) || p.charAt(j)=='.') match[i+1][j+1]=match[i][j];  
        else if(p.charAt(j)=='*'){  
           match[i+1][j+1] =match[i+1][j-1]||match[i+1][j]||(match[i][j+1] && (s.charAt(i)==p.charAt(j-1) ||p.charAt(j-1)=='.'));  
        }  
        else match[i+1][j+1]=false;  
        }  
      }  
      }  
    return match[ls][lp];  
   }  

No comments:

Post a Comment