Tuesday, February 17, 2015

28. Implement strStr() LeetCode Java

Implement strStr().
Returns the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.
Solution:
1. Brute force
Check every possible substring of haystack and compare with needle
if find a solution return the beginning index.
Time complexity: O(m*n) m is the length of haystack, n is the length of needle
 public int strStr(String haystack, String needle) {  
    if(haystack==null || needle==null || needle.length()>haystack.length()) return -1;  
    int m=0;  
    int ln=needle.length();  
    int lh=haystack.length();  
    while(m+ln<=lh){  
      int i=0;  
      while (i<ln && needle.charAt(i)==haystack.charAt(m+i)) i++;  
      if(i==ln) return m;  
      else m++;  
    }  
    return -1;  
   }  
2. Rolling hash
Since in this problem, needle and haystack both are only consisted with lowercase alphabetic letter, so we define a hash function such that only same string will have the same hash code. For example hashCode(adbe) = 1*27^0+ 4*27^1+ 2* 27^2+5*27^3. 
The advantage of this hash function is that if we know the hashCode of needle, we can calculate the each hashCode of the substring of haystack with the same length of needle in linear time.
for example if haystack="gadbe", so we can easily calculate the hashCode(gadb)=  7*27^0+ 1*27^1+ 4* 27^2+2*27^3, so we can roll the hashCode to next substring(adbe) by hashCode(adbe)= hashCode(gadb)/ 27 + 5*27^3, so we can calculate the hashcode by the hashcode of the last substring. We just need to compare the hashcode with needle's hashcode when we calculating it.
Time Complexity: O(m)
 public int strStr(String haystack, String needle) {  
    if(haystack==null || needle==null || needle.length()>haystack.length()) return -1;  
    long base=1;  
    long hashCode=0;   
    for(int i=0;i<needle.length();i++){  
      hashCode+=(needle.charAt(i)-'a'+1)*base;  
      base*=27;  
    }  
    long newHashCode=0;  
    base=1;  
    for(int i=0; i<needle.length();i++){  
      newHashCode+=(haystack.charAt(i)-'a'+1)*base;  
      base*=27;  
    }  
    if(newHashCode==hashCode) return 0;  
    base=base/27;  
    int i=needle.length();  
    while(i<haystack.length()){  
      newHashCode=newHashCode/27+(haystack.charAt(i)-'a'+1)*base;  
      if(newHashCode==hashCode) return i-needle.length()+1;  
      i++;  
    }  
    return -1;  
  }