Implement strStr function in O(n + m) time.
strStr return the first index of the target string in a source string. The length of the target string is m and the length of the source string is n.
If target does not exist in source, just return -1.
Example
Given source = abcdef, target = bcd, return 1.
思路
O(n + m)的算法有2个: KMP和Rabin-Karp。
还有一个best performance是O(N/M)但是worst case是O(MN)的算法Boyer-Moore。
先写一个Boyer-Moore的,提交也能通过。
每次比的时候是从target的最后一位往前比。比到有mismatch,就把i的位置从source里往后移动,移动多少根据用target来pre-compute的一个表。
具体算法参照Princeton老头的算法课件。老头讲得特别清楚。
Boyer-Moore
Code
public class Solution { /** * @param source a source string * @param target a target string * @return an integer as index */ public int strStr2(String source, String target) { // Write your code here if (source == null || target == null) { return -1; } int N = source.length(); int M = target.length(); if (M == 0) { return 0; } int R = 256; int[] right = new int[R]; for (int c = 0; c < R; c++) { right[c] = -1; } for (int j = 0; j < M; j++) { right[target.charAt(j)] = j; } int skip; for (int i = 0; i < N - M + 1; i += skip) { skip = 0; for (int j = M - 1; j >= 0; j--) { if (source.charAt(i + j) != target.charAt(j)) { skip = Math.max(1, j - right[source.charAt(i + j)]); break; } } if (skip == 0) { return i; } } return -1; } }
Rabin-Karp
Code
public class Solution { /** * @param source a source string * @param target a target string * @return an integer as index */ public int strStr2(String source, String target) { // Write your code here if (source == null || target == null) { return -1; } int N = source.length(); int M = target.length(); if (M == 0) { return 0; } int R = 31; int Q = 997; int RM = 1; for (int i = 0; i < M; i++) { RM = RM * R % Q; } int targetHash = 0; for (int i = 0; i < M; i++) { targetHash = (targetHash * R + target.charAt(i)) % Q; } int sourceHash = 0; for (int i = 0; i < N; i++) { sourceHash = (sourceHash * R + source.charAt(i)) % Q; if (i < M - 1) { continue; } if (i >= M) { sourceHash = sourceHash - (source.charAt(i - M) * RM) % Q; if (sourceHash < 0) { sourceHash += Q; } } if (sourceHash == targetHash) { if (source.substring(i - M + 1, i + 1).equals(target)) { return i - M + 1; } } } return -1; } }
KMP有个问题,因为DFA的方法预处理的表的空间为R * M,所以memory会越界。