Given a string s
, find the maximum length of a substring that appears at least twice and the two occurrences do not overlap.
s
): What is the maximum length of s
?
Given the assumptions, let’s move towards the strategy to solve this problem.
Here’s how we can implement this strategy:
def max_length_substring_no_overlap(s: str) -> int:
def search_substr_with_len(L):
"""
This function returns True if there is a non-overlapping substring of length `L` which appears at least twice
"""
seen = {}
base, modulus = 26, 2**32
# Precompute hash for the first window of length `L`
hash_ = 0
for i in range(L):
hash_ = (hash_ * base + ord(s[i])) % modulus
seen[hash_] = [0] # Initial position of this hash
left_base = pow(base, L, modulus) # leftmost base for rolling hash
# Rolling hash to compute hash values of all substrings of length `L`
for start in range(1, len(s) - L + 1):
hash_ = (hash_ * base - ord(s[start - 1]) * left_base + ord(s[start + L - 1])) % modulus
if hash_ in seen:
for prev_start in seen[hash_]:
if prev_start + L <= start: # Ensure non-overlapping
return True
seen[hash_].append(start)
else:
seen[hash_] = [start]
return False
# Binary search for the maximum length of substring
low, high, result = 1, len(s), 0
while low <= high:
mid = (low + high) // 2
if search_substr_with_len(mid):
result = mid
low = mid + 1
else:
high = mid - 1
return result
# Example usage:
s = "banana"
print(max_length_substring_no_overlap(s)) # Output: 1, for "a"
search_substr_with_len(L)
is O(n), where ‘n’ is the length of the string.search_substr_with_len(L)
function.Thus, the overall time complexity is O(n log n).
This implementation ensures that we efficiently find the maximum length substring that appears at least twice, while ensuring the occurrences do not overlap.
Got blindsided by a question you didn’t expect?
Spend too much time studying?
Or simply don’t have the time to go over all 3000 questions?