algoadvance

Leetcode 3090. Maximum Length Substring With Two Occurrences

Problem Statement:

Given a string s, find the length of the maximum substring that appears twice in the original string. These two substrings must not overlap.

Clarifying Questions:

Are the substrings case-sensitive?
- Yes, the comparison is case-sensitive.
Can there be overlapping substrings?
- No, the substrings must not overlap.
What is the maximum length of the string s?
- Let’s assume the string length can be up to (10^5) characters.
Should the solution handle edge cases where the input string is empty or very short?
- Yes, handle cases where the string length is less than 2 since no non-overlapping substrings are possible in such cases.

Strategy:

Binary Search on Length: We will use binary search to find the maximum length L such that some substring of length L appears at least twice in s with no overlap. This allows us to efficiently zoom in on the length rather than a direct brute force approach.
Hashing and Sliding Window: For checking if a substring of length L appears twice, we can use hashing (e.g., rolling hash) along with a sliding window algorithm. This helps in efficiently managing the hash values and quickly identifying repeated substrings.
Hash Set for Storage: Use a hash set to store substrings of the current length and check for repeats. Ensure to manage non-overlapping by keeping track of the previous indices.

Code:

Here is a possible implementation in C++:

#include <iostream>
#include <string>
#include <unordered_set>
#include <vector>

using namespace std;

class Solution {
public:
    int maxLengthSubstringWithTwoOccurrences(string s) {
        int n = s.length();
        if (n < 2) return 0;

        // Helper function to determine if there's a non-overlapping substring of length len
        auto hasNonOverlappingSubstring = [&](int len) -> bool {
            unordered_set<string> substr_set;
            for (int i = 0; i <= n - len; ++i) {
                string current_sub = s.substr(i, len);
                if (substr_set.count(current_sub)) {
                    return true;
                }
                substr_set.insert(current_sub);
            }
            return false;
        };

        // Binary search for the maximum possible length
        int left = 1, right = n - 1, result = 0;
        while (left <= right) {
            int mid = left + (right - left) / 2;
            if (hasNonOverlappingSubstring(mid)) {
                result = mid;  // Possible valid length
                left = mid + 1;
            } else {
                right = mid - 1;
            }
        }

        return result;
    }
};

int main() {
    Solution sol;
    string s = "banana";
    cout << "The length of the maximum substring which appears twice: " << sol.maxLengthSubstringWithTwoOccurrences(s) << endl;
    return 0;
}

Time Complexity:

Binary Search: We perform binary search on lengths from 1 to (n-1), which takes (O(\log n)).
Checking Substrings: For each length, the hash set check takes (O(n)) comparisons in the worst-case scenario, assuming efficient hashing and insertion in average case (O(1)).

Thus, the overall complexity is (O(n \log n)).

Recap:

This approach combines sliding window and hashing within a binary search framework, ensuring efficient computation while checking for non-overlapping repeated substrings in linearithmic time.

Cut your prep time in half and DOMINATE your interview with AlgoAdvance AI

Got blindsided by a question you didn’t expect?
Spend too much time studying?
Or simply don’t have the time to go over all 3000 questions?