algoadvance

316. Remove Duplicate Letters

Given a string s, remove duplicate letters so that every letter appears once and only once. You must make sure your result is the smallest in lexicographical order among all possible results.

Example:

Input: s = "bcabc"
Output: "abc"

Input: s = "cbacdcbc"
Output: "acdb"

Clarifying Questions

  1. Q: Can the input string contain uppercase letters? A: The problem statement does not explicitly mention restrictions regarding uppercase letters, so we assume the string contains lowercase English letters only (a-z).

  2. Q: How do we define lexicographical order? A: Lexicographical order is similar to dictionary order where “a” < “b” < “c” and so forth.

  3. Q: Can the input string be empty? A: Yes, the input can be empty. The output for an empty string should also be an empty string.

Strategy

To solve this problem efficiently and correctly, we will use a greedy algorithm with a stack:

  1. Count Occurrences: First, count all occurrences of each character.
  2. Stack Initialization: Use a stack to build the result string.
  3. Track Visited: Use a set to track characters that are already in the stack to avoid re-inserting them.
  4. Iterate Through Characters: For each character:
    • Decrease its count in the occurrence dictionary.
    • If the character is already in the stack, skip it.
    • If the character is not in the stack, ensure removing previous characters from the stack won’t violate the lexicographical order and uniqueness requirements. Pop from the stack if the following conditions are met:
      • The current character is smaller than the last character in the stack.
      • The last character could appear later as well (i.e., its count in the dictionary is more than zero).
    • Push the current character into the stack.
  5. Result: The final stack represents the smallest lexicographical order result.

Code

Here’s the code implementing the above strategy:

def removeDuplicateLetters(s: str) -> str:
    # Step 1: Count occurrences of each character in the string
    count = {char: 0 for char in s}
    for char in s:
        count[char] += 1

    # Step 2: Use a stack to build the result string
    stack = []
    in_stack = set()

    for char in s:
        # Decrease the count of the current character
        count[char] -= 1
        
        # If the character is already in the stack, skip it
        if char in in_stack:
            continue
        
        # Pop from stack if necessary
        while stack and char < stack[-1] and count[stack[-1]] > 0:
            popped_char = stack.pop()
            in_stack.remove(popped_char)
        
        # Add current character to the stack and mark it as in stack
        stack.append(char)
        in_stack.add(char)
    
    # Join the stack to form the result string
    return ''.join(stack)

# Example usage:
print(removeDuplicateLetters("bcabc"))      # Output: "abc"
print(removeDuplicateLetters("cbacdcbc"))   # Output: "acdb"

Time Complexity

The time complexity of this algorithm is (O(n)), where (n) is the length of the string s. This is because each character is pushed and popped from the stack at most once, ensuring linear time complexity.

Space Complexity

The space complexity is (O(k)), where (k) is the number of distinct characters in the string s. This accounts for the space used by the stack, set, and the dictionary for storing counts, all of which store up to (k) items. In the worst case, (k) can be (26) (number of lowercase English letters), hence it is considered a constant with respect to the input size.

Try our interview co-pilot at AlgoAdvance.com