algoadvance

Given two strings word1 and word2, return the minimum number of steps required to make word1 and word2 the same. In one step, you can delete exactly one character in either string.

Example 1:

Example 2:

Constraints:

Clarifying Questions

  1. Can we assume that the inputs are always valid non-null strings?
  2. Do we need to consider only lowercase letters, or should the solution be generalized for uppercase letters as well?
  3. Are there any restrictions on the types of operations allowed (only deletions)?
  4. Is the problem case-sensitive?

Strategy

This problem can be approached using dynamic programming. The key insight is to minimize the deletions needed by maximizing the similarity between the two strings through their longest common subsequence (LCS).

Here’s the approach:

  1. Find the length of the longest common subsequence (LCS) between word1 and word2. The LCS is the longest sequence that can be derived from both strings by deleting some characters.
  2. Once the LCS length is known, we can calculate the number of deletions required in both strings. This is because to make the strings equal, any character that is not part of this LCS must be deleted.
  3. The number of deletions required in word1 to make it equal to the LCS would be len(word1) - len(LCS).
  4. The number of deletions required in word2 to make it equal to the LCS would be len(word2) - len(LCS).
  5. Therefore, the total number of deletions required will be (len(word1) - len(LCS)) + (len(word2) - len(LCS)).

Code

def minDistance(word1: str, word2: str) -> int:
    def longest_common_subsequence(text1, text2):
        dp = [[0] * (len(text2) + 1) for _ in range(len(text1) + 1)]
        
        for i in range(1, len(text1) + 1):
            for j in range(1, len(text2) + 1):
                if text1[i - 1] == text2[j - 1]:
                    dp[i][j] = dp[i - 1][j - 1] + 1
                else:
                    dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
        
        return dp[len(text1)][len(text2)]
    
    lcs_length = longest_common_subsequence(word1, word2)
    return (len(word1) - lcs_length) + (len(word2) - lcs_length)

# Example usage
word1 = "sea"
word2 = "eat"
print(minDistance(word1, word2))  # Output: 2

Time Complexity

The time complexity of this approach is O(m * n), where m is the length of word1 and n is the length of word2. This is because we use a 2D DP table of size m+1 by n+1 and we fill each cell in the table exactly once.

The space complexity is also O(m * n) for the DP table.

Try our interview co-pilot at AlgoAdvance.com