Given a string s and an array of strings words, return the number of words[i] that are subsequences of s.
A subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.
For example, "ace" is a subsequence of "abcde".
Example 1:
Input: s = "abcde", words = ["a", "bb", "acd", "ace"]
Output: 3
Explanation: There are three strings in words that are subsequences of s: "a", "acd", "ace".
Example 2:
Input: s = "dsahjpjauf", words = ["ahjpjau","ja","ahbwzgqnuk","tnmlanowax"]
Output: 2
words contain duplicate strings?
s always guaranteed to be non-empty?
s is not empty.s and words are lowercase English letters?
1 <= s.length <= 5 * 10^41 <= words.length <= 50001 <= words[i].length <= 50The naive approach would involve checking each word in words against s to see if it is a subsequence. This involves iterating over both the string s and each word in words, which may not be efficient for larger inputs. Instead, a more efficient approach can involve the following steps:
s: Create a mapping from each character to all its positions in s.words, use the preprocessed mapping to check if the word can be constructed by following the increasing order of indices.s:
pos where each key is a character, and the value is a list of indices where this character appears in s.words, use the pos dictionary to find positions of characters in s and ensure they appear in increasing order.This approach ensures that each character lookup in s is done in logarithmic time relative to the length of s, resulting in a more efficient solution.
s: O(n), where n is the length of s.words, the lookup steps take O(k log n) where k is the length of the word, leading to O(m * k log n) for all words (m being the number of words).Overall, the time complexity is O(n + m * k log n), which is efficient given the constraints.
from collections import defaultdict
import bisect
def numMatchingSubseq(s, words):
# Create a dictionary to map each character to a list of indices where they appear in `s`
pos = defaultdict(list)
for i, char in enumerate(s):
pos[char].append(i)
def is_subsequence(word):
# Current index in `s` to match characters of `word`
prev_index = -1
for char in word:
if char not in pos:
return False
# Use binary search to find the smallest index in pos[char] which is greater than prev_index
indices = pos[char]
idx = bisect.bisect_right(indices, prev_index)
if idx == len(indices):
return False
prev_index = indices[idx]
return True
# Count the number of words that are subsequences of `s`
return sum(is_subsequence(word) for word in words)
This code efficiently counts the number of subsequences in s for the list of words.
Got blindsided by a question you didn’t expect?
Spend too much time studying?
Or simply don’t have the time to go over all 3000 questions?