algoadvance

You are given a 0-indexed string array words. Two strings are similar if they consist of the same characters.

For example, “abca” and “cba” are similar since both consist of characters ‘a’, ‘b’, and ‘c’.
However, “ab” and “bbc” are not similar since they do not consist of the same characters.

Return the number of pairs (i, j) (where 0 <= i < j <= word.length - 1) such that the two strings words[i] and words[j] are similar.

Clarifying Questions:

Are words guaranteed to be lowercase English alphabets?
- Yes.
Can the input array contain duplicate words?
- Yes.

Strategy:

Transform each word into a set of characters: This will make it easier to compare words.
Use a hashmap to count occurrences of unique character sets: This allows us to efficiently calculate the number of pairs.
Calculate pairs: If a unique set of characters appears n times, the number of pairs is n * (n - 1) / 2.

Code:

def countPairs(words):
    from collections import defaultdict
    
    # Hashmap to store count of each unique character set
    count_map = defaultdict(int)
    
    # Populate the hashmap
    for word in words:
        char_set = ''.join(sorted(set(word)))  # Convert to a sorted string for use as key
        count_map[char_set] += 1
    
    # Calculate the number of pairs
    num_pairs = 0
    for count in count_map.values():
        if count > 1:
            num_pairs += (count * (count - 1)) // 2  # nC2 formula
    
    return num_pairs

Time Complexity:

Transforming each word into a character set: Sorting takes (O(k \log k)) where (k) is the length of the word.
Counting occurrences of unique character sets: Insert each word into the hashmap takes (O(1)).
Calculating pairs: Iterating through the hashmap takes (O(m)) where (m) is the number of unique character sets in the map.

Overall, the time complexity can be approximated as (O(n \cdot k \log k)) where:

(n) is the number of words.
(k) is the average length of the words.

Example:

words = ["aba", "aaba", "baaa", "aaa", "bc"]
print(countPairs(words))  # Output should be 2

Explanation:

The similar pairs are (“aba”, “baaa”) and (“aaba”, “baaa”).
The pair (“aba”, “aaba”) is not similar because “aaa” is not in both sets.

Try our interview co-pilot at AlgoAdvance.com

Got blindsided by a question you didn’t expect?
Spend too much time studying?
Or simply don’t have the time to go over all 3000 questions?