algoadvance

Leetcode 2506. Count Pairs Of Similar Strings

Problem Statement

You are given an array of strings words. A pair of two different strings in the array (words[i], words[j]) is called similar if both strings have the same set of characters.

For example, “abca” and “cba” are similar because they both contain characters [‘a’, ‘b’, ‘c’].

Return the number of pairs of similar strings.

Example 1:

Input: words = [“aba”,”aabb”,”abcd”,”bac”,”aabc”] Output: 2 Explanation: There are 2 pairs that satisfy the condition:

(“aba”, “aabb”) and (“aabb”, “aabc”)

Example 2:

Input: words = [“aabb”,”ab”,”ba”] Output: 3 Explanation: There are 3 pairs that satisfy the condition:

(“aabb”, “ab”), (“aabb”, “ba”), and (“ab”, “ba”)

Clarifying Questions

Can the array have duplicate strings? Yes, but they should not affect our counting of similar pairs.
Are all characters in the strings lowercase English letters? Yes, the problem implicitly assumes so based on the examples.
How should we treat an empty array words? An empty array should return 0 pairs.

Strategy

To solve this problem, we need to identify pairs of strings that have the same set of characters. Here is the strategy to approach this:

Normalize each string: Convert each string to a set of characters and then sort it to get a normalized representation. This helps in easily comparing different strings.
Counting pairs: Use a frequency map or hash map to count the normalized versions of strings.
Calculate pairs: Once we have the frequency of each normalized set, we can calculate how many pairs can be formed using combinations.

Code

Here is the Java code implementing the above strategy:

import java.util.*;

public class Solution {
    public int similarPairs(String[] words) {
        // A map to store the frequency of each normalized set of characters.
        Map<String, Integer> frequencyMap = new HashMap<>();
        
        // Iterate through each word in the array
        for (String word : words) {
            // Convert to a set of characters and then back to a sorted string
            Set<Character> charSet = new TreeSet<>();
            for (char c : word.toCharArray()) {
                charSet.add(c);
            }
            StringBuilder normalizedBuilder = new StringBuilder();
            for (char c : charSet) {
                normalizedBuilder.append(c);
            }
            String normalized = normalizedBuilder.toString();
            
            // Update the frequency map
            frequencyMap.put(normalized, frequencyMap.getOrDefault(normalized, 0) + 1);
        }
        
        // Calculate the number of pairs
        int pairs = 0;
        for (int count : frequencyMap.values()) {
            if (count > 1) {
                pairs += (count * (count - 1)) / 2; // Combination formula: nC2 = n * (n-1) / 2
            }
        }
        
        return pairs;
    }

    public static void main(String[] args) {
        Solution solution = new Solution();
        
        String[] words1 = {"aba", "aabb", "abcd", "bac", "aabc"};
        System.out.println(solution.similarPairs(words1)); // Output: 2
        
        String[] words2 = {"aabb", "ab", "ba"};
        System.out.println(solution.similarPairs(words2)); // Output: 3
    }
}

Time Complexity

Normalization of Each Word: Converting a word to a set and then sorting the set takes (O(n \log n)) where (n) is the length of the word. Given we are doing this for each of the m words, it’s (O(m \cdot n \log n)).
Building the Frequency Map: This takes (O(m)) as we are inserting each normalized word.
Calculating the Pairs: This takes (O(T)), where (T) is the number of unique normalized strings, which in the worst case will be (O(m)).

Overall, the time complexity is dominated by the normalization step, resulting in (O(m \cdot n \log n)).

Cut your prep time in half and DOMINATE your interview with AlgoAdvance AI

Got blindsided by a question you didn’t expect?
Spend too much time studying?
Or simply don’t have the time to go over all 3000 questions?