algoadvance

Problem Statement

Given an array of strings words and an integer k, return the k most frequent strings.

The answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

Clarifying Questions

Input Constraints:
- What is the range of the length of the words array?
- How long can each word be?
- Is it guaranteed that the input list will not be empty?
Output Format:
- Should the output be returned as a list/array of strings?
Case Sensitivity:
- Are words case-sensitive?

Assuming:

The length of words can be up to (10^4).
Each word can be up to 10 characters long.

Strategy

Count Frequencies: Use a hash map (unordered_map) to count the frequency of each word.
Use a Priority Queue:
- Use a min-heap (priority queue) to keep track of the top k elements.
- The priority queue will store pairs where the first element is the frequency and the second element is the word.
- If two words have the same frequency, the word with the lower alphabetical order should come first. This will be maintained by the comparison logic in the priority queue.
Maintain Top K Elements: Push pairs into the min-heap such that if the heap size exceeds k, the smallest element (i.e., least frequent or lexicographically larger in case of tie) is popped out.
Extract Results: Extract elements from the heap and sort them based on frequency and lexicographical order.

Code

Here’s how you can implement this in C++:

#include <vector>
#include <string>
#include <unordered_map>
#include <queue>
#include <algorithm>

using namespace std;

class Solution {
public:
    vector<string> topKFrequent(vector<string>& words, int k) {
        unordered_map<string, int> count;
        for (const string& word : words) {
            count[word]++;
        }

        auto comp = [](pair<int, string>& a, pair<int, string>& b) {
            if (a.first == b.first)
                return a.second > b.second;
            return a.first < b.first;
        };

        priority_queue<pair<int, string>, vector<pair<int, string>>, decltype(comp)> pq(comp);

        for (const auto& entry : count) {
            pq.push({entry.second, entry.first});
            if (pq.size() > k) {
                pq.pop();
            }
        }

        vector<string> result;
        while (!pq.empty()) {
            result.push_back(pq.top().second);
            pq.pop();
        }

        reverse(result.begin(), result.end());
        return result;
    }
};

Time Complexity

Frequency Count: The frequency count will take (O(N)), where (N) is the number of words.
Priority Queue Operations:
- Pushing into the priority queue takes (O(\log k)).
- This operation is repeated for every unique word. In the worst case, if all words are unique, this is done (N) times.
- Thus, the cost of inserting all elements into the priority queue is (O(N \log k)).
Overall Complexity: The total time complexity is (O(N \log k)). The space complexity is (O(N)) to store the word frequencies and the priority queue.

This approach ensures we efficiently maintain the top k frequent words and return them in the desired order.

Cut your prep time in half and DOMINATE your interview with AlgoAdvance AI

Got blindsided by a question you didn’t expect?
Spend too much time studying?
Or simply don’t have the time to go over all 3000 questions?