algoadvance

A website domain “visit count” is represented as a string cpdomain, consisting of a count followed by a space, followed by the address. Given a list of such count-paired domains, return the count of total visits for each subdomain.

For example, a count-paired domain “9001 discuss.leetcode.com” indicates that the domain discuss.leetcode.com was visited 9001 times. A subdomain will count visits for every domain it is a part of.

A count for the domain discuss.leetcode.com counts visits to discuss.leetcode.com, leetcode.com, and com.

Example:

Input:

["9001 discuss.leetcode.com"]

Output:

["9001 leetcode.com", "9001 discuss.leetcode.com", "9001 com"]

You may assume the input will be valid.
The order of the output does not matter.

Clarifying Questions:

Can there be multiple domains in the input list?
Will counts always be positive integers?
Will domain strings always be non-empty and valid?
Are subdomains restricted to three levels?
Should the outputs be sorted in any specific way?

Strategy:

Initialize an empty dictionary to store the counts of each subdomain.
Iterate through each domain string in the input list:
- Split the string into count and domain.
- Split the domain by dots to get all subdomains.
- For each subdomain, add its count to the dictionary.
After processing all domains, construct the output list by formatting the counts and domains from the dictionary.
Return the output list.

Code:

def subdomainVisits(cpdomains):
    from collections import defaultdict
    
    # Dictionary to store counts of each subdomain
    subdomain_count = defaultdict(int)
    
    for full_domain in cpdomains:
        count, domain = full_domain.split()
        count = int(count)
        subdomains = domain.split('.')
        
        # Form and count all subdomains
        for i in range(len(subdomains)):
            subdomain = '.'.join(subdomains[i:])
            subdomain_count[subdomain] += count
    
    # Build the output list
    result = [f"{value} {key}" for key, value in subdomain_count.items()]
    
    return result

# Example usage
input_list = ["9001 discuss.leetcode.com"]
print(subdomainVisits(input_list))

Time Complexity:

Parsing each domain string and generating subdomains will take O(N) time, where N is the total number of characters in the input list.
Splitting the domain and aggregating counts in the dictionary is O(D*L) where D is the number of domain strings and L is the average number of subdomain levels.
Constructing the final result involves iterating over the dictionary, which is O(M), where M is the number of unique subdomains.

Thus, the overall time complexity is O(N + D*L) which is generally linear with respect to the input size.

Try our interview co-pilot at AlgoAdvance.com

Got blindsided by a question you didn’t expect?
Spend too much time studying?
Or simply don’t have the time to go over all 3000 questions?