algoadvance

A website domain “visit count” is represented as a string cpdomain, consisting of a count followed by a space, followed by the address. Given a list of such count-paired domains, return the count of total visits for each subdomain.

For example, a count-paired domain “9001 discuss.leetcode.com” indicates that the domain discuss.leetcode.com was visited 9001 times. A subdomain will count visits for every domain it is a part of.

Example:

Input:

["9001 discuss.leetcode.com"]

Output:

["9001 leetcode.com", "9001 discuss.leetcode.com", "9001 com"]

Clarifying Questions:

  1. Can there be multiple domains in the input list?
  2. Will counts always be positive integers?
  3. Will domain strings always be non-empty and valid?
  4. Are subdomains restricted to three levels?
  5. Should the outputs be sorted in any specific way?

Strategy:

  1. Initialize an empty dictionary to store the counts of each subdomain.
  2. Iterate through each domain string in the input list:
    • Split the string into count and domain.
    • Split the domain by dots to get all subdomains.
    • For each subdomain, add its count to the dictionary.
  3. After processing all domains, construct the output list by formatting the counts and domains from the dictionary.
  4. Return the output list.

Code:

def subdomainVisits(cpdomains):
    from collections import defaultdict
    
    # Dictionary to store counts of each subdomain
    subdomain_count = defaultdict(int)
    
    for full_domain in cpdomains:
        count, domain = full_domain.split()
        count = int(count)
        subdomains = domain.split('.')
        
        # Form and count all subdomains
        for i in range(len(subdomains)):
            subdomain = '.'.join(subdomains[i:])
            subdomain_count[subdomain] += count
    
    # Build the output list
    result = [f"{value} {key}" for key, value in subdomain_count.items()]
    
    return result

# Example usage
input_list = ["9001 discuss.leetcode.com"]
print(subdomainVisits(input_list))

Time Complexity:

Thus, the overall time complexity is O(N + D*L) which is generally linear with respect to the input size.

Try our interview co-pilot at AlgoAdvance.com