Given a list of accounts where each element accounts[i]
is a list of strings, where the first element accounts[i][0]
is a name, and the rest of the elements are emails representing emails of the account.
Your task is to merge the accounts. Two accounts definitely belong to the same person if there is some common email. Note that even if two accounts have the same name, they may belong to different people, as people could have the same name. A person can have any number of accounts initially, but all their accounts ultimately should be merged into one account. The order in which emails are presented in the returned accounts does not matter.
You need to return the accounts in the following format: the first element of each account is the name, and the rest of the elements are emails in sorted order.
Example:
Input:
accounts = [["John", "johnsmith@mail.com", "john00@mail.com"],
["John", "johnnybravo@mail.com"],
["John", "johnsmith@mail.com", "john_newyork@mail.com"],
["Mary", "mary@mail.com"]]
Output: [["John", "john00@mail.com", "john_newyork@mail.com", "johnsmith@mail.com"],
["John", "johnnybravo@mail.com"],
["Mary", "mary@mail.com"]]
We can use the Union-Find (Disjoint Set Union) data structure to solve this problem efficiently. The steps are as follows:
Here is the Python code that implements the above strategy:
from collections import defaultdict
class UnionFind:
def __init__(self):
self.parent = {}
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # Path compression
return self.parent[x]
def union(self, x, y):
rootX = self.find(x)
rootY = self.find(y)
if rootX != rootY:
self.parent[rootY] = rootX
def accountsMerge(accounts):
uf = UnionFind()
email_to_name = {}
# Initialize the union-find structure and email_to_name map
for account in accounts:
name = account[0]
first_email = account[1]
for email in account[1:]:
if email not in uf.parent:
uf.parent[email] = email
if first_email not in uf.parent:
uf.parent[first_email] = first_email
uf.union(first_email, email)
email_to_name[email] = name
# Find the root email for each connected component
email_clusters = defaultdict(list)
for email in uf.parent:
root_email = uf.find(email)
email_clusters[root_email].append(email)
# Construct the result
result = []
for emails in email_clusters.values():
result.append([email_to_name[emails[0]]] + sorted(emails))
return result
O(A*α(B))
where:
A
is the number of accounts.B
is the total number of emails.α
is the inverse Ackermann function which grows very slowly.O(B log B)
due to sorting emails within each cluster.Overall time complexity is approximately O(A + B log B)
, making the solution efficient given the constraints.
Got blindsided by a question you didn’t expect?
Spend too much time studying?
Or simply don’t have the time to go over all 3000 questions?