algoadvance

LeetCode Problem 652: Find Duplicate Subtrees

Given the root of a binary tree, return all duplicate subtrees. For each kind of duplicate subtrees, you only need to return the root node of any one of them.

Two trees are duplicate if they have the same structure with the same node values.

Example:

Input: root = [1,2,3,4,null,2,4,null,null,4]
Output: [[2,4],[4]]

Clarifying Questions

  1. What is the format of the output?
    • The output should be a list of tree nodes, where each node is the root of a duplicate subtree.
  2. Can the values of the nodes be negative?
    • Yes, node values can be negative as well.
  3. What is the maximum number of nodes in the tree?
    • The problem does not specify, but typically, constraints might involve up to tens of thousands of nodes. Let’s assume up to 10^4 as a safe estimate.
  4. What is the structure of the tree node?
    • Each tree node has a value, and pointers to its left and right children.

Strategy

To identify duplicate subtrees:

  1. Use Depth-First Search (DFS) to traverse the tree.
  2. Represent each subtree by a serialization string (e.g., “2,4,#,#,#”).
  3. Use a dictionary to count occurrences of each serialization.
  4. Collect trees that have duplicate serializations.

To implement this:

  1. Perform a DFS traversal of the tree.
  2. Serialize each subtree recursively.
  3. Store each serialization in a dictionary to count occurrences.
  4. Whenever a serialization count reaches 2 (indicating a duplicate), add the subtree root to the result list.

Code

from collections import defaultdict
from typing import List

# Definition for a binary tree node.
class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def findDuplicateSubtrees(root: TreeNode) -> List[TreeNode]:
    # This dictionary will store the serialized subtree and its count
    trees = defaultdict(int)
    duplicates = []
    
    def serialize(node):
        if not node:
            return "#"
        
        # Serialize current subtree
        subtree = f"{node.val},{serialize(node.left)},{serialize(node.right)}"
        
        # Record the occurrence of the subtree
        trees[subtree] += 1
        
        # If it has been seen exactly twice (to avoid adding the same root multiple times)
        if trees[subtree] == 2:
            duplicates.append(node)
        
        return subtree
    
    serialize(root)
    return duplicates

Time Complexity

This approach efficiently captures duplicate subtrees using serialization and tracking their counts, ensuring an optimal solution.

Try our interview co-pilot at AlgoAdvance.com