algoadvance

Your task is to design a service that can encode a URL into a shortened URL and decode the shortened URL to its original form. This is similar to the functionality provided by services like TinyURL.

Requirements

  1. Implement two methods:
    • encode(longUrl: str) -> str: Encodes a long URL to a shortened URL.
    • decode(shortUrl: str) -> str: Decodes a shortened URL to its original URL.
  2. There’s no limit to the number of URLs that can be encoded, so assume infinite URLs can be handled.

  3. The shortened URL can use a fixed domain like “http://tinyurl.com/” followed by a unique identifier that maps to the original URL.

Example

# Example usage
url_service = Codec()
short_url = url_service.encode("https://leetcode.com/problems/design-tinyurl")
# returns "http://tinyurl.com/abc123"

assert url_service.decode(short_url) == "https://leetcode.com/problems/design-tinyurl"

Clarifying Questions

  1. Unique URLs: Can we assume that the same long URL will always get the same short URL if encoded multiple times?
  2. Security: How secure does the encoding need to be? Are simple hash functions acceptable?
  3. Concurrency: Do we need to handle high concurrency in accessing and creating short URLs?

Assuming the answers are:

  1. Yes, encoding the same URL multiple times should yield the same short URL.
  2. Simple hash-based methods are acceptable since there is no mention of security requirements.
  3. We do not have to handle high concurrency considerations explicitly.

Strategy

Data Structures

Time Complexity

Code

import string
import random
import hashlib

class Codec:
    def __init__(self):
        self.url_map = {}
        self.counter = 0
        self.base_url = "http://tinyurl.com/"

    def _generate_hash(self, longUrl: str) -> str:
        """
        Generate a unique hash for the given URL using a combination of a hash function 
        and internal counter to avoid collisions.
        """
        self.counter += 1
        hash_object = hashlib.md5(f"{longUrl}{self.counter}".encode())
        short_hash = hash_object.hexdigest()[:6]
        return short_hash

    def encode(self, longUrl: str) -> str:
        "Encodes a URL to a shortened URL."
        # Generate short hash
        short_hash = self._generate_hash(longUrl)
        
        # Handle potential collisions
        while short_hash in self.url_map:
            short_hash = self._generate_hash(longUrl)
        
        # Store in url_map
        short_url = self.base_url + short_hash
        self.url_map[short_url] = longUrl
        return short_url

    def decode(self, shortUrl: str) -> str:
        "Decodes a shortened URL to its original URL."
        return self.url_map.get(shortUrl, "")

# Example usage
codec = Codec()
long_url = "https://leetcode.com/problems/design-tinyurl"
short_url = codec.encode(long_url)
print("Short URL:", short_url)
assert codec.decode(short_url) == long_url
print("Decoded URL:", codec.decode(short_url))

Explanation

  1. Initialization:
    • An instance of Codec initializes the URL mapping dictionary and the base URL.
  2. Generate Hash:
    • The _generate_hash method uses the MD5 hash function to create a unique hash of the URL concatenated with a counter, ensuring uniqueness.
  3. Encode:
    • Encodes the long URL using the hash and stores the mapping in the dictionary.
  4. Decode:
    • Simply looks up the short URL in the dictionary and returns the corresponding long URL.

This design ensures that the encoder and decoder are both efficient and handle potential hash collisions.

Try our interview co-pilot at AlgoAdvance.com