algoadvance

Leetcode 433. Minimum Genetic Mutation

Problem Statement

LeetCode Problem 433: Minimum Genetic Mutation

A gene string can be represented by an 8-character long string, with characters from the set {'A', 'C', 'G', 'T'}.

Given a gene start string, a gene end string, and a list of bank strings (all of which represent valid genetic sequences), return the minimum number of mutations needed to transform the start gene to the end gene using the following rules:

  1. Each mutation must change exactly one character in the gene string.
  2. The mutation must result in a new gene string that is in the bank.

If it is not possible to mutate the start gene to the end gene, return -1.

Clarifying Questions

  1. What if start or end is not in the bank?
    • start does not need to be in the bank, but end must be in the bank for a valid transformation to exist.
  2. Can there be duplicate strings in the bank?
    • No, the bank contains distinct strings.
  3. Are all strings in bank guaranteed to be 8 characters long and composed of {A, C, G, T}?
    • Yes, all strings are valid and of the correct format.
  4. Is the input guaranteed to be valid?
    • Yes, based on the problem statement.

Strategy

This problem can be formulated as an unweighted graph traversal problem where each gene string is a node and an edge exists between two nodes if they differ by exactly one character (i.e., one mutation).

Steps:

  1. Breadth-First Search (BFS):
    • BFS is suitable because it explores all possibilities level by level (i.e., all one-step mutations from current strings at each level).
    • Perform BFS starting from the start gene string.
    • Each valid mutation (a string present in the bank and not visited yet) is added to the BFS queue.
    • Keep track of the number of mutations (levels) taken to reach the end gene string.
    • If end is reached during BFS, return the number of steps taken.
  2. Edge Case Handling:
    • If end is not in the bank, immediately return -1.

Time Complexity:

Code

#include <iostream>
#include <vector>
#include <unordered_set>
#include <queue>
#include <string>

using namespace std;

// Function to check if two strings differ by exactly one character
bool oneMutationApart(const string &s1, const string &s2) {
    int diffCount = 0;
    for (int i = 0; i < s1.size(); ++i) {
        if (s1[i] != s2[i]) {
            ++diffCount;
            if (diffCount > 1) return false;
        }
    }
    return diffCount == 1;
}

int minMutation(string start, string end, vector<string>& bank) {
    unordered_set<string> bankSet(bank.begin(), bank.end());
    if (!bankSet.count(end)) return -1;

    queue<pair<string, int>> bfsQueue;
    bfsQueue.push({start, 0});
    unordered_set<string> visited;
    visited.insert(start);

    vector<char> genes = {'A', 'C', 'G', 'T'};

    while (!bfsQueue.empty()) {
        auto [current, steps] = bfsQueue.front();
        bfsQueue.pop();

        if (current == end) {
            return steps;
        }

        for (int i = 0; i < current.size(); ++i) {
            char originalChar = current[i];
            for (char gene : genes) {
                if (gene == originalChar) continue;

                current[i] = gene;
                if (bankSet.count(current) && !visited.count(current)) {
                    bfsQueue.push({current, steps + 1});
                    visited.insert(current);
                }
            }
            current[i] = originalChar; // revert back for the next iteration
        }
    }

    return -1;
}

int main() {
    string start = "AACCGGTT";
    string end = "AACCGGTA";
    vector<string> bank = {"AACCGGTA"};

    cout << "Minimum mutations: " << minMutation(start, end, bank) << endl;

    return 0;
}

Explanation of the Code:

  1. oneMutationApart Function: Helper function to check if two strings differ by exactly one character.
  2. minMutation Function: Main function that uses BFS to find the minimum number of mutations needed.
  3. Initialization: Convert the bank to an unordered_set for O(1) look-up times.
  4. BFS Traversal: Use a queue to keep track of the current string and the number of mutations made.
  5. Mutate and Enqueue: For each position in the current string, try mutating it with ‘A’, ‘C’, ‘G’, and ‘T’, and if the mutated string is in the bank and not visited, enqueue it.
  6. Return Result: If the end string is reached, return the number of mutations; otherwise, when the traversal is complete and end is not reached, return -1.

Cut your prep time in half and DOMINATE your interview with AlgoAdvance AI