algoadvance

Given an array of integers citations where citations[i] is the number of citations a researcher received for their i-th paper, and citations is sorted in an ascending order, return the researcher’s h-index.

The h-index is defined as the maximum value h such that the researcher has published h papers that have each been cited at least h times.

Clarifying Questions

  1. Is it guaranteed that the array citations is sorted in ascending order?
    • Yes, the problem statement guarantees that the array is sorted in ascending order.
  2. What should be returned if the input array is empty?
    • Return 0, as no papers imply an h-index of 0.
  3. What is the maximum size of the input array?
    • The problem does not specify, but for the purpose of this problem, let’s assume the array size can be large, potentially up to (10^6).

Strategy

Given that the array is sorted, a binary search approach would work efficiently. The typical strategy for finding the h-index in a sorted array is:

  1. Initialize two pointers: left at the start (0) and right at the end (len(citations) - 1) of the array.
  2. Use binary search to find the maximum h-index:
    • Calculate mid as the average of left and right.
    • Determine the number of papers with citations >= citations[mid] by n - mid (where n is the total number of papers).
    • Check if citations[mid] is a valid h-index by comparing it to n - mid.
    • Adjust the binary search range accordingly until left exceeds right.
  3. Return the point where the h-index condition was last met.

Time Complexity

The time complexity of the binary search approach is (O(\log n)), where n is the number of papers.

Code

def hIndex(citations):
    n = len(citations)
    left, right = 0, n - 1
    while left <= right:
        mid = (left + right) // 2
        if citations[mid] == n - mid:
            return citations[mid]
        elif citations[mid] < n - mid:
            left = mid + 1
        else:
            right = mid - 1
    return n - left

# Example usage:
citations = [0, 1, 3, 5, 6]
print(hIndex(citations))  # Output: 3

In this code, the binary search helps efficiently find the h-index by narrowing down the potential values based on the sorted order of the citations array.

Try our interview co-pilot at AlgoAdvance.com