Question

在一般的二进制搜索中，我们正在寻找出现在数组中的值。但是，有时我们需要找到比目标更大或更小的第一个元素。

这是我丑陋，不完整的解决方案：

// Assume all elements are positive, i.e., greater than zero
int bs (int[] a, int t) {
  int s = 0, e = a.length;
  int firstlarge = 1 << 30;
  int firstlargeindex = -1;
  while (s < e) {
    int m = (s + e) / 2;
    if (a[m] > t) {
      // how can I know a[m] is the first larger than
      if(a[m] < firstlarge) {
        firstlarge = a[m];
        firstlargeindex = m;
      }
      e = m - 1; 
    } else if (a[m] < /* something */) {
      // go to the right part
      // how can i know is the first less than  
    }
  }
}

这种问题有更优雅的解决方案吗？

Answer 1

一个特别优雅的思考这个问题的方法是考虑对数组的转换版本进行二进制搜索，其中数组已经通过应用函数进行了修改

f(x) = 1 if x > target
       0 else

现在，我们的目标是找到这个函数对值1的第一个位置。我们可以使用二进制搜索来实现，如下所示：

int low = 0, high = numElems; // numElems is the size of the array i.e arr.size() 
while (low != high) {
    int mid = (low + high) / 2; // Or a fancy way to avoid int overflow
    if (arr[mid] <= target) {
        /* This index, and everything below it, must not be the first element
         * greater than what we're looking for because this element is no greater
         * than the element.
         */
        low = mid + 1;
    }
    else {
        /* This element is at least as large as the element, so anything after it can't
         * be the first element that's at least as large.
         */
        high = mid;
    }
}
/* Now, low and high both point to the element in question. */

要确保此算法正确，请考虑进行每次比较。如果我们找到一个不大于目标元素的元素，那么它和它下面的所有元素都不可能匹配，因此不需要搜索该区域。我们可以递归搜索右半边。如果我们发现一个元素大于有问题的元素，那么它之后的任何东西也必须更大，所以它们不能是更强的第一个元素，因此我们不需要搜索他们。因此，中间元素是最后可能的位置。

请注意，在每次迭代中，我们至少会考虑剩余的一半元素。如果顶部分支执行，则[low，（low + high）/ 2]范围内的元素全部被丢弃，导致我们失去下限（（低+高）/ 2） - 低+ 1> =（低+高）/ 2 - 低=（高 - 低）/ 2个元素。

如果底部分支执行，则[（低+高）/ 2 + 1，高]范围内的元素全部被丢弃。这使我们失去了高楼（低+高）/ 2 + 1＆gt; =高 - （低+高）/ 2 =（高 - 低）/ 2元素。

因此，我们最终会在此过程的O（lg n）次迭代中找到大于目标的第一个元素。

编辑：这是在阵列0 0 1 1 1 1上运行的算法的痕迹。

最初，我们有

0 0 1 1 1 1
L = 0       H = 6

所以我们计算mid =（0 + 6）/ 2 = 3，所以我们检查位置3的元素，其值为1.由于1> 1。 0，我们设置high = mid = 3.我们现在有

0 0 1
L     H

我们计算mid =（0 + 3）/ 2 = 1，所以我们检查元素1.因为它的值为0 <= 0，我们设置mid = low + 1 = 2.我们现在留下L = 2且H = 3：

0 0 1
    L H

现在，我们计算mid =（2 + 3）/ 2 = 2.索引2处的元素是1，并且因为1≥0，我们设置H = mid = 2，此时我们停止，实际上我们'看第一个大于0的元素。

希望这有帮助！

Answer 2

如果对数组进行排序，则可以使用std::upper_bound（假设n是数组a[]的大小）：

int* p = std::upper_bound( a, a + n, x );
if( p == a + n )
     std::cout << "No element greater";
else
     std::cout << "The first element greater is " << *p
               << " at position " << p - a;

Answer 3

以下递归方法如何：

    public static int minElementGreaterThanOrEqualToKey(int A[], int key,
        int imin, int imax) {

    // Return -1 if the maximum value is less than the minimum or if the key
    // is great than the maximum
    if (imax < imin || key > A[imax])
        return -1;

    // Return the first element of the array if that element is greater than
    // or equal to the key.
    if (key < A[imin])
        return imin;

    // When the minimum and maximum values become equal, we have located the element. 
    if (imax == imin)
        return imax;

    else {
        // calculate midpoint to cut set in half, avoiding integer overflow
        int imid = imin + ((imax - imin) / 2);

        // if key is in upper subset, then recursively search in that subset
        if (A[imid] < key)
            return minElementGreaterThanOrEqualToKey(A, key, imid + 1, imax);

        // if key is in lower subset, then recursively search in that subset
        else
            return minElementGreaterThanOrEqualToKey(A, key, imin, imid);
    }
}

Answer 4

我的以下实现使用条件bottom <= top，这与@templatetypedef的答案不同。

int FirstElementGreaterThan(int n, const vector<int>& values) {
  int B = 0, T = values.size() - 1, M = 0;
  while (B <= T) { // B strictly increases, T strictly decreases
    M = B + (T - B) / 2;
    if (values[M] <= n) { // all values at or before M are not the target
      B = M + 1;
    } else {
      T = M - 1;// search for other elements before M
    }
  }
  return T + 1;
}

Answer 5

这是 JAVA 中经过修改的二进制搜索代码，时间复杂度 O（logn）：

返回索引要搜索的元素如果元素存在
返回下一个更大元素的索引如果搜索的元素大于数组的最大元素，则
返回-1

public static int search(int arr[],int key) {
    int low=0,high=arr.length,mid=-1;
    boolean flag=false;

    while(low<high) {
        mid=(low+high)/2;
        if(arr[mid]==key) {
            flag=true;
            break;
        } else if(arr[mid]<key) {
            low=mid+1;
        } else {
            high=mid;
        }
    }
    if(flag) {
        return mid;
    }
    else {
        if(low>=arr.length)
            return -1;
        else
        return low;
        //high will give next smaller
    }
}

public static void main(String args[]) throws IOException {
    BufferedReader br=new BufferedReader(new InputStreamReader(System.in));
    //int n=Integer.parseInt(br.readLine());
    int arr[]={12,15,54,221,712};
    int key=71;
    System.out.println(search(arr,key));
    br.close();
}

Answer 6

经过多年的算法教学，我解决二进制搜索问题的方法是在元素上而不是数组之外设置开始和结束。这样，我可以感觉到发生了什么，一切都在控制之下，而又没有解决方案的魔力。

解决二进制搜索问题（以及许多其他基于循环的解决方案）的关键是一组好的不变式。选择正确的不变式会解决问题。尽管很多年前我在大学里第一次学习了不变性概念，但我花了很多年的时间来掌握它。

即使您想通过选择数组外部的开始或结尾来解决二进制搜索问题，仍然可以通过适当的不变性来实现。话虽如此，我的选择如上所述，总是设置在数组的第一个元素上开始，并在数组的最后一个元素上结束。

总而言之，到目前为止，我们有：

int start = 0; 
int end = a.length - 1;

现在不变式。现在我们拥有的数组是[start，end]。我们对这些元素一无所知。它们都可能大于目标，或者都可能更小，或更小或更大。因此，到目前为止，我们无法对这些元素做出任何假设。我们的目标是找到大于目标的第一个元素。因此，我们选择这样的不变式：

结尾右边的任何元素都大于目标。
任何开始左侧的元素小于或等于目标。

我们可以很容易地看到我们的不变式在开始时是正确的（即在进入任何循环之前）。开始处左侧的所有元素（基本上没有元素）都小于或等于目标，结束时的原因相同。

有了这个不变式，当循环结束时，结束后的第一个元素将成为答案（还记得结束式的右边是否都大于目标？ answer = end + 1这样。

另外，我们需要注意的是，当循环结束时，开始将比结束多一。即start = end +1。因此，等效地，我们也可以说start是答案（不变的是，开始左侧的任何内容都小于或等于目标，因此start本身是大于目标的第一个元素）。

所以一切都在说，这里是代码。您应该对这段代码的每一行都感到满意，并且完全没有魔力。如果没有，请评论一下含糊之处，我将很乐意回答。

public static int find(int a[], int target) {
    int st = 0; 
    int end = a.length - 1; 
    while(st <= end) {
        int mid = (st + end) / 2;   // or elegant way of st + (end - st) / 2; 
        if (a[mid] <= target) {
            st = mid + 1; 
        } else { // mid > target
            end = mid - 1; 
        }
    }
    return st; // or return end + 1
}

关于这种解决二进制搜索问题的方法的一些额外说明：

这种类型的解决方案总是将子数组的大小至少缩小1。这在代码中显而易见。新的开始或结束是中间的+1或-1。我更喜欢这种方法，而不是在两侧或一侧都包括中间，然后稍后再解释为什么算法正确。通过这种方式，它更切实，更无错误。

while循环的条件为st <= end。不是st < end。这意味着进入while循环的最小大小是大小为1的数组。这完全符合我们的期望。在其他解决二进制搜索问题的方式中，有时最小的大小是大小为2的数组（如果st

因此，希望这可以澄清该问题以及许多其他二进制搜索问题的解决方案。将此解决方案视为一种专业理解和解决更多二进制搜索问题的方式，而无需担心该算法是否适用于边缘情况。

Answer 7

public static int search(int target, int[] arr) {
        if (arr == null || arr.length == 0)
            return -1;
        int lower = 0, higher = arr.length - 1, last = -1;
        while (lower <= higher) {
            int mid = lower + (higher - lower) / 2;
            if (target == arr[mid]) {
                last = mid;
                lower = mid + 1;
            } else if (target < arr[mid]) {
                higher = mid - 1;
            } else {
                lower = mid + 1;
       }
    }
    return (last > -1 && last < arr.length - 1) ? last + 1 : -1;
}

如果我们找到'target == arr [mid]'，则任何先前的元素将小于或等于target。因此，下边界被设置为“ lower = mid + 1”。同样，“ last”是“ target”的最后一个索引。最后，我们返回'last + 1'-考虑边界条件。

Answer 8

kind = 0：完全匹配，kind = 1：比x更重要，kind = -1：只小于x;

如果未找到匹配项，则返回-1。

2018-01-28 10:00:09,648 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, localhost
2018-01-28 10:00:09,649 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2018-01-28 10:00:09,650 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.mb, 768
2018-01-28 10:00:09,650 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.heap.mb, 768
2018-01-28 10:00:09,650 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2018-01-28 10:00:09,650 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.memory.preallocate, false
2018-01-28 10:00:09,650 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 1
2018-01-28 10:00:09,650 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: web.port, 8081
2018-01-28 10:00:10,003 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-01-28 10:00:10,069 INFO  org.apache.flink.runtime.security.modules.HadoopModule        - Hadoop user set to manager (auth:SIMPLE)
2018-01-28 10:00:10,377 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Cluster specification: ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=768, numberTaskManagers=2, slotsPerTaskManager=3}
2018-01-28 10:00:10,747 WARN  org.apache.flink.yarn.YarnClusterDescriptor                   - The configuration directory ('/opt/flink/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them.
2018-01-28 10:00:10,751 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/opt/flink/conf/log4j.properties to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/log4j.properties
2018-01-28 10:00:11,123 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/opt/flink/lib/log4j-1.2.17.jar to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/lib/log4j-1.2.17.jar
2018-01-28 10:00:11,384 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/opt/flink/lib/flink-dist_2.11-1.4.0.jar to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/lib/flink-dist_2.11-1.4.0.jar
2018-01-28 10:00:30,986 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/opt/flink/lib/flink-shaded-hadoop2-uber-1.4.0.jar to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/lib/flink-shaded-hadoop2-uber-1.4.0.jar
2018-01-28 10:00:40,852 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/opt/flink/lib/flink-python_2.11-1.4.0.jar to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/lib/flink-python_2.11-1.4.0.jar
2018-01-28 10:00:41,017 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/opt/flink/lib/slf4j-log4j12-1.7.7.jar to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/lib/slf4j-log4j12-1.7.7.jar
2018-01-28 10:00:41,250 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/opt/flink/conf/logback.xml to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/logback.xml
2018-01-28 10:00:41,386 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/opt/flink/lib/flink-dist_2.11-1.4.0.jar to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/flink-dist_2.11-1.4.0.jar
2018-01-28 10:01:02,966 INFO  org.apache.flink.yarn.Utils                                   - Copying from /tmp/application_1517118829753_0002-flink-conf.yaml285707454205346702.tmp to hdfs://ha-cluster/user/manager/.flink/application_1517118829753_0002/application_1517118829753_0002-flink-conf.yaml285707454205346702.tmp
2018-01-28 10:01:03,601 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Submitting application master application_1517118829753_0002
2018-01-28 10:01:03,782 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1517118829753_0002
2018-01-28 10:01:03,783 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Waiting for the cluster to be allocated
2018-01-28 10:01:03,796 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deploying cluster, current state ACCEPTED

查找排序数组中第一个大于目标的元素

8 个答案: