中位数的中位数选择python

时间:2016-11-02 08:14:17

标签: python arrays algorithm select deterministic

我正在实施选择算法(例如,确定性选择)。我已经让它适用于小型数组/列表,但是当我的数组大小超过26时,会出现以下错误:"运行时错误:超出最大递归深度"。对于25号及以下的阵列,没有问题。

我的最终目标是让它运行大小为500的数组并进行多次迭代。迭代不是问题。我已经研究过StackOverflow并且看过文章:Python implementation of "median of medians" algorithm等等。我有一种预感,在我随机生成的数组中重复可能导致了一个问题,但似乎并不是这样。

这是我的代码:

import math
import random

# Insertion Sort Khan Academy video: https://www.youtube.com/watch?v=6pyeMmJTefg&list=PL36E7A2B75028A3D6&index=22

def insertion_sort(A):  # Sorting it in place
    for index in range(1, len(A)):# range is up to but not including len(A)
      value = A[index]
      i = index - 1           # index of the item that is directly to the left
      while i >= 0:
        if value < A[i]:
          A[i + 1] = A[i]
          A[i] = value
          i = i - 1
        else:
          break

timeslo = 0  # I think that this is a global variable

def partition(A, p):
  global timeslo
  hi = [] #hold things larger than our pivot
  lo = [] #  "     "   smaller  "   "   "
  for x in A:       # walk through all the elements in the Array A.
    if x <p:
      lo = lo + [x]
      timeslo = timeslo + 1  #keep track no. of comparisons
    else:
      hi = hi + [x]
  return lo,hi,timeslo

def get_chunks(Acopy, n):
                                    # Declare some empty lists to hold our chunks
  chunk = []
  chunks = []
                                    # Step through the array n element at a time
  for x in range(0, len(Acopy), n): # stepping by size n starting at the beginning
                                    # of the array
    chunk = Acopy[x:x+n]            # Extract 5 elements                           
                                    # sort chunk and find its median
    insertion_sort(chunk) # in place sort of chunk of size 5
    # get the median ... (i.e. the middle element)
    # Add them to list



 mindex = (len(chunk)-1)/2  # pick middle index each time

    chunks.append(chunk[mindex]) 
#     chunks.append(chunk)                        # assuming subarrays are size 5 and we want the middle
                                                  # this caused some trouble because not all subarrays were size 5
                            # index which is 2.
  return chunks


def Select(A, k): 

  if (len(A) == 1):  # if the array is size 1 then just return the one and only element
    return A[0]
  elif (len(A) <= 5): # if length is 5 or less, sort it and return the kth smallest element
    insertion_sort(A)
    return A[k-1]
  else:
    M = get_chunks(A, 5)  # this will give you the array of medians,,, don't sort it....WHY ???



    m = len(M)           # m is the size of the array of Medians M.

    x  = Select(M, m/2)# m/2 is the same as len(A)/10  FYI

    lo, hi, timeslo = partition(A, x) 

    rank = len(lo) + 1

    if rank == k: # we're in the middle -- we're done
      return x, timeslo    # return the value of the kth smallest element
    elif k < rank:
      return Select(lo, k) # ???????????????
    else:
      return Select(hi, k-rank)

################### TROUBLESHOOTING   ################################
#   Works with arrays of size 25 and 5000 iterations
#   Doesn't work with     "   26 and 5000    "
#
#  arrays of size 26 and 20 iterations breaks it    ?????????????????

# A = []
Total = 0
n = input('What size of array of random #s do you want?: ')
N = input('number of iterations: ')

# n = 26
# N = 1

for x in range(0, N):
  A = random.sample(range(1,1000), n)  # make an array or list of size n
  result = Select(A, 2)      #p is the median of the medians, 2 means the 3rd smallest element
  Total = Total + timeslo             # the total number of comparisons made
print("the result is"), result
print("timeslo = "), timeslo
print("# of comparisons = "), Total

# A = [7, 1, 3, 5, 9, 2, 83, 8, 4, 13, 17, 21, 16, 11, 77, 33, 55, 44, 66, 88, 111, 222]
# result = Select(A, 2)  
# print("Result = "), result  

任何帮助都将不胜感激。

1 个答案:

答案 0 :(得分:1)

更改此行
return x, timeslo # return the value of the kth smallest element

return x # return the value of the kth smallest element

您可以通过最后打印来获取timeslo。使用x返回timeslo是不正确的,因为它将在partition(A, p)中用于拆分数组,其中参数p应该是前一个语句{{1}的中位数1}}

相关问题