加快我的Ruby代码

时间:2013-02-24 16:24:57

标签: ruby performance

我创建了一个用于选择用户的算法,由于它在无限滚动网页上使用,因此运行速度比我想要的慢。

我对Ruby的了解不足以确定提高效率的方法。有没有人有任何想法?

def gen_users(list_length)
new_selection = []

# get all users
all_users = User.all

# shuffle them randomly
shuffled_users = all_users.shuffle

# cycle through all users randomly
shuffled_users.each do |user|
    # check user profile isn't already in current selection
    if !@users.include?(user)
    # check user profile exists
    if user.etkh_profile
        profile_completeness = user.etkh_profile.get_profile_completeness

        # check user profile meets minimum requirements        
        if profile_completeness >= MIN_PROFILE_COMPLETENESS && user.avatar? \
        && user.etkh_profile.background.length >= MIN_BACKGROUND_LENGTH

        # insert randomness and bias towards profiles with high completeness
        r = Random.new
        rand = r.rand(1..10)  # random integer between 1 and 10
        product = rand * profile_completeness

        # threshold is defined by the probability that a profile with min profile completeness
        # will be selected

        max_product = MIN_PROFILE_COMPLETENESS * 10
        threshold = (1 - PROBABILITY_MIN_PROFILE) * max_product

        if product >= threshold
            # add to total list
            @users << user

            # add to list of latest selection
            new_selection << user
        end
        end
    end
    end

    # exit loop if enough users have been found
    break if new_selection.length >= list_length
end

# return this selection
return new_selection
end

2 个答案:

答案 0 :(得分:3)

你做错的两件事是:

  • threshold是不变的。你不应该每次都在循环中计算。
  • Random.new应该重复使用。这就是它的用途。每次循环时都不应该创建新实例。

我对你的代码的重构(steenslag的改进)将是这样的:

THRESHOLD = (1 - PROBABILITY_MIN_PROFILE) * MIN_PROFILE_COMPLETENESS * 10
RANDOM_GENERATOR = Random.new

def gen_users(list_length)
  (User.all - @users)
  .select do |user|
    profile = user.etkh_profile and
    profile.background.length >= MIN_BACKGROUND_LENGTH and
    (completeness = profile.get_profile_completeness) >= MIN_PROFILE_COMPLETENESS and
    RANDOM_GENERATOR.rand(1..10) * completeness >= THRESHOLD
  end
  .select(&:avatar?)
  .sample(list_length)
  .tap{|a| @users.concat(a)}
end

答案 1 :(得分:0)

很难说明具体会产生什么样的不良表现,这取决于用户数量。也:

  1. max_productthreshold是常量,您不需要在每个周期计算它们。
  2. 也许user.etkh_profile是一种缓慢的方法?你每个周期调用它三次。
  3. 此外,您不需要存储两个包含大量数据的局部变量(all_usersshuffled_users)。相反,您可以执行shuffled_users = User.all.shuffle
  4. 之类的操作