为什么这个rails关联在急切加载后单独加载?

时间:2010-01-21 16:40:46

标签: ruby-on-rails activerecord associations eager-loading

我试图通过急切加载来避免N + 1查询问题,但它无法正常工作。相关模型仍在单独加载。

以下是相关的ActiveRecords及其关系:

class Player < ActiveRecord::Base
  has_one :tableau
end

Class Tableau < ActiveRecord::Base
  belongs_to :player
  has_many :tableau_cards
  has_many :deck_cards, :through => :tableau_cards
end

Class TableauCard < ActiveRecord::Base
  belongs_to :tableau
  belongs_to :deck_card, :include => :card
end

class DeckCard < ActiveRecord::Base
  belongs_to :card
  has_many :tableaus, :through => :tableau_cards
end

class Card < ActiveRecord::Base
  has_many :deck_cards
end

class Turn < ActiveRecord::Base
  belongs_to :game
end

我正在使用的查询是在Player的这个方法中:

def tableau_contains(card_id)
  self.tableau.tableau_cards = TableauCard.find :all, :include => [ {:deck_card => (:card)}], :conditions => ['tableau_cards.tableau_id = ?', self.tableau.id]
  contains = false
  for tableau_card in self.tableau.tableau_cards
    # my logic here, looking at attributes of the Card model, with        
    # tableau_card.deck_card.card;
    # individual loads of related Card models related to tableau_card are done here
  end
  return contains
end

是否与范围有关?这个tableau_contains方法是在一个更大的循环中进行一些方法调用,我最初尝试进行预先加载,因为有几个地方将这些相同的对象循环并进行检查。然后我最终尝试了上面的代码,在循环之前加载了,我仍然在日志中的tableau_cards循环中看到了针对Card的各个SELECT查询。我也可以在tableau_cards循环之前看到带有IN子句的eager-loading查询。

编辑:下面有更大的外部循环

的附加信息

EDIT2:下面的纠正循环以及答案提示

EDIT3:在循环中添加了更多详细信息

这是更大的循环。它在after_save

的观察者里面
def after_save(pa)
  turn = Turn.find(pa.turn_id, :include => :player_actions)
  game = Game.find(turn.game_id, :include => :goals)
  game.players.all(:include => [ :player_goals, {:tableau => [:tableau_cards => [:deck_card => [:card]]]} ])
  if turn.phase_complete(pa, players)  # calls player.tableau_contains(card)
    for goal in game.goals
      if goal.checks_on_this_phase(pa)
        if goal.is_available(players, pa, turn)
          for player in game.players
            goal.check_if_player_takes(player, turn, pa)
              ... # loop through player.tableau_cards
            end
          end
        end
      end
    end
  end

以下是转弯类中的相关代码:

def phase_complete(phase, players)
  all_players_complete = true
  for player in players
    if(!player_completed_phase(player, phase))
      all_players_complete = false
    end
  end
  return all_players_complete
end

for player in game.players正在执行另一个查询来加载玩家。它是缓存的,我的意思是它在日志中有CACHE标签,但我认为根本就没有查询,因为game.players应该已经加载到内存中了。

目标模型的另一个片段:

class Goal < ActiveRecord::Base
  has_many :game_goals
  has_many :games, :through => :game_goals
  has_many :player_goals
  has_many :players, :through => :player_goals

  def check_if_player_takes(player, turn, phase)
    ...
    for tab_card in player.tableau_cards
    ...
  end
end

3 个答案:

答案 0 :(得分:6)

试试这个:

class Game
  has_many :players
end

更改tableau_contains的逻辑如下:

class Player < ActiveRecord::Base
  has_one :tableau
  belongs_to :game

  def tableau_contains(card_id)
    tableau.tableau_cards.any?{|tc| tc.deck_card.card.id == card_id}
  end

end

更改after_save的逻辑如下:

def after_save(turn)
  game = Game.find(turn.game_id, :include => :goals))
  Rails.logger.info("Begin  eager loading..")                
  players = game.players.all(:include => [:player_goals,
            {:tableau => [:tableau_cards=> [:deck_card => [:card]]]} ])
  Rails.logger.info("End  eager loading..")                
  Rails.logger.info("Begin  tableau_contains check..")                
  if players.any?{|player| player.tableau_contains(turn.card_id)}
    # do something..                
  end
  Rails.logger.info("End  tableau_contains check..")                
end

after_save方法中的第二行急切加载执行tableau_contains检查所需的数据。 tableau.tableau_cardstc.deck_card.card之类的调用应该/不会命中数据库。

您的代码中的问题:

1)将数组分配给has_many关联

@game.players = Player.find :all, :include => ...

上述声明不是简单的赋值声明。它使用给定游戏的palyers更改game_id表格行。 我假设这不是你想要的。如果您检查数据库表,您会注意到玩家表的updated_time 分配后行已更改。

您必须将值分配给单独的变量,如after_save方法中的代码示例所示。

2)手工编码关联SQL

您在代码中的许多地方都在为关联数据编写SQL代码。 Rails为此提供了关联。

E.g:

tcards= TableauCard.find :all, :include => [ {:deck_card => (:card)}], 
         :conditions => ['tableau_cards.tableau_id = ?', self.tableau.id]

可以改写为:

tcards = tableau.tableau_cards.all(:include => [ {:deck_card => (:card)}])

tableau_cards模型上的Tableau卡片关联构造了您手动编码的相同SQL。

您可以通过向has_many :through课程添加Player关联来进一步改进上述声明。

class Player
  has_one :tableau
  has_many :tableau_cards, :through => :tableau
end

tcards = tableau_cards.all(:include => [ {:deck_card => (:card)}])

修改1

我创建了一个测试此代码的应用程序。它按预期工作。 Rails运行几个SQL来急切加载数据,即:

Begin  eager loading..
SELECT * FROM `players` WHERE (`players`.game_id = 1) 
SELECT `tableau`.* FROM `tableau` WHERE (`tableau`.player_id IN (1,2))
SELECT `tableau_cards`.* FROM `tableau_cards` 
          WHERE (`tableau_cards`.tableau_id IN (1,2))
SELECT * FROM `deck_cards` WHERE (`deck_cards`.`id` IN (6,7,8,1,2,3,4,5))
SELECT * FROM `cards` WHERE (`cards`.`id` IN (6,7,8,1,2,3,4,5))
End  eager loading..
Begin  tableau_contains check..
End  tableau_contains check..

在急切加载数据后,我没有看到任何SQL执行。

修改2

对您的代码进行以下更改。

def after_save(pa)
  turn = Turn.find(pa.turn_id, :include => :player_actions)
  game = Game.find(turn.game_id, :include => :goals)
  players = game.players.all(:include => [ :player_goals, {:tableau => [:tableau_cards => [:deck_card => [:card]]]} ])
  if turn.phase_complete(pa, game, players)
    for player in game.players
      if(player.tableau_contains(card))
      ...
      end
    end
  end
end
def phase_complete(phase, game, players)
  all_players_complete = true
  for player in players
    if(!player_completed_phase(player, phase))
      all_players_complete = false
    end
  end
  return all_players_complete
end

缓存的工作原理如下:

game.players # cached in the game object
game.players.all # not cached in the game object

players = game.players.all(:include => [:player_goals])
players.first.player_goals # cached

上面的第二个语句导致自定义关联查询。因此AR不会缓存结果。在使用标准关联SQL获取第3个语句中的每个玩家对象时缓存player_goals的位置。

答案 1 :(得分:1)

第一个问题是:您每次都在重置player.tableau.tableau_cards

player.tableau.tableau_cards = TableauCard.find :all, :include => [ {:deck_card => (:card)}], :conditions => ['tableau_cards.tableau_id = ?', player.tableau.id] 

如果那应该是一个临时数组,那么你做的工作超出了必要的范围。以下情况会更好:

temp_tableau_cards = TableauCard.find :all, :include => [ {:deck_card => (:card)}], :conditions => ['tableau_cards.tableau_id = ?', player.tableau.id] 

如果您实际上尝试设置tableau_cards并对其执行某些操作,我也会将这两个操作分开。

player.tableau.tableau_cards = TableauCard.find :all, :include => [ {:deck_card => (:card)}], :conditions => ['tableau_cards.tableau_id = ?', player.tableau.id] 
card.whatever_logic if player.tableau.tableau_cards.include? card

同样,当您不需要时,看起来您正在加倍查询。

答案 2 :(得分:1)

如果您从cards = TableauCard.find...电话中分出player.tableau.tableau_cards = cards来电,会怎样?也许rails会在代码中重置关联的缓存记录,然后重新加载关联。

这也允许您通过显式传递变量来确保将相同的数组传递到tableau_contains

您似乎试图在多次调用player.cards.tableau_cards关联时保留预先加载的关联。我不确定rails的工作方式是否可以实现这一功能。我相信它会缓存从sql语句返回的原始数据,但不会返回返回的实际数组。所以:

  def test_association_identity
   a = player.tableau.tableau_cards.all(
          :include => {:deck_card => :card}) 
          #=> Array with object_id 12345
          # and all the eager loaded deck and card associations set up
   b = player.tableau.tableau_cards 
          #=> Array 320984230 with no eager loaded associations set up. 
          #But no extra sql query since it should be cached.
   assert_equal a.object_id, b.object_id #probably fails 
   a.each{|card| card.deck_card.card}
   puts("shouldn't have fired any sql queries, 
         unless the b call reloaded the association magically.")
   b.each{|card| card.deck_card.card; puts("should fire a query 
                                        for each deck_card and card")}
  end

我能想到的另一件事就是在整个代码中分散一些输出,看看延迟加载的确切位置。

这就是我的意思:

#Observer

def after_save(pa)
  @game = Game.find(turn.game_id, :include => :goals)
  @game.players = Player.find( :all, 
                :include => [ {:tableau => (:tableau_cards)},:player_goals ], 
                :conditions => ['players.game_id =?', @game.id]
  for player in @game.players
    cards = TableauCard.find( :all, 
          :include =>{:deck_card => :card}, 
          :conditions => ['tableau_cards.tableau_id = ?', player.tableau.id])
    logger.error("First load")
    player.tableau.tableau_cards =  cards #See above comments as well.
    # Both sides of this ^ line should always be == since: 
    # Given player.tableau => Tableau(n) then Tableau(n).tableau_cards 
    # will all have tableau_id == n. In other words, if there are 
    # `tableau_cards.`tableau_id = n in the db (as in the find call),
    # then they'll already be found in the tableau.tableau_cards call.
    logger.error("Any second loads?")
    if(tableau_contains(cards,card))
       logger.error("There certainly shouldn't be any loads here.") 
       #so that we're not relying on any additional association calls, 
       #this should at least remove one point of confusion.
    ...
    end
  end
end

#Also in the Observer, for just these purposes (it can be moved back out 
#to Player after the subject problem here is understood better)

def tableau_contains(cards,card_id)
  contains = false
          logger.error("Is this for loop loading the cards?")
  for card in cards
           logger.error("Are they being loaded after `card` is set?")
    # my logic here, looking at attributes of the Card model, with        
    # card.deck_card.card;
    logger.error("What about prior to this call?")
  end
  return contains
end