【问题标题】:why is this rails association loading individually after an eager load?为什么这个 Rails 关联在急切加载后单独加载?
【发布时间】:2011-01-07 19:59:00
【问题描述】:

我试图避免急切加载的 N+1 查询问题,但它不起作用。相关模型仍在单独加载。

以下是相关的 ActiveRecord 及其关系:

class Player < ActiveRecord::Base
  has_one :tableau
end

Class Tableau < ActiveRecord::Base
  belongs_to :player
  has_many :tableau_cards
  has_many :deck_cards, :through => :tableau_cards
end

Class TableauCard < ActiveRecord::Base
  belongs_to :tableau
  belongs_to :deck_card, :include => :card
end

class DeckCard < ActiveRecord::Base
  belongs_to :card
  has_many :tableaus, :through => :tableau_cards
end

class Card < ActiveRecord::Base
  has_many :deck_cards
end

class Turn < ActiveRecord::Base
  belongs_to :game
end

我正在使用的查询在 Player 的这个方法中:

def tableau_contains(card_id)
  self.tableau.tableau_cards = TableauCard.find :all, :include => [ {:deck_card => (:card)}], :conditions => ['tableau_cards.tableau_id = ?', self.tableau.id]
  contains = false
  for tableau_card in self.tableau.tableau_cards
    # my logic here, looking at attributes of the Card model, with        
    # tableau_card.deck_card.card;
    # individual loads of related Card models related to tableau_card are done here
  end
  return contains
end

它与范围有关吗?这个 tableau_contains 方法在一个更大的循环中减少了几个方法调用,我最初尝试进行急切加载,因为有几个地方循环和检查这些相同的对象。然后我最终尝试了上面的代码,在循环之前加载,我仍然在日志中的 tableau_cards 循环中看到 Card 的单个 SELECT 查询。我也可以在 tableau_cards 循环之前看到带有 IN 子句的即时加载查询。

编辑:下面带有更大的外部循环的附加信息

EDIT2 : 用答案提示纠正了下面的循环

EDIT3:在循环中添加了更多细节与目标

这是更大的循环。它在 after_save 的观察者中

def after_save(pa)
  turn = Turn.find(pa.turn_id, :include => :player_actions)
  game = Game.find(turn.game_id, :include => :goals)
  game.players.all(:include => [ :player_goals, {:tableau => [:tableau_cards => [:deck_card => [:card]]]} ])
  if turn.phase_complete(pa, players)  # calls player.tableau_contains(card)
    for goal in game.goals
      if goal.checks_on_this_phase(pa)
        if goal.is_available(players, pa, turn)
          for player in game.players
            goal.check_if_player_takes(player, turn, pa)
              ... # loop through player.tableau_cards
            end
          end
        end
      end
    end
  end

这是turn类中的相关代码:

def phase_complete(phase, players)
  all_players_complete = true
  for player in players
    if(!player_completed_phase(player, phase))
      all_players_complete = false
    end
  end
  return all_players_complete
end

for player in game.players 正在执行另一个查询以加载播放器。它被缓存了,我的意思是它在日志中有 CACHE 标签,但我认为根本不会有任何查询,因为 game.players 应该已经加载到内存中。

目标模型中的另一个 sn-p:

class Goal < ActiveRecord::Base
  has_many :game_goals
  has_many :games, :through => :game_goals
  has_many :player_goals
  has_many :players, :through => :player_goals

  def check_if_player_takes(player, turn, phase)
    ...
    for tab_card in player.tableau_cards
    ...
  end
end

【问题讨论】:

    标签: ruby-on-rails activerecord associations eager-loading


    【解决方案1】:

    第一个问题是:你每次都在重置 player.tableau.tableau_cards

    player.tableau.tableau_cards = TableauCard.find :all, :include => [ {:deck_card => (:card)}], :conditions => ['tableau_cards.tableau_id = ?', player.tableau.id] 
    

    如果这应该是一个临时数组,那么你做的工作比必要的多。以下会更好:

    temp_tableau_cards = TableauCard.find :all, :include => [ {:deck_card => (:card)}], :conditions => ['tableau_cards.tableau_id = ?', player.tableau.id] 
    

    如果您实际上是在尝试设置 tableau_cards 并对它们执行某些操作,我也会将这两个操作分开。

    player.tableau.tableau_cards = TableauCard.find :all, :include => [ {:deck_card => (:card)}], :conditions => ['tableau_cards.tableau_id = ?', player.tableau.id] 
    card.whatever_logic if player.tableau.tableau_cards.include? card
    

    再一次,看起来你在不需要的时候加倍查询。

    【讨论】:

    • Re:问题 #1:我看不到我是如何“重置” player.tableau.tableau_cards 的。在此之前它们没有被加载,所以我试图加载 tableau_cards 以及为每个玩家一次加载其关联的 deck_cards 和卡片模型。稍后,是的,我在 tableau_contains 方法中再次加载玩家的 tableau_cards。那时我认为原始玩家的 tableau_cards 关联不知何故被卸载或超出范围,因为我看到在 tableau_contains 方法中加载了各个卡片。
    • 这就是我在 player.tableau_contains 方法中添加第二个 Tableau_card.find... 的原因,以查看这是否会消除单独加载单个卡片模型的情况,但事实并非如此。所以最初的问题仍然存在:在 player.tableau_contains 方法中,为什么要加载单个 Card 模型,在包含在 TableauCard.find 中之后,:include => [ {:deck_card => (:card)}] 查询?
    • 您可以使用player.tableau.tableau_cards(:include =&gt; {:deck_card =&gt; :card})player.tableau 中急切加载已经DB 存储的tableau_cards。 @Sixty 的意思是你试图通过分配给player.tableau.tableau_cards= 中的关联来在数据库中重新设置它们,这通常会为与player.tableau.tableau_cards() 不匹配的记录触发 SQL INSERT。这反过来可能会重新加载该协会的记录。这里的“已加载”是一个糟糕的词选择,因为您可能表示“存储在数据库中”或“不需要从数据库中延迟加载”。
    • 我想你,@Codeman 使用“加载”和“设置”来表示“加载到内存中”,而@Sixty 用它来表示“存储在数据库中”。
    • 错字:player.tableau.tableau_cards(:include =&gt; {:deck_card =&gt; :card}) 应该有.all 喜欢:player.tableau.tableau_cards.all(:include =&gt; {:deck_card =&gt; :card})
    【解决方案2】:

    如果将cards = TableauCard.find... 调用与player.tableau.tableau_cards = cards 调用分开会发生什么?也许 rails 正在代码中重置关联的缓存记录,然后重新加载关联。

    这还允许您通过显式传递变量来确保将相同的数组传递到tableau_contains

    您似乎正试图在多次调用 player.cards.tableau_cards 关联时保留预先加载的关联。我不确定 Rails 的工作方式是否可以实现此功能。我相信它会缓存从 sql 语句返回的原始数据,而不是返回的实际数组。所以:

      def test_association_identity
       a = player.tableau.tableau_cards.all(
              :include => {:deck_card => :card}) 
              #=> Array with object_id 12345
              # and all the eager loaded deck and card associations set up
       b = player.tableau.tableau_cards 
              #=> Array 320984230 with no eager loaded associations set up. 
              #But no extra sql query since it should be cached.
       assert_equal a.object_id, b.object_id #probably fails 
       a.each{|card| card.deck_card.card}
       puts("shouldn't have fired any sql queries, 
             unless the b call reloaded the association magically.")
       b.each{|card| card.deck_card.card; puts("should fire a query 
                                            for each deck_card and card")}
      end
    

    我能想到的唯一另一件事是在整个代码中分散一些输出,并准确查看延迟加载发生的位置。

    这就是我的意思:

    #观察者

    def after_save(pa)
      @game = Game.find(turn.game_id, :include => :goals)
      @game.players = Player.find( :all, 
                    :include => [ {:tableau => (:tableau_cards)},:player_goals ], 
                    :conditions => ['players.game_id =?', @game.id]
      for player in @game.players
        cards = TableauCard.find( :all, 
              :include =>{:deck_card => :card}, 
              :conditions => ['tableau_cards.tableau_id = ?', player.tableau.id])
        logger.error("First load")
        player.tableau.tableau_cards =  cards #See above comments as well.
        # Both sides of this ^ line should always be == since: 
        # Given player.tableau => Tableau(n) then Tableau(n).tableau_cards 
        # will all have tableau_id == n. In other words, if there are 
        # `tableau_cards.`tableau_id = n in the db (as in the find call),
        # then they'll already be found in the tableau.tableau_cards call.
        logger.error("Any second loads?")
        if(tableau_contains(cards,card))
           logger.error("There certainly shouldn't be any loads here.") 
           #so that we're not relying on any additional association calls, 
           #this should at least remove one point of confusion.
        ...
        end
      end
    end
    
    #Also in the Observer, for just these purposes (it can be moved back out 
    #to Player after the subject problem here is understood better)
    
    def tableau_contains(cards,card_id)
      contains = false
              logger.error("Is this for loop loading the cards?")
      for card in cards
               logger.error("Are they being loaded after `card` is set?")
        # my logic here, looking at attributes of the Card model, with        
        # card.deck_card.card;
        logger.error("What about prior to this call?")
      end
      return contains
    end
    

    【讨论】:

    • 您对我的意图的解释是正确的:我正在尝试将现有玩家的画面卡从数据库读入内存。感谢您弄清楚这一点,并解释了@Sixty 和我如何在他的答案下方的 cmets 中以不同的方式解释事物。我从未想过我正在更新现有的协会。看起来我将不得不使用像@Sixty 建议的临时数组。我想我现在明白了,但我仍然觉得奇怪的是,您不能急切加载现有关联并使用 player.tableau.tableau_cards 参考。
    • 我尝试使用 player.tableau.tableau_cards(:include => {:deck_card => :card}) 急切加载已经存储的 tableau_cards,来自您在@Sixty 的回答下方的评论;它没有加载 deck_cards 或卡片
    • 对不起,我是player.tableau.tableau_cards.all(:include =&gt; {:deck_card =&gt; :card})
    • 好吧,这有助于在外循环中急切加载deck_cards和卡片,但是在player.tableau_contains方法中,当我在self.tableau.tableau_cards中为tab_card执行时,它正在重新加载再次单独的deck_cards和卡片;我可以通过在 tableau_contains 中使用临时数组并在使用它的任何地方循环遍历它来摆脱急切的加载,因为它正在缓存查询和结果;但是我仍然不明白为什么玩家的 tableau.tableau_cards 没有在 player.tableau_contains 中被引用
    【解决方案3】:

    试试这个:

    class Game
      has_many :players
    end
    

    改变tableau_contains的逻辑如下:

    class Player < ActiveRecord::Base
      has_one :tableau
      belongs_to :game
    
      def tableau_contains(card_id)
        tableau.tableau_cards.any?{|tc| tc.deck_card.card.id == card_id}
      end
    
    end
    

    改变after_save的逻辑如下:

    def after_save(turn)
      game = Game.find(turn.game_id, :include => :goals))
      Rails.logger.info("Begin  eager loading..")                
      players = game.players.all(:include => [:player_goals,
                {:tableau => [:tableau_cards=> [:deck_card => [:card]]]} ])
      Rails.logger.info("End  eager loading..")                
      Rails.logger.info("Begin  tableau_contains check..")                
      if players.any?{|player| player.tableau_contains(turn.card_id)}
        # do something..                
      end
      Rails.logger.info("End  tableau_contains check..")                
    end
    

    after_save 方法的第二行急切地加载执行tableau_contains 检查所需的数据。 tableau.tableau_cardstc.deck_card.card 等调用应该/不会访问数据库。

    代码中的问题:

    1) 将数组分配给has_many 关联

    @game.players = Player.find :all, :include => ...
    

    上面的语句不是一个简单的赋值语句。它使用给定游戏的game_id 更改palyers 表行。 我假设这不是你想要的。如果您检查 DB 表,您会注意到玩家表的 updated_time 分配后行已更改。

    您必须将值分配给单独的变量,如after_save 方法中的代码示例所示。

    2) 手工编码关联SQL

    代码中的许多地方都是您为关联数据手动编写 SQL。 Rails 为此提供了关联。

    例如:

    tcards= TableauCard.find :all, :include => [ {:deck_card => (:card)}], 
             :conditions => ['tableau_cards.tableau_id = ?', self.tableau.id]
    

    可以改写为:

    tcards = tableau.tableau_cards.all(:include => [ {:deck_card => (:card)}])
    

    Tableau 模型上的 tableau_cards 卡片关联构造了与您手动编写的 SQL 相同的 SQL。

    您可以通过将has_many :through 关联添加到Player 类来进一步改进上述语句。

    class Player
      has_one :tableau
      has_many :tableau_cards, :through => :tableau
    end
    
    tcards = tableau_cards.all(:include => [ {:deck_card => (:card)}])
    

    编辑 1

    我创建了一个应用程序来测试这段代码。它按预期工作。 Rails 运行多个 SQL 来预先加载数据,即:

    Begin  eager loading..
    SELECT * FROM `players` WHERE (`players`.game_id = 1) 
    SELECT `tableau`.* FROM `tableau` WHERE (`tableau`.player_id IN (1,2))
    SELECT `tableau_cards`.* FROM `tableau_cards` 
              WHERE (`tableau_cards`.tableau_id IN (1,2))
    SELECT * FROM `deck_cards` WHERE (`deck_cards`.`id` IN (6,7,8,1,2,3,4,5))
    SELECT * FROM `cards` WHERE (`cards`.`id` IN (6,7,8,1,2,3,4,5))
    End  eager loading..
    Begin  tableau_contains check..
    End  tableau_contains check..
    

    在急切加载数据后,我没有看到任何 SQL 执行。

    编辑 2

    对您的代码进行以下更改。

    def after_save(pa)
      turn = Turn.find(pa.turn_id, :include => :player_actions)
      game = Game.find(turn.game_id, :include => :goals)
      players = game.players.all(:include => [ :player_goals, {:tableau => [:tableau_cards => [:deck_card => [:card]]]} ])
      if turn.phase_complete(pa, game, players)
        for player in game.players
          if(player.tableau_contains(card))
          ...
          end
        end
      end
    end
    def phase_complete(phase, game, players)
      all_players_complete = true
      for player in players
        if(!player_completed_phase(player, phase))
          all_players_complete = false
        end
      end
      return all_players_complete
    end
    

    缓存工作如下:

    game.players # cached in the game object
    game.players.all # not cached in the game object
    
    players = game.players.all(:include => [:player_goals])
    players.first.player_goals # cached
    

    上面的第二条语句产生一个自定义关联查询。因此 AR 不会缓存结果。其中 player_goals 在第三条语句中为每个玩家对象缓存,因为它们是使用标准关联 SQL 获取的。

    【讨论】:

    • 我还需要为玩家做查询吗?我已将 for 循环更改为仅适用于 game.players 中的玩家,它似乎工作正常。
    • players 的查询会加载您需要的数据对象。对game.players 的后续调用将返回缓存列表。您可以将两个语句合二为一,但为了便于阅读,我写了两个。
    • 我在after_save 中使用了game.players.all(:include =&gt; [ :player_goals, {:tableau =&gt; [:tableau_cards =&gt; [:deck_card =&gt; [:card]]]} ]),然后在tableau_contains 中使用了你的tableau.tableau_cards.any?{|tc| tc.deck_card.card.id == card_id},它仍然在tableau_contains 的循环中单独加载每个deck_card 和card;我已经仔细检查了玩家的 tableau 或 tableau_cards 关联是否在任何地方被重置,但我找不到任何东西,所以我仍然不知道为什么在它上面的热切加载之后单独查询 deck_cards 和卡
    • 更新了我的答案,看看
    • 我在急切加载操作中看到了相同的查询,这不是问题。问题是稍后会在我认为在内存中的对象上重复查询。我在原始帖子中添加了更多信息,显示了正在执行相同信息查询的这些地方之一。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多