【问题标题】:Django prefetch_related a large datasetDjango prefetch_related 一个大型数据集
【发布时间】:2017-05-12 18:28:52
【问题描述】:

我现在遇到与 django 的预取相关的问题。 举个例子,让我们想象一下那些模型

from django.db import models

class Client(models.Model):
    name = models.CharField(max_length=255)

class Purchase(models.Model):
    client = models.ForeignKey('Client')

假设我们有几个客户,大概有 200 个,但他们购买了很多,所以我们有数百万次购买。

如果我必须创建一个网页来显示所有客户以及每个客户的购买数量,我将不得不编写类似的内容

from django.db.models import Prefetch
from .models import Purchase, Client

purchases = Purchase.objects.all()
clients = Client.prefetch_related(Prefetch('purchase_set', queryset=purchases))

这里的问题是我将查询大宗采购数据库,而该查询可能需要一分钟以上,或者更糟的是在服务器上创建一个 MemoryError。

所以,我尝试只选择该数据库的一批

 purchases = Purchase.objects.all()[:9]

但正如我们所料,Django 不太喜欢它并启动了这种异常

Traceback (most recent call last):
  File "project/venv/lib/python3.6/site-packages/django/core/handlers/base.py",
 line 149, in get_response
    response = self.process_exception_by_middleware(e, request)
  File "project/venv/lib/python3.6/site-packages/django/core/handlers/base.py",
 line 147, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "project/venv/lib/python3.6/site-packages/django/views/generic/base.py",
 line 68, in view
    return self.dispatch(request, *args, **kwargs)
  File "project/venv/lib/python3.6/site-packages/django/utils/decorators.py", l
ine 67, in _wrapper
    return bound_func(*args, **kwargs)
  File "project/venv/lib/python3.6/site-packages/django/views/decorators/cache.
py", line 57, in _wrapped_view_func
    response = view_func(request, *args, **kwargs)
  File "project/venv/lib/python3.6/site-packages/django/utils/decorators.py", l
ine 63, in bound_func
    return func.__get__(self, type(self))(*args2, **kwargs2)
****************** login decorators, views, ... 
  File "project/***.py", line ***, in ***
    for client in clients:
  File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 258, in __iter__
    self._fetch_all()
  File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 1076, in _fetch_all
    self._prefetch_related_objects()
  File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 656, in _prefetch_related_objects
    prefetch_related_objects(self._result_cache, self._prefetch_related_lookups)
  File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 1457, in prefetch_related_objects
    obj_list, additional_lookups = prefetch_one_level(obj_list, prefetcher, lookup, level)
  File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 1556, in prefetch_one_level
    prefetcher.get_prefetch_queryset(instances, lookup.get_current_queryset(level)))
  File "project/venv/lib/python3.6/site-packages/django/db/models/fields/relate
d_descriptors.py", line 539, in get_prefetch_queryset
    queryset = queryset.filter(**query)
  File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 790, in filter
    return self._filter_or_exclude(False, *args, **kwargs)
  File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 802, in _filter_or_exclude
    "Cannot filter a query once a slice has been taken."
AssertionError: Cannot filter a query once a slice has been taken.

所以现在,我没有真正的解决方案。我正在查看 django/db/models/query.py:258 中的 __iter__ 函数是如何构建的,以尝试创建具有相同行为但需要在预取中设置有限集才能对其进行分页并执行操作的函数一种更并行的方式。

有什么“好方法”来做这些查询吗?

【问题讨论】:

    标签: python django postgresql django-models query-optimization


    【解决方案1】:

    假设我们有几个客户,大概有 200 个,但他们购买 很多,所以我们有数百万次购买。

    如果我必须创建一个网页来显示所有客户端和 每个客户的购买次数,...

    我会将您的问题解释为需要此功能。你试过了吗:

    from django.db.models import Count
    clients = Client.objects.annotate(num_purchases=Count('purchase'))
    clients[0].num_purchases
    

    如果您想排序并获得最高购买量的客户,您也可以这样做:

    clients = Client.objects.annotate(num_purchases=Count('purchase')).order_by('-num_purchases')[:5]
    

    更多功能请参见https://docs.djangoproject.com/en/1.11/topics/db/aggregation/

    【讨论】:

      猜你喜欢
      • 2016-11-28
      • 2013-11-23
      • 1970-01-01
      • 2019-03-16
      • 1970-01-01
      • 2011-05-21
      • 1970-01-01
      • 1970-01-01
      • 2020-03-25
      相关资源
      最近更新 更多