为什么没有从 NDB 的上下文缓存中获取实体？答案

【问题标题】：Why is an entity not being fetched from NDB's in-context cache?为什么没有从 NDB 的上下文缓存中获取实体？
【发布时间】：2013-02-13 18:55:55
【问题描述】：

我有一个用于存储一些全局应用程序设置的实体。这些设置可以通过管理 HTML 页面进行编辑，但很少更改。我只有这个实体的一个实例（各种各样的单例），当我需要访问设置时总是引用这个实例。

归结为：

class Settings(ndb.Model):
    SINGLETON_DATASTORE_KEY = 'SINGLETON'

    @classmethod
    def singleton(cls):
        return cls.get_or_insert(cls.SINGLETON_DATASTORE_KEY)

    foo = ndb.IntegerProperty(
        default =  100,
        verbose_name = "Some setting called 'foo'",
        indexed = False)

@ndb.tasklet
def foo():
    # Even though settings has already been fetched from memcache and
    # should be available in NDB's in-context cache, the following call
    # fetches it from memcache anyways. Why?
    settings = Settings.singleton()

class SomeHandler(webapp2.RequestHandler):
    @ndb.toplevel
    def get(self):
        settings = Settings.singleton()
        # Do some stuff
        yield foo()
        self.response.write("The 'foo' setting value is %d" % settings.foo)

我假设每个请求处理程序多次调用Settings.singleton() 会非常快，因为第一次调用很可能会从内存缓存中检索Settings 实体（因为该实体很少更新）和所有后续同一请求处理程序中的调用将从 NDB 的上下文缓存中检索它。来自documentation：

上下文缓存仅在单个传入 HTTP 请求期间持续存在，并且仅对处理该请求的代码“可见”。它很快;这个缓存存在于内存中。

但是，AppStat 显示我的Settings 实体正在同一个请求处理程序中多次从内存缓存中检索。我通过查看 AppStat 中请求处理程序的详细页面、将每次调用的调用跟踪扩展到 memcache.Get 并查看正在接收的 memcahe 密钥来了解这一点。

我在请求处理程序中使用了很多小任务，我从需要访问设置的小任务中调用Settings.singleton()。这可能是再次从内存缓存而不是从上下文缓存中获取设置实体的原因吗？如果是这样，控制是否/何时可以从上下文缓存中获取实体的确切规则是什么？我无法在 NDB 文档中找到此信息。

2013/02/15 更新：我无法在虚拟测试应用程序中重现这一点。测试代码为：

class Foo(ndb.Model):
    prop_a = ndb.DateTimeProperty(auto_now_add = True)

def use_foo():
    foo = Foo.get_or_insert('singleton')
    logging.info("Function using foo: %r", foo.prop_a)

@ndb.tasklet
def use_foo_tasklet():
    foo = Foo.get_or_insert('singleton')
    logging.info("Function using foo: %r", foo.prop_a)

@ndb.tasklet
def use_foo_async_tasklet():
    foo = yield Foo.get_or_insert_async('singleton')
    logging.info("Function using foo: %r", foo.prop_a)

class FuncGetOrInsertHandler(webapp2.RequestHandler):
    def get(self):
        for i in xrange(10):
            logging.info("Iteration %d", i)
            use_foo()

class TaskletGetOrInsertHandler(webapp2.RequestHandler):
    @ndb.toplevel
    def get(self):
        logging.info("Toplevel")
        use_foo()
        for i in xrange(10):
            logging.info("Iteration %d", i)
            use_foo_tasklet()

class AsyncTaskletGetOrInsertHandler(webapp2.RequestHandler):
    @ndb.toplevel
    def get(self):
        logging.info("Toplevel")
        use_foo()
        for i in xrange(10):
            logging.info("Iteration %d", i)
            use_foo_async_tasklet()

在运行任何测试处理程序之前，我确保具有键名 singleton 的 Foo 实体存在。

与我在生产应用程序中看到的相反，所有这些请求处理程序都在 Appstats 中显示对 memcache.Get 的单个调用。

2013/02/21 更新：我终于能够在虚拟测试应用程序中重现这一点。测试代码为：

class ToplevelAsyncTaskletGetOrInsertHandler(webapp2.RequestHandler):
    @ndb.toplevel
    def get(self):
        logging.info("Toplevel 1")
        use_foo()
        self._toplevel2()

    @ndb.toplevel
    def _toplevel2(self):
        logging.info("Toplevel 2")
        use_foo()
        for i in xrange(10):
            logging.info("Iteration %d", i)
            use_foo_async_tasklet()

这个处理程序确实在 Appstats 中显示了对 memcache.Get 的 2 次调用，就像我的生产代码一样。

确实，在我的生产请求处理程序代码路径中，我有一个 toplevel 被另一个 toplevel 调用。似乎 toplevel 创建了一个新的 ndb 上下文。

将嵌套的toplevel 更改为synctasklet 可以解决问题。

【问题讨论】：

get 和 get_or_insert 有区别吗？
不。使用 get_by_id() 而不是 get_or_insert() 仍然会在单个请求处理程序中从 memcache 中进行多次提取。
get_or_insert() 在一个事务中运行，据说它绕过了内存缓存。所以这没有任何意义......
我无法在独立的测试应用程序中重现这一点。从子函数和小任务中多次调用get_or_insert() 最终会在AppStats 中得到一个memcache.Get 调用。所以在我的生产代码中必须有其他东西导致这个单例在单个处理程序中多次从 memcache 中获取。正在调查...
你是如何测试这个的？ memcache 仅在您第一次获取数据时（而不是在您放置数据时）被填充，因此要测试您是否看到缓存访问，您需要写入、读取并再次读取（在单独的请求中）。第二次读取应该使用 memcache。

标签： google-app-engine app-engine-ndb

【解决方案1】：

似乎顶层创建了一个新的 ndb 上下文。

确切地说，每个带有toplevel 装饰器的处理程序都有自己的上下文，因此有一个单独的缓存。您可以在下面的链接中查看toplevel 的代码，在函数文档中指出toplevel 是“设置新的默认上下文的同步小任务”。

https://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/ext/ndb/tasklets.py#1033

【讨论】：