如何计算所有文档，天蓝色 DocumentDB答案

【问题标题】：How to count all docs, azure DocumentDB如何计算所有文档，天蓝色 DocumentDB
【发布时间】：2016-07-26 07:59:55
【问题描述】：

下一个 SP 是尝试对集合中的所有文档进行计数，并且通常学习如何处理完整的集合。

由于某种原因下一个SP返回

{"count":0,"QueryCount":0}

虽然我希望它会返回

{"count":1000, "QueryCount":1}

SP：

   function CountAll(continuationToken) {
    var collection = getContext().getCollection();
    var results =0;
    var queryCount = 0;
    var pageSize = 1000;
    var responseOptionsContinuation;
    var accepted = true;

    var responseOptions = { continuation: continuationToken, pageSize : pageSize};

    if (accepted) {
        accepted = collection.readDocuments(collection.getSelfLink(), responseOptions, onReadDocuments);
        responseOptions.continuation = responseOptionsContinuation;
    }
    setBody();



    function onReadDocuments(err, docFeed, responseOptions) {
        queryCount++;
         if (err) {
            throw 'Error while reading document: ' + err;
        }

        results += docFeed.length;
        responseOptionsContinuation = responseOptions.continuation;
    }

    function setBody() {
        var body = { count: results,  QueryCount: queryCount};
        getContext().getResponse().setBody(body);
    }
}

【问题讨论】：

不确定我是否理解这个问题，但您可以在 SQL/Linq 中使用 ORDER BY 和 LIMIT 进行精确分页，或者您可以使用 options = new FeedOptions { MaxItemCount = 10 }; 进行近似分页。这有帮助吗？
如果我可以询问下一页，分页会有所帮助，直到我拥有所有页面。否则，您将如何查询 20,000 个文档？
您可以将 MaxItemCount 设为 -1 来指定“尽可能多地获取”。我仍然没有关注。你想做什么？你用什么代码来尝试做到这一点？你期望什么输出/响应？这与您实际得到的有什么不同？
我把问题改成更容易回答的问题。

标签： .net azure azure-cosmosdb

【解决方案1】：

请注意，DocumentDB 现在将文档总数作为标题返回。 您可以通过调用 GET /colls/collectionName（.NET 中的 ReadDocumentCollectionAsync）将其作为 O(1) 操作来执行：

今天的服务器返回此信息。不幸的是，今天 SDK 没有公开这个属性。我们将在下次更新 SDK 时修复此问题。在此之前，您可以尝试这样做。

ResourceResponse<DocumentCollection> collectionReadResponse = await client.ReadDocumentCollectionAsync(…);
String quotaUsage = collectionReadResponse.ResponseHeaders["x-ms-resource-usage"];

// Quota Usage is a semicolon(;) delimited key-value pair. 
// The key "documentCount" will return the actual count of document.

这是标题的样子。

"functions=0;storedProcedures=0;triggers=0;documentSize=10178;documentsSize=5781669;documentsCount=17151514;collectionSize=10422760";

在此示例中，文档数约为 17M (17151514)。

【讨论】：

感谢您的解释和补充信息。

【解决方案2】：

你在正确的轨道上。只需要一些调整。您的麻烦似乎在于您编写异步代码的方式。我花了一段时间才习惯为 javascript 编写异步代码。我相信你会明白的。以下是我注意到的事情：

我在您的回调onReadDocuments() 中没有看到任何会在返回 1000 个文档页面后尝试执行另一个查询的内容。在onReadDocuments() 内部，您需要测试继续令牌不为空并且接受仍然为真。如果满足这两个条件，那么你应该再次执行这条语句，accepted = collection.readDocuments(collection.getSelfLink(), responseOptions, onReadDocuments);
另外，在onReadDocuments() 内部，这条线可能没有达到你的预期，responseOptions.continuation = responseOptionsContinuation; 这里没有必要，因为你将它设置在它之上，直到它之后才会设置为新值回调被调用。
您使用responseOptions 作为onReadDocuments() 的最后一个参数令人困惑，因为它是请求回复标头而不是请求提交选项。将其更改为 options。
您似乎有三种不同的方式来引用延续令牌，并且没有始终如一地传递您设置的那个。建议，把sproc的参数从continuationToken改成continuationTokenForThisSPROCExecution'. You already initialize it into theresponseOptionsso that's good, just update it to the new name. However, inonReadDocuments(), executeresponseOptions.continuation = options.continuation;`
只是为了确保您理解，存储过程和调用在超时之前需要许多 1000 个文档页面（根据我的经验，在未加载的系统上至少有 10,000 个）。因此，您正在考虑上述更改，但如果存储过程超时，您需要以不同的方式处理它，这将涉及客户端的一些工作。您需要将最新的延续令牌传回正文和客户端，如果您看到带有延续令牌的响应，则需要再次调用存储过程（使用该延续令牌）。然后，您要么需要将当前计数传回 sproc 以使其继续添加，要么需要在客户端累积它。

Here 是一个完整的 CoffeeScript 示例（可编译为 JavaScript）。请注意，如果您使用 documentdb-utils，它将继续调用 sproc 直到完成。否则，您需要自己执行此操作。

【讨论】：