【问题标题】:Azure Search .net SDK- How to use "FindFailedActionsToRetry"?Azure Search .net SDK-如何使用“FindFailedActionsToRetry”?
【发布时间】:2020-04-30 00:42:44
【问题描述】:

使用 Azure Search .net SDK,当您尝试索引文档时,您可能会收到异常 IndexBatchException

From the documentation here:

        try
        {
            var batch = IndexBatch.Upload(documents);
            indexClient.Documents.Index(batch);
        }
        catch (IndexBatchException e)
        {
            // Sometimes when your Search service is under load, indexing will fail for some of the documents in
            // the batch. Depending on your application, you can take compensating actions like delaying and
            // retrying. For this simple demo, we just log the failed document keys and continue.
            Console.WriteLine(
                "Failed to index some of the documents: {0}",
                String.Join(", ", e.IndexingResults.Where(r => !r.Succeeded).Select(r => r.Key)));
        }

应如何使用 e.FindFailedActionsToRetry 来创建新批次以重试失败操作的索引?

我创建了一个这样的函数:

    public void UploadDocuments<T>(SearchIndexClient searchIndexClient, IndexBatch<T> batch, int count) where T : class, IMyAppSearchDocument
    {
        try
        {
            searchIndexClient.Documents.Index(batch);
        }
        catch (IndexBatchException e)
        {
            if (count == 5) //we will try to index 5 times and give up if it still doesn't work.
            {
                throw new Exception("IndexBatchException: Indexing Failed for some documents.");
            }

            Thread.Sleep(5000); //we got an error, wait 5 seconds and try again (in case it's an intermitent or network issue

            var retryBatch = e.FindFailedActionsToRetry<T>(batch, arg => arg.ToString());
            UploadDocuments(searchIndexClient, retryBatch, count++);
        }
    }

但我认为这部分是错误的:

var retryBatch = e.FindFailedActionsToRetry<T>(batch, arg => arg.ToString());

【问题讨论】:

    标签: azure azure-cognitive-search azure-search-.net-sdk


    【解决方案1】:

    FindFailedActionsToRetry 的第二个参数,名为 keySelector,是一个函数,它应该返回模型类型上代表文档键的任何属性。在您的示例中,您的模型类型在编译时在 UploadDocuments 内是未知的,因此您需要更改 UploadsDocuments 以也采用 keySelector 参数并将其传递给 FindFailedActionsToRetryUploadDocuments 的调用者需要指定一个特定于类型 T 的 lambda。例如,如果 Tthis article 中示例代码中的示例 Hotel 类,则 lambda 必须是 hotel =&gt; hotel.HotelId,因为 HotelId 是用作文档键的 Hotel 的属性。

    顺便说一下,你的 catch 块中的等待不应该等待固定的时间。如果您的搜索服务负载过重,等待持续的延迟并不能真正帮助它有时间恢复。相反,我们建议以指数方式后退(例如,第一次延迟是 2 秒,然后是 4 秒,然后是 8 秒,然后是 16 秒,直到某个最大值)。

    【讨论】:

    • 谢谢布鲁斯。我看到它奏效了。我已将代码更改为:var retryBatch = e.FindFailedActionsToRetry(batch, searchDoc => searchDoc.id);
    • 具有讽刺意味的是,我的代码呈指数级下降,但为了这篇文章和简单性,我将其更改为仅 5 秒。我再改一次。你会推荐多少次指数增加的重试?目前我的设置为 5。
    • 您可以继续重试,只要您正在取得进展(批次的项目比上次索引调用的项目少),并且仅当您没有取得进展时才限制重试次数。在这种情况下,最大重试次数应基于您愿意等待多长时间,因为延迟呈指数增长。过了某个点,您可以从指数延迟切换到恒定延迟(例如,在延迟达到几分钟后,或者您发现适合您的任何东西)。
    • 有什么方法可以测试吗?比如让索引操作在1000条记录之间失败,看看是否抛出异常?
    • @KevinCohen 你试过模拟 IDocumentsOperations.IndexWithHttpMessagesAsync 吗? docs.microsoft.com/dotnet/api/…
    【解决方案2】:

    我采用了Bruce's recommendations in his answercomment 并使用Polly 实现了它。

    • 指数回退最多一分钟,之后每隔一分钟重试一次。
    • 只要有进展就重试。 5 次请求后超时,没有任何进展。
    • IndexBatchExceptionis also thrown for unknown documents。我选择忽略此类非暂时性故障,因为它们可能表明请求不再相关(例如,在单独的请求中删除了文档)。
    int curActionCount = work.Actions.Count();
    int noProgressCount = 0;
    
    await Polly.Policy
        .Handle<IndexBatchException>() // One or more of the actions has failed.
        .WaitAndRetryForeverAsync(
            // Exponential backoff (2s, 4s, 8s, 16s, ...) and constant delay after 1 minute.
            retryAttempt => TimeSpan.FromSeconds( Math.Min( Math.Pow( 2, retryAttempt ), 60 ) ),
            (ex, _) =>
            {
                var batchEx = ex as IndexBatchException;
                work = batchEx.FindFailedActionsToRetry( work, d => d.Id );
    
                // Verify whether any progress was made.
                int remainingActionCount = work.Actions.Count();
                if ( remainingActionCount == curActionCount ) ++noProgressCount;
                curActionCount = remainingActionCount;
            } )
        .ExecuteAsync( async () =>
        {
            // Limit retries if no progress is made after multiple requests.
            if ( noProgressCount > 5 )
            {
                throw new TimeoutException( "Updating Azure search index timed out." );
            }
    
            // Only retry if the error is transient (determined by FindFailedActionsToRetry).
            // IndexBatchException is also thrown for unknown document IDs;
            // consider them outdated requests and ignore.
            if ( curActionCount > 0 )
            {
                await _search.Documents.IndexAsync( work );
            }
        } );
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2020-06-25
      • 2019-06-15
      • 1970-01-01
      • 2018-01-27
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多