【问题标题】:How to use batchWriteItem to write more than 25 items into DynamoDB Table using PHP如何使用 batchWriteItem 使用 PHP 将超过 25 个项目写入 DynamoDB 表
【发布时间】:2020-07-01 07:00:02
【问题描述】:

我正在使用适用于 PHP 3.x 的 AWS 开发工具包

对 BatchWriteItem 的一次调用最多可写入 16 MB 的数据,其中可包含多达 25 个放置或删除请求。要写入的单个项目可以大到 400 KB。

  $result = $dynamodbClient->batchWriteItem([
  'RequestItems' => [
    $tableName => [
      [
        'PutRequest' => [
          'Item' => [
            'Id' => ['N' => '1'],
            'AlbumTitle' => [
              'S' => 'Somewhat Famous',
            ],
            'Artist' => [
              'S' => 'No One You Know',
            ],
            'SongTitle' => [
              'S' => 'Call Me Today',
            ],
          ],
        ],
      ],          
    ],
  ],
]);

对于单个项目,它工作正常。我怎样才能写超过 25 个项目。

【问题讨论】:

    标签: amazon-dynamodb


    【解决方案1】:

    要写入超过 25 个项目,您必须重复调用 BatchWriteItem,从您的集合中添加项目,一次 25 个。

    类似的东西(伪代码):

    requests = []; // use an array to stage your put item requests
    foreach(item in SourceCollection) {
        addItem(item, requests); // add this item to the array 
        if(count(requests) == 25) { // when you have 25 ready..
           // result = dynamodbClient->batchWriteItem(...)
           requests = []; // clean up the array of put item requests
           // handle the failed items from the result object
        }
    }
    

    确保通过将每个 batchWriteItem 结果重新添加回请求来处理失败的项目

    【讨论】:

    • 如何从每个 batchWriteItem 结果中处理失败的项目
    • 您可以重试它们。只需将它们全部添加回请求数组中
    • 我每批试了 25 件。我需要在一个表中添加 2000 个项目。每批返回 75 个容量单位。但是我的表中只添加了 270 个项目,结果还返回了一些未处理的项目。有什么办法可以解决这个问题。
    【解决方案2】:

    这是我使用 lambda 函数的方法:

    exports.handler = (event, context, callback) => {
      console.log(`EVENT: ${JSON.stringify(event)}`);
    
      var AWS = require('aws-sdk');
    
      AWS.config.update({ region: process.env.REGION })
    
      var docClient = new AWS.DynamoDB.DocumentClient();
    
      const {data, table, cb} = JSON.parse(event.body);
    
      console.log('{data, table, cb}:', {data, table, cb});
    
      // Build the batches
      var batches = [];
      var current_batch = [];
      var item_count = 0;
    
      for (var i = 0; i < data.length; i++) {
        // Add the item to the current batch
        item_count++
        current_batch.push({
          PutRequest: {
            Item: data[i],
          },
        })
        // If we've added 25 items, add the current batch to the batches array
        // and reset it
        if (item_count % 25 === 0) {
          batches.push(current_batch)
          current_batch = []
        }
      }
    
      // Add the last batch if it has records and is not equal to 25
      if (current_batch.length > 0 && current_batch.length !== 25) {
        batches.push(current_batch)
      }
    
      // Handler for the database operations
      var completed_requests = 0
      var errors = false
    
      function requestHandler (request) {
    
        console.log('in the handler: ', request)
    
        return function (err, data) {
          // Increment the completed requests
          completed_requests++;
    
          // Set the errors flag
          errors = (errors) ? true : err;
    
          // Log the error if we got one
          if(err) {
            console.error(JSON.stringify(err, null, 2));
            console.error("Request that caused database error:");
            console.error(JSON.stringify(request, null, 2));
            callback(err);
          }else {
            var response = {
              statusCode: 200,
              headers: {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Methods': 'GET,POST,OPTIONS',
                'Access-Control-Allow-Origin': '*',
                'Access-Control-Allow-Credentials': true
              },
              body: JSON.stringify(data),
              isBase64Encoded: false
            };
            console.log(`success: returned ${data}`);
            callback(null, response);
          }
    
          // Make the callback if we've completed all the requests
          if(completed_requests === batches.length) {
            cb(errors);
          }
        }
      }
    
      // Make the requests
      var params;
      for (var j = 0; j < batches.length; j++) {
        // Items go in params.RequestItems.id array
        // Format for the items is {PutRequest: {Item: ITEM_OBJECT}}
        params = '{"RequestItems": {"' + table + '": []}}'
        params = JSON.parse(params)
        params.RequestItems[table] = batches[j]
    
        console.log('before db.batchWrite: ', params)
    
        // Perform the batchWrite operation
        docClient.batchWrite(params, requestHandler(params))
      }
    };
    
    
    dealspoondBatchWrite
    

    【讨论】:

    • 喜欢这种方法@gildiny 只有一个问题。我厌倦了将其写入 DynamoDB 本地映像。知道如何设置数据库端点吗?我查看了文档,但找不到执行此操作的 api。
    • @RyanCarville 这就是我在我的 React 组件中使用它的方式:const apiName = "api_name"; const apiEndpoint = "/add-batch"; try { await API.post(apiName, apiEndpoint, {body: {data: &lt;your_data&gt;, table: 'table_name', cb: null}}); } catch (error) { console.log('ERROR: ', error) }
    • 谢谢@gildiny。我最终完成了这项工作。 var { DynamoDB } = require('aws-sdk'); var db = new DynamoDB.DocumentClient({ region: 'localhost', endpoint: 'http://localhost:8000', });
    【解决方案3】:

    我正在使用以下代码使用 batchWriteItem 添加数据。建议是否有更好的方法。

        // Build the batches
    $albums= "// collection of album json";
    $batches = [];
    $current_batch = [];
    $item_count = 0;
    foreach ($albums as $album) {
        // Add the item to the current batch
        $item_count++;
        $json = json_encode($album);
        $data['PutRequest'] = array('Item' => $marshaler->marshalJson($json));
        array_push($current_batch, $data);
        // If we've added 25 items, add the current batch to the batches array
        // and reset it
        if ($item_count % 25 == 0) {
            array_push($batches, $current_batch);
            $current_batch = [];
        }
    }
    // Handler for the database operations
    // Add the last batch if it has records and is not equal to 25
    if (count($current_batch) > 0 && count($current_batch) != 25) {
        array_push($batches, array_values($current_batch));
    }
    //batches.push(current_batch);
    // Handler for the database operations
    $completed_requests = 0;
    $errors = false;
    $batch_count = 0;
    foreach ($batches as $batch) {
        try {
            $batch_count++;
            $params = array('RequestItems' => array($tableName => $batch), 'ReturnConsumedCapacity' => 'TOTAL', 'ReturnItemCollectionMetrics' => 'SIZE');
            $response = $dynamodb->batchWriteItem($params);
            echo "Album $batch_count Added." . "<br>";
            echo "<pre>";
    //        print_r($params);
            print_r($response);
            echo "</pre>";
        }
        catch (DynamoDbException $e) {
            echo "Unable to add movie:\n";
            echo $e->getMessage() . "\n";
    //    break;
        }
    }
    

    【讨论】:

    • 不要使用$batches$current_batch这两个数组。只需使用一个数组。并且,随着数组填充(即if count($current_batch) == 25)执行写入。写入后,您必须查看$response 并从$current_batch 中删除所有成功写入的项目,保留任何失败的项目。这样,您就可以继续重试失败的那些。
    • 我想说的是,将您的代码从预先填充一批批次更改为单个“填充和写入”方法。继续将项目添加到批处理中,最多 25 个。尝试编写它们。检查结果并从批次中删除任何成功的项目,留下任何失败的项目。继续添加新项目,直到您全部添加完为止..
    • 另外,如果写入 25 个项目需要 75 个容量单位,这意味着您的项目相当大。确保您的表已配置足够的容量(即至少 75 WCU)
    猜你喜欢
    • 2015-09-12
    • 1970-01-01
    • 2020-03-13
    • 2020-02-19
    • 1970-01-01
    • 2016-12-07
    • 1970-01-01
    • 1970-01-01
    • 2022-07-08
    相关资源
    最近更新 更多