DynamoDB get_item 以毫秒为单位读取 400kb 数据答案

【问题标题】：DynamoDB get_item to read 400kb data in millisecondsDynamoDB get_item 以毫秒为单位读取 400kb 数据
【发布时间】：2021-03-01 20:43:18
【问题描述】：

我有一个名为 events 的 dynamodb 表，其中存储了所有 user event details，例如 product_view 、add_to_cart 和 product_purchase

在这个events表中，我有一些items的存储容量达到了400kb

问题：

        response = self._table.get_item(
            Key={
                PARTITION_KEY: <pk>,
                SORT_KEY: <sk>,
            },
            ConsistentRead=False,
        )

当我想使用dynamodb get_item 方法访问item(400kb) 时，需要大约5 seconds 才能返回结果。

我已经用过 DAX

目标

我想在 1 秒内阅读 400kb 项目。

重要信息：

dynamodb中的数据会以这种格式存储

{
 "partition_key": "user_id1111",
 "sort_key": "version_1",
 "attributes": {
  "events": [
   {
    "t": "1614712316",  
    "a": "product_view",   
    "i": "1275"
   },
   {
    "t": "1614712316",  
    "a": "product_add",   
    "i": "1275"
   },
   {
    "t": "1614712316",  
    "a": "product_purchase",   
    "i": "1275"
   },
    ...

  ]
 }
}

t 是一个时间戳
a 可能是 product_view,product_add,product_purchase
i 是product_id

如果您看到上面的项目events 是一个列表，它将被新事件附加。

我有一个 400kb 的项目，events 列表中有事件数

我写了一些脚本来测量时间，结果如下所示

import boto3
import datetime

dynamodb = boto3.resource('dynamodb')

table = dynamodb.Table('events')

pk = f"user_id1111"
sk = f"version_1"


t_load_start = datetime.datetime.now()


response = table.get_item(
    Key={
        "partition_key": pk,
        "sort_key": sk,
    },
    ReturnConsumedCapacity="TOTAL"
)
capacity_units = response["ConsumedCapacity"]["CapacityUnits"]

t_load_end = datetime.datetime.now()
seconds = (t_load_end - t_load_start).total_seconds()

print(f"Elapsed time is::{seconds}sec and {capacity_units} capacity units")

这是我得到的输出。

Elapsed time is::5.676799sec and 50.0 capacity units

谁能为此提出解决方案？

【问题讨论】：

你不会想听到这个，但我会开始重新考虑数据模型。从技术上讲，5 秒就是 5000 毫秒，你能更精确地满足性能要求吗？ ;-)
@maurice，我想在 1 秒内拿到货
那么，你的本地是us-east-1中的EC2实例？还是us-east-1中的ddb表？
5 秒不是一个合理的衡量标准，无论项目大小/地理位置如何；你很可能被限制了。您可以在 AWS 控制台中查看表格的“指标”选项卡，查看“受限制的读取请求/事件”图表，或check if boto3 is retrying
Lambda 函数是否与 DynamoDB 表在同一个区域？ Lambda 是否在 VPC 中运行，如果是，您是否有任何异常的网络路由？

标签： amazon-web-services amazon-dynamodb

【解决方案1】：

tl;dr：将函数内存增加到至少 1024MB，请参阅更新 2

我很好奇，所以我做了一些测量。我创建了一个脚本，用于在新表中创建一个大小几乎正好为 400KB 的大 boi 项。

然后我测试来自 Python 的两次读取 - 一次使用资源 API，另一次使用较低级别的客户端 - 在两种情况下最终一致的读取。

这是我测量的：

Reading Big Boi from a Table Resource took 0.366508s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.301585s and consumed 50.0 RCUs

如果我们从 RCU 推断，它读取的项目大小约为 50 * 2 * 4KB = 400 KB（最终一致读取消耗 0.5 个 RCU）。

我在德国本地针对eu-central-1（德国法兰克福）运行了几次，我看到的最高延迟约为 900 毫秒。（这没有 DAX。）

因此，我认为您应该向我们展示您是如何进行测量的。

import uuid
from datetime import datetime, timedelta

import boto3
import boto3.dynamodb.conditions as conditions

TABLE_NAME = "big-boi-test"
BIG_BOI_PK = "f0ba8d6c"

TABLE_RESOURCE = boto3.resource("dynamodb").Table(TABLE_NAME)
DDB_CLIENT = boto3.client("dynamodb")

def create_table():
    DDB_CLIENT.create_table(
        AttributeDefinitions=[{"AttributeName": "PK", "AttributeType": "S"}],
        TableName=TABLE_NAME,
        KeySchema=[{"AttributeName": "PK", "KeyType": "HASH"}],
        BillingMode="PAY_PER_REQUEST"
    )

def create_big_boi_item() -> str:
    # based on calculations here: https://zaccharles.github.io/dynamodb-calculator/
    template = {
        "PK": {
            "S": BIG_BOI_PK
        },
        "bigBoi": {
            "S": ""
        }
    } # This is 16 bytes

    big_boi = "X" * (1024 * 400 - 16)
    template["bigBoi"]["S"] = big_boi
    return template

def store_big_boi():
    big_bio = create_big_boi_item()

    DDB_CLIENT.put_item(
        Item=big_bio,
        TableName=TABLE_NAME
    )

def get_big_boi_with_table_resource():

    start = datetime.now()
    response = TABLE_RESOURCE.get_item(
        Key={"PK": BIG_BOI_PK},
        ReturnConsumedCapacity="TOTAL"
    )
    end = datetime.now()
    seconds = (end - start).total_seconds()
    capacity_units = response["ConsumedCapacity"]["CapacityUnits"]

    print(f"Reading Big Boi from a Table Resource took {seconds}s and consumed {capacity_units} RCUs")

def get_big_boi_with_client():

    start = datetime.now()
    response = DDB_CLIENT.get_item(
        Key={"PK": {"S": BIG_BOI_PK}},
        ReturnConsumedCapacity="TOTAL",
        TableName=TABLE_NAME
    )
    end = datetime.now()
    seconds = (end - start).total_seconds()
    capacity_units = response["ConsumedCapacity"]["CapacityUnits"]

    print(f"Reading Big Boi from a Client took {seconds}s and consumed {capacity_units} RCUs")

if __name__ == "__main__":
    # create_table()
    # store_big_boi()
    get_big_boi_with_table_resource()
    get_big_boi_with_client()

更新

我再次对一个看起来更像您正在使用的项目进行了相同的测量，无论我以哪种方式请求它们，我的平均时间仍然低于 1000 毫秒：

Reading Big Boi from a Table Resource took 1.492829s and consumed 50.0 RCUs
Reading Big Boi from a Table Resource took 0.871583s and consumed 50.0 RCUs
Reading Big Boi from a Table Resource took 0.857513s and consumed 50.0 RCUs
Reading Big Boi from a Table Resource took 0.769432s and consumed 50.0 RCUs
Reading Big Boi from a Table Resource took 0.690172s and consumed 50.0 RCUs
Reading Big Boi from a Table Resource took 0.670099s and consumed 50.0 RCUs
Reading Big Boi from a Table Resource took 0.633489s and consumed 50.0 RCUs
Reading Big Boi from a Table Resource took 0.605999s and consumed 50.0 RCUs
Reading Big Boi from a Table Resource took 0.598635s and consumed 50.0 RCUs
Reading Big Boi from a Table Resource took 0.606553s and consumed 50.0 RCUs
Reading Big Boi from a Client took 1.66636s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.921605s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.831735s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.707082s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.668602s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.648401s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.5695s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.592073s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.611436s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.553827s and consumed 50.0 RCUs
Average latency over 10 requests with the table resource: 0.7796304s
Average latency over 10 requests with the client: 0.7770621s

这是物品的样子：

这是供您验证的完整测试脚本：

import statistics
import uuid
from datetime import datetime, timedelta

import boto3
import boto3.dynamodb.conditions as conditions

TABLE_NAME = "big-boi-test"
BIG_BOI_PK = "NestedBoi"

TABLE_RESOURCE = boto3.resource("dynamodb").Table(TABLE_NAME)
DDB_CLIENT = boto3.client("dynamodb")

def create_table():
    DDB_CLIENT.create_table(
        AttributeDefinitions=[{"AttributeName": "PK", "AttributeType": "S"}],
        TableName=TABLE_NAME,
        KeySchema=[{"AttributeName": "PK", "KeyType": "HASH"}],
        BillingMode="PAY_PER_REQUEST"
    )

def create_big_boi_item() -> str:
    # based on calculations here: https://zaccharles.github.io/dynamodb-calculator/
    template = {
        "PK": {
            "S": "NestedBoi"
        },
        "bigBoiContainer": {
            "M": {
            "bigBoiList": {
                "L": [
                
                ]
            }
            }
        }
    } # 43 bytes

    item = {
        "M": {
        "t": {
            "S": "1614712316"
        },
        "a": {
            "S": "product_view"
        },
        "i": {
            "S": "1275"
        }
        }
    }  # 36 bytes

    number_of_items = int((1024 * 400 - 43) / 36)

    for _ in range(number_of_items):
        template["bigBoiContainer"]["M"]["bigBoiList"]["L"].append(item)

    return template

def store_big_boi():
    big_bio = create_big_boi_item()

    DDB_CLIENT.put_item(
        Item=big_bio,
        TableName=TABLE_NAME
    )

def get_big_boi_with_table_resource():

    start = datetime.now()
    response = TABLE_RESOURCE.get_item(
        Key={"PK": BIG_BOI_PK},
        ReturnConsumedCapacity="TOTAL"
    )
    end = datetime.now()
    seconds = (end - start).total_seconds()
    capacity_units = response["ConsumedCapacity"]["CapacityUnits"]

    print(f"Reading Big Boi from a Table Resource took {seconds}s and consumed {capacity_units} RCUs")

    return seconds

def get_big_boi_with_client():

    start = datetime.now()
    response = DDB_CLIENT.get_item(
        Key={"PK": {"S": BIG_BOI_PK}},
        ReturnConsumedCapacity="TOTAL",
        TableName=TABLE_NAME
    )
    end = datetime.now()
    seconds = (end - start).total_seconds()
    capacity_units = response["ConsumedCapacity"]["CapacityUnits"]

    print(f"Reading Big Boi from a Client took {seconds}s and consumed {capacity_units} RCUs")

    return seconds

if __name__ == "__main__":
    # create_table()
    # store_big_boi()

    n_experiments = 10
    experiments_with_table_resource = [get_big_boi_with_table_resource() for i in range(n_experiments)]
    experiments_with_client = [get_big_boi_with_client() for i in range(n_experiments)]
    print(f"Average latency over {n_experiments} requests with the table resource: {statistics.mean(experiments_with_table_resource)}s")
    print(f"Average latency over {n_experiments} requests with the client: {statistics.mean(experiments_with_client)}s")

如果我增加 n_experiments，它往往会变得更快，可能是因为 DDB 在内部缓存。

仍然：无法重现。

更新 2

在得知您正在运行 Lambda 函数后，我在 Lambda 内部使用不同的内存配置再次运行了测试。

Memory	n_experiments	average time with resource	average time with client
128MB	10	6.28s	5.06s
256MB	10	3.26s	2.61s
512MB	10	1.62s	1.33s
1024MB	10	0.84s	0.68s
2048MB	10	0.52s	0.43s
4096MB	10	0.51s	0.41s

如 cmets 中所述，CPU 和网络性能随分配给函数的内存量而变化。你可以通过砸钱来解决你的问题:-)

【讨论】：

我刚刚添加了脚本来识别延迟，并在问题部分添加了dynamodb item 的快照。请检查一下
我的 dynamodb 表在us-east-1(N.Virginia)中
我更新了我的答案以检查与您的嵌套数据结构更相似的项目 - 仍然无法重现。您是否检查过节流指标？
很棒的东西。我认为这里学到的教训是 lambda 性能随内存而变化。即使内存看起来足够，调整该值也可以显着提高性能。

【解决方案2】：

听起来您有一些问题。第一个问题是您遇到了 400kb 的项目大小限制。虽然您没有说这是个问题，但可能值得重新审视您的数据模型，以便您可以存储更多事件数据。

性能问题不太可能与您的数据模型有关。 get_item 操作应该具有单位数毫秒的平均延迟，特别是因为您指定了最终一致的读取。这里正在发生其他事情。

您如何测试和衡量此操作的性能？

AWS 文档有一些来自about troubleshooting high latency DynamoDB operations 的建议可能有用。

【讨论】：

我的 dynamodb 表中基本上有 400kb 的问题。所以我正在访问item，然后计算大小，然后将 400kb 项目截断为 200kb 项目。要完成这项任务，我首先使用 get_item 访问 dynamodb 项目，但 get_item 需要大约 5 秒才能获取项目
我正在使用python's - datetime 模块计算运行时间
您正在登录 Cloudwatch 吗？我想知道对 DDB 的实际请求是否需要 5 秒，或者是否有其他东西占用了所有时间。查看 Cloudwatch 中的 DynamoDB 指标，看看是否真的是 DDB 需要这么长时间：docs.aws.amazon.com/amazondynamodb/latest/developerguide/…
我刚刚在问题部分添加了我的Dynamo DB项目的快照，请检查它