【问题标题】:Python 2.7 vs 3.8 Lambda for sending logfiles from S3 to Elasticsearch用于将日志文件从 S3 发送到 Elasticsearch 的 Python 2.7 与 3.8 Lambda
【发布时间】:2020-08-10 08:14:25
【问题描述】:

首先,我是 Python 新手,没有太多编写代码的经验。我将 JSON 编码的日志文件存储在 S3 中,并构建了一个 Lambda 函数(基于 AWS sample.py),它解析其中一些日志并将其发送到 Elasticsearch。当 Lambda 运行时设置为 Python 2.7 时,一切正常。代码如下:

import boto3
import re
import requests
from requests_aws4auth import AWS4Auth

region = 'us-west-1'
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

host = 'https://search-siem-hds-sec-zsn57erua5fu5gdkdgnxhj5rsi.us-west-1.es.amazonaws.com'
index = 'index1'
type = 'lambda-type'
url = host + '/' + index + '/' + type

headers = { "Content-Type": "application/json" }

s3 = boto3.client('s3')

time_pattern = re.compile('(202\d-\d\d-\d\dT\d\d:\d\d:\d\d\.\d\d\dZ)')
message_pattern = re.compile('(.*)')

def lambda_handler(event, context):
    for record in event['Records']:

        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key']

        obj = s3.get_object(Bucket=bucket, Key=key)
        body = obj['Body'].read()
        lines = body.splitlines()

            timestamp = time_pattern.search(line).group(1)
            message = message_pattern.search(line).group(1)

            document = { "timestamp": timestamp, "message": message }
            r = requests.post(url, auth=awsauth, json=document, headers=headers)

将运行时设置为 Python 3.8 时,Lambda 失败并显示以下消息:

[ERROR] TypeError: cannot use a string pattern on a bytes-like object

经过阅读后,我在以下两行中添加了“b”以尝试解决此问题:

######################################################
time_pattern = re.compile(b'(202\d-\d\d-\d\dT\d\d:\d\d:\d\d\.\d\d\dZ)')
message_pattern = re.compile(b'(.*)')
######################################################

然而这导致了以下错误:

[ERROR] TypeError: Object of type bytes is not JSON serializable

有哪位 Python 专家能够提供帮助或指导我如何在 Python 3.8 上实现此功能?

非常感谢, 血清

【问题讨论】:

    标签: python amazon-web-services elasticsearch aws-lambda


    【解决方案1】:

    如果您正在阅读的文件不是二进制文件,我认为您没有将其内容与文本字符串进行比较,那么请更改:

    body = obj['Body'].read()
    

    到这里:

    body = obj['Body'].read().decode('utf-8')
    

    Python 3 中的 read() 函数返回字节。你想要字符串。

    【讨论】:

      猜你喜欢
      • 2018-06-16
      • 2020-09-26
      • 1970-01-01
      • 2020-07-19
      • 1970-01-01
      • 1970-01-01
      • 2022-07-07
      • 2019-10-03
      • 1970-01-01
      相关资源
      最近更新 更多