【发布时间】:2021-05-16 22:58:18
【问题描述】:
我正在尝试创建一个云函数来触发从大查询表到谷歌云存储的导出。 我阅读了下面的链接并做了与提到的相同的事情。
How to schedule an export from a BigQuery table to Cloud Storage?
请看下面我在 main.py 中提供的内容
# Imports the BigQuery client library
from google.cloud import bigquery
def extract_data(request):
# Replace these values according to your project
project_name = "whr-asia-datalake-nonprod"
bucket_name = "whr-asia-datalake-dev-standard/outbound/Adobe/"
dataset_name = "WHR_DATALAKE"
table_name = "CONSUMER_EXTRACT"
destination_uri = "gs://{}/{}".format(bucket_name,"Consumer_master.csv")
bq_client = bigquery.Client(project=project_name)
dataset = bq_client.dataset(dataset_name, project=project_name)
table_to_export = dataset.table(table_name)
job_config = bigquery.job.ExtractJobConfig()
extract_job = bq_client.extract_table(
table_to_export,
destination_uri,
# Location must match that of the source table.
location="asia-south1",
job_config=job_config,
)
return "Job with ID {} started exporting data from {}.{} to {}".format(extract_job.job_id, dataset_name, table_name, destination_uri)
在 requirements.txt 中,我提供了如下
google-cloud-bigquery
我的云功能将从 pub 子主题触发,一旦我需要从中提取数据的大查询表被加载,该主题就会创建一个事件。
但是,该功能无法部署,我收到如下错误,我无法理解。 请有人可以帮我解决这个问题。我不确定出了什么问题。
2021-02-13T11:23:38.518815ZCloud FunctionsUpdateFunctionasia-south1:xxxx@xxxx.com {@type: type.googleapis.com/google.cloud.audit.AuditLog, authenticationInfo: {…}, methodName: google.cloud.functions.v1.CloudFunctionsService.UpdateFunction, resourceName: projects/whr-asia-datalake-nonprod/locations/asia-south1/functions/Consumer_Extract_Bigquery, serviceName: cloudfunctions.google…
{@type: type.googleapis.com/google.cloud.audit.AuditLog, authenticationInfo: {…}, methodName: google.cloud.functions.v1.CloudFunctionsService.UpdateFunction, resourceName: projects/whr-asia-datalake-nonprod/locations/asia-south1/functions/Consumer_Extract_Bigquery, serviceName: cloudfunctions.google…
我可以通过在 main.py 中添加以下内容来解决此问题
# Imports the BigQuery client library
from google.cloud import bigquery
def extract_data(event, context):
# Replace these values according to your project
project_name = "whr-asia-datalake-nonprod"
bucket_name = "whr-asia-datalake-dev-standard/outbound/Adobe/"
dataset_name = "WHR_DATALAKE"
table_name = "CONSUMER_EXTRACT"
destination_uri = "gs://{}/{}".format(bucket_name,"Consumer_master.csv")
bq_client = bigquery.Client(project=project_name)
dataset = bq_client.dataset(dataset_name, project=project_name)
table_to_export = dataset.table(table_name)
job_config = bigquery.job.ExtractJobConfig()
extract_job = bq_client.extract_table(
table_to_export,
destination_uri,
# Location must match that of the source table.
location="asia-south1",
job_config=job_config,
)
return "Job with ID {} started exporting data from {}.{} to {}".format(extract_job.job_id, dataset_name, table_name, destination_uri)
并在 requirements.txt 中添加以下内容
google-cloud-bigquery==1
google-cloud-bigquery-datatransfer==1
【问题讨论】:
-
嗨@Maxim,你能帮我吗?无法解决此问题。您的输入将对我有所帮助,因为这是我根据您对其他查询的输入所做的。我做错了吗?
-
感谢@Chris32 编辑它。我没有意识到我提供了帐户详细信息。
-
错误与 def extract_data(request) 有关,它不是 HTTP 触发器,而是 pub 子主题。我应该用什么代替它?
-
嗨@Maxim,我正在尝试在文件中创建带有标题的管道分隔符。我现在可以创建文件,但要求是创建管道分隔符和标题。
标签: python google-bigquery google-cloud-functions