【发布时间】:2021-04-10 00:15:12
【问题描述】:
我有一个分区表,其中一列是 DateTime 类型,并且该表在同一列上分区。根据 spark-bigquery 文档,对应的 Spark SQL 类型是 String 类型。 https://github.com/GoogleCloudDataproc/spark-bigquery-connector
我尝试做同样的事情,但我遇到了数据类型不匹配的问题。
代码片段:
ZonedDateTime nowPST = ZonedDateTime.ofInstant(Instant.now(), TimeZone.getTimeZone("PST").toZoneId());
df = df.withColumn("createdDate", lit(nowPST.toLocalDateTime().toString()));
错误:
Caused by: com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryException: Failed to load to <PROJECT_ID>:<DATASET_NAME>.<TABLE_NAME> in job JobId{project=<PROJECT_ID>, job=<JOB_ID>, location=US}. BigQuery error was Provided Schema does not match Table <PROJECT_ID>:<DATASET_NAME>.<TABLE_NAME>. Field createdDate has changed type from DATETIME to STRING
at com.google.cloud.spark.bigquery.BigQueryWriteHelper.loadDataToBigQuery(BigQueryWriteHelper.scala:156)
at com.google.cloud.spark.bigquery.BigQueryWriteHelper.writeDataFrameToBigQuery(BigQueryWriteHelper.scala:89)
... 36 more
【问题讨论】:
标签: apache-spark google-bigquery