【发布时间】:2021-07-22 19:22:58
【问题描述】:
我正在使用 Pands to_gbq 将数据框附加到一个大查询表中,就像我过去使用它成功完成的那样(我只在模式中明确声明了一个字段,因此它会将其识别为日期,否则它会强制它是一个字符串):
schema = [{'name': 'Week', 'type': 'DATE'}]
def load_to_BQ():
dataframe.to_gbq(destination_table='Table.my_table',
project_id='myprojectid',
table_schema=schema,
if_exists='append')
运行时出现以下错误:
InvalidSchema: Please verify that the structure and data types in the DataFrame match the schema of the destination table.
我很困惑,因为在使用此代码之前,我已将数据帧上传并附加到同一个 BQ 表中。我根据数据框列检查了架构,它们都匹配并且顺序正确。我怀疑罪魁祸首是数据框中称为“周”的日期字段,但即使在 BQ 中,“周”字段也被列为 DATE。我已使用以下方法将该字段转换为日期时间:
dataframe['Week'] = pd.to_datetime(dataframe['Week'], format='%m-%d-%y').dt.date
当我使用schema.generate_bq_schema(dataframe) 检查架构类型时,“周”字段返回为TIMESTAMP。我看到有人建议使用“TIMESTAMP”作为 BQ 而不是“DATE”,但是当我在模式中更改它时,我得到了同样的错误。谁能指出我做错了什么?这是完整的错误信息:
InvalidSchema Traceback (most recent call last)
<ipython-input-117-fb947996ea53> in <module>
30 answer = input("Are you sure you want to load to BigQuery? (y/n)")
31 if answer == "y":
---> 32 load_to_BQ()
33 else:
34 print("Load failed.")
<ipython-input-117-fb947996ea53> in load_to_BQ()
12 # dataframe, table_id, job_config=job_config
13 # )
---> 14 dataframe.to_gbq(destination_table='table.my_table',
15 project_id='myprojectid',
16 table_schema=schema,
~\anaconda3\lib\site-packages\pandas\core\frame.py in to_gbq(self, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials)
1708 from pandas.io import gbq
1709
-> 1710 gbq.to_gbq(
1711 self,
1712 destination_table,
~\anaconda3\lib\site-packages\pandas\io\gbq.py in to_gbq(dataframe, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials)
209 ) -> None:
210 pandas_gbq = _try_import()
--> 211 pandas_gbq.to_gbq(
212 dataframe,
213 destination_table,
~\anaconda3\lib\site-packages\pandas_gbq\gbq.py in to_gbq(dataframe, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials, verbose, private_key)
1074 original_schema, table_schema
1075 ):
-> 1076 raise InvalidSchema(
1077 "Please verify that the structure and "
1078 "data types in the DataFrame match the "
InvalidSchema: Please verify that the structure and data types in the DataFrame match the schema of the destination table.
【问题讨论】:
标签: python pandas dataframe google-bigquery