【问题标题】:Unable to determine type for field 'T_DATE'. warnings.warn("Unable to determine type for field '{}'.".format(bq_field.name))无法确定字段“T_DATE”的类型。 warnings.warn("无法确定字段 '{}' 的类型。".format(bq_field.name))
【发布时间】:2021-10-26 12:09:31
【问题描述】:

我想使用以下代码将双查询查询结果转换为数据框。 google jupyterLab Notebook 上的相同代码工作文件,但在我的本地引发错误。

    from google.cloud import bigquery
    bq_client = bigquery.Client(project=project_id, location=bq_location)
    query_job = bq_client.query(sql, project=project_id)
    result = query_job.result()
    schema = result.schema
    df = result.to_dataframe()

架构看起来像

[SchemaField('T_DATE', 'DATE', 'NULLABLE')]
C:Python37\lib\site-packages\google\cloud\bigquery\_pandas_helpers.py:244: UserWarning: Unable to determine type for field 'T_DATE'.
  warnings.warn("Unable to determine type for field '{}'.".format(bq_field.name))
Traceback (most recent call last):
df = result.to_dataframe()
  File "Python\Python37\lib\site-packages\google\cloud\bigquery\table.py", line 1941, in to_dataframe
    create_bqstorage_client=create_bqstorage_client,
  File "Python\Python37\lib\site-packages\google\cloud\bigquery\table.py", line 1733, in to_arrow
    bqstorage_client=bqstorage_client
  File "Python\Python37\lib\site-packages\google\cloud\bigquery\table.py", line 1630, in _to_page_iterable
    yield from result_pages
  File "Python\Python37\lib\site-packages\google\cloud\bigquery\_pandas_helpers.py", line 628, in download_arrow_row_iterator
    yield _row_iterator_page_to_arrow(page, column_names, arrow_types)
  File "Python\Python37\lib\site-packages\google\cloud\bigquery\_pandas_helpers.py", line 601, in _row_iterator_page_to_arrow
    arrays.append(pyarrow.array(page._columns[column_index], type=arrow_type))
AttributeError: 'NoneType' object has no attribute 'array' 

使用 pandas = 1.3.4 和 bigquery - 2.28.1

【问题讨论】:

  • 看起来result 可以为空或没有任何值,您能否检查是否可以打印result 或检查BQ 作业历史记录中是否有任何错误
  • @Prany Row((datetime.date(2015, 5, 12)...) 我可以看到结果中的数据并且该行的非为空,它在架构中也被定义为可为空.
  • 没有解决这个问题,目前通过将 RowIterator 转换为列表然后使用 pd.Dataframe() 转换为数据帧来解决问题。 result = query_job.result() schema = result.schema df = result.to_dataframe() header = [] for row in schema: header.append(row.name) ls = [] for row in result: temp_list = [] for行中的数据: temp_list.append(data) ls.append(temp_list) df = pd.DataFrame(ls, columns=header)
  • 你能粘贴一些示例数据吗,我可以尝试重新创建这个

标签: pandas dataframe google-bigquery


【解决方案1】:
df = bq_client.query(sql, project=project_id).to_dataframe()

看起来是result() 导致了问题。这对我有用。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2021-02-22
    • 2014-01-10
    • 2014-02-01
    • 2014-11-29
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多