【问题标题】:ContextualVersionConflict using BigQuery in AI-Platform-Notebooks在 AI-Platform-Notebooks 中使用 BigQuery 的 ContextualVersionConflict
【发布时间】:2020-12-31 01:02:37
【问题描述】:

我正在尝试在 AI-Platform-Notebooks 中使用 BigQuery,但遇到了 ContextualVersionConflict。 在这个玩具示例中,我试图从项目 job2vec 中名为 bgt_all 的 BigQuery 数据库中提取两列数据。

from google.cloud import bigquery

client = bigquery.Client()

aaa="""

SELECT BGTJobId, soc6 FROM `job2vec.bq_bgt_storage.bgt_all` LIMIT 100
"""
df = client.query(aaa).to_dataframe()
df.head()

返回

---------------------------------------------------------------------------
ContextualVersionConflict                 Traceback (most recent call last)
<ipython-input-25-7bdfe216bcc8> in <module>
      7 SELECT BGTJobId, soc6 FROM `job2vec.bq_bgt_storage.bgt_all` LIMIT 100
      8 """
----> 9 df = client.query(aaa).to_dataframe()
     10 df.head()

/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/job.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client, date_as_object)
   3381             progress_bar_type=progress_bar_type,
   3382             create_bqstorage_client=create_bqstorage_client,
-> 3383             date_as_object=date_as_object,
   3384         )
   3385 

/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/table.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client, date_as_object)
   1725                 progress_bar_type=progress_bar_type,
   1726                 bqstorage_client=bqstorage_client,
-> 1727                 create_bqstorage_client=create_bqstorage_client,
   1728             )
   1729             df = record_batch.to_pandas(date_as_object=date_as_object)

/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/table.py in to_arrow(self, progress_bar_type, bqstorage_client, create_bqstorage_client)
   1535         owns_bqstorage_client = False
   1536         if not bqstorage_client and create_bqstorage_client:
-> 1537             bqstorage_client = self.client._create_bqstorage_client()
   1538             owns_bqstorage_client = bqstorage_client is not None
   1539 

/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/client.py in _create_bqstorage_client(self)
    402         """
    403         try:
--> 404             from google.cloud import bigquery_storage_v1
    405         except ImportError:
    406             warnings.warn(

/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery_storage_v1/__init__.py in <module>
     20 
     21 __version__ = pkg_resources.get_distribution(
---> 22     "google-cloud-bigquery-storage"
     23 ).version  # noqa
     24 

/opt/conda/lib/python3.7/site-packages/pkg_resources/__init__.py in get_distribution(dist)
    478         dist = Requirement.parse(dist)
    479     if isinstance(dist, Requirement):
--> 480         dist = get_provider(dist)
    481     if not isinstance(dist, Distribution):
    482         raise TypeError("Expected string, Requirement, or Distribution", dist)

/opt/conda/lib/python3.7/site-packages/pkg_resources/__init__.py in get_provider(moduleOrReq)
    354     """Return an IResourceProvider for the named module or requirement"""
    355     if isinstance(moduleOrReq, Requirement):
--> 356         return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
    357     try:
    358         module = sys.modules[moduleOrReq]

/opt/conda/lib/python3.7/site-packages/pkg_resources/__init__.py in require(self, *requirements)
    897         included, even if they were already activated in this working set.
    898         """
--> 899         needed = self.resolve(parse_requirements(requirements))
    900 
    901         for dist in needed:

/opt/conda/lib/python3.7/site-packages/pkg_resources/__init__.py in resolve(self, requirements, env, installer, replace_conflicting, extras)
    788                 # Oops, the "best" so far conflicts with a dependency
    789                 dependent_req = required_by[req]
--> 790                 raise VersionConflict(dist, req).with_context(dependent_req)
    791 
    792             # push the new requirements onto the stack

ContextualVersionConflict: (google-api-core 1.22.1 (/opt/conda/lib/python3.7/site-packages), Requirement.parse('google-api-core[grpc]<2.0.0dev,>=1.22.2'), {'google-cloud-bigquery-storage'})

这很奇怪,因为当我跑步时 !pip install google-api-core --upgrade 它显示它是 1.24.1,所以我不太明白为什么。

已编辑:当我输入 !conda list | 时出现以下内容grep 谷歌

google-api-core-grpcio-gcp 1.16.0                        1    conda-forge
google-api-python-client  1.9.1              pyh9f0ad1d_0    conda-forge
google-apitools           0.5.31                   pypi_0    pypi
google-auth               1.24.0                   pypi_0    pypi
google-auth-httplib2      0.0.3                      py_3    conda-forge
google-auth-oauthlib      0.4.1                      py_2    conda-forge
google-cloud-bigquery     1.24.0                   pypi_0    pypi
google-cloud-bigquery-storage 2.1.0                    pypi_0    pypi
google-cloud-bigtable     1.0.0                    pypi_0    pypi
google-cloud-core         1.3.0                    pypi_0    pypi
google-cloud-dataproc     1.1.1                    pypi_0    pypi
google-cloud-datastore    1.7.4                    pypi_0    pypi
google-cloud-dlp          0.13.0                   pypi_0    pypi
google-cloud-firestore    1.8.1                    pypi_0    pypi
google-cloud-kms          1.4.0                    pypi_0    pypi
google-cloud-language     1.3.0                    pypi_0    pypi
google-cloud-logging      1.15.1                   pypi_0    pypi
google-cloud-pubsub       1.0.2                    pypi_0    pypi
google-cloud-scheduler    1.3.0                    pypi_0    pypi
google-cloud-spanner      1.17.1                   pypi_0    pypi
google-cloud-speech       1.3.2                    pypi_0    pypi
google-cloud-storage      1.30.0                   pypi_0    pypi
google-cloud-tasks        1.5.0                    pypi_0    pypi
google-cloud-translate    2.0.2                    pypi_0    pypi
google-cloud-videointelligence 1.13.0                   pypi_0    pypi
google-cloud-vision       0.42.0                   pypi_0    pypi
google-crc32c             0.1.0                    pypi_0    pypi
google-pasta              0.2.0                    pypi_0    pypi
google-resumable-media    0.7.1                    pypi_0    pypi
googleapis-common-protos  1.51.0           py37hc8dfbb8_2    conda-forge
grpc-google-iam-v1        0.12.3                   pypi_0    pypi

【问题讨论】:

  • 运行!conda list | grep google会得到什么
  • 我使用最新的 DLVM cloud.google.com/ai-platform/deep-learning-vm/docs/… 尝试了相同的命令并且工作正常(不同的表),我修复了与 google-api-core 相关的错误
  • @gogasca 刚刚在输入时添加了输出 !conda list | grep 谷歌。感谢您的帮助!
  • 似乎您使用的是旧版本的 AI Platform notebooks,我们将“google-api-core-grpcio-gcp”固定到 1.16,这似乎与 google-cloud-bigquery-storage 冲突,它需要 1.22 或更高版本。 googleapis.dev/python/google-api-core/latest/changelog.html 在全新的笔记本中我得到:google-api-core 1.22.4 pyh9f0ad1d_0 conda-forge google-api-core-grpcio-gcp 1.22.2 hc8dfbb8_0 conda-forge 是否可以迁移到新笔记本,或者如果使用笔记本 API,我们还提供升级端点。
  • 你可以运行以下命令并检查它是否在它之后工作pip install --upgrade pandas-gbq 'google-cloud-bigquery[bqstorage,pandas]'。此外,我能够运行以下命令而没有任何错误:df = pandas.read_gbq(sql, dialect='standard') \\ project_id = 'project-id' \\ df = pandas.read_gbq(sql, project_id=project_id, dialect='standard') \\ df.head()。它对你有用吗? 注意,每个 \\ 只代表一个新行,而 sql 是您的查询。

标签: python google-bigquery jupyter-notebook gcp-ai-platform-notebook


【解决方案1】:

为了进一步为社区做出贡献,我根据上面的评论发布答案。

首先,您应该尝试使用以下命令升级软件包:

pip install --upgrade pandas-gbq 'google-cloud-bigquery[bqstorage,pandas]'

然后,您可以使用read_gbq(),而不是使用 to_dataframe() 方法,它使用环境的默认项目从 BigQuery 加载数据,如下所示:

import pandas

sql = """
    SELECT name
    FROM `bigquery-public-data.usa_names.usa_1910_current`
    WHERE state = 'TX'
    LIMIT 100
"""

# Run a Standard SQL query using the environment's default project
df = pandas.read_gbq(sql, dialect='standard')

# Run a Standard SQL query with the project set explicitly
project_id = 'your-project-id'
df = pandas.read_gbq(sql, project_id=project_id, dialect='standard')

以上代码取自文档here

【讨论】:

    【解决方案2】:

    在深度学习 VM 映像 version 50 中,由于库 issue,库 google-api-core-grpcio-gcp 被固定到版本 1.16。

    稍后此库似乎与 google-cloud-bigquery-storage 冲突,后者需要更新的版本(1.22 或更高版本)。如果您开始使用 Deep Learning VM 59+ version 其中删除了 pin,您应该不会看到此问题:

    google-api-core           1.22.4             pyh9f0ad1d_0    conda-forge 
    google-api-core-grpcio-gcp 1.22.2               hc8dfbb8_0    conda-forge 
    

    您可以创建一个全新的 notebook 或者如果使用 Notebooks API,我们还提供一个 upgrade 端点,您可以使用它来升级到最新的 DLVM 版本。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-03-27
      • 2021-07-11
      • 1970-01-01
      • 2018-05-31
      • 1970-01-01
      相关资源
      最近更新 更多