【问题标题】:Can no longer install `google-cloud-datastore` dependency for Google Cloud Dataflow jobs in requirements.txt file无法再在 requirements.txt 文件中为 Google Cloud Dataflow 作业安装“google-cloud-datastore”依赖项
【发布时间】:2018-08-01 00:31:56
【问题描述】:

与上一篇类似的事情再次发生:Google Python cloud-dataflow instances broke without new deployment (failed pubsub import)

基本上,一夜之间,我们所有的云数据流作业似乎都无缘无故地中断了。没有新的部署,也没有 SDK 更新,什么都没有,我们的团队只是在 StackDriver 通知中醒来,说我们的映射作业一夜之间失败了。

这是作业失败的堆栈跟踪

Traceback (most recent call last):
    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 609, in do_work work_executor.execute()
    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 167, in execute op.start()
    File "apache_beam/runners/worker/operations.py", line 340, in apache_beam.runners.worker.operations.DoOperation.start def start(self):
    File "apache_beam/runners/worker/operations.py", line 341, in apache_beam.runners.worker.operations.DoOperation.start with self.scoped_start_state:
    File "apache_beam/runners/worker/operations.py", line 346, in apache_beam.runners.worker.operations.DoOperation.start pickler.loads(self.spec.serialized_fn))
    File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 225, in loads return dill.loads(s)
    File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 277, in loads return load(file)
    File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 266, in load obj = pik.load()
    File "/usr/lib/python2.7/pickle.py", line 864, in load dispatch[key](self)
    File "/usr/lib/python2.7/pickle.py", line 1096, in load_global klass = self.find_class(module, name)
    File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 423, in find_class return StockUnpickler.find_class(self, module, name)
    File "/usr/lib/python2.7/pickle.py", line 1130, in find_class __import__(module)
    File "/usr/local/lib/python2.7/dist-packages/dataflow_pipeline/invoice_overages.py", line 26,
in <module> from google.cloud.datastore.helpers import entity_from_protobuf ImportError: No module named datastore.helpers

我尝试在本地重现该错误,这似乎是由于无法安装google-cloud-datastore

这是我们当前的requirements.txt 文件内容

Flask==0.12.2
apache-beam[gcp]
google-cloud-dataflow
gunicorn==19.7.1
google-cloud-datastore==1.3.0
pytz
google-cloud-pubsub
google-gax
grpc-google-iam-v1
googleapis-common-protos
google-cloud==0.32
six==1.10.0
protobuf

我目前主要无法在本地重现此内容。我安装了这些要求

httplib2==0.9.1
oauth2client==3.0.0
google-cloud-dataflow==2.5.0

我得到了上面显示的错误,

$ python main.py
Traceback (most recent call last):
  File "main.py", line 25, in <module>
    import dataflow_pipeline.summarize_intervals as summarization_pipeline
  File "/Users/john/camio-mappers/box-counters-pipeline/dataflow_pipeline/summarize_intervals.py", line 31, in <module>
    from google.cloud.datastore.helpers import entity_from_protobuf
ImportError: No module named datastore.helpers
(venv)

但如果我那么做pip install --ignore-installed google-cloud-datastore

我遇到了这个疯狂的错误

$ python main.py
Traceback (most recent call last):
  File "main.py", line 25, in <module>
    import dataflow_pipeline.summarize_intervals as summarization_pipeline
  File "/Users/john/camio-mappers/box-counters-pipeline/dataflow_pipeline/summarize_intervals.py", line 31, in <module>
    from google.cloud.datastore.helpers import entity_from_protobuf
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore/__init__.py", line 61, in <module>
    from google.cloud.datastore.batch import Batch
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore/batch.py", line 24, in <module>
    from google.cloud.datastore import helpers
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore/helpers.py", line 29, in <module>
    from google.cloud.datastore_v1.proto import datastore_pb2
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore_v1/__init__.py", line 17, in <module>
    from google.cloud.datastore_v1 import types
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore_v1/types.py", line 26, in <module>
    from google.cloud.datastore_v1.proto import datastore_pb2
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore_v1/proto/datastore_pb2.py", line 17, in <module>
    from google.cloud.datastore_v1.proto import entity_pb2 as google_dot_cloud_dot_datastore__v1_dot_proto_dot_entity__pb2
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore_v1/proto/entity_pb2.py", line 28, in <module>
    dependencies=[google_dot_api_dot_annotations__pb2.DESCRIPTOR,google_dot_protobuf_dot_struct__pb2.DESCRIPTOR,google_dot_protobuf_dot_timestamp__pb2.DESCRIPTOR,google_dot_type_dot_latlng__pb2.DESCRIPTOR,])
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/protobuf/descriptor.py", line 878, in __new__
    return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "google/cloud/datastore_v1/proto/entity.proto":
  google.datastore.v1.PartitionId.project_id: "google.datastore.v1.PartitionId.project_id" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.PartitionId.namespace_id: "google.datastore.v1.PartitionId.namespace_id" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.PartitionId: "google.datastore.v1.PartitionId" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.partition_id: "google.datastore.v1.Key.partition_id" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.path: "google.datastore.v1.Key.path" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.PathElement.id_type: "google.datastore.v1.Key.PathElement.id_type" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.PathElement.kind: "google.datastore.v1.Key.PathElement.kind" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.PathElement.id: "google.datastore.v1.Key.PathElement.id" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.PathElement.name: "google.datastore.v1.Key.PathElement.name" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.PathElement: "google.datastore.v1.Key.PathElement" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key: "google.datastore.v1.Key" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.ArrayValue.values: "google.datastore.v1.ArrayValue.values" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.ArrayValue: "google.datastore.v1.ArrayValue" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.value_type: "google.datastore.v1.Value.value_type" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.null_value: "google.datastore.v1.Value.null_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.boolean_value: "google.datastore.v1.Value.boolean_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.integer_value: "google.datastore.v1.Value.integer_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.double_value: "google.datastore.v1.Value.double_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.timestamp_value: "google.datastore.v1.Value.timestamp_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.key_value: "google.datastore.v1.Value.key_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.string_value: "google.datastore.v1.Value.string_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.blob_value: "google.datastore.v1.Value.blob_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.geo_point_value: "google.datastore.v1.Value.geo_point_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.entity_value: "google.datastore.v1.Value.entity_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.array_value: "google.datastore.v1.Value.array_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.meaning: "google.datastore.v1.Value.meaning" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.exclude_from_indexes: "google.datastore.v1.Value.exclude_from_indexes" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value: "google.datastore.v1.Value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Entity.key: "google.datastore.v1.Entity.key" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Entity.properties: "google.datastore.v1.Entity.properties" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Entity.PropertiesEntry.key: "google.datastore.v1.Entity.PropertiesEntry.key" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Entity.PropertiesEntry.value: "google.datastore.v1.Entity.PropertiesEntry.value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Entity.PropertiesEntry: "google.datastore.v1.Entity.PropertiesEntry" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Entity: "google.datastore.v1.Entity" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.partition_id: "google.datastore.v1.PartitionId" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Key.path: "google.datastore.v1.Key.PathElement" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.ArrayValue.values: "google.datastore.v1.Value" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Value.key_value: "google.datastore.v1.Key" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Value.entity_value: "google.datastore.v1.Entity" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Value.array_value: "google.datastore.v1.ArrayValue" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Entity.PropertiesEntry.value: "google.datastore.v1.Value" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Entity.key: "google.datastore.v1.Key" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Entity.properties: "google.datastore.v1.Entity.PropertiesEntry" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.

我不知道这意味着什么。我似乎确实缺少列出的 .proto 文件,但为什么它们会丢失?

我的主要问题是,因为这是最后一次发生这种情况:为什么在我们不进行任何新部署的情况下工作中断是可以的?假设是,如果我们不更改任何代码,则代码不应该中断。如果我们不更改任何依赖项,我们不应该以损坏的管道告终,然后我们必须争先恐后地修复。 Cloud Dataflow 已结束测试版并且应该是稳定的,如果它处于测试版并且受到 API 更改的影响,这是可以预期的,但此时它应该是稳定的。截至目前,我们的管道已关闭,我们不知道要持续多久。


我对@9​​87654332@ 进行了建议的更改,它部分修复了我的管道,但现在它在另一个点上失败了。看来google-cloud-pubsub 现在正在失败,并出现类似的丢失.proto 文件的错误

ERROR: (gcloud.app.deploy) Error Response: [9]
Application startup error:
    import dataflow_pipeline.tally_overages as overaging_pipeline
  File "/home/vmagent/app/dataflow_pipeline/tally_overages.py", line 29, in <module>
    from google.cloud import pubsub
  File "/env/local/lib/python2.7/site-packages/google/cloud/pubsub.py", line 17, in <module>
    from google.cloud.pubsub_v1 import PublisherClient
  File "/env/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/__init__.py", line 17, in <module>
    from google.cloud.pubsub_v1 import types
  File "/env/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/types.py", line 30, in <module>
    from google.cloud.pubsub_v1.proto import pubsub_pb2
  File "/env/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/proto/pubsub_pb2.py", line 29, in <module>
    dependencies=[google_dot_api_dot_annotations__pb2.DESCRIPTOR,google_dot_protobuf_dot_duration__pb2.DESCRIPTOR,google_dot_protobuf_dot_empty__pb2.DESCRIPTOR,google_dot_protobuf_dot_field__mask__pb2.DESCRIPTOR,google_dot_protobuf_dot_timestamp__pb2.DESCRIPTOR,])
  File "/env/local/lib/python2.7/site-packages/google/protobuf/descriptor.py", line 878, in __new__
    return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "google/cloud/pubsub_v1/proto/pubsub.proto":
  google.pubsub.v1.Topic.name: "google.pubsub.v1.Topic.name" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.Topic: "google.pubsub.v1.Topic" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.data: "google.pubsub.v1.PubsubMessage.data" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.attributes: "google.pubsub.v1.PubsubMessage.attributes" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.message_id: "google.pubsub.v1.PubsubMessage.message_id" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.publish_time: "google.pubsub.v1.PubsubMessage.publish_time" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.AttributesEntry.key: "google.pubsub.v1.PubsubMessage.AttributesEntry.key" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.AttributesEntry.value: "google.pubsub.v1.PubsubMessage.AttributesEntry.value" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.AttributesEntry: "google.pubsub.v1.PubsubMessage.AttributesEntry" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage: "google.pubsub.v1.PubsubMessage" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.GetTopicRequest.topic: "google.pubsub.v1.GetTopicRequest.topic" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.GetTopicRequest: "google.pubsub.v1.GetTopicRequest" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PublishRequest.topic: "google.pubsub.v1.PublishRequest.topic" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PublishRequest.messages: "google.pubsub.v1.PublishRequest.messages" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PublishRequest: "google.pubsub.v1.PublishRequest" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PublishResponse.message_ids: "google.pubsub.v1.PublishResponse.message_ids" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PublishResponse: "google.pubsub.v1.PublishResponse" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsRequest.project: "google.pubsub.v1.ListTopicsRequest.project" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsRequest.page_size: "google.pubsub.v1.ListTopicsRequest.page_size" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsRequest.page_token: "google.pubsub.v1.ListTopicsRequest.page_token" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsRequest: "google.pubsub.v1.ListTopicsRequest" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsResponse.topics: "google.pubsub.v1.ListTopicsResponse.topics" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsResponse.next_page_token: "google.pubsub.v1.ListTopicsResponse.next_page_token" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsResponse: "google.pubsub.v1.ListTopicsResponse" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsRequest.topic: "google.pubsub.v1.ListTopicSubscriptionsRequest.topic" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsRequest.page_size: "google.pubsub.v1.ListTopicSubscriptionsRequest.page_size" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsRequest.page_token: "google.pubsub.v1.ListTopicSubscriptionsRequest.page_token" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsRequest: "google.pubsub.v1.ListTopicSubscriptionsRequest" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsResponse.subscriptions: "google.pubsub.v1.ListTopicSubscriptionsResponse.subscriptions" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsResponse.next_page_token: "google.pubsub.v1.ListTopicSubscriptionsResponse.next_page_token" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsResponse: "google.pubsub.v1.ListTopicSubscriptionsResponse" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.DeleteTopicRequest.topic: "google.pubsub.v1.DeleteTopicRequest.topic" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.DeleteTopicRequest: "google.pubsub.v1.DeleteTopicRequest" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.Subscription.name: "google.pubsub.v1.Subscription.name" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.Subscription.topic: "google.pubsub.v1.Subscription.topic" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.Subscription.push_config: "google.pubsub.v1.Subscription.push_config" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.Subscription.ack_deadline_seconds: "google.pubsub.v1.Subscription.ack_deadline_seconds" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.Subscription.retain_acked_messages: "google.pubsub.v1.Subscription.retain_acked_messages" is already defined in 
  google.pubsub.v1.Subscriber.Seek: "google.pubsub.v1.SeekRequest" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Subscriber.Seek: "google.pubsub.v1.SeekResponse" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.CreateTopic: "google.pubsub.v1.Topic" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.CreateTopic: "google.pubsub.v1.Topic" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.UpdateTopic: "google.pubsub.v1.Topic" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.Publish: "google.pubsub.v1.PublishRequest" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.Publish: "google.pubsub.v1.PublishResponse" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.GetTopic: "google.pubsub.v1.GetTopicRequest" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.GetTopic: "google.pubsub.v1.Topic" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.ListTopics: "google.pubsub.v1.ListTopicsRequest" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.ListTopics: "google.pubsub.v1.ListTopicsResponse" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.ListTopicSubscriptions: "google.pubsub.v1.ListTopicSubscriptionsRequest" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.ListTopicSubscriptions: "google.pubsub.v1.ListTopicSubscriptionsResponse" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.DeleteTopic: "google.pubsub.v1.DeleteTopicRequest" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.

我真的不明白这些错误是什么意思。这是我的新 requirements.txt 文件

httplib2==0.9.1
oauth2client==3.0.0
google-cloud-dataflow==2.5.0
dill==0.2.6
Flask
gunicorn
pytz
googledatastore
google-cloud-datastore==1.4.0
google-cloud
google-cloud-pubsub

编辑 - 这是代码的导入行

文件 1

from __future__ import absolute_import

# standard imports
import datetime
import json
import logging
import base64
import collections
import traceback
import hashlib
from functools import reduce

# apache beam / dataflow imports
import apache_beam as beam
from apache_beam.io.gcp.datastore.v1.datastoreio import ReadFromDatastore
from apache_beam.options.pipeline_options import PipelineOptions

# google cloud datastore imports
from google.cloud.proto.datastore.v1 import query_pb2
from googledatastore import helper as datastore_helper, PropertyFilter, CompositeFilter
from google.cloud.datastore.helpers import entity_from_protobuf

# custom utility imports
from .util import *

文件 2

from __future__ import absolute_import

import datetime
import math
import json
import logging
import traceback
import collections
import hashlib
from functools import reduce

import apache_beam as beam
from apache_beam.io.gcp.datastore.v1.datastoreio import ReadFromDatastore
from apache_beam.options.pipeline_options import PipelineOptions

from google.cloud.proto.datastore.v1 import query_pb2
from googledatastore import helper as datastore_helper, PropertyFilter, CompositeFilter
from google.cloud.datastore.helpers import entity_from_protobuf
from google.cloud import pubsub

from .util import *

文件 3

from __future__ import absolute_import

import math
import json
import logging
import traceback
import uuid
import collections
import hashlib
import datetime
from functools import reduce

import apache_beam as beam
from apache_beam.io.gcp.datastore.v1.datastoreio import ReadFromDatastore
from apache_beam.options.pipeline_options import PipelineOptions

from google.cloud.proto.datastore.v1 import query_pb2
from googledatastore import helper as datastore_helper, PropertyFilter, CompositeFilter
from google.cloud.datastore.helpers import entity_from_protobuf
from google.cloud import pubsub

from .util import *

来自util.* 文件

import re
import os
import math
import datetime
import json
import logging
import base64
import traceback
import itertools
import pytz
import requests
from functools import reduce

# google cloud datastore imports
from google.cloud.proto.datastore.v1 import query_pb2
from googledatastore import helper as datastore_helper, PropertyFilter, CompositeFilter
from google.cloud.datastore.helpers import entity_from_protobuf

【问题讨论】:

  • 我试过这个from google.cloud.datastore.helpers import entity_from_protobuf,它对我来说很顺利。我们之间唯一的区别是我的 google-cloud-datastore 版本是 1.4.0,可能会尝试更新?
  • 我已经删除了 google-cloud-datastore 的版本,但问题仍然存在。我会尝试指定版本 1.4.0
  • 我真正想知道的是,为什么这些东西在我们没有做任何事情的情况下就崩溃了。我们会假设,当我们将工作构建投入生产时,它将以稳定的方式运行,直到任何底层 SDK/API 被弃用,并且这只会在合理的时间表上发生。在我们自己没有更改之后,由于某些导入错误而唤醒我们所有的批处理作业失败对于生产管道来说是不可接受的。
  • 我认为你应该在 github 上抱怨这个。此外,当我尝试重现您的错误但似乎出现其他问题时。有兴趣的可以输入chat.stackoverflow.com/rooms/177141/gcp-problem-51623806进行讨论,而不是扩大评论列表。
  • 我正在尝试重现该问题。请您提供您的脚本正在使用的导入行吗?

标签: python google-cloud-platform google-cloud-datastore python-import google-cloud-dataflow


【解决方案1】:

我最终通过创建最小的管道集合并逐个添加依赖项来使其工作,直到找到一些有效的组合。依赖如下

Flask
gunicorn
apache-beam[gcp]==2.6.0
oauth2client==3.0.0
google-cloud-datastore==1.3.0
google-cloud-pubsub==0.28.0
google-cloud-core==0.27.0
google-cloud==0.34.0

仍然非常沮丧的是,这些东西在没有警告和我们方面没有采取任何行动的情况下就中断了,我们的生产管道已离线一周,但现在已经启动,希望这能帮助那里的人。

编辑 -

说得太早了。这种组合在本地有效,但在云端仍然会中断>:-(

EDIT2 -

所以.. Google 似乎停止从 requirements.txt 文件安装。我的setup.py 文件中有这个

REQUIRED_PACKAGES = ['google-cloud']

看起来这是唯一安装的依赖项!当我添加此代码时

if os.path.exists('requirements.txt'):
    with open('requirements.txt') as fh:
        REQUIRED_PACKAGES=[line.strip() for line in fh.readlines()]

这一切都奏效了!现在我的管道运行。

所以看起来谷歌改变了他们从requirements.txt停止安装的地方。

什么!?!

【讨论】:

  • 我还想指出来自beam.apache documentation 的 PyPI 依赖项的第三点:3.- 使用以下命令行选项运行您的管道:--requirements_file 要求。 txt
  • @Temu 我有点困惑..您发送的第二个链接指出它将在 GAE Flex 环境中运行时安装 requirements.txt 依赖项,但这正是这个问题的全部内容,它不再只是这样做了。如果我部署到 flex 环境,除非在 setup.py 文件中枚举,否则不会安装要求。
  • @Temu 同样,关于您的第一点,当管道在云中运行时,我不提供命令行选项,我只是说明管道的入口点 entrypoint: gunicorn -b :$PORT main:app 和让她撕裂?
  • @Temu,或者,您的意思是我应该在beam.Pipeline(options=pipeline_options) 字典中设置requirements_file 值吗?我想这留下了为什么这之前工作但随后随机停止工作的问题?您发送的链接已经说明 GAE Flex 环境是从 requirements.txt 文件安装的,所以这不是用于本地开发的冗余步骤吗?
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2018-10-27
  • 1970-01-01
  • 2020-12-21
  • 1970-01-01
  • 2021-07-17
相关资源
最近更新 更多