使用 python 获取某个文件后，如何从谷歌云存储桶中获取文件？答案

【问题标题】：How do you fetch files from Google cloud storage bucket after a certain file is fetched using python?使用 python 获取某个文件后，如何从谷歌云存储桶中获取文件？
【发布时间】：2021-03-15 21:32:24
【问题描述】：

假设在我的 Google Cloud 存储桶中有大约 10k 个文件，在使用 python 获取这些文件时，我将限制设置为 max_results=100。我使用blob.updated 和blob.name 保存最后一个文件的时间戳和名称。如何确保下次运行我的 python 程序时它会在第 100 个文件（已保存）之后获取文件。所以基本上是在max_results=100 之后获取文件，即来自max_results=101

我浏览了文档，但找不到与我想做的事情相关的任何内容。我也知道 max_results 参数会给出结果，直到它被调用的数字，在我的例子中是 100。这是代码：

storage_client = storage.Client()
bucket_name = 'json_file.json'
bucket = storage_client.get_bucket(bucket_name)
blobs = bucket.list_blobs(max_results=100)
last_file_timestamp = list()
name_list = list()
for blob in blobs:
    name_list.append(blob.name)
    last_file_timestamp.append(blob.updated)
print(name_list)
print(last_file_timestamp)

简单来说 - 我如何确保我的 python 脚本第二次执行时，它会在 100 个文件之后从存储桶中获取文件。？有办法吗？请帮忙

【问题讨论】：

标签： python python-3.x google-cloud-platform

【解决方案1】：

当您对 Google API 执行查询时，您有一组结果和一个下一页令牌，您有更多结果。在这种情况下，请使用此令牌并向 API 请求下一页。

这里是基于您的代码的示例

storage_client = storage.Client()
bucket_name = 'json_file.json'
bucket = storage_client.get_bucket(bucket_name)

#First 100 results
blobs = bucket.list_blobs(max_results=100)
for blob in blobs:
    print(blob.name)

#Next 100 results
blobs = bucket.list_blobs(page_token=blobs.next_page_token,max_results=100)
for blob in blobs:
    print(blob.name)

【讨论】：

这正是我想要的。你帮了大忙。谢谢纪尧姆！！