【问题标题】:Write csv to google cloud storage将 csv 写入谷歌云存储
【发布时间】:2017-04-25 03:54:10
【问题描述】:

我正在尝试了解如何将多行 csv 文件写入谷歌云存储。我只是没有关注documentation

靠近这里: Unable to read csv file uploaded on google cloud storage bucket

例子:

from google.cloud import storage
from oauth2client.client import GoogleCredentials
import os

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "<pathtomycredentials>"

a=[1,2,3]

b=['a','b','c']

storage_client = storage.Client()
bucket = storage_client.get_bucket("<mybucketname>")

blob=bucket.blob("Hummingbirds/trainingdata.csv")

for eachrow in range(3):
    blob.upload_from_string(str(a[eachrow]) + "," + str(b[eachrow]))

这样你就可以在谷歌云存储上找到一条线

3,c

显然它每次都会打开一个新文件并写入该行。

好的,添加一个新行 delim 怎么样?

for eachrow in range(3):
    blob.upload_from_string(str(a[eachrow]) + "," + str(b[eachrow]) + "\n")

添加换行符,但再次从头开始写入。

有人可以说明这种方法是什么吗?我可以把我所有的行合并成一个字符串,或者写一个临时文件,但这看起来很丑。

也许以文件形式打开?

【问题讨论】:

标签: python csv google-cloud-storage


【解决方案1】:

blob.upload_from_string(data) 方法创建一个新对象,其内容正是字符串data 的内容。它覆盖现有对象而不是追加。

最简单的解决方案是将整个 CSV 写入一个临时文件,然后使用 blob.upload_from_filename(filename) 函数将该文件上传到 GCS。

【讨论】:

  • 如果您有创建文件的限制,即您的应用程序在应用程序引擎中而您就是不能,有没有办法直接直接写入文件
  • upload_from_string 将其存储为文本文件。是否可以将其转换为 .csv?
【解决方案2】:

请参考下面的答案,希望对您有所帮助。

import pandas as pd
 data = [['Alex','Feb',10],['Bob','jan',12]]
 df = pd.DataFrame(data,columns=['Name','Month','Age'])
 print df

输出

   Name Month  Age
0  Alex   Feb   10
1   Bob   jan   12

添加一行

row = ['Sally','Oct',15]
df.loc[len(df)] = row
print df

输出

     Name Month  Age
 0   Alex   Feb   10
 1    Bob   jan   12
 2  Sally   Oct   15

使用 gsutil 写入/复制到 GCP 存储桶

  df.to_csv('text.csv', index = False)
 !gsutil cp 'text.csv' 'gs://BucketName/folderName/'

Python 代码(文档https://googleapis.dev/python/storage/latest/index.html

from google.cloud import storage

def upload_to_bucket(bucket_name, blob_path, local_path):
    bucket = storage.Client().bucket(bucket_name)
    blob = bucket.blob(blob_path)
    blob.upload_from_filename(local_path)
    return blob.url

# method call
bucket_name = 'bucket-name' # do not give gs:// ,just bucket name
blob_path: = 'path/folder name inside bucket'
local_path = 'local_machine_path_where_file_resides' #local file path
upload_to_bucket(bucket_name, blob_path, local_path)

【讨论】:

    【解决方案3】:
    from google.cloud import storage
    from oauth2client.client import GoogleCredentials
    import os
    
    os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "<pathtomycredentials>"
    
    a=[1,2,3]
    
    b=['a','b','c']
    
    storage_client = storage.Client()
    bucket = storage_client.get_bucket("<mybucketname>")
    
    blob=bucket.blob("Hummingbirds/trainingdata.csv")
    
    # build up the complete csv string
    csv_string_to_upload = ''
    
    for eachrow in range(3):
        # add the lines
        csv_string_to_upload = csv_string_to_upload + str(a[eachrow]) + ',' + b[eachrow] + '\n'
    
    # upload the complete csv string
    blob.upload_from_string(
                data=csv_string_to_upload,
                content_type='text/csv'
            )
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2013-06-26
      • 2017-09-26
      • 2019-08-02
      • 2015-09-02
      • 1970-01-01
      • 1970-01-01
      • 2019-12-26
      • 2020-09-10
      相关资源
      最近更新 更多