将文件传递给活动作业/后台作业答案

【问题标题】：Pass file to Active Job / background job将文件传递给活动作业/后台作业
【发布时间】：2018-03-24 02:55:37
【问题描述】：

我通过标准文件输入接收请求参数中的文件

def create
  file = params[:file]
  upload = Upload.create(file: file, filename: "img.png")
end

但是，对于大型上传，我想在后台作业中执行此操作。 Sidekiq 或 Resque 等流行的后台作业选项依赖于 Redis 来存储参数，所以我不能只通过 redis 传递一个文件对象。

我可以使用Tempfile，但在某些平台上，例如 Heroku，本地存储并不可靠。

我有哪些选项可以使它在“任何”平台上可靠？

【问题讨论】：

标签： ruby-on-rails redis background-process sidekiq resque

【解决方案1】：

没有临时文件

听起来您想加快图片上传速度或将其推送到后台。这里是my suggestions from another post。如果您正在寻找，也许他们会为您提供帮助。

我发现这个问题的原因是因为我想保存一个 CSV 文件并将我的后台作业添加到包含该文件中的信息的数据库中。

我有办法。

因为你的问题有点不清楚，我懒得发布我自己的问题并回答我自己的问题，我就在这里发布答案。哈哈

就像其他人说的那样，将文件保存在某个云存储服务上。对于亚马逊，您需要：

# Gemfile
gem 'aws-sdk', '~> 2.0' # for storing images on AWS S3
gem 'paperclip', '~> 5.0.0' # image processor if you want to use images

你也需要这个。在production.rb中使用相同的代码但不同的bucket名称

# config/environments/development.rb
Rails.application.configure do
  config.paperclip_defaults = {
    storage: :s3,
    s3_host_name: 's3-us-west-2.amazonaws.com',
    s3_credentials: {
      bucket: 'my-bucket-development',
      s3_region: 'us-west-2',
      access_key_id: ENV['AWS_ACCESS_KEY_ID'],
      secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
    }
  }
end

您还需要迁移

# db/migrate/20000000000000_create_files.rb
class CreateFiles < ActiveRecord::Migration[5.0]
  def change
    create_table :files do |t|
      t.attachment :import_file
    end
  end
end

还有一个模型

class Company < ApplicationRecord
  after_save :start_file_import

  has_attached_file :import_file, default_url: '/missing.png'
  validates_attachment_content_type :import_file, content_type: %r{\Atext\/.*\Z}

  def start_file_import
    return unless import_file_updated_at_changed?
    FileImportJob.perform_later id
  end
end

还有一份工作

class FileImportJob < ApplicationJob
  queue_as :default

  def perform(file_id)
    file = File.find file_id
    filepath = file.import_file.url

    # fetch file
    response = HTTParty.get filepath
    # we only need the contents of the response
    csv_text = response.body
    # use the csv gem to create csv table
    csv = CSV.parse csv_text, headers: true
    p "csv class: #{csv.class}" # => "csv class: CSV::Table"
    # loop through each table row and do something with the data
    csv.each_with_index do |row, index|
      if index == 0
        p "row class: #{row.class}" # => "row class: CSV::Row"
        p row.to_hash # hash of all the keys and values from the csv file
      end
    end
  end
end

在你的控制器中

def create
  @file.create file_params
end

def file_params
  params.require(:file).permit(:import_file)
end

【讨论】：

【解决方案2】：

我建议直接上传到Amazon S3 之类的服务，然后在后台作业中处理您认为合适的文件。

当用户上传文件时，您可以放心，它会安全地存储在 S3 中。您可以使用私有存储桶来禁止公共访问。然后，在您的后台任务中，您可以通过传递文件的 S3 URI 来处理上传，并让您的后台工作人员下载文件。

我不知道您的后台工作人员对该文件做了什么，但不用说再次下载它可能没有必要。毕竟它存储在某个地方。

我过去曾成功使用过carrierwave-direct gem。由于您提到 Heroku，他们有详细的 guide 用于将文件直接上传到 S3。

【讨论】：

【解决方案3】：

首先，您应该将文件保存到存储（本地或 AWS S3）上。然后将 filepath 或 uuid 作为参数传递给后台作业。

我强烈建议避免在参数上传递 Tempfile。这会将对象存储在内存中，这些对象可能会过期，从而导致数据过时问题。

【讨论】：