【发布时间】:2022-02-08 21:45:33
【问题描述】:
我已经训练并构建了一个 Fastai(v1) 模型并将其导出为 .pkl 文件。 现在我想在 Amazon Sagemaker 中部署这个模型进行推理
遵循 Pytorch 模型的 Sagemaker 文档 [https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html#write-an-inference-script][1]
采取的步骤
文件夹结构
我用来创建 zip 文件的命令
cd贤者/ tar -czvf /tmp/model.tar.gz ./export.pkl ./code这将生成一个 model.tar.gz 文件,然后我将其上传到 S3 存储桶
为了部署它,我使用了 python sagemaker SDK
from sagemaker.pytorch import PyTorchModel
role = "sagemaker-role-arn"
model_path = "s3 key for the model.tar.gz file that i created above"
pytorch_model = PyTorchModel(model_data=model_path,role=role,`entry_point='inference.py',framework_version="1.4.0", py_version="py3")
predictor = pytorch_model.deploy(instance_type='ml.c5.large', initial_instance_count=1)
执行上述代码后,我看到模型是在 sagemaker 中创建并部署的,但我最终在运行推理时遇到错误
botocore.errorfactory.ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from primary with message "No module named 'fastai'
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 110, in transform
self.validate_and_initialize(model_dir=model_dir)
File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 157, in validate_and_initialize
self._validate_user_module_and_set_functions()
File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 170, in _validate_user_module_and_set_functions
user_module = importlib.import_module(user_module_name)
File "/opt/conda/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 678, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/opt/ml/model/code/inference.py", line 2, in <module>
from fastai.basic_train import load_learner, DatasetType, Path
ModuleNotFoundError: No module named 'fastai'
很明显,fastai 模块没有被下载这是什么原因,在这种情况下我做错了什么
【问题讨论】:
标签: python machine-learning amazon-sagemaker fast-ai