从超参数调整作业创建用于管道的模型答案

【问题标题】：Creating a model for use in a pipeline from a hyperparameter tuning job从超参数调整作业创建用于管道的模型
【发布时间】：2019-05-25 19:44:04
【问题描述】：

我正在尝试将超参数调整作业中的最佳估算器实施到管道对象中以部署端点。

我已尽最大努力阅读文档以将调优作业的结果包含在管道中，但我在创建 Model() 类对象时遇到了问题。

# This is the hyperparameter tuning job
tuner.fit({'train': s3_train, 'validation': s3_val}, 
include_cls_metadata=False)


#With a standard Model (Not from the tuner) the process was as follows:
scikit_learn_inferencee_model_name = sklearn_preprocessor.create_model()
xgb_model_name = Model(model_data=xgb_model.model_data, image=xgb_image)


model_name = 'xgb-inference-pipeline-' + timestamp_prefix
endpoint_name = 'xgb-inference-pipeline-ep-' + timestamp_prefix
sm_model = PipelineModel(
    name=model_name, 
    role=role, 
    models=[
        scikit_learn_inferencee_model_name, 
        xgb_model_name])

sm_model.deploy(initial_instance_count=1, instance_type='ml.c4.xlarge', 
endpoint_name=endpoint_name)

我希望能够使用调优作业的结果干净地实例化模型对象，并将其传递给 PipelineModel 对象。任何指导表示赞赏。

【问题讨论】：

标签： python amazon-web-services pipeline amazon-sagemaker

【解决方案1】：

我认为你是在正确的轨道上。你有什么错误吗？请参阅此 notebook 以从调谐器实例化模型并在推理管道中使用。

根据评论编辑之前的回复。要从超参数调整作业的最佳训练作业创建模型，您可以使用下面的 sn-p

from sagemaker.tuner import HyperparameterTuner
from sagemaker.estimator import Estimator
from sagemaker.model import Model

# Attach to an existing hyperparameter tuning job.
xgb_tuning_job_name = 'my_xgb_hpo_tuning_job_name'
xgb_tuner = HyperparameterTuner.attach(xgb_tuning_job_name)

# Get the best XGBoost training job name from the HPO job
xgb_best_training_job = xgb_tuner.best_training_job()
print(xgb_best_training_job)

# Attach estimator to the best training job name
xgb_best_estimator = Estimator.attach(xgb_best_training_job)

# Create model to be passed to the inference pipeline
xgb_model = Model(model_data=xgb_best_estimator.model_data,
                  role=sagemaker.get_execution_role(),
                  image=xgb_best_estimator.image_name)

【讨论】：

我能够使用xgb_model = Model(model_data = tuner.estimator.model_data).... 构建我的端点，但是这个估计器和模型并不能反映我调优工作中的最佳超参数。我在任何地方都找不到有关在管道中部署调谐器对象的文档。我已经尝试了将近一个星期。在您链接的笔记本中，他们不使用超参数调整作业，该作业创建了一个没有 .model_data 字段的调谐器对象。
我根据您的反馈编辑了我的回复。我没有在示例笔记本中看到从 HPO 作业创建推理管道的示例。应该不会太难。看看上面的代码 sn -p 是否可以帮助您从 HPO 作业的最佳训练作业中创建推理管道。
非常感谢！这行得通，并且模型构建正确。我必须通过文档来理解多个附加调用......你能快速解释一下，以便我以后可以自己解决这个问题吗？非常感谢。我的帐户太新，无法投票支持您的解决方案，但我已将其验证为答案。