【发布时间】:2021-03-16 15:36:17
【问题描述】:
我正在使用 DataBricks 和 Spark 7.4ML,
以下代码成功记录了参数和指标,我可以在 MLFLOW gui 中看到 ROCcurve.png(只是模型下方树中的项目)。但实际情节是空白的。为什么?
with mlflow.start_run(run_name="logistic-regression") as run:
pipeModel = pipe.fit(trainDF)
mlflow.spark.log_model(pipeModel, "model")
predTest = pipeModel.transform(testDF)
predTrain = pipeModel.transform(trainDF)
evaluator=BinaryClassificationEvaluator(labelCol="arrivedLate")
trainROC = evaluator.evaluate(predTrain)
testROC = evaluator.evaluate(predTest)
print(f"Train ROC: {trainROC}")
print(f"Test ROC: {testROC}")
mlflow.log_param("Dataset Name", "Flights " + datasetName)
mlflow.log_metric(key="Train ROC", value=trainROC)
mlflow.log_metric(key="Test ROC", value=testROC)
lrModel = pipeModel.stages[3]
trainingSummary = lrModel.summary
roc = trainingSummary.roc.toPandas()
plt.plot(roc['FPR'],roc['TPR'])
plt.ylabel('False Positive Rate')
plt.xlabel('True Positive Rate')
plt.title('ROC Curve')
plt.show()
plt.savefig("ROCcurve.png")
mlflow.log_artifact("ROCcurve.png")
plt.close()
display(predTest.select(stringCols + ["arrivedLate", "prediction"]))
笔记本显示的内容:
MLFlow 显示的内容:
【问题讨论】:
标签: apache-spark matplotlib pyspark databricks mlflow