【发布时间】:2021-04-06 11:23:16
【问题描述】:
The official doc 仅状态
>>> from pytorch_lightning.metrics import ConfusionMatrix
>>> target = torch.tensor([1, 1, 0, 0])
>>> preds = torch.tensor([0, 1, 0, 0])
>>> confmat = ConfusionMatrix(num_classes=2)
>>> confmat(preds, target)
这没有说明如何在框架中使用指标。
我的尝试(方法不完整,只显示相关部分):
def __init__(...):
self.val_confusion = pl.metrics.classification.ConfusionMatrix(num_classes=self._config.n_clusters)
def validation_step(self, batch, batch_index):
...
log_probs = self.forward(orig_batch)
loss = self._criterion(log_probs, label_batch)
self.val_confusion.update(log_probs, label_batch)
self.log('validation_confusion_step', self.val_confusion, on_step=True, on_epoch=False)
def validation_step_end(self, outputs):
return outputs
def validation_epoch_end(self, outs):
self.log('validation_confusion_epoch', self.val_confusion.compute())
在第 0 个 epoch 之后,这给出了
Traceback (most recent call last):
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 521, in train
self.train_loop.run_training_epoch()
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\training_loop.py", line 588, in run_training_epoch
self.trainer.run_evaluation(test_mode=False)
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 613, in run_evaluation
self.evaluation_loop.log_evaluation_step_metrics(output, batch_idx)
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\evaluation_loop.py", line 346, in log_evaluation_step_metrics
self.__log_result_step_metrics(step_log_metrics, step_pbar_metrics, batch_idx)
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\evaluation_loop.py", line 350, in __log_result_step_metrics
cached_batch_pbar_metrics, cached_batch_log_metrics = cached_results.update_logger_connector()
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 378, in update_logger_connector
batch_log_metrics = self.get_latest_batch_log_metrics()
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 418, in get_latest_batch_log_metrics
batch_log_metrics = self.run_batch_from_func_name("get_batch_log_metrics")
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 414, in run_batch_from_func_name
results = [func(include_forked_originals=False) for func in results]
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 414, in <listcomp>
results = [func(include_forked_originals=False) for func in results]
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 122, in get_batch_log_metrics
return self.run_latest_batch_metrics_with_func_name("get_batch_log_metrics",
*args, **kwargs)
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 115, in run_latest_batch_metrics_with_func_name
for dl_idx in range(self.num_dataloaders)
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 115, in <listcomp>
for dl_idx in range(self.num_dataloaders)
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 100, in get_latest_from_func_name
results.update(func(*args, add_dataloader_idx=add_dataloader_idx, **kwargs))
File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\core\step_result.py", line 298, in get_batch_log_metrics
result[dl_key] = self[k]._forward_cache.detach()
AttributeError: 'NoneType' object has no attribute 'detach'
它确实在训练前通过了健全性验证检查。
失败发生在validation_step_end 的返回上。对我来说毫无意义。
使用 mertics 的完全相同的方法可以准确地工作。
如何得到正确的混淆矩阵?
【问题讨论】:
-
请提供预期的MRE。显示中间结果与预期结果的偏差。我们应该能够将您的代码块粘贴到文件中,运行它并重现您的问题。这也让我们可以在您的上下文中测试任何建议。
-
您提供的文档链接提供的信息比您在问题中提供的更多信息,以及更完整的示例。正如我所看到的,
validation_step中的update假定实现与ConfusionMatrix对象的结构不一致。由于您省略了这么多代码,我们无法判断;您让我们目视检查您未追踪的代码片段,而不是测试。 -
@Prune MRE 不可行,运行机器学习的代码至少需要一个数据集和配置。这只是一个缺少文档的问题,无论如何我的可重现性实际上是无用的,我只是想看看正确的用法。请告诉我我缺少文档的哪一部分?显然我的实现并不像预期的那样,但我也不明白预期是什么,因为我使用的与更完整的准确性示例完全相同。
-
准确性示例在文档本身中不是 MRE,因为那样它的可读性较差...pytorch-lightning.readthedocs.io/en/stable/metrics.html
标签: python deep-learning pytorch tensorboard pytorch-lightning