检查 PyTorch 模型中的参数总数答案

【问题标题】：Check the total number of parameters in a PyTorch model检查 PyTorch 模型中的参数总数
【发布时间】：2018-08-18 11:11:51
【问题描述】：

如何计算 PyTorch 模型中的参数总数？类似于 Keras 中的model.count_params()。

【问题讨论】：

标签： deep-learning pytorch

【解决方案1】：

为了像 Keras 一样获取每一层的参数计数，PyTorch 有model.named_paramters()，它返回参数名称和参数本身的迭代器。

这是一个例子：

from prettytable import PrettyTable

def count_parameters(model):
    table = PrettyTable(["Modules", "Parameters"])
    total_params = 0
    for name, parameter in model.named_parameters():
        if not parameter.requires_grad: continue
        params = parameter.numel()
        table.add_row([name, params])
        total_params+=params
    print(table)
    print(f"Total Trainable Params: {total_params}")
    return total_params
    
count_parameters(net)

输出如下所示：

+-------------------+------------+
|      Modules      | Parameters |
+-------------------+------------+
| embeddings.weight |   922866   |
|    conv1.weight   |  1048576   |
|     conv1.bias    |    1024    |
|     bn1.weight    |    1024    |
|      bn1.bias     |    1024    |
|    conv2.weight   |  2097152   |
|     conv2.bias    |    1024    |
|     bn2.weight    |    1024    |
|      bn2.bias     |    1024    |
|    conv3.weight   |  2097152   |
|     conv3.bias    |    1024    |
|     bn3.weight    |    1024    |
|      bn3.bias     |    1024    |
|    lin1.weight    |  50331648  |
|     lin1.bias     |    512     |
|    lin2.weight    |   265728   |
|     lin2.bias     |    519     |
+-------------------+------------+
Total Trainable Params: 56773369

【讨论】：

【解决方案2】：

简单直接

print(sum(p.numel() for p in model.parameters()))

【讨论】：

【解决方案3】：

正如@fábio-perez 提到的，PyTorch 中没有这样的内置函数。

但是，我发现这是实现相同结果的一种紧凑而简洁的方式：

num_of_parameters = sum(map(torch.numel, model.parameters()))

【讨论】：

【解决方案4】：

有一个内置的实用函数可以将一个可迭代的张量转换为一个张量：torch.nn.utils.parameters_to_vector，然后与torch.numel结合：

torch.nn.utils.parameters_to_vector(model.parameters()).numel()

或更短的命名导入 (from torch.nn.utils import parameters_to_vector)：

parameters_to_vector(model.parameters()).numel()

【讨论】：

【解决方案5】：

如果要避免重复计算共享参数，可以使用torch.Tensor.data_ptr。例如：

sum(dict((p.data_ptr(), p.numel()) for p in model.parameters()).values())

这是一个更详细的实现，其中包括过滤掉不可训练参数的选项：

def numel(m: torch.nn.Module, only_trainable: bool = False):
    """
    returns the total number of parameters used by `m` (only counting
    shared parameters once); if `only_trainable` is True, then only
    includes parameters with `requires_grad = True`
    """
    parameters = list(m.parameters())
    if only_trainable:
        parameters = [p for p in parameters if p.requires_grad]
    unique = {p.data_ptr(): p for p in parameters}.values()
    return sum(p.numel() for p in unique)

【讨论】：

【解决方案6】：

您可以使用torchsummary 来做同样的事情。就两行代码。

from torchsummary import summary

print(summary(model, (input_shape)))

【讨论】：

【解决方案7】：

另一种可能的解决方案

def model_summary(model):
  print("model_summary")
  print()
  print("Layer_name"+"\t"*7+"Number of Parameters")
  print("="*100)
  model_parameters = [layer for layer in model.parameters() if layer.requires_grad]
  layer_name = [child for child in model.children()]
  j = 0
  total_params = 0
  print("\t"*10)
  for i in layer_name:
    print()
    param = 0
    try:
      bias = (i.bias is not None)
    except:
      bias = False  
    if not bias:
      param =model_parameters[j].numel()+model_parameters[j+1].numel()
      j = j+2
    else:
      param =model_parameters[j].numel()
      j = j+1
    print(str(i)+"\t"*3+str(param))
    total_params+=param
  print("="*100)
  print(f"Total Params:{total_params}")       

model_summary(net)

这将产生类似于下面的输出

model_summary

Layer_name                          Number of Parameters
====================================================================================================

Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))             60
Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1))            880
Linear(in_features=576, out_features=120, bias=True)        69240
Linear(in_features=120, out_features=84, bias=True)         10164
Linear(in_features=84, out_features=10, bias=True)          850
====================================================================================================
Total Params:81194

【讨论】：

【解决方案8】：

如果您想在不实例化模型的情况下计算每一层的权重和偏差的数量，您可以简单地加载原始文件并迭代生成的 collections.OrderedDict，如下所示：

import torch


tensor_dict = torch.load('model.dat', map_location='cpu') # OrderedDict
tensor_list = list(tensor_dict.items())
for layer_tensor_name, tensor in tensor_list:
    print('Layer {}: {} elements'.format(layer_tensor_name, torch.numel(tensor)))

你会得到类似的东西

conv1.weight: 312
conv1.bias: 26
batch_norm1.weight: 26
batch_norm1.bias: 26
batch_norm1.running_mean: 26
batch_norm1.running_var: 26
conv2.weight: 2340
conv2.bias: 10
batch_norm2.weight: 10
batch_norm2.bias: 10
batch_norm2.running_mean: 10
batch_norm2.running_var: 10
fcs.layers.0.weight: 135200
fcs.layers.0.bias: 260
fcs.layers.1.weight: 33800
fcs.layers.1.bias: 130
fcs.batch_norm_layers.0.weight: 260
fcs.batch_norm_layers.0.bias: 260
fcs.batch_norm_layers.0.running_mean: 260
fcs.batch_norm_layers.0.running_var: 260

【讨论】：

【解决方案9】：

PyTorch 没有像 Keras 那样计算参数总数的函数，但可以对每个参数组的元素数求和：

pytorch_total_params = sum(p.numel() for p in model.parameters())

如果你只想计算可训练的参数：

pytorch_total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

受 PyTorch 论坛上的 answer 启发的答案。

注意：我是answering my own question。如果有人有更好的解决方案，请与我们分享。

【讨论】：