如何计算 PyTorch 模型中的参数总数?类似于 Keras 中的 model.count_params()
。
PyTorch 没有像 Keras 那样计算参数总数的函数,但可以对每个参数组的元素数求和:
pytorch_total_params = sum(p.numel() for p in model.parameters())
如果您只想计算可训练参数:
pytorch_total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
在 PyTorch 论坛上受此 answer 启发的答案。
注意:我是 answering my own question。如果有人有更好的解决方案,请与我们分享。
为了像 Keras 一样获取每一层的参数计数,PyTorch 有 model.named_paramters(),它返回参数名称和参数本身的迭代器。
这是一个例子:
from prettytable import PrettyTable
def count_parameters(model):
table = PrettyTable(["Modules", "Parameters"])
total_params = 0
for name, parameter in model.named_parameters():
if not parameter.requires_grad: continue
params = parameter.numel()
table.add_row([name, params])
total_params+=params
print(table)
print(f"Total Trainable Params: {total_params}")
return total_params
count_parameters(net)
输出看起来像这样:
+-------------------+------------+
| Modules | Parameters |
+-------------------+------------+
| embeddings.weight | 922866 |
| conv1.weight | 1048576 |
| conv1.bias | 1024 |
| bn1.weight | 1024 |
| bn1.bias | 1024 |
| conv2.weight | 2097152 |
| conv2.bias | 1024 |
| bn2.weight | 1024 |
| bn2.bias | 1024 |
| conv3.weight | 2097152 |
| conv3.bias | 1024 |
| bn3.weight | 1024 |
| bn3.bias | 1024 |
| lin1.weight | 50331648 |
| lin1.bias | 512 |
| lin2.weight | 265728 |
| lin2.bias | 519 |
+-------------------+------------+
Total Trainable Params: 56773369
如果要避免重复计算共享参数,可以使用 torch.Tensor.data_ptr
。例如:
sum(dict((p.data_ptr(), p.numel()) for p in model.parameters()).values())
这是一个更详细的实现,其中包括过滤掉不可训练参数的选项:
def numel(m: torch.nn.Module, only_trainable: bool = False):
"""
returns the total number of parameters used by `m` (only counting
shared parameters once); if `only_trainable` is True, then only
includes parameters with `requires_grad = True`
"""
parameters = list(m.parameters())
if only_trainable:
parameters = [p for p in parameters if p.requires_grad]
unique = {p.data_ptr(): p for p in parameters}.values()
return sum(p.numel() for p in unique)
如果您想在不实例化模型的情况下计算每层中的权重和偏差的数量,您可以简单地加载原始文件并迭代生成的 collections.OrderedDict
,如下所示:
import torch
tensor_dict = torch.load('model.dat', map_location='cpu') # OrderedDict
tensor_list = list(tensor_dict.items())
for layer_tensor_name, tensor in tensor_list:
print('Layer {}: {} elements'.format(layer_tensor_name, torch.numel(tensor)))
你会得到类似的东西
conv1.weight: 312
conv1.bias: 26
batch_norm1.weight: 26
batch_norm1.bias: 26
batch_norm1.running_mean: 26
batch_norm1.running_var: 26
conv2.weight: 2340
conv2.bias: 10
batch_norm2.weight: 10
batch_norm2.bias: 10
batch_norm2.running_mean: 10
batch_norm2.running_var: 10
fcs.layers.0.weight: 135200
fcs.layers.0.bias: 260
fcs.layers.1.weight: 33800
fcs.layers.1.bias: 130
fcs.batch_norm_layers.0.weight: 260
fcs.batch_norm_layers.0.bias: 260
fcs.batch_norm_layers.0.running_mean: 260
fcs.batch_norm_layers.0.running_var: 260
您可以使用 torchsummary
来做同样的事情。这只是两行代码。
from torchsummary import summary
print(summary(model, (input_shape)))
另一种可能的解决方案
def model_summary(model):
print("model_summary")
print()
print("Layer_name"+"\t"*7+"Number of Parameters")
print("="*100)
model_parameters = [layer for layer in model.parameters() if layer.requires_grad]
layer_name = [child for child in model.children()]
j = 0
total_params = 0
print("\t"*10)
for i in layer_name:
print()
param = 0
try:
bias = (i.bias is not None)
except:
bias = False
if not bias:
param =model_parameters[j].numel()+model_parameters[j+1].numel()
j = j+2
else:
param =model_parameters[j].numel()
j = j+1
print(str(i)+"\t"*3+str(param))
total_params+=param
print("="*100)
print(f"Total Params:{total_params}")
model_summary(net)
这将给出类似于下面的输出
model_summary
Layer_name Number of Parameters
====================================================================================================
Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1)) 60
Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1)) 880
Linear(in_features=576, out_features=120, bias=True) 69240
Linear(in_features=120, out_features=84, bias=True) 10164
Linear(in_features=84, out_features=10, bias=True) 850
====================================================================================================
Total Params:81194
直而简单
print(sum(p.numel() for p in model.parameters()))
有一个内置的实用函数将张量的可迭代转换为张量:torch.nn.utils.parameters_to_vector
,然后与 torch.numel
结合:
torch.nn.utils.parameters_to_vector(model.parameters()).numel()
或更短的命名导入 (from torch.nn.utils import parameters_to_vector
):
parameters_to_vector(model.parameters()).numel()
如 @fábio-perez 所述,PyTorch 中没有这样的内置函数。
但是,我发现这是实现相同结果的一种紧凑而简洁的方法:
num_of_parameters = sum(map(torch.numel, model.parameters()))