代码之家 › 专栏 › 技术社区 › blue-sky

卷积运算的意外结果

pytorch computer-vision deep-learning machine-learning

blue-sky · 技术社区 · 6 年前

这是我编写的执行单个卷积并输出形状的代码。

使用来自的公式 http://cs231n.github.io/convolutional-networks/ 要计算输出大小:

你可以说服自己正确的计算公式许多神经元__fit_由(w__f+2p)/s+1给出。

计算输出大小的公式如下所示:

def output_size(w , f , stride , padding) : 
        return (((w - f) + (2 * padding)) / stride) + 1

问题是 output_size 计算2690.5的大小,与1350的卷积结果不同:

%reset -f

import torch
import torch.nn.functional as F
import numpy as np
from PIL import Image
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
from pylab import plt
plt.style.use('seaborn')
%matplotlib inline

width = 60
height = 30
kernel_size_param = 5
stride_param = 2
padding_param = 2

img = Image.new('RGB', (width, height), color = 'red')

in_channels = 3
out_channels = 3

class ConvNet(nn.Module):
    def __init__(self):
        super(ConvNet, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(in_channels, 
                      out_channels, 
                      kernel_size=kernel_size_param, 
                      stride=stride_param, 
                      padding=padding_param))

    def forward(self, x):
        out = self.layer1(x)

        return out

# w : input volume size
# f : receptive field size of the Conv Layer neurons
# output_size computes spatial size of output volume - spatial dimensions are (width, height)
def output_size(w , f , stride , padding) : 
    return (((w - f) + (2 * padding)) / stride) + 1

w = width * height * in_channels
f = kernel_size_param * kernel_size_param

print('output size :' , output_size(w , f , stride_param , padding_param))

model = ConvNet()

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=.001)

img_a = np.array(img)
img_pt = torch.tensor(img_a).float()
result = model(img_pt.view(3, width , height).unsqueeze_(0))
an = result.view(30 , 15 , out_channels).data.numpy()

# print(result.shape)
# print(an.shape)

# print(np.amin(an.flatten('F')))

print(30 * 15 * out_channels)

我是否正确实现了输出大小?如何修改此模型,使结果 Conv2d 形状与结果相同 输出大小 ?

1 回复 | 直到 6 年前

01axel01christian 6 年前

问题是输入的图像不是正方形,因此应将公式应用于 width 以及 heigth 输入图像的。而且你也不应该使用 nb_channels 因为我们要明确定义输出中需要多少通道。然后你用你的 f=kernel_size 而不是 f=kernel_size*kernel_size 如公式所述。

w = width 
h = height
f = kernel_size_param
output_w =  int(output_size(w , f , stride_param , padding_param))
output_h =  int(output_size(h , f , stride_param , padding_param))
print("Output_size", [out_channels, output_w, output_h]) #--> [1, 3, 30 ,15]

然后输出大小:

print("Output size", result.shape)  #--> [1, 3, 30 ,15]

公式来源: http://cs231n.github.io/convolutional-networks/

推荐文章

Saffy · 如何在IterableDataset上应用最小最大缩放?

5 月前

sanjeev mk · 通过索引从Pytorch或Numpy 2D数组中快速删除多行的方法

1 年前

Anonymous · 如何为零维火炬张量赋值?

1 年前

JohnnyWang97 · getattr引起的有趣错误

1 年前

Kamugg · 在PyTorch中使用不同分辨率图像训练DeepLabV3的最佳实践

1 年前

Stocavista · 无法在python中将float 64转换为float 32

1 年前

efwefwefwefwefw wefwefwefwef · 如何在PyTorch Conv1d层中仅在一侧应用填充?

1 年前

Okhr · 运行时错误:CUDA错误:在带有GTX 1660 Super的Debian 12虚拟机上不支持此操作

1 年前

Fatemeh · 如何从使用nn训练和保存的模型加载检查点。DataParallel到不使用nn的模型上。DataParallel?

1 年前

Twenkid · 将GPT2 h5型号转换为割炬,以转换为ggml形状不匹配

1 年前