参考博文:

1.睿智的目标检测26——Pytorch搭建yolo3目标检测平台Bubbliiiing的博客-CSDN博客睿智的目标检测26

2.Darknet53网络各层参数详解 - 简书 (jianshu.com)

3.Darknet53网络结构及代码实现_Tc.小浩的博客-CSDN博客_darknet53

4.Yolov3算法详解 - 奥辰 - 博客园 (cnblogs.com)

Darkenet53是Yolov3网络中的一部分(backbone),为了更加详细了解darknet53网络的结构,现将Darknet53各层输入与输出的形状列举下来,便于分析理解。

Darknet53的网络结构如图1所示,其中蓝色方块×1,x2,x8分别表示该模块重复1次、2次和8次,黄色方块是该模块的名字,Conv Block表示该模块是一个普通的卷积模块,Residual Bolck代表该模块是一个残差网络。 读者可以将图1和图2结合对比着看更容易理解Daeknet53网络三种不同的输出位置。

img

实现代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
import math
from collections import OrderedDict

import torch.nn as nn


# ---------------------------------------------------------------------#
# 残差结构
# 利用一个1x1卷积下降通道数,然后利用一个3x3卷积提取特征并且上升通道数
# 最后接上一个残差边
# ---------------------------------------------------------------------#
class BasicBlock(nn.Module):
def __init__(self, inplanes, planes):
super(BasicBlock, self).__init__()
# (kernel_size, stride, padding)=(1, 1, 0)保持了宽高不变
self.conv1 = nn.Conv2d(
inplanes, planes[0], kernel_size=1, stride=1, padding=0, bias=False)
self.bn1 = nn.BatchNorm2d(planes[0])
self.relu1 = nn.LeakyReLU(0.1)

# (kernel_size, stride, padding)=(3, 1, 1)同样保持了宽高不变
self.conv2 = nn.Conv2d(
planes[0], planes[1], kernel_size=3, stride=1, padding=1, bias=False)
self.bn2 = nn.BatchNorm2d(planes[1])
self.relu2 = nn.LeakyReLU(0.1)

def forward(self, x):
residual = x

out = self.conv1(x)
out = self.bn1(out)
out = self.relu1(out)

out = self.conv2(out)
out = self.bn2(out)
out = self.relu2(out)

out += residual
return out


class DarkNet(nn.Module):
def __init__(self, layers):
super(DarkNet, self).__init__()
self.inplanes = 32
# 416,416,3 -> 416,416,32
self.conv1 = nn.Conv2d(
3, self.inplanes, kernel_size=3, stride=1, padding=1, bias=False)
self.bn1 = nn.BatchNorm2d(self.inplanes)
self.relu1 = nn.LeakyReLU(0.1)

# 416,416,32 -> 208,208,64
self.layer1 = self._make_layer([32, 64], layers[0])
# 208,208,64 -> 104,104,128
self.layer2 = self._make_layer([64, 128], layers[1])
# 104,104,128 -> 52,52,256
self.layer3 = self._make_layer([128, 256], layers[2])
# 52,52,256 -> 26,26,512
self.layer4 = self._make_layer([256, 512], layers[3])
# 26,26,512 -> 13,13,1024
self.layer5 = self._make_layer([512, 1024], layers[4])

self.layers_out_filters = [64, 128, 256, 512, 1024]

# 进行权值初始化
for m in self.modules():
if isinstance(m, nn.Conv2d):
n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
m.weight.data.normal_(0, math.sqrt(2. / n))
elif isinstance(m, nn.BatchNorm2d):
m.weight.data.fill_(1)
m.bias.data.zero_()

# ---------------------------------------------------------------------#
# 在每一个layer里面,首先利用一个步长为2的3x3卷积进行下采样
# 然后进行残差结构的堆叠,使用了上面的BasicBlock模块
# planes = [last_layer_out_channels, this_layer_in_channels]
# ---------------------------------------------------------------------#
def _make_layer(self, planes, blocks):
layers = []
# 下采样,步长为2,卷积核大小为3
layers.append(("ds_conv", nn.Conv2d(
self.inplanes, planes[1], kernel_size=3, stride=2, padding=1, bias=False)))
layers.append(("ds_bn", nn.BatchNorm2d(planes[1])))
layers.append(("ds_relu", nn.LeakyReLU(0.1)))
# 加入残差结构
self.inplanes = planes[1]
# 将Residual Block重复layer[i]次
for i in range(0, blocks):
layers.append(("residual_{}".format(
i), BasicBlock(self.inplanes, planes)))
return nn.Sequential(OrderedDict(layers))

def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu1(x)

x = self.layer1(x)
x = self.layer2(x)
out3 = self.layer3(x)
out4 = self.layer4(out3)
out5 = self.layer5(out4)
# 取出来了三层特征继续改进
return out3, out4, out5


def darknet53():
model = DarkNet([1, 2, 8, 8, 4])
return model

说明:

Darknet53中的53说的是卷积和全连接层数之和,53 = 1 + (1+2+8+8+4)*2 +5+1

看结构图就能明白,最后那个1表示全连接,图中没有画出来,得去看原论文,因为这里的代码是利用其做主干得到特征再进行改进,因此没有全连接。我更愿意说它是Darknet52,但根据习惯,还是叫它Darknet53。