1
00:00:00,000 --> 00:00:03,210
Hey, here we are again, and in this

2
00:00:03,235 --> 00:00:05,490
lecture, we will continue building our

3
00:00:05,515 --> 00:00:08,180
real world motion detection program.

4
00:00:09,825 --> 00:00:13,242
Now, before I go and write the code, I'll

5
00:00:13,267 --> 00:00:16,380
first like to explain to you the

6
00:00:16,380 --> 00:00:19,230
architecture of the program that we're

7
00:00:19,260 --> 00:00:22,410
building. So how this program is able to

8
00:00:22,410 --> 00:00:25,320
detect motion in the video, and I assume

9
00:00:25,320 --> 00:00:26,760
you already know how to build the

10
00:00:26,760 --> 00:00:29,400
script. So we built this in the previous

11
00:00:29,400 --> 00:00:33,570
lectures, and what the script does is

12
00:00:33,570 --> 00:00:35,580
that, in case you have missed it.

13
00:00:35,604 --> 00:00:41,456
[No audio]

14
00:00:41,481 --> 00:00:45,210
Hey, so what this does is it triggers the

15
00:00:45,210 --> 00:00:48,900
video, as you saw, from that can be the

16
00:00:48,900 --> 00:00:53,070
webcam, and so we're processing frames here

17
00:00:53,070 --> 00:00:56,190
in this while loop, and so on. So you know

18
00:00:56,190 --> 00:00:58,920
this, I will not go through them right

19
00:00:58,920 --> 00:01:02,460
now. So what we need to do next is we

20
00:01:02,460 --> 00:01:04,650
need to process those frames that are

21
00:01:04,650 --> 00:01:07,800
being iterated in this while loop, and

22
00:01:07,800 --> 00:01:09,450
I've got some pictures here to

23
00:01:09,480 --> 00:01:13,620
illustrate my ideas, my concepts. Let me

24
00:01:13,620 --> 00:01:15,360
go to the directory where they are

25
00:01:15,360 --> 00:01:19,500
located. And so what this motion detection

26
00:01:19,500 --> 00:01:22,170
program will do is, it will trigger the

27
00:01:22,170 --> 00:01:24,900
webcam just like our current script

28
00:01:24,900 --> 00:01:26,850
does, and one condition for the program

29
00:01:26,850 --> 00:01:29,430
to work well is that once you trigger

30
00:01:29,430 --> 00:01:31,766
the webcam, the first_frame of the

31
00:01:31,791 --> 00:01:34,230
video should be the static background.

32
00:01:35,220 --> 00:01:37,350
So if you're planning to use this

33
00:01:37,350 --> 00:01:40,500
program, let's say you'll setup a

34
00:01:40,500 --> 00:01:43,050
webcam on a laptop or a Raspberry Pi

35
00:01:43,080 --> 00:01:45,697
Server, let's say you want to detect

36
00:01:45,871 --> 00:01:49,590
the movement of a certain animal in an

37
00:01:49,590 --> 00:01:53,730
area. So first, you'd want to capture, to

38
00:01:53,760 --> 00:01:55,680
trigger the camera while the

39
00:01:55,680 --> 00:01:58,110
background is static, and then you want

40
00:01:58,110 --> 00:02:01,140
to use this background as a base image

41
00:02:01,140 --> 00:02:02,880
so that you can compare the other

42
00:02:02,880 --> 00:02:05,370
images, and then Python can detect if

43
00:02:05,370 --> 00:02:06,900
there is a change between the first

44
00:02:06,900 --> 00:02:10,170
frame and the next frames. So that's one

45
00:02:10,170 --> 00:02:12,870
thing you need to do, and so this is

46
00:02:12,870 --> 00:02:16,350
an example of background, and then

47
00:02:17,400 --> 00:02:19,560
something, the animal will appear on your

48
00:02:19,560 --> 00:02:22,560
camera. Okay, and then what you'll

49
00:02:22,560 --> 00:02:24,120
have to write in your script in the

50
00:02:24,145 --> 00:02:28,525
program is that first, we would want to

51
00:02:28,620 --> 00:02:31,140
gray out this image, so the background

52
00:02:31,140 --> 00:02:33,810
image and the current frame of the

53
00:02:33,810 --> 00:02:35,430
camera, so we'll store the first

54
00:02:35,430 --> 00:02:38,640
frame of the video capture in a variable

55
00:02:38,670 --> 00:02:41,310
and then you'll convert that frame to

56
00:02:41,340 --> 00:02:44,550
a grayscale image, and then the while

57
00:02:44,550 --> 00:02:46,800
loop will go through the current frames.

58
00:02:47,160 --> 00:02:49,290
And you do the same for the current

59
00:02:49,290 --> 00:02:50,580
frame, so you will convert them to

60
00:02:50,580 --> 00:02:53,700
grayscale, and then what you'll do to

61
00:02:53,700 --> 00:02:56,430
these two grayscale images of the current

62
00:02:56,460 --> 00:02:58,950
iteration of the loop, you'll apply

63
00:02:59,100 --> 00:03:01,320
the difference between them. So this is

64
00:03:01,350 --> 00:03:03,480
the difference. An example of a

65
00:03:03,523 --> 00:03:06,090
difference frame, of a delta frame, if you

66
00:03:06,090 --> 00:03:08,250
can say like that. In this particular

67
00:03:08,250 --> 00:03:10,110
image, you'll notice that behind me,

68
00:03:10,470 --> 00:03:12,900
there is the lamp of the room, and

69
00:03:12,900 --> 00:03:15,150
normally, you don't be able to see the

70
00:03:15,150 --> 00:03:17,520
lamp but Python is actually making the

71
00:03:17,520 --> 00:03:20,820
difference between the frame I was

72
00:03:20,820 --> 00:03:22,770
appearing and the background frame where

73
00:03:22,770 --> 00:03:26,370
the lamp is visible. So it comes up with

74
00:03:26,370 --> 00:03:29,040
this gray image where each pixel has a

75
00:03:29,040 --> 00:03:31,470
certain value. So it has some intensity

76
00:03:31,470 --> 00:03:33,424
values. All right.

77
00:03:33,448 --> 00:03:35,448
[No audio]

78
00:03:35,473 --> 00:03:37,440
So that means for

79
00:03:37,440 --> 00:03:39,810
instance, the high intensity values

80
00:03:39,840 --> 00:03:42,630
like this one where I am, it means

81
00:03:42,655 --> 00:03:45,775
that there is potential motion in this

82
00:03:45,810 --> 00:03:48,150
area here, while the black areas imply

83
00:03:48,150 --> 00:03:50,760
there is no motion. But you also see

84
00:03:50,760 --> 00:03:54,720
some light black pixels here, because

85
00:03:54,720 --> 00:03:57,240
when I appear on the camera, there is

86
00:03:57,240 --> 00:04:00,780
shadow behind me and so on. But we will

87
00:04:00,780 --> 00:04:03,510
apply something else later on, we just

88
00:04:03,510 --> 00:04:06,630
call the threshold. So we'll basically

89
00:04:06,630 --> 00:04:08,460
say that if you see a difference in the

90
00:04:08,490 --> 00:04:10,350
delta frame, in the frame that I just

91
00:04:10,350 --> 00:04:12,420
showed you, if you see a difference of

92
00:04:12,420 --> 00:04:16,170
more than 100 intensity, convert that

93
00:04:16,260 --> 00:04:20,130
pixels, those pixels to white pixels,

94
00:04:20,370 --> 00:04:23,910
again, and for pixels that are below

95
00:04:23,910 --> 00:04:28,770
the threshold, convert them to black, okay, and so

96
00:04:28,770 --> 00:04:31,530
you come up with the outline of the

97
00:04:31,568 --> 00:04:34,350
object that is moving in the camera or in the

98
00:04:34,375 --> 00:04:36,930
frame. So we are doing all these

99
00:04:36,930 --> 00:04:39,690
processes inside the while loop, and

100
00:04:39,690 --> 00:04:41,040
then once we have calculated the

101
00:04:41,040 --> 00:04:43,800
threshold frame inside the loop, what

102
00:04:43,800 --> 00:04:45,810
we'll do to the current frame is we'll

103
00:04:45,810 --> 00:04:50,010
find the contours of the white objects

104
00:04:50,040 --> 00:04:53,070
in the frame. Okay. So for this

105
00:04:53,070 --> 00:04:55,140
particular image, we would have contours

106
00:04:55,140 --> 00:04:58,050
around this object here and the contours

107
00:04:58,050 --> 00:04:59,760
around this and around this as well.

108
00:05:00,482 --> 00:05:02,280
And then we will write a for loop that will

109
00:05:02,310 --> 00:05:04,350
iterate through all the contours of the

110
00:05:04,350 --> 00:05:06,990
current frame, okay, it will go to this

111
00:05:06,990 --> 00:05:09,540
contour, to this, to this, and to this one. And

112
00:05:09,540 --> 00:05:11,670
then inside that loop, what we'll do,

113
00:05:11,700 --> 00:05:15,480
we'll check if the area of the contours, so

114
00:05:15,480 --> 00:05:17,550
this, for example, has an area of, let's

115
00:05:17,550 --> 00:05:20,670
say 500 pixels. So if the area, if the

116
00:05:20,670 --> 00:05:23,940
area of the contour is more than 500

117
00:05:23,940 --> 00:05:26,100
pixels, for example, then consider these

118
00:05:26,100 --> 00:05:28,770
as a moving object. If the

119
00:05:28,770 --> 00:05:31,200
area is, let's say less than 500, like

120
00:05:31,200 --> 00:05:33,870
this one here is probably 20 pixels,

121
00:05:33,870 --> 00:05:36,000
let's say this will not be considered

122
00:05:36,270 --> 00:05:39,060
moving object. Okay, I hope that makes

123
00:05:39,060 --> 00:05:42,030
sense. And then what we'll do next is we

124
00:05:42,030 --> 00:05:44,610
will draw a rectangle around the

125
00:05:44,640 --> 00:05:46,920
contours that are greater than the

126
00:05:46,920 --> 00:05:49,410
minimum area, and then we'll show those

127
00:05:49,410 --> 00:05:52,920
rectangles in the original image, so in

128
00:05:52,920 --> 00:05:54,990
the color version of the current frame,

129
00:05:55,277 --> 00:05:58,295
and that means we will see a rectangle in

130
00:05:58,320 --> 00:05:59,670
the video. While the video is

131
00:05:59,670 --> 00:06:01,080
playing, we'll see a rectangle around

132
00:06:01,080 --> 00:06:03,630
the object. Later on, we will detect the

133
00:06:03,630 --> 00:06:05,460
times that the object, that the moving

134
00:06:05,460 --> 00:06:08,010
object entered the background, the video

135
00:06:08,010 --> 00:06:11,190
frame, and the time that the object exited

136
00:06:11,190 --> 00:06:13,500
the background. But for now, let's

137
00:06:13,500 --> 00:06:15,900
simply focus on detecting the object,

138
00:06:15,900 --> 00:06:18,600
the moving object in the video. Okay,

139
00:06:18,810 --> 00:06:21,870
and let's go back to the script,

140
00:06:21,894 --> 00:06:27,155
[No audio]

141
00:06:27,180 --> 00:06:31,440
close this and this and this. Okay,

142
00:06:31,789 --> 00:06:34,590
and let's remove some unnecessary lines

143
00:06:34,590 --> 00:06:37,080
here. So we had the a variable here,

144
00:06:37,080 --> 00:06:38,850
which we created because we wanted to

145
00:06:38,850 --> 00:06:41,430
see how many frames we had in the video.

146
00:06:43,080 --> 00:06:45,360
So we don't need that anymore, and we

147
00:06:45,360 --> 00:06:47,340
don't need the script to stop for 3

148
00:06:47,340 --> 00:06:50,626
seconds. So I removed the time.sleep

149
00:06:51,075 --> 00:06:54,245
and this one also. Okay, and now

150
00:06:54,270 --> 00:06:56,640
here, the first part is the most

151
00:06:56,670 --> 00:07:00,210
trickiest one, we need to figure out

152
00:07:00,210 --> 00:07:02,850
way to store the current frame of the

153
00:07:02,850 --> 00:07:05,670
video. So as soon as the video starts,

154
00:07:05,670 --> 00:07:08,970
we want to store that NumPy array in a

155
00:07:08,970 --> 00:07:11,730
variable and have that variable static,

156
00:07:11,730 --> 00:07:13,290
so we don't want to change the value of

157
00:07:13,315 --> 00:07:16,015
that variable, while the while loop runs

158
00:07:16,115 --> 00:07:19,775
in the script. And the way to do that is, we

159
00:07:19,800 --> 00:07:23,550
would first need to create a variable

160
00:07:23,550 --> 00:07:26,790
for the frame, for the first_frame. So and we

161
00:07:26,790 --> 00:07:31,530
need to assign it a None value. So None is

162
00:07:31,555 --> 00:07:34,590
a special Python datatype, that allows you

163
00:07:34,590 --> 00:07:36,780
to create a variable and assign nothing

164
00:07:36,780 --> 00:07:38,820
to it, but you have the variable there. So if

165
00:07:38,820 --> 00:07:41,100
you call this variable later, Python

166
00:07:41,100 --> 00:07:44,040
will not say variable is not defined.

167
00:07:44,520 --> 00:07:46,230
Okay, if you don't understand it, please

168
00:07:46,230 --> 00:07:48,720
hold on and you'll get it in just a

169
00:07:48,720 --> 00:07:53,400
while, and what you need to do now is to

170
00:07:53,400 --> 00:07:55,920
write a conditional and apply a

171
00:07:55,920 --> 00:07:58,854
continuous statement either, let me

172
00:07:58,879 --> 00:08:00,600
the write the conditional first, and then

173
00:08:00,960 --> 00:08:03,600
I'll explain to you what this does. So,

174
00:08:03,600 --> 00:08:09,688
we want to check if the first_frame is None

175
00:08:09,712 --> 00:08:14,457
[No audio]

176
00:08:14,482 --> 00:08:18,889
and if it is None, we will assign the first_frame,

177
00:08:18,913 --> 00:08:22,045
[No audio]

178
00:08:22,070 --> 00:08:25,770
the gray frame. So, what

179
00:08:25,770 --> 00:08:28,140
this does is that the script will run,

180
00:08:28,170 --> 00:08:31,230
the video will be triggered and then the

181
00:08:31,230 --> 00:08:33,720
while loop will start to run and it will

182
00:08:33,720 --> 00:08:36,510
get the first_frame of the video and it will

183
00:08:36,510 --> 00:08:39,540
store it in this frame variable, and this

184
00:08:39,540 --> 00:08:41,640
frame variable will be converted into a

185
00:08:41,640 --> 00:08:44,040
gray frame and then we say if the

186
00:08:44,040 --> 00:08:47,520
first_frame is None, which is true, in the

187
00:08:47,520 --> 00:08:49,260
first iteration of the loop, this is

188
00:08:49,260 --> 00:08:51,930
true. So the first_frame is actually None

189
00:08:51,930 --> 00:08:55,620
because we assign None here, assign the

190
00:08:55,620 --> 00:08:59,310
gray NumPy to the first_frame. So the

191
00:08:59,310 --> 00:09:02,430
first_frame will get the grayscale image

192
00:09:02,460 --> 00:09:04,530
which represents the very first_frame of

193
00:09:04,530 --> 00:09:07,645
the video. So this happens in the

194
00:09:07,670 --> 00:09:10,710
very first iteration of the loop. Okay.

195
00:09:11,280 --> 00:09:13,470
But then what will happen is Python will

196
00:09:13,470 --> 00:09:16,530
execute this other lines of code, and

197
00:09:16,530 --> 00:09:18,330
then it will go to the second loop, and

198
00:09:18,330 --> 00:09:20,520
what Python will do is it will grab the

199
00:09:20,520 --> 00:09:23,520
second frame of the video. Okay, let's say

200
00:09:23,520 --> 00:09:25,860
the first_frame was a background image,

201
00:09:26,250 --> 00:09:29,190
and then suddenly an object appears in the

202
00:09:29,220 --> 00:09:31,920
camera, in front of the camera, so Python

203
00:09:31,920 --> 00:09:35,070
will grab the NumPy array that contains

204
00:09:35,070 --> 00:09:38,040
that object. So the second frame and

205
00:09:38,070 --> 00:09:40,500
this frame will be converted to gray and

206
00:09:40,500 --> 00:09:42,750
then here we say if first_frame is

207
00:09:42,750 --> 00:09:45,750
None, the first_frame variable will

208
00:09:45,750 --> 00:09:47,558
get the first frame of the video.

209
00:09:47,582 --> 00:09:49,686
first frame

210
00:09:49,711 --> 00:09:52,350
And once we have grabbed the first_frame, we

211
00:09:52,350 --> 00:09:54,660
don't want these other lines of code to

212
00:09:54,685 --> 00:09:57,385
be executed. Because here we will have,

213
00:09:57,630 --> 00:09:59,160
you know, we'll apply the difference

214
00:09:59,160 --> 00:10:01,260
between frames and we'll blurry the

215
00:10:01,260 --> 00:10:04,110
frames, and so on. So we don't want

216
00:10:04,110 --> 00:10:06,450
these to be executed. Instead, we want

217
00:10:06,450 --> 00:10:08,370
Python to go to the beginning of the

218
00:10:08,370 --> 00:10:10,888
loop, and continue with the second frame.

219
00:10:11,486 --> 00:10:14,604
To do that, we need to write continue here.

220
00:10:14,628 --> 00:10:16,779
[No audio]

221
00:10:16,804 --> 00:10:18,510
So this means continue to

222
00:10:18,510 --> 00:10:21,090
the beginning of the loop, and don't go

223
00:10:21,120 --> 00:10:24,510
and run the rest of the code. Okay,

224
00:10:25,080 --> 00:10:28,055
so the first_frame is None, and the first_frame

225
00:10:28,079 --> 00:10:29,970
gets the value of the first image

226
00:10:29,995 --> 00:10:32,845
of the video, and then goes to the next

227
00:10:32,885 --> 00:10:35,495
iteration, and then the next iteration,

228
00:10:35,520 --> 00:10:37,890
what we'll do, it will grab the second

229
00:10:37,890 --> 00:10:40,230
frame of the video, and then it will

230
00:10:40,260 --> 00:10:42,540
calculate the gray version of that

231
00:10:42,540 --> 00:10:45,390
frame, and then it goes again to the

232
00:10:45,390 --> 00:10:48,360
conditional, and in this case, is

233
00:10:48,360 --> 00:10:49,860
the first_frame None? No, it's not,

234
00:10:49,890 --> 00:10:51,810
because the first_frame got the value of

235
00:10:51,810 --> 00:10:54,780
the gray image in the first iteration of the

236
00:10:54,780 --> 00:10:58,260
while loop. So this lines here will not

237
00:10:58,260 --> 00:11:00,480
be executed at the second iteration of

238
00:11:00,480 --> 00:11:05,850
the loop. Okay. Great. That means we can

239
00:11:05,850 --> 00:11:09,990
now apply a delta_frame. So we can

240
00:11:09,990 --> 00:11:12,180
calculate the difference between the

241
00:11:12,180 --> 00:11:14,430
first_frame and the current frame of the

242
00:11:14,430 --> 00:11:16,488
image. So the first_frame is the first_frame

243
00:11:16,512 --> 00:11:17,850
variable and the current frame is

244
00:11:17,850 --> 00:11:21,600
a gray variable. But before that, we

245
00:11:21,600 --> 00:11:23,790
would like to do something to the

246
00:11:23,790 --> 00:11:27,150
current frame of the image. We want to

247
00:11:27,150 --> 00:11:29,970
apply a GaussianBlur to the image

248
00:11:29,994 --> 00:11:34,955
[Author typing]

249
00:11:34,980 --> 00:11:37,530
and a dot here, so the reason we want to

250
00:11:37,530 --> 00:11:39,630
apply GaussianBlur is that we want to

251
00:11:39,630 --> 00:11:41,370
blur the image, so we want to make it

252
00:11:41,370 --> 00:11:44,730
blurry. So to smooth fade, because that

253
00:11:44,760 --> 00:11:47,160
removes noise and increases accuracy in

254
00:11:47,160 --> 00:11:51,210
the calculation of the difference. So this

255
00:11:51,210 --> 00:11:53,880
gets as parameter the image you want to

256
00:11:53,880 --> 00:11:56,460
blur, and then so we are passing the gray

257
00:11:56,460 --> 00:11:58,530
image here and we are storing the blurry

258
00:11:58,530 --> 00:12:00,840
version of the image in the gray image

259
00:12:00,840 --> 00:12:03,120
again, and then we have another

260
00:12:04,080 --> 00:12:06,420
parameter which comes as a tuple, and

261
00:12:06,420 --> 00:12:08,700
here we need to pass the width and

262
00:12:08,700 --> 00:12:12,630
the height of the Gaussian kernel. So

263
00:12:12,630 --> 00:12:15,360
which is basically the parameters of the

264
00:12:15,660 --> 00:12:18,150
blurriness but 21 would be accepted

265
00:12:18,150 --> 00:12:21,270
numbers, and you also need another last

266
00:12:21,300 --> 00:12:23,880
parameter. So that would be the standard

267
00:12:23,880 --> 00:12:29,100
deviation and I'll pass 0. 0 here is also commonly

268
00:12:29,100 --> 00:12:31,470
used. If you want to learn about them,

269
00:12:31,470 --> 00:12:33,510
you can go through the documentation,

270
00:12:33,510 --> 00:12:36,720
but these values should be good. So

271
00:12:36,720 --> 00:12:39,480
we're making the gray image blurry here

272
00:12:40,470 --> 00:12:45,480
and then down here now we need to compare

273
00:12:45,480 --> 00:12:47,910
the first_frame of the image, so the

274
00:12:47,910 --> 00:12:50,010
background, we have a current frame.

275
00:12:51,300 --> 00:12:53,940
Let's call it delta_frame, and

276
00:12:53,940 --> 00:12:59,421
that would be equal to cv2.absdifference,

277
00:12:59,445 --> 00:13:00,870
so absolute difference

278
00:13:01,230 --> 00:13:06,240
between the first_frame and the current frame,

279
00:13:06,600 --> 00:13:09,455
which is gray. Note that the first_frame

280
00:13:09,479 --> 00:13:11,190
will also be the gray, a gray

281
00:13:11,190 --> 00:13:13,950
version, a blurred gray version

282
00:13:13,950 --> 00:13:16,680
actually. So we are comparing here to

283
00:13:16,680 --> 00:13:20,280
blurred grayscale images. Okay. And what

284
00:13:20,310 --> 00:13:24,540
this will give us is another image.

285
00:13:24,720 --> 00:13:27,210
Okay, and actually, I'd like to show

286
00:13:27,210 --> 00:13:31,721
that image here on the screen, cv2.imageshow

287
00:13:31,745 --> 00:13:34,153
[No audio]

288
00:13:34,178 --> 00:13:37,800
delta_frame. Let's see what

289
00:13:37,859 --> 00:13:39,254
we'll get out of this.

290
00:13:39,279 --> 00:13:41,279
[No audio]

291
00:13:41,280 --> 00:13:43,227
And before running the script,

292
00:13:43,251 --> 00:13:45,390
I'll disappear from the view

293
00:13:45,390 --> 00:13:47,888
first and then will appear, okay, let's see.

294
00:13:47,912 --> 00:13:51,007
[No audio]

295
00:13:51,032 --> 00:13:52,110
Okay, nothing happened

296
00:13:52,110 --> 00:13:54,960
because I forgot to enter the name of

297
00:13:54,960 --> 00:13:59,921
the window, let's call this Delta Frame,

298
00:14:01,362 --> 00:14:04,388
and let's say Gray Frame for this.

299
00:14:04,412 --> 00:14:06,641
[No audio]

300
00:14:06,666 --> 00:14:07,710
And let me run it again.

301
00:14:07,734 --> 00:14:15,455
[No audio]

302
00:14:15,480 --> 00:14:19,440
And here I am. So this is blurred

303
00:14:19,680 --> 00:14:22,530
grayscale version, this one here, and we

304
00:14:22,530 --> 00:14:28,650
have a difference. Okay, so you can see

305
00:14:28,675 --> 00:14:31,255
the lamp behind me in the Delta Frame.

306
00:14:31,830 --> 00:14:35,490
Great. Press the Q key and create the

307
00:14:35,535 --> 00:14:40,635
video. Now if you, if I'd want to print,

308
00:14:41,825 --> 00:14:44,778
just to check the Delta Frame.

309
00:14:44,802 --> 00:14:48,980
[No audio]

310
00:14:49,005 --> 00:14:51,600
This will allow us to see the difference between

311
00:14:51,630 --> 00:14:54,540
intensities of the corresponding pixels.

312
00:14:55,223 --> 00:14:57,773
So let me see.

313
00:14:57,804 --> 00:15:02,945
[No audio]

314
00:15:02,970 --> 00:15:05,296
And if I quit this now,

315
00:15:05,320 --> 00:15:07,376
[No audio]

316
00:15:07,401 --> 00:15:10,500
here we go. So

317
00:15:10,500 --> 00:15:12,630
what we have here, 5 means there is

318
00:15:12,630 --> 00:15:15,990
no difference. So in this area, there's

319
00:15:15,990 --> 00:15:19,110
no motion probably. But then you have

320
00:15:19,110 --> 00:15:22,410
174, which is quite wide. So that

321
00:15:22,410 --> 00:15:25,230
means Python will classify this as

322
00:15:25,230 --> 00:15:29,280
motion. Okay, that will just to show you

323
00:15:29,580 --> 00:15:31,626
the values of the delta_frame.

324
00:15:31,650 --> 00:15:34,041
[No audio]

325
00:15:34,066 --> 00:15:38,400
What we need now is to classify these values. So

326
00:15:38,400 --> 00:15:40,140
let's say we want to assign a threshold.

327
00:15:40,440 --> 00:15:42,720
So let's say if you have values that are

328
00:15:42,720 --> 00:15:44,520
more than more than 30, so if the

329
00:15:44,520 --> 00:15:46,920
difference between the first_frame, and the

330
00:15:46,920 --> 00:15:49,650
current frame is more than 30, we will

331
00:15:49,650 --> 00:15:53,370
classify that as wide. So we'll say

332
00:15:53,395 --> 00:15:56,605
there's probably motion in those pixels,

333
00:15:56,703 --> 00:15:58,776
so there's an object in those pixels.

334
00:15:59,190 --> 00:16:01,050
And if the difference is less than 30,

335
00:16:01,440 --> 00:16:06,943
we'll assign it black pixel. Okay, so

336
00:16:08,220 --> 00:16:11,190
we can do that. So we are here. We can

337
00:16:11,190 --> 00:16:14,010
do that using the threshold method of

338
00:16:14,010 --> 00:16:18,221
the cv2.library. Let's say thresh_delta

339
00:16:18,246 --> 00:16:21,721
equals to cv2.threshold,

340
00:16:22,830 --> 00:16:25,191
and what this expects is the image

341
00:16:25,215 --> 00:16:27,589
[No audio]

342
00:16:27,614 --> 00:16:31,320
that you want to threshold, and then you want

343
00:16:31,320 --> 00:16:34,350
to specify the threshold limit. So

344
00:16:34,350 --> 00:16:36,900
30, we said 30. And what color do

345
00:16:36,900 --> 00:16:38,910
you want to assign to the values that

346
00:16:38,910 --> 00:16:40,650
are more than 30? Well, we want to

347
00:16:40,650 --> 00:16:43,410
assign a white color which corresponds

348
00:16:43,410 --> 00:16:48,420
to the 255 value. Okay, and you also

349
00:16:48,420 --> 00:16:51,600
need one more argument here, which is

350
00:16:51,600 --> 00:16:53,055
the threshold method.

351
00:16:53,079 --> 00:16:55,353
[No audio]

352
00:16:55,378 --> 00:16:56,460
There are quite a

353
00:16:56,460 --> 00:16:59,430
few methods out there, and but we are

354
00:16:59,430 --> 00:17:01,800
using this method here, so Threshold

355
00:17:01,800 --> 00:17:04,110
Binary. You can experiment with

356
00:17:04,110 --> 00:17:05,329
others if you like.

357
00:17:05,354 --> 00:17:08,022
[No audio]

358
00:17:08,047 --> 00:17:10,655
Great, and now let's see how these

359
00:17:10,679 --> 00:17:12,000
thresh_delta frame

360
00:17:12,210 --> 00:17:13,688
will look like.

361
00:17:14,401 --> 00:17:20,521
cv2.imshow Threshold Frame,

362
00:17:20,545 --> 00:17:24,088
okay, thresh_delta.

363
00:17:24,442 --> 00:17:26,850
Great. Let's see.

364
00:17:26,875 --> 00:17:32,016
[No audio]

365
00:17:32,045 --> 00:17:34,835
Apparently, I wrote this wrong way

366
00:17:34,860 --> 00:17:39,555
delta_frame, okay, not frame_delta and

367
00:17:39,579 --> 00:17:42,436
[No audio]

368
00:17:42,461 --> 00:17:48,210
one more typo in line 19, which is here, cv2. Okay,

369
00:17:48,210 --> 00:17:49,786
let's hope this time will work,

370
00:17:49,810 --> 00:17:51,885
[No audio]

371
00:17:51,910 --> 00:17:56,550
and yet another error. And this time, I've missed

372
00:17:56,550 --> 00:18:00,960
something here. Okay, this method here,

373
00:18:00,960 --> 00:18:02,640
so the threshold method actually

374
00:18:02,640 --> 00:18:06,420
returns a tuple with two values, and the

375
00:18:06,420 --> 00:18:08,940
first value is needed when you use

376
00:18:08,970 --> 00:18:11,550
other threshold methods. So the first

377
00:18:11,550 --> 00:18:13,710
item of the tuple basically suggests a

378
00:18:13,710 --> 00:18:16,080
value for the threshold if you're using

379
00:18:16,110 --> 00:18:18,450
other methods, but for Threshold Binary,

380
00:18:18,450 --> 00:18:21,000
you only need to access the second item

381
00:18:21,000 --> 00:18:24,330
of the tuple, which is the actual frame

382
00:18:24,330 --> 00:18:27,750
that is returned from the threshold

383
00:18:27,750 --> 00:18:30,120
method. So you want to access the second

384
00:18:30,120 --> 00:18:34,140
item of the tuple. Okay, I promise this

385
00:18:34,485 --> 00:18:36,353
is a last error.

386
00:18:36,377 --> 00:18:39,639
[No audio]

387
00:18:39,664 --> 00:18:42,810
Great, finally, and this is the threshold frame.

388
00:18:42,834 --> 00:18:47,975
[No audio]

389
00:18:48,000 --> 00:18:51,180
So you can see my outline there, but you

390
00:18:51,180 --> 00:18:53,940
also see some wide areas here

391
00:18:53,940 --> 00:18:55,616
because of the shadows.

392
00:18:55,640 --> 00:18:57,982
[No audio]

393
00:18:58,007 --> 00:19:01,500
So I am being detected as an object, but my shadows as

394
00:19:01,500 --> 00:19:03,425
well as are being detected as

395
00:19:03,450 --> 00:19:06,420
objects too. Okay, so you get the idea.

396
00:19:07,139 --> 00:19:09,780
Now we can go right away and use this

397
00:19:09,780 --> 00:19:12,600
thresh_delta frame, which I called it

398
00:19:12,600 --> 00:19:14,580
thresh_delta, I should call it the

399
00:19:14,580 --> 00:19:17,520
thresh_frame, you know, just for name

400
00:19:17,545 --> 00:19:20,635
consistency. And so we can go ahead now and

401
00:19:21,083 --> 00:19:24,983
create contours of the white objects in

402
00:19:25,020 --> 00:19:28,290
thresh_frame. But before that, I'd like

403
00:19:28,290 --> 00:19:30,421
to do something else. I'd like to dilate

404
00:19:30,445 --> 00:19:33,060
those areas, so I want to remove the

405
00:19:33,060 --> 00:19:36,390
black holes from those big white areas

406
00:19:36,420 --> 00:19:38,550
in the image. So basically, I want to

407
00:19:38,550 --> 00:19:42,646
smooth my threshold frame, and to do that,

408
00:19:43,025 --> 00:19:44,705
you need to use the dilate method

409
00:19:44,730 --> 00:19:49,410
of the cv2 library. And so let's say we

410
00:19:49,435 --> 00:19:51,385
want to change the threshold_frame,

411
00:19:52,170 --> 00:19:54,050
cv2.dilate.

412
00:19:54,074 --> 00:19:56,225
[No audio]

413
00:19:56,250 --> 00:19:59,670
Again, you want to pass the threshold_frame there. Now if

414
00:19:59,670 --> 00:20:02,760
you have a kernel array, and you want this

415
00:20:02,760 --> 00:20:04,980
process to be very sophisticated, you'd

416
00:20:04,980 --> 00:20:07,530
pass that array in here. We don't have

417
00:20:07,530 --> 00:20:09,960
any and we don't need one, so you need

418
00:20:09,985 --> 00:20:11,755
to pass None for this parameter,

419
00:20:12,320 --> 00:20:14,343
and there's yet another parameter here,

420
00:20:15,050 --> 00:20:18,455
iterations, and let's say 2. So this

421
00:20:18,480 --> 00:20:21,360
defines how many times do you want to go

422
00:20:21,391 --> 00:20:23,911
through the image to remove those holes.

423
00:20:24,360 --> 00:20:25,980
So the bigger this normally is, the

424
00:20:25,980 --> 00:20:29,220
smoother the image will be. Okay, and let me

425
00:20:29,220 --> 00:20:30,388
check it quickly.

426
00:20:30,412 --> 00:20:33,350
[No audio]

427
00:20:33,375 --> 00:20:36,690
Oh, yeah, I changed the name earlier, but I didn't

428
00:20:36,715 --> 00:20:38,905
change it here in the imshow method.

429
00:20:38,930 --> 00:20:41,728
[No audio]

430
00:20:41,753 --> 00:20:44,220
thresh_frame, okay.

431
00:20:44,244 --> 00:20:52,895
[No audio]

432
00:20:52,920 --> 00:20:54,690
So you can probably notice that

433
00:20:55,295 --> 00:20:57,185
the areas, the white areas are

434
00:20:57,210 --> 00:21:03,420
smoother now. So if I go away, you'll

435
00:21:03,420 --> 00:21:06,180
only notice a few areas that are there

436
00:21:06,180 --> 00:21:09,360
because of my shadow. So it seems to be

437
00:21:09,360 --> 00:21:13,740
working so far. Let's quit it now. Great.

438
00:21:14,130 --> 00:21:18,240
So we've got these three frames. And what's

439
00:21:18,240 --> 00:21:20,550
next? Well, next is we need to find the

440
00:21:20,550 --> 00:21:24,241
contours of these dilated threshold frame.

441
00:21:24,265 --> 00:21:26,376
[No audio]

442
00:21:26,401 --> 00:21:28,560
Regarding contour detection with

443
00:21:28,560 --> 00:21:32,640
OpenCV, you have two methods. So you

444
00:21:32,640 --> 00:21:35,460
have a findContours and a drawContours

445
00:21:35,460 --> 00:21:38,190
methods, and with the findContours

446
00:21:38,190 --> 00:21:40,620
method, what you do is you find the

447
00:21:40,620 --> 00:21:42,570
contours in your image and you store

448
00:21:42,570 --> 00:21:45,270
them in a tuple. On the other hand, the

449
00:21:45,270 --> 00:21:48,360
drawContours method draws contours in

450
00:21:48,360 --> 00:21:51,420
an image. So in this case, what we want

451
00:21:51,445 --> 00:21:53,395
to do is we want to find the contours,

452
00:21:53,420 --> 00:21:58,050
and then we want to check if the area of this

453
00:21:58,050 --> 00:22:00,930
contour, so let's say you have a contour

454
00:22:00,930 --> 00:22:03,150
like a circle, and so you want to find the

455
00:22:03,150 --> 00:22:06,690
area that this contour defines. So you

456
00:22:06,690 --> 00:22:09,734
want to store those contours in a tuple.

457
00:22:10,115 --> 00:22:13,985
So you want to write cnts and ',' and

458
00:22:14,010 --> 00:22:17,490
another '_', okay, and that would be

459
00:22:17,490 --> 00:22:23,020
equal to cv2.findCountours,

460
00:22:23,044 --> 00:22:25,075
[No audio]

461
00:22:25,100 --> 00:22:27,390
and then you want to pass the frame that you want

462
00:22:27,390 --> 00:22:30,990
to find the contours for, and it's

463
00:22:30,990 --> 00:22:33,780
good to actually use the copy of the

464
00:22:33,780 --> 00:22:37,140
frame, so you don't want to modify the

465
00:22:37,165 --> 00:22:39,415
threshold frame. So use copy here,

466
00:22:40,031 --> 00:22:42,210
and though, so this is the first parameter

467
00:22:42,210 --> 00:22:44,721
of the frame you want to find the contours from

468
00:22:44,745 --> 00:22:46,745
[No audio]

469
00:22:46,774 --> 00:22:48,240
and then you have the methods

470
00:22:49,242 --> 00:22:54,903
RETRIEVE_EXTERNAL, so you want to

471
00:22:55,080 --> 00:22:57,690
draw the external contours of the objects

472
00:22:57,690 --> 00:23:00,330
that you'll be finding in the image.

473
00:23:00,690 --> 00:23:03,150
And you've got yet another argument,

474
00:23:03,785 --> 00:23:09,125
CHAIN_APPROX_SIMPLE, and so this is

475
00:23:09,150 --> 00:23:13,110
approximation method that OpenCV will

476
00:23:13,230 --> 00:23:15,887
apply for retrieving the contours.

477
00:23:17,160 --> 00:23:20,400
Great. So what we have is we are

478
00:23:20,400 --> 00:23:23,100
iterating through the current frame, and so

479
00:23:23,100 --> 00:23:25,410
we are blurring it and converting it to

480
00:23:25,410 --> 00:23:29,910
grayscale and find the delta_frame and

481
00:23:30,089 --> 00:23:32,654
apply the threshold, so the black and white

482
00:23:32,679 --> 00:23:35,010
image, and then we find

483
00:23:35,045 --> 00:23:38,315
all the contours of the objects, of the

484
00:23:38,340 --> 00:23:41,280
distinct objects in this image. So if

485
00:23:41,280 --> 00:23:43,950
you've got two wide continuous areas in

486
00:23:43,950 --> 00:23:46,621
your image, but they are distinct, you'll get

487
00:23:46,645 --> 00:23:48,360
two contours, so one contour for

488
00:23:48,360 --> 00:23:50,910
each of the areas, and these contours will be

489
00:23:50,910 --> 00:23:55,890
stored in this cnts variable. So we're

490
00:23:55,890 --> 00:23:57,990
talking about the current frame, and next

491
00:23:57,990 --> 00:24:00,450
what we want to do is we want to filter

492
00:24:00,450 --> 00:24:03,000
out these contours. So we want to check

493
00:24:03,000 --> 00:24:05,070
that, we want to keep only the contours

494
00:24:05,070 --> 00:24:08,070
that are, let's say that have an area

495
00:24:08,070 --> 00:24:10,410
that is bigger than 1000 pixels.

496
00:24:10,775 --> 00:24:13,775
For that you need to iterate, let's say

497
00:24:13,800 --> 00:24:17,842
for countour in cnts,

498
00:24:17,867 --> 00:24:19,890
[No audio]

499
00:24:19,915 --> 00:24:21,388
and if

500
00:24:22,168 --> 00:24:27,721
cv2.countourArea

501
00:24:28,293 --> 00:24:30,810
of the countour, so the

502
00:24:30,810 --> 00:24:33,720
countour that we are iterating through, so

503
00:24:33,745 --> 00:24:36,423
if this is less than 1000,

504
00:24:36,452 --> 00:24:39,123
[No audio]

505
00:24:39,148 --> 00:24:43,440
continue to the beginning of the for loop again. So

506
00:24:43,470 --> 00:24:46,050
what this means is, let's say Python

507
00:24:46,050 --> 00:24:48,330
found three countours and it will go

508
00:24:48,330 --> 00:24:50,340
through the first one and it'll say if

509
00:24:50,340 --> 00:24:52,470
the area of this countour, so we use the

510
00:24:52,500 --> 00:24:54,540
countourArea of the cv2 library. If

511
00:24:54,540 --> 00:24:59,100
the area has less than 1000 pixels, go

512
00:24:59,100 --> 00:25:01,200
to the next countour, so go to the

513
00:25:01,200 --> 00:25:03,180
second countour and check again and again

514
00:25:03,180 --> 00:25:06,390
and again. Otherwise, if the area is

515
00:25:06,390 --> 00:25:09,870
bigger than or equal to 1000, the next

516
00:25:09,870 --> 00:25:12,421
lines here after the for loop will be executed.

517
00:25:14,056 --> 00:25:16,535
So what do you want to do if a

518
00:25:16,560 --> 00:25:19,020
countour is greater than 1000 pixels?

519
00:25:20,400 --> 00:25:23,190
Well, we wants to draw the rectangle

520
00:25:23,220 --> 00:25:26,109
surrounding that countour to the current frame.

521
00:25:26,134 --> 00:25:28,410
[No audio]

522
00:25:28,435 --> 00:25:30,660
So make sure you are inside the

523
00:25:30,660 --> 00:25:34,230
for loop. So we are still iterating, and

524
00:25:34,255 --> 00:25:38,945
[No audio]

525
00:25:38,970 --> 00:25:42,150
so these are the parameters that define

526
00:25:42,150 --> 00:25:44,874
the rectangle and that will be equal to

527
00:25:45,606 --> 00:25:51,711
cv2.boundRectangle on the current countour.

528
00:25:51,736 --> 00:25:54,047
[No audio]

529
00:25:54,072 --> 00:25:57,990
So if the countour is equal or

530
00:25:57,990 --> 00:26:01,590
greater than 1000 pixels, so if it

531
00:26:01,590 --> 00:26:03,750
has an area of equal or greater than

532
00:26:03,750 --> 00:26:07,590
1000 pixels, this will be executed. So

533
00:26:07,590 --> 00:26:10,830
we are creating a rectangle, and then we

534
00:26:10,830 --> 00:26:13,380
want to draw that rectangle to our frame,

535
00:26:13,440 --> 00:26:17,321
to our current frame. So cv2.rectangle,

536
00:26:18,380 --> 00:26:20,165
so we already consumed these

537
00:26:20,190 --> 00:26:22,620
method in our face detection lectures.

538
00:26:23,700 --> 00:26:25,350
And here, we would want to pass the

539
00:26:25,350 --> 00:26:30,660
color frame. Okay, so that is the frame and

540
00:26:31,650 --> 00:26:34,590
you want to specify x and y here. So

541
00:26:34,590 --> 00:26:36,390
these are the coordinates of the upper

542
00:26:36,530 --> 00:26:40,370
left corner of the rectangle, x, y, and

543
00:26:41,742 --> 00:26:43,830
you want to specify the coordinates of

544
00:26:43,830 --> 00:26:46,830
the right lower corner of the rectangle

545
00:26:46,860 --> 00:26:50,955
as well, so x+w,

546
00:26:51,888 --> 00:26:54,720
just like that, and

547
00:26:54,720 --> 00:26:58,290
also the color of your rectangle, let's

548
00:26:58,315 --> 00:27:02,910
say, green, and the width, let's say

549
00:27:02,910 --> 00:27:06,180
3. So what we did in these two lines

550
00:27:06,180 --> 00:27:10,170
is that we created this table with these

551
00:27:10,200 --> 00:27:14,310
four coordinates, and these values, these

552
00:27:14,310 --> 00:27:17,100
will be assigned automatically. So x and

553
00:27:17,100 --> 00:27:19,890
y will get the value from a

554
00:27:19,920 --> 00:27:23,040
rectangle bounding these countour, this

555
00:27:23,040 --> 00:27:25,560
current countour of the for loop. Now then

556
00:27:25,560 --> 00:27:27,060
these values will be used to draw

557
00:27:27,060 --> 00:27:29,100
rectangle in the frame, in the current

558
00:27:29,100 --> 00:27:31,080
frame, and then we want to show that

559
00:27:31,110 --> 00:27:34,260
current frame. So let me add it here,

560
00:27:34,950 --> 00:27:36,255
imageshow.

561
00:27:36,279 --> 00:27:38,329
[No audio]

562
00:27:38,354 --> 00:27:40,200
Let's call this Color Frame

563
00:27:41,135 --> 00:27:45,480
and frame. And actually, this method here is

564
00:27:45,690 --> 00:27:50,190
boundingRect. All right, let me try the script now.

565
00:27:50,214 --> 00:27:55,655
[No audio]

566
00:27:55,680 --> 00:28:01,054
And cv2 has no attribute findCountours.

567
00:28:01,596 --> 00:28:04,445
So here I've got a 'u'

568
00:28:04,470 --> 00:28:05,988
there that shouldn't be there.

569
00:28:07,508 --> 00:28:08,949
Let's try it again.

570
00:28:08,973 --> 00:28:14,826
[No audio]

571
00:28:14,851 --> 00:28:18,870
And that's funny. I tend to miss type

572
00:28:18,902 --> 00:28:24,320
this word countourArea, line 24, and

573
00:28:24,695 --> 00:28:26,906
remove the 'u', and

574
00:28:26,930 --> 00:28:28,948
[No audio]

575
00:28:28,973 --> 00:28:31,456
it should be contour here as well.

576
00:28:31,480 --> 00:28:34,477
[No audio]

577
00:28:34,502 --> 00:28:35,539
Okay.

578
00:28:35,563 --> 00:28:42,485
[No audio]

579
00:28:42,525 --> 00:28:44,850
countour is not defined again, I've got

580
00:28:44,850 --> 00:28:50,010
another one here. So bear with me. Try again.

581
00:28:50,034 --> 00:28:55,445
[No audio]

582
00:28:55,470 --> 00:28:57,570
And yeah, this time seems to be working.

583
00:28:57,594 --> 00:29:01,775
[No audio]

584
00:29:01,800 --> 00:29:06,690
So no objects, objects, no objects,

585
00:29:07,170 --> 00:29:11,460
objects. Great. Great. So that's what I

586
00:29:11,460 --> 00:29:14,190
wanted to teach you in this lecture, and

587
00:29:14,293 --> 00:29:15,815
we'll continue with the next lectures,

588
00:29:15,840 --> 00:29:17,910
because what we've done so far

589
00:29:17,910 --> 00:29:20,550
is that we can detect that object and we

590
00:29:20,550 --> 00:29:22,590
can draw a rectangle around that object.

591
00:29:22,650 --> 00:29:24,750
But this is not very practical. I mean,

592
00:29:24,750 --> 00:29:27,690
in real world, it's not enough to just

593
00:29:27,720 --> 00:29:29,910
draw a rectangle around your object, and

594
00:29:29,910 --> 00:29:33,150
then that's it. So what we'll be doing

595
00:29:33,150 --> 00:29:35,430
in the next lectures is we'll be

596
00:29:35,430 --> 00:29:37,980
storing the times that the object enters

597
00:29:37,980 --> 00:29:40,800
the frame and when the object exits the

598
00:29:40,800 --> 00:29:43,500
frame. So we've got some more lines to

599
00:29:43,500 --> 00:29:46,950
add to this code, and I know this code

600
00:29:46,950 --> 00:29:49,139
here was quite a lot to consume,

601
00:29:49,164 --> 00:29:53,340
but I hope these are clear now, and I'll

602
00:29:53,340 --> 00:29:54,593
see you in the next lecture.

603
00:29:54,617 --> 00:29:56,617
[No audio]