1
00:00:00,000 --> 00:00:01,325
[No audio]

2
00:00:01,350 --> 00:00:03,450
Hey, welcome to this new lecture, and

3
00:00:03,450 --> 00:00:06,420
here you will learn how to detect faces.

4
00:00:08,096 --> 00:00:11,460
And we will be using OpenCV with Python

5
00:00:11,460 --> 00:00:13,950
to detect one or more faces from an

6
00:00:13,950 --> 00:00:18,030
image. So basically, how does face

7
00:00:18,030 --> 00:00:20,520
detection works anyway? Well, the idea

8
00:00:20,520 --> 00:00:23,100
is that someone has created some

9
00:00:23,700 --> 00:00:27,090
cascades, which are basically some XML

10
00:00:27,090 --> 00:00:29,278
files, such as this one,

11
00:00:29,302 --> 00:00:32,953
[No audio]

12
00:00:32,978 --> 00:00:36,180
and this XML file contains information about the

13
00:00:36,180 --> 00:00:38,940
features that an image of a face

14
00:00:38,940 --> 00:00:42,930
contains. So we're talking about ratio

15
00:00:42,930 --> 00:00:45,750
of the shadows or the eyes and nose and

16
00:00:45,750 --> 00:00:48,420
lips, and all of these features, this pixel

17
00:00:48,420 --> 00:00:50,850
intensity numbers are stored, are

18
00:00:50,880 --> 00:00:53,550
recorded in this XML file, which has

19
00:00:53,550 --> 00:00:56,550
been created by using some images with

20
00:00:56,550 --> 00:00:59,460
faces as training samples. So basically,

21
00:00:59,460 --> 00:01:01,680
you tell your computers, okay, these are

22
00:01:01,680 --> 00:01:04,920
all of these are faces, and then you use

23
00:01:04,950 --> 00:01:07,950
software like OpenCV to create such XML

24
00:01:07,950 --> 00:01:12,255
files. So these are called haarcascades,

25
00:01:12,280 --> 00:01:14,191
and this is a haarcascade for

26
00:01:14,220 --> 00:01:17,880
frontal face object, and if you want

27
00:01:17,880 --> 00:01:21,690
other objects, you can find them in this

28
00:01:21,780 --> 00:01:26,028
link here. So you've got fullbody and

29
00:01:26,053 --> 00:01:28,738
lefteye and lowerbody and so on.

30
00:01:30,660 --> 00:01:32,700
Alternatively, you can use the resource

31
00:01:32,700 --> 00:01:35,520
section of this lecture, and you can

32
00:01:35,520 --> 00:01:38,190
download on a zip file with all the XML

33
00:01:38,675 --> 00:01:41,615
haarcascades there. But in this

34
00:01:41,640 --> 00:01:44,321
lecture, we will be focusing on the frontalface.

35
00:01:44,436 --> 00:01:46,590
So we'll be using these cascades

36
00:01:46,620 --> 00:01:50,100
to detect faces. And the way it works now is

37
00:01:50,100 --> 00:01:53,070
that we will load the image in Python.

38
00:01:53,400 --> 00:01:55,920
And then we will tell Python that this

39
00:01:55,920 --> 00:01:58,920
is a model that you want to look for in

40
00:01:58,920 --> 00:02:00,600
the image, and you want to find this

41
00:02:00,600 --> 00:02:03,270
model, so the XML model in the image,

42
00:02:03,540 --> 00:02:05,790
and what Python will do with the

43
00:02:05,790 --> 00:02:08,640
help of OpenCV, it will start to

44
00:02:08,640 --> 00:02:12,540
search all the image using a window, and

45
00:02:12,540 --> 00:02:14,610
then it will resize the image, so it will

46
00:02:14,610 --> 00:02:17,460
decrease the image size, and using the

47
00:02:17,460 --> 00:02:19,350
same window, it will detect for

48
00:02:19,350 --> 00:02:22,290
smaller faces, and so on. We will go

49
00:02:22,320 --> 00:02:24,780
through that step by step. So let's

50
00:02:24,780 --> 00:02:26,520
write the script that detects faces.

51
00:02:28,355 --> 00:02:29,915
import cv2 is the first thing you

52
00:02:29,940 --> 00:02:32,670
want to do, and that the next thing is

53
00:02:32,670 --> 00:02:35,460
you want to read the cascade in Python.

54
00:02:35,580 --> 00:02:38,606
So let's call the variable face_cascade

55
00:02:38,631 --> 00:02:40,631
[No audio]

56
00:02:40,656 --> 00:02:43,650
cv2, and we have a methods called

57
00:02:43,650 --> 00:02:48,600
CascadeClassifier. So this will create a

58
00:02:48,600 --> 00:02:51,360
face_cascade object in Python, and all

59
00:02:51,360 --> 00:02:54,510
we have to pass here is the path of the

60
00:02:54,510 --> 00:02:55,788
haarcascade.

61
00:02:55,812 --> 00:03:00,097
[No audio]

62
00:03:00,122 --> 00:03:01,252
That's it.

63
00:03:01,276 --> 00:03:04,707
[No audio]

64
00:03:04,732 --> 00:03:08,400
And that will create a CascadeClassifier object, and

65
00:03:08,400 --> 00:03:10,110
now you can use this CascadeClassifier

66
00:03:10,110 --> 00:03:14,100
object of the face feature to search for

67
00:03:14,100 --> 00:03:17,550
face in your image. So the next thing

68
00:03:17,580 --> 00:03:19,620
you might want to do is to load the

69
00:03:19,620 --> 00:03:21,420
image in Python, the image that you want

70
00:03:21,420 --> 00:03:24,450
to search for a face, let's say image

71
00:03:24,545 --> 00:03:28,205
equals. So you know that we can load

72
00:03:28,230 --> 00:03:31,260
images in Python via the imageread

73
00:03:31,290 --> 00:03:33,870
method of the OpenCV, and you can just

74
00:03:33,870 --> 00:03:37,530
pass the photo.jpg name here. So I'm

75
00:03:37,560 --> 00:03:41,400
passing the file path of this image. So

76
00:03:41,400 --> 00:03:43,230
I'm not passing any second parameter

77
00:03:43,230 --> 00:03:45,330
here, that means I'm reading the image

78
00:03:45,360 --> 00:03:47,790
as a color image. However, a good idea

79
00:03:47,790 --> 00:03:50,430
is to use grayscale images when

80
00:03:50,430 --> 00:03:52,650
searching for a face. So I loaded the

81
00:03:52,650 --> 00:03:55,320
image here, but I'll be using the

82
00:03:55,350 --> 00:03:57,810
grayscale version of the image when

83
00:03:57,810 --> 00:04:00,540
searching for a face in that image. That

84
00:04:00,540 --> 00:04:03,090
is thought to produce higher accuracy

85
00:04:03,090 --> 00:04:05,010
when searching for faces. Because

86
00:04:05,010 --> 00:04:06,510
you know, you may notice that when you

87
00:04:06,510 --> 00:04:08,940
have very busy images with lots of

88
00:04:08,940 --> 00:04:12,570
features, OpenCV will not be 100%

89
00:04:12,570 --> 00:04:15,360
accurate. So you may get faces that,

90
00:04:15,390 --> 00:04:17,730
obviously, will miss out, or you may

91
00:04:17,730 --> 00:04:21,655
get features that will be classified as faces.

92
00:04:21,679 --> 00:04:23,040
Using the grayscale

93
00:04:23,040 --> 00:04:25,890
image though increases the accuracy. I

94
00:04:25,890 --> 00:04:28,740
could go ahead and pass a 0 flag here

95
00:04:28,740 --> 00:04:31,121
so that I could read this image as a greyscale

96
00:04:31,145 --> 00:04:32,580
but I'd like to actually

97
00:04:32,580 --> 00:04:34,380
keep the original image as a color

98
00:04:34,380 --> 00:04:36,270
version, because I want to show the

99
00:04:36,270 --> 00:04:39,540
color version at the end, but use the

100
00:04:39,540 --> 00:04:42,900
gray_image to pass it to the methods

101
00:04:42,900 --> 00:04:45,300
that will be searching for the face.

102
00:04:45,478 --> 00:04:47,968
So I'll create here a gray_image

103
00:04:48,000 --> 00:04:50,370
variable. We're all still a grayscale

104
00:04:50,370 --> 00:04:53,250
version of the image. So cv2 and we

105
00:04:53,250 --> 00:04:56,700
have a method called cvtColor, and

106
00:04:56,700 --> 00:04:58,680
that takes as argument the original

107
00:04:58,680 --> 00:05:01,800
image, of course, and a flag, so one

108
00:05:01,800 --> 00:05:08,921
argument called cv2.COLOR_BGR2GRAY.

109
00:05:10,092 --> 00:05:12,150
So what this means is that this

110
00:05:12,150 --> 00:05:15,150
will convert the BGR image, so blue,

111
00:05:15,150 --> 00:05:18,480
green, red bands. It will convert it to a

112
00:05:18,480 --> 00:05:22,770
grayscale image, that's it. So, if

113
00:05:22,770 --> 00:05:25,655
you want to quickly show this gray_image,

114
00:05:25,685 --> 00:05:31,655
if you like, cv2.imshow gray_image there,

115
00:05:32,090 --> 00:05:33,210
we give a name for the

116
00:05:33,210 --> 00:05:38,850
window Gray and the image you want to

117
00:05:38,850 --> 00:05:43,890
show, and you need to pass the waitKey

118
00:05:43,890 --> 00:05:46,590
parameter here, let's say 0. So we'll

119
00:05:46,590 --> 00:05:48,330
press any key and it will close the

120
00:05:48,330 --> 00:05:55,021
window. cv2.destroyAllWindows.

121
00:05:55,046 --> 00:05:57,651
[No audio]

122
00:05:57,676 --> 00:05:59,550
So you learned these methods in the

123
00:05:59,550 --> 00:06:03,120
previous lectures. Let me execute this.

124
00:06:03,144 --> 00:06:09,244
[No audio]

125
00:06:09,269 --> 00:06:12,209
So this is a grayscale version of the

126
00:06:12,209 --> 00:06:16,949
image. Come and now we will use a method

127
00:06:16,949 --> 00:06:19,949
called detectMultiScale, and what this

128
00:06:19,949 --> 00:06:23,159
method will do, it will search for the

129
00:06:23,189 --> 00:06:25,739
CascadeClassifier. So it will search

130
00:06:25,739 --> 00:06:30,419
for this frontalface xml file in our

131
00:06:30,419 --> 00:06:32,609
image, and it will return the

132
00:06:32,638 --> 00:06:37,320
coordinates of the face in the image. So for instance,

133
00:06:37,344 --> 00:06:39,769
[No audio]

134
00:06:39,794 --> 00:06:42,479
this is the image and what

135
00:06:42,479 --> 00:06:45,449
this method will return is it will find

136
00:06:45,449 --> 00:06:48,089
the face and it will say. So it will

137
00:06:48,089 --> 00:06:51,899
give you the number of the row and the

138
00:06:51,899 --> 00:06:54,389
column of the upper left point of the

139
00:06:54,419 --> 00:06:57,419
face. So it will start here and it will

140
00:06:57,419 --> 00:07:01,469
also give you the height of the face

141
00:07:01,949 --> 00:07:04,379
and the width of the face. So we've get a

142
00:07:04,379 --> 00:07:06,479
rectangle, and then we will draw that

143
00:07:06,479 --> 00:07:07,815
rectangle in the image.

144
00:07:07,839 --> 00:07:10,010
[No audio]

145
00:07:10,035 --> 00:07:11,939
So that's basically but you'll understand it

146
00:07:11,939 --> 00:07:16,409
better, and so we need to create, let's

147
00:07:16,409 --> 00:07:20,249
say a faces object variable, where we'll

148
00:07:20,249 --> 00:07:24,659
store this x, y width and height values.

149
00:07:25,289 --> 00:07:27,093
So let's quickly grab that.

150
00:07:27,117 --> 00:07:29,393
[No audio]

151
00:07:29,418 --> 00:07:32,219
So we need to refer to the CascadeClassifier

152
00:07:32,219 --> 00:07:36,029
object, which is this one here, and what

153
00:07:36,029 --> 00:07:38,009
we want to do this with this object

154
00:07:38,009 --> 00:07:44,609
is detectMultiScale, and we want to

155
00:07:44,609 --> 00:07:47,609
detect the gray_image, and then you

156
00:07:47,609 --> 00:07:49,739
want to pass a scale factor in there.

157
00:07:50,579 --> 00:07:52,679
You know that when you have lots of

158
00:07:52,679 --> 00:07:55,079
parameters, and they are too many to fit

159
00:07:55,079 --> 00:07:57,689
in a line. After the comma, you can

160
00:07:57,689 --> 00:08:00,029
just press Enter and Python will read

161
00:08:00,029 --> 00:08:03,455
your line as it was, you know, single line.

162
00:08:03,985 --> 00:08:06,634
So that be sure to just press Enter after the

163
00:08:06,659 --> 00:08:12,479
comma, and so we have scaleFactor, and

164
00:08:12,479 --> 00:08:14,579
a good value to give for this would be

165
00:08:14,609 --> 00:08:18,479
1.05. Now what does this mean,

166
00:08:18,562 --> 00:08:24,486
anyway? Well, let's consider this image.

167
00:08:25,559 --> 00:08:28,259
What Python will do is that it will

168
00:08:28,259 --> 00:08:30,599
start from the original size of the

169
00:08:30,599 --> 00:08:33,959
image and it will search, it will create

170
00:08:33,959 --> 00:08:36,029
a window that will search for faces in

171
00:08:36,029 --> 00:08:38,849
the image, so search in this area, in this

172
00:08:38,849 --> 00:08:41,789
area, in this area, and so once it does that,

173
00:08:42,329 --> 00:08:46,319
then by giving a scaleFactor of 1.05,

174
00:08:46,769 --> 00:08:49,349
you're telling Python to decrease the

175
00:08:49,349 --> 00:08:52,889
scale by 5% for the next face search.

176
00:08:53,369 --> 00:08:55,819
So what Python will do is it will

177
00:08:55,889 --> 00:08:58,739
downscale the image by 5% and it will

178
00:08:58,739 --> 00:09:01,529
research for bigger faces in the image.

179
00:09:02,459 --> 00:09:04,109
So search again, search again, and then

180
00:09:04,409 --> 00:09:06,869
decrease it by 5% again the image and

181
00:09:06,869 --> 00:09:11,039
search for bigger faces, and so on until

182
00:09:11,039 --> 00:09:16,979
it goes to a final size. So that means a

183
00:09:16,979 --> 00:09:19,949
smaller value means higher accuracy. If

184
00:09:19,949 --> 00:09:23,969
you give for example, .5, Python will

185
00:09:24,299 --> 00:09:27,269
decrease the scale by 50%. So, it

186
00:09:27,269 --> 00:09:29,669
will start from the original size and

187
00:09:29,669 --> 00:09:33,029
then it will go 50% higher and so you

188
00:09:33,029 --> 00:09:36,419
don't get much accuracy with that. The

189
00:09:36,419 --> 00:09:39,239
profit with this normally is that the script

190
00:09:39,239 --> 00:09:41,609
will run quicker. So you'll have

191
00:09:41,609 --> 00:09:43,739
less passes to the image for searching

192
00:09:43,739 --> 00:09:49,919
for face, 0.5 is good. Okay, and then

193
00:09:49,919 --> 00:09:52,229
you have another parameter called

194
00:09:52,619 --> 00:09:56,309
minimumNeighbor and that is usually

195
00:09:56,334 --> 00:09:59,664
set to 5. And What this basically is that

196
00:10:00,479 --> 00:10:03,899
this will tell Python how many neighbors to

197
00:10:03,899 --> 00:10:07,109
search around the window. So you may

198
00:10:07,109 --> 00:10:09,569
want to experiment with this numbers a

199
00:10:09,599 --> 00:10:12,659
little bit, and see which gets the

200
00:10:12,659 --> 00:10:14,879
better results. But these two are well

201
00:10:14,879 --> 00:10:20,129
accepted numbers. So let's do something

202
00:10:20,129 --> 00:10:25,499
now. Let's print out faces and see what

203
00:10:25,499 --> 00:10:28,255
this is about. So what kinds of

204
00:10:28,279 --> 00:10:31,109
object this is? And I could also

205
00:10:31,349 --> 00:10:36,329
print the type of faces, just like that.

206
00:10:36,353 --> 00:10:39,724
[No audio]

207
00:10:39,749 --> 00:10:41,549
So I'll run the script now and what

208
00:10:41,549 --> 00:10:43,949
the script will do, it will

209
00:10:43,949 --> 00:10:46,289
read this xml file and it will load the

210
00:10:46,289 --> 00:10:48,449
image, it will make the grayscale version

211
00:10:48,449 --> 00:10:51,029
of the image, and then it will

212
00:10:51,029 --> 00:10:53,279
detect the coordinates of upper left

213
00:10:53,309 --> 00:10:56,669
corner of the face in the image and the

214
00:10:56,669 --> 00:10:59,519
width and the height of the rectangle

215
00:10:59,519 --> 00:11:02,759
defining the face in the image, and then

216
00:11:02,759 --> 00:11:06,089
it will print out the type of the faces and

217
00:11:06,119 --> 00:11:09,290
it will print the faces, the actual faces object. So

218
00:11:09,327 --> 00:11:11,350
[No audio]

219
00:11:11,375 --> 00:11:12,929
okay, press any key to exit

220
00:11:12,929 --> 00:11:18,149
this. And so faces is a NumPy array, an n

221
00:11:18,149 --> 00:11:21,629
dimensional array object, and it is an

222
00:11:21,629 --> 00:11:25,705
array with four values. So we have

223
00:11:25,735 --> 00:11:27,964
detected our face, and these are the values

224
00:11:27,989 --> 00:11:30,479
basically defining the face in the

225
00:11:30,479 --> 00:11:35,009
image. So basically, what we have here

226
00:11:35,369 --> 00:11:40,739
is, so this is 155, which would be the

227
00:11:40,739 --> 00:11:46,349
155th column, so this is the x. So it's,

228
00:11:46,379 --> 00:11:48,179
the rectangle should start somewhere

229
00:11:48,204 --> 00:11:55,262
here. Now, in the forehead, and this should be 83,

230
00:11:55,505 --> 00:12:00,004
so the row 83, column 155, and then

231
00:12:00,029 --> 00:12:05,189
we have the width, which is 382, and the

232
00:12:05,189 --> 00:12:08,459
height, which is the same, and so we have a

233
00:12:08,459 --> 00:12:11,729
rectangle in the face. Now, let's go

234
00:12:11,754 --> 00:12:14,214
ahead and draw that rectangle in the face,

235
00:12:14,407 --> 00:12:18,754
in the image. So we created the faces

236
00:12:18,779 --> 00:12:21,449
array, and then what we want to do

237
00:12:21,449 --> 00:12:24,089
is access all of these values of this

238
00:12:24,089 --> 00:12:26,549
array. To do that, we can use a for

239
00:12:26,549 --> 00:12:34,405
loop. So for x, y, width, and height in faces.

240
00:12:34,429 --> 00:12:36,960
[No audio]

241
00:12:36,985 --> 00:12:39,749
Well image, so we are updating

242
00:12:39,749 --> 00:12:43,049
the image object that we have read here,

243
00:12:43,079 --> 00:12:45,659
we have loaded here. So we will update

244
00:12:45,659 --> 00:12:49,794
it by drawing a rectangle in the image.

245
00:12:49,818 --> 00:12:51,842
[No audio]

246
00:12:51,867 --> 00:12:53,459
And again, you'll need to pass in each

247
00:12:53,484 --> 00:12:56,874
object in here, and then you need to pass

248
00:12:57,059 --> 00:13:00,449
four more arguments. The first argument

249
00:13:00,449 --> 00:13:03,179
would be the starting point of the

250
00:13:03,179 --> 00:13:06,659
rectangle, so that would be x and y, so

251
00:13:06,684 --> 00:13:10,014
these coordinates of the face array.

252
00:13:10,829 --> 00:13:14,669
Great, and then the next parameter is

253
00:13:14,669 --> 00:13:17,009
another tuple defining the coordinates

254
00:13:17,129 --> 00:13:20,039
of the other corner of the image. So

255
00:13:20,039 --> 00:13:23,399
we've got the top left corner in here, and

256
00:13:23,399 --> 00:13:26,819
we got the lowest right corner in down

257
00:13:26,819 --> 00:13:33,857
here. So what that would be is x+ the width

258
00:13:35,423 --> 00:13:39,839
and y+ the height. That's

259
00:13:39,839 --> 00:13:43,319
it. Yet another parameter and this would

260
00:13:43,319 --> 00:13:45,389
be the color that you want to give to

261
00:13:45,389 --> 00:13:50,909
this rectangle. So and that comes as a BGR

262
00:13:50,909 --> 00:13:53,969
format. So you want to pass value for

263
00:13:53,969 --> 00:13:56,399
the blue color. So let's say 0 for

264
00:13:56,399 --> 00:13:59,249
blue, and let's have a green rectangle,

265
00:13:59,249 --> 00:14:05,309
so 255 for green, so we'll have a full

266
00:14:05,309 --> 00:14:09,629
green there, and then 0 for red, and you

267
00:14:09,629 --> 00:14:11,579
can also pass another parameter, which

268
00:14:11,604 --> 00:14:14,821
would be the width of the rectangle, let's say 3.

269
00:14:14,845 --> 00:14:17,880
[No audio]

270
00:14:17,905 --> 00:14:20,759
Okay, and once you do

271
00:14:20,759 --> 00:14:22,349
that, so once you have updated your

272
00:14:22,349 --> 00:14:24,839
image, you may want to show your image

273
00:14:24,839 --> 00:14:27,659
window on the screen now. So we use the

274
00:14:28,012 --> 00:14:29,962
image.show method, but here we have a

275
00:14:29,993 --> 00:14:32,003
gray_image, so we want to pass the

276
00:14:32,028 --> 00:14:36,449
updated image object, and we should be

277
00:14:36,449 --> 00:14:40,259
good to go. Save the script, go and try

278
00:14:40,259 --> 00:14:43,799
to run it, and my system is not being

279
00:14:43,824 --> 00:14:46,224
able to find the face_detector.py.

280
00:14:46,652 --> 00:14:49,139
Yeah, I've messed up something with a name

281
00:14:49,191 --> 00:14:54,381
here. So let me change this '.py', Enter,

282
00:14:55,049 --> 00:14:57,509
so nothing to do with the script, and

283
00:14:57,744 --> 00:15:02,484
let me try again. Great. So I hope

284
00:15:02,509 --> 00:15:03,919
this is what you were expecting.

285
00:15:04,559 --> 00:15:07,469
So, the rectangle starts here and it

286
00:15:07,469 --> 00:15:11,219
has a width and a height, and it ends up

287
00:15:11,279 --> 00:15:13,799
in here. Sometimes though, I assume I

288
00:15:13,799 --> 00:15:15,749
already know you may have images that

289
00:15:15,749 --> 00:15:17,699
are bigger than your screen resolution.

290
00:15:17,939 --> 00:15:20,669
In that case, your image would not fit

291
00:15:20,669 --> 00:15:23,039
in your screen, on your screen, so what

292
00:15:23,039 --> 00:15:25,683
you could do is resize the image before showing it.

293
00:15:25,707 --> 00:15:27,924
[No audio]

294
00:15:27,949 --> 00:15:31,139
So let me create a resized

295
00:15:31,469 --> 00:15:33,839
variable here, and that would be equal

296
00:15:33,864 --> 00:15:36,414
to cv2.resize

297
00:15:37,835 --> 00:15:40,835
and we want to pass image, so which

298
00:15:40,860 --> 00:15:43,770
contains the rectangle there, and you

299
00:15:43,770 --> 00:15:46,440
want to resize the image to these two

300
00:15:46,440 --> 00:15:49,800
values. So you need to set the

301
00:15:49,830 --> 00:15:53,010
resolution of the image here, and as you

302
00:15:53,010 --> 00:15:54,840
know, you can just put some values here,

303
00:15:54,840 --> 00:15:57,660
let's say 500 by 500. But these values

304
00:15:57,660 --> 00:15:59,970
might stretch out your image, so you,

305
00:16:00,000 --> 00:16:02,550
a better solution would be to access the

306
00:16:02,580 --> 00:16:05,970
shape of the image. So the resolution of

307
00:16:05,970 --> 00:16:09,450
the image, and this should be the width of

308
00:16:09,450 --> 00:16:11,820
your image. So you already know this, I

309
00:16:11,820 --> 00:16:14,070
explain this. So I'm just going through

310
00:16:14,070 --> 00:16:17,040
this quickly, and to make sure you get a

311
00:16:17,070 --> 00:16:20,550
good size, you could divide it by 3,

312
00:16:20,880 --> 00:16:23,580
and then again, image.shape, you get

313
00:16:23,580 --> 00:16:26,490
the height, which is the first value of

314
00:16:26,490 --> 00:16:29,160
the tuple and divided by 3 again.

315
00:16:29,820 --> 00:16:32,970
And you also wants to convert these

316
00:16:32,970 --> 00:16:35,370
to integers, because you might get

317
00:16:35,370 --> 00:16:38,852
float numbers and Python will say that

318
00:16:38,889 --> 00:16:40,987
[No audio]

319
00:16:41,012 --> 00:16:42,990
it cannot resize an image to

320
00:16:43,020 --> 00:16:45,930
float numbers. So that should be okay.

321
00:16:46,320 --> 00:16:49,591
And then you want to show the resized image.

322
00:16:49,615 --> 00:16:52,872
[No audio]

323
00:16:52,897 --> 00:16:56,730
Quickly testing it, and I'm missing a

324
00:16:57,150 --> 00:16:59,730
bracket here, so this should be a

325
00:16:59,730 --> 00:17:01,890
tuple all of these, but here, I'm not

326
00:17:01,890 --> 00:17:03,990
starting with as a tuple. So bracket

327
00:17:03,990 --> 00:17:08,088
here and closes in here.

328
00:17:08,113 --> 00:17:10,551
[No audio]

329
00:17:10,576 --> 00:17:11,639
That should work now.

330
00:17:11,664 --> 00:17:13,686
[No audio]

331
00:17:13,711 --> 00:17:17,100
Yeah, so this was my picture,

332
00:17:17,160 --> 00:17:18,840
but let's try a more challenging

333
00:17:18,840 --> 00:17:22,279
picture now, this one here.

334
00:17:22,303 --> 00:17:24,303
[No audio]

335
00:17:24,328 --> 00:17:29,220
So we've got two faces, which are not very clear,

336
00:17:29,220 --> 00:17:31,890
this is not a real frontal face, and this

337
00:17:31,890 --> 00:17:35,460
guy here he has his eyes closed and his

338
00:17:35,460 --> 00:17:38,790
chin is not visible, and we've also got

339
00:17:38,790 --> 00:17:40,950
these two faces here in the newspapers,

340
00:17:41,280 --> 00:17:43,410
which let me open the original image

341
00:17:43,410 --> 00:17:47,010
from here, which have basically a very

342
00:17:47,010 --> 00:17:49,920
low resolution, and I think Python will

343
00:17:49,920 --> 00:17:53,010
not be able to detect this two faces.

344
00:17:54,267 --> 00:17:55,914
So let's me check this.

345
00:17:55,938 --> 00:17:58,335
[No audio]

346
00:17:58,360 --> 00:18:04,230
Here's our face_detector, and we want to pass the

347
00:18:04,230 --> 00:18:05,821
name of the new image file.

348
00:18:05,845 --> 00:18:10,012
[No audio]

349
00:18:10,037 --> 00:18:14,940
Save the script and go ahead and run it. So

350
00:18:14,940 --> 00:18:18,030
we've got some back to reality here, and

351
00:18:18,030 --> 00:18:21,155
as you see this time, Python and OpenCV

352
00:18:21,179 --> 00:18:23,520
were able to detect this face, even

353
00:18:23,520 --> 00:18:25,050
though it's not in a very frontal

354
00:18:25,050 --> 00:18:28,920
position, and obviously, it was also

355
00:18:28,920 --> 00:18:33,570
able to detect the hand of this guy. So

356
00:18:33,600 --> 00:18:36,720
Python read it as a face, and what you

357
00:18:36,720 --> 00:18:39,810
can do in this case is to tweak this

358
00:18:39,810 --> 00:18:42,390
values here, so the scaleFactor and

359
00:18:42,390 --> 00:18:46,500
the minimumNeighbor. Something you may

360
00:18:46,530 --> 00:18:49,350
also be interested to know is the faces

361
00:18:49,380 --> 00:18:52,020
array here has two lists in this case.

362
00:18:52,290 --> 00:18:53,970
So this is the first face, the

363
00:18:53,970 --> 00:18:57,030
coordinates of the first face, and then

364
00:18:57,030 --> 00:18:59,820
the second face, which in this case

365
00:18:59,820 --> 00:19:03,562
happens to be hand. But anyway, you get the idea.

366
00:19:03,587 --> 00:19:05,642
[No audio]

367
00:19:05,667 --> 00:19:08,160
So probably by maybe using a

368
00:19:08,160 --> 00:19:12,720
1.1 scale, you may be able to get rid of

369
00:19:12,720 --> 00:19:16,950
that hand, so yeah, that's it. You can

370
00:19:16,950 --> 00:19:20,610
also try out to detect the face of this guy

371
00:19:20,610 --> 00:19:22,680
but I don't think you'll be able to do

372
00:19:22,680 --> 00:19:25,710
it. So just that you know that these

373
00:19:25,710 --> 00:19:28,530
techniques have limitations. So its a

374
00:19:28,530 --> 00:19:30,870
computer, its not a human being, so it will

375
00:19:30,895 --> 00:19:34,511
always have some downsides in the accuracy.

376
00:19:34,612 --> 00:19:36,356
I hope you've found this useful though.

377
00:19:37,020 --> 00:19:38,840
and I'll see you in the next lecture.