1 00:00:00,000 --> 00:00:01,325 [No audio] 2 00:00:01,350 --> 00:00:03,450 Hey, welcome to this new lecture, and 3 00:00:03,450 --> 00:00:06,420 here you will learn how to detect faces. 4 00:00:08,096 --> 00:00:11,460 And we will be using OpenCV with Python 5 00:00:11,460 --> 00:00:13,950 to detect one or more faces from an 6 00:00:13,950 --> 00:00:18,030 image. So basically, how does face 7 00:00:18,030 --> 00:00:20,520 detection works anyway? Well, the idea 8 00:00:20,520 --> 00:00:23,100 is that someone has created some 9 00:00:23,700 --> 00:00:27,090 cascades, which are basically some XML 10 00:00:27,090 --> 00:00:29,278 files, such as this one, 11 00:00:29,302 --> 00:00:32,953 [No audio] 12 00:00:32,978 --> 00:00:36,180 and this XML file contains information about the 13 00:00:36,180 --> 00:00:38,940 features that an image of a face 14 00:00:38,940 --> 00:00:42,930 contains. So we're talking about ratio 15 00:00:42,930 --> 00:00:45,750 of the shadows or the eyes and nose and 16 00:00:45,750 --> 00:00:48,420 lips, and all of these features, this pixel 17 00:00:48,420 --> 00:00:50,850 intensity numbers are stored, are 18 00:00:50,880 --> 00:00:53,550 recorded in this XML file, which has 19 00:00:53,550 --> 00:00:56,550 been created by using some images with 20 00:00:56,550 --> 00:00:59,460 faces as training samples. So basically, 21 00:00:59,460 --> 00:01:01,680 you tell your computers, okay, these are 22 00:01:01,680 --> 00:01:04,920 all of these are faces, and then you use 23 00:01:04,950 --> 00:01:07,950 software like OpenCV to create such XML 24 00:01:07,950 --> 00:01:12,255 files. So these are called haarcascades, 25 00:01:12,280 --> 00:01:14,191 and this is a haarcascade for 26 00:01:14,220 --> 00:01:17,880 frontal face object, and if you want 27 00:01:17,880 --> 00:01:21,690 other objects, you can find them in this 28 00:01:21,780 --> 00:01:26,028 link here. So you've got fullbody and 29 00:01:26,053 --> 00:01:28,738 lefteye and lowerbody and so on. 30 00:01:30,660 --> 00:01:32,700 Alternatively, you can use the resource 31 00:01:32,700 --> 00:01:35,520 section of this lecture, and you can 32 00:01:35,520 --> 00:01:38,190 download on a zip file with all the XML 33 00:01:38,675 --> 00:01:41,615 haarcascades there. But in this 34 00:01:41,640 --> 00:01:44,321 lecture, we will be focusing on the frontalface. 35 00:01:44,436 --> 00:01:46,590 So we'll be using these cascades 36 00:01:46,620 --> 00:01:50,100 to detect faces. And the way it works now is 37 00:01:50,100 --> 00:01:53,070 that we will load the image in Python. 38 00:01:53,400 --> 00:01:55,920 And then we will tell Python that this 39 00:01:55,920 --> 00:01:58,920 is a model that you want to look for in 40 00:01:58,920 --> 00:02:00,600 the image, and you want to find this 41 00:02:00,600 --> 00:02:03,270 model, so the XML model in the image, 42 00:02:03,540 --> 00:02:05,790 and what Python will do with the 43 00:02:05,790 --> 00:02:08,640 help of OpenCV, it will start to 44 00:02:08,640 --> 00:02:12,540 search all the image using a window, and 45 00:02:12,540 --> 00:02:14,610 then it will resize the image, so it will 46 00:02:14,610 --> 00:02:17,460 decrease the image size, and using the 47 00:02:17,460 --> 00:02:19,350 same window, it will detect for 48 00:02:19,350 --> 00:02:22,290 smaller faces, and so on. We will go 49 00:02:22,320 --> 00:02:24,780 through that step by step. So let's 50 00:02:24,780 --> 00:02:26,520 write the script that detects faces. 51 00:02:28,355 --> 00:02:29,915 import cv2 is the first thing you 52 00:02:29,940 --> 00:02:32,670 want to do, and that the next thing is 53 00:02:32,670 --> 00:02:35,460 you want to read the cascade in Python. 54 00:02:35,580 --> 00:02:38,606 So let's call the variable face_cascade 55 00:02:38,631 --> 00:02:40,631 [No audio] 56 00:02:40,656 --> 00:02:43,650 cv2, and we have a methods called 57 00:02:43,650 --> 00:02:48,600 CascadeClassifier. So this will create a 58 00:02:48,600 --> 00:02:51,360 face_cascade object in Python, and all 59 00:02:51,360 --> 00:02:54,510 we have to pass here is the path of the 60 00:02:54,510 --> 00:02:55,788 haarcascade. 61 00:02:55,812 --> 00:03:00,097 [No audio] 62 00:03:00,122 --> 00:03:01,252 That's it. 63 00:03:01,276 --> 00:03:04,707 [No audio] 64 00:03:04,732 --> 00:03:08,400 And that will create a CascadeClassifier object, and 65 00:03:08,400 --> 00:03:10,110 now you can use this CascadeClassifier 66 00:03:10,110 --> 00:03:14,100 object of the face feature to search for 67 00:03:14,100 --> 00:03:17,550 face in your image. So the next thing 68 00:03:17,580 --> 00:03:19,620 you might want to do is to load the 69 00:03:19,620 --> 00:03:21,420 image in Python, the image that you want 70 00:03:21,420 --> 00:03:24,450 to search for a face, let's say image 71 00:03:24,545 --> 00:03:28,205 equals. So you know that we can load 72 00:03:28,230 --> 00:03:31,260 images in Python via the imageread 73 00:03:31,290 --> 00:03:33,870 method of the OpenCV, and you can just 74 00:03:33,870 --> 00:03:37,530 pass the photo.jpg name here. So I'm 75 00:03:37,560 --> 00:03:41,400 passing the file path of this image. So 76 00:03:41,400 --> 00:03:43,230 I'm not passing any second parameter 77 00:03:43,230 --> 00:03:45,330 here, that means I'm reading the image 78 00:03:45,360 --> 00:03:47,790 as a color image. However, a good idea 79 00:03:47,790 --> 00:03:50,430 is to use grayscale images when 80 00:03:50,430 --> 00:03:52,650 searching for a face. So I loaded the 81 00:03:52,650 --> 00:03:55,320 image here, but I'll be using the 82 00:03:55,350 --> 00:03:57,810 grayscale version of the image when 83 00:03:57,810 --> 00:04:00,540 searching for a face in that image. That 84 00:04:00,540 --> 00:04:03,090 is thought to produce higher accuracy 85 00:04:03,090 --> 00:04:05,010 when searching for faces. Because 86 00:04:05,010 --> 00:04:06,510 you know, you may notice that when you 87 00:04:06,510 --> 00:04:08,940 have very busy images with lots of 88 00:04:08,940 --> 00:04:12,570 features, OpenCV will not be 100% 89 00:04:12,570 --> 00:04:15,360 accurate. So you may get faces that, 90 00:04:15,390 --> 00:04:17,730 obviously, will miss out, or you may 91 00:04:17,730 --> 00:04:21,655 get features that will be classified as faces. 92 00:04:21,679 --> 00:04:23,040 Using the grayscale 93 00:04:23,040 --> 00:04:25,890 image though increases the accuracy. I 94 00:04:25,890 --> 00:04:28,740 could go ahead and pass a 0 flag here 95 00:04:28,740 --> 00:04:31,121 so that I could read this image as a greyscale 96 00:04:31,145 --> 00:04:32,580 but I'd like to actually 97 00:04:32,580 --> 00:04:34,380 keep the original image as a color 98 00:04:34,380 --> 00:04:36,270 version, because I want to show the 99 00:04:36,270 --> 00:04:39,540 color version at the end, but use the 100 00:04:39,540 --> 00:04:42,900 gray_image to pass it to the methods 101 00:04:42,900 --> 00:04:45,300 that will be searching for the face. 102 00:04:45,478 --> 00:04:47,968 So I'll create here a gray_image 103 00:04:48,000 --> 00:04:50,370 variable. We're all still a grayscale 104 00:04:50,370 --> 00:04:53,250 version of the image. So cv2 and we 105 00:04:53,250 --> 00:04:56,700 have a method called cvtColor, and 106 00:04:56,700 --> 00:04:58,680 that takes as argument the original 107 00:04:58,680 --> 00:05:01,800 image, of course, and a flag, so one 108 00:05:01,800 --> 00:05:08,921 argument called cv2.COLOR_BGR2GRAY. 109 00:05:10,092 --> 00:05:12,150 So what this means is that this 110 00:05:12,150 --> 00:05:15,150 will convert the BGR image, so blue, 111 00:05:15,150 --> 00:05:18,480 green, red bands. It will convert it to a 112 00:05:18,480 --> 00:05:22,770 grayscale image, that's it. So, if 113 00:05:22,770 --> 00:05:25,655 you want to quickly show this gray_image, 114 00:05:25,685 --> 00:05:31,655 if you like, cv2.imshow gray_image there, 115 00:05:32,090 --> 00:05:33,210 we give a name for the 116 00:05:33,210 --> 00:05:38,850 window Gray and the image you want to 117 00:05:38,850 --> 00:05:43,890 show, and you need to pass the waitKey 118 00:05:43,890 --> 00:05:46,590 parameter here, let's say 0. So we'll 119 00:05:46,590 --> 00:05:48,330 press any key and it will close the 120 00:05:48,330 --> 00:05:55,021 window. cv2.destroyAllWindows. 121 00:05:55,046 --> 00:05:57,651 [No audio] 122 00:05:57,676 --> 00:05:59,550 So you learned these methods in the 123 00:05:59,550 --> 00:06:03,120 previous lectures. Let me execute this. 124 00:06:03,144 --> 00:06:09,244 [No audio] 125 00:06:09,269 --> 00:06:12,209 So this is a grayscale version of the 126 00:06:12,209 --> 00:06:16,949 image. Come and now we will use a method 127 00:06:16,949 --> 00:06:19,949 called detectMultiScale, and what this 128 00:06:19,949 --> 00:06:23,159 method will do, it will search for the 129 00:06:23,189 --> 00:06:25,739 CascadeClassifier. So it will search 130 00:06:25,739 --> 00:06:30,419 for this frontalface xml file in our 131 00:06:30,419 --> 00:06:32,609 image, and it will return the 132 00:06:32,638 --> 00:06:37,320 coordinates of the face in the image. So for instance, 133 00:06:37,344 --> 00:06:39,769 [No audio] 134 00:06:39,794 --> 00:06:42,479 this is the image and what 135 00:06:42,479 --> 00:06:45,449 this method will return is it will find 136 00:06:45,449 --> 00:06:48,089 the face and it will say. So it will 137 00:06:48,089 --> 00:06:51,899 give you the number of the row and the 138 00:06:51,899 --> 00:06:54,389 column of the upper left point of the 139 00:06:54,419 --> 00:06:57,419 face. So it will start here and it will 140 00:06:57,419 --> 00:07:01,469 also give you the height of the face 141 00:07:01,949 --> 00:07:04,379 and the width of the face. So we've get a 142 00:07:04,379 --> 00:07:06,479 rectangle, and then we will draw that 143 00:07:06,479 --> 00:07:07,815 rectangle in the image. 144 00:07:07,839 --> 00:07:10,010 [No audio] 145 00:07:10,035 --> 00:07:11,939 So that's basically but you'll understand it 146 00:07:11,939 --> 00:07:16,409 better, and so we need to create, let's 147 00:07:16,409 --> 00:07:20,249 say a faces object variable, where we'll 148 00:07:20,249 --> 00:07:24,659 store this x, y width and height values. 149 00:07:25,289 --> 00:07:27,093 So let's quickly grab that. 150 00:07:27,117 --> 00:07:29,393 [No audio] 151 00:07:29,418 --> 00:07:32,219 So we need to refer to the CascadeClassifier 152 00:07:32,219 --> 00:07:36,029 object, which is this one here, and what 153 00:07:36,029 --> 00:07:38,009 we want to do this with this object 154 00:07:38,009 --> 00:07:44,609 is detectMultiScale, and we want to 155 00:07:44,609 --> 00:07:47,609 detect the gray_image, and then you 156 00:07:47,609 --> 00:07:49,739 want to pass a scale factor in there. 157 00:07:50,579 --> 00:07:52,679 You know that when you have lots of 158 00:07:52,679 --> 00:07:55,079 parameters, and they are too many to fit 159 00:07:55,079 --> 00:07:57,689 in a line. After the comma, you can 160 00:07:57,689 --> 00:08:00,029 just press Enter and Python will read 161 00:08:00,029 --> 00:08:03,455 your line as it was, you know, single line. 162 00:08:03,985 --> 00:08:06,634 So that be sure to just press Enter after the 163 00:08:06,659 --> 00:08:12,479 comma, and so we have scaleFactor, and 164 00:08:12,479 --> 00:08:14,579 a good value to give for this would be 165 00:08:14,609 --> 00:08:18,479 1.05. Now what does this mean, 166 00:08:18,562 --> 00:08:24,486 anyway? Well, let's consider this image. 167 00:08:25,559 --> 00:08:28,259 What Python will do is that it will 168 00:08:28,259 --> 00:08:30,599 start from the original size of the 169 00:08:30,599 --> 00:08:33,959 image and it will search, it will create 170 00:08:33,959 --> 00:08:36,029 a window that will search for faces in 171 00:08:36,029 --> 00:08:38,849 the image, so search in this area, in this 172 00:08:38,849 --> 00:08:41,789 area, in this area, and so once it does that, 173 00:08:42,329 --> 00:08:46,319 then by giving a scaleFactor of 1.05, 174 00:08:46,769 --> 00:08:49,349 you're telling Python to decrease the 175 00:08:49,349 --> 00:08:52,889 scale by 5% for the next face search. 176 00:08:53,369 --> 00:08:55,819 So what Python will do is it will 177 00:08:55,889 --> 00:08:58,739 downscale the image by 5% and it will 178 00:08:58,739 --> 00:09:01,529 research for bigger faces in the image. 179 00:09:02,459 --> 00:09:04,109 So search again, search again, and then 180 00:09:04,409 --> 00:09:06,869 decrease it by 5% again the image and 181 00:09:06,869 --> 00:09:11,039 search for bigger faces, and so on until 182 00:09:11,039 --> 00:09:16,979 it goes to a final size. So that means a 183 00:09:16,979 --> 00:09:19,949 smaller value means higher accuracy. If 184 00:09:19,949 --> 00:09:23,969 you give for example, .5, Python will 185 00:09:24,299 --> 00:09:27,269 decrease the scale by 50%. So, it 186 00:09:27,269 --> 00:09:29,669 will start from the original size and 187 00:09:29,669 --> 00:09:33,029 then it will go 50% higher and so you 188 00:09:33,029 --> 00:09:36,419 don't get much accuracy with that. The 189 00:09:36,419 --> 00:09:39,239 profit with this normally is that the script 190 00:09:39,239 --> 00:09:41,609 will run quicker. So you'll have 191 00:09:41,609 --> 00:09:43,739 less passes to the image for searching 192 00:09:43,739 --> 00:09:49,919 for face, 0.5 is good. Okay, and then 193 00:09:49,919 --> 00:09:52,229 you have another parameter called 194 00:09:52,619 --> 00:09:56,309 minimumNeighbor and that is usually 195 00:09:56,334 --> 00:09:59,664 set to 5. And What this basically is that 196 00:10:00,479 --> 00:10:03,899 this will tell Python how many neighbors to 197 00:10:03,899 --> 00:10:07,109 search around the window. So you may 198 00:10:07,109 --> 00:10:09,569 want to experiment with this numbers a 199 00:10:09,599 --> 00:10:12,659 little bit, and see which gets the 200 00:10:12,659 --> 00:10:14,879 better results. But these two are well 201 00:10:14,879 --> 00:10:20,129 accepted numbers. So let's do something 202 00:10:20,129 --> 00:10:25,499 now. Let's print out faces and see what 203 00:10:25,499 --> 00:10:28,255 this is about. So what kinds of 204 00:10:28,279 --> 00:10:31,109 object this is? And I could also 205 00:10:31,349 --> 00:10:36,329 print the type of faces, just like that. 206 00:10:36,353 --> 00:10:39,724 [No audio] 207 00:10:39,749 --> 00:10:41,549 So I'll run the script now and what 208 00:10:41,549 --> 00:10:43,949 the script will do, it will 209 00:10:43,949 --> 00:10:46,289 read this xml file and it will load the 210 00:10:46,289 --> 00:10:48,449 image, it will make the grayscale version 211 00:10:48,449 --> 00:10:51,029 of the image, and then it will 212 00:10:51,029 --> 00:10:53,279 detect the coordinates of upper left 213 00:10:53,309 --> 00:10:56,669 corner of the face in the image and the 214 00:10:56,669 --> 00:10:59,519 width and the height of the rectangle 215 00:10:59,519 --> 00:11:02,759 defining the face in the image, and then 216 00:11:02,759 --> 00:11:06,089 it will print out the type of the faces and 217 00:11:06,119 --> 00:11:09,290 it will print the faces, the actual faces object. So 218 00:11:09,327 --> 00:11:11,350 [No audio] 219 00:11:11,375 --> 00:11:12,929 okay, press any key to exit 220 00:11:12,929 --> 00:11:18,149 this. And so faces is a NumPy array, an n 221 00:11:18,149 --> 00:11:21,629 dimensional array object, and it is an 222 00:11:21,629 --> 00:11:25,705 array with four values. So we have 223 00:11:25,735 --> 00:11:27,964 detected our face, and these are the values 224 00:11:27,989 --> 00:11:30,479 basically defining the face in the 225 00:11:30,479 --> 00:11:35,009 image. So basically, what we have here 226 00:11:35,369 --> 00:11:40,739 is, so this is 155, which would be the 227 00:11:40,739 --> 00:11:46,349 155th column, so this is the x. So it's, 228 00:11:46,379 --> 00:11:48,179 the rectangle should start somewhere 229 00:11:48,204 --> 00:11:55,262 here. Now, in the forehead, and this should be 83, 230 00:11:55,505 --> 00:12:00,004 so the row 83, column 155, and then 231 00:12:00,029 --> 00:12:05,189 we have the width, which is 382, and the 232 00:12:05,189 --> 00:12:08,459 height, which is the same, and so we have a 233 00:12:08,459 --> 00:12:11,729 rectangle in the face. Now, let's go 234 00:12:11,754 --> 00:12:14,214 ahead and draw that rectangle in the face, 235 00:12:14,407 --> 00:12:18,754 in the image. So we created the faces 236 00:12:18,779 --> 00:12:21,449 array, and then what we want to do 237 00:12:21,449 --> 00:12:24,089 is access all of these values of this 238 00:12:24,089 --> 00:12:26,549 array. To do that, we can use a for 239 00:12:26,549 --> 00:12:34,405 loop. So for x, y, width, and height in faces. 240 00:12:34,429 --> 00:12:36,960 [No audio] 241 00:12:36,985 --> 00:12:39,749 Well image, so we are updating 242 00:12:39,749 --> 00:12:43,049 the image object that we have read here, 243 00:12:43,079 --> 00:12:45,659 we have loaded here. So we will update 244 00:12:45,659 --> 00:12:49,794 it by drawing a rectangle in the image. 245 00:12:49,818 --> 00:12:51,842 [No audio] 246 00:12:51,867 --> 00:12:53,459 And again, you'll need to pass in each 247 00:12:53,484 --> 00:12:56,874 object in here, and then you need to pass 248 00:12:57,059 --> 00:13:00,449 four more arguments. The first argument 249 00:13:00,449 --> 00:13:03,179 would be the starting point of the 250 00:13:03,179 --> 00:13:06,659 rectangle, so that would be x and y, so 251 00:13:06,684 --> 00:13:10,014 these coordinates of the face array. 252 00:13:10,829 --> 00:13:14,669 Great, and then the next parameter is 253 00:13:14,669 --> 00:13:17,009 another tuple defining the coordinates 254 00:13:17,129 --> 00:13:20,039 of the other corner of the image. So 255 00:13:20,039 --> 00:13:23,399 we've got the top left corner in here, and 256 00:13:23,399 --> 00:13:26,819 we got the lowest right corner in down 257 00:13:26,819 --> 00:13:33,857 here. So what that would be is x+ the width 258 00:13:35,423 --> 00:13:39,839 and y+ the height. That's 259 00:13:39,839 --> 00:13:43,319 it. Yet another parameter and this would 260 00:13:43,319 --> 00:13:45,389 be the color that you want to give to 261 00:13:45,389 --> 00:13:50,909 this rectangle. So and that comes as a BGR 262 00:13:50,909 --> 00:13:53,969 format. So you want to pass value for 263 00:13:53,969 --> 00:13:56,399 the blue color. So let's say 0 for 264 00:13:56,399 --> 00:13:59,249 blue, and let's have a green rectangle, 265 00:13:59,249 --> 00:14:05,309 so 255 for green, so we'll have a full 266 00:14:05,309 --> 00:14:09,629 green there, and then 0 for red, and you 267 00:14:09,629 --> 00:14:11,579 can also pass another parameter, which 268 00:14:11,604 --> 00:14:14,821 would be the width of the rectangle, let's say 3. 269 00:14:14,845 --> 00:14:17,880 [No audio] 270 00:14:17,905 --> 00:14:20,759 Okay, and once you do 271 00:14:20,759 --> 00:14:22,349 that, so once you have updated your 272 00:14:22,349 --> 00:14:24,839 image, you may want to show your image 273 00:14:24,839 --> 00:14:27,659 window on the screen now. So we use the 274 00:14:28,012 --> 00:14:29,962 image.show method, but here we have a 275 00:14:29,993 --> 00:14:32,003 gray_image, so we want to pass the 276 00:14:32,028 --> 00:14:36,449 updated image object, and we should be 277 00:14:36,449 --> 00:14:40,259 good to go. Save the script, go and try 278 00:14:40,259 --> 00:14:43,799 to run it, and my system is not being 279 00:14:43,824 --> 00:14:46,224 able to find the face_detector.py. 280 00:14:46,652 --> 00:14:49,139 Yeah, I've messed up something with a name 281 00:14:49,191 --> 00:14:54,381 here. So let me change this '.py', Enter, 282 00:14:55,049 --> 00:14:57,509 so nothing to do with the script, and 283 00:14:57,744 --> 00:15:02,484 let me try again. Great. So I hope 284 00:15:02,509 --> 00:15:03,919 this is what you were expecting. 285 00:15:04,559 --> 00:15:07,469 So, the rectangle starts here and it 286 00:15:07,469 --> 00:15:11,219 has a width and a height, and it ends up 287 00:15:11,279 --> 00:15:13,799 in here. Sometimes though, I assume I 288 00:15:13,799 --> 00:15:15,749 already know you may have images that 289 00:15:15,749 --> 00:15:17,699 are bigger than your screen resolution. 290 00:15:17,939 --> 00:15:20,669 In that case, your image would not fit 291 00:15:20,669 --> 00:15:23,039 in your screen, on your screen, so what 292 00:15:23,039 --> 00:15:25,683 you could do is resize the image before showing it. 293 00:15:25,707 --> 00:15:27,924 [No audio] 294 00:15:27,949 --> 00:15:31,139 So let me create a resized 295 00:15:31,469 --> 00:15:33,839 variable here, and that would be equal 296 00:15:33,864 --> 00:15:36,414 to cv2.resize 297 00:15:37,835 --> 00:15:40,835 and we want to pass image, so which 298 00:15:40,860 --> 00:15:43,770 contains the rectangle there, and you 299 00:15:43,770 --> 00:15:46,440 want to resize the image to these two 300 00:15:46,440 --> 00:15:49,800 values. So you need to set the 301 00:15:49,830 --> 00:15:53,010 resolution of the image here, and as you 302 00:15:53,010 --> 00:15:54,840 know, you can just put some values here, 303 00:15:54,840 --> 00:15:57,660 let's say 500 by 500. But these values 304 00:15:57,660 --> 00:15:59,970 might stretch out your image, so you, 305 00:16:00,000 --> 00:16:02,550 a better solution would be to access the 306 00:16:02,580 --> 00:16:05,970 shape of the image. So the resolution of 307 00:16:05,970 --> 00:16:09,450 the image, and this should be the width of 308 00:16:09,450 --> 00:16:11,820 your image. So you already know this, I 309 00:16:11,820 --> 00:16:14,070 explain this. So I'm just going through 310 00:16:14,070 --> 00:16:17,040 this quickly, and to make sure you get a 311 00:16:17,070 --> 00:16:20,550 good size, you could divide it by 3, 312 00:16:20,880 --> 00:16:23,580 and then again, image.shape, you get 313 00:16:23,580 --> 00:16:26,490 the height, which is the first value of 314 00:16:26,490 --> 00:16:29,160 the tuple and divided by 3 again. 315 00:16:29,820 --> 00:16:32,970 And you also wants to convert these 316 00:16:32,970 --> 00:16:35,370 to integers, because you might get 317 00:16:35,370 --> 00:16:38,852 float numbers and Python will say that 318 00:16:38,889 --> 00:16:40,987 [No audio] 319 00:16:41,012 --> 00:16:42,990 it cannot resize an image to 320 00:16:43,020 --> 00:16:45,930 float numbers. So that should be okay. 321 00:16:46,320 --> 00:16:49,591 And then you want to show the resized image. 322 00:16:49,615 --> 00:16:52,872 [No audio] 323 00:16:52,897 --> 00:16:56,730 Quickly testing it, and I'm missing a 324 00:16:57,150 --> 00:16:59,730 bracket here, so this should be a 325 00:16:59,730 --> 00:17:01,890 tuple all of these, but here, I'm not 326 00:17:01,890 --> 00:17:03,990 starting with as a tuple. So bracket 327 00:17:03,990 --> 00:17:08,088 here and closes in here. 328 00:17:08,113 --> 00:17:10,551 [No audio] 329 00:17:10,576 --> 00:17:11,639 That should work now. 330 00:17:11,664 --> 00:17:13,686 [No audio] 331 00:17:13,711 --> 00:17:17,100 Yeah, so this was my picture, 332 00:17:17,160 --> 00:17:18,840 but let's try a more challenging 333 00:17:18,840 --> 00:17:22,279 picture now, this one here. 334 00:17:22,303 --> 00:17:24,303 [No audio] 335 00:17:24,328 --> 00:17:29,220 So we've got two faces, which are not very clear, 336 00:17:29,220 --> 00:17:31,890 this is not a real frontal face, and this 337 00:17:31,890 --> 00:17:35,460 guy here he has his eyes closed and his 338 00:17:35,460 --> 00:17:38,790 chin is not visible, and we've also got 339 00:17:38,790 --> 00:17:40,950 these two faces here in the newspapers, 340 00:17:41,280 --> 00:17:43,410 which let me open the original image 341 00:17:43,410 --> 00:17:47,010 from here, which have basically a very 342 00:17:47,010 --> 00:17:49,920 low resolution, and I think Python will 343 00:17:49,920 --> 00:17:53,010 not be able to detect this two faces. 344 00:17:54,267 --> 00:17:55,914 So let's me check this. 345 00:17:55,938 --> 00:17:58,335 [No audio] 346 00:17:58,360 --> 00:18:04,230 Here's our face_detector, and we want to pass the 347 00:18:04,230 --> 00:18:05,821 name of the new image file. 348 00:18:05,845 --> 00:18:10,012 [No audio] 349 00:18:10,037 --> 00:18:14,940 Save the script and go ahead and run it. So 350 00:18:14,940 --> 00:18:18,030 we've got some back to reality here, and 351 00:18:18,030 --> 00:18:21,155 as you see this time, Python and OpenCV 352 00:18:21,179 --> 00:18:23,520 were able to detect this face, even 353 00:18:23,520 --> 00:18:25,050 though it's not in a very frontal 354 00:18:25,050 --> 00:18:28,920 position, and obviously, it was also 355 00:18:28,920 --> 00:18:33,570 able to detect the hand of this guy. So 356 00:18:33,600 --> 00:18:36,720 Python read it as a face, and what you 357 00:18:36,720 --> 00:18:39,810 can do in this case is to tweak this 358 00:18:39,810 --> 00:18:42,390 values here, so the scaleFactor and 359 00:18:42,390 --> 00:18:46,500 the minimumNeighbor. Something you may 360 00:18:46,530 --> 00:18:49,350 also be interested to know is the faces 361 00:18:49,380 --> 00:18:52,020 array here has two lists in this case. 362 00:18:52,290 --> 00:18:53,970 So this is the first face, the 363 00:18:53,970 --> 00:18:57,030 coordinates of the first face, and then 364 00:18:57,030 --> 00:18:59,820 the second face, which in this case 365 00:18:59,820 --> 00:19:03,562 happens to be hand. But anyway, you get the idea. 366 00:19:03,587 --> 00:19:05,642 [No audio] 367 00:19:05,667 --> 00:19:08,160 So probably by maybe using a 368 00:19:08,160 --> 00:19:12,720 1.1 scale, you may be able to get rid of 369 00:19:12,720 --> 00:19:16,950 that hand, so yeah, that's it. You can 370 00:19:16,950 --> 00:19:20,610 also try out to detect the face of this guy 371 00:19:20,610 --> 00:19:22,680 but I don't think you'll be able to do 372 00:19:22,680 --> 00:19:25,710 it. So just that you know that these 373 00:19:25,710 --> 00:19:28,530 techniques have limitations. So its a 374 00:19:28,530 --> 00:19:30,870 computer, its not a human being, so it will 375 00:19:30,895 --> 00:19:34,511 always have some downsides in the accuracy. 376 00:19:34,612 --> 00:19:36,356 I hope you've found this useful though. 377 00:19:37,020 --> 00:19:38,840 and I'll see you in the next lecture.