1 00:00:00,000 --> 00:00:03,210 Hey, here we are again, and in this 2 00:00:03,235 --> 00:00:05,490 lecture, we will continue building our 3 00:00:05,515 --> 00:00:08,180 real world motion detection program. 4 00:00:09,825 --> 00:00:13,242 Now, before I go and write the code, I'll 5 00:00:13,267 --> 00:00:16,380 first like to explain to you the 6 00:00:16,380 --> 00:00:19,230 architecture of the program that we're 7 00:00:19,260 --> 00:00:22,410 building. So how this program is able to 8 00:00:22,410 --> 00:00:25,320 detect motion in the video, and I assume 9 00:00:25,320 --> 00:00:26,760 you already know how to build the 10 00:00:26,760 --> 00:00:29,400 script. So we built this in the previous 11 00:00:29,400 --> 00:00:33,570 lectures, and what the script does is 12 00:00:33,570 --> 00:00:35,580 that, in case you have missed it. 13 00:00:35,604 --> 00:00:41,456 [No audio] 14 00:00:41,481 --> 00:00:45,210 Hey, so what this does is it triggers the 15 00:00:45,210 --> 00:00:48,900 video, as you saw, from that can be the 16 00:00:48,900 --> 00:00:53,070 webcam, and so we're processing frames here 17 00:00:53,070 --> 00:00:56,190 in this while loop, and so on. So you know 18 00:00:56,190 --> 00:00:58,920 this, I will not go through them right 19 00:00:58,920 --> 00:01:02,460 now. So what we need to do next is we 20 00:01:02,460 --> 00:01:04,650 need to process those frames that are 21 00:01:04,650 --> 00:01:07,800 being iterated in this while loop, and 22 00:01:07,800 --> 00:01:09,450 I've got some pictures here to 23 00:01:09,480 --> 00:01:13,620 illustrate my ideas, my concepts. Let me 24 00:01:13,620 --> 00:01:15,360 go to the directory where they are 25 00:01:15,360 --> 00:01:19,500 located. And so what this motion detection 26 00:01:19,500 --> 00:01:22,170 program will do is, it will trigger the 27 00:01:22,170 --> 00:01:24,900 webcam just like our current script 28 00:01:24,900 --> 00:01:26,850 does, and one condition for the program 29 00:01:26,850 --> 00:01:29,430 to work well is that once you trigger 30 00:01:29,430 --> 00:01:31,766 the webcam, the first_frame of the 31 00:01:31,791 --> 00:01:34,230 video should be the static background. 32 00:01:35,220 --> 00:01:37,350 So if you're planning to use this 33 00:01:37,350 --> 00:01:40,500 program, let's say you'll setup a 34 00:01:40,500 --> 00:01:43,050 webcam on a laptop or a Raspberry Pi 35 00:01:43,080 --> 00:01:45,697 Server, let's say you want to detect 36 00:01:45,871 --> 00:01:49,590 the movement of a certain animal in an 37 00:01:49,590 --> 00:01:53,730 area. So first, you'd want to capture, to 38 00:01:53,760 --> 00:01:55,680 trigger the camera while the 39 00:01:55,680 --> 00:01:58,110 background is static, and then you want 40 00:01:58,110 --> 00:02:01,140 to use this background as a base image 41 00:02:01,140 --> 00:02:02,880 so that you can compare the other 42 00:02:02,880 --> 00:02:05,370 images, and then Python can detect if 43 00:02:05,370 --> 00:02:06,900 there is a change between the first 44 00:02:06,900 --> 00:02:10,170 frame and the next frames. So that's one 45 00:02:10,170 --> 00:02:12,870 thing you need to do, and so this is 46 00:02:12,870 --> 00:02:16,350 an example of background, and then 47 00:02:17,400 --> 00:02:19,560 something, the animal will appear on your 48 00:02:19,560 --> 00:02:22,560 camera. Okay, and then what you'll 49 00:02:22,560 --> 00:02:24,120 have to write in your script in the 50 00:02:24,145 --> 00:02:28,525 program is that first, we would want to 51 00:02:28,620 --> 00:02:31,140 gray out this image, so the background 52 00:02:31,140 --> 00:02:33,810 image and the current frame of the 53 00:02:33,810 --> 00:02:35,430 camera, so we'll store the first 54 00:02:35,430 --> 00:02:38,640 frame of the video capture in a variable 55 00:02:38,670 --> 00:02:41,310 and then you'll convert that frame to 56 00:02:41,340 --> 00:02:44,550 a grayscale image, and then the while 57 00:02:44,550 --> 00:02:46,800 loop will go through the current frames. 58 00:02:47,160 --> 00:02:49,290 And you do the same for the current 59 00:02:49,290 --> 00:02:50,580 frame, so you will convert them to 60 00:02:50,580 --> 00:02:53,700 grayscale, and then what you'll do to 61 00:02:53,700 --> 00:02:56,430 these two grayscale images of the current 62 00:02:56,460 --> 00:02:58,950 iteration of the loop, you'll apply 63 00:02:59,100 --> 00:03:01,320 the difference between them. So this is 64 00:03:01,350 --> 00:03:03,480 the difference. An example of a 65 00:03:03,523 --> 00:03:06,090 difference frame, of a delta frame, if you 66 00:03:06,090 --> 00:03:08,250 can say like that. In this particular 67 00:03:08,250 --> 00:03:10,110 image, you'll notice that behind me, 68 00:03:10,470 --> 00:03:12,900 there is the lamp of the room, and 69 00:03:12,900 --> 00:03:15,150 normally, you don't be able to see the 70 00:03:15,150 --> 00:03:17,520 lamp but Python is actually making the 71 00:03:17,520 --> 00:03:20,820 difference between the frame I was 72 00:03:20,820 --> 00:03:22,770 appearing and the background frame where 73 00:03:22,770 --> 00:03:26,370 the lamp is visible. So it comes up with 74 00:03:26,370 --> 00:03:29,040 this gray image where each pixel has a 75 00:03:29,040 --> 00:03:31,470 certain value. So it has some intensity 76 00:03:31,470 --> 00:03:33,424 values. All right. 77 00:03:33,448 --> 00:03:35,448 [No audio] 78 00:03:35,473 --> 00:03:37,440 So that means for 79 00:03:37,440 --> 00:03:39,810 instance, the high intensity values 80 00:03:39,840 --> 00:03:42,630 like this one where I am, it means 81 00:03:42,655 --> 00:03:45,775 that there is potential motion in this 82 00:03:45,810 --> 00:03:48,150 area here, while the black areas imply 83 00:03:48,150 --> 00:03:50,760 there is no motion. But you also see 84 00:03:50,760 --> 00:03:54,720 some light black pixels here, because 85 00:03:54,720 --> 00:03:57,240 when I appear on the camera, there is 86 00:03:57,240 --> 00:04:00,780 shadow behind me and so on. But we will 87 00:04:00,780 --> 00:04:03,510 apply something else later on, we just 88 00:04:03,510 --> 00:04:06,630 call the threshold. So we'll basically 89 00:04:06,630 --> 00:04:08,460 say that if you see a difference in the 90 00:04:08,490 --> 00:04:10,350 delta frame, in the frame that I just 91 00:04:10,350 --> 00:04:12,420 showed you, if you see a difference of 92 00:04:12,420 --> 00:04:16,170 more than 100 intensity, convert that 93 00:04:16,260 --> 00:04:20,130 pixels, those pixels to white pixels, 94 00:04:20,370 --> 00:04:23,910 again, and for pixels that are below 95 00:04:23,910 --> 00:04:28,770 the threshold, convert them to black, okay, and so 96 00:04:28,770 --> 00:04:31,530 you come up with the outline of the 97 00:04:31,568 --> 00:04:34,350 object that is moving in the camera or in the 98 00:04:34,375 --> 00:04:36,930 frame. So we are doing all these 99 00:04:36,930 --> 00:04:39,690 processes inside the while loop, and 100 00:04:39,690 --> 00:04:41,040 then once we have calculated the 101 00:04:41,040 --> 00:04:43,800 threshold frame inside the loop, what 102 00:04:43,800 --> 00:04:45,810 we'll do to the current frame is we'll 103 00:04:45,810 --> 00:04:50,010 find the contours of the white objects 104 00:04:50,040 --> 00:04:53,070 in the frame. Okay. So for this 105 00:04:53,070 --> 00:04:55,140 particular image, we would have contours 106 00:04:55,140 --> 00:04:58,050 around this object here and the contours 107 00:04:58,050 --> 00:04:59,760 around this and around this as well. 108 00:05:00,482 --> 00:05:02,280 And then we will write a for loop that will 109 00:05:02,310 --> 00:05:04,350 iterate through all the contours of the 110 00:05:04,350 --> 00:05:06,990 current frame, okay, it will go to this 111 00:05:06,990 --> 00:05:09,540 contour, to this, to this, and to this one. And 112 00:05:09,540 --> 00:05:11,670 then inside that loop, what we'll do, 113 00:05:11,700 --> 00:05:15,480 we'll check if the area of the contours, so 114 00:05:15,480 --> 00:05:17,550 this, for example, has an area of, let's 115 00:05:17,550 --> 00:05:20,670 say 500 pixels. So if the area, if the 116 00:05:20,670 --> 00:05:23,940 area of the contour is more than 500 117 00:05:23,940 --> 00:05:26,100 pixels, for example, then consider these 118 00:05:26,100 --> 00:05:28,770 as a moving object. If the 119 00:05:28,770 --> 00:05:31,200 area is, let's say less than 500, like 120 00:05:31,200 --> 00:05:33,870 this one here is probably 20 pixels, 121 00:05:33,870 --> 00:05:36,000 let's say this will not be considered 122 00:05:36,270 --> 00:05:39,060 moving object. Okay, I hope that makes 123 00:05:39,060 --> 00:05:42,030 sense. And then what we'll do next is we 124 00:05:42,030 --> 00:05:44,610 will draw a rectangle around the 125 00:05:44,640 --> 00:05:46,920 contours that are greater than the 126 00:05:46,920 --> 00:05:49,410 minimum area, and then we'll show those 127 00:05:49,410 --> 00:05:52,920 rectangles in the original image, so in 128 00:05:52,920 --> 00:05:54,990 the color version of the current frame, 129 00:05:55,277 --> 00:05:58,295 and that means we will see a rectangle in 130 00:05:58,320 --> 00:05:59,670 the video. While the video is 131 00:05:59,670 --> 00:06:01,080 playing, we'll see a rectangle around 132 00:06:01,080 --> 00:06:03,630 the object. Later on, we will detect the 133 00:06:03,630 --> 00:06:05,460 times that the object, that the moving 134 00:06:05,460 --> 00:06:08,010 object entered the background, the video 135 00:06:08,010 --> 00:06:11,190 frame, and the time that the object exited 136 00:06:11,190 --> 00:06:13,500 the background. But for now, let's 137 00:06:13,500 --> 00:06:15,900 simply focus on detecting the object, 138 00:06:15,900 --> 00:06:18,600 the moving object in the video. Okay, 139 00:06:18,810 --> 00:06:21,870 and let's go back to the script, 140 00:06:21,894 --> 00:06:27,155 [No audio] 141 00:06:27,180 --> 00:06:31,440 close this and this and this. Okay, 142 00:06:31,789 --> 00:06:34,590 and let's remove some unnecessary lines 143 00:06:34,590 --> 00:06:37,080 here. So we had the a variable here, 144 00:06:37,080 --> 00:06:38,850 which we created because we wanted to 145 00:06:38,850 --> 00:06:41,430 see how many frames we had in the video. 146 00:06:43,080 --> 00:06:45,360 So we don't need that anymore, and we 147 00:06:45,360 --> 00:06:47,340 don't need the script to stop for 3 148 00:06:47,340 --> 00:06:50,626 seconds. So I removed the time.sleep 149 00:06:51,075 --> 00:06:54,245 and this one also. Okay, and now 150 00:06:54,270 --> 00:06:56,640 here, the first part is the most 151 00:06:56,670 --> 00:07:00,210 trickiest one, we need to figure out 152 00:07:00,210 --> 00:07:02,850 way to store the current frame of the 153 00:07:02,850 --> 00:07:05,670 video. So as soon as the video starts, 154 00:07:05,670 --> 00:07:08,970 we want to store that NumPy array in a 155 00:07:08,970 --> 00:07:11,730 variable and have that variable static, 156 00:07:11,730 --> 00:07:13,290 so we don't want to change the value of 157 00:07:13,315 --> 00:07:16,015 that variable, while the while loop runs 158 00:07:16,115 --> 00:07:19,775 in the script. And the way to do that is, we 159 00:07:19,800 --> 00:07:23,550 would first need to create a variable 160 00:07:23,550 --> 00:07:26,790 for the frame, for the first_frame. So and we 161 00:07:26,790 --> 00:07:31,530 need to assign it a None value. So None is 162 00:07:31,555 --> 00:07:34,590 a special Python datatype, that allows you 163 00:07:34,590 --> 00:07:36,780 to create a variable and assign nothing 164 00:07:36,780 --> 00:07:38,820 to it, but you have the variable there. So if 165 00:07:38,820 --> 00:07:41,100 you call this variable later, Python 166 00:07:41,100 --> 00:07:44,040 will not say variable is not defined. 167 00:07:44,520 --> 00:07:46,230 Okay, if you don't understand it, please 168 00:07:46,230 --> 00:07:48,720 hold on and you'll get it in just a 169 00:07:48,720 --> 00:07:53,400 while, and what you need to do now is to 170 00:07:53,400 --> 00:07:55,920 write a conditional and apply a 171 00:07:55,920 --> 00:07:58,854 continuous statement either, let me 172 00:07:58,879 --> 00:08:00,600 the write the conditional first, and then 173 00:08:00,960 --> 00:08:03,600 I'll explain to you what this does. So, 174 00:08:03,600 --> 00:08:09,688 we want to check if the first_frame is None 175 00:08:09,712 --> 00:08:14,457 [No audio] 176 00:08:14,482 --> 00:08:18,889 and if it is None, we will assign the first_frame, 177 00:08:18,913 --> 00:08:22,045 [No audio] 178 00:08:22,070 --> 00:08:25,770 the gray frame. So, what 179 00:08:25,770 --> 00:08:28,140 this does is that the script will run, 180 00:08:28,170 --> 00:08:31,230 the video will be triggered and then the 181 00:08:31,230 --> 00:08:33,720 while loop will start to run and it will 182 00:08:33,720 --> 00:08:36,510 get the first_frame of the video and it will 183 00:08:36,510 --> 00:08:39,540 store it in this frame variable, and this 184 00:08:39,540 --> 00:08:41,640 frame variable will be converted into a 185 00:08:41,640 --> 00:08:44,040 gray frame and then we say if the 186 00:08:44,040 --> 00:08:47,520 first_frame is None, which is true, in the 187 00:08:47,520 --> 00:08:49,260 first iteration of the loop, this is 188 00:08:49,260 --> 00:08:51,930 true. So the first_frame is actually None 189 00:08:51,930 --> 00:08:55,620 because we assign None here, assign the 190 00:08:55,620 --> 00:08:59,310 gray NumPy to the first_frame. So the 191 00:08:59,310 --> 00:09:02,430 first_frame will get the grayscale image 192 00:09:02,460 --> 00:09:04,530 which represents the very first_frame of 193 00:09:04,530 --> 00:09:07,645 the video. So this happens in the 194 00:09:07,670 --> 00:09:10,710 very first iteration of the loop. Okay. 195 00:09:11,280 --> 00:09:13,470 But then what will happen is Python will 196 00:09:13,470 --> 00:09:16,530 execute this other lines of code, and 197 00:09:16,530 --> 00:09:18,330 then it will go to the second loop, and 198 00:09:18,330 --> 00:09:20,520 what Python will do is it will grab the 199 00:09:20,520 --> 00:09:23,520 second frame of the video. Okay, let's say 200 00:09:23,520 --> 00:09:25,860 the first_frame was a background image, 201 00:09:26,250 --> 00:09:29,190 and then suddenly an object appears in the 202 00:09:29,220 --> 00:09:31,920 camera, in front of the camera, so Python 203 00:09:31,920 --> 00:09:35,070 will grab the NumPy array that contains 204 00:09:35,070 --> 00:09:38,040 that object. So the second frame and 205 00:09:38,070 --> 00:09:40,500 this frame will be converted to gray and 206 00:09:40,500 --> 00:09:42,750 then here we say if first_frame is 207 00:09:42,750 --> 00:09:45,750 None, the first_frame variable will 208 00:09:45,750 --> 00:09:47,558 get the first frame of the video. 209 00:09:47,582 --> 00:09:49,686 first frame 210 00:09:49,711 --> 00:09:52,350 And once we have grabbed the first_frame, we 211 00:09:52,350 --> 00:09:54,660 don't want these other lines of code to 212 00:09:54,685 --> 00:09:57,385 be executed. Because here we will have, 213 00:09:57,630 --> 00:09:59,160 you know, we'll apply the difference 214 00:09:59,160 --> 00:10:01,260 between frames and we'll blurry the 215 00:10:01,260 --> 00:10:04,110 frames, and so on. So we don't want 216 00:10:04,110 --> 00:10:06,450 these to be executed. Instead, we want 217 00:10:06,450 --> 00:10:08,370 Python to go to the beginning of the 218 00:10:08,370 --> 00:10:10,888 loop, and continue with the second frame. 219 00:10:11,486 --> 00:10:14,604 To do that, we need to write continue here. 220 00:10:14,628 --> 00:10:16,779 [No audio] 221 00:10:16,804 --> 00:10:18,510 So this means continue to 222 00:10:18,510 --> 00:10:21,090 the beginning of the loop, and don't go 223 00:10:21,120 --> 00:10:24,510 and run the rest of the code. Okay, 224 00:10:25,080 --> 00:10:28,055 so the first_frame is None, and the first_frame 225 00:10:28,079 --> 00:10:29,970 gets the value of the first image 226 00:10:29,995 --> 00:10:32,845 of the video, and then goes to the next 227 00:10:32,885 --> 00:10:35,495 iteration, and then the next iteration, 228 00:10:35,520 --> 00:10:37,890 what we'll do, it will grab the second 229 00:10:37,890 --> 00:10:40,230 frame of the video, and then it will 230 00:10:40,260 --> 00:10:42,540 calculate the gray version of that 231 00:10:42,540 --> 00:10:45,390 frame, and then it goes again to the 232 00:10:45,390 --> 00:10:48,360 conditional, and in this case, is 233 00:10:48,360 --> 00:10:49,860 the first_frame None? No, it's not, 234 00:10:49,890 --> 00:10:51,810 because the first_frame got the value of 235 00:10:51,810 --> 00:10:54,780 the gray image in the first iteration of the 236 00:10:54,780 --> 00:10:58,260 while loop. So this lines here will not 237 00:10:58,260 --> 00:11:00,480 be executed at the second iteration of 238 00:11:00,480 --> 00:11:05,850 the loop. Okay. Great. That means we can 239 00:11:05,850 --> 00:11:09,990 now apply a delta_frame. So we can 240 00:11:09,990 --> 00:11:12,180 calculate the difference between the 241 00:11:12,180 --> 00:11:14,430 first_frame and the current frame of the 242 00:11:14,430 --> 00:11:16,488 image. So the first_frame is the first_frame 243 00:11:16,512 --> 00:11:17,850 variable and the current frame is 244 00:11:17,850 --> 00:11:21,600 a gray variable. But before that, we 245 00:11:21,600 --> 00:11:23,790 would like to do something to the 246 00:11:23,790 --> 00:11:27,150 current frame of the image. We want to 247 00:11:27,150 --> 00:11:29,970 apply a GaussianBlur to the image 248 00:11:29,994 --> 00:11:34,955 [Author typing] 249 00:11:34,980 --> 00:11:37,530 and a dot here, so the reason we want to 250 00:11:37,530 --> 00:11:39,630 apply GaussianBlur is that we want to 251 00:11:39,630 --> 00:11:41,370 blur the image, so we want to make it 252 00:11:41,370 --> 00:11:44,730 blurry. So to smooth fade, because that 253 00:11:44,760 --> 00:11:47,160 removes noise and increases accuracy in 254 00:11:47,160 --> 00:11:51,210 the calculation of the difference. So this 255 00:11:51,210 --> 00:11:53,880 gets as parameter the image you want to 256 00:11:53,880 --> 00:11:56,460 blur, and then so we are passing the gray 257 00:11:56,460 --> 00:11:58,530 image here and we are storing the blurry 258 00:11:58,530 --> 00:12:00,840 version of the image in the gray image 259 00:12:00,840 --> 00:12:03,120 again, and then we have another 260 00:12:04,080 --> 00:12:06,420 parameter which comes as a tuple, and 261 00:12:06,420 --> 00:12:08,700 here we need to pass the width and 262 00:12:08,700 --> 00:12:12,630 the height of the Gaussian kernel. So 263 00:12:12,630 --> 00:12:15,360 which is basically the parameters of the 264 00:12:15,660 --> 00:12:18,150 blurriness but 21 would be accepted 265 00:12:18,150 --> 00:12:21,270 numbers, and you also need another last 266 00:12:21,300 --> 00:12:23,880 parameter. So that would be the standard 267 00:12:23,880 --> 00:12:29,100 deviation and I'll pass 0. 0 here is also commonly 268 00:12:29,100 --> 00:12:31,470 used. If you want to learn about them, 269 00:12:31,470 --> 00:12:33,510 you can go through the documentation, 270 00:12:33,510 --> 00:12:36,720 but these values should be good. So 271 00:12:36,720 --> 00:12:39,480 we're making the gray image blurry here 272 00:12:40,470 --> 00:12:45,480 and then down here now we need to compare 273 00:12:45,480 --> 00:12:47,910 the first_frame of the image, so the 274 00:12:47,910 --> 00:12:50,010 background, we have a current frame. 275 00:12:51,300 --> 00:12:53,940 Let's call it delta_frame, and 276 00:12:53,940 --> 00:12:59,421 that would be equal to cv2.absdifference, 277 00:12:59,445 --> 00:13:00,870 so absolute difference 278 00:13:01,230 --> 00:13:06,240 between the first_frame and the current frame, 279 00:13:06,600 --> 00:13:09,455 which is gray. Note that the first_frame 280 00:13:09,479 --> 00:13:11,190 will also be the gray, a gray 281 00:13:11,190 --> 00:13:13,950 version, a blurred gray version 282 00:13:13,950 --> 00:13:16,680 actually. So we are comparing here to 283 00:13:16,680 --> 00:13:20,280 blurred grayscale images. Okay. And what 284 00:13:20,310 --> 00:13:24,540 this will give us is another image. 285 00:13:24,720 --> 00:13:27,210 Okay, and actually, I'd like to show 286 00:13:27,210 --> 00:13:31,721 that image here on the screen, cv2.imageshow 287 00:13:31,745 --> 00:13:34,153 [No audio] 288 00:13:34,178 --> 00:13:37,800 delta_frame. Let's see what 289 00:13:37,859 --> 00:13:39,254 we'll get out of this. 290 00:13:39,279 --> 00:13:41,279 [No audio] 291 00:13:41,280 --> 00:13:43,227 And before running the script, 292 00:13:43,251 --> 00:13:45,390 I'll disappear from the view 293 00:13:45,390 --> 00:13:47,888 first and then will appear, okay, let's see. 294 00:13:47,912 --> 00:13:51,007 [No audio] 295 00:13:51,032 --> 00:13:52,110 Okay, nothing happened 296 00:13:52,110 --> 00:13:54,960 because I forgot to enter the name of 297 00:13:54,960 --> 00:13:59,921 the window, let's call this Delta Frame, 298 00:14:01,362 --> 00:14:04,388 and let's say Gray Frame for this. 299 00:14:04,412 --> 00:14:06,641 [No audio] 300 00:14:06,666 --> 00:14:07,710 And let me run it again. 301 00:14:07,734 --> 00:14:15,455 [No audio] 302 00:14:15,480 --> 00:14:19,440 And here I am. So this is blurred 303 00:14:19,680 --> 00:14:22,530 grayscale version, this one here, and we 304 00:14:22,530 --> 00:14:28,650 have a difference. Okay, so you can see 305 00:14:28,675 --> 00:14:31,255 the lamp behind me in the Delta Frame. 306 00:14:31,830 --> 00:14:35,490 Great. Press the Q key and create the 307 00:14:35,535 --> 00:14:40,635 video. Now if you, if I'd want to print, 308 00:14:41,825 --> 00:14:44,778 just to check the Delta Frame. 309 00:14:44,802 --> 00:14:48,980 [No audio] 310 00:14:49,005 --> 00:14:51,600 This will allow us to see the difference between 311 00:14:51,630 --> 00:14:54,540 intensities of the corresponding pixels. 312 00:14:55,223 --> 00:14:57,773 So let me see. 313 00:14:57,804 --> 00:15:02,945 [No audio] 314 00:15:02,970 --> 00:15:05,296 And if I quit this now, 315 00:15:05,320 --> 00:15:07,376 [No audio] 316 00:15:07,401 --> 00:15:10,500 here we go. So 317 00:15:10,500 --> 00:15:12,630 what we have here, 5 means there is 318 00:15:12,630 --> 00:15:15,990 no difference. So in this area, there's 319 00:15:15,990 --> 00:15:19,110 no motion probably. But then you have 320 00:15:19,110 --> 00:15:22,410 174, which is quite wide. So that 321 00:15:22,410 --> 00:15:25,230 means Python will classify this as 322 00:15:25,230 --> 00:15:29,280 motion. Okay, that will just to show you 323 00:15:29,580 --> 00:15:31,626 the values of the delta_frame. 324 00:15:31,650 --> 00:15:34,041 [No audio] 325 00:15:34,066 --> 00:15:38,400 What we need now is to classify these values. So 326 00:15:38,400 --> 00:15:40,140 let's say we want to assign a threshold. 327 00:15:40,440 --> 00:15:42,720 So let's say if you have values that are 328 00:15:42,720 --> 00:15:44,520 more than more than 30, so if the 329 00:15:44,520 --> 00:15:46,920 difference between the first_frame, and the 330 00:15:46,920 --> 00:15:49,650 current frame is more than 30, we will 331 00:15:49,650 --> 00:15:53,370 classify that as wide. So we'll say 332 00:15:53,395 --> 00:15:56,605 there's probably motion in those pixels, 333 00:15:56,703 --> 00:15:58,776 so there's an object in those pixels. 334 00:15:59,190 --> 00:16:01,050 And if the difference is less than 30, 335 00:16:01,440 --> 00:16:06,943 we'll assign it black pixel. Okay, so 336 00:16:08,220 --> 00:16:11,190 we can do that. So we are here. We can 337 00:16:11,190 --> 00:16:14,010 do that using the threshold method of 338 00:16:14,010 --> 00:16:18,221 the cv2.library. Let's say thresh_delta 339 00:16:18,246 --> 00:16:21,721 equals to cv2.threshold, 340 00:16:22,830 --> 00:16:25,191 and what this expects is the image 341 00:16:25,215 --> 00:16:27,589 [No audio] 342 00:16:27,614 --> 00:16:31,320 that you want to threshold, and then you want 343 00:16:31,320 --> 00:16:34,350 to specify the threshold limit. So 344 00:16:34,350 --> 00:16:36,900 30, we said 30. And what color do 345 00:16:36,900 --> 00:16:38,910 you want to assign to the values that 346 00:16:38,910 --> 00:16:40,650 are more than 30? Well, we want to 347 00:16:40,650 --> 00:16:43,410 assign a white color which corresponds 348 00:16:43,410 --> 00:16:48,420 to the 255 value. Okay, and you also 349 00:16:48,420 --> 00:16:51,600 need one more argument here, which is 350 00:16:51,600 --> 00:16:53,055 the threshold method. 351 00:16:53,079 --> 00:16:55,353 [No audio] 352 00:16:55,378 --> 00:16:56,460 There are quite a 353 00:16:56,460 --> 00:16:59,430 few methods out there, and but we are 354 00:16:59,430 --> 00:17:01,800 using this method here, so Threshold 355 00:17:01,800 --> 00:17:04,110 Binary. You can experiment with 356 00:17:04,110 --> 00:17:05,329 others if you like. 357 00:17:05,354 --> 00:17:08,022 [No audio] 358 00:17:08,047 --> 00:17:10,655 Great, and now let's see how these 359 00:17:10,679 --> 00:17:12,000 thresh_delta frame 360 00:17:12,210 --> 00:17:13,688 will look like. 361 00:17:14,401 --> 00:17:20,521 cv2.imshow Threshold Frame, 362 00:17:20,545 --> 00:17:24,088 okay, thresh_delta. 363 00:17:24,442 --> 00:17:26,850 Great. Let's see. 364 00:17:26,875 --> 00:17:32,016 [No audio] 365 00:17:32,045 --> 00:17:34,835 Apparently, I wrote this wrong way 366 00:17:34,860 --> 00:17:39,555 delta_frame, okay, not frame_delta and 367 00:17:39,579 --> 00:17:42,436 [No audio] 368 00:17:42,461 --> 00:17:48,210 one more typo in line 19, which is here, cv2. Okay, 369 00:17:48,210 --> 00:17:49,786 let's hope this time will work, 370 00:17:49,810 --> 00:17:51,885 [No audio] 371 00:17:51,910 --> 00:17:56,550 and yet another error. And this time, I've missed 372 00:17:56,550 --> 00:18:00,960 something here. Okay, this method here, 373 00:18:00,960 --> 00:18:02,640 so the threshold method actually 374 00:18:02,640 --> 00:18:06,420 returns a tuple with two values, and the 375 00:18:06,420 --> 00:18:08,940 first value is needed when you use 376 00:18:08,970 --> 00:18:11,550 other threshold methods. So the first 377 00:18:11,550 --> 00:18:13,710 item of the tuple basically suggests a 378 00:18:13,710 --> 00:18:16,080 value for the threshold if you're using 379 00:18:16,110 --> 00:18:18,450 other methods, but for Threshold Binary, 380 00:18:18,450 --> 00:18:21,000 you only need to access the second item 381 00:18:21,000 --> 00:18:24,330 of the tuple, which is the actual frame 382 00:18:24,330 --> 00:18:27,750 that is returned from the threshold 383 00:18:27,750 --> 00:18:30,120 method. So you want to access the second 384 00:18:30,120 --> 00:18:34,140 item of the tuple. Okay, I promise this 385 00:18:34,485 --> 00:18:36,353 is a last error. 386 00:18:36,377 --> 00:18:39,639 [No audio] 387 00:18:39,664 --> 00:18:42,810 Great, finally, and this is the threshold frame. 388 00:18:42,834 --> 00:18:47,975 [No audio] 389 00:18:48,000 --> 00:18:51,180 So you can see my outline there, but you 390 00:18:51,180 --> 00:18:53,940 also see some wide areas here 391 00:18:53,940 --> 00:18:55,616 because of the shadows. 392 00:18:55,640 --> 00:18:57,982 [No audio] 393 00:18:58,007 --> 00:19:01,500 So I am being detected as an object, but my shadows as 394 00:19:01,500 --> 00:19:03,425 well as are being detected as 395 00:19:03,450 --> 00:19:06,420 objects too. Okay, so you get the idea. 396 00:19:07,139 --> 00:19:09,780 Now we can go right away and use this 397 00:19:09,780 --> 00:19:12,600 thresh_delta frame, which I called it 398 00:19:12,600 --> 00:19:14,580 thresh_delta, I should call it the 399 00:19:14,580 --> 00:19:17,520 thresh_frame, you know, just for name 400 00:19:17,545 --> 00:19:20,635 consistency. And so we can go ahead now and 401 00:19:21,083 --> 00:19:24,983 create contours of the white objects in 402 00:19:25,020 --> 00:19:28,290 thresh_frame. But before that, I'd like 403 00:19:28,290 --> 00:19:30,421 to do something else. I'd like to dilate 404 00:19:30,445 --> 00:19:33,060 those areas, so I want to remove the 405 00:19:33,060 --> 00:19:36,390 black holes from those big white areas 406 00:19:36,420 --> 00:19:38,550 in the image. So basically, I want to 407 00:19:38,550 --> 00:19:42,646 smooth my threshold frame, and to do that, 408 00:19:43,025 --> 00:19:44,705 you need to use the dilate method 409 00:19:44,730 --> 00:19:49,410 of the cv2 library. And so let's say we 410 00:19:49,435 --> 00:19:51,385 want to change the threshold_frame, 411 00:19:52,170 --> 00:19:54,050 cv2.dilate. 412 00:19:54,074 --> 00:19:56,225 [No audio] 413 00:19:56,250 --> 00:19:59,670 Again, you want to pass the threshold_frame there. Now if 414 00:19:59,670 --> 00:20:02,760 you have a kernel array, and you want this 415 00:20:02,760 --> 00:20:04,980 process to be very sophisticated, you'd 416 00:20:04,980 --> 00:20:07,530 pass that array in here. We don't have 417 00:20:07,530 --> 00:20:09,960 any and we don't need one, so you need 418 00:20:09,985 --> 00:20:11,755 to pass None for this parameter, 419 00:20:12,320 --> 00:20:14,343 and there's yet another parameter here, 420 00:20:15,050 --> 00:20:18,455 iterations, and let's say 2. So this 421 00:20:18,480 --> 00:20:21,360 defines how many times do you want to go 422 00:20:21,391 --> 00:20:23,911 through the image to remove those holes. 423 00:20:24,360 --> 00:20:25,980 So the bigger this normally is, the 424 00:20:25,980 --> 00:20:29,220 smoother the image will be. Okay, and let me 425 00:20:29,220 --> 00:20:30,388 check it quickly. 426 00:20:30,412 --> 00:20:33,350 [No audio] 427 00:20:33,375 --> 00:20:36,690 Oh, yeah, I changed the name earlier, but I didn't 428 00:20:36,715 --> 00:20:38,905 change it here in the imshow method. 429 00:20:38,930 --> 00:20:41,728 [No audio] 430 00:20:41,753 --> 00:20:44,220 thresh_frame, okay. 431 00:20:44,244 --> 00:20:52,895 [No audio] 432 00:20:52,920 --> 00:20:54,690 So you can probably notice that 433 00:20:55,295 --> 00:20:57,185 the areas, the white areas are 434 00:20:57,210 --> 00:21:03,420 smoother now. So if I go away, you'll 435 00:21:03,420 --> 00:21:06,180 only notice a few areas that are there 436 00:21:06,180 --> 00:21:09,360 because of my shadow. So it seems to be 437 00:21:09,360 --> 00:21:13,740 working so far. Let's quit it now. Great. 438 00:21:14,130 --> 00:21:18,240 So we've got these three frames. And what's 439 00:21:18,240 --> 00:21:20,550 next? Well, next is we need to find the 440 00:21:20,550 --> 00:21:24,241 contours of these dilated threshold frame. 441 00:21:24,265 --> 00:21:26,376 [No audio] 442 00:21:26,401 --> 00:21:28,560 Regarding contour detection with 443 00:21:28,560 --> 00:21:32,640 OpenCV, you have two methods. So you 444 00:21:32,640 --> 00:21:35,460 have a findContours and a drawContours 445 00:21:35,460 --> 00:21:38,190 methods, and with the findContours 446 00:21:38,190 --> 00:21:40,620 method, what you do is you find the 447 00:21:40,620 --> 00:21:42,570 contours in your image and you store 448 00:21:42,570 --> 00:21:45,270 them in a tuple. On the other hand, the 449 00:21:45,270 --> 00:21:48,360 drawContours method draws contours in 450 00:21:48,360 --> 00:21:51,420 an image. So in this case, what we want 451 00:21:51,445 --> 00:21:53,395 to do is we want to find the contours, 452 00:21:53,420 --> 00:21:58,050 and then we want to check if the area of this 453 00:21:58,050 --> 00:22:00,930 contour, so let's say you have a contour 454 00:22:00,930 --> 00:22:03,150 like a circle, and so you want to find the 455 00:22:03,150 --> 00:22:06,690 area that this contour defines. So you 456 00:22:06,690 --> 00:22:09,734 want to store those contours in a tuple. 457 00:22:10,115 --> 00:22:13,985 So you want to write cnts and ',' and 458 00:22:14,010 --> 00:22:17,490 another '_', okay, and that would be 459 00:22:17,490 --> 00:22:23,020 equal to cv2.findCountours, 460 00:22:23,044 --> 00:22:25,075 [No audio] 461 00:22:25,100 --> 00:22:27,390 and then you want to pass the frame that you want 462 00:22:27,390 --> 00:22:30,990 to find the contours for, and it's 463 00:22:30,990 --> 00:22:33,780 good to actually use the copy of the 464 00:22:33,780 --> 00:22:37,140 frame, so you don't want to modify the 465 00:22:37,165 --> 00:22:39,415 threshold frame. So use copy here, 466 00:22:40,031 --> 00:22:42,210 and though, so this is the first parameter 467 00:22:42,210 --> 00:22:44,721 of the frame you want to find the contours from 468 00:22:44,745 --> 00:22:46,745 [No audio] 469 00:22:46,774 --> 00:22:48,240 and then you have the methods 470 00:22:49,242 --> 00:22:54,903 RETRIEVE_EXTERNAL, so you want to 471 00:22:55,080 --> 00:22:57,690 draw the external contours of the objects 472 00:22:57,690 --> 00:23:00,330 that you'll be finding in the image. 473 00:23:00,690 --> 00:23:03,150 And you've got yet another argument, 474 00:23:03,785 --> 00:23:09,125 CHAIN_APPROX_SIMPLE, and so this is 475 00:23:09,150 --> 00:23:13,110 approximation method that OpenCV will 476 00:23:13,230 --> 00:23:15,887 apply for retrieving the contours. 477 00:23:17,160 --> 00:23:20,400 Great. So what we have is we are 478 00:23:20,400 --> 00:23:23,100 iterating through the current frame, and so 479 00:23:23,100 --> 00:23:25,410 we are blurring it and converting it to 480 00:23:25,410 --> 00:23:29,910 grayscale and find the delta_frame and 481 00:23:30,089 --> 00:23:32,654 apply the threshold, so the black and white 482 00:23:32,679 --> 00:23:35,010 image, and then we find 483 00:23:35,045 --> 00:23:38,315 all the contours of the objects, of the 484 00:23:38,340 --> 00:23:41,280 distinct objects in this image. So if 485 00:23:41,280 --> 00:23:43,950 you've got two wide continuous areas in 486 00:23:43,950 --> 00:23:46,621 your image, but they are distinct, you'll get 487 00:23:46,645 --> 00:23:48,360 two contours, so one contour for 488 00:23:48,360 --> 00:23:50,910 each of the areas, and these contours will be 489 00:23:50,910 --> 00:23:55,890 stored in this cnts variable. So we're 490 00:23:55,890 --> 00:23:57,990 talking about the current frame, and next 491 00:23:57,990 --> 00:24:00,450 what we want to do is we want to filter 492 00:24:00,450 --> 00:24:03,000 out these contours. So we want to check 493 00:24:03,000 --> 00:24:05,070 that, we want to keep only the contours 494 00:24:05,070 --> 00:24:08,070 that are, let's say that have an area 495 00:24:08,070 --> 00:24:10,410 that is bigger than 1000 pixels. 496 00:24:10,775 --> 00:24:13,775 For that you need to iterate, let's say 497 00:24:13,800 --> 00:24:17,842 for countour in cnts, 498 00:24:17,867 --> 00:24:19,890 [No audio] 499 00:24:19,915 --> 00:24:21,388 and if 500 00:24:22,168 --> 00:24:27,721 cv2.countourArea 501 00:24:28,293 --> 00:24:30,810 of the countour, so the 502 00:24:30,810 --> 00:24:33,720 countour that we are iterating through, so 503 00:24:33,745 --> 00:24:36,423 if this is less than 1000, 504 00:24:36,452 --> 00:24:39,123 [No audio] 505 00:24:39,148 --> 00:24:43,440 continue to the beginning of the for loop again. So 506 00:24:43,470 --> 00:24:46,050 what this means is, let's say Python 507 00:24:46,050 --> 00:24:48,330 found three countours and it will go 508 00:24:48,330 --> 00:24:50,340 through the first one and it'll say if 509 00:24:50,340 --> 00:24:52,470 the area of this countour, so we use the 510 00:24:52,500 --> 00:24:54,540 countourArea of the cv2 library. If 511 00:24:54,540 --> 00:24:59,100 the area has less than 1000 pixels, go 512 00:24:59,100 --> 00:25:01,200 to the next countour, so go to the 513 00:25:01,200 --> 00:25:03,180 second countour and check again and again 514 00:25:03,180 --> 00:25:06,390 and again. Otherwise, if the area is 515 00:25:06,390 --> 00:25:09,870 bigger than or equal to 1000, the next 516 00:25:09,870 --> 00:25:12,421 lines here after the for loop will be executed. 517 00:25:14,056 --> 00:25:16,535 So what do you want to do if a 518 00:25:16,560 --> 00:25:19,020 countour is greater than 1000 pixels? 519 00:25:20,400 --> 00:25:23,190 Well, we wants to draw the rectangle 520 00:25:23,220 --> 00:25:26,109 surrounding that countour to the current frame. 521 00:25:26,134 --> 00:25:28,410 [No audio] 522 00:25:28,435 --> 00:25:30,660 So make sure you are inside the 523 00:25:30,660 --> 00:25:34,230 for loop. So we are still iterating, and 524 00:25:34,255 --> 00:25:38,945 [No audio] 525 00:25:38,970 --> 00:25:42,150 so these are the parameters that define 526 00:25:42,150 --> 00:25:44,874 the rectangle and that will be equal to 527 00:25:45,606 --> 00:25:51,711 cv2.boundRectangle on the current countour. 528 00:25:51,736 --> 00:25:54,047 [No audio] 529 00:25:54,072 --> 00:25:57,990 So if the countour is equal or 530 00:25:57,990 --> 00:26:01,590 greater than 1000 pixels, so if it 531 00:26:01,590 --> 00:26:03,750 has an area of equal or greater than 532 00:26:03,750 --> 00:26:07,590 1000 pixels, this will be executed. So 533 00:26:07,590 --> 00:26:10,830 we are creating a rectangle, and then we 534 00:26:10,830 --> 00:26:13,380 want to draw that rectangle to our frame, 535 00:26:13,440 --> 00:26:17,321 to our current frame. So cv2.rectangle, 536 00:26:18,380 --> 00:26:20,165 so we already consumed these 537 00:26:20,190 --> 00:26:22,620 method in our face detection lectures. 538 00:26:23,700 --> 00:26:25,350 And here, we would want to pass the 539 00:26:25,350 --> 00:26:30,660 color frame. Okay, so that is the frame and 540 00:26:31,650 --> 00:26:34,590 you want to specify x and y here. So 541 00:26:34,590 --> 00:26:36,390 these are the coordinates of the upper 542 00:26:36,530 --> 00:26:40,370 left corner of the rectangle, x, y, and 543 00:26:41,742 --> 00:26:43,830 you want to specify the coordinates of 544 00:26:43,830 --> 00:26:46,830 the right lower corner of the rectangle 545 00:26:46,860 --> 00:26:50,955 as well, so x+w, 546 00:26:51,888 --> 00:26:54,720 just like that, and 547 00:26:54,720 --> 00:26:58,290 also the color of your rectangle, let's 548 00:26:58,315 --> 00:27:02,910 say, green, and the width, let's say 549 00:27:02,910 --> 00:27:06,180 3. So what we did in these two lines 550 00:27:06,180 --> 00:27:10,170 is that we created this table with these 551 00:27:10,200 --> 00:27:14,310 four coordinates, and these values, these 552 00:27:14,310 --> 00:27:17,100 will be assigned automatically. So x and 553 00:27:17,100 --> 00:27:19,890 y will get the value from a 554 00:27:19,920 --> 00:27:23,040 rectangle bounding these countour, this 555 00:27:23,040 --> 00:27:25,560 current countour of the for loop. Now then 556 00:27:25,560 --> 00:27:27,060 these values will be used to draw 557 00:27:27,060 --> 00:27:29,100 rectangle in the frame, in the current 558 00:27:29,100 --> 00:27:31,080 frame, and then we want to show that 559 00:27:31,110 --> 00:27:34,260 current frame. So let me add it here, 560 00:27:34,950 --> 00:27:36,255 imageshow. 561 00:27:36,279 --> 00:27:38,329 [No audio] 562 00:27:38,354 --> 00:27:40,200 Let's call this Color Frame 563 00:27:41,135 --> 00:27:45,480 and frame. And actually, this method here is 564 00:27:45,690 --> 00:27:50,190 boundingRect. All right, let me try the script now. 565 00:27:50,214 --> 00:27:55,655 [No audio] 566 00:27:55,680 --> 00:28:01,054 And cv2 has no attribute findCountours. 567 00:28:01,596 --> 00:28:04,445 So here I've got a 'u' 568 00:28:04,470 --> 00:28:05,988 there that shouldn't be there. 569 00:28:07,508 --> 00:28:08,949 Let's try it again. 570 00:28:08,973 --> 00:28:14,826 [No audio] 571 00:28:14,851 --> 00:28:18,870 And that's funny. I tend to miss type 572 00:28:18,902 --> 00:28:24,320 this word countourArea, line 24, and 573 00:28:24,695 --> 00:28:26,906 remove the 'u', and 574 00:28:26,930 --> 00:28:28,948 [No audio] 575 00:28:28,973 --> 00:28:31,456 it should be contour here as well. 576 00:28:31,480 --> 00:28:34,477 [No audio] 577 00:28:34,502 --> 00:28:35,539 Okay. 578 00:28:35,563 --> 00:28:42,485 [No audio] 579 00:28:42,525 --> 00:28:44,850 countour is not defined again, I've got 580 00:28:44,850 --> 00:28:50,010 another one here. So bear with me. Try again. 581 00:28:50,034 --> 00:28:55,445 [No audio] 582 00:28:55,470 --> 00:28:57,570 And yeah, this time seems to be working. 583 00:28:57,594 --> 00:29:01,775 [No audio] 584 00:29:01,800 --> 00:29:06,690 So no objects, objects, no objects, 585 00:29:07,170 --> 00:29:11,460 objects. Great. Great. So that's what I 586 00:29:11,460 --> 00:29:14,190 wanted to teach you in this lecture, and 587 00:29:14,293 --> 00:29:15,815 we'll continue with the next lectures, 588 00:29:15,840 --> 00:29:17,910 because what we've done so far 589 00:29:17,910 --> 00:29:20,550 is that we can detect that object and we 590 00:29:20,550 --> 00:29:22,590 can draw a rectangle around that object. 591 00:29:22,650 --> 00:29:24,750 But this is not very practical. I mean, 592 00:29:24,750 --> 00:29:27,690 in real world, it's not enough to just 593 00:29:27,720 --> 00:29:29,910 draw a rectangle around your object, and 594 00:29:29,910 --> 00:29:33,150 then that's it. So what we'll be doing 595 00:29:33,150 --> 00:29:35,430 in the next lectures is we'll be 596 00:29:35,430 --> 00:29:37,980 storing the times that the object enters 597 00:29:37,980 --> 00:29:40,800 the frame and when the object exits the 598 00:29:40,800 --> 00:29:43,500 frame. So we've got some more lines to 599 00:29:43,500 --> 00:29:46,950 add to this code, and I know this code 600 00:29:46,950 --> 00:29:49,139 here was quite a lot to consume, 601 00:29:49,164 --> 00:29:53,340 but I hope these are clear now, and I'll 602 00:29:53,340 --> 00:29:54,593 see you in the next lecture. 603 00:29:54,617 --> 00:29:56,617 [No audio]