1 00:00:00,000 --> 00:00:02,548 Great. So let's go ahead and play around 2 00:00:02,549 --> 00:00:05,819 with OpenCV a little bit, and specifically, 3 00:00:05,819 --> 00:00:07,769 what you'll learn in this lecture is 4 00:00:07,799 --> 00:00:10,499 you will learn how to load images in 5 00:00:10,499 --> 00:00:13,259 Python using OpenCV, and you will learn 6 00:00:13,259 --> 00:00:16,199 how to display them, resize, and then 7 00:00:16,199 --> 00:00:20,121 save the resize images in a new image files. 8 00:00:20,541 --> 00:00:22,384 So load, display, resize, and 9 00:00:22,409 --> 00:00:26,789 write images, and I've got a nice image 10 00:00:26,789 --> 00:00:30,299 here of a galaxy. So I'll play around 11 00:00:30,299 --> 00:00:32,969 with this. The first thing you want to 12 00:00:32,969 --> 00:00:36,779 do is import library, and that the second 13 00:00:36,779 --> 00:00:38,669 thing is you want to load the image in 14 00:00:38,669 --> 00:00:42,642 Python. So image would be equal to 15 00:00:42,667 --> 00:00:48,214 cv2.imageread method. So the 16 00:00:48,239 --> 00:00:50,879 method expects now the path to the image 17 00:00:50,879 --> 00:00:53,159 that you want to load in Python, and 18 00:00:53,159 --> 00:00:57,839 that would be galaxy.jpg. So my 19 00:00:57,839 --> 00:01:00,449 script1.py file is in the 20 00:01:00,449 --> 00:01:04,109 same directory with galaxy.jpg, 21 00:01:04,349 --> 00:01:06,569 so I just need to pass the file name 22 00:01:06,594 --> 00:01:10,164 here, and then there is yet another parameter 23 00:01:10,384 --> 00:01:12,754 for imageread, for imread 24 00:01:12,784 --> 00:01:16,984 method, and then this parameter takes three 25 00:01:17,009 --> 00:01:19,949 arguments. Now this parameter specifies 26 00:01:19,949 --> 00:01:21,929 how you want to read the image in 27 00:01:21,964 --> 00:01:24,904 Python. So do you want to read it as an 28 00:01:24,932 --> 00:01:27,689 RBG image which means do you want three 29 00:01:27,689 --> 00:01:30,539 bands in your image, so you want a color 30 00:01:30,539 --> 00:01:33,779 image with a red band, the blue, and 31 00:01:33,779 --> 00:01:36,209 green band, so if you want to read the 32 00:01:36,209 --> 00:01:38,999 image as it is, so with colors, you'd 33 00:01:38,999 --> 00:01:40,197 want to pass 1 here. 34 00:01:40,222 --> 00:01:42,227 [No audio] 35 00:01:42,251 --> 00:01:45,659 If you want to read the image as black and white 36 00:01:45,659 --> 00:01:48,479 image in a grayscale, you'd want to pass 37 00:01:48,479 --> 00:01:51,449 0, and having a grayscale image 38 00:01:51,449 --> 00:01:53,849 implies that your image will have one 39 00:01:53,849 --> 00:01:56,159 band, and I'll get back to bands and 40 00:01:56,159 --> 00:01:59,429 explain them in just a moment. So and we 41 00:01:59,429 --> 00:02:01,649 also have -1, and this means the 42 00:02:01,649 --> 00:02:05,369 color image, but you also have an alpha 43 00:02:05,369 --> 00:02:07,769 channel, which means your image will 44 00:02:07,769 --> 00:02:10,709 have transparency capabilities. So if 45 00:02:10,709 --> 00:02:12,629 you apply operations that require 46 00:02:12,629 --> 00:02:16,319 transparency, you can do that when you 47 00:02:16,319 --> 00:02:18,119 read, when you load the image with a 48 00:02:18,119 --> 00:02:20,699 -1 argument here. Okay, so 49 00:02:20,699 --> 00:02:22,729 I would like to try out 0. 50 00:02:22,753 --> 00:02:24,924 [No audio] 51 00:02:24,949 --> 00:02:27,539 Great. Now, before I show the image, before I 52 00:02:27,539 --> 00:02:29,669 display the image on the screen, I'd 53 00:02:29,669 --> 00:02:33,569 like you to understand what this image 54 00:02:33,599 --> 00:02:37,618 object is about. So I'd want to print the type of it 55 00:02:37,642 --> 00:02:39,777 [No audio] 56 00:02:39,802 --> 00:02:40,811 just like that, 57 00:02:40,836 --> 00:02:45,232 [No audio] 58 00:02:45,316 --> 00:02:49,355 execute the script, script1. Yeah, 59 00:02:50,031 --> 00:02:52,899 close and then try again. 60 00:02:52,924 --> 00:02:54,934 [No audio] 61 00:02:54,959 --> 00:02:58,732 So this is a NumPy n-dimensional array, 62 00:02:58,756 --> 00:03:00,814 [No audio] 63 00:03:00,839 --> 00:03:02,688 and if you want, you can print that out, 64 00:03:02,712 --> 00:03:06,868 [No audio] 65 00:03:06,893 --> 00:03:09,103 and you'll see the actual NumPy array. 66 00:03:09,127 --> 00:03:11,298 [No audio] 67 00:03:11,323 --> 00:03:15,749 So this is a 2-dimensional array with values in the 68 00:03:15,790 --> 00:03:18,659 horizontal axis, and the vertical axis 69 00:03:18,659 --> 00:03:21,809 as well. So think of the image now and 70 00:03:21,839 --> 00:03:24,959 this would be the very first so the top 71 00:03:24,959 --> 00:03:28,019 left intensity value of the first pixel. 72 00:03:28,289 --> 00:03:31,199 So 14, for example, would be the 73 00:03:31,199 --> 00:03:34,079 intensity value in the grayscale, for 74 00:03:34,079 --> 00:03:36,539 the first pixel of the image, so for the 75 00:03:36,569 --> 00:03:39,929 top left pixel of the image, and then 76 00:03:39,929 --> 00:03:42,509 for the second pixel, and so on, and 77 00:03:42,509 --> 00:03:44,129 then you have these dots, which means 78 00:03:44,129 --> 00:03:46,679 Python cannot display the list, the 79 00:03:46,679 --> 00:03:48,749 long list in here because you have a 80 00:03:48,749 --> 00:03:51,179 couple of thousands values in the first 81 00:03:51,179 --> 00:03:52,949 row, and then you have the second row of 82 00:03:52,949 --> 00:03:55,799 pixels in the image, and the third, and so 83 00:03:55,799 --> 00:03:58,829 on, and this maze of metrics of the 84 00:03:58,829 --> 00:04:00,929 pixels of the image, and if you want to 85 00:04:00,929 --> 00:04:03,179 know how many numbers, how many values 86 00:04:03,179 --> 00:04:05,009 you have in the horizontal direction, 87 00:04:05,009 --> 00:04:06,779 how many values you have in the vertical 88 00:04:06,804 --> 00:04:09,504 directions, you can go ahead and print 89 00:04:10,114 --> 00:04:15,844 image.shape. Okay. So let's say that the 90 00:04:15,869 --> 00:04:20,855 image resolution is 1485 by 990. 91 00:04:22,322 --> 00:04:26,669 So Python stores the image as a NumPy array, 92 00:04:27,599 --> 00:04:30,359 as a matrix of numbers. As easy as that, 93 00:04:31,084 --> 00:04:33,574 if you want to check the dimensionals of 94 00:04:33,599 --> 00:04:37,319 your array of your image, you can do that 95 00:04:37,319 --> 00:04:40,619 with this expression, and you see that 96 00:04:40,619 --> 00:04:43,679 you have two dimensions. Now, if this 97 00:04:43,679 --> 00:04:47,549 was a color image, so with three bands, 98 00:04:47,699 --> 00:04:49,196 red, blue, and green, 99 00:04:49,220 --> 00:04:51,491 [No audio] 100 00:04:51,516 --> 00:04:52,872 things will change a little bit. 101 00:04:53,272 --> 00:04:55,732 So you've got 3-dimensions, 102 00:04:55,756 --> 00:04:57,987 [No audio] 103 00:04:58,012 --> 00:04:59,669 and you also see that the 104 00:04:59,669 --> 00:05:03,299 new array is a bit different. So here 105 00:05:03,299 --> 00:05:05,819 you've got values for each of the bands. 106 00:05:05,819 --> 00:05:08,519 So for green, for red, green, and blue. 107 00:05:10,079 --> 00:05:13,199 So I'd like to stick with a great image. 108 00:05:13,889 --> 00:05:16,589 And what I can do now is, I can display 109 00:05:16,589 --> 00:05:19,619 the image on the screen, and for that, you 110 00:05:19,619 --> 00:05:22,289 want to use the image. So imshow 111 00:05:22,319 --> 00:05:25,589 method, and this will display a window 112 00:05:25,589 --> 00:05:27,449 and you want to name that window. So you 113 00:05:27,449 --> 00:05:29,279 want to put a title for that window, 114 00:05:29,999 --> 00:05:31,121 let's say galaxy. 115 00:05:31,145 --> 00:05:33,574 [No audio] 116 00:05:33,599 --> 00:05:35,189 And what you pass here is 117 00:05:35,189 --> 00:05:39,449 the image object. So this one, great, 118 00:05:39,474 --> 00:05:42,388 and then what you want to do is 119 00:05:42,412 --> 00:05:44,802 [No audio] 120 00:05:44,827 --> 00:05:48,449 you want to specify a time for the audio 121 00:05:48,449 --> 00:05:51,959 window to be closed. Because this will 122 00:05:51,959 --> 00:05:54,239 show the window, but you also want to 123 00:05:54,239 --> 00:05:56,759 define some functionalities, so that the 124 00:05:56,759 --> 00:05:59,849 user can close the window. If you put 00 125 00:05:59,849 --> 00:06:02,129 here, and while the user presses any 126 00:06:02,129 --> 00:06:05,369 button, the window will close. Let me 127 00:06:05,369 --> 00:06:08,189 change this cv2. So if you put 128 00:06:08,189 --> 00:06:10,289 0 here, the user can close the window 129 00:06:10,289 --> 00:06:13,679 pressing any button. If you want to put 130 00:06:13,709 --> 00:06:16,859 a time, you could say 2000, and that 131 00:06:16,859 --> 00:06:21,449 simplized 2000 milliseconds. So that 132 00:06:21,449 --> 00:06:24,299 means 2 seconds. So we said how the 133 00:06:24,299 --> 00:06:26,729 user wants to close the window, and then 134 00:06:26,729 --> 00:06:29,849 you want to specify what to do when the 135 00:06:29,849 --> 00:06:33,119 user presses a button or waits for 2 136 00:06:33,119 --> 00:06:35,459 seconds. So he want to destroy all 137 00:06:35,489 --> 00:06:39,509 windows. That's the method that closes 138 00:06:39,509 --> 00:06:43,889 the window. Good. Let's see what 139 00:06:43,889 --> 00:06:48,188 will happen. Okay, so we got the image slide, 140 00:06:48,212 --> 00:06:51,020 and it waited for 2 seconds, and then it closed. 141 00:06:51,044 --> 00:06:53,480 [No audio] 142 00:06:53,505 --> 00:06:55,439 If you put it at 0, 143 00:06:55,463 --> 00:06:57,484 [No audio] 144 00:06:57,509 --> 00:07:01,049 the image will stay there, and if you 145 00:07:01,049 --> 00:07:04,739 press a button, it will close, and let me 146 00:07:04,739 --> 00:07:07,319 show the image again. Now, the reason 147 00:07:07,319 --> 00:07:10,199 that you don't see the image fit on my 148 00:07:10,199 --> 00:07:13,559 screen is that the image, as you see 149 00:07:13,559 --> 00:07:18,749 here in these values is 1485 pixels 150 00:07:18,779 --> 00:07:20,729 high. So the height, this is the 151 00:07:20,729 --> 00:07:24,869 height, and it's 990 pixels wide. 152 00:07:25,504 --> 00:07:27,304 So my screen resolution is set 153 00:07:27,329 --> 00:07:33,839 at 1280 by 720. So that means this image 154 00:07:33,839 --> 00:07:35,789 with this size does not fit on my 155 00:07:35,789 --> 00:07:38,055 screen, because my screen is too small for this. 156 00:07:38,080 --> 00:07:40,480 [No audio] 157 00:07:40,505 --> 00:07:42,288 So in this case, let me close this. 158 00:07:43,672 --> 00:07:45,604 What you can do is you want 159 00:07:45,629 --> 00:07:48,239 to resize the image, and then show the 160 00:07:48,239 --> 00:07:51,269 resized image. So we load the image, 161 00:07:51,299 --> 00:07:53,399 and before showing it, before passing it 162 00:07:53,431 --> 00:07:56,851 to the imshow method, you want to say 163 00:07:57,831 --> 00:07:59,526 let's say resized 164 00:07:59,558 --> 00:08:01,558 [No audio] 165 00:08:01,574 --> 00:08:03,509 image that will be equal 166 00:08:03,509 --> 00:08:09,089 to cv2.resize, and this would get 167 00:08:09,119 --> 00:08:11,339 two parameters. So the first is, of 168 00:08:11,339 --> 00:08:13,979 course, is the image object that you want 169 00:08:13,979 --> 00:08:17,969 to resize. So img is our variable, and 170 00:08:17,969 --> 00:08:20,069 then you want to specify a tuple with a 171 00:08:20,069 --> 00:08:25,049 new dimensions, I'll say 1000 by 500. 172 00:08:26,434 --> 00:08:29,552 And then you want to pass a new image here. 173 00:08:29,576 --> 00:08:33,538 [No audio] 174 00:08:33,563 --> 00:08:35,159 So what's happening here is that 175 00:08:35,969 --> 00:08:38,489 Python is actually resizing the NumPy 176 00:08:38,489 --> 00:08:41,429 array. So it will take the array with 177 00:08:41,429 --> 00:08:44,939 this number of pixels, number of values. 178 00:08:45,299 --> 00:08:47,609 and it will create an array with this 179 00:08:47,669 --> 00:08:50,399 new dimensions. So what will happen 180 00:08:50,399 --> 00:08:52,409 there is that Python will interpolate 181 00:08:52,409 --> 00:08:55,019 those values. So it has quite a lot of 182 00:08:55,019 --> 00:08:57,959 values here. But then it goes from this 183 00:08:58,859 --> 00:09:02,819 to this. So when it sees let's say it 184 00:09:02,819 --> 00:09:06,179 has 4 for a value and the 6 185 00:09:06,179 --> 00:09:09,689 for the neighbor value, and what Python 186 00:09:09,689 --> 00:09:13,636 will do is that it will get a 4 and 6 187 00:09:13,661 --> 00:09:16,379 basically and it will just make a 188 00:09:16,379 --> 00:09:18,419 value out of it, so let's say 5. 189 00:09:19,789 --> 00:09:22,220 That's basically the idea. So it terminates 190 00:09:22,245 --> 00:09:24,449 the values and then it shows 191 00:09:24,449 --> 00:09:26,369 the interpolated image on the screen, 192 00:09:27,064 --> 00:09:29,494 which looks pretty nice in our 193 00:09:29,519 --> 00:09:33,319 eyes. Okay, let's see this, 194 00:09:33,343 --> 00:09:36,519 [No audio] 195 00:09:36,544 --> 00:09:39,989 and this is the image and here you see that this is quite 196 00:09:39,989 --> 00:09:42,929 stretched a little bit. So it was a 197 00:09:42,954 --> 00:09:45,324 total image but now it's quite wide. 198 00:09:46,420 --> 00:09:50,310 And because this is actually the width 199 00:09:50,310 --> 00:09:52,320 of the image and this is the height, so 200 00:09:52,371 --> 00:09:57,501 if you want you can say 500 and 1000, so if 201 00:09:58,590 --> 00:10:02,670 again, and now you see more or less the two 202 00:10:02,700 --> 00:10:06,270 ratio of the image. But if you want 203 00:10:06,270 --> 00:10:09,480 to keep the ratio of the image, you'd 204 00:10:09,480 --> 00:10:13,200 want to go more advanced here. So let's 205 00:10:13,200 --> 00:10:16,980 say from this, so let's say we would 206 00:10:16,980 --> 00:10:19,650 want to show a size that is half of 207 00:10:19,650 --> 00:10:22,920 this. So that that would keep the ratio 208 00:10:22,920 --> 00:10:25,320 of the image. So what we can do is we 209 00:10:25,320 --> 00:10:28,740 need access to these values, and we can 210 00:10:28,740 --> 00:10:31,230 grab those values from the shape 211 00:10:31,230 --> 00:10:33,450 method. So this produces a tuple with 212 00:10:33,450 --> 00:10:36,510 these two values, and then we go here 213 00:10:36,510 --> 00:10:41,622 and say, image.shape, 214 00:10:41,663 --> 00:10:43,663 [No audio] 215 00:10:43,688 --> 00:10:45,360 and this would be 216 00:10:45,385 --> 00:10:49,763 this value, so 990, which has an index of 1, 217 00:10:51,522 --> 00:10:54,035 and then we have again image.shape 218 00:10:55,935 --> 00:10:58,021 index of 0 for this number, 219 00:10:58,045 --> 00:11:00,045 [No audio] 220 00:11:01,408 --> 00:11:04,221 and then we want to divide this by 2. 221 00:11:04,245 --> 00:11:06,398 [No audio] 222 00:11:06,423 --> 00:11:08,700 Okay, and I expects to get an 223 00:11:08,700 --> 00:11:12,000 error from this. Let's see. So 224 00:11:12,030 --> 00:11:14,655 yeah, we've got an error. It says TypeError 225 00:11:14,679 --> 00:11:16,890 integer argument expected, got 226 00:11:16,890 --> 00:11:20,580 float. But here, what we're doing here 227 00:11:20,580 --> 00:11:24,570 is that when we divide this number by 228 00:11:24,570 --> 00:11:27,000 2, we will get a float. So we would 229 00:11:27,000 --> 00:11:31,288 get something like 742.5. 230 00:11:31,312 --> 00:11:34,242 [No audio] 231 00:11:34,267 --> 00:11:36,821 In that case, what we want to do is 232 00:11:36,845 --> 00:11:38,321 convert this to an integer. 233 00:11:38,345 --> 00:11:41,027 [No audio] 234 00:11:41,078 --> 00:11:46,916 So 742.5 for becomes 742 . 235 00:11:47,945 --> 00:11:51,155 Okay, and let's keep the consistency so 236 00:11:51,180 --> 00:11:53,455 integers for these as well. 237 00:11:53,479 --> 00:11:57,087 [No audio] 238 00:11:57,112 --> 00:11:58,318 And let's see. 239 00:11:59,498 --> 00:12:02,520 And we've got an invalid syntax here. So it 240 00:12:02,520 --> 00:12:06,330 points us to the line 11, somewhere in the 241 00:12:06,360 --> 00:12:08,250 beginning, which can be quite 242 00:12:08,250 --> 00:12:12,600 misleading. So you want to see before 243 00:12:12,600 --> 00:12:15,360 that line, which is here, and you can 244 00:12:15,360 --> 00:12:17,730 see that these brackets here closes 245 00:12:17,730 --> 00:12:20,910 here, so we need another bracket, which 246 00:12:20,910 --> 00:12:23,070 closes in the first bracket here. So 247 00:12:23,070 --> 00:12:24,943 save that, try again. 248 00:12:24,967 --> 00:12:26,967 [No audio] 249 00:12:26,990 --> 00:12:29,255 And this time the image looks good. 250 00:12:30,502 --> 00:12:31,770 So you will learn how 251 00:12:31,770 --> 00:12:34,680 to load an image in Python, and you'll 252 00:12:34,680 --> 00:12:37,170 learn how to resize an image, how to show 253 00:12:37,170 --> 00:12:40,350 an image on the screen, and now let's go 254 00:12:40,350 --> 00:12:43,230 ahead and write the resized_image in 255 00:12:43,230 --> 00:12:46,200 a new file. For that, you'd want to use 256 00:12:46,235 --> 00:12:49,521 the imwrite methods. So imagewrite 257 00:12:50,837 --> 00:12:52,260 and you want to give a name to 258 00:12:52,290 --> 00:12:57,763 the new image Galaxy, let's say, resized.jpg, 259 00:12:57,787 --> 00:12:59,787 [No audio] 260 00:12:59,812 --> 00:13:01,290 and then you pass the image 261 00:13:01,290 --> 00:13:03,630 object that you want to store in this 262 00:13:04,200 --> 00:13:07,650 file. So the comma goes also here, 263 00:13:08,160 --> 00:13:11,488 and the image you want to store is 264 00:13:11,512 --> 00:13:14,420 resized_image. That's it. 265 00:13:14,444 --> 00:13:16,882 [No audio] 266 00:13:16,907 --> 00:13:19,680 Execute, and we got 267 00:13:19,710 --> 00:13:23,880 the old image resize on the fly. So 268 00:13:23,910 --> 00:13:26,550 Python gets a NumPy array, it 269 00:13:26,550 --> 00:13:29,550 interpolates it, so it resizes it and then 270 00:13:29,550 --> 00:13:31,800 it shows it on the screen. So this is 271 00:13:31,800 --> 00:13:35,490 galaxy, this one here, and then we can close 272 00:13:35,490 --> 00:13:39,150 this, and here we've got our new image. 273 00:13:39,390 --> 00:13:40,736 So Galaxy_resized. 274 00:13:40,761 --> 00:13:43,830 [No audio] 275 00:13:43,855 --> 00:13:44,940 So if you go to the 276 00:13:44,940 --> 00:13:48,270 folder where these files are, you'll see 277 00:13:48,270 --> 00:13:51,150 that this image has new dimensions. So 278 00:13:51,150 --> 00:13:56,010 495 by 742, and that's what I wanted to 279 00:13:56,010 --> 00:13:58,230 teach you in this lecture. Hope you 280 00:13:58,230 --> 00:13:59,940 enjoyed it and talk to you later.