1 00:00:00,000 --> 00:00:02,474 Dimension reduction. 2 00:00:02,475 --> 00:00:03,840 Okay, what is it? 3 00:00:04,600 --> 00:00:13,300 In supervised learning, while using some classification or prediction algorithm, one 4 00:00:13,301 --> 00:00:16,708 big challenge to handle is the number of 5 00:00:16,709 --> 00:00:20,986 input features that the algorithm needs to analyze. 6 00:00:20,987 --> 00:00:24,970 Let's say that those features are dimensions. 7 00:00:24,971 --> 00:00:29,356 Now think about a high resolution image that 8 00:00:29,357 --> 00:00:35,770 has millions dimensional piece of data in pixels. 9 00:00:35,771 --> 00:00:41,110 Each pixel in that image is actually three dimension. 10 00:00:41,111 --> 00:00:44,758 It's described using red, green, and blue. 11 00:00:44,759 --> 00:00:47,360 So we will have 3 million 12 00:00:47,361 --> 00:00:50,410 dimension to describe a single image. 13 00:00:51,330 --> 00:00:53,520 So what's the problem here? 14 00:00:54,050 --> 00:00:57,476 More features, more dimensions will require much 15 00:00:57,477 --> 00:01:00,820 more processing time, more computing resources like 16 00:01:00,821 --> 00:01:06,584 memory, storage, networking, and sometimes many of 17 00:01:06,585 --> 00:01:10,664 those features are correlated to each other 18 00:01:10,665 --> 00:01:13,662 and therefore redundant for the algorithm. 19 00:01:13,663 --> 00:01:17,356 In other cases, some of those features will have 20 00:01:17,357 --> 00:01:22,950 a very weak influence on the machine learning outcome. 21 00:01:23,530 --> 00:01:28,368 So what if we can perform some preprocessing to 22 00:01:28,369 --> 00:01:31,568 the data as a step before applying that in 23 00:01:31,569 --> 00:01:36,032 a supervised learning algorithm and reduce the data size, 24 00:01:36,033 --> 00:01:40,628 reducing the dimension of the data. This is 25 00:01:40,629 --> 00:01:45,124 where dimension reduction algorithms come into play. 26 00:01:45,125 --> 00:01:47,620 They can be used for reducing the number 27 00:01:47,621 --> 00:01:51,988 of variables under consideration, helping to simplify the 28 00:01:51,989 --> 00:01:55,220 data without losing too much information. 29 00:01:55,910 --> 00:01:59,160 This is a common preprocessing step 30 00:01:59,161 --> 00:02:02,590 for prediction and classification task. 31 00:02:02,591 --> 00:02:05,176 As a simple way to visual that process, 32 00:02:05,177 --> 00:02:08,882 take a look at this three dimensional pipe. 33 00:02:08,883 --> 00:02:14,018 We can reduce one dimension and describe 34 00:02:14,019 --> 00:02:17,186 this pipe in a two dimensional plane. 35 00:02:17,187 --> 00:02:22,496 Like for example, this circle while looking on the pipe from 36 00:02:22,497 --> 00:02:27,264 upside in the X and Y plane and also as a 37 00:02:27,265 --> 00:02:32,010 rectangle looking from the side in the X and Z plane. 38 00:02:32,011 --> 00:02:35,790 This is a very simple example of dimension reduction. 39 00:02:37,010 --> 00:02:39,092 It will be useful to talk about a 40 00:02:39,093 --> 00:02:42,826 very common use case with using this approach. 41 00:02:42,827 --> 00:02:46,318 Let's say I would like to build an image 42 00:02:46,319 --> 00:02:51,880 classifier as part of some object detection system using 43 00:02:51,881 --> 00:02:56,546 supervised learning, and for performing that training task, 44 00:02:56,547 --> 00:03:06,470 I have 10k images with 640 x 640 resolution per each image. 45 00:03:07,050 --> 00:03:12,598 An image with colors is basically a large group of pixels. 46 00:03:12,599 --> 00:03:16,486 Each pixel can be represented by the combination 47 00:03:16,487 --> 00:03:22,554 of three basic colors Red, Green, Blue, RGB. 48 00:03:22,555 --> 00:03:28,290 The red, green and blue use 8 bits each, 49 00:03:28,291 --> 00:03:34,408 which have an integer values between 0 to 255. 50 00:03:34,409 --> 00:03:40,030 This makes around 6 million possible colors, 51 00:03:40,031 --> 00:03:44,040 okay, this is the space of options. In our case, 52 00:03:44,041 --> 00:03:46,872 if we have this resolution per each 53 00:03:46,873 --> 00:03:50,892 image, then the row uncompressed size of 54 00:03:50,893 --> 00:03:55,990 that image will be around 1.2 megabytes. 55 00:03:57,290 --> 00:04:01,760 If we need to process 10k images with 56 00:04:01,761 --> 00:04:05,952 1.2 megabytes per each, it's a lot of data. 57 00:04:05,953 --> 00:04:08,982 In many cases, the objective of an ML 58 00:04:08,983 --> 00:04:12,042 system does not require such level of details. 59 00:04:12,043 --> 00:04:14,874 Think about the situation of identifying 60 00:04:14,875 --> 00:04:17,082 a car in an image. 61 00:04:17,083 --> 00:04:20,995 The machine does not care about all the color range of 62 00:04:20,996 --> 00:04:25,190 a car to understand that this object is a car. 63 00:04:25,191 --> 00:04:29,214 So it makes sense to transform and compress 64 00:04:29,215 --> 00:04:32,686 the data in that picture as a preprocessing 65 00:04:32,687 --> 00:04:36,956 step. In our example, what if we can 66 00:04:36,957 --> 00:04:41,370 reduce the space of possible color options? 67 00:04:41,371 --> 00:04:46,892 So instead of using 60 million color space, we can 68 00:04:46,893 --> 00:04:52,406 reduce the image quality into just 256 space of colors. 69 00:04:52,407 --> 00:04:58,890 In that case the image size will be just 0.5 megabytes. 70 00:04:59,470 --> 00:05:07,546 Then I will take all my compressed 10k images and feed it into my supervised 71 00:05:07,547 --> 00:05:11,124 learning algorithm which will be much faster process 72 00:05:11,125 --> 00:05:13,572 because there is less data to process. 73 00:05:13,573 --> 00:05:16,308 This is being done by performing something that 74 00:05:16,309 --> 00:05:19,092 is called image segmentation as part of the 75 00:05:19,093 --> 00:05:24,372 dimension reduction algorithm by clustering pixels according to 76 00:05:24,373 --> 00:05:28,546 their color, and then replacing each pixel 77 00:05:28,547 --> 00:05:32,690 color with some mean color of its cluster. 78 00:05:32,691 --> 00:05:38,092 Like replacing group of shades of red into one single 79 00:05:38,093 --> 00:05:42,332 red color, which is enough and better amount of information 80 00:05:42,333 --> 00:05:46,680 to the algorithm to identify a specific object. 81 00:05:46,681 --> 00:05:48,043 [No audio]