1
00:00:00,000 --> 00:00:03,410
We talked about supervised learning.

2
00:00:03,411 --> 00:00:08,036
Using this method, we must provide labeled data, also

3
00:00:08,037 --> 00:00:11,306
called examples, as part of the training phase.

4
00:00:11,307 --> 00:00:16,788
If I'm building an image classifier that should identify the

5
00:00:16,789 --> 00:00:21,284
type of animal on a given image, then I need

6
00:00:21,285 --> 00:00:26,348
a large number of examples images that are labeled with

7
00:00:26,349 --> 00:00:29,042
the type of animal in each image.

8
00:00:29,043 --> 00:00:32,070
This information is used during the training phase.

9
00:00:32,650 --> 00:00:35,868
Unfortunately, the vast majority of available

10
00:00:35,869 --> 00:00:38,032
data in many application in many

11
00:00:38,033 --> 00:00:41,910
industry use cases is usually unlabeled.

12
00:00:41,911 --> 00:00:44,742
We know the input feature x, but we don't

13
00:00:44,743 --> 00:00:47,900
have the labels y to train our model.

14
00:00:48,590 --> 00:00:52,052
If we still want to use supervised learning, then

15
00:00:52,053 --> 00:00:57,802
we can consider several options like searching available label

16
00:00:57,803 --> 00:01:02,228
data from other sources that can be free in

17
00:01:02,229 --> 00:01:06,552
the internet, or maybe purchase a label dataset

18
00:01:06,553 --> 00:01:09,110
from a third party company.

19
00:01:09,111 --> 00:01:12,158
The next option will be to label

20
00:01:12,159 --> 00:01:14,904
the data somehow as a manual process.

21
00:01:14,905 --> 00:01:19,772
Let a group of people, expert, go over some

22
00:01:19,773 --> 00:01:22,636
portion of the dataset and label it. Okay.

23
00:01:22,637 --> 00:01:26,428
This can be an expensive and very slow process, and

24
00:01:26,429 --> 00:01:30,672
in some cases the amount of manually labeled data will

25
00:01:30,673 --> 00:01:34,160
not be good enough to train a good model.

26
00:01:34,161 --> 00:01:35,926
Still, it is a practical

27
00:01:35,927 --> 00:01:38,780
option for using supervised learning.

28
00:01:39,890 --> 00:01:43,572
As you may guess, the next option to consider

29
00:01:43,573 --> 00:01:49,170
is to use unsupervised learning, which is not as

30
00:01:49,171 --> 00:01:53,608
widespread and frequently used as supervised learning.

31
00:01:53,609 --> 00:01:56,968
Unsupervised learning is learning without a

32
00:01:56,969 --> 00:01:59,784
teacher supervising the learning process.

33
00:01:59,785 --> 00:02:03,838
The goal is to identify automatically

34
00:02:03,839 --> 00:02:07,868
meaningful patterns in unlabeled data.

35
00:02:07,869 --> 00:02:10,812
We don't need to provide the algorithm a

36
00:02:10,813 --> 00:02:14,460
labeled dataset, which makes it a very

37
00:02:14,461 --> 00:02:17,190
attractive option to some use cases.

38
00:02:18,270 --> 00:02:20,512
Unsupervised learning is used for

39
00:02:20,513 --> 00:02:23,350
two main fundamental tasks.

40
00:02:23,351 --> 00:02:25,248
The first one is called clustering and

41
00:02:25,249 --> 00:02:28,170
the second one is called dimension reduction.

42
00:02:28,910 --> 00:02:32,842
Clustering is about summarizing and grouping

43
00:02:32,843 --> 00:02:36,770
similar instances together into clusters.

44
00:02:36,771 --> 00:02:41,570
It is helping to find a small number of attributes

45
00:02:41,571 --> 00:02:45,528
that will represent the patterns in the data and by

46
00:02:45,529 --> 00:02:49,700
doing that, uncover the underlying structure of the dataset.

47
00:02:50,550 --> 00:02:53,160
Clustering as a method is widely used

48
00:02:53,161 --> 00:02:58,818
for search engines, customer segmentations method, image

49
00:02:58,819 --> 00:03:02,204
segmentation, simple data analysis, and more.

50
00:03:02,205 --> 00:03:04,550
We'll talk about it in the next lecture.

51
00:03:05,850 --> 00:03:08,300
The second type of task is called

52
00:03:08,301 --> 00:03:12,588
dimension reduction, which is about reducing the

53
00:03:12,589 --> 00:03:14,924
complexity of the input data.

54
00:03:14,925 --> 00:03:20,436
This method, under unsupervised learning, is sometimes used

55
00:03:20,437 --> 00:03:25,170
to preprocess the input data and compress it

56
00:03:25,171 --> 00:03:28,586
before feeding into a supervised learning algorithm.

57
00:03:28,587 --> 00:03:31,332
Okay, the idea will be to compress the

58
00:03:31,333 --> 00:03:35,258
data while maintaining its structure and usefulness.

59
00:03:35,259 --> 00:03:37,246
Let's review each one of them.

60
00:03:37,247 --> 00:03:40,240
[No audio]