1 00:00:00,820 --> 00:00:02,970 - [Speaker] Next, let's take a look at a demo of 2 00:00:02,970 --> 00:00:06,410 Watson's visual recognition capabilities. 3 00:00:06,410 --> 00:00:07,950 And as you're going to see, 4 00:00:07,950 --> 00:00:10,410 they provide both pretrained models 5 00:00:10,410 --> 00:00:12,450 and you can train your own models 6 00:00:12,450 --> 00:00:16,470 for recognizing things in images in video. 7 00:00:16,470 --> 00:00:19,010 So the pretrained models are able to do things like 8 00:00:19,010 --> 00:00:22,030 colors and objects and faces, text, food, 9 00:00:22,030 --> 00:00:24,090 inappropriate content and more. 10 00:00:24,090 --> 00:00:27,220 And the training of your own models 11 00:00:27,220 --> 00:00:31,230 would be based on whatever applications you need to develop. 12 00:00:31,230 --> 00:00:33,230 So I'll show you the demo in just a moment. 13 00:00:33,230 --> 00:00:35,850 But the reason this is interesting is that 14 00:00:35,850 --> 00:00:39,720 one of the key areas in which machine learning 15 00:00:39,720 --> 00:00:44,720 and deep learning are used is computer vision applications. 16 00:00:45,520 --> 00:00:48,510 So for instance, in both of the next couple of lessons, 17 00:00:48,510 --> 00:00:50,640 one of the examples that we'll look at 18 00:00:50,640 --> 00:00:52,250 in each of those lessons 19 00:00:52,250 --> 00:00:55,140 is going to be where we analyze 20 00:00:55,140 --> 00:00:58,040 images of handwritten digits 21 00:00:58,040 --> 00:01:00,550 to try to figure out whether a given image 22 00:01:00,550 --> 00:01:03,090 is a zero, a one, a two, etc., 23 00:01:03,090 --> 00:01:04,700 all the way up through nine. 24 00:01:04,700 --> 00:01:06,480 This is something for instance 25 00:01:06,480 --> 00:01:09,050 that the post office needs to do 26 00:01:09,050 --> 00:01:11,060 in order to look at zip codes 27 00:01:11,060 --> 00:01:14,590 and in a computerized and mechanical way 28 00:01:14,590 --> 00:01:17,380 route mail to the appropriate locations 29 00:01:17,380 --> 00:01:19,580 for example throughout the United States. 30 00:01:19,580 --> 00:01:22,380 So here we have a prebundled service 31 00:01:22,380 --> 00:01:24,300 that knows how to do things like this 32 00:01:24,300 --> 00:01:28,130 but we're also going to demonstrate using Python 33 00:01:28,130 --> 00:01:31,060 and machine learning and deep learning libraries 34 00:01:31,060 --> 00:01:33,170 how you can create your own models 35 00:01:33,170 --> 00:01:35,810 that are actually able to predict things 36 00:01:35,810 --> 00:01:38,170 like what digit an image is 37 00:01:38,170 --> 00:01:42,150 with over 99% accuracy, which is just phenomenal. 38 00:01:42,150 --> 00:01:44,330 And as you'll see, it's only going to take 39 00:01:44,330 --> 00:01:47,840 a couple of lines of code to build those models. 40 00:01:47,840 --> 00:01:50,523 So let me switch over to this demo for a moment here. 41 00:01:51,490 --> 00:01:54,890 And by the way, when you first come up to this page, 42 00:01:54,890 --> 00:01:57,290 it's going to show at the top of the page like this, 43 00:01:57,290 --> 00:01:58,910 I had already scrolled down. 44 00:01:58,910 --> 00:02:01,030 But in the demo they start out 45 00:02:01,030 --> 00:02:03,290 with a custom model here 46 00:02:03,290 --> 00:02:08,140 and this particular model is for an insurance company. 47 00:02:08,140 --> 00:02:10,850 So they're looking at insurance claims 48 00:02:10,850 --> 00:02:12,130 and they have images of things 49 00:02:12,130 --> 00:02:14,880 and they want to be able to analyze those images 50 00:02:14,880 --> 00:02:17,790 quickly to figure out what type of problem it is. 51 00:02:17,790 --> 00:02:20,710 So clearly this is an image of a flat tire 52 00:02:20,710 --> 00:02:23,330 and when they ran this image through 53 00:02:23,330 --> 00:02:25,490 the custom insurance classifier, 54 00:02:25,490 --> 00:02:28,900 it classified this with 91% certainty 55 00:02:28,900 --> 00:02:30,800 as being a flat tire. 56 00:02:30,800 --> 00:02:31,970 Now down at the bottom here, 57 00:02:31,970 --> 00:02:34,130 they have other images that you can try. 58 00:02:34,130 --> 00:02:36,350 So for instance if I click this one, 59 00:02:36,350 --> 00:02:38,630 it looks like somebody vandalized this van 60 00:02:38,630 --> 00:02:41,990 so there's a 64% likelihood based on this image 61 00:02:41,990 --> 00:02:43,770 that the issue is vandalism. 62 00:02:43,770 --> 00:02:46,530 But if you look closely at the bottom left here, 63 00:02:46,530 --> 00:02:49,700 because of the fact that this van is parked at a curb 64 00:02:49,700 --> 00:02:53,230 it kind of looks like that wheel or that tire 65 00:02:53,230 --> 00:02:56,820 may be flat as well so it's also giving a 53% 66 00:02:56,820 --> 00:03:01,590 likelihood that there is a flat tire in this case. 67 00:03:01,590 --> 00:03:04,890 And clearly this one, once again, is a flat tire. 68 00:03:04,890 --> 00:03:06,250 This one, if I click on it, 69 00:03:06,250 --> 00:03:08,150 looks like there was a motorcycle accident. 70 00:03:08,150 --> 00:03:10,300 So 90% certainty on that. 71 00:03:10,300 --> 00:03:12,820 And there's additional images that you can play with 72 00:03:12,820 --> 00:03:14,740 down here for this custom model. 73 00:03:14,740 --> 00:03:18,053 And then separately, if I go to the pretrained models area, 74 00:03:19,470 --> 00:03:21,920 you'll notice in this image it's giving back 75 00:03:21,920 --> 00:03:23,610 a lot of information here. 76 00:03:23,610 --> 00:03:28,110 So with 96% certainty, there's fabric in this image. 77 00:03:28,110 --> 00:03:30,500 And in this case it's recognizing 78 00:03:30,500 --> 00:03:34,630 that this is a Harris Tweed jacket with 87% likelihood 79 00:03:34,630 --> 00:03:37,530 and that there's clothing in here with 80% likelihood. 80 00:03:37,530 --> 00:03:41,230 And lots of other possibilities as well. 81 00:03:41,230 --> 00:03:44,960 There's other, this is the general model by the way. 82 00:03:44,960 --> 00:03:48,080 Other pretrained models that they have are for faces, 83 00:03:48,080 --> 00:03:52,530 food, explicit content and text as well. 84 00:03:52,530 --> 00:03:55,340 So depending on what image you're looking at, 85 00:03:55,340 --> 00:03:58,170 you may want to look at the results from different models. 86 00:03:58,170 --> 00:04:00,930 So for this, clearly there's a face here, 87 00:04:00,930 --> 00:04:03,390 it's recognizing that there's a person in here. 88 00:04:03,390 --> 00:04:05,960 It's saying with 75% likelihood. 89 00:04:05,960 --> 00:04:07,833 If I go to the face model, 90 00:04:09,230 --> 00:04:11,970 it's saying not only that there's a face 91 00:04:11,970 --> 00:04:15,000 but specifically that this person 92 00:04:15,000 --> 00:04:18,340 is probably between the ages of 19 and 22 93 00:04:18,340 --> 00:04:20,520 with 93% certainty. 94 00:04:20,520 --> 00:04:23,470 And they're saying with 100% certainty 95 00:04:23,470 --> 00:04:25,200 that this is a female. 96 00:04:25,200 --> 00:04:28,250 And of course if you use a model that's not intended 97 00:04:28,250 --> 00:04:31,890 you're not necessarily going to get the results you expect. 98 00:04:31,890 --> 00:04:34,730 It is clearly recognizing that this is not food 99 00:04:34,730 --> 00:04:38,470 with 100% certainty in this case. 100 00:04:38,470 --> 00:04:42,220 So again, pretrained models for lots of existing cases. 101 00:04:42,220 --> 00:04:44,670 Which may, in fact, be all you need 102 00:04:44,670 --> 00:04:46,210 for the apps that you're building. 103 00:04:46,210 --> 00:04:48,130 But you also have the ability to build 104 00:04:48,130 --> 00:04:49,570 your own custom models. 105 00:04:49,570 --> 00:04:52,520 So if I'm working for an insurance company 106 00:04:52,520 --> 00:04:56,150 and I have tens of millions of photos 107 00:04:56,150 --> 00:04:59,380 of insurance claim issues over the years, 108 00:04:59,380 --> 00:05:02,540 I can train models based on those images 109 00:05:02,540 --> 00:05:04,840 and then hopefully make it easier 110 00:05:04,840 --> 00:05:08,520 for my insurance adjusters to categorize 111 00:05:08,520 --> 00:05:10,563 future claims as they come in.