1 00:00:06,940 --> 00:00:08,292 - Welcome. 2 00:00:08,292 --> 00:00:09,650 In this demonstration, 3 00:00:09,650 --> 00:00:10,600 we're gonna take a look 4 00:00:10,600 --> 00:00:14,640 at Kinesis Data Firehose Delivery Streams, 5 00:00:14,640 --> 00:00:16,990 and we are going to create one 6 00:00:16,990 --> 00:00:21,560 using the console with a destination of S3. 7 00:00:21,560 --> 00:00:22,990 There are a couple of different ways 8 00:00:22,990 --> 00:00:27,600 to reach the Firehose dashboard. 9 00:00:27,600 --> 00:00:31,463 So we could search here for Kinesis, 10 00:00:32,690 --> 00:00:35,100 and it will take us to the Kinesis console, 11 00:00:35,100 --> 00:00:39,163 or if we type Firehose instead, 12 00:00:40,270 --> 00:00:42,950 it'll take us to this specific dashboard 13 00:00:42,950 --> 00:00:44,800 within the Kinesis service. 14 00:00:44,800 --> 00:00:46,453 But let's go to Kinesis first, 15 00:00:47,350 --> 00:00:50,800 and we'll see that we have bright orange button 16 00:00:50,800 --> 00:00:52,160 with a couple of different options: 17 00:00:52,160 --> 00:00:55,320 Data Streams or Firehose are two of those. 18 00:00:55,320 --> 00:00:59,710 We can also select on the left-hand side, 19 00:00:59,710 --> 00:01:01,320 and we could pick delivery streams, 20 00:01:01,320 --> 00:01:04,040 and notice that it doesn't say Firehose. 21 00:01:04,040 --> 00:01:05,460 It just says delivery streams 22 00:01:05,460 --> 00:01:08,820 because those are the specific resources 23 00:01:08,820 --> 00:01:12,433 that are created within that Firehose offering. 24 00:01:13,310 --> 00:01:14,710 And so if we go over here, 25 00:01:14,710 --> 00:01:17,970 click on Data Firehose and create, 26 00:01:17,970 --> 00:01:21,170 it's gonna take us into a little wizard 27 00:01:21,170 --> 00:01:22,870 that we have to fill out. 28 00:01:22,870 --> 00:01:26,083 It gives us an explanation of how Firehose works. 29 00:01:27,170 --> 00:01:32,170 We'll make that smaller so we can fill out our form. 30 00:01:32,180 --> 00:01:33,353 We have a source. 31 00:01:34,220 --> 00:01:36,100 We can allow for direct put, 32 00:01:36,100 --> 00:01:39,583 that way we don't have to create a data stream as well. 33 00:01:40,480 --> 00:01:43,200 And our destination is going to be S3, 34 00:01:43,200 --> 00:01:46,410 but you can see that there are other possibilities, 35 00:01:46,410 --> 00:01:51,010 third party offerings, other AWS services and so on, 36 00:01:51,010 --> 00:01:52,483 but we're gonna pick S3. 37 00:01:54,800 --> 00:01:56,250 Now that we've done this, 38 00:01:56,250 --> 00:01:57,870 we can scroll down a little bit further 39 00:01:57,870 --> 00:02:00,453 as this expanded the form for us. 40 00:02:01,690 --> 00:02:05,690 It's gonna give us a random delivery stream name. 41 00:02:05,690 --> 00:02:07,820 We can make that a little bit different. 42 00:02:07,820 --> 00:02:10,733 We could call this test-stream if we want. 43 00:02:12,880 --> 00:02:17,120 We have an option of transforming or converting records. 44 00:02:17,120 --> 00:02:19,330 And so if we enable data transformation, 45 00:02:19,330 --> 00:02:22,340 then it's going to ask us for a Lambda function 46 00:02:22,340 --> 00:02:24,870 that's going to perform that work. 47 00:02:24,870 --> 00:02:27,323 So we'll leave that disabled. 48 00:02:28,500 --> 00:02:31,420 If we want to do a format conversion, 49 00:02:31,420 --> 00:02:33,980 we have a couple of options here, 50 00:02:33,980 --> 00:02:37,723 both Apache Parquet or ORC files. 51 00:02:38,640 --> 00:02:40,670 We'll leave that disabled as well. 52 00:02:40,670 --> 00:02:42,780 Although we can pick here schema, 53 00:02:42,780 --> 00:02:45,550 we're gonna talk about AWS glue 54 00:02:45,550 --> 00:02:47,303 in another lesson in this course. 55 00:02:49,060 --> 00:02:50,840 So if we just wanna pass the data through 56 00:02:50,840 --> 00:02:52,840 in essentially the raw format, 57 00:02:52,840 --> 00:02:56,300 now all we have to do is figure out the destination 58 00:02:56,300 --> 00:02:58,150 as an S3 bucket. 59 00:02:58,150 --> 00:02:59,903 And if we didn't have one, 60 00:03:00,930 --> 00:03:02,700 we can go ahead and create one here. 61 00:03:02,700 --> 00:03:05,260 So we can click on the create button. 62 00:03:05,260 --> 00:03:09,180 And it's gonna take us to the S3 service 63 00:03:09,180 --> 00:03:12,390 where we will then be allowed to create the bucket. 64 00:03:12,390 --> 00:03:14,090 So we can call this, 65 00:03:14,090 --> 00:03:17,697 let's call it livelessons-firehosetest-01. 66 00:03:21,510 --> 00:03:22,930 We'll leave it in us-east-1 67 00:03:22,930 --> 00:03:25,600 because we're gonna match the region 68 00:03:25,600 --> 00:03:28,350 of the delivery stream itself. 69 00:03:28,350 --> 00:03:31,033 That's one way that helps to optimize for cost, 70 00:03:32,050 --> 00:03:35,720 and we really don't have to change anything else, all right? 71 00:03:35,720 --> 00:03:37,140 The only thing that I would add here 72 00:03:37,140 --> 00:03:39,320 because it's not enabled by default 73 00:03:39,320 --> 00:03:42,300 is object encryption. 74 00:03:42,300 --> 00:03:43,133 Other than that, 75 00:03:43,133 --> 00:03:44,663 we just click on create bucket. 76 00:03:46,480 --> 00:03:51,480 And now we go back to our Firehose dashboard, 77 00:03:52,250 --> 00:03:55,780 and we can then browse our S3 buckets here 78 00:03:55,780 --> 00:03:57,130 and it will pull up a list. 79 00:03:58,020 --> 00:04:00,920 And so we can scroll down, 80 00:04:00,920 --> 00:04:02,393 keep going here. 81 00:04:04,310 --> 00:04:05,950 Let's see, there we go. 82 00:04:05,950 --> 00:04:09,613 Let's refresh livelessons-firehosetest-01, 83 00:04:11,930 --> 00:04:12,973 choose that. 84 00:04:15,500 --> 00:04:18,750 And if we wanted to pick a prefix, 85 00:04:18,750 --> 00:04:19,880 we can do that here. 86 00:04:19,880 --> 00:04:21,750 We can enable data partitioning, 87 00:04:21,750 --> 00:04:23,540 which we're not going to do. 88 00:04:23,540 --> 00:04:26,230 We don't need any error output prefixes. 89 00:04:26,230 --> 00:04:27,160 And then we can start 90 00:04:27,160 --> 00:04:30,560 to configure the actual delivery stream itself 91 00:04:30,560 --> 00:04:34,410 in terms of the buffer size for buffer and batch, 92 00:04:34,410 --> 00:04:36,420 as well as the interval. 93 00:04:36,420 --> 00:04:39,410 And so regardless of how much data has been collected, 94 00:04:39,410 --> 00:04:41,880 whether we meet the minimum size or not, 95 00:04:41,880 --> 00:04:46,030 every 300 seconds it will automatically perform a delivery 96 00:04:46,030 --> 00:04:47,113 to the back end. 97 00:04:48,560 --> 00:04:51,220 We can pick different types of encryption 98 00:04:51,220 --> 00:04:53,033 for the data records themselves, 99 00:04:53,910 --> 00:04:56,990 and then we can come down here to advanced settings 100 00:04:56,990 --> 00:04:59,530 where there are a few more options, 101 00:04:59,530 --> 00:05:01,180 like error logging. 102 00:05:01,180 --> 00:05:03,610 We can enable server side encryption 103 00:05:03,610 --> 00:05:06,170 on the delivery stream itself, 104 00:05:06,170 --> 00:05:10,340 and it will ask us about permissions. 105 00:05:10,340 --> 00:05:12,980 And this is an important thing to note. 106 00:05:12,980 --> 00:05:15,030 When we are going through and doing this, 107 00:05:15,030 --> 00:05:18,090 using the AWS console, 108 00:05:18,090 --> 00:05:19,570 there are a number of tasks 109 00:05:19,570 --> 00:05:22,330 that are being accomplished for us behind the scenes 110 00:05:22,330 --> 00:05:24,730 that would normally have to be achieved 111 00:05:24,730 --> 00:05:28,860 as an individual piece of work. 112 00:05:28,860 --> 00:05:33,390 Instead, the dashboard is combining all of this 113 00:05:33,390 --> 00:05:34,430 into one form, 114 00:05:34,430 --> 00:05:37,160 and the creation of permissions is one of them. 115 00:05:37,160 --> 00:05:41,460 For a delivery stream to deliver to an S3 bucket, 116 00:05:41,460 --> 00:05:44,100 there has to be an IAM role in place 117 00:05:44,100 --> 00:05:47,600 that allows that delivery stream permissions 118 00:05:47,600 --> 00:05:51,110 to place objects in that S3 bucket. 119 00:05:51,110 --> 00:05:53,653 And that's what this is doing for us. 120 00:05:55,150 --> 00:05:57,160 So we can add a tag, 121 00:05:57,160 --> 00:05:58,763 we'll give it a cost center tag, 122 00:06:01,070 --> 00:06:05,040 and click on create delivery stream. 123 00:06:05,040 --> 00:06:09,960 This process usually takes a handful of minutes to complete. 124 00:06:09,960 --> 00:06:14,960 And so if you are planning on automating this type 125 00:06:14,960 --> 00:06:17,810 of workflow to create delivery streams, 126 00:06:17,810 --> 00:06:21,530 you just need to be aware that there's going to be a delay 127 00:06:21,530 --> 00:06:23,890 after the provisioning 128 00:06:23,890 --> 00:06:27,240 while it creates that delivery stream for you. 129 00:06:27,240 --> 00:06:29,223 And that completes this demonstration.