1 00:00:06,302 --> 00:00:08,277 - Now let's review a use case 2 00:00:08,277 --> 00:00:10,607 where we perform business intelligence 3 00:00:10,607 --> 00:00:13,473 on a video subscription service. 4 00:00:13,473 --> 00:00:17,290 In this example, we have a video subscription service, 5 00:00:17,290 --> 00:00:19,318 something similar to Netflix, 6 00:00:19,318 --> 00:00:23,393 where our customers are out there watching videos 7 00:00:23,393 --> 00:00:27,311 on their laptops, their desktops, their mobile devices, 8 00:00:27,311 --> 00:00:29,732 and as they're watching videos, 9 00:00:29,732 --> 00:00:33,503 as they're pressing Play, as they are pressing Pause, 10 00:00:33,503 --> 00:00:35,631 as they're scrubbing forward, or backward, 11 00:00:35,631 --> 00:00:37,501 or replaying, or re-watching, 12 00:00:37,501 --> 00:00:40,727 or favoriting a particular section of a video, 13 00:00:40,727 --> 00:00:45,036 or leaving comments, all of these events are coming in 14 00:00:45,036 --> 00:00:46,869 to our Kinesis Stream. 15 00:00:48,620 --> 00:00:51,057 From that stream we have a number 16 00:00:51,057 --> 00:00:52,913 of different applications that are doing things 17 00:00:52,913 --> 00:00:55,296 like Sliding Window analysis, 18 00:00:55,296 --> 00:00:59,301 how many videos are actively being played right now, 19 00:00:59,301 --> 00:01:01,421 what videos are being played right now, 20 00:01:01,421 --> 00:01:05,389 what categories are being watched right now. 21 00:01:05,389 --> 00:01:09,099 As new users actually sign up we want to be able 22 00:01:09,099 --> 00:01:12,233 to see how many users actually became new users 23 00:01:12,233 --> 00:01:15,742 in the last minute or the last hour. 24 00:01:15,742 --> 00:01:18,320 We're doing these Sliding Window analyses 25 00:01:18,320 --> 00:01:21,703 and those are being stored in DynamoDB. 26 00:01:21,703 --> 00:01:23,408 We have another application that is just taking 27 00:01:23,408 --> 00:01:27,575 the raw event messages, aggregating those per hour 28 00:01:29,153 --> 00:01:33,153 or per day and storing those files in Amazon S3. 29 00:01:35,978 --> 00:01:39,898 From there we have our User Account information, 30 00:01:39,898 --> 00:01:42,268 we have Video meta-data being stored 31 00:01:42,268 --> 00:01:46,101 in the Amazon RDS using Amazon Aurora database 32 00:01:47,090 --> 00:01:50,028 which again is a MySQL compatible database 33 00:01:50,028 --> 00:01:54,195 but gives us a much greater performance over stock MySQL. 34 00:01:55,058 --> 00:01:58,229 We also have our Finance and Accounting team is using 35 00:01:58,229 --> 00:02:00,977 SQL Server on Amazon RDS 36 00:02:00,977 --> 00:02:04,942 to track how much we spend on marketing campaigns, 37 00:02:04,942 --> 00:02:08,657 how much we spend to produce video content. 38 00:02:08,657 --> 00:02:11,209 You can see that we have data stored 39 00:02:11,209 --> 00:02:13,431 in a lot of different areas. 40 00:02:13,431 --> 00:02:16,362 Each area, each particular database, 41 00:02:16,362 --> 00:02:20,815 whether it be RDS Amazon Aurora or SQL Server or DynamoDB, 42 00:02:20,815 --> 00:02:23,519 they are serving a very particular function 43 00:02:23,519 --> 00:02:25,306 but it still means that we have data 44 00:02:25,306 --> 00:02:27,359 in a lot of different areas. 45 00:02:27,359 --> 00:02:30,829 We're going to use AWS Data Pipeline 46 00:02:30,829 --> 00:02:34,329 to run queries on these various databases. 47 00:02:36,207 --> 00:02:39,720 We're going to pull in information about user accounts. 48 00:02:39,720 --> 00:02:42,369 We're going to pull in information about videos. 49 00:02:42,369 --> 00:02:44,605 We're going to pull in information 50 00:02:44,605 --> 00:02:47,642 about all of our financials. 51 00:02:47,642 --> 00:02:49,571 We're going to aggregate all of that data 52 00:02:49,571 --> 00:02:53,071 and store it over here in Amazon Redshift. 53 00:02:54,117 --> 00:02:57,688 We're also going to take the data 54 00:02:57,688 --> 00:02:59,948 from these Sliding Windows. 55 00:02:59,948 --> 00:03:02,578 We're going to leverage AWS Data Pipeline 56 00:03:02,578 --> 00:03:05,946 to pull data from DynamoDB and transfer 57 00:03:05,946 --> 00:03:08,884 it over to Redshift as well. 58 00:03:08,884 --> 00:03:12,817 From here, now we can use our business intelligence software 59 00:03:12,817 --> 00:03:16,984 to run much more complex queries across the entire dataset 60 00:03:17,862 --> 00:03:20,705 that would have normally been very difficult 61 00:03:20,705 --> 00:03:22,450 because they were in all of these 62 00:03:22,450 --> 00:03:25,095 various disparate databases but now that all 63 00:03:25,095 --> 00:03:27,927 of that data is in one cluster 64 00:03:27,927 --> 00:03:31,249 we can leverage the massive parallel processing power 65 00:03:31,249 --> 00:03:35,416 of Amazon Redshift to answer some very complex questions. 66 00:03:36,421 --> 00:03:40,891 We can also, if we wanted to retroactively go back 67 00:03:40,891 --> 00:03:43,899 and ask questions that we weren't asking 68 00:03:43,899 --> 00:03:46,319 at the time the data was coming in, 69 00:03:46,319 --> 00:03:50,895 we can go back, pull these Raw Log files out of S3, 70 00:03:50,895 --> 00:03:54,228 load them into Amazon Elastic MapReduce, 71 00:03:55,664 --> 00:03:58,417 leveraging the power of thousands 72 00:03:58,417 --> 00:04:02,334 of nodes running in parallel to further process 73 00:04:03,795 --> 00:04:07,962 those logs and store those results also in Redshift. 74 00:04:08,984 --> 00:04:12,067 With all of this data being pulled in 75 00:04:13,810 --> 00:04:15,808 and aggregated into Redshift, 76 00:04:15,808 --> 00:04:19,278 then we can start to get a sense of what's happening 77 00:04:19,278 --> 00:04:20,517 in the big picture. 78 00:04:20,517 --> 00:04:23,136 We want to start to understand things 79 00:04:23,136 --> 00:04:26,205 like user engagement per video. 80 00:04:26,205 --> 00:04:30,372 We want to understand how well a particular series is doing 81 00:04:32,174 --> 00:04:33,658 versus other series, 82 00:04:33,658 --> 00:04:36,370 how much are people interacting with, 83 00:04:36,370 --> 00:04:38,967 how much of a video are people watching. 84 00:04:38,967 --> 00:04:42,327 We want to see user engagement again per video, 85 00:04:42,327 --> 00:04:45,561 per series, per category, and so on. 86 00:04:45,561 --> 00:04:48,727 We want to understand out conversion rate, 87 00:04:48,727 --> 00:04:51,474 as we're spending money on marketing campaigns. 88 00:04:51,474 --> 00:04:54,004 People are coming back to the site as a result 89 00:04:54,004 --> 00:04:56,907 but how many of those people come back? 90 00:04:56,907 --> 00:05:00,549 How many of the people that come back, 91 00:05:00,549 --> 00:05:03,285 how many of those actually become customers? 92 00:05:03,285 --> 00:05:05,818 What are our conversion rates? 93 00:05:05,818 --> 00:05:09,387 We can answer those questions given all of this data 94 00:05:09,387 --> 00:05:12,768 from our user accounts, given the data that we spend 95 00:05:12,768 --> 00:05:15,786 on marketing we can answer those questions. 96 00:05:15,786 --> 00:05:18,958 We can also figure out how much we spend 97 00:05:18,958 --> 00:05:20,420 per customer acquisition. 98 00:05:20,420 --> 00:05:25,370 How much does it actually cost us to acquire new customers? 99 00:05:25,370 --> 00:05:28,453 We also can look at our user accounts 100 00:05:29,873 --> 00:05:32,623 and the money that we're spending 101 00:05:33,591 --> 00:05:36,383 and determine what are our user churn rates. 102 00:05:36,383 --> 00:05:39,324 This month versus last month, how many users signed up 103 00:05:39,324 --> 00:05:42,494 versus how many users canceled their account. 104 00:05:42,494 --> 00:05:45,586 Again, you can see this is just one more example 105 00:05:45,586 --> 00:05:48,985 of the many, many ways that we can possibly 106 00:05:48,985 --> 00:05:52,077 put all of these various services to use 107 00:05:52,077 --> 00:05:55,868 to analyze a very large amount of data, 108 00:05:55,868 --> 00:05:59,118 very easily within Amazon Web Services. 109 00:06:00,640 --> 00:06:04,807 That is just one example of video subscription service 110 00:06:06,189 --> 00:06:09,565 business intelligence leveraging multiple services 111 00:06:09,565 --> 00:06:11,482 in Amazon Web Services.