1 00:00:07,220 --> 00:00:09,070 Now let's take a look at an overview 2 00:00:09,070 --> 00:00:11,780 of AWS database options. 3 00:00:11,780 --> 00:00:16,200 So starting with Amazon Relational Database Service or RDS, 4 00:00:16,200 --> 00:00:20,150 we get a fully managed relational database system 5 00:00:20,150 --> 00:00:21,900 and by fully managed, we mean that 6 00:00:21,900 --> 00:00:24,950 Amazon handles pretty much everything, 7 00:00:24,950 --> 00:00:27,823 everything except writing sequel statements for you. 8 00:00:27,823 --> 00:00:31,380 And so we also get our choice of popular engines 9 00:00:31,380 --> 00:00:34,580 such as, you know, My Sequel, Sequel Server, 10 00:00:34,580 --> 00:00:37,410 Post Cres, Oracle and so on. 11 00:00:37,410 --> 00:00:39,960 Another great thing about the Relational Database Service 12 00:00:39,960 --> 00:00:42,970 is that maintenance is fully automated. 13 00:00:42,970 --> 00:00:45,840 Things like operating system patches, 14 00:00:45,840 --> 00:00:48,248 the database engine patching, backups, 15 00:00:48,248 --> 00:00:50,810 are all fully automated. 16 00:00:50,810 --> 00:00:54,650 It's also highly customizable, we get access to options 17 00:00:54,650 --> 00:00:58,210 and parameters so that we can tune the database. 18 00:00:58,210 --> 00:00:59,560 Just because it's fully managed, 19 00:00:59,560 --> 00:01:01,670 doesn't mean that we're locked out of those things. 20 00:01:01,670 --> 00:01:03,550 We do get access to parameters 21 00:01:03,550 --> 00:01:06,370 so that we can, like I said, 22 00:01:06,370 --> 00:01:08,490 help tune that relational database 23 00:01:08,490 --> 00:01:10,750 to perform the way we want it to perform 24 00:01:10,750 --> 00:01:13,430 for a particular workload. 25 00:01:13,430 --> 00:01:14,720 Relational Database Service 26 00:01:14,720 --> 00:01:18,630 also supports high-availability with automatic failover. 27 00:01:18,630 --> 00:01:22,020 We will go into more details about this service later-on 28 00:01:22,020 --> 00:01:24,763 and explain exactly how that happens. 29 00:01:26,410 --> 00:01:29,210 We also have Amazon ElastiCache, 30 00:01:29,210 --> 00:01:32,281 Amazon ElastiCache, very much like RDS, 31 00:01:32,281 --> 00:01:36,780 is also fully-managed, but this gives us access 32 00:01:36,780 --> 00:01:40,020 to an in-memory cache and we get out our choice 33 00:01:40,020 --> 00:01:41,725 of Redis or Memchaced. 34 00:01:41,725 --> 00:01:45,719 So with Memcahced we get a simple key value store. 35 00:01:45,719 --> 00:01:48,350 With Redis we actually get much more 36 00:01:48,350 --> 00:01:50,003 of an in-memory database. 37 00:01:50,990 --> 00:01:55,180 We have the ability to use more advanced data structures 38 00:01:55,180 --> 00:01:59,550 like hashes and sorted sets and in-memory operations. 39 00:01:59,550 --> 00:02:02,270 Now with Amazon DynamoDB, 40 00:02:02,270 --> 00:02:06,195 we gain access to a fully managed NoSQL datastore. 41 00:02:06,195 --> 00:02:08,778 This is, of course, in the same family 42 00:02:08,778 --> 00:02:12,445 of data stores such as Casandra and MongoDB. 43 00:02:13,660 --> 00:02:16,990 But with DynamoDB there's no 44 00:02:16,990 --> 00:02:19,720 infrastructure whatsoever for us to manage. 45 00:02:19,720 --> 00:02:22,540 Amazon handles all of the underlying infrastructure 46 00:02:22,540 --> 00:02:25,260 as well as the database engine. 47 00:02:25,260 --> 00:02:28,770 One of the benefits of DynamoDB, one of the many benefits, 48 00:02:28,770 --> 00:02:30,450 is that it's highly scalable. 49 00:02:30,450 --> 00:02:34,180 Your data set can grow into the petabytes and beyond, 50 00:02:34,180 --> 00:02:37,160 without any additional effort on your part. 51 00:02:37,160 --> 00:02:39,650 You just simply throw data at your table 52 00:02:39,650 --> 00:02:41,264 and it automatically grows 53 00:02:41,264 --> 00:02:45,010 and it grows, not only for the storage plane, 54 00:02:45,010 --> 00:02:47,730 but also for the infrastructure 55 00:02:47,730 --> 00:02:50,810 that's managing the actual throughput. 56 00:02:50,810 --> 00:02:54,853 DynamoDB, like most managed services within AWS, 57 00:02:54,853 --> 00:02:58,590 is also fault-tolerant, it's inherently fault-tolerant. 58 00:02:58,590 --> 00:03:02,420 All of those good things are built in to that service. 59 00:03:02,420 --> 00:03:04,880 So the benefit of these managed services, 60 00:03:04,880 --> 00:03:07,476 like RDS and ElastiCache and DynamoDB, 61 00:03:07,476 --> 00:03:10,870 is that we can offload operational burdens. 62 00:03:10,870 --> 00:03:12,690 So that we as an organization, 63 00:03:12,690 --> 00:03:14,520 we don't have to worry about what it means 64 00:03:14,520 --> 00:03:17,230 to make that service fault-tolerant. 65 00:03:17,230 --> 00:03:21,610 DynamoDB is also internally, automatically replicated. 66 00:03:21,610 --> 00:03:23,850 Our data is written to numerous devices 67 00:03:23,850 --> 00:03:25,600 across numerous data centers. 68 00:03:25,600 --> 00:03:27,690 And again, we will go into more detail 69 00:03:27,690 --> 00:03:31,203 on the service later and talk about how that happens. 70 00:03:32,660 --> 00:03:34,710 DynamoDB is also event-driven, 71 00:03:34,710 --> 00:03:38,080 meaning that as data is being written 72 00:03:38,080 --> 00:03:41,120 or updated or deleted from a table, 73 00:03:41,120 --> 00:03:43,520 then we can tie into those events 74 00:03:43,520 --> 00:03:46,760 and analyze those things in near real-time 75 00:03:46,760 --> 00:03:48,980 as those changes occur. 76 00:03:48,980 --> 00:03:51,880 With Amazon Redshift we gain access 77 00:03:51,880 --> 00:03:56,220 to Amazon's petabyte-scale data warehouse 78 00:03:56,220 --> 00:03:58,837 and again, this is also fully-managed. 79 00:03:58,837 --> 00:04:02,160 AWS takes care of everything from the infrastructure 80 00:04:02,160 --> 00:04:05,140 to the operating system to the database engine. 81 00:04:05,140 --> 00:04:07,300 And the largest cluster within Redshift 82 00:04:07,300 --> 00:04:12,300 gives us access to `1.6 petabytes of storage and 83 00:04:12,879 --> 00:04:16,850 we can use this to run queries - sequel queries - 84 00:04:16,850 --> 00:04:20,053 across a vast dataset. 85 00:04:22,451 --> 00:04:26,005 Of course Redshift, being a fork of Postgres8, 86 00:04:26,005 --> 00:04:28,663 means that it's fully SQL compliant. 87 00:04:30,160 --> 00:04:33,840 And again, just like other systems like DynamoDB, 88 00:04:33,840 --> 00:04:36,426 Redshift is also inherently fault-tolerant, 89 00:04:36,426 --> 00:04:39,750 highly durable and also replicated. 90 00:04:39,750 --> 00:04:42,660 It also supports server-side encryption 91 00:04:42,660 --> 00:04:44,290 and disc-based encryption. 92 00:04:44,290 --> 00:04:47,230 With Amazon Neptune, we gain access 93 00:04:47,230 --> 00:04:50,550 to a fully-managed graph database. 94 00:04:50,550 --> 00:04:53,910 With a graph-database, we can use this to 95 00:04:55,760 --> 00:04:58,840 run queries and analytics based on relationships. 96 00:04:58,840 --> 00:05:00,720 This database is optimized 97 00:05:00,720 --> 00:05:03,080 for relationships between entities. 98 00:05:03,080 --> 00:05:06,320 So you think of things like a recommendation engine 99 00:05:06,320 --> 00:05:09,420 would be really well suited to run 100 00:05:09,420 --> 00:05:11,763 on a graph database such as Neptune. 101 00:05:13,650 --> 00:05:18,650 Again, also inherently internally replicated for durability. 102 00:05:18,890 --> 00:05:21,780 Neptune also performs continuous backups, 103 00:05:21,780 --> 00:05:24,920 whereas something like the Relational Database Service 104 00:05:25,774 --> 00:05:28,460 performs backups every so often, 105 00:05:28,460 --> 00:05:31,270 with other services like Neptune 106 00:05:31,270 --> 00:05:33,113 the backups are continuous. 107 00:05:35,130 --> 00:05:37,410 We also get the very easy ability 108 00:05:37,410 --> 00:05:42,410 to create read replicas with Neptune and RDS as well 109 00:05:43,270 --> 00:05:47,530 so that we can offload our reads to one node 110 00:05:47,530 --> 00:05:49,290 and send our writes to another node. 111 00:05:49,290 --> 00:05:53,393 That can help increase performance of the system as a whole. 112 00:05:55,820 --> 00:05:59,100 Now, a couple of other databases 113 00:05:59,100 --> 00:06:04,100 that were released in November of 2018, during AWS reinvent, 114 00:06:05,990 --> 00:06:10,070 were Amazon Timestream and the Quantum Leisure Database. 115 00:06:10,070 --> 00:06:14,430 So these are very new database services 116 00:06:15,330 --> 00:06:18,240 that were very recently released. 117 00:06:18,240 --> 00:06:20,078 So with Amazon Timestream, 118 00:06:20,078 --> 00:06:25,078 we gain access to a fully-managed, time series database. 119 00:06:25,790 --> 00:06:28,370 So if you've been in the industry for, you know, 120 00:06:28,370 --> 00:06:32,200 more than a year or so, you've probably seen 121 00:06:32,200 --> 00:06:36,430 a scenario where you are recording time series data. 122 00:06:36,430 --> 00:06:37,970 Time series data is very common 123 00:06:37,970 --> 00:06:42,970 but it's also a very specific type of data 124 00:06:43,220 --> 00:06:45,680 that needs to be adjusted and stored and accessed 125 00:06:45,680 --> 00:06:49,260 in a particular way and if you've used 126 00:06:49,260 --> 00:06:50,770 Relational Databases long enough, 127 00:06:50,770 --> 00:06:53,010 you'll know that they're great for many things. 128 00:06:53,010 --> 00:06:55,870 But when you're recording many events 129 00:06:55,870 --> 00:06:58,970 based on the second or even sub-second, 130 00:06:58,970 --> 00:07:00,600 you can very quickly, 131 00:07:00,600 --> 00:07:03,579 over a relatively short period of time, 132 00:07:03,579 --> 00:07:07,230 end up with potentially billions of events. 133 00:07:07,230 --> 00:07:09,690 Then if you're wanting to access those quickly 134 00:07:09,690 --> 00:07:13,840 or perform time-series specific types of analytics, 135 00:07:13,840 --> 00:07:16,270 then relation databases tend to fall down 136 00:07:16,270 --> 00:07:18,380 and tend to not perform very well 137 00:07:18,380 --> 00:07:20,480 for that specific type of workload. 138 00:07:20,480 --> 00:07:23,970 So the Amazon Timestream Database is specifically geared 139 00:07:23,970 --> 00:07:26,810 towards that time series type of data 140 00:07:26,810 --> 00:07:30,610 and it's capable of recording trillions of events per day. 141 00:07:30,610 --> 00:07:34,530 So if you have scenarios where you are wanting to record 142 00:07:34,530 --> 00:07:39,530 events, you know, in less than a second interval, 143 00:07:40,240 --> 00:07:43,190 then Amazon Timestream may very well 144 00:07:43,190 --> 00:07:46,163 be the right solution for that problem. 145 00:07:47,780 --> 00:07:50,123 It also has built-in analytic functions 146 00:07:50,123 --> 00:07:53,550 so you don't necessarily have to write your own code 147 00:07:53,550 --> 00:07:56,920 to perform analytics on that time-series data. 148 00:07:56,920 --> 00:07:59,036 You could leverage the built-in analytic functions 149 00:07:59,036 --> 00:08:02,340 of Amazon Timestream itself. 150 00:08:02,340 --> 00:08:05,400 And now, at the time of this video recording, 151 00:08:05,400 --> 00:08:08,490 Amazon Timestream is currently in preview. 152 00:08:08,490 --> 00:08:10,270 So by the time you're watching this, 153 00:08:10,270 --> 00:08:12,780 it may or not be out of preview. 154 00:08:12,780 --> 00:08:16,060 But, you know, keep an eye out for Amazon Timestream 155 00:08:16,060 --> 00:08:19,853 if you are looking to do something with time-series data. 156 00:08:21,630 --> 00:08:23,500 And lastly, that brings us to the 157 00:08:23,500 --> 00:08:26,870 Amazon Quantum Ledger Database or QLDB. 158 00:08:26,870 --> 00:08:31,770 With QLDB we gain access to a fully-managed ledger database. 159 00:08:31,770 --> 00:08:34,900 Now when I say ledger-database, what I mean is, 160 00:08:34,900 --> 00:08:36,650 when you think about a ledger, 161 00:08:36,650 --> 00:08:38,900 a ledger is most typically used in something 162 00:08:38,900 --> 00:08:40,840 like financial transactions if you want 163 00:08:40,840 --> 00:08:44,420 to record the history of debits and credits 164 00:08:44,420 --> 00:08:46,270 to and from an account 165 00:08:46,270 --> 00:08:49,770 and you want that history to be accurate and immutable. 166 00:08:49,770 --> 00:08:51,830 Right, so the Quantum Ledger Database 167 00:08:51,830 --> 00:08:54,980 uses an append-only immutable journal. 168 00:08:54,980 --> 00:08:57,020 So every transaction is simply added 169 00:08:57,020 --> 00:08:58,410 to the end of that ledger 170 00:08:58,410 --> 00:09:01,030 and there's no ability to go back 171 00:09:01,030 --> 00:09:03,860 and modify a previous transaction. 172 00:09:03,860 --> 00:09:05,200 So the nice thing about a ledger 173 00:09:05,200 --> 00:09:07,380 is that it gives you a very accurate 174 00:09:07,380 --> 00:09:10,580 historical journal of transactions. 175 00:09:10,580 --> 00:09:12,870 They don't necessarily have to be financial, 176 00:09:12,870 --> 00:09:14,270 even though that's the most common, 177 00:09:14,270 --> 00:09:16,221 there are all kinds of use cases for 178 00:09:16,221 --> 00:09:19,580 recording the history of certain things. 179 00:09:19,580 --> 00:09:23,630 And so, with the ledger, this takes a page 180 00:09:23,630 --> 00:09:26,950 from the block-chain networks such as 181 00:09:26,950 --> 00:09:29,280 Bitcoin or Ethereum or others. 182 00:09:29,280 --> 00:09:32,020 Whereas you have this with 183 00:09:32,020 --> 00:09:34,890 those type of distributed ledgers 184 00:09:34,890 --> 00:09:37,860 You're able to cryptographically prove that 185 00:09:37,860 --> 00:09:41,440 the history of that ledger was immutable, 186 00:09:41,440 --> 00:09:43,320 that it hasn't changed. 187 00:09:43,320 --> 00:09:46,030 And so here, we get the same ability 188 00:09:46,030 --> 00:09:49,210 except that the Quantum Ledger Database is centralized 189 00:09:49,210 --> 00:09:52,414 and the benefit of it centralized rather than distributed 190 00:09:52,414 --> 00:09:55,330 like we see with Bitcoin or Ethereum, 191 00:09:55,330 --> 00:09:57,030 or other networks like that, 192 00:09:57,030 --> 00:09:59,150 is that it's going to operate much faster, 193 00:09:59,150 --> 00:10:01,476 there's no need to transfer the entire ledger 194 00:10:01,476 --> 00:10:04,830 to all of these nodes all over the internet. 195 00:10:04,830 --> 00:10:07,393 It's all centralized into one place. 196 00:10:08,280 --> 00:10:12,575 And so again, we can validate the entire history 197 00:10:12,575 --> 00:10:16,170 of that ledger with the cryptographic hash 198 00:10:16,170 --> 00:10:19,235 which means if anyone were to somehow 199 00:10:19,235 --> 00:10:22,460 modify a previous transaction, 200 00:10:22,460 --> 00:10:25,590 then the ultimate hash would be invalidated, 201 00:10:25,590 --> 00:10:27,163 it wouldn't match. 202 00:10:27,163 --> 00:10:30,090 So again, we have the ability to verify 203 00:10:30,090 --> 00:10:32,550 an entire history of transactions 204 00:10:32,550 --> 00:10:34,570 with the Quantum Ledger Database. 205 00:10:34,570 --> 00:10:38,360 And so those are our database options within AWS. 206 00:10:38,360 --> 00:10:40,956 You can see we have a number of really amazing options 207 00:10:40,956 --> 00:10:43,684 to choose from according to the types of data 208 00:10:43,684 --> 00:10:46,910 and the way that we're using that data. 209 00:10:46,910 --> 00:10:50,170 And so we will go into many more details 210 00:10:50,170 --> 00:10:52,990 on each one of these, or most of these services 211 00:10:52,990 --> 00:10:54,453 as the course progresses.