1 00:00:06,451 --> 00:00:08,670 - Now let's take a look at an overview 2 00:00:08,670 --> 00:00:11,120 of all the options we have for storage 3 00:00:11,120 --> 00:00:13,370 within Amazon Web Services. 4 00:00:14,829 --> 00:00:16,778 We have a number of different storage options 5 00:00:16,778 --> 00:00:18,550 in Amazon Web Services, 6 00:00:18,550 --> 00:00:19,989 and let's just start here on the left 7 00:00:19,989 --> 00:00:23,067 with what we call the instance store. 8 00:00:23,067 --> 00:00:25,938 The instance store refers to 9 00:00:25,938 --> 00:00:30,488 block storage that are, essentially what we could say is, 10 00:00:30,488 --> 00:00:33,890 are built in to the EC2 instance. 11 00:00:33,890 --> 00:00:36,417 A number of different EC2 instance types 12 00:00:36,417 --> 00:00:39,884 come with a different type of storage. 13 00:00:39,884 --> 00:00:42,336 It could be a small 30 gig volume, 14 00:00:42,336 --> 00:00:45,840 it could be several terabytes worth of volumes, 15 00:00:45,840 --> 00:00:50,459 and these are, now we say, a free, sort of highlighted here 16 00:00:50,459 --> 00:00:53,378 because the cost of that storage is included 17 00:00:53,378 --> 00:00:54,798 with the price of the machine, 18 00:00:54,798 --> 00:00:57,378 so you're not paying anything additional 19 00:00:57,378 --> 00:00:59,377 to the price of the machine for that storage, 20 00:00:59,377 --> 00:01:01,147 it just comes with it. 21 00:01:01,147 --> 00:01:05,307 Now, these drives within the instance store 22 00:01:05,307 --> 00:01:07,138 are considered ephemeral, 23 00:01:07,138 --> 00:01:12,036 meaning that when that EC2 instance terminates, or stops, 24 00:01:12,036 --> 00:01:15,470 then whatever data is on that particular instance store 25 00:01:15,470 --> 00:01:16,719 will be gone. 26 00:01:16,719 --> 00:01:19,410 And that's because the instance store 27 00:01:19,410 --> 00:01:22,637 is backed by the volumes that are present 28 00:01:22,637 --> 00:01:25,949 on the physical host of that virtual machine. 29 00:01:25,949 --> 00:01:27,705 So, when you stop a virtual machine, 30 00:01:27,705 --> 00:01:29,255 or terminate a virtual machine, 31 00:01:29,255 --> 00:01:31,022 those stores will go away. 32 00:01:31,022 --> 00:01:33,851 Even if you were to stop a machine and start it again, 33 00:01:33,851 --> 00:01:37,852 it's most likely, probably almost even guaranteed 34 00:01:37,852 --> 00:01:40,441 that that instance will come back 35 00:01:40,441 --> 00:01:42,444 on a different physical host 36 00:01:42,444 --> 00:01:44,899 and so that underlining storage subsystem 37 00:01:44,899 --> 00:01:46,449 is no longer available. 38 00:01:46,449 --> 00:01:50,159 And so, that's why we consider the instance store ephemeral. 39 00:01:50,159 --> 00:01:51,507 It's really important to know that 40 00:01:51,507 --> 00:01:53,148 so if that if you have applications 41 00:01:53,148 --> 00:01:54,842 that are writing to that, 42 00:01:54,842 --> 00:01:57,763 then if you lose that machine for whatever reason, 43 00:01:57,763 --> 00:01:59,863 if it fails, if you terminate it, 44 00:01:59,863 --> 00:02:02,404 if auto-scaling pulls that machine back in 45 00:02:02,404 --> 00:02:03,564 and gets rid of it, 46 00:02:03,564 --> 00:02:05,535 then that storage will go away. 47 00:02:05,535 --> 00:02:08,525 We do not have the ability to take snapshots, 48 00:02:08,525 --> 00:02:12,314 so there's no snapshot capabilities of the instance store. 49 00:02:12,314 --> 00:02:16,245 If you want to use that for it's high IOPs capability, 50 00:02:16,245 --> 00:02:17,955 then back ups will be something 51 00:02:17,955 --> 00:02:20,374 that you will have to do on your own. 52 00:02:20,374 --> 00:02:23,414 Now, that brings us to another form of block storage, 53 00:02:23,414 --> 00:02:27,219 which is Amazon EBS, Elastic Block Storage service. 54 00:02:27,219 --> 00:02:29,674 Now, the nice thing about the service is that 55 00:02:29,674 --> 00:02:31,772 these volumes, these disks, 56 00:02:31,772 --> 00:02:34,886 are independent from the machine itself. 57 00:02:34,886 --> 00:02:38,047 Typically what we would do is launch an EC2 instance 58 00:02:38,047 --> 00:02:42,214 and then we would attach an EBS volume to that instance 59 00:02:43,355 --> 00:02:46,543 so that we can have a place to store our data. 60 00:02:46,543 --> 00:02:50,710 But, because this device is independent from the machine, 61 00:02:51,947 --> 00:02:54,527 if we configure things properly, 62 00:02:54,527 --> 00:02:57,790 we could potentially, you know, lose this EC2 instance, 63 00:02:57,790 --> 00:03:01,177 but the data that's on this EBS volume 64 00:03:01,177 --> 00:03:02,695 will still be there 65 00:03:02,695 --> 00:03:06,256 because it's independent from the life of the EC2 instance. 66 00:03:06,256 --> 00:03:08,923 Now, it's important to know that 67 00:03:10,296 --> 00:03:12,494 as far as EBS goes, 68 00:03:12,494 --> 00:03:16,577 what we pay for is a allocated storage per month, 69 00:03:18,035 --> 00:03:21,952 so if we allocate one terabyte for that volume, 70 00:03:24,090 --> 00:03:26,655 but we only use one meg, 71 00:03:26,655 --> 00:03:28,262 we're still paying for the one terabyte 72 00:03:28,262 --> 00:03:32,979 because that's how much we've allocated for the month, 73 00:03:32,979 --> 00:03:36,668 or that's how much we've allocated for that drive. 74 00:03:36,668 --> 00:03:39,597 Now, with EBS volumes we get 75 00:03:39,597 --> 00:03:42,847 a maximum of a 16 terabytes per volume, 76 00:03:43,850 --> 00:03:46,028 and when we say that they're durable, 77 00:03:46,028 --> 00:03:47,111 it means that 78 00:03:47,949 --> 00:03:51,699 the EBS system, inherently behind the scenes will 79 00:03:51,699 --> 00:03:55,968 replicate the data to numerous devices within a cluster 80 00:03:55,968 --> 00:03:57,908 in one availability zone. 81 00:03:57,908 --> 00:04:01,152 So EBS is, to some degree, already redundant 82 00:04:01,152 --> 00:04:03,500 and it's resilient to the loss of a device 83 00:04:03,500 --> 00:04:04,920 within that cluster. 84 00:04:04,920 --> 00:04:09,087 EBS is not resilient to the loss of a data center, 85 00:04:09,996 --> 00:04:13,818 EBS volumes are created per availability zone, 86 00:04:13,818 --> 00:04:16,394 they don't span the entire region. 87 00:04:16,394 --> 00:04:20,927 So, we do have the ability to create snapshots, 88 00:04:20,927 --> 00:04:25,094 and those snapshots are then stored over here in S3. 89 00:04:27,103 --> 00:04:30,939 So, S3 is what we would consider object storage, 90 00:04:30,939 --> 00:04:33,375 and we're gonna talk about the difference between those two 91 00:04:33,375 --> 00:04:35,556 here in just a little bit. 92 00:04:35,556 --> 00:04:38,139 So, with S3 we can get storage, 93 00:04:39,021 --> 00:04:40,830 at least at the time of this video, 94 00:04:40,830 --> 00:04:44,223 storage was somewhere around 3 cents per gig per month. 95 00:04:44,223 --> 00:04:47,269 That's an incredibly low price for storage, 96 00:04:47,269 --> 00:04:49,498 and what we might describe 97 00:04:49,498 --> 00:04:53,665 Amazon's Simple Storage Service as write once, read many, 98 00:04:55,280 --> 00:05:00,211 we might also describe S3 as storage for the internet. 99 00:05:00,211 --> 00:05:04,048 So, we say that because S3 is a really great place 100 00:05:04,048 --> 00:05:06,837 to store things that need to be retrieved 101 00:05:06,837 --> 00:05:10,745 from millions of users out there across the internet. 102 00:05:10,745 --> 00:05:13,512 So we could put images that are meant for a website, 103 00:05:13,512 --> 00:05:16,012 we could put software patches, 104 00:05:16,985 --> 00:05:19,463 PDF documents, videos, 105 00:05:19,463 --> 00:05:21,843 all kinds of things can be stored in S3, 106 00:05:21,843 --> 00:05:25,143 and then we can send our users directly to S3, 107 00:05:25,143 --> 00:05:26,800 bypassing our own systems 108 00:05:26,800 --> 00:05:28,640 and reducing the load on our systems 109 00:05:28,640 --> 00:05:31,290 and just letting S3 handle the delivery 110 00:05:31,290 --> 00:05:33,928 of that content to our users. 111 00:05:33,928 --> 00:05:37,469 It's also important to know that all transfers use HTTP, 112 00:05:37,469 --> 00:05:42,136 that is the protocol for Amazon Simple Storage Service. 113 00:05:42,136 --> 00:05:44,156 Now, the last option that I wanna talk about 114 00:05:44,156 --> 00:05:46,180 with AWS storage options 115 00:05:46,180 --> 00:05:48,898 is what's called the AWS Storage Gateway. 116 00:05:48,898 --> 00:05:51,912 This is a really interesting service in that 117 00:05:51,912 --> 00:05:56,307 ultimately what it is is a virtual machine, 118 00:05:56,307 --> 00:05:58,687 this is published by Amazon Web Services, 119 00:05:58,687 --> 00:06:02,128 and it's designed to be run on-premises, 120 00:06:02,128 --> 00:06:04,029 on an on-premises appliance 121 00:06:04,029 --> 00:06:06,112 using VM wear or Hyper V, 122 00:06:07,068 --> 00:06:08,557 and so you run this in your own 123 00:06:08,557 --> 00:06:12,786 local on-premises data center or co-location, what have you. 124 00:06:12,786 --> 00:06:14,703 And it exposes a device 125 00:06:16,356 --> 00:06:19,427 to which you can use as local storage. 126 00:06:19,427 --> 00:06:22,106 All of the data that is stored there 127 00:06:22,106 --> 00:06:26,845 will be also backed up, encrypted in S3 or Glacier. 128 00:06:26,845 --> 00:06:28,724 You can configure it 129 00:06:28,724 --> 00:06:30,945 to use either one of those services. 130 00:06:30,945 --> 00:06:32,733 If you use S3, 131 00:06:32,733 --> 00:06:36,933 we can use some other mechanisms that we'll talk about, 132 00:06:36,933 --> 00:06:40,213 like life cycle rules to get those files 133 00:06:40,213 --> 00:06:42,482 from S3 to Glacier in another way. 134 00:06:42,482 --> 00:06:46,804 The storage gateway comes in three possible configurations. 135 00:06:46,804 --> 00:06:50,014 One of them that we call the gateway-cached volume, 136 00:06:50,014 --> 00:06:52,383 means that the great majority of your data 137 00:06:52,383 --> 00:06:54,684 is stored directly in S3, 138 00:06:54,684 --> 00:06:57,706 but frequently-accessed data is stored locally, 139 00:06:57,706 --> 00:07:00,306 that's the gateway-cached volume. 140 00:07:00,306 --> 00:07:02,099 The gateway-stored volume 141 00:07:02,099 --> 00:07:06,107 means that we're storing all of our data locally, 142 00:07:06,107 --> 00:07:08,889 but it takes regular point-in-time snapshots 143 00:07:08,889 --> 00:07:11,989 and uploads those snapshots to S3, 144 00:07:11,989 --> 00:07:13,910 and from there, if we wanted to, 145 00:07:13,910 --> 00:07:17,827 we could rebuild those snapshots as EBS volumes 146 00:07:19,158 --> 00:07:23,055 and attach those to virtual machines within EC2. 147 00:07:23,055 --> 00:07:26,774 The last one is what's called a virtual tape library 148 00:07:26,774 --> 00:07:29,143 that exposes an iSCSI interface, 149 00:07:29,143 --> 00:07:31,692 and it essentially looks just like tape, 150 00:07:31,692 --> 00:07:34,822 but that each virtual tape is stored 151 00:07:34,822 --> 00:07:37,213 directly in S3, or Glacier. 152 00:07:37,213 --> 00:07:39,929 So, for those instances where you have 153 00:07:39,929 --> 00:07:41,779 a lot of data on premises 154 00:07:41,779 --> 00:07:43,449 and you have a lot of heavy workload, 155 00:07:43,449 --> 00:07:47,139 you're not ready to move all of that stuff to the Cloud yet, 156 00:07:47,139 --> 00:07:49,130 or perhaps there are things that you 157 00:07:49,130 --> 00:07:50,741 never plan to migrate, 158 00:07:50,741 --> 00:07:52,530 you're gonna leave it back on premises, 159 00:07:52,530 --> 00:07:54,410 but you want a good DR solution, 160 00:07:54,410 --> 00:07:56,387 then the AWS Storage Gateway 161 00:07:56,387 --> 00:07:58,636 is a really good option to consider. 162 00:07:58,636 --> 00:08:02,519 So, those at a high level, those are our storage options 163 00:08:02,519 --> 00:08:04,769 within Amazon Web Services.