1 00:00:06,600 --> 00:00:09,310 - Now let's talk about storage class management 2 00:00:09,310 --> 00:00:11,240 with lifecycle rules. 3 00:00:11,240 --> 00:00:13,420 When we're storing data in S3, 4 00:00:13,420 --> 00:00:16,170 we need to keep in mind that there are several different 5 00:00:16,170 --> 00:00:18,130 storage classes to choose from 6 00:00:18,130 --> 00:00:20,280 and those storage classes will affect 7 00:00:21,500 --> 00:00:25,770 what we end up paying for that storage, over time. 8 00:00:25,770 --> 00:00:27,610 The default storage class 9 00:00:27,610 --> 00:00:29,590 is called the Standard storage class 10 00:00:29,590 --> 00:00:33,680 and this will store upwards of six copies of our data 11 00:00:34,599 --> 00:00:37,680 redundantly across numerous availability zones. 12 00:00:37,680 --> 00:00:42,043 So we get the very high degree of durability in that data. 13 00:00:43,070 --> 00:00:46,580 And it's stored redundantly but it's stored in a way 14 00:00:46,580 --> 00:00:49,160 that enables a very fast access 15 00:00:49,160 --> 00:00:51,530 so that when we do want to read that data, 16 00:00:51,530 --> 00:00:54,313 it's available to us to get very quickly. 17 00:00:55,500 --> 00:00:56,930 We can lower our costs 18 00:00:56,930 --> 00:00:59,730 by looking at the other storage classes. 19 00:00:59,730 --> 00:01:02,100 Now another storage class 20 00:01:02,100 --> 00:01:05,220 would be the Standard-Infrequent Access. 21 00:01:05,220 --> 00:01:07,410 Now with Standard-Infrequent Access, 22 00:01:07,410 --> 00:01:12,410 we can lower our costs for data that we just don't look at 23 00:01:12,640 --> 00:01:14,270 quite as often. 24 00:01:14,270 --> 00:01:16,180 Now, the thing about Infrequent Access 25 00:01:16,180 --> 00:01:18,530 is that we still get the same amount of durability 26 00:01:18,530 --> 00:01:20,700 but it's just stored in a different way. 27 00:01:20,700 --> 00:01:22,730 And it's written to storage 28 00:01:22,730 --> 00:01:26,700 that is optimized for infrequent access. 29 00:01:26,700 --> 00:01:30,160 So, there may be data that we don't read very often 30 00:01:30,160 --> 00:01:31,630 but when we do read it, 31 00:01:31,630 --> 00:01:34,763 we want to be able to get it synchronously at will. 32 00:01:35,870 --> 00:01:37,160 Another storage class 33 00:01:37,160 --> 00:01:41,080 would be the One Zone-Infrequent Access. 34 00:01:41,080 --> 00:01:42,370 And in this regard, 35 00:01:42,370 --> 00:01:44,453 we get a lower degree of durability. 36 00:01:45,830 --> 00:01:48,760 Remember that with Standard and Standard-Infrequent Access, 37 00:01:48,760 --> 00:01:52,190 the data is stored across numerous availability zones. 38 00:01:52,190 --> 00:01:56,900 With One Zone-Infrequent Access, as the name suggests, 39 00:01:56,900 --> 00:02:00,130 the data is being stored in one zone. 40 00:02:00,130 --> 00:02:02,360 And so we get less resiliency 41 00:02:02,360 --> 00:02:04,850 but we also get a much lower cost. 42 00:02:04,850 --> 00:02:07,290 And then lastly, as we've already talked about, 43 00:02:07,290 --> 00:02:08,780 we have Glacier. 44 00:02:08,780 --> 00:02:13,780 And when data is transitioned to the Glacier storage class, 45 00:02:14,130 --> 00:02:16,760 the data is moved into the Glacier service. 46 00:02:16,760 --> 00:02:18,260 But keep in mind 47 00:02:18,260 --> 00:02:22,520 that if you write data to S3 48 00:02:22,520 --> 00:02:26,060 and then you transition that data to Glacier, 49 00:02:26,060 --> 00:02:29,539 you have to go to S3 to get it out. 50 00:02:29,539 --> 00:02:32,320 You can't get the data directly from Glacier. 51 00:02:32,320 --> 00:02:35,880 You have to go to S3 and manage that data from within S3 52 00:02:35,880 --> 00:02:39,870 because that's where the data essentially originated from. 53 00:02:39,870 --> 00:02:42,800 Right so, keep these storage classes in mind 54 00:02:42,800 --> 00:02:46,090 that as you move forward using S3, 55 00:02:46,090 --> 00:02:50,990 we can lower our bill by optimizing the storage class 56 00:02:50,990 --> 00:02:54,550 for the particular types of data. 57 00:02:54,550 --> 00:02:58,100 Now another thing that we can take advantage of, 58 00:02:58,100 --> 00:03:00,680 or as what we call Lifecycle Rules. 59 00:03:00,680 --> 00:03:04,293 And so within S3, 60 00:03:05,870 --> 00:03:07,460 remember that within S3, 61 00:03:07,460 --> 00:03:09,610 we're writing objects into buckets. 62 00:03:09,610 --> 00:03:12,810 And so, we might have certain types of data 63 00:03:12,810 --> 00:03:15,220 and we might say for a particular type of data 64 00:03:15,220 --> 00:03:17,330 after seven days, 65 00:03:17,330 --> 00:03:20,540 then let's automatically transition that 66 00:03:20,540 --> 00:03:23,210 to Standard-Infrequent Access 67 00:03:23,210 --> 00:03:28,210 So it could be maybe a log file, for example. 68 00:03:28,240 --> 00:03:32,140 A log file gets wrapped up and rotated 69 00:03:32,140 --> 00:03:34,820 and that log file gets shipped to S3. 70 00:03:34,820 --> 00:03:37,750 And perhaps some time during that first seven days, 71 00:03:37,750 --> 00:03:41,270 you may do a lot of analysis on that log file. 72 00:03:41,270 --> 00:03:43,710 And then after seven days, 73 00:03:43,710 --> 00:03:45,943 you may continue to do some analysis 74 00:03:45,943 --> 00:03:48,690 but maybe not as often or not as much. 75 00:03:48,690 --> 00:03:50,650 And so you want to lower your cost 76 00:03:50,650 --> 00:03:54,950 by transitioning it to Standard or Infrequent Access, 77 00:03:54,950 --> 00:03:56,780 but still have it available 78 00:03:56,780 --> 00:03:58,960 for when you do actually need it. 79 00:03:58,960 --> 00:04:03,410 And so we may put another lifecycle rule in place 80 00:04:03,410 --> 00:04:05,780 and say after 30 days, 81 00:04:05,780 --> 00:04:08,660 transition to Glacier. 82 00:04:08,660 --> 00:04:10,740 And so some time in that first month, 83 00:04:10,740 --> 00:04:12,820 you may perform different types of analysis 84 00:04:12,820 --> 00:04:14,090 and pull that data in 85 00:04:14,090 --> 00:04:17,230 and do some type of processing on it. 86 00:04:17,230 --> 00:04:20,180 But then after those 30 days, 87 00:04:20,180 --> 00:04:24,360 you still need to keep it due to perhaps regulatory needs. 88 00:04:24,360 --> 00:04:27,530 But you're probably not going to do anything with it. 89 00:04:27,530 --> 00:04:30,900 And so by transitioning to Glacier, 90 00:04:30,900 --> 00:04:34,840 what this particular object then essentially becomes 91 00:04:34,840 --> 00:04:39,840 an archive in a vault within Glacier, automatically. 92 00:04:40,490 --> 00:04:44,140 Just by putting these lifecycle rules in place, 93 00:04:44,140 --> 00:04:47,140 the S3 and Glacier systems work together 94 00:04:47,140 --> 00:04:51,780 to move that data around in order to satisfy these rules. 95 00:04:51,780 --> 00:04:54,040 And so the really great thing about that 96 00:04:54,040 --> 00:04:57,500 is that we don't have to write scripts 97 00:04:57,500 --> 00:04:59,980 and worry about the management on our own. 98 00:04:59,980 --> 00:05:02,540 All we have to do is put the rules in place 99 00:05:02,540 --> 00:05:07,443 and rely on AWS to transition that data for us. 100 00:05:08,600 --> 00:05:10,540 And then of course, we could also say 101 00:05:10,540 --> 00:05:11,920 we could do one more thing here 102 00:05:11,920 --> 00:05:16,920 and say perhaps after 2600 days or about seven years, 103 00:05:18,200 --> 00:05:22,050 permanently delete that data. 104 00:05:22,050 --> 00:05:24,820 And so once we put these lifecycle rules in place, 105 00:05:24,820 --> 00:05:25,990 as time goes on, 106 00:05:25,990 --> 00:05:29,150 as objects are written and transitioned and deleted, 107 00:05:29,150 --> 00:05:33,050 we can rest assured knowing that we've optimized 108 00:05:33,050 --> 00:05:35,600 our usage of these different storage classes 109 00:05:35,600 --> 00:05:39,010 in order to optimize our costs 110 00:05:39,010 --> 00:05:42,467 for storing large amounts of data within AWS.