1 00:00:07,120 --> 00:00:12,120 - While AWS is built of many individual building blocks 2 00:00:13,410 --> 00:00:17,980 and a lot of them can be used atomically on their own, 3 00:00:17,980 --> 00:00:20,829 it's really designed as an ecosystem 4 00:00:20,829 --> 00:00:25,360 and meant to be used as that ecosystem. 5 00:00:25,360 --> 00:00:27,700 And there's a great example 6 00:00:27,700 --> 00:00:31,651 of how these services synergize together 7 00:00:31,651 --> 00:00:35,440 using an elastic load balancer, 8 00:00:35,440 --> 00:00:36,990 EC2 instances 9 00:00:38,513 --> 00:00:39,410 and then the auto scaling service 10 00:00:39,410 --> 00:00:42,010 that we're gonna be talking about now. 11 00:00:42,010 --> 00:00:46,400 So the auto scaling for EC2 is AZ scoped 12 00:00:46,400 --> 00:00:48,500 just like the EC2 instances 13 00:00:48,500 --> 00:00:51,570 that it's going to launch and destroy on your behalf. 14 00:00:51,570 --> 00:00:55,305 However, it also supports multiple AZs 15 00:00:55,305 --> 00:00:58,280 similar to load balancers 16 00:00:58,280 --> 00:01:02,313 that you might be associating your EC2 instances with. 17 00:01:03,440 --> 00:01:08,270 And the purpose of auto scaling is horizontal scaling 18 00:01:08,270 --> 00:01:10,425 for your EC2 instances. 19 00:01:10,425 --> 00:01:13,200 So what is horizontal scaling? 20 00:01:13,200 --> 00:01:14,470 Well, in order to understand that 21 00:01:14,470 --> 00:01:17,320 let's first explain vertical scaling. 22 00:01:17,320 --> 00:01:22,320 Vertical scaling is when you add or remove resources 23 00:01:22,360 --> 00:01:26,302 from a single EC2 instance, like resizing it 24 00:01:26,302 --> 00:01:31,302 or adding an extra EBS volume or upsizing a volume. 25 00:01:32,323 --> 00:01:34,800 Many of those types of operations 26 00:01:34,800 --> 00:01:38,453 are going to generate some sort of outage. 27 00:01:39,580 --> 00:01:41,850 Horizontal scaling's a little bit different. 28 00:01:41,850 --> 00:01:45,120 This is where you add and remove discrete instances 29 00:01:45,120 --> 00:01:48,400 because they're all performing similar tasks. 30 00:01:48,400 --> 00:01:51,430 And so from an infrastructure perspective 31 00:01:51,430 --> 00:01:55,290 where we have a two-tier application, load balancer, 32 00:01:55,290 --> 00:01:57,883 and then auto scaled group of EC2 instances. 33 00:01:59,010 --> 00:02:04,010 When the inbound traffic or load or latency 34 00:02:04,520 --> 00:02:07,860 or whatever metric we're using as a KPI. 35 00:02:07,860 --> 00:02:10,994 When that increases beyond a certain threshold 36 00:02:10,994 --> 00:02:15,450 we can then add EC2 instances to address that 37 00:02:15,450 --> 00:02:18,163 and reduce it back to a tolerant level. 38 00:02:19,540 --> 00:02:23,850 But auto scaling is an elastic feature. 39 00:02:23,850 --> 00:02:26,380 It's not just for scaling out. 40 00:02:26,380 --> 00:02:28,310 It is also for scaling in. 41 00:02:28,310 --> 00:02:33,150 And so when that traffic load decreases 42 00:02:33,150 --> 00:02:35,852 down below other levels, 43 00:02:35,852 --> 00:02:37,890 you can remove instances. 44 00:02:37,890 --> 00:02:42,383 And so it is a great cost optimization feature. 45 00:02:43,220 --> 00:02:47,343 In addition to being performant and efficient. 46 00:02:49,468 --> 00:02:53,569 In order to design an auto scaled infrastructure 47 00:02:53,569 --> 00:02:56,428 we need to talk a little bit about a scaling plan. 48 00:02:56,428 --> 00:03:00,850 And a scaling plan starts with a scaling strategy. 49 00:03:00,850 --> 00:03:03,430 A scaling strategy is a spectrum. 50 00:03:03,430 --> 00:03:06,060 On one end, we have availability. 51 00:03:06,060 --> 00:03:08,670 If you're optimizing for availability 52 00:03:08,670 --> 00:03:13,670 you are going to want to have as many instances as possible. 53 00:03:13,807 --> 00:03:17,649 On the other end of the spectrum, we have cost. 54 00:03:17,649 --> 00:03:20,470 And that's the trade off between the two. 55 00:03:20,470 --> 00:03:22,030 If you wanna have more instances 56 00:03:22,030 --> 00:03:24,680 to handle all that traffic all the time, 57 00:03:24,680 --> 00:03:26,760 you're gonna have to pay more for it. 58 00:03:26,760 --> 00:03:29,068 And so if you are trying to optimize for cost 59 00:03:29,068 --> 00:03:33,926 you are gonna want to deploy the fewest instances required 60 00:03:33,926 --> 00:03:36,010 to get the job done. 61 00:03:36,010 --> 00:03:40,300 And many organizations end up somewhere in the middle 62 00:03:40,300 --> 00:03:41,810 on this spectrum 63 00:03:41,810 --> 00:03:44,490 with optimizing for cost, 64 00:03:44,490 --> 00:03:47,780 but recognizing that the auto scaling service 65 00:03:47,780 --> 00:03:50,790 does not scale instantaneously 66 00:03:50,790 --> 00:03:52,110 and so it might be a good idea 67 00:03:52,110 --> 00:03:55,228 to have a few extra resources on hand. 68 00:03:55,228 --> 00:03:59,686 Now, this scaling plan is what's defined by the customer 69 00:03:59,686 --> 00:04:02,926 and this can be translated directly 70 00:04:02,926 --> 00:04:07,065 into parameters and configuration for auto scaling. 71 00:04:07,065 --> 00:04:10,950 Rules and limits for the minimum number of instances, 72 00:04:10,950 --> 00:04:14,449 the maximum number of instances, and so forth. 73 00:04:14,449 --> 00:04:19,449 And then AWS will take that configuration and parameters 74 00:04:20,370 --> 00:04:22,680 and turn that into a combination 75 00:04:22,680 --> 00:04:25,660 of dynamic and predictive scaling. 76 00:04:25,660 --> 00:04:28,373 And we're gonna talk about those two terms coming up. 77 00:04:29,366 --> 00:04:31,960 The auto scaling architecture 78 00:04:31,960 --> 00:04:34,400 is broken up into several pieces. 79 00:04:34,400 --> 00:04:37,830 The first of these is not specifically part 80 00:04:37,830 --> 00:04:39,080 of the auto scaling service. 81 00:04:39,080 --> 00:04:41,080 It's actually part of the EC2 service 82 00:04:41,080 --> 00:04:43,270 and it's called a launch template. 83 00:04:43,270 --> 00:04:48,166 The launch template is basically a set of default values 84 00:04:48,166 --> 00:04:51,526 in order to launch an EC2 instance. 85 00:04:51,526 --> 00:04:55,420 And so the launch template is gonna answer the question of: 86 00:04:55,420 --> 00:04:56,653 what am I gonna launch? 87 00:04:58,170 --> 00:05:00,550 The auto scaling group itself 88 00:05:00,550 --> 00:05:03,780 is the sticky glue that holds the whole service together. 89 00:05:03,780 --> 00:05:07,173 This is what defines limits in associations. 90 00:05:07,173 --> 00:05:10,210 Minimum instances, maximum instances, 91 00:05:10,210 --> 00:05:13,373 associations with target groups, and so on. 92 00:05:15,850 --> 00:05:20,010 Scaling policies are gonna determine when 93 00:05:20,010 --> 00:05:24,180 or under which conditions you launch or destroy resources 94 00:05:24,180 --> 00:05:26,333 according to cloud watch metrics. 95 00:05:27,990 --> 00:05:32,990 And finally scheduled actions also define when you scale 96 00:05:33,560 --> 00:05:36,110 but according to a calendar or clock 97 00:05:36,110 --> 00:05:38,140 instead of specific metrics, 98 00:05:38,140 --> 00:05:40,888 and they can be used in conjunction 99 00:05:40,888 --> 00:05:43,583 with the auto scaling policies.