1
00:00:07,120 --> 00:00:12,120
- While AWS is built of many
individual building blocks

2
00:00:13,410 --> 00:00:17,980
and a lot of them can be
used atomically on their own,

3
00:00:17,980 --> 00:00:20,829
it's really designed as an ecosystem

4
00:00:20,829 --> 00:00:25,360
and meant to be used as that ecosystem.

5
00:00:25,360 --> 00:00:27,700
And there's a great example

6
00:00:27,700 --> 00:00:31,651
of how these services synergize together

7
00:00:31,651 --> 00:00:35,440
using an elastic load balancer,

8
00:00:35,440 --> 00:00:36,990
EC2 instances

9
00:00:38,513 --> 00:00:39,410
and then the auto scaling service

10
00:00:39,410 --> 00:00:42,010
that we're gonna be talking about now.

11
00:00:42,010 --> 00:00:46,400
So the auto scaling for EC2 is AZ scoped

12
00:00:46,400 --> 00:00:48,500
just like the EC2 instances

13
00:00:48,500 --> 00:00:51,570
that it's going to launch
and destroy on your behalf.

14
00:00:51,570 --> 00:00:55,305
However, it also supports multiple AZs

15
00:00:55,305 --> 00:00:58,280
similar to load balancers

16
00:00:58,280 --> 00:01:02,313
that you might be associating
your EC2 instances with.

17
00:01:03,440 --> 00:01:08,270
And the purpose of auto
scaling is horizontal scaling

18
00:01:08,270 --> 00:01:10,425
for your EC2 instances.

19
00:01:10,425 --> 00:01:13,200
So what is horizontal scaling?

20
00:01:13,200 --> 00:01:14,470
Well, in order to understand that

21
00:01:14,470 --> 00:01:17,320
let's first explain vertical scaling.

22
00:01:17,320 --> 00:01:22,320
Vertical scaling is when
you add or remove resources

23
00:01:22,360 --> 00:01:26,302
from a single EC2
instance, like resizing it

24
00:01:26,302 --> 00:01:31,302
or adding an extra EBS
volume or upsizing a volume.

25
00:01:32,323 --> 00:01:34,800
Many of those types of operations

26
00:01:34,800 --> 00:01:38,453
are going to generate some sort of outage.

27
00:01:39,580 --> 00:01:41,850
Horizontal scaling's a
little bit different.

28
00:01:41,850 --> 00:01:45,120
This is where you add and
remove discrete instances

29
00:01:45,120 --> 00:01:48,400
because they're all
performing similar tasks.

30
00:01:48,400 --> 00:01:51,430
And so from an infrastructure perspective

31
00:01:51,430 --> 00:01:55,290
where we have a two-tier
application, load balancer,

32
00:01:55,290 --> 00:01:57,883
and then auto scaled
group of EC2 instances.

33
00:01:59,010 --> 00:02:04,010
When the inbound traffic
or load or latency

34
00:02:04,520 --> 00:02:07,860
or whatever metric we're using as a KPI.

35
00:02:07,860 --> 00:02:10,994
When that increases
beyond a certain threshold

36
00:02:10,994 --> 00:02:15,450
we can then add EC2
instances to address that

37
00:02:15,450 --> 00:02:18,163
and reduce it back to a tolerant level.

38
00:02:19,540 --> 00:02:23,850
But auto scaling is an elastic feature.

39
00:02:23,850 --> 00:02:26,380
It's not just for scaling out.

40
00:02:26,380 --> 00:02:28,310
It is also for scaling in.

41
00:02:28,310 --> 00:02:33,150
And so when that traffic load decreases

42
00:02:33,150 --> 00:02:35,852
down below other levels,

43
00:02:35,852 --> 00:02:37,890
you can remove instances.

44
00:02:37,890 --> 00:02:42,383
And so it is a great cost
optimization feature.

45
00:02:43,220 --> 00:02:47,343
In addition to being
performant and efficient.

46
00:02:49,468 --> 00:02:53,569
In order to design an
auto scaled infrastructure

47
00:02:53,569 --> 00:02:56,428
we need to talk a little
bit about a scaling plan.

48
00:02:56,428 --> 00:03:00,850
And a scaling plan starts
with a scaling strategy.

49
00:03:00,850 --> 00:03:03,430
A scaling strategy is a spectrum.

50
00:03:03,430 --> 00:03:06,060
On one end, we have availability.

51
00:03:06,060 --> 00:03:08,670
If you're optimizing for availability

52
00:03:08,670 --> 00:03:13,670
you are going to want to have
as many instances as possible.

53
00:03:13,807 --> 00:03:17,649
On the other end of the
spectrum, we have cost.

54
00:03:17,649 --> 00:03:20,470
And that's the trade off between the two.

55
00:03:20,470 --> 00:03:22,030
If you wanna have more instances

56
00:03:22,030 --> 00:03:24,680
to handle all that traffic all the time,

57
00:03:24,680 --> 00:03:26,760
you're gonna have to pay more for it.

58
00:03:26,760 --> 00:03:29,068
And so if you are trying
to optimize for cost

59
00:03:29,068 --> 00:03:33,926
you are gonna want to deploy
the fewest instances required

60
00:03:33,926 --> 00:03:36,010
to get the job done.

61
00:03:36,010 --> 00:03:40,300
And many organizations end
up somewhere in the middle

62
00:03:40,300 --> 00:03:41,810
on this spectrum

63
00:03:41,810 --> 00:03:44,490
with optimizing for cost,

64
00:03:44,490 --> 00:03:47,780
but recognizing that
the auto scaling service

65
00:03:47,780 --> 00:03:50,790
does not scale instantaneously

66
00:03:50,790 --> 00:03:52,110
and so it might be a good idea

67
00:03:52,110 --> 00:03:55,228
to have a few extra resources on hand.

68
00:03:55,228 --> 00:03:59,686
Now, this scaling plan is
what's defined by the customer

69
00:03:59,686 --> 00:04:02,926
and this can be translated directly

70
00:04:02,926 --> 00:04:07,065
into parameters and
configuration for auto scaling.

71
00:04:07,065 --> 00:04:10,950
Rules and limits for the
minimum number of instances,

72
00:04:10,950 --> 00:04:14,449
the maximum number of
instances, and so forth.

73
00:04:14,449 --> 00:04:19,449
And then AWS will take that
configuration and parameters

74
00:04:20,370 --> 00:04:22,680
and turn that into a combination

75
00:04:22,680 --> 00:04:25,660
of dynamic and predictive scaling.

76
00:04:25,660 --> 00:04:28,373
And we're gonna talk about
those two terms coming up.

77
00:04:29,366 --> 00:04:31,960
The auto scaling architecture

78
00:04:31,960 --> 00:04:34,400
is broken up into several pieces.

79
00:04:34,400 --> 00:04:37,830
The first of these is
not specifically part

80
00:04:37,830 --> 00:04:39,080
of the auto scaling service.

81
00:04:39,080 --> 00:04:41,080
It's actually part of the EC2 service

82
00:04:41,080 --> 00:04:43,270
and it's called a launch template.

83
00:04:43,270 --> 00:04:48,166
The launch template is basically
a set of default values

84
00:04:48,166 --> 00:04:51,526
in order to launch an EC2 instance.

85
00:04:51,526 --> 00:04:55,420
And so the launch template is
gonna answer the question of:

86
00:04:55,420 --> 00:04:56,653
what am I gonna launch?

87
00:04:58,170 --> 00:05:00,550
The auto scaling group itself

88
00:05:00,550 --> 00:05:03,780
is the sticky glue that holds
the whole service together.

89
00:05:03,780 --> 00:05:07,173
This is what defines
limits in associations.

90
00:05:07,173 --> 00:05:10,210
Minimum instances, maximum instances,

91
00:05:10,210 --> 00:05:13,373
associations with target
groups, and so on.

92
00:05:15,850 --> 00:05:20,010
Scaling policies are gonna determine when

93
00:05:20,010 --> 00:05:24,180
or under which conditions you
launch or destroy resources

94
00:05:24,180 --> 00:05:26,333
according to cloud watch metrics.

95
00:05:27,990 --> 00:05:32,990
And finally scheduled actions
also define when you scale

96
00:05:33,560 --> 00:05:36,110
but according to a calendar or clock

97
00:05:36,110 --> 00:05:38,140
instead of specific metrics,

98
00:05:38,140 --> 00:05:40,888
and they can be used in conjunction

99
00:05:40,888 --> 00:05:43,583
with the auto scaling policies.