1
00:00:06,840 --> 00:00:09,220
- So now let's review a use case

2
00:00:09,220 --> 00:00:11,613
of running container-based microservices.

3
00:00:13,210 --> 00:00:15,940
Okay, so we have in a
particular region here,

4
00:00:15,940 --> 00:00:20,390
in this particular example,
we're using us-east-2,

5
00:00:20,390 --> 00:00:23,010
in the Ohio region, and we have a couple

6
00:00:23,010 --> 00:00:26,550
of availability zones,
and we will be leveraging

7
00:00:26,550 --> 00:00:29,243
the Amazon Elastic Container Service.

8
00:00:30,260 --> 00:00:32,580
And so, the first thing that we do

9
00:00:32,580 --> 00:00:37,160
is we go to the ECS service,
and we create a cluster.

10
00:00:37,160 --> 00:00:41,440
Within ECS, a cluster is
really a logical grouping

11
00:00:41,440 --> 00:00:44,380
of applications that will
be deployed together,

12
00:00:44,380 --> 00:00:46,590
that will run alongside one another,

13
00:00:46,590 --> 00:00:49,340
and you can have multiple clusters.

14
00:00:49,340 --> 00:00:53,320
Now, it is possible to use Fargate,

15
00:00:53,320 --> 00:00:54,920
which is a service that allows us

16
00:00:54,920 --> 00:00:56,820
to deploy containers to that cluster

17
00:00:56,820 --> 00:01:01,180
without managing any EC2
instances whatsoever.

18
00:01:01,180 --> 00:01:04,450
And for many applications,
that might be perfectly fine.

19
00:01:04,450 --> 00:01:07,120
There are times, though,
when you want access

20
00:01:07,120 --> 00:01:11,380
to technologies that are only
available on EC2 instances,

21
00:01:11,380 --> 00:01:16,380
such as high-performing CPUs, GPUs, FPGAs,

22
00:01:17,030 --> 00:01:18,583
or high-performance networking,

23
00:01:19,760 --> 00:01:22,573
or more than 30 gigs worth of memory.

24
00:01:23,580 --> 00:01:27,860
And so if we decided to run EC2 instances

25
00:01:27,860 --> 00:01:30,570
within the cluster, then
we could deploy those,

26
00:01:30,570 --> 00:01:32,040
and you'll see here that
we're deploying those

27
00:01:32,040 --> 00:01:34,690
across multiple availability zones

28
00:01:34,690 --> 00:01:37,550
for the sake of
high-availability fault tolerance

29
00:01:37,550 --> 00:01:41,210
being resilient to the loss
of an availability zone.

30
00:01:41,210 --> 00:01:45,480
And now these instances, we
can configure these instances

31
00:01:45,480 --> 00:01:47,600
to essentially join this cluster

32
00:01:47,600 --> 00:01:52,600
and now, all of these instances,

33
00:01:52,640 --> 00:01:56,443
however many we have, whether
it's two or 20 or 200,

34
00:01:57,390 --> 00:02:00,520
they, as far as centrally,
become, through the cluster,

35
00:02:00,520 --> 00:02:03,860
they become a single pool of resources.

36
00:02:03,860 --> 00:02:06,440
And we don't need to worry about

37
00:02:06,440 --> 00:02:09,838
all of the fine logic and steps necessary

38
00:02:09,838 --> 00:02:12,080
to actually get a container running

39
00:02:12,080 --> 00:02:14,350
on an individual instance.

40
00:02:14,350 --> 00:02:19,350
All of that is abstracted
away by the ECS service,

41
00:02:19,610 --> 00:02:22,320
which is essentially a scheduler.

42
00:02:22,320 --> 00:02:25,050
And so we deploy to the cluster,

43
00:02:25,050 --> 00:02:28,500
ECS handles the fine
details of determining

44
00:02:28,500 --> 00:02:31,930
which instance is currently capable

45
00:02:31,930 --> 00:02:35,920
of supporting the resources
that we need for that container,

46
00:02:35,920 --> 00:02:40,560
be it CPU, memory, volumes, and so on.

47
00:02:40,560 --> 00:02:44,070
And then of course, if
we're running EC2 instances,

48
00:02:44,070 --> 00:02:46,100
then we would probably want to wrap those

49
00:02:46,100 --> 00:02:47,660
in an Auto Scaling group.

50
00:02:47,660 --> 00:02:51,770
We will talk much more
about Auto Scaling later on

51
00:02:51,770 --> 00:02:54,460
in lesson seven, but for
now, suffice it to say

52
00:02:54,460 --> 00:02:57,060
that Auto Scaling is, in the least,

53
00:02:57,060 --> 00:03:01,430
a way for us to maintain a
fixed number of machines.

54
00:03:01,430 --> 00:03:02,970
We can do much more with it,

55
00:03:02,970 --> 00:03:05,340
but in this particular
scenario we may say,

56
00:03:05,340 --> 00:03:09,330
well, we only want two EC2
instances in this cluster,

57
00:03:09,330 --> 00:03:11,460
and if one were to fail,

58
00:03:11,460 --> 00:03:13,660
then Auto Scaling could terminate it

59
00:03:13,660 --> 00:03:15,470
and then replace it with a new one.

60
00:03:15,470 --> 00:03:19,050
And then of course, as we deploy
containers to the cluster,

61
00:03:19,050 --> 00:03:20,960
the scheduler will then determine

62
00:03:20,960 --> 00:03:24,200
which instance those containers run on.

63
00:03:24,200 --> 00:03:27,740
So our containers will be
deployed to these instances.

64
00:03:27,740 --> 00:03:30,060
And so we would go to ECS

65
00:03:30,060 --> 00:03:33,480
and we would configure a service or a task

66
00:03:33,480 --> 00:03:37,030
and say, hey, we want this
container to be running

67
00:03:37,030 --> 00:03:42,030
and ECS will then do all
of the underlying logic

68
00:03:42,260 --> 00:03:45,343
in order to deploy that
to a particular instance.

69
00:03:46,250 --> 00:03:47,220
So we as developers,

70
00:03:47,220 --> 00:03:49,370
we worry about deploying to the cluster.

71
00:03:49,370 --> 00:03:51,960
ECS handles what it takes to actually

72
00:03:51,960 --> 00:03:54,310
get that container running on an instance.

73
00:03:54,310 --> 00:03:56,610
In this particular scenario,
as it looks right now,

74
00:03:56,610 --> 00:03:57,850
we have one container running.

75
00:03:57,850 --> 00:04:01,080
So we have one instance
of one application.

76
00:04:01,080 --> 00:04:03,800
And there are plenty of times
when that's appropriate.

77
00:04:03,800 --> 00:04:05,210
You may have a background task

78
00:04:05,210 --> 00:04:07,990
that does something very
quickly, and then it goes away.

79
00:04:07,990 --> 00:04:10,280
But you may have, in other situations,

80
00:04:10,280 --> 00:04:14,550
you may want to run replicas
of that application.

81
00:04:14,550 --> 00:04:17,260
So you're running the
same exact application

82
00:04:17,260 --> 00:04:20,230
in two different containers,
and so by doing that,

83
00:04:20,230 --> 00:04:22,860
we can configure ECS to make that happen

84
00:04:22,860 --> 00:04:26,450
and say, well, hey, I want two

85
00:04:26,450 --> 00:04:29,040
or four or eight containers

86
00:04:29,880 --> 00:04:33,500
or replicas of this application running.

87
00:04:33,500 --> 00:04:37,280
ECS is also aware of availability zones,

88
00:04:37,280 --> 00:04:40,190
and so it will make its best attempt

89
00:04:40,190 --> 00:04:43,680
to maintain a balance of those containers

90
00:04:43,680 --> 00:04:46,440
across those availability zones,

91
00:04:46,440 --> 00:04:50,660
while also being aware of the capabilities

92
00:04:50,660 --> 00:04:54,240
and the available resources
of our EC2 instances.

93
00:04:54,240 --> 00:04:57,070
And so again, this could be
two replicas of an application

94
00:04:57,070 --> 00:04:58,690
doing background processing

95
00:04:58,690 --> 00:05:02,930
or it could be two replicas of a web API,

96
00:05:02,930 --> 00:05:04,420
some kind of a web application

97
00:05:04,420 --> 00:05:06,820
that's meant to run for
a long period of time,

98
00:05:06,820 --> 00:05:11,110
listening for requests
and handling web requests.

99
00:05:11,110 --> 00:05:13,040
And so in that regard,
if these were part of

100
00:05:13,040 --> 00:05:17,150
some kind of an API, a web application,

101
00:05:17,150 --> 00:05:18,370
then we would probably want

102
00:05:18,370 --> 00:05:20,260
to put them behind a load balancer

103
00:05:20,260 --> 00:05:23,620
so that we can point our
users to the load balancer

104
00:05:23,620 --> 00:05:25,770
and then it would distribute traffic

105
00:05:25,770 --> 00:05:28,120
among those backend containers.

106
00:05:28,120 --> 00:05:29,090
We will talk much more

107
00:05:29,090 --> 00:05:31,500
about load balancing later on as well.

108
00:05:31,500 --> 00:05:33,840
And then of course, we need
a way for the load balancer

109
00:05:33,840 --> 00:05:36,600
to be aware of these containers.

110
00:05:36,600 --> 00:05:39,060
And so by leveraging the load balancing

111
00:05:39,060 --> 00:05:41,360
in combination with ECS,

112
00:05:41,360 --> 00:05:44,520
we can configure the
Elastic Container Service

113
00:05:44,520 --> 00:05:47,580
to coordinate not only the
placement of containers,

114
00:05:47,580 --> 00:05:50,580
but also the registering
of those containers

115
00:05:50,580 --> 00:05:52,040
with the load balancer.

116
00:05:52,040 --> 00:05:53,950
So ECS will tell the load balancer,

117
00:05:53,950 --> 00:05:58,000
hey, you have these two
containers available

118
00:05:58,000 --> 00:06:01,973
on this port and for this
particular application.

119
00:06:03,020 --> 00:06:06,360
And then once that
registration has happened,

120
00:06:06,360 --> 00:06:08,540
then those containers become

121
00:06:08,540 --> 00:06:11,910
what we would call targets
for the load balancer.

122
00:06:11,910 --> 00:06:15,600
Requests coming in could
then be distributed

123
00:06:15,600 --> 00:06:18,010
among those two containers.

124
00:06:18,010 --> 00:06:20,090
And of course, we could keep going

125
00:06:20,090 --> 00:06:22,150
and add other applications

126
00:06:22,150 --> 00:06:24,350
that are also registered
with the load balancer.

127
00:06:24,350 --> 00:06:29,010
So we can run just about
any number of applications

128
00:06:29,010 --> 00:06:30,550
in this same cluster.

129
00:06:30,550 --> 00:06:34,340
And so we have multiple replicas
of multiple applications

130
00:06:34,340 --> 00:06:37,373
all running side by side
within the same cluster.

131
00:06:39,500 --> 00:06:40,610
And then of course at some point,

132
00:06:40,610 --> 00:06:43,590
we may experience a failure,
some kind of an error

133
00:06:43,590 --> 00:06:45,820
within our application.

134
00:06:45,820 --> 00:06:49,648
It could be a race condition
that caused the application

135
00:06:49,648 --> 00:06:53,100
to just fail and the process to exit.

136
00:06:53,100 --> 00:06:55,970
It could be the fact that the process

137
00:06:55,970 --> 00:06:58,200
used more memory than what we allocated,

138
00:06:58,200 --> 00:07:00,450
and then that process was killed.

139
00:07:00,450 --> 00:07:03,460
And so either way, regardless of how,

140
00:07:03,460 --> 00:07:07,520
if our process, if our
application were to exit,

141
00:07:07,520 --> 00:07:10,210
then the container essentially goes away,

142
00:07:10,210 --> 00:07:12,020
and now we have a discrepancy here.

143
00:07:12,020 --> 00:07:14,220
We have configured, in this case,

144
00:07:14,220 --> 00:07:16,860
we've configured the
Elastic Container Service

145
00:07:16,860 --> 00:07:20,970
to maintain, say, two containers,

146
00:07:20,970 --> 00:07:22,220
but now we only have one.

147
00:07:22,220 --> 00:07:25,100
So ECS will say, all right,
let's replace that one

148
00:07:25,100 --> 00:07:26,130
with a new one.

149
00:07:26,130 --> 00:07:28,060
It will notice that one has failed

150
00:07:28,060 --> 00:07:30,424
and then replace it with a new one.

151
00:07:30,424 --> 00:07:33,933
And so ECS is maintaining the containers.

152
00:07:35,210 --> 00:07:36,800
If one fails, it replaces it.

153
00:07:36,800 --> 00:07:39,580
We can also scale the number of containers

154
00:07:39,580 --> 00:07:43,750
by updating the service within ECS.

155
00:07:43,750 --> 00:07:46,930
And then Auto Scaling
is doing that same thing

156
00:07:46,930 --> 00:07:49,650
for our EC2 instances.

157
00:07:49,650 --> 00:07:52,360
And then, of course, as
far as EC2 instances go,

158
00:07:52,360 --> 00:07:55,910
they don't have to be the same instances.

159
00:07:55,910 --> 00:07:57,300
They could be a blend

160
00:07:57,300 --> 00:08:00,239
of several different types of instance.

161
00:08:00,239 --> 00:08:03,060
You could have a cluster that's comprised

162
00:08:03,060 --> 00:08:07,350
of a C5 large, an M5 extra large,

163
00:08:07,350 --> 00:08:10,930
an R5 triple extra large, right?

164
00:08:10,930 --> 00:08:12,680
So you don't necessarily have to have

165
00:08:12,680 --> 00:08:15,230
the same kind of EC2 instances.

166
00:08:15,230 --> 00:08:17,120
There are a lot of advantages to that.

167
00:08:17,120 --> 00:08:19,870
We can talk more about billing later on,

168
00:08:19,870 --> 00:08:22,070
but the effects on billing could be

169
00:08:22,070 --> 00:08:26,300
that you can always, always
get the cheapest instance.

170
00:08:26,300 --> 00:08:29,460
And we'll talk more about ways of getting,

171
00:08:29,460 --> 00:08:32,690
saving money on EC2 later on as well.

172
00:08:32,690 --> 00:08:36,500
And then of course we have
developers that are writing code,

173
00:08:36,500 --> 00:08:39,510
and we need to get that code deployed

174
00:08:39,510 --> 00:08:42,580
as a container into this environment.

175
00:08:42,580 --> 00:08:44,320
And so that would probably first start

176
00:08:44,320 --> 00:08:47,610
with some kind of version control system.

177
00:08:47,610 --> 00:08:50,700
We will talk more about
code commit later on

178
00:08:50,700 --> 00:08:53,540
as an Amazon service that solves this.

179
00:08:53,540 --> 00:08:57,140
And then the pushing
to this version control

180
00:08:57,140 --> 00:09:01,000
or the, say the merging of a pull request,

181
00:09:01,000 --> 00:09:04,463
that action could then
trigger a CI/CD pipeline.

182
00:09:05,510 --> 00:09:06,810
We have a lot of options here.

183
00:09:06,810 --> 00:09:08,390
We could use Jenkins, we could use

184
00:09:08,390 --> 00:09:11,440
third-party subscription services

185
00:09:11,440 --> 00:09:14,610
like Travis CI, CircleCI, Bamboo.

186
00:09:14,610 --> 00:09:16,010
The list goes on.

187
00:09:16,010 --> 00:09:20,080
We also have an Amazon native
service called CodePipeline,

188
00:09:20,080 --> 00:09:24,090
which we will talk in detail
about later on as well.

189
00:09:24,090 --> 00:09:26,870
And then of course, so the end result

190
00:09:26,870 --> 00:09:28,980
of that CI/CD pipeline

191
00:09:28,980 --> 00:09:32,670
would be deploying a container

192
00:09:32,670 --> 00:09:35,120
into this cluster.

193
00:09:35,120 --> 00:09:38,980
And so essentially, the
CI/CD pipeline goes to ECS

194
00:09:38,980 --> 00:09:43,980
and updates an existing
service with a new image.

195
00:09:44,380 --> 00:09:47,070
So we're saying, ECS,
hey, here's a new image.

196
00:09:47,070 --> 00:09:49,820
Replace the existing containers

197
00:09:49,820 --> 00:09:52,950
with new containers that
are running my new version

198
00:09:52,950 --> 00:09:54,500
based on this new image.

199
00:09:54,500 --> 00:09:56,340
And so the benefit of containers

200
00:09:56,340 --> 00:09:59,340
is that we can get a very fast deployment.

201
00:09:59,340 --> 00:10:01,880
Another benefit is that we could even run,

202
00:10:01,880 --> 00:10:05,840
not only can we run multiple
applications side by side,

203
00:10:05,840 --> 00:10:09,270
we can run multiple versions
of the same application

204
00:10:09,270 --> 00:10:10,400
side by side.

205
00:10:10,400 --> 00:10:14,420
So if we wanted to support
older browsers or older clients

206
00:10:14,420 --> 00:10:17,710
for a longer period of time,
then we could very easily

207
00:10:17,710 --> 00:10:20,470
run multiple versions
of the same application.

208
00:10:20,470 --> 00:10:23,440
And then, of course, our CI/CD pipeline

209
00:10:23,440 --> 00:10:24,970
is generating events.

210
00:10:24,970 --> 00:10:26,859
At life cycles along the
way, things are happening.

211
00:10:26,859 --> 00:10:29,990
Things are passing, things are failing.

212
00:10:29,990 --> 00:10:32,100
We might have tests that are running

213
00:10:32,100 --> 00:10:34,282
and they pass, they fail.

214
00:10:34,282 --> 00:10:35,731
We have builds that are happening

215
00:10:35,731 --> 00:10:37,980
and they pass and they fail.

216
00:10:37,980 --> 00:10:39,810
And we have deployments,

217
00:10:39,810 --> 00:10:41,560
and they fail or they succeed.

218
00:10:41,560 --> 00:10:42,950
And so we want to know

219
00:10:42,950 --> 00:10:45,240
exactly what's happening along the way.

220
00:10:45,240 --> 00:10:46,830
And the same with ECS.

221
00:10:46,830 --> 00:10:48,890
Once the CI/CD pipeline passes,

222
00:10:48,890 --> 00:10:51,500
and then ECS attempts to deploy that,

223
00:10:51,500 --> 00:10:54,320
that process could fail
or it could succeed.

224
00:10:54,320 --> 00:10:56,810
And so there are all of
these life cycle events

225
00:10:56,810 --> 00:10:58,950
that are happening
within these two services

226
00:10:58,950 --> 00:11:01,380
and we would want some kind of a way

227
00:11:01,380 --> 00:11:05,110
of getting those notifications

228
00:11:05,110 --> 00:11:07,260
back to our developer.

229
00:11:07,260 --> 00:11:10,640
So if our tests failed,
get that information

230
00:11:10,640 --> 00:11:13,710
back to the developer so
they can write their code,

231
00:11:13,710 --> 00:11:15,470
get it back into version control,

232
00:11:15,470 --> 00:11:17,573
and push it through the process again.

233
00:11:18,590 --> 00:11:21,820
If tests pass, but the build failed,

234
00:11:21,820 --> 00:11:25,330
notify the developer so they
can start it all over again.

235
00:11:25,330 --> 00:11:30,190
And then, of course, if the
CI/CD pipeline was a success,

236
00:11:30,190 --> 00:11:32,730
it would also be possible for ECS to fail.

237
00:11:32,730 --> 00:11:36,330
But either way, by piping in
all of these notifications,

238
00:11:36,330 --> 00:11:39,160
then we enable our developers to see

239
00:11:39,160 --> 00:11:42,430
in pretty much real time
exactly what's happening

240
00:11:42,430 --> 00:11:43,330
along the way.

241
00:11:43,330 --> 00:11:46,070
And so, as far as these notifications go,

242
00:11:46,070 --> 00:11:47,380
we will talk more later

243
00:11:47,380 --> 00:11:49,600
about the simple notification service

244
00:11:49,600 --> 00:11:52,370
as a way of collecting
these kinds of notifications

245
00:11:52,370 --> 00:11:54,750
from different Amazon services

246
00:11:54,750 --> 00:11:57,410
and then getting them to a single,

247
00:11:57,410 --> 00:11:59,430
or even multiple endpoints,

248
00:11:59,430 --> 00:12:03,340
whether it be a developer
or an HTTP endpoint

249
00:12:04,230 --> 00:12:06,880
that might be running in the
third party or on premises.

250
00:12:06,880 --> 00:12:09,850
And so we have a lot of options

251
00:12:09,850 --> 00:12:13,990
in running container-based
microservices within AWS.

252
00:12:13,990 --> 00:12:16,470
We don't have to use ECS.

253
00:12:16,470 --> 00:12:19,660
We could use, ultimately
ECS is a scheduler,

254
00:12:19,660 --> 00:12:21,240
and there are other schedulers.

255
00:12:21,240 --> 00:12:24,760
We could run Apache Mesos on EC2.

256
00:12:24,760 --> 00:12:28,470
We could run Docker Swarm
on EC2 or Kubernetes on EC2.

257
00:12:28,470 --> 00:12:30,470
What I like about Amazon ECS

258
00:12:30,470 --> 00:12:32,770
is that it's a fully-managed service.

259
00:12:32,770 --> 00:12:35,680
But then we also have ECS for Kubernetes,

260
00:12:35,680 --> 00:12:38,470
which is also a fully-managed service.

261
00:12:38,470 --> 00:12:39,670
So you have a lot of options.

262
00:12:39,670 --> 00:12:43,440
If you would prefer
Kubernetes as opposed to ECS,

263
00:12:43,440 --> 00:12:45,920
then we can very easily do that as well,

264
00:12:45,920 --> 00:12:48,030
and this process is basically the same.

265
00:12:48,030 --> 00:12:49,680
Not a whole lot would change.

266
00:12:49,680 --> 00:12:50,760
The only thing that would change

267
00:12:50,760 --> 00:12:53,270
would be essentially the syntax

268
00:12:53,270 --> 00:12:58,150
by which we inform ECS versus EKS

269
00:12:58,150 --> 00:13:00,153
on how to do what we're trying to do.

270
00:13:01,640 --> 00:13:05,580
And so I've been in situations
and production scenarios

271
00:13:05,580 --> 00:13:07,400
where we could go from

272
00:13:08,350 --> 00:13:12,950
developer pushing, merging a pull request,

273
00:13:12,950 --> 00:13:17,520
testing, building, publishing
the Docker image to a repo,

274
00:13:17,520 --> 00:13:20,650
and then deploying a
container into production.

275
00:13:20,650 --> 00:13:24,390
I've seen that whole thing
take less than 10 minutes,

276
00:13:24,390 --> 00:13:25,790
and it's really a beautiful thing

277
00:13:25,790 --> 00:13:27,100
when you get to a point like this

278
00:13:27,100 --> 00:13:30,330
where you can deploy
to production at will,

279
00:13:30,330 --> 00:13:34,500
at any time, and get those,

280
00:13:34,500 --> 00:13:36,270
those notifications in real time,

281
00:13:36,270 --> 00:13:39,380
allowing your developers
to iterate very quickly.

282
00:13:39,380 --> 00:13:41,320
So for anyone looking to deploy

283
00:13:41,320 --> 00:13:44,360
microservices within AWS,

284
00:13:44,360 --> 00:13:46,650
and especially if you're
looking at containers,

285
00:13:46,650 --> 00:13:47,950
I would highly encourage you

286
00:13:47,950 --> 00:13:50,870
to look at all of the
options around containers

287
00:13:50,870 --> 00:13:53,933
such as ECS and ECS for Kubernetes.