1
00:00:06,560 --> 00:00:09,350
- Let's get into a
little bit now of design,

2
00:00:09,350 --> 00:00:12,260
because one of the things
I find that is lacking in

3
00:00:12,260 --> 00:00:14,850
a lot of Go code is logging consistency.

4
00:00:14,850 --> 00:00:16,990
I spent 30 years trying to figure out

5
00:00:16,990 --> 00:00:20,450
how to combine error handling with logging

6
00:00:20,450 --> 00:00:22,960
and I finally came to
the realization that,

7
00:00:22,960 --> 00:00:26,640
actually I tried to for 30
years, not try to combine it,

8
00:00:26,640 --> 00:00:28,010
but separate it.

9
00:00:28,010 --> 00:00:30,910
I always wanted to try
to get the error handling

10
00:00:30,910 --> 00:00:34,080
kind of separated in terms
of frameworks and logic,

11
00:00:34,080 --> 00:00:37,700
around how we log things, this separation,

12
00:00:37,700 --> 00:00:39,050
and I really came to the realization

13
00:00:39,050 --> 00:00:40,850
that you can't separate these two things.

14
00:00:40,850 --> 00:00:44,480
That error handling and logging
are just this one thing,

15
00:00:44,480 --> 00:00:46,250
and we've gotta bring
them together if we want

16
00:00:46,250 --> 00:00:47,230
any consistency in them.

17
00:00:47,230 --> 00:00:49,310
I'm always worried about consistency.

18
00:00:49,310 --> 00:00:53,120
And so I always love asking
this question to everybody

19
00:00:53,120 --> 00:00:56,930
in the trainings that
we do live, and that is,

20
00:00:56,930 --> 00:00:59,793
if, let's say it's three
o'clock in the morning,

21
00:01:00,800 --> 00:01:03,200
and an error occurs in
your software right?

22
00:01:03,200 --> 00:01:04,230
An error occurs.

23
00:01:04,230 --> 00:01:07,520
But the error is handled by the software,

24
00:01:07,520 --> 00:01:09,290
and the program just continues to run.

25
00:01:09,290 --> 00:01:10,850
There's no real hiccup.

26
00:01:10,850 --> 00:01:13,940
Maybe there was for a second or two,

27
00:01:13,940 --> 00:01:16,530
but it recovered, we keep going.

28
00:01:16,530 --> 00:01:18,460
My first question is do you wanna wake up,

29
00:01:18,460 --> 00:01:21,260
or do you wanna be woken
up at three in the morning

30
00:01:21,260 --> 00:01:23,020
because that just happened?

31
00:01:23,020 --> 00:01:25,530
Remember nobody needs you,

32
00:01:25,530 --> 00:01:27,450
the software just kept running.

33
00:01:27,450 --> 00:01:30,134
The reality is that I don't wanna wake up.

34
00:01:30,134 --> 00:01:31,690
I mean the software did
what it's supposed to do.

35
00:01:31,690 --> 00:01:34,840
It identified failure, it
recovered, it keeps moving.

36
00:01:34,840 --> 00:01:37,060
So I don't want to,

37
00:01:37,060 --> 00:01:38,010
don't wake me up.

38
00:01:38,010 --> 00:01:39,230
Let me sleep.

39
00:01:39,230 --> 00:01:41,340
Now when I wake up in the
morning and I get to work,

40
00:01:41,340 --> 00:01:45,010
I wanna see some indication
that there was a problem right?

41
00:01:45,010 --> 00:01:46,690
I don't wanna be blind to it,

42
00:01:46,690 --> 00:01:48,130
but I do wanna see it.

43
00:01:48,130 --> 00:01:50,190
But the real question now becomes,

44
00:01:50,190 --> 00:01:54,480
am I gonna spend time looking
at the logs when I get in,

45
00:01:54,480 --> 00:01:56,360
because right now again, it's not needed.

46
00:01:56,360 --> 00:01:57,470
It was recovered.

47
00:01:57,470 --> 00:02:00,260
Maybe on my dashboard for metrics,

48
00:02:00,260 --> 00:02:01,580
I see that there was an error,

49
00:02:01,580 --> 00:02:02,930
I don't know much more.

50
00:02:02,930 --> 00:02:04,840
But the program kept going.

51
00:02:04,840 --> 00:02:06,150
Here's the reality.

52
00:02:06,150 --> 00:02:09,220
The reality is, is probably
I'm so busy with meetings,

53
00:02:09,220 --> 00:02:10,410
and other priorities,

54
00:02:10,410 --> 00:02:12,390
I never go back to look at the logs

55
00:02:12,390 --> 00:02:14,580
because it's no longer an issue.

56
00:02:14,580 --> 00:02:16,100
It's not that I don't want to,

57
00:02:16,100 --> 00:02:18,460
it's not that I'm not necessarily curious,

58
00:02:18,460 --> 00:02:20,700
but it's just human nature to say,

59
00:02:20,700 --> 00:02:23,340
I've got other priorities.

60
00:02:23,340 --> 00:02:25,650
What I'm trying to stress here is that

61
00:02:25,650 --> 00:02:29,170
we write applications
that log a lot of things,

62
00:02:29,170 --> 00:02:33,560
and most of the time we're
logging as an insurance policy

63
00:02:33,560 --> 00:02:36,850
to be able to find bugs when errors occur.

64
00:02:36,850 --> 00:02:39,990
And I did that for a long
time but the reality is,

65
00:02:39,990 --> 00:02:43,310
is that there's too much
activity on our systems today.

66
00:02:43,310 --> 00:02:46,910
Our user bases can grow to a
million people almost overnight

67
00:02:46,910 --> 00:02:50,630
and so logging that much
has a huge significant cost.

68
00:02:50,630 --> 00:02:52,670
A lot of times logging is going to create

69
00:02:52,670 --> 00:02:54,940
a large amount of allocations,

70
00:02:54,940 --> 00:02:57,440
which is going to put a lot
of pressure on your heap.

71
00:02:57,440 --> 00:03:01,060
Now that's not unique to Go,
but we are talking about Go.

72
00:03:01,060 --> 00:03:04,610
So I want you to consider
that logging is important,

73
00:03:04,610 --> 00:03:09,610
but we've got to constantly
balance signal to noise

74
00:03:10,260 --> 00:03:12,880
in the log because if you're writing logs,

75
00:03:12,880 --> 00:03:14,150
writing data to your logs,

76
00:03:14,150 --> 00:03:18,360
that you end up never,
ever reading or using,

77
00:03:18,360 --> 00:03:20,830
you're wasting CPU cycles on something

78
00:03:21,670 --> 00:03:24,100
that you could've been
doing actual real work.

79
00:03:24,100 --> 00:03:27,300
And it goes beyond just the
CPU cycles of your process.

80
00:03:27,300 --> 00:03:30,360
You're eating network
bandwidth, disc I/O bandwidth,

81
00:03:30,360 --> 00:03:33,280
other complexities that go
through the entire system.

82
00:03:33,280 --> 00:03:36,530
So during development I
really wanna make sure

83
00:03:36,530 --> 00:03:41,310
that we always have a good
level of signal in our logs

84
00:03:41,310 --> 00:03:43,500
and we're logging from a trace perspective

85
00:03:43,500 --> 00:03:45,180
the bare minimum we need,

86
00:03:45,180 --> 00:03:47,440
but then we're logging the errors in a way

87
00:03:47,440 --> 00:03:50,550
that there's always enough
context if we wanna take the time

88
00:03:50,550 --> 00:03:52,970
or need to take the time to look at it.

89
00:03:52,970 --> 00:03:56,370
And that's also been a big
problem for me for 30 years.

90
00:03:56,370 --> 00:03:59,610
How do you make sure there's
enough context in the log,

91
00:03:59,610 --> 00:04:02,000
both from a tracing
perspective, bare minimal,

92
00:04:02,000 --> 00:04:05,500
and then an error perspective
and not duplicate errors

93
00:04:05,500 --> 00:04:07,970
throughout a log and at the same time,

94
00:04:07,970 --> 00:04:11,550
minimize those log writes and
then, let's add more to it,

95
00:04:11,550 --> 00:04:16,550
have a consistent pattern
that we all can follow

96
00:04:16,720 --> 00:04:19,150
and review during code reviews,

97
00:04:19,150 --> 00:04:22,800
where we're doing logging the
same way and it's not random?

98
00:04:22,800 --> 00:04:24,670
Oh my God my head wants to hurt right?

99
00:04:24,670 --> 00:04:25,913
All of these things.

100
00:04:26,990 --> 00:04:28,820
You might find it hard to
believe that I don't believe

101
00:04:28,820 --> 00:04:30,080
in logging levels.

102
00:04:30,080 --> 00:04:33,170
I've never been able to turn
a logging level up in time

103
00:04:33,170 --> 00:04:35,070
to get more detailed information.

104
00:04:35,070 --> 00:04:37,160
So I'm not a believer in logging levels.

105
00:04:37,160 --> 00:04:40,540
I tend to use the standard
library log package quite a bit.

106
00:04:40,540 --> 00:04:42,940
I will create my own logger and pass

107
00:04:42,940 --> 00:04:45,340
that around the application but,

108
00:04:45,340 --> 00:04:48,310
either I need this information or I don't,

109
00:04:48,310 --> 00:04:51,260
and I make sure throughout
my unit testing and any other

110
00:04:51,260 --> 00:04:55,420
integration testing that I
do that my logs have signal.

111
00:04:55,420 --> 00:04:57,470
Now I wanna show you a pattern using

112
00:04:57,470 --> 00:04:59,750
Dave Cheney's errors package.

113
00:04:59,750 --> 00:05:03,170
Dave Cheney's error package
is very, really nice,

114
00:05:03,170 --> 00:05:07,680
and it gives us a consistent
way to apply error handling,

115
00:05:07,680 --> 00:05:11,160
apply logging and have code
consistency throughout,

116
00:05:11,160 --> 00:05:13,400
minimizing again a lot of pain.

117
00:05:13,400 --> 00:05:16,250
But remember when I talked
about handling an error?

118
00:05:16,250 --> 00:05:17,500
That's a big thing.

119
00:05:17,500 --> 00:05:21,200
Now that I can't separate
logging from error handling,

120
00:05:21,200 --> 00:05:24,810
what do I mean when I talk
about error handling an error?

121
00:05:24,810 --> 00:05:26,770
I mean really a couple of things.

122
00:05:26,770 --> 00:05:29,660
When we talk about error handling
what we're gonna mean is,

123
00:05:29,660 --> 00:05:32,190
is that when a piece of code
decides to handle the error,

124
00:05:32,190 --> 00:05:34,640
it means that, that code is
responsible for logging it

125
00:05:34,640 --> 00:05:36,930
and logging in the full context of it.

126
00:05:36,930 --> 00:05:39,520
It also means that code
has to make a decision.

127
00:05:39,520 --> 00:05:42,040
Can we recover or not?

128
00:05:42,040 --> 00:05:43,640
If the answer is no,

129
00:05:43,640 --> 00:05:46,070
then error handling means
we shut down the app.

130
00:05:46,070 --> 00:05:49,590
Either with the stack trace
on the panic call or OS exit.

131
00:05:49,590 --> 00:05:51,674
If we can recover,

132
00:05:51,674 --> 00:05:54,240
then that code has to
recover the application back

133
00:05:54,240 --> 00:05:57,420
to its correct state and keep it going.

134
00:05:57,420 --> 00:06:00,130
At the end of the day when
that piece of code returns

135
00:06:00,130 --> 00:06:02,890
it never, ever returns
another error back up.

136
00:06:02,890 --> 00:06:04,470
The error stops there.

137
00:06:04,470 --> 00:06:07,430
Either the app is
recovered or it shuts down.

138
00:06:07,430 --> 00:06:09,700
But we've also logged the error.

139
00:06:09,700 --> 00:06:11,400
That is error handling to me.

140
00:06:11,400 --> 00:06:13,980
And this is what this pattern
is going to be perfect

141
00:06:13,980 --> 00:06:16,470
for with Dave Cheney's errors package.

142
00:06:16,470 --> 00:06:18,523
So imagine this piece of code right here.

143
00:06:19,460 --> 00:06:22,030
We're at the bottom of a call chain

144
00:06:22,030 --> 00:06:24,040
and we make this call to thirdCall.

145
00:06:24,040 --> 00:06:26,863
Some function above us
makes a call to thirdCall.

146
00:06:27,821 --> 00:06:30,690
And you can see here that
thirdCall always fails.

147
00:06:30,690 --> 00:06:34,270
It's failing with this custom
error type called AppError.

148
00:06:34,270 --> 00:06:35,520
We're using pointer semantics,

149
00:06:35,520 --> 00:06:37,830
we're gonna store the
address of AppError in there,

150
00:06:37,830 --> 00:06:40,490
and we've set the error code to 99.

151
00:06:40,490 --> 00:06:41,910
This is very similar to maybe like

152
00:06:41,910 --> 00:06:46,660
a standard library function
that is gonna return these raw,

153
00:06:46,660 --> 00:06:48,600
or these true error values.

154
00:06:48,600 --> 00:06:52,250
Now secondCall was the
one that called thirdCall,

155
00:06:52,250 --> 00:06:54,730
and you see our
traditional mechanics in Go

156
00:06:54,730 --> 00:06:56,170
where we're gonna use the if statement

157
00:06:56,170 --> 00:06:57,820
to do the error handling right?

158
00:06:57,820 --> 00:06:59,620
We're staying away from the else clauses.

159
00:06:59,620 --> 00:07:01,600
Happy path in the first tab.

160
00:07:01,600 --> 00:07:04,070
And so here we go, we call thirdCall,

161
00:07:04,070 --> 00:07:06,090
we get back an error, interface value.

162
00:07:06,090 --> 00:07:08,420
We ask is there a concrete
value stored inside

163
00:07:08,420 --> 00:07:09,253
the error interface?

164
00:07:09,253 --> 00:07:10,750
The answer is yes.

165
00:07:10,750 --> 00:07:11,800
When the answer is yes,

166
00:07:11,800 --> 00:07:15,260
now the developer writing this
code has to make a choice.

167
00:07:15,260 --> 00:07:16,770
It's really boolean.

168
00:07:16,770 --> 00:07:19,930
Am I gonna handle the error here, or not?

169
00:07:19,930 --> 00:07:22,300
If the answer is handling the error here,

170
00:07:22,300 --> 00:07:24,150
we deal with it like I said before.

171
00:07:24,150 --> 00:07:27,070
But if the answer is no, I'm
not gonna handle the error,

172
00:07:27,070 --> 00:07:29,370
then there's only one
thing you're allowed to do,

173
00:07:29,370 --> 00:07:33,470
and that is wrap the
error with context right?

174
00:07:33,470 --> 00:07:35,300
We would prefer to handle the error,

175
00:07:35,300 --> 00:07:37,700
the lower in the call
stack we handle the error,

176
00:07:37,700 --> 00:07:40,320
the better opportunity you're
gonna have for recovery.

177
00:07:40,320 --> 00:07:42,750
But in this case the
developer has decided,

178
00:07:42,750 --> 00:07:44,440
no we're not gonna handle the error.

179
00:07:44,440 --> 00:07:45,273
Think about this.

180
00:07:45,273 --> 00:07:46,860
I don't have to worry
about logging anymore

181
00:07:46,860 --> 00:07:48,700
and I don't have to worry
about recovery anymore.

182
00:07:48,700 --> 00:07:51,320
All I worry about is the wrap call.

183
00:07:51,320 --> 00:07:54,740
Now the wrap call is interesting
because it does two things.

184
00:07:54,740 --> 00:07:57,650
At this point remember that
we've gotten an AppError.

185
00:07:57,650 --> 00:07:59,780
Here it is, this is our AppError,

186
00:07:59,780 --> 00:08:02,980
and our AppError is
already wrapped against

187
00:08:02,980 --> 00:08:05,410
this error interface value.

188
00:08:05,410 --> 00:08:06,960
Now this is error, right?

189
00:08:06,960 --> 00:08:09,090
Wrapping the AppError.

190
00:08:09,090 --> 00:08:11,910
But what wrap is going to do now,

191
00:08:11,910 --> 00:08:15,920
is wrap context around what we have.

192
00:08:15,920 --> 00:08:18,140
And there's two types of context.

193
00:08:18,140 --> 00:08:21,900
There's gonna be call stack context

194
00:08:21,900 --> 00:08:25,690
and there's going to be user context.

195
00:08:25,690 --> 00:08:28,920
The call stack context is going to take a,

196
00:08:28,920 --> 00:08:31,600
where we are in the code
at that line of code

197
00:08:31,600 --> 00:08:32,730
that we're doing the wrap.

198
00:08:32,730 --> 00:08:36,770
We now know exactly where we
are when this error occurred.

199
00:08:36,770 --> 00:08:39,900
And we're also now also
gonna be able to add

200
00:08:39,900 --> 00:08:41,090
some user context.

201
00:08:41,090 --> 00:08:43,760
In this case I'm just
indicating that secondCall

202
00:08:43,760 --> 00:08:45,250
was calling thirdCall.

203
00:08:45,250 --> 00:08:48,090
So we're gonna get call
stack and user context

204
00:08:48,090 --> 00:08:49,090
built in there.

205
00:08:49,090 --> 00:08:51,020
Now we take this error, we wrap it,

206
00:08:51,020 --> 00:08:53,160
and we send it back up the call stack.

207
00:08:53,160 --> 00:08:56,240
And when we do, firstCall now is involved.

208
00:08:56,240 --> 00:08:58,960
Remember firstCall is calling secondCall

209
00:08:58,960 --> 00:09:01,070
with the parameter of i.

210
00:09:01,070 --> 00:09:03,360
Now an error occurred, we know that right?

211
00:09:03,360 --> 00:09:06,410
We know that secondCall
returned this error.

212
00:09:06,410 --> 00:09:08,154
Well guess what?

213
00:09:08,154 --> 00:09:10,350
firstCall now has to decide
am I gonna handle the error?

214
00:09:10,350 --> 00:09:14,080
If the answer is no, then
you only have one choice.

215
00:09:14,080 --> 00:09:18,250
We wrap the error again with that context,

216
00:09:18,250 --> 00:09:22,480
our user context and our call context.

217
00:09:22,480 --> 00:09:27,390
Notice this time our user context
is showing the parameters,

218
00:09:27,390 --> 00:09:30,960
or the values we passed into secondCall.

219
00:09:30,960 --> 00:09:32,410
This is brilliant stuff.

220
00:09:32,410 --> 00:09:36,300
You can share any context
you need and if you get bugs

221
00:09:36,300 --> 00:09:37,960
and the context isn't enough,

222
00:09:37,960 --> 00:09:40,580
this is what you're gonna
be improving, the context.

223
00:09:40,580 --> 00:09:43,450
If I'm writing a database
type of app and I don't have

224
00:09:43,450 --> 00:09:45,860
a security issue writing
queries to the logs,

225
00:09:45,860 --> 00:09:48,520
that would be a query so
I can copy and paste it

226
00:09:48,520 --> 00:09:50,340
out of the log and run it directly.

227
00:09:50,340 --> 00:09:53,690
Here I love showing what the input was,

228
00:09:53,690 --> 00:09:56,590
because I'm not gonna get that
in the call stack context,

229
00:09:56,590 --> 00:09:58,650
but at least I can see
what values were passed.

230
00:09:58,650 --> 00:09:59,800
That could really help.

231
00:10:00,887 --> 00:10:04,500
So I've got, this is the
original error right here.

232
00:10:04,500 --> 00:10:06,690
This is one wrap right here.

233
00:10:06,690 --> 00:10:09,450
This is another wrapping right there.

234
00:10:09,450 --> 00:10:11,760
And we go to main.

235
00:10:11,760 --> 00:10:14,050
Now main made a call to firstCall,

236
00:10:14,050 --> 00:10:15,000
it gets back the error.

237
00:10:15,000 --> 00:10:16,870
It says, is there a
concrete value stored inside

238
00:10:16,870 --> 00:10:17,703
the error interface?

239
00:10:17,703 --> 00:10:19,620
There absolutely is.

240
00:10:19,620 --> 00:10:22,020
Therefore we now come into this.

241
00:10:22,020 --> 00:10:25,060
Now the mechanics that I
showed you on the case,

242
00:10:25,060 --> 00:10:27,740
doing the generic type
assertions, guess what?

243
00:10:27,740 --> 00:10:31,230
We get to continue to do that
because of the cause function.

244
00:10:31,230 --> 00:10:33,440
What's brilliant about the cause function,

245
00:10:33,440 --> 00:10:37,870
is it knows how to unwind
this down and get us back

246
00:10:37,870 --> 00:10:40,360
to the root error value.

247
00:10:40,360 --> 00:10:42,470
And now we're gonna type assert against

248
00:10:42,470 --> 00:10:44,230
the root error value,

249
00:10:44,230 --> 00:10:47,580
and when we do that we can
do the case on line 30.

250
00:10:47,580 --> 00:10:50,640
But I think what's even
more powerful is this.

251
00:10:50,640 --> 00:10:53,930
If you're using the standard
library fmt or log packages,

252
00:10:53,930 --> 00:10:58,930
like I do, if you put a %+v
in the formatting, guess what?

253
00:10:59,510 --> 00:11:03,650
The full stack trace and
context of this whole thing

254
00:11:03,650 --> 00:11:05,760
gets logged to, in this case,

255
00:11:05,760 --> 00:11:07,120
standard out or standard error,

256
00:11:07,120 --> 00:11:08,050
wherever you're logging.

257
00:11:08,050 --> 00:11:10,270
And if you just do %v,

258
00:11:10,270 --> 00:11:12,870
then you're just gonna
get your user context.

259
00:11:12,870 --> 00:11:15,780
%v just gives you the user context.

260
00:11:15,780 --> 00:11:18,320
%+v gives you both.

261
00:11:18,320 --> 00:11:21,960
Now I wanna show you this
running so you see it visualizing

262
00:11:21,960 --> 00:11:22,800
it right here.

263
00:11:22,800 --> 00:11:25,943
Let me copy the path and get
us over into our terminal.

264
00:11:27,260 --> 00:11:29,410
And what I'm gonna do
is build this program

265
00:11:30,288 --> 00:11:31,230
and then I'm gonna run it.

266
00:11:31,230 --> 00:11:33,390
And I want you to notice the output.

267
00:11:33,390 --> 00:11:35,840
Now the first output you're seeing,

268
00:11:35,840 --> 00:11:39,230
is when we did the %v.

269
00:11:39,230 --> 00:11:41,280
We're doing the %+v.

270
00:11:41,280 --> 00:11:43,950
And what you see here
is a full stack trace,

271
00:11:43,950 --> 00:11:46,680
here's the AppError context state 99.

272
00:11:46,680 --> 00:11:50,840
And we can see stack traces
from the different levels

273
00:11:50,840 --> 00:11:53,290
of calls that we were making in the app.

274
00:11:53,290 --> 00:11:54,770
Here's firstCall right?

275
00:11:54,770 --> 00:11:56,700
So that's runtime, main,

276
00:11:56,700 --> 00:11:58,880
main all the way to firstCall.

277
00:11:58,880 --> 00:11:59,840
There it is.

278
00:11:59,840 --> 00:12:03,050
And then we've got our context
of what we've passed in,

279
00:12:03,050 --> 00:12:05,500
and here what we're
seeing is the full context

280
00:12:05,500 --> 00:12:08,590
from main to firstCall to secondCall,

281
00:12:08,590 --> 00:12:10,900
and then the context to thirdCall.

282
00:12:10,900 --> 00:12:15,820
So we can see individually
how these different layers

283
00:12:15,820 --> 00:12:18,540
that we got from our wrapping, came in.

284
00:12:18,540 --> 00:12:22,220
And then I can use just
the v without the +

285
00:12:22,220 --> 00:12:27,220
and what you see now is just
the context, the user context.

286
00:12:27,640 --> 00:12:30,260
So I can look at a stack
trace from any point

287
00:12:30,260 --> 00:12:33,800
in the call chain and
then I can also look at

288
00:12:33,800 --> 00:12:35,750
just the user context right?

289
00:12:35,750 --> 00:12:39,400
So we see here firstCall line 51.

290
00:12:39,400 --> 00:12:41,270
Let's go to line 51.

291
00:12:41,270 --> 00:12:42,103
Look at that.

292
00:12:42,103 --> 00:12:45,150
We know that there was failure
on that call to secondCall,

293
00:12:45,150 --> 00:12:47,820
that's where we were, inside of there.

294
00:12:47,820 --> 00:12:51,390
Again if we look at firstCall 52,

295
00:12:51,390 --> 00:12:53,540
we know that that's where
the wrap call happened.

296
00:12:53,540 --> 00:12:56,750
We got a tremendous amount of context here

297
00:12:56,750 --> 00:12:59,310
about where we were and
we don't have to worry

298
00:12:59,310 --> 00:13:01,210
about logging all the way up right?

299
00:13:01,210 --> 00:13:03,110
We're not gonna separate
logging from error handling.

300
00:13:03,110 --> 00:13:04,550
It's just one thing.

301
00:13:04,550 --> 00:13:06,770
So from a user perspective remember

302
00:13:06,770 --> 00:13:08,800
this is all we're gonna
have to now remember.

303
00:13:08,800 --> 00:13:10,640
But from a developer perspective,

304
00:13:10,640 --> 00:13:12,300
what we're gonna just say is,

305
00:13:12,300 --> 00:13:14,290
oh my goodness, an error called.

306
00:13:14,290 --> 00:13:15,820
Are we gonna handle it or not?

307
00:13:15,820 --> 00:13:18,260
If we're not gonna handle it, we wrap it.

308
00:13:18,260 --> 00:13:19,730
If we are gonna handle it,

309
00:13:19,730 --> 00:13:21,490
then we deal with it.

310
00:13:21,490 --> 00:13:24,050
We log it and then we make decisions

311
00:13:24,050 --> 00:13:25,940
about whether we can recover or not.

312
00:13:25,940 --> 00:13:29,240
This is in a fantastic
pattern and I really again,

313
00:13:29,240 --> 00:13:30,073
want you to,

314
00:13:30,073 --> 00:13:32,220
I want you to not feel like
you've gotta log everything

315
00:13:32,220 --> 00:13:33,633
as an insurance policy.

316
00:13:34,684 --> 00:13:37,240
Signal is everything in
logging because logging

317
00:13:37,240 --> 00:13:39,110
costs you a lot.

318
00:13:39,110 --> 00:13:42,350
It's going to cost you
lots of allocations,

319
00:13:42,350 --> 00:13:45,870
so the logs that you're writing
better be worth the cost.

320
00:13:45,870 --> 00:13:48,490
They better be truly information you need

321
00:13:48,490 --> 00:13:49,890
if there is a problem,

322
00:13:49,890 --> 00:13:52,240
and information from a tracing perspective

323
00:13:52,240 --> 00:13:54,570
to allow you to know that
the system is healthy.

324
00:13:54,570 --> 00:13:56,620
Don't forget that we have dashboards.

325
00:13:56,620 --> 00:13:57,480
We have metrics.

326
00:13:57,480 --> 00:13:59,520
I don't like writing data into the logs.

327
00:13:59,520 --> 00:14:03,160
I'm not a big fan of structured
logging because for me,

328
00:14:03,160 --> 00:14:07,580
logs serve the purpose of being
able to find and fix bugs,

329
00:14:07,580 --> 00:14:09,190
but that's me.

330
00:14:09,190 --> 00:14:12,430
And I rather use my metric
systems and my dashboards

331
00:14:12,430 --> 00:14:14,000
for those data points,

332
00:14:14,000 --> 00:14:16,020
and try to tie that stuff together.

333
00:14:16,020 --> 00:14:18,360
So I really want you to look
at Dave Cheney's package.

334
00:14:18,360 --> 00:14:22,080
You'll find it in GitHub under pkg errors

335
00:14:22,080 --> 00:14:25,220
and this pattern is something
that we use at Arden,

336
00:14:25,220 --> 00:14:27,950
and a lot of people that are
my clients that I teach use,

337
00:14:27,950 --> 00:14:30,050
and it's been very, very effective for us.