1 00:00:06,560 --> 00:00:09,350 - Let's get into a little bit now of design, 2 00:00:09,350 --> 00:00:12,260 because one of the things I find that is lacking in 3 00:00:12,260 --> 00:00:14,850 a lot of Go code is logging consistency. 4 00:00:14,850 --> 00:00:16,990 I spent 30 years trying to figure out 5 00:00:16,990 --> 00:00:20,450 how to combine error handling with logging 6 00:00:20,450 --> 00:00:22,960 and I finally came to the realization that, 7 00:00:22,960 --> 00:00:26,640 actually I tried to for 30 years, not try to combine it, 8 00:00:26,640 --> 00:00:28,010 but separate it. 9 00:00:28,010 --> 00:00:30,910 I always wanted to try to get the error handling 10 00:00:30,910 --> 00:00:34,080 kind of separated in terms of frameworks and logic, 11 00:00:34,080 --> 00:00:37,700 around how we log things, this separation, 12 00:00:37,700 --> 00:00:39,050 and I really came to the realization 13 00:00:39,050 --> 00:00:40,850 that you can't separate these two things. 14 00:00:40,850 --> 00:00:44,480 That error handling and logging are just this one thing, 15 00:00:44,480 --> 00:00:46,250 and we've gotta bring them together if we want 16 00:00:46,250 --> 00:00:47,230 any consistency in them. 17 00:00:47,230 --> 00:00:49,310 I'm always worried about consistency. 18 00:00:49,310 --> 00:00:53,120 And so I always love asking this question to everybody 19 00:00:53,120 --> 00:00:56,930 in the trainings that we do live, and that is, 20 00:00:56,930 --> 00:00:59,793 if, let's say it's three o'clock in the morning, 21 00:01:00,800 --> 00:01:03,200 and an error occurs in your software right? 22 00:01:03,200 --> 00:01:04,230 An error occurs. 23 00:01:04,230 --> 00:01:07,520 But the error is handled by the software, 24 00:01:07,520 --> 00:01:09,290 and the program just continues to run. 25 00:01:09,290 --> 00:01:10,850 There's no real hiccup. 26 00:01:10,850 --> 00:01:13,940 Maybe there was for a second or two, 27 00:01:13,940 --> 00:01:16,530 but it recovered, we keep going. 28 00:01:16,530 --> 00:01:18,460 My first question is do you wanna wake up, 29 00:01:18,460 --> 00:01:21,260 or do you wanna be woken up at three in the morning 30 00:01:21,260 --> 00:01:23,020 because that just happened? 31 00:01:23,020 --> 00:01:25,530 Remember nobody needs you, 32 00:01:25,530 --> 00:01:27,450 the software just kept running. 33 00:01:27,450 --> 00:01:30,134 The reality is that I don't wanna wake up. 34 00:01:30,134 --> 00:01:31,690 I mean the software did what it's supposed to do. 35 00:01:31,690 --> 00:01:34,840 It identified failure, it recovered, it keeps moving. 36 00:01:34,840 --> 00:01:37,060 So I don't want to, 37 00:01:37,060 --> 00:01:38,010 don't wake me up. 38 00:01:38,010 --> 00:01:39,230 Let me sleep. 39 00:01:39,230 --> 00:01:41,340 Now when I wake up in the morning and I get to work, 40 00:01:41,340 --> 00:01:45,010 I wanna see some indication that there was a problem right? 41 00:01:45,010 --> 00:01:46,690 I don't wanna be blind to it, 42 00:01:46,690 --> 00:01:48,130 but I do wanna see it. 43 00:01:48,130 --> 00:01:50,190 But the real question now becomes, 44 00:01:50,190 --> 00:01:54,480 am I gonna spend time looking at the logs when I get in, 45 00:01:54,480 --> 00:01:56,360 because right now again, it's not needed. 46 00:01:56,360 --> 00:01:57,470 It was recovered. 47 00:01:57,470 --> 00:02:00,260 Maybe on my dashboard for metrics, 48 00:02:00,260 --> 00:02:01,580 I see that there was an error, 49 00:02:01,580 --> 00:02:02,930 I don't know much more. 50 00:02:02,930 --> 00:02:04,840 But the program kept going. 51 00:02:04,840 --> 00:02:06,150 Here's the reality. 52 00:02:06,150 --> 00:02:09,220 The reality is, is probably I'm so busy with meetings, 53 00:02:09,220 --> 00:02:10,410 and other priorities, 54 00:02:10,410 --> 00:02:12,390 I never go back to look at the logs 55 00:02:12,390 --> 00:02:14,580 because it's no longer an issue. 56 00:02:14,580 --> 00:02:16,100 It's not that I don't want to, 57 00:02:16,100 --> 00:02:18,460 it's not that I'm not necessarily curious, 58 00:02:18,460 --> 00:02:20,700 but it's just human nature to say, 59 00:02:20,700 --> 00:02:23,340 I've got other priorities. 60 00:02:23,340 --> 00:02:25,650 What I'm trying to stress here is that 61 00:02:25,650 --> 00:02:29,170 we write applications that log a lot of things, 62 00:02:29,170 --> 00:02:33,560 and most of the time we're logging as an insurance policy 63 00:02:33,560 --> 00:02:36,850 to be able to find bugs when errors occur. 64 00:02:36,850 --> 00:02:39,990 And I did that for a long time but the reality is, 65 00:02:39,990 --> 00:02:43,310 is that there's too much activity on our systems today. 66 00:02:43,310 --> 00:02:46,910 Our user bases can grow to a million people almost overnight 67 00:02:46,910 --> 00:02:50,630 and so logging that much has a huge significant cost. 68 00:02:50,630 --> 00:02:52,670 A lot of times logging is going to create 69 00:02:52,670 --> 00:02:54,940 a large amount of allocations, 70 00:02:54,940 --> 00:02:57,440 which is going to put a lot of pressure on your heap. 71 00:02:57,440 --> 00:03:01,060 Now that's not unique to Go, but we are talking about Go. 72 00:03:01,060 --> 00:03:04,610 So I want you to consider that logging is important, 73 00:03:04,610 --> 00:03:09,610 but we've got to constantly balance signal to noise 74 00:03:10,260 --> 00:03:12,880 in the log because if you're writing logs, 75 00:03:12,880 --> 00:03:14,150 writing data to your logs, 76 00:03:14,150 --> 00:03:18,360 that you end up never, ever reading or using, 77 00:03:18,360 --> 00:03:20,830 you're wasting CPU cycles on something 78 00:03:21,670 --> 00:03:24,100 that you could've been doing actual real work. 79 00:03:24,100 --> 00:03:27,300 And it goes beyond just the CPU cycles of your process. 80 00:03:27,300 --> 00:03:30,360 You're eating network bandwidth, disc I/O bandwidth, 81 00:03:30,360 --> 00:03:33,280 other complexities that go through the entire system. 82 00:03:33,280 --> 00:03:36,530 So during development I really wanna make sure 83 00:03:36,530 --> 00:03:41,310 that we always have a good level of signal in our logs 84 00:03:41,310 --> 00:03:43,500 and we're logging from a trace perspective 85 00:03:43,500 --> 00:03:45,180 the bare minimum we need, 86 00:03:45,180 --> 00:03:47,440 but then we're logging the errors in a way 87 00:03:47,440 --> 00:03:50,550 that there's always enough context if we wanna take the time 88 00:03:50,550 --> 00:03:52,970 or need to take the time to look at it. 89 00:03:52,970 --> 00:03:56,370 And that's also been a big problem for me for 30 years. 90 00:03:56,370 --> 00:03:59,610 How do you make sure there's enough context in the log, 91 00:03:59,610 --> 00:04:02,000 both from a tracing perspective, bare minimal, 92 00:04:02,000 --> 00:04:05,500 and then an error perspective and not duplicate errors 93 00:04:05,500 --> 00:04:07,970 throughout a log and at the same time, 94 00:04:07,970 --> 00:04:11,550 minimize those log writes and then, let's add more to it, 95 00:04:11,550 --> 00:04:16,550 have a consistent pattern that we all can follow 96 00:04:16,720 --> 00:04:19,150 and review during code reviews, 97 00:04:19,150 --> 00:04:22,800 where we're doing logging the same way and it's not random? 98 00:04:22,800 --> 00:04:24,670 Oh my God my head wants to hurt right? 99 00:04:24,670 --> 00:04:25,913 All of these things. 100 00:04:26,990 --> 00:04:28,820 You might find it hard to believe that I don't believe 101 00:04:28,820 --> 00:04:30,080 in logging levels. 102 00:04:30,080 --> 00:04:33,170 I've never been able to turn a logging level up in time 103 00:04:33,170 --> 00:04:35,070 to get more detailed information. 104 00:04:35,070 --> 00:04:37,160 So I'm not a believer in logging levels. 105 00:04:37,160 --> 00:04:40,540 I tend to use the standard library log package quite a bit. 106 00:04:40,540 --> 00:04:42,940 I will create my own logger and pass 107 00:04:42,940 --> 00:04:45,340 that around the application but, 108 00:04:45,340 --> 00:04:48,310 either I need this information or I don't, 109 00:04:48,310 --> 00:04:51,260 and I make sure throughout my unit testing and any other 110 00:04:51,260 --> 00:04:55,420 integration testing that I do that my logs have signal. 111 00:04:55,420 --> 00:04:57,470 Now I wanna show you a pattern using 112 00:04:57,470 --> 00:04:59,750 Dave Cheney's errors package. 113 00:04:59,750 --> 00:05:03,170 Dave Cheney's error package is very, really nice, 114 00:05:03,170 --> 00:05:07,680 and it gives us a consistent way to apply error handling, 115 00:05:07,680 --> 00:05:11,160 apply logging and have code consistency throughout, 116 00:05:11,160 --> 00:05:13,400 minimizing again a lot of pain. 117 00:05:13,400 --> 00:05:16,250 But remember when I talked about handling an error? 118 00:05:16,250 --> 00:05:17,500 That's a big thing. 119 00:05:17,500 --> 00:05:21,200 Now that I can't separate logging from error handling, 120 00:05:21,200 --> 00:05:24,810 what do I mean when I talk about error handling an error? 121 00:05:24,810 --> 00:05:26,770 I mean really a couple of things. 122 00:05:26,770 --> 00:05:29,660 When we talk about error handling what we're gonna mean is, 123 00:05:29,660 --> 00:05:32,190 is that when a piece of code decides to handle the error, 124 00:05:32,190 --> 00:05:34,640 it means that, that code is responsible for logging it 125 00:05:34,640 --> 00:05:36,930 and logging in the full context of it. 126 00:05:36,930 --> 00:05:39,520 It also means that code has to make a decision. 127 00:05:39,520 --> 00:05:42,040 Can we recover or not? 128 00:05:42,040 --> 00:05:43,640 If the answer is no, 129 00:05:43,640 --> 00:05:46,070 then error handling means we shut down the app. 130 00:05:46,070 --> 00:05:49,590 Either with the stack trace on the panic call or OS exit. 131 00:05:49,590 --> 00:05:51,674 If we can recover, 132 00:05:51,674 --> 00:05:54,240 then that code has to recover the application back 133 00:05:54,240 --> 00:05:57,420 to its correct state and keep it going. 134 00:05:57,420 --> 00:06:00,130 At the end of the day when that piece of code returns 135 00:06:00,130 --> 00:06:02,890 it never, ever returns another error back up. 136 00:06:02,890 --> 00:06:04,470 The error stops there. 137 00:06:04,470 --> 00:06:07,430 Either the app is recovered or it shuts down. 138 00:06:07,430 --> 00:06:09,700 But we've also logged the error. 139 00:06:09,700 --> 00:06:11,400 That is error handling to me. 140 00:06:11,400 --> 00:06:13,980 And this is what this pattern is going to be perfect 141 00:06:13,980 --> 00:06:16,470 for with Dave Cheney's errors package. 142 00:06:16,470 --> 00:06:18,523 So imagine this piece of code right here. 143 00:06:19,460 --> 00:06:22,030 We're at the bottom of a call chain 144 00:06:22,030 --> 00:06:24,040 and we make this call to thirdCall. 145 00:06:24,040 --> 00:06:26,863 Some function above us makes a call to thirdCall. 146 00:06:27,821 --> 00:06:30,690 And you can see here that thirdCall always fails. 147 00:06:30,690 --> 00:06:34,270 It's failing with this custom error type called AppError. 148 00:06:34,270 --> 00:06:35,520 We're using pointer semantics, 149 00:06:35,520 --> 00:06:37,830 we're gonna store the address of AppError in there, 150 00:06:37,830 --> 00:06:40,490 and we've set the error code to 99. 151 00:06:40,490 --> 00:06:41,910 This is very similar to maybe like 152 00:06:41,910 --> 00:06:46,660 a standard library function that is gonna return these raw, 153 00:06:46,660 --> 00:06:48,600 or these true error values. 154 00:06:48,600 --> 00:06:52,250 Now secondCall was the one that called thirdCall, 155 00:06:52,250 --> 00:06:54,730 and you see our traditional mechanics in Go 156 00:06:54,730 --> 00:06:56,170 where we're gonna use the if statement 157 00:06:56,170 --> 00:06:57,820 to do the error handling right? 158 00:06:57,820 --> 00:06:59,620 We're staying away from the else clauses. 159 00:06:59,620 --> 00:07:01,600 Happy path in the first tab. 160 00:07:01,600 --> 00:07:04,070 And so here we go, we call thirdCall, 161 00:07:04,070 --> 00:07:06,090 we get back an error, interface value. 162 00:07:06,090 --> 00:07:08,420 We ask is there a concrete value stored inside 163 00:07:08,420 --> 00:07:09,253 the error interface? 164 00:07:09,253 --> 00:07:10,750 The answer is yes. 165 00:07:10,750 --> 00:07:11,800 When the answer is yes, 166 00:07:11,800 --> 00:07:15,260 now the developer writing this code has to make a choice. 167 00:07:15,260 --> 00:07:16,770 It's really boolean. 168 00:07:16,770 --> 00:07:19,930 Am I gonna handle the error here, or not? 169 00:07:19,930 --> 00:07:22,300 If the answer is handling the error here, 170 00:07:22,300 --> 00:07:24,150 we deal with it like I said before. 171 00:07:24,150 --> 00:07:27,070 But if the answer is no, I'm not gonna handle the error, 172 00:07:27,070 --> 00:07:29,370 then there's only one thing you're allowed to do, 173 00:07:29,370 --> 00:07:33,470 and that is wrap the error with context right? 174 00:07:33,470 --> 00:07:35,300 We would prefer to handle the error, 175 00:07:35,300 --> 00:07:37,700 the lower in the call stack we handle the error, 176 00:07:37,700 --> 00:07:40,320 the better opportunity you're gonna have for recovery. 177 00:07:40,320 --> 00:07:42,750 But in this case the developer has decided, 178 00:07:42,750 --> 00:07:44,440 no we're not gonna handle the error. 179 00:07:44,440 --> 00:07:45,273 Think about this. 180 00:07:45,273 --> 00:07:46,860 I don't have to worry about logging anymore 181 00:07:46,860 --> 00:07:48,700 and I don't have to worry about recovery anymore. 182 00:07:48,700 --> 00:07:51,320 All I worry about is the wrap call. 183 00:07:51,320 --> 00:07:54,740 Now the wrap call is interesting because it does two things. 184 00:07:54,740 --> 00:07:57,650 At this point remember that we've gotten an AppError. 185 00:07:57,650 --> 00:07:59,780 Here it is, this is our AppError, 186 00:07:59,780 --> 00:08:02,980 and our AppError is already wrapped against 187 00:08:02,980 --> 00:08:05,410 this error interface value. 188 00:08:05,410 --> 00:08:06,960 Now this is error, right? 189 00:08:06,960 --> 00:08:09,090 Wrapping the AppError. 190 00:08:09,090 --> 00:08:11,910 But what wrap is going to do now, 191 00:08:11,910 --> 00:08:15,920 is wrap context around what we have. 192 00:08:15,920 --> 00:08:18,140 And there's two types of context. 193 00:08:18,140 --> 00:08:21,900 There's gonna be call stack context 194 00:08:21,900 --> 00:08:25,690 and there's going to be user context. 195 00:08:25,690 --> 00:08:28,920 The call stack context is going to take a, 196 00:08:28,920 --> 00:08:31,600 where we are in the code at that line of code 197 00:08:31,600 --> 00:08:32,730 that we're doing the wrap. 198 00:08:32,730 --> 00:08:36,770 We now know exactly where we are when this error occurred. 199 00:08:36,770 --> 00:08:39,900 And we're also now also gonna be able to add 200 00:08:39,900 --> 00:08:41,090 some user context. 201 00:08:41,090 --> 00:08:43,760 In this case I'm just indicating that secondCall 202 00:08:43,760 --> 00:08:45,250 was calling thirdCall. 203 00:08:45,250 --> 00:08:48,090 So we're gonna get call stack and user context 204 00:08:48,090 --> 00:08:49,090 built in there. 205 00:08:49,090 --> 00:08:51,020 Now we take this error, we wrap it, 206 00:08:51,020 --> 00:08:53,160 and we send it back up the call stack. 207 00:08:53,160 --> 00:08:56,240 And when we do, firstCall now is involved. 208 00:08:56,240 --> 00:08:58,960 Remember firstCall is calling secondCall 209 00:08:58,960 --> 00:09:01,070 with the parameter of i. 210 00:09:01,070 --> 00:09:03,360 Now an error occurred, we know that right? 211 00:09:03,360 --> 00:09:06,410 We know that secondCall returned this error. 212 00:09:06,410 --> 00:09:08,154 Well guess what? 213 00:09:08,154 --> 00:09:10,350 firstCall now has to decide am I gonna handle the error? 214 00:09:10,350 --> 00:09:14,080 If the answer is no, then you only have one choice. 215 00:09:14,080 --> 00:09:18,250 We wrap the error again with that context, 216 00:09:18,250 --> 00:09:22,480 our user context and our call context. 217 00:09:22,480 --> 00:09:27,390 Notice this time our user context is showing the parameters, 218 00:09:27,390 --> 00:09:30,960 or the values we passed into secondCall. 219 00:09:30,960 --> 00:09:32,410 This is brilliant stuff. 220 00:09:32,410 --> 00:09:36,300 You can share any context you need and if you get bugs 221 00:09:36,300 --> 00:09:37,960 and the context isn't enough, 222 00:09:37,960 --> 00:09:40,580 this is what you're gonna be improving, the context. 223 00:09:40,580 --> 00:09:43,450 If I'm writing a database type of app and I don't have 224 00:09:43,450 --> 00:09:45,860 a security issue writing queries to the logs, 225 00:09:45,860 --> 00:09:48,520 that would be a query so I can copy and paste it 226 00:09:48,520 --> 00:09:50,340 out of the log and run it directly. 227 00:09:50,340 --> 00:09:53,690 Here I love showing what the input was, 228 00:09:53,690 --> 00:09:56,590 because I'm not gonna get that in the call stack context, 229 00:09:56,590 --> 00:09:58,650 but at least I can see what values were passed. 230 00:09:58,650 --> 00:09:59,800 That could really help. 231 00:10:00,887 --> 00:10:04,500 So I've got, this is the original error right here. 232 00:10:04,500 --> 00:10:06,690 This is one wrap right here. 233 00:10:06,690 --> 00:10:09,450 This is another wrapping right there. 234 00:10:09,450 --> 00:10:11,760 And we go to main. 235 00:10:11,760 --> 00:10:14,050 Now main made a call to firstCall, 236 00:10:14,050 --> 00:10:15,000 it gets back the error. 237 00:10:15,000 --> 00:10:16,870 It says, is there a concrete value stored inside 238 00:10:16,870 --> 00:10:17,703 the error interface? 239 00:10:17,703 --> 00:10:19,620 There absolutely is. 240 00:10:19,620 --> 00:10:22,020 Therefore we now come into this. 241 00:10:22,020 --> 00:10:25,060 Now the mechanics that I showed you on the case, 242 00:10:25,060 --> 00:10:27,740 doing the generic type assertions, guess what? 243 00:10:27,740 --> 00:10:31,230 We get to continue to do that because of the cause function. 244 00:10:31,230 --> 00:10:33,440 What's brilliant about the cause function, 245 00:10:33,440 --> 00:10:37,870 is it knows how to unwind this down and get us back 246 00:10:37,870 --> 00:10:40,360 to the root error value. 247 00:10:40,360 --> 00:10:42,470 And now we're gonna type assert against 248 00:10:42,470 --> 00:10:44,230 the root error value, 249 00:10:44,230 --> 00:10:47,580 and when we do that we can do the case on line 30. 250 00:10:47,580 --> 00:10:50,640 But I think what's even more powerful is this. 251 00:10:50,640 --> 00:10:53,930 If you're using the standard library fmt or log packages, 252 00:10:53,930 --> 00:10:58,930 like I do, if you put a %+v in the formatting, guess what? 253 00:10:59,510 --> 00:11:03,650 The full stack trace and context of this whole thing 254 00:11:03,650 --> 00:11:05,760 gets logged to, in this case, 255 00:11:05,760 --> 00:11:07,120 standard out or standard error, 256 00:11:07,120 --> 00:11:08,050 wherever you're logging. 257 00:11:08,050 --> 00:11:10,270 And if you just do %v, 258 00:11:10,270 --> 00:11:12,870 then you're just gonna get your user context. 259 00:11:12,870 --> 00:11:15,780 %v just gives you the user context. 260 00:11:15,780 --> 00:11:18,320 %+v gives you both. 261 00:11:18,320 --> 00:11:21,960 Now I wanna show you this running so you see it visualizing 262 00:11:21,960 --> 00:11:22,800 it right here. 263 00:11:22,800 --> 00:11:25,943 Let me copy the path and get us over into our terminal. 264 00:11:27,260 --> 00:11:29,410 And what I'm gonna do is build this program 265 00:11:30,288 --> 00:11:31,230 and then I'm gonna run it. 266 00:11:31,230 --> 00:11:33,390 And I want you to notice the output. 267 00:11:33,390 --> 00:11:35,840 Now the first output you're seeing, 268 00:11:35,840 --> 00:11:39,230 is when we did the %v. 269 00:11:39,230 --> 00:11:41,280 We're doing the %+v. 270 00:11:41,280 --> 00:11:43,950 And what you see here is a full stack trace, 271 00:11:43,950 --> 00:11:46,680 here's the AppError context state 99. 272 00:11:46,680 --> 00:11:50,840 And we can see stack traces from the different levels 273 00:11:50,840 --> 00:11:53,290 of calls that we were making in the app. 274 00:11:53,290 --> 00:11:54,770 Here's firstCall right? 275 00:11:54,770 --> 00:11:56,700 So that's runtime, main, 276 00:11:56,700 --> 00:11:58,880 main all the way to firstCall. 277 00:11:58,880 --> 00:11:59,840 There it is. 278 00:11:59,840 --> 00:12:03,050 And then we've got our context of what we've passed in, 279 00:12:03,050 --> 00:12:05,500 and here what we're seeing is the full context 280 00:12:05,500 --> 00:12:08,590 from main to firstCall to secondCall, 281 00:12:08,590 --> 00:12:10,900 and then the context to thirdCall. 282 00:12:10,900 --> 00:12:15,820 So we can see individually how these different layers 283 00:12:15,820 --> 00:12:18,540 that we got from our wrapping, came in. 284 00:12:18,540 --> 00:12:22,220 And then I can use just the v without the + 285 00:12:22,220 --> 00:12:27,220 and what you see now is just the context, the user context. 286 00:12:27,640 --> 00:12:30,260 So I can look at a stack trace from any point 287 00:12:30,260 --> 00:12:33,800 in the call chain and then I can also look at 288 00:12:33,800 --> 00:12:35,750 just the user context right? 289 00:12:35,750 --> 00:12:39,400 So we see here firstCall line 51. 290 00:12:39,400 --> 00:12:41,270 Let's go to line 51. 291 00:12:41,270 --> 00:12:42,103 Look at that. 292 00:12:42,103 --> 00:12:45,150 We know that there was failure on that call to secondCall, 293 00:12:45,150 --> 00:12:47,820 that's where we were, inside of there. 294 00:12:47,820 --> 00:12:51,390 Again if we look at firstCall 52, 295 00:12:51,390 --> 00:12:53,540 we know that that's where the wrap call happened. 296 00:12:53,540 --> 00:12:56,750 We got a tremendous amount of context here 297 00:12:56,750 --> 00:12:59,310 about where we were and we don't have to worry 298 00:12:59,310 --> 00:13:01,210 about logging all the way up right? 299 00:13:01,210 --> 00:13:03,110 We're not gonna separate logging from error handling. 300 00:13:03,110 --> 00:13:04,550 It's just one thing. 301 00:13:04,550 --> 00:13:06,770 So from a user perspective remember 302 00:13:06,770 --> 00:13:08,800 this is all we're gonna have to now remember. 303 00:13:08,800 --> 00:13:10,640 But from a developer perspective, 304 00:13:10,640 --> 00:13:12,300 what we're gonna just say is, 305 00:13:12,300 --> 00:13:14,290 oh my goodness, an error called. 306 00:13:14,290 --> 00:13:15,820 Are we gonna handle it or not? 307 00:13:15,820 --> 00:13:18,260 If we're not gonna handle it, we wrap it. 308 00:13:18,260 --> 00:13:19,730 If we are gonna handle it, 309 00:13:19,730 --> 00:13:21,490 then we deal with it. 310 00:13:21,490 --> 00:13:24,050 We log it and then we make decisions 311 00:13:24,050 --> 00:13:25,940 about whether we can recover or not. 312 00:13:25,940 --> 00:13:29,240 This is in a fantastic pattern and I really again, 313 00:13:29,240 --> 00:13:30,073 want you to, 314 00:13:30,073 --> 00:13:32,220 I want you to not feel like you've gotta log everything 315 00:13:32,220 --> 00:13:33,633 as an insurance policy. 316 00:13:34,684 --> 00:13:37,240 Signal is everything in logging because logging 317 00:13:37,240 --> 00:13:39,110 costs you a lot. 318 00:13:39,110 --> 00:13:42,350 It's going to cost you lots of allocations, 319 00:13:42,350 --> 00:13:45,870 so the logs that you're writing better be worth the cost. 320 00:13:45,870 --> 00:13:48,490 They better be truly information you need 321 00:13:48,490 --> 00:13:49,890 if there is a problem, 322 00:13:49,890 --> 00:13:52,240 and information from a tracing perspective 323 00:13:52,240 --> 00:13:54,570 to allow you to know that the system is healthy. 324 00:13:54,570 --> 00:13:56,620 Don't forget that we have dashboards. 325 00:13:56,620 --> 00:13:57,480 We have metrics. 326 00:13:57,480 --> 00:13:59,520 I don't like writing data into the logs. 327 00:13:59,520 --> 00:14:03,160 I'm not a big fan of structured logging because for me, 328 00:14:03,160 --> 00:14:07,580 logs serve the purpose of being able to find and fix bugs, 329 00:14:07,580 --> 00:14:09,190 but that's me. 330 00:14:09,190 --> 00:14:12,430 And I rather use my metric systems and my dashboards 331 00:14:12,430 --> 00:14:14,000 for those data points, 332 00:14:14,000 --> 00:14:16,020 and try to tie that stuff together. 333 00:14:16,020 --> 00:14:18,360 So I really want you to look at Dave Cheney's package. 334 00:14:18,360 --> 00:14:22,080 You'll find it in GitHub under pkg errors 335 00:14:22,080 --> 00:14:25,220 and this pattern is something that we use at Arden, 336 00:14:25,220 --> 00:14:27,950 and a lot of people that are my clients that I teach use, 337 00:14:27,950 --> 00:14:30,050 and it's been very, very effective for us.