1 00:00:00,000 --> 00:00:06,666 [No Audio] 2 00:00:06,667 --> 00:00:09,133 In this tutorial, we will be talking about 3 00:00:09,134 --> 00:00:11,266 performance improvements that are possible 4 00:00:11,267 --> 00:00:13,433 with the help of Clippy provided lints. 5 00:00:14,166 --> 00:00:16,366 Lints are warnings, or suggestions provided 6 00:00:16,367 --> 00:00:18,533 by the Rust compiler to help you write more 7 00:00:18,534 --> 00:00:21,866 idiomatic and correct code. Lints can catch 8 00:00:21,867 --> 00:00:24,100 common mistakes or code patterns that may 9 00:00:24,101 --> 00:00:26,600 lead to bugs or sub optimal performance. 10 00:00:26,966 --> 00:00:29,666 There are many built in lints in Rust, and 11 00:00:29,667 --> 00:00:32,333 you can also define your own custom lints. 12 00:00:32,700 --> 00:00:35,466 Let us look at some of the available lints. 13 00:00:35,467 --> 00:00:39,566 [No Audio] 14 00:00:39,567 --> 00:00:42,266 There are many lints available, each lint is 15 00:00:42,300 --> 00:00:45,200 associated with a level, which can be either 16 00:00:45,266 --> 00:00:49,166 allow, warn, or deny. A lint with the level 17 00:00:49,167 --> 00:00:52,366 allow will do nothing by default. These lints 18 00:00:52,367 --> 00:00:55,766 exist mostly, mostly to be manually turned on 19 00:00:55,767 --> 00:00:59,033 via configuration. The warn lint level will 20 00:00:59,034 --> 00:01:02,066 produce a warning if you violate the lint. A 21 00:01:02,067 --> 00:01:04,566 deny lint produce an error if you violate 22 00:01:04,567 --> 00:01:07,033 it. Please note that None is not the lint 23 00:01:07,034 --> 00:01:09,733 level, but somewhat is appearing here in the 24 00:01:09,734 --> 00:01:12,266 interface. The lints are grouped into various 25 00:01:12,267 --> 00:01:14,433 categories, we are interested in the 26 00:01:14,434 --> 00:01:17,466 performance related lints, so let me select them. 27 00:01:17,467 --> 00:01:20,666 [No Audio] 28 00:01:20,667 --> 00:01:22,833 All these lints have the level of 29 00:01:22,834 --> 00:01:25,666 warn, which means that violating these will 30 00:01:25,667 --> 00:01:28,533 cause a warning message. If you scroll down 31 00:01:28,566 --> 00:01:30,766 there, there are a whole bunch of these. 32 00:01:31,200 --> 00:01:33,700 Clicking a specific lint will show its full 33 00:01:33,701 --> 00:01:36,833 details along with examples and explanations. 34 00:01:38,600 --> 00:01:40,466 In this tutorial, we will be going through 35 00:01:40,467 --> 00:01:43,033 some of these which are more common and 36 00:01:43,034 --> 00:01:45,833 frequently encountered. Of course, we cannot 37 00:01:45,834 --> 00:01:48,233 go through all of them, as it will require a 38 00:01:48,234 --> 00:01:51,266 lot of time. The first lint is related to the 39 00:01:51,267 --> 00:01:54,800 use of the Box. Let us look at an example, I 40 00:01:54,801 --> 00:01:56,933 will define a struct containing a single field. 41 00:01:56,934 --> 00:01:59,666 [No Audio] 42 00:01:59,667 --> 00:02:03,066 Let us run the clippy to see if there are any issues. 43 00:02:03,067 --> 00:02:05,533 [No Audio] 44 00:02:05,534 --> 00:02:07,200 It gives out a warning 45 00:02:07,201 --> 00:02:09,833 message, saying that you seem to be trying to 46 00:02:09,834 --> 00:02:14,500 use Box Vec, consider using just Vec. If we 47 00:02:14,501 --> 00:02:16,666 look at the next line, it has some very good 48 00:02:16,667 --> 00:02:19,833 explanation. It says Vec is already on the 49 00:02:19,834 --> 00:02:23,233 heap. Box Vec makes an extra location 50 00:02:23,234 --> 00:02:26,066 since collections already keep their contents 51 00:02:26,067 --> 00:02:29,066 in a separate area on the heap. So if we Box 52 00:02:29,067 --> 00:02:31,666 them, we are just unnecessarily adding 53 00:02:31,700 --> 00:02:33,833 another level of indirection without any 54 00:02:33,834 --> 00:02:36,500 benefit whatsoever. So a more efficient 55 00:02:36,501 --> 00:02:38,866 syntax would be to keep the field as a simple 56 00:02:38,867 --> 00:02:43,166 Vec of i32 values. In general, for better 57 00:02:43,167 --> 00:02:45,333 performance, we should avoid the location on 58 00:02:45,334 --> 00:02:47,433 the heap, since heap management is typically 59 00:02:47,434 --> 00:02:50,466 being done using the operating system. In the 60 00:02:50,467 --> 00:02:53,566 same way, unnecessarily boxing a value is not 61 00:02:53,567 --> 00:02:55,766 a good practice from performance perspective. 62 00:02:56,000 --> 00:02:58,133 For instance, let me define a u32 63 00:02:58,134 --> 00:02:59,966 primitive which is being boxed. 64 00:02:59,967 --> 00:03:01,966 [No Audio] 65 00:03:01,967 --> 00:03:04,866 This should be avoided. The lint which checks for such 66 00:03:04,867 --> 00:03:08,933 issues is Box_local lint. In 67 00:03:08,934 --> 00:03:11,366 general, we should only Box a type when 68 00:03:11,367 --> 00:03:14,100 there is a compelling reason to do so. We 69 00:03:14,101 --> 00:03:16,433 have seen some use cases of the Box such as 70 00:03:16,434 --> 00:03:18,600 undetermined size at compile time due to 71 00:03:18,601 --> 00:03:21,833 recursive types. Let us look at one more use 72 00:03:21,834 --> 00:03:24,966 case where we should be using the Box. This 73 00:03:24,967 --> 00:03:27,200 use case is explained in the lint of 74 00:03:27,201 --> 00:03:30,466 large_enum_variant. First, I 75 00:03:30,467 --> 00:03:33,566 will comment out the previous code. Consider 76 00:03:33,567 --> 00:03:35,433 an enum with two variants 77 00:03:35,434 --> 00:03:38,600 [No Audio] 78 00:03:38,601 --> 00:03:39,900 The first variant takes 79 00:03:39,901 --> 00:03:41,666 only a few bytes, but the second 80 00:03:41,667 --> 00:03:43,966 variant will occupy quite a large space in 81 00:03:43,967 --> 00:03:46,666 memory. Now, when we create an instance of 82 00:03:46,667 --> 00:03:49,933 the enum, it will allocate to it the memory 83 00:03:49,934 --> 00:03:52,800 equal to the largest variant size. If the 84 00:03:52,801 --> 00:03:55,566 large size is occurring very infrequently, 85 00:03:55,567 --> 00:03:58,400 then we will be wasting quite a lot of memory 86 00:03:58,401 --> 00:04:01,233 space making unnecessary allocations. This 87 00:04:01,234 --> 00:04:04,133 can be improved by making the use of the Box 88 00:04:04,134 --> 00:04:07,633 smart pointer, let us Box the large variant. 89 00:04:07,900 --> 00:04:10,666 The Box in this case will take a space equal 90 00:04:10,667 --> 00:04:12,600 to the size of the pointer which is pointing 91 00:04:12,601 --> 00:04:15,566 to some resource on the heap, which is way to 92 00:04:15,567 --> 00:04:18,800 less than the previous size. Please note that 93 00:04:18,801 --> 00:04:20,666 this implementation may not be always 94 00:04:20,667 --> 00:04:23,433 feasible and depends on the usage of the enum 95 00:04:23,434 --> 00:04:25,533 inside the code. For instance, if the 96 00:04:25,534 --> 00:04:28,166 frequency of the variant_2 is 97 00:04:28,167 --> 00:04:31,700 99%, then we will be making quite frequent 98 00:04:31,701 --> 00:04:34,333 heap allocations which are again expensive. 99 00:04:34,500 --> 00:04:37,033 So it all depends on the specific situation 100 00:04:37,034 --> 00:04:40,600 and scenario. A similar lint to this one is 101 00:04:40,601 --> 00:04:43,766 called result_large_error. 102 00:04:43,767 --> 00:04:45,000 [No Audio] 103 00:04:45,001 --> 00:04:47,666 As we are aware that result is also an 104 00:04:47,667 --> 00:04:50,133 enum, so defining an error type for the 105 00:04:50,134 --> 00:04:52,500 result which occupies a large amount of space 106 00:04:52,501 --> 00:04:54,866 can also be boxed. Since the error are 107 00:04:54,867 --> 00:04:57,366 generally quite a few in number, so therefore 108 00:04:57,367 --> 00:04:59,666 boxing the error will be most of the time 109 00:04:59,667 --> 00:05:01,766 beneficial and will provide useful 110 00:05:01,767 --> 00:05:04,966 performance benefits. The remaining of the 111 00:05:04,967 --> 00:05:07,200 details of that lint is very similar, so 112 00:05:07,201 --> 00:05:10,100 therefore, we will skip its details. The key 113 00:05:10,101 --> 00:05:12,600 idea is that box should be used when we have 114 00:05:12,601 --> 00:05:15,900 a compelling reason to use it. Let us look at 115 00:05:15,901 --> 00:05:18,400 another lint related to function calls. 116 00:05:18,533 --> 00:05:20,800 Consider a case where I would like to create 117 00:05:20,801 --> 00:05:24,900 a heap allocated i32 integer with a default value of zero. 118 00:05:24,901 --> 00:05:28,500 [No Audio] 119 00:05:28,501 --> 00:05:29,900 In this line of code we are 120 00:05:29,901 --> 00:05:32,533 making two function calls. The first one is 121 00:05:32,534 --> 00:05:35,233 that of a new, which will make a new Box which 122 00:05:35,234 --> 00:05:37,833 will be making a new heap allocation, and then 123 00:05:37,834 --> 00:05:40,033 we are calling the default which will set the 124 00:05:40,034 --> 00:05:43,533 default value. This syntax can be improved, so 125 00:05:43,534 --> 00:05:45,733 that it makes just a single function call. 126 00:05:45,734 --> 00:05:47,066 Let us revise this. 127 00:05:47,067 --> 00:05:50,200 [No Audio] 128 00:05:50,201 --> 00:05:51,833 This syntax. is now more 129 00:05:51,834 --> 00:05:53,600 efficient from performance perspective, 130 00:05:53,601 --> 00:05:56,100 because it is only making a single call 131 00:05:56,101 --> 00:05:58,100 instead of making two function calls, and 132 00:05:58,101 --> 00:06:00,566 secondly, it is more simplified and contains 133 00:06:00,567 --> 00:06:03,800 lesser code. The general idea is to reduce 134 00:06:03,833 --> 00:06:07,066 unnecessary function calls. A side note in 135 00:06:07,067 --> 00:06:09,266 this regards is that, when we make a function 136 00:06:09,267 --> 00:06:11,766 called the program has to start execution 137 00:06:11,767 --> 00:06:14,000 from a different place in the code, which 138 00:06:14,001 --> 00:06:16,700 will require to store the local variables 139 00:06:16,701 --> 00:06:19,000 into registers and keep track of the place 140 00:06:19,001 --> 00:06:22,166 where to go back after completion. All this 141 00:06:22,167 --> 00:06:24,633 is expensive compared to sequential execution 142 00:06:24,634 --> 00:06:27,200 of a program. This does not however, means 143 00:06:27,201 --> 00:06:30,133 that function that functions are always made. 144 00:06:30,266 --> 00:06:31,866 We need to make a trade off between the 145 00:06:31,867 --> 00:06:34,500 clarity or understanding of our code and its 146 00:06:34,501 --> 00:06:37,133 performance. The more clearer and more 147 00:06:37,134 --> 00:06:39,566 understandable the code is, the more it is 148 00:06:39,567 --> 00:06:42,366 easy to update and maintain. However, in some 149 00:06:42,367 --> 00:06:44,533 situations where performance is very 150 00:06:44,534 --> 00:06:46,666 critical, we may compromise to a certain 151 00:06:46,667 --> 00:06:48,333 level on the number of functions in a 152 00:06:48,334 --> 00:06:50,833 program. In this case, however, the point we 153 00:06:50,834 --> 00:06:53,466 want to emphasize is to avoid unnecessary 154 00:06:53,467 --> 00:06:56,133 function calls. Sometimes we may 155 00:06:56,134 --> 00:06:59,533 unnecessarily make an own copy for doing some 156 00:06:59,534 --> 00:07:02,133 operations which has never been used later on 157 00:07:02,134 --> 00:07:04,466 in the program. Let us consider a string 158 00:07:04,467 --> 00:07:06,700 variable and a string slice variable. 159 00:07:06,701 --> 00:07:08,866 [No Audio] 160 00:07:08,867 --> 00:07:12,033 We want to compare if the two values are equal using 161 00:07:12,034 --> 00:07:15,166 the comparison operator, we may note that the 162 00:07:15,167 --> 00:07:17,366 two variables are of different types. So let 163 00:07:17,367 --> 00:07:20,033 us convert the variable own string type and 164 00:07:20,034 --> 00:07:21,666 then make the comparison. 165 00:07:21,667 --> 00:07:25,866 [No Audio] 166 00:07:25,867 --> 00:07:29,533 The comparison operator can operate on a reference, 167 00:07:29,534 --> 00:07:32,333 so creating an own value effectively throws it 168 00:07:32,334 --> 00:07:35,300 away directly afterwards, which is needlessly 169 00:07:35,301 --> 00:07:38,266 consuming code and heap space. This means 170 00:07:38,267 --> 00:07:40,433 that the allocation is never being assigned 171 00:07:40,434 --> 00:07:43,466 to anything, and is only being used within 172 00:07:43,467 --> 00:07:46,166 this line and will be thrown away after this 173 00:07:46,167 --> 00:07:49,133 line. We may more effectively do the 174 00:07:49,134 --> 00:07:50,933 comparison on references. 175 00:07:50,934 --> 00:07:53,600 [No Audio] 176 00:07:53,601 --> 00:07:55,166 This revised syntax 177 00:07:55,167 --> 00:07:57,633 is more efficient and more effective. 178 00:07:57,966 --> 00:08:01,733 The next lint is extend_with_drain. 179 00:08:01,734 --> 00:08:03,333 This lint relates to the 180 00:08:03,334 --> 00:08:06,100 two methods of extend and append on vectors, 181 00:08:06,200 --> 00:08:08,666 which can both be used for adding items of 182 00:08:08,667 --> 00:08:11,700 one vector to another vector. Let us define 183 00:08:11,701 --> 00:08:13,166 a couple of vectors first. 184 00:08:13,167 --> 00:08:16,700 [No Audio] 185 00:08:16,701 --> 00:08:18,066 The extend methods 186 00:08:18,067 --> 00:08:20,400 extends our collection with the contents of 187 00:08:20,401 --> 00:08:23,133 an iterator. In this case, we would like to 188 00:08:23,134 --> 00:08:26,966 extend a vector a with with an iterator over 189 00:08:26,967 --> 00:08:30,100 the elements of b, so let us use this. 190 00:08:30,101 --> 00:08:34,232 [No Audio] 191 00:08:34,233 --> 00:08:37,765 The drain method returns an iterator that keeps a 192 00:08:37,766 --> 00:08:40,433 mutable borrow on the vector to optimize its 193 00:08:40,466 --> 00:08:43,765 implementation. The extend function add 194 00:08:43,766 --> 00:08:46,766 items one after another, and is therefore 195 00:08:46,767 --> 00:08:49,100 optimal, especially in situation where we 196 00:08:49,101 --> 00:08:51,166 would like the operation to be done at once 197 00:08:51,167 --> 00:08:53,433 and not for each item individually. 198 00:08:53,666 --> 00:08:56,266 The lint therefore suggested to use the append 199 00:08:56,267 --> 00:08:59,633 function, which will add all the elements at once. 200 00:08:59,634 --> 00:09:02,633 [No Audio] 201 00:09:02,634 --> 00:09:04,600 You may note that the append is more 202 00:09:04,601 --> 00:09:07,766 simpler and requires only mutable reference, 203 00:09:07,767 --> 00:09:09,600 while the external requires you to first 204 00:09:09,601 --> 00:09:12,566 provide an iterator, and then it will turn the 205 00:09:12,567 --> 00:09:15,000 references to the items of the iterator to 206 00:09:15,001 --> 00:09:18,133 add it to the current collection. Please note 207 00:09:18,134 --> 00:09:20,933 that if your specific case demands to perform 208 00:09:20,934 --> 00:09:23,600 any actions on the items before adding them, 209 00:09:23,900 --> 00:09:26,433 then it's usually better to use extend. 210 00:09:26,434 --> 00:09:29,266 However, if you only want to add the items 211 00:09:29,267 --> 00:09:31,366 directly then it's better to use the append 212 00:09:31,367 --> 00:09:34,200 for speed and efficiency. Let us cover a few 213 00:09:34,201 --> 00:09:37,166 more. The next lint is collapsible string 214 00:09:37,167 --> 00:09:40,800 replace. Consider a scenario in which we want 215 00:09:40,801 --> 00:09:42,700 to replace some characters in any input 216 00:09:42,701 --> 00:09:46,400 string. We can use the replace function for this purpose. 217 00:09:46,401 --> 00:09:54,800 [No Audio] 218 00:09:54,801 --> 00:09:57,500 These consecutive calls to string replace can 219 00:09:57,501 --> 00:10:00,133 be collapsed into a single call. This can can 220 00:10:00,134 --> 00:10:02,433 be done by calling the replace once and 221 00:10:02,434 --> 00:10:05,033 giving it in an array of characters instead 222 00:10:05,034 --> 00:10:06,566 of individual characters. 223 00:10:06,567 --> 00:10:14,233 [No Audio] 224 00:10:14,234 --> 00:10:15,900 This is more efficient for a couple of 225 00:10:15,901 --> 00:10:19,000 reasons. First, it involves a lesser number 226 00:10:19,001 --> 00:10:21,300 of function calls, which as pointed out 227 00:10:21,301 --> 00:10:24,200 earlier are expensive. Secondly, consecutive 228 00:10:24,201 --> 00:10:26,700 calls to string replace will scan the string 229 00:10:26,701 --> 00:10:29,633 multiple times, which will also lead to extra 230 00:10:29,634 --> 00:10:32,833 cost. There are many other useful lints which 231 00:10:32,834 --> 00:10:34,333 you may consider for improving the 232 00:10:34,334 --> 00:10:36,900 performance of your Rust code. The complete 233 00:10:36,901 --> 00:10:39,400 list is available at this particular link 234 00:10:39,433 --> 00:10:45,733 that is rust-lang.github.io/rust-clippy/master. 235 00:10:46,033 --> 00:10:47,766 I would strongly encourage you 236 00:10:47,767 --> 00:10:50,000 to go through it. With this we end this 237 00:10:50,001 --> 00:10:52,100 tutorial. See you again with more performance 238 00:10:52,101 --> 00:10:54,266 related tutorials and until next tutorial, 239 00:10:54,300 --> 00:10:56,200 enjoy Rust programming.