1 00:00:00,000 --> 00:00:06,766 [No Audio] 2 00:00:06,767 --> 00:00:09,100 A Regular Expression is a search pattern that 3 00:00:09,101 --> 00:00:11,533 describes a certain text that we want to find 4 00:00:11,534 --> 00:00:14,000 in the input text. This means that, we will 5 00:00:14,001 --> 00:00:16,400 have two inputs to the system, there is the 6 00:00:16,401 --> 00:00:18,900 input text and the regex, and the output of 7 00:00:18,901 --> 00:00:20,700 the system will be the instances of the 8 00:00:20,701 --> 00:00:23,066 search pattern that the system is able to 9 00:00:23,067 --> 00:00:25,833 find in the input text. Please note that, this 10 00:00:25,834 --> 00:00:27,666 is going to be very precise and short 11 00:00:27,667 --> 00:00:30,466 tutorials on regexes or regular expressions. 12 00:00:30,700 --> 00:00:33,033 And if you want to learn more in depth about 13 00:00:33,034 --> 00:00:35,700 regexes, then you may reach out to me, and I 14 00:00:35,701 --> 00:00:38,066 will give you a free coupon for one of my 15 00:00:38,067 --> 00:00:40,066 other courses on regular expressions for 16 00:00:40,067 --> 00:00:43,500 having an in depth tutorial on regexes. Since 17 00:00:43,501 --> 00:00:45,900 this course is about Rust, so therefore these 18 00:00:45,901 --> 00:00:48,300 tutorials will not be very much elaborative, 19 00:00:48,301 --> 00:00:50,766 and will be therefore short and more precise. 20 00:00:51,500 --> 00:00:53,833 The search pattern may be used for a text 21 00:00:53,834 --> 00:00:56,266 searching to perform operations such as 22 00:00:56,267 --> 00:00:58,700 find and replace, finding some particular piece 23 00:00:58,701 --> 00:01:00,866 of information in the text, such as email 24 00:01:00,867 --> 00:01:03,966 addresses or phone numbers. Due to their high 25 00:01:03,967 --> 00:01:06,366 importance in problem solving encountered in 26 00:01:06,400 --> 00:01:08,700 programming tasks, the regexes in many 27 00:01:08,701 --> 00:01:11,200 programming languages are implemented as part 28 00:01:11,201 --> 00:01:13,733 of the standard library. Rust does not 29 00:01:13,734 --> 00:01:15,900 provide it in the standard library, since the 30 00:01:15,901 --> 00:01:18,200 intention in Rust is to keep the standard 31 00:01:18,201 --> 00:01:21,400 library as shrinked as possible. However, 32 00:01:21,401 --> 00:01:23,833 there are some useful crates which can be 33 00:01:23,834 --> 00:01:27,600 used in this regards. In this and some of the 34 00:01:27,601 --> 00:01:29,500 upcoming tutorials, we will be looking at the 35 00:01:29,501 --> 00:01:32,100 crate called regex, which is typically used 36 00:01:32,101 --> 00:01:34,000 for regular expression implementation in 37 00:01:34,001 --> 00:01:37,333 Rust. I will first add it to the terminal file. 38 00:01:37,334 --> 00:01:41,500 [No Audio] 39 00:01:41,501 --> 00:01:43,000 Let us first bring the relevant 40 00:01:43,001 --> 00:01:44,533 modules in crate into scope. 41 00:01:44,534 --> 00:01:50,100 [No Audio] 42 00:01:50,101 --> 00:01:52,833 Let us create a simple regex using the new function. 43 00:01:52,834 --> 00:01:57,166 [No Audio] 44 00:01:57,167 --> 00:01:59,466 The new function returns a result enum, 45 00:01:59,467 --> 00:02:01,600 so therefore, we have unwrapped it to 46 00:02:01,601 --> 00:02:04,500 obtain the regex. If the regex is not 47 00:02:04,501 --> 00:02:07,400 syntactically correct, then it will return an error. 48 00:02:07,533 --> 00:02:09,600 The arm before the regex is used to 49 00:02:09,601 --> 00:02:11,933 indicate to Rust that it is going to be a raw 50 00:02:11,934 --> 00:02:14,833 string. The text inside the double quotes is 51 00:02:14,866 --> 00:02:17,500 our regular expression. There are many 52 00:02:17,501 --> 00:02:19,466 constructs in a regular expression, and the 53 00:02:19,467 --> 00:02:22,033 most fundamental one is called character 54 00:02:22,034 --> 00:02:24,666 class. Character classes are being indicated 55 00:02:24,667 --> 00:02:28,333 by square brackets. It tells the regex engine 56 00:02:28,334 --> 00:02:30,733 to match only one out of several characters 57 00:02:30,734 --> 00:02:32,566 that are being mentioned inside the square 58 00:02:32,567 --> 00:02:35,666 brackets. In this example, we are telling to 59 00:02:35,667 --> 00:02:38,666 match it in the input text at the start, one 60 00:02:38,667 --> 00:02:41,366 of the three letters of either letter b, or 61 00:02:41,367 --> 00:02:44,366 letter r, or letter t, followed by a mandatory 62 00:02:44,367 --> 00:02:48,300 letters of ain which must be matched. Now, 63 00:02:48,301 --> 00:02:50,266 let us define a string which we want to 64 00:02:50,267 --> 00:02:52,933 investigate for possible matches. I will 65 00:02:52,934 --> 00:02:56,100 define a reference string containing some text. 66 00:02:56,101 --> 00:03:02,500 [No Audio] 67 00:03:02,501 --> 00:03:04,500 Next, to determine if the regular expression 68 00:03:04,501 --> 00:03:07,200 has a match in the input text or not, there 69 00:03:07,201 --> 00:03:09,600 is a function called is_match, 70 00:03:09,601 --> 00:03:12,500 which returns either true or false. Let us 71 00:03:12,501 --> 00:03:14,100 use it inside the print statement. 72 00:03:14,101 --> 00:03:19,566 [No Audio] 73 00:03:19,567 --> 00:03:21,100 Let us cargo run this now. 74 00:03:21,101 --> 00:03:25,500 [No Audio] 75 00:03:25,501 --> 00:03:28,800 There is a match. To know where exactly is the 76 00:03:28,801 --> 00:03:31,033 first match in the input string, the crate 77 00:03:31,034 --> 00:03:33,333 provides another function called find which 78 00:03:33,334 --> 00:03:35,700 returns the starting and ending index of the 79 00:03:35,701 --> 00:03:38,866 first match in the input string. Let us call 80 00:03:38,867 --> 00:03:41,300 it inside the print statement and display the result. 81 00:03:41,301 --> 00:03:48,333 [No Audio] 82 00:03:48,334 --> 00:03:51,133 These functions are useful, however, they do 83 00:03:51,134 --> 00:03:53,566 not provide the specific text which has been 84 00:03:53,567 --> 00:03:56,433 found as a capture. The crate provides a 85 00:03:56,434 --> 00:03:58,933 convenient iterators for matching an expression 86 00:03:58,934 --> 00:04:01,300 repeatedly against a search string to find 87 00:04:01,301 --> 00:04:03,966 successive non overlapping matches. 88 00:04:03,967 --> 00:04:06,033 Let us use it using a for loop. 89 00:04:06,034 --> 00:04:12,300 [No Audio] 90 00:04:12,301 --> 00:04:14,333 Let us print the individual captures, 91 00:04:14,334 --> 00:04:15,933 inside a print statement now. 92 00:04:15,934 --> 00:04:22,800 [No Audio] 93 00:04:22,801 --> 00:04:25,533 The capture_iter returns a capture 94 00:04:25,534 --> 00:04:28,000 structure which has the individual captures 95 00:04:28,001 --> 00:04:31,500 inside each store, which we can access using 96 00:04:31,501 --> 00:04:34,233 the index of zero. There are some regexes 97 00:04:34,234 --> 00:04:36,333 which has different paths to match, and the 98 00:04:36,334 --> 00:04:39,000 indexes are used in those cases to refer to 99 00:04:39,001 --> 00:04:41,400 the different parts of a single regex 100 00:04:41,401 --> 00:04:43,600 which has been matched. We will look into the 101 00:04:43,601 --> 00:04:46,000 details of that later on. For now, it is 102 00:04:46,001 --> 00:04:48,433 sufficient to know that the individual match 103 00:04:48,466 --> 00:04:51,600 texts are being accessed using this syntax. 104 00:04:51,601 --> 00:04:53,800 The result in this case would be rain and 105 00:04:53,801 --> 00:04:56,866 pain. Let us execute and then we will explain. 106 00:04:56,867 --> 00:05:03,900 [No Audio] 107 00:05:03,901 --> 00:05:06,100 The first two letters of r in the words 108 00:05:06,133 --> 00:05:08,533 does not match, because after the are the 109 00:05:08,534 --> 00:05:11,333 rejects demands the letters of ain, which are 110 00:05:11,334 --> 00:05:13,800 absent after the first two letters of r. 111 00:05:13,900 --> 00:05:16,766 The last are matched, because we have the letters 112 00:05:16,767 --> 00:05:19,800 of ain afterwards. The first letter in the 113 00:05:19,801 --> 00:05:22,133 second word does not match to any of the 114 00:05:22,134 --> 00:05:24,766 characters in our character class, and is 115 00:05:24,767 --> 00:05:28,166 therefore not part of the match. The letter p 116 00:05:28,167 --> 00:05:30,066 followed by the letter s matched, because the 117 00:05:30,067 --> 00:05:32,466 letter p is in the character group, and is 118 00:05:32,467 --> 00:05:35,433 being followed by the mandatory letters of ain. 119 00:05:35,966 --> 00:05:38,166 The dot in regex will match a single 120 00:05:38,167 --> 00:05:41,166 character including letters and digits, dot 121 00:05:41,167 --> 00:05:44,100 should not be used inside a character class. 122 00:05:44,466 --> 00:05:46,900 Let us modify the regex and put a dot after 123 00:05:46,901 --> 00:05:50,300 the character class. This means now that in 124 00:05:50,301 --> 00:05:53,233 order to have a match, we need to have either 125 00:05:53,234 --> 00:05:57,000 one of letters p, letter r, or letter t at the 126 00:05:57,001 --> 00:05:59,266 start followed by any character and then a 127 00:05:59,267 --> 00:06:03,400 mandatory letters of a, i, and n, let us execute. 128 00:06:03,401 --> 00:06:10,800 [No Audio] 129 00:06:10,801 --> 00:06:13,733 You may note that, it also makes the extra are 130 00:06:13,734 --> 00:06:16,500 in the rain due to dot. The corrector 131 00:06:16,501 --> 00:06:18,933 class is quite useful, and can be used for 132 00:06:18,934 --> 00:06:21,166 spell checking. Let me add a new regular 133 00:06:21,167 --> 00:06:23,533 expression for checking the spelling of gray. 134 00:06:23,534 --> 00:06:33,233 [No Audio] 135 00:06:33,234 --> 00:06:35,066 The spelling will be correct, if we have 136 00:06:35,067 --> 00:06:38,600 letters of g and r followed by letters of 137 00:06:38,633 --> 00:06:41,633 either a or e, followed by a mandatory letter 138 00:06:41,634 --> 00:06:44,166 of y. Let us add some text for checking. 139 00:06:44,167 --> 00:06:49,933 [No Audio] 140 00:06:49,934 --> 00:06:51,900 Let us paste the code for the possible 141 00:06:51,933 --> 00:06:53,800 matches now and cargo run. 142 00:06:53,801 --> 00:06:56,033 [No Audio] 143 00:06:56,034 --> 00:06:57,400 You may note that only the 144 00:06:57,401 --> 00:06:58,900 valid part of the spellings are 145 00:06:58,901 --> 00:07:01,200 being returned. Similar to the corrector 146 00:07:01,201 --> 00:07:03,500 classes are correct arranges which are used 147 00:07:03,501 --> 00:07:05,833 to check if characters in a certain range are 148 00:07:05,834 --> 00:07:08,266 part of the text or not. For instance, to 149 00:07:08,267 --> 00:07:10,433 check for all the lowercase characters, we 150 00:07:10,434 --> 00:07:12,800 will use the syntax which will look like this. 151 00:07:12,801 --> 00:07:17,366 [No Audio] 152 00:07:17,367 --> 00:07:20,200 The small a followed by a dash and then z 153 00:07:20,201 --> 00:07:22,733 means, all the characters from A to Z. 154 00:07:23,033 --> 00:07:25,600 The overall meaning of the regex is now we need 155 00:07:25,601 --> 00:07:27,800 to have a small letter character at the start, 156 00:07:27,801 --> 00:07:31,100 followed by the mandatory letters of a, i, and n. 157 00:07:31,101 --> 00:07:33,533 That is used some example text 158 00:07:33,534 --> 00:07:38,566 [No Audio] 159 00:07:38,567 --> 00:07:40,666 Finally, I will add the code for printing 160 00:07:40,667 --> 00:07:42,266 out the relevant matches. 161 00:07:42,267 --> 00:07:44,466 [No Audio] 162 00:07:44,467 --> 00:07:45,733 Let us cargo run this now. 163 00:07:45,734 --> 00:07:49,900 [No Audio] 164 00:07:49,901 --> 00:07:52,500 It returns the possible matches. The last 165 00:07:52,501 --> 00:07:54,400 word did not match, because at the start of 166 00:07:54,401 --> 00:07:56,500 the word we do not have a small character 167 00:07:56,501 --> 00:07:59,866 letter. We can mention multiple ranges inside 168 00:07:59,867 --> 00:08:02,633 the square brackets also, for instance, I will 169 00:08:02,634 --> 00:08:05,500 include all the upper letters by mentioning 170 00:08:05,501 --> 00:08:09,900 capital A-Z. This will now mean that, at 171 00:08:09,901 --> 00:08:12,266 the start we need to have at least one upper 172 00:08:12,267 --> 00:08:14,733 or lowercase character. Inside the square 173 00:08:14,734 --> 00:08:17,733 brackets, we can exclude certain ranges also. 174 00:08:17,866 --> 00:08:20,133 The exclusion is done with the help of the 175 00:08:20,134 --> 00:08:22,433 cap symbol at the start of the range. 176 00:08:22,633 --> 00:08:25,066 For instance, if I want to exclude all the 177 00:08:25,067 --> 00:08:26,933 lowercase characters from the start of the 178 00:08:26,934 --> 00:08:29,566 match, I will mention something like this. 179 00:08:29,567 --> 00:08:34,765 [No Audio] 180 00:08:34,766 --> 00:08:36,100 Let us execute to see what 181 00:08:36,101 --> 00:08:37,566 happens to the matches. 182 00:08:37,567 --> 00:08:43,200 [No Audio] 183 00:08:43,201 --> 00:08:45,300 You may note that, it has just returned a 184 00:08:45,301 --> 00:08:49,200 match for the word of 0ain. And this is 185 00:08:49,201 --> 00:08:51,766 because, all the remaining words has a 186 00:08:51,767 --> 00:08:54,400 small letter character at the beginning and are 187 00:08:54,401 --> 00:08:58,100 therefore not being matched. There are some 188 00:08:58,101 --> 00:09:00,333 shorthands for the frequently used character 189 00:09:00,334 --> 00:09:03,700 classes, and they include a slash small w, 190 00:09:03,833 --> 00:09:08,200 and slash small d. Small w is a shorthand for 191 00:09:08,201 --> 00:09:11,066 character class including all upper, lowercase 192 00:09:11,067 --> 00:09:13,933 characters, digits, and underscore. On the 193 00:09:13,934 --> 00:09:16,500 other hand, \d is shorthand for all 194 00:09:16,501 --> 00:09:19,166 digits from 0 up to 9, and let us have 195 00:09:19,167 --> 00:09:22,400 some examples. I want to make a local 196 00:09:22,401 --> 00:09:24,900 telephone number which is six digits long in 197 00:09:24,901 --> 00:09:27,433 our are, so I will write a regex which 198 00:09:27,434 --> 00:09:29,400 contains only six digits. 199 00:09:29,401 --> 00:09:34,800 [No Audio] 200 00:09:34,801 --> 00:09:37,300 Next, I will define a simple text containing 201 00:09:37,301 --> 00:09:39,600 some text and one telephone number. 202 00:09:39,601 --> 00:09:44,400 [No Audio] 203 00:09:44,401 --> 00:09:47,300 Let us also include the code for printing the matching. 204 00:09:48,633 --> 00:09:50,600 This will now return the match for 205 00:09:50,601 --> 00:09:52,533 the phone number from the input text. 206 00:09:52,534 --> 00:09:53,900 Let us cargo run this. 207 00:09:53,901 --> 00:09:59,100 [No Audio] 208 00:09:59,101 --> 00:10:01,666 Okay let us take a short quiz. I will write 209 00:10:01,667 --> 00:10:04,033 simple regex and some input text. 210 00:10:04,034 --> 00:10:10,700 [No Audio] 211 00:10:10,701 --> 00:10:13,033 Can you guess what this regex will return, 212 00:10:13,034 --> 00:10:14,933 I will give you five seconds to think. 213 00:10:14,934 --> 00:10:21,466 [No Audio] 214 00:10:21,467 --> 00:10:23,500 In this case, the regex will return the two 215 00:10:23,501 --> 00:10:25,833 phone numbers. This is because the regex 216 00:10:25,834 --> 00:10:28,200 means that, at the start we need to have a 217 00:10:28,201 --> 00:10:30,966 digit followed by any five characters. A cap 218 00:10:30,967 --> 00:10:33,800 before the shorthand character will mean a negation. 219 00:10:33,933 --> 00:10:36,233 For instance, if we use, if we use 220 00:10:36,234 --> 00:10:39,833 cap followed by \d in this regex, then it 221 00:10:39,834 --> 00:10:42,000 would mean to match anything which is not a 222 00:10:42,001 --> 00:10:45,300 digit. The small \s is another very 223 00:10:45,301 --> 00:10:47,366 useful shorthand that is used for making a 224 00:10:47,367 --> 00:10:51,266 space character. Next, let us talk about the 225 00:10:51,267 --> 00:10:55,300 starting and ending anchors. The cap when 226 00:10:55,301 --> 00:10:57,533 used outside the square brackets is used to 227 00:10:57,534 --> 00:11:00,033 indicate the start of the input text. 228 00:11:00,034 --> 00:11:02,100 For instance, let me write a regex. 229 00:11:02,101 --> 00:11:06,466 [No Audio] 230 00:11:06,467 --> 00:11:08,433 The cap when used inside 231 00:11:08,434 --> 00:11:10,166 square brackets or when used in 232 00:11:10,167 --> 00:11:12,566 connection with shorthands, has a meaning of 233 00:11:12,567 --> 00:11:15,033 navigation, but in this case, it will match 234 00:11:15,034 --> 00:11:17,166 at the start of the input text before any 235 00:11:17,167 --> 00:11:18,700 character in the input text. 236 00:11:19,133 --> 00:11:21,066 So, let us add some input text. 237 00:11:21,067 --> 00:11:26,233 [No Audio] 238 00:11:26,234 --> 00:11:29,333 I will next include the code for printing the matches. 239 00:11:29,334 --> 00:11:32,066 [No Audio] 240 00:11:32,067 --> 00:11:34,133 In this case, the cap will match at 241 00:11:34,134 --> 00:11:36,133 the start of the input tag, and then there are 242 00:11:36,134 --> 00:11:38,733 regex demands the letters of a followed by 243 00:11:38,734 --> 00:11:42,433 b and then a. Since the input text has this, 244 00:11:42,434 --> 00:11:45,333 therefore it will match. Let us cargo run to confirm. 245 00:11:45,334 --> 00:11:52,166 [No Audio] 246 00:11:52,167 --> 00:11:54,500 if I remove the starting a in the text, then 247 00:11:54,501 --> 00:11:57,000 it will not match, because the cap matches at 248 00:11:57,001 --> 00:11:58,933 the start of the text, and then the regex 249 00:11:58,934 --> 00:12:01,566 demands the letters of a which is not there, 250 00:12:01,567 --> 00:12:03,366 let us cargo run again. 251 00:12:04,900 --> 00:12:06,533 In summary, the cap is 252 00:12:06,534 --> 00:12:08,733 used if you want a regex to match at the 253 00:12:08,734 --> 00:12:12,000 start of the input text only. Similar to the 254 00:12:12,033 --> 00:12:14,266 starting anchor, we have the ending anchor 255 00:12:14,267 --> 00:12:16,766 which will match at the end of the input text 256 00:12:16,833 --> 00:12:19,166 Let us define a regex and some input text. 257 00:12:19,167 --> 00:12:25,366 [No Audio] 258 00:12:25,367 --> 00:12:28,066 Let me also include the code for matching. 259 00:12:28,833 --> 00:12:31,200 In this case now, the regex will first match 260 00:12:31,201 --> 00:12:33,866 dollar with the ending of the text, and then 261 00:12:33,900 --> 00:12:36,300 from the ending of the text, the last element 262 00:12:36,301 --> 00:12:38,866 should be c, and then the second last element 263 00:12:38,867 --> 00:12:41,566 should be b. Since the input text contains 264 00:12:41,567 --> 00:12:43,500 such a match, therefore the match will be 265 00:12:43,501 --> 00:12:46,500 successful. A key point to note is that, the 266 00:12:46,501 --> 00:12:49,233 starting and ending anchors do not match any 267 00:12:49,234 --> 00:12:51,200 characters themselves, they are just telling 268 00:12:51,201 --> 00:12:53,966 the regex engine to match at the start and at 269 00:12:53,967 --> 00:12:55,466 the end of the input text. 270 00:12:56,900 --> 00:12:58,466 We can use the starting and 271 00:12:58,467 --> 00:13:00,200 ending anchors at the same time, 272 00:13:00,201 --> 00:13:01,600 let us define a regex. 273 00:13:01,601 --> 00:13:07,366 [No Audio] 274 00:13:07,367 --> 00:13:09,633 In this case, now the cap will enforce to 275 00:13:09,634 --> 00:13:12,733 match at the start followed by bc, and the 276 00:13:12,734 --> 00:13:15,233 ending dollar will enforce to match from the 277 00:13:15,234 --> 00:13:18,866 end. The dollar sign requires that last 278 00:13:18,867 --> 00:13:21,266 letter should be c, and second last letter 279 00:13:21,267 --> 00:13:24,000 should be b. Since all these requirements are 280 00:13:24,001 --> 00:13:26,066 being fulfilled, therefore there will be a 281 00:13:26,067 --> 00:13:29,233 successful match. Let us take another quiz, I 282 00:13:29,234 --> 00:13:31,200 will add a regex, and you will have to guess 283 00:13:31,201 --> 00:13:32,533 what it will match. 284 00:13:32,534 --> 00:13:38,800 [No Audio] 285 00:13:38,801 --> 00:13:40,833 I will give you five seconds again. 286 00:13:40,834 --> 00:13:46,900 [No Audio] 287 00:13:46,901 --> 00:13:49,600 This will match any input text which contains 288 00:13:49,601 --> 00:13:52,633 exactly two digits. Let us now talk about 289 00:13:52,666 --> 00:13:54,900 word boundaries, which is done with the help 290 00:13:54,901 --> 00:13:58,166 of a special metacharacter called \b. 291 00:13:58,167 --> 00:14:01,000 It will match at the three places, that is before 292 00:14:01,001 --> 00:14:03,500 the start in the input text before any input 293 00:14:03,501 --> 00:14:06,533 character, or at the end of the input text 294 00:14:06,566 --> 00:14:09,433 after there is no character, and between two 295 00:14:09,434 --> 00:14:11,333 characters, if the first one is a word 296 00:14:11,334 --> 00:14:13,866 character and the second one is not a word 297 00:14:13,867 --> 00:14:17,066 character. Again I will repeat that, it will 298 00:14:17,067 --> 00:14:20,066 match either at the start or at the end, and 299 00:14:20,067 --> 00:14:21,933 between two characters, if one is a word 300 00:14:21,934 --> 00:14:24,266 character and the second one is not a word 301 00:14:24,267 --> 00:14:26,600 character. Let us use it inside some 302 00:14:26,601 --> 00:14:29,533 examples. I will include a simple regex. 303 00:14:29,534 --> 00:14:35,733 [No Audio] 304 00:14:35,734 --> 00:14:38,166 let me also include a simple text. 305 00:14:38,167 --> 00:14:42,100 [No Audio] 306 00:14:42,101 --> 00:14:44,333 Now I will include the code for matching. 307 00:14:46,066 --> 00:14:48,866 In this case now, the \b will match at the start 308 00:14:48,867 --> 00:14:52,000 before any character. Like cap and the dollar, 309 00:14:52,001 --> 00:14:54,300 it does not match any character, but it rather 310 00:14:54,301 --> 00:14:56,766 matches a position which is the start and end, 311 00:14:57,000 --> 00:14:58,966 and any other position where one of the 312 00:14:58,967 --> 00:15:02,233 position after or before is a word character, 313 00:15:02,234 --> 00:15:04,900 and one position after or before is an non word 314 00:15:04,901 --> 00:15:07,600 character. In this case, the \b will 315 00:15:07,601 --> 00:15:09,866 first match at the start position. After the 316 00:15:09,867 --> 00:15:12,300 start of the regex requires any word 317 00:15:12,301 --> 00:15:14,933 corrector, so it will match the letter edge 318 00:15:14,934 --> 00:15:18,066 in the word of Hi. Next, the position between 319 00:15:18,067 --> 00:15:21,300 the two words of Hi and my, it will match the 320 00:15:21,301 --> 00:15:24,300 position after the letter i, because before 321 00:15:24,301 --> 00:15:26,400 this position, there is a word corrector, and 322 00:15:26,401 --> 00:15:28,966 after this position there is a space, that is 323 00:15:29,000 --> 00:15:31,700 non word corrector. However, the pattern will 324 00:15:31,701 --> 00:15:34,833 not match, because after this position, we do 325 00:15:34,834 --> 00:15:37,300 not have any word corrector, rather we have a 326 00:15:37,301 --> 00:15:40,833 space. Next, it will match the position just 327 00:15:40,834 --> 00:15:43,800 before the letter m, because it also satisfies 328 00:15:43,801 --> 00:15:46,566 the requirements. After the position of m, 329 00:15:46,567 --> 00:15:49,066 there is a word corrector of m, so it will 330 00:15:49,100 --> 00:15:52,400 return a match there also. In summary, it 331 00:15:52,401 --> 00:15:54,166 will return all the starting letters of the 332 00:15:54,167 --> 00:15:56,966 words. Let us cargo run to confirm. 333 00:15:56,967 --> 00:16:03,633 [No Audio] 334 00:16:03,634 --> 00:16:05,933 A star in the regex means zero or more 335 00:16:05,934 --> 00:16:08,733 times of a certain pattern. So, if I put a 336 00:16:08,734 --> 00:16:12,266 star on the \w, it will mean zero or more 337 00:16:12,267 --> 00:16:14,900 times have any character. In this case now, it 338 00:16:14,901 --> 00:16:17,066 will match the complete words in the input 339 00:16:17,067 --> 00:16:19,200 text. Let us cargo run this to confirm 340 00:16:19,201 --> 00:16:26,000 [No Audio] 341 00:16:26,001 --> 00:16:27,300 We end this tutorial here 342 00:16:27,301 --> 00:16:28,900 as it is getting a bit lengthy. 343 00:16:29,066 --> 00:16:31,166 In the next tutorial, we will try to wind up 344 00:16:31,167 --> 00:16:33,033 the topic, and we'll cover some more stuff 345 00:16:33,034 --> 00:16:35,400 related to regexes. Do come back for covering 346 00:16:35,401 --> 00:16:37,866 that, and until then happy Rust programming 347 00:16:37,867 --> 00:16:42,300 [No Audio]