1
00:00:00,000 --> 00:00:06,766
[No Audio]

2
00:00:06,767 --> 00:00:09,100
A Regular Expression is a search pattern that

3
00:00:09,101 --> 00:00:11,533
describes a certain text that we want to find

4
00:00:11,534 --> 00:00:14,000
in the input text. This means that, we will

5
00:00:14,001 --> 00:00:16,400
have two inputs to the system, there is the

6
00:00:16,401 --> 00:00:18,900
input text and the regex, and the output of

7
00:00:18,901 --> 00:00:20,700
the system will be the instances of the

8
00:00:20,701 --> 00:00:23,066
search pattern that the system is able to

9
00:00:23,067 --> 00:00:25,833
find in the input text. Please note that, this

10
00:00:25,834 --> 00:00:27,666
is going to be very precise and short

11
00:00:27,667 --> 00:00:30,466
tutorials on regexes or regular expressions.

12
00:00:30,700 --> 00:00:33,033
And if you want to learn more in depth about

13
00:00:33,034 --> 00:00:35,700
regexes, then you may reach out to me, and I

14
00:00:35,701 --> 00:00:38,066
will give you a free coupon for one of my

15
00:00:38,067 --> 00:00:40,066
other courses on regular expressions for

16
00:00:40,067 --> 00:00:43,500
having an in depth tutorial on regexes. Since

17
00:00:43,501 --> 00:00:45,900
this course is about Rust, so therefore these

18
00:00:45,901 --> 00:00:48,300
tutorials will not be very much elaborative,

19
00:00:48,301 --> 00:00:50,766
and will be therefore short and more precise.

20
00:00:51,500 --> 00:00:53,833
The search pattern may be used for a text

21
00:00:53,834 --> 00:00:56,266
searching to perform operations such as

22
00:00:56,267 --> 00:00:58,700
find and replace, finding some particular piece

23
00:00:58,701 --> 00:01:00,866
of information in the text, such as email

24
00:01:00,867 --> 00:01:03,966
addresses or phone numbers. Due to their high

25
00:01:03,967 --> 00:01:06,366
importance in problem solving encountered in

26
00:01:06,400 --> 00:01:08,700
programming tasks, the regexes in many

27
00:01:08,701 --> 00:01:11,200
programming languages are implemented as part

28
00:01:11,201 --> 00:01:13,733
of the standard library. Rust does not

29
00:01:13,734 --> 00:01:15,900
provide it in the standard library, since the

30
00:01:15,901 --> 00:01:18,200
intention in Rust is to keep the standard

31
00:01:18,201 --> 00:01:21,400
library as shrinked as possible. However,

32
00:01:21,401 --> 00:01:23,833
there are some useful crates which can be

33
00:01:23,834 --> 00:01:27,600
used in this regards. In this and some of the

34
00:01:27,601 --> 00:01:29,500
upcoming tutorials, we will be looking at the

35
00:01:29,501 --> 00:01:32,100
crate called regex, which is typically used

36
00:01:32,101 --> 00:01:34,000
for regular expression implementation in

37
00:01:34,001 --> 00:01:37,333
Rust. I will first add it to the terminal file.

38
00:01:37,334 --> 00:01:41,500
[No Audio]

39
00:01:41,501 --> 00:01:43,000
Let us first bring the relevant

40
00:01:43,001 --> 00:01:44,533
modules in crate into scope.

41
00:01:44,534 --> 00:01:50,100
[No Audio]

42
00:01:50,101 --> 00:01:52,833
Let us create a simple regex using the new function.

43
00:01:52,834 --> 00:01:57,166
[No Audio]

44
00:01:57,167 --> 00:01:59,466
The new function returns a result enum,

45
00:01:59,467 --> 00:02:01,600
so therefore, we have unwrapped it to

46
00:02:01,601 --> 00:02:04,500
obtain the regex. If the regex is not

47
00:02:04,501 --> 00:02:07,400
syntactically correct, then it will return an error.

48
00:02:07,533 --> 00:02:09,600
The arm before the regex is used to

49
00:02:09,601 --> 00:02:11,933
indicate to Rust that it is going to be a raw

50
00:02:11,934 --> 00:02:14,833
string. The text inside the double quotes is

51
00:02:14,866 --> 00:02:17,500
our regular expression. There are many

52
00:02:17,501 --> 00:02:19,466
constructs in a regular expression, and the

53
00:02:19,467 --> 00:02:22,033
most fundamental one is called character

54
00:02:22,034 --> 00:02:24,666
class. Character classes are being indicated

55
00:02:24,667 --> 00:02:28,333
by square brackets. It tells the regex engine

56
00:02:28,334 --> 00:02:30,733
to match only one out of several characters

57
00:02:30,734 --> 00:02:32,566
that are being mentioned inside the square

58
00:02:32,567 --> 00:02:35,666
brackets. In this example, we are telling to

59
00:02:35,667 --> 00:02:38,666
match it in the input text at the start, one

60
00:02:38,667 --> 00:02:41,366
of the three letters of either letter b, or

61
00:02:41,367 --> 00:02:44,366
letter r, or letter t, followed by a mandatory

62
00:02:44,367 --> 00:02:48,300
letters of ain which must be matched. Now,

63
00:02:48,301 --> 00:02:50,266
let us define a string which we want to

64
00:02:50,267 --> 00:02:52,933
investigate for possible matches. I will

65
00:02:52,934 --> 00:02:56,100
define a reference string containing some text.

66
00:02:56,101 --> 00:03:02,500
[No Audio]

67
00:03:02,501 --> 00:03:04,500
Next, to determine if the regular expression

68
00:03:04,501 --> 00:03:07,200
has a match in the input text or not, there

69
00:03:07,201 --> 00:03:09,600
is a function called is_match,

70
00:03:09,601 --> 00:03:12,500
which returns either true or false. Let us

71
00:03:12,501 --> 00:03:14,100
use it inside the print statement.

72
00:03:14,101 --> 00:03:19,566
[No Audio]

73
00:03:19,567 --> 00:03:21,100
Let us cargo run this now.

74
00:03:21,101 --> 00:03:25,500
[No Audio]

75
00:03:25,501 --> 00:03:28,800
There is a match. To know where exactly is the

76
00:03:28,801 --> 00:03:31,033
first match in the input string, the crate

77
00:03:31,034 --> 00:03:33,333
provides another function called find which

78
00:03:33,334 --> 00:03:35,700
returns the starting and ending index of the

79
00:03:35,701 --> 00:03:38,866
first match in the input string. Let us call

80
00:03:38,867 --> 00:03:41,300
it inside the print statement and display the result.

81
00:03:41,301 --> 00:03:48,333
[No Audio]

82
00:03:48,334 --> 00:03:51,133
These functions are useful, however, they do

83
00:03:51,134 --> 00:03:53,566
not provide the specific text which has been

84
00:03:53,567 --> 00:03:56,433
found as a capture. The crate provides a

85
00:03:56,434 --> 00:03:58,933
convenient iterators for matching an expression

86
00:03:58,934 --> 00:04:01,300
repeatedly against a search string to find

87
00:04:01,301 --> 00:04:03,966
successive non overlapping matches.

88
00:04:03,967 --> 00:04:06,033
Let us use it using a for loop.

89
00:04:06,034 --> 00:04:12,300
[No Audio]

90
00:04:12,301 --> 00:04:14,333
Let us print the individual captures,

91
00:04:14,334 --> 00:04:15,933
inside a print statement now.

92
00:04:15,934 --> 00:04:22,800
[No Audio]

93
00:04:22,801 --> 00:04:25,533
The capture_iter returns a capture

94
00:04:25,534 --> 00:04:28,000
structure which has the individual captures

95
00:04:28,001 --> 00:04:31,500
inside each store, which we can access using

96
00:04:31,501 --> 00:04:34,233
the index of zero. There are some regexes

97
00:04:34,234 --> 00:04:36,333
which has different paths to match, and the

98
00:04:36,334 --> 00:04:39,000
indexes are used in those cases to refer to

99
00:04:39,001 --> 00:04:41,400
the different parts of a single regex

100
00:04:41,401 --> 00:04:43,600
which has been matched. We will look into the

101
00:04:43,601 --> 00:04:46,000
details of that later on. For now, it is

102
00:04:46,001 --> 00:04:48,433
sufficient to know that the individual match

103
00:04:48,466 --> 00:04:51,600
texts are being accessed using this syntax.

104
00:04:51,601 --> 00:04:53,800
The result in this case would be rain and

105
00:04:53,801 --> 00:04:56,866
pain. Let us execute and then we will explain.

106
00:04:56,867 --> 00:05:03,900
[No Audio]

107
00:05:03,901 --> 00:05:06,100
The first two letters of r in the words

108
00:05:06,133 --> 00:05:08,533
does not match, because after the are the

109
00:05:08,534 --> 00:05:11,333
rejects demands the letters of ain, which are

110
00:05:11,334 --> 00:05:13,800
absent after the first two letters of r.

111
00:05:13,900 --> 00:05:16,766
The last are matched, because we have the letters

112
00:05:16,767 --> 00:05:19,800
of ain afterwards. The first letter in the

113
00:05:19,801 --> 00:05:22,133
second word does not match to any of the

114
00:05:22,134 --> 00:05:24,766
characters in our character class, and is

115
00:05:24,767 --> 00:05:28,166
therefore not part of the match. The letter p

116
00:05:28,167 --> 00:05:30,066
followed by the letter s matched, because the

117
00:05:30,067 --> 00:05:32,466
letter p is in the character group, and is

118
00:05:32,467 --> 00:05:35,433
being followed by the mandatory letters of ain.

119
00:05:35,966 --> 00:05:38,166
The dot in regex will match a single

120
00:05:38,167 --> 00:05:41,166
character including letters and digits, dot

121
00:05:41,167 --> 00:05:44,100
should not be used inside a character class.

122
00:05:44,466 --> 00:05:46,900
Let us modify the regex and put a dot after

123
00:05:46,901 --> 00:05:50,300
the character class. This means now that in

124
00:05:50,301 --> 00:05:53,233
order to have a match, we need to have either

125
00:05:53,234 --> 00:05:57,000
one of letters p, letter r, or letter t at the

126
00:05:57,001 --> 00:05:59,266
start followed by any character and then a

127
00:05:59,267 --> 00:06:03,400
mandatory letters of a, i, and n, let us execute.

128
00:06:03,401 --> 00:06:10,800
[No Audio]

129
00:06:10,801 --> 00:06:13,733
You may note that, it also makes the extra are

130
00:06:13,734 --> 00:06:16,500
in the rain due to dot. The corrector

131
00:06:16,501 --> 00:06:18,933
class is quite useful, and can be used for

132
00:06:18,934 --> 00:06:21,166
spell checking. Let me add a new regular

133
00:06:21,167 --> 00:06:23,533
expression for checking the spelling of gray.

134
00:06:23,534 --> 00:06:33,233
[No Audio]

135
00:06:33,234 --> 00:06:35,066
The spelling will be correct, if we have

136
00:06:35,067 --> 00:06:38,600
letters of g and r followed by letters of

137
00:06:38,633 --> 00:06:41,633
either a or e, followed by a mandatory letter

138
00:06:41,634 --> 00:06:44,166
of y. Let us add some text for checking.

139
00:06:44,167 --> 00:06:49,933
[No Audio]

140
00:06:49,934 --> 00:06:51,900
Let us paste the code for the possible

141
00:06:51,933 --> 00:06:53,800
matches now and cargo run.

142
00:06:53,801 --> 00:06:56,033
[No Audio]

143
00:06:56,034 --> 00:06:57,400
You may note that only the

144
00:06:57,401 --> 00:06:58,900
valid part of the spellings are

145
00:06:58,901 --> 00:07:01,200
being returned. Similar to the corrector

146
00:07:01,201 --> 00:07:03,500
classes are correct arranges which are used

147
00:07:03,501 --> 00:07:05,833
to check if characters in a certain range are

148
00:07:05,834 --> 00:07:08,266
part of the text or not. For instance, to

149
00:07:08,267 --> 00:07:10,433
check for all the lowercase characters, we

150
00:07:10,434 --> 00:07:12,800
will use the syntax which will look like this.

151
00:07:12,801 --> 00:07:17,366
[No Audio]

152
00:07:17,367 --> 00:07:20,200
The small a followed by a dash and then z

153
00:07:20,201 --> 00:07:22,733
means, all the characters from A to Z.

154
00:07:23,033 --> 00:07:25,600
The overall meaning of the regex is now we need

155
00:07:25,601 --> 00:07:27,800
to have a small letter character at the start,

156
00:07:27,801 --> 00:07:31,100
followed by the mandatory letters of a, i, and n.

157
00:07:31,101 --> 00:07:33,533
That is used some example text

158
00:07:33,534 --> 00:07:38,566
[No Audio]

159
00:07:38,567 --> 00:07:40,666
Finally, I will add the code for printing

160
00:07:40,667 --> 00:07:42,266
out the relevant matches.

161
00:07:42,267 --> 00:07:44,466
[No Audio]

162
00:07:44,467 --> 00:07:45,733
Let us cargo run this now.

163
00:07:45,734 --> 00:07:49,900
[No Audio]

164
00:07:49,901 --> 00:07:52,500
It returns the possible matches. The last

165
00:07:52,501 --> 00:07:54,400
word did not match, because at the start of

166
00:07:54,401 --> 00:07:56,500
the word we do not have a small character

167
00:07:56,501 --> 00:07:59,866
letter. We can mention multiple ranges inside

168
00:07:59,867 --> 00:08:02,633
the square brackets also, for instance, I will

169
00:08:02,634 --> 00:08:05,500
include all the upper letters by mentioning

170
00:08:05,501 --> 00:08:09,900
capital A-Z. This will now mean that, at

171
00:08:09,901 --> 00:08:12,266
the start we need to have at least one upper

172
00:08:12,267 --> 00:08:14,733
or lowercase character. Inside the square

173
00:08:14,734 --> 00:08:17,733
brackets, we can exclude certain ranges also.

174
00:08:17,866 --> 00:08:20,133
The exclusion is done with the help of the

175
00:08:20,134 --> 00:08:22,433
cap symbol at the start of the range.

176
00:08:22,633 --> 00:08:25,066
For instance, if I want to exclude all the

177
00:08:25,067 --> 00:08:26,933
lowercase characters from the start of the

178
00:08:26,934 --> 00:08:29,566
match, I will mention something like this.

179
00:08:29,567 --> 00:08:34,765
[No Audio]

180
00:08:34,766 --> 00:08:36,100
Let us execute to see what

181
00:08:36,101 --> 00:08:37,566
happens to the matches.

182
00:08:37,567 --> 00:08:43,200
[No Audio]

183
00:08:43,201 --> 00:08:45,300
You may note that, it has just returned a

184
00:08:45,301 --> 00:08:49,200
match for the word of 0ain. And this is

185
00:08:49,201 --> 00:08:51,766
because, all the remaining words has a

186
00:08:51,767 --> 00:08:54,400
small letter character at the beginning and are

187
00:08:54,401 --> 00:08:58,100
therefore not being matched. There are some

188
00:08:58,101 --> 00:09:00,333
shorthands for the frequently used character

189
00:09:00,334 --> 00:09:03,700
classes, and they include a slash small w,

190
00:09:03,833 --> 00:09:08,200
and slash small d. Small w is a shorthand for

191
00:09:08,201 --> 00:09:11,066
character class including all upper, lowercase

192
00:09:11,067 --> 00:09:13,933
characters, digits, and underscore. On the

193
00:09:13,934 --> 00:09:16,500
other hand, \d is shorthand for all

194
00:09:16,501 --> 00:09:19,166
digits from 0 up to 9, and let us have

195
00:09:19,167 --> 00:09:22,400
some examples. I want to make a local

196
00:09:22,401 --> 00:09:24,900
telephone number which is six digits long in

197
00:09:24,901 --> 00:09:27,433
our are, so I will write a regex which

198
00:09:27,434 --> 00:09:29,400
contains only six digits.

199
00:09:29,401 --> 00:09:34,800
[No Audio]

200
00:09:34,801 --> 00:09:37,300
Next, I will define a simple text containing

201
00:09:37,301 --> 00:09:39,600
some text and one telephone number.

202
00:09:39,601 --> 00:09:44,400
[No Audio]

203
00:09:44,401 --> 00:09:47,300
Let us also include the code for printing the matching.

204
00:09:48,633 --> 00:09:50,600
This will now return the match for

205
00:09:50,601 --> 00:09:52,533
the phone number from the input text.

206
00:09:52,534 --> 00:09:53,900
Let us cargo run this.

207
00:09:53,901 --> 00:09:59,100
[No Audio]

208
00:09:59,101 --> 00:10:01,666
Okay let us take a short quiz. I will write

209
00:10:01,667 --> 00:10:04,033
simple regex and some input text.

210
00:10:04,034 --> 00:10:10,700
[No Audio]

211
00:10:10,701 --> 00:10:13,033
Can you guess what this regex will return,

212
00:10:13,034 --> 00:10:14,933
I will give you five seconds to think.

213
00:10:14,934 --> 00:10:21,466
[No Audio]

214
00:10:21,467 --> 00:10:23,500
In this case, the regex will return the two

215
00:10:23,501 --> 00:10:25,833
phone numbers. This is because the regex

216
00:10:25,834 --> 00:10:28,200
means that, at the start we need to have a

217
00:10:28,201 --> 00:10:30,966
digit followed by any five characters. A cap

218
00:10:30,967 --> 00:10:33,800
before the shorthand character will mean a negation.

219
00:10:33,933 --> 00:10:36,233
For instance, if we use, if we use

220
00:10:36,234 --> 00:10:39,833
cap followed by \d in this regex, then it

221
00:10:39,834 --> 00:10:42,000
would mean to match anything which is not a

222
00:10:42,001 --> 00:10:45,300
digit. The small \s is another very

223
00:10:45,301 --> 00:10:47,366
useful shorthand that is used for making a

224
00:10:47,367 --> 00:10:51,266
space character. Next, let us talk about the

225
00:10:51,267 --> 00:10:55,300
starting and ending anchors. The cap when

226
00:10:55,301 --> 00:10:57,533
used outside the square brackets is used to

227
00:10:57,534 --> 00:11:00,033
indicate the start of the input text.

228
00:11:00,034 --> 00:11:02,100
For instance, let me write a regex.

229
00:11:02,101 --> 00:11:06,466
[No Audio]

230
00:11:06,467 --> 00:11:08,433
The cap when used inside

231
00:11:08,434 --> 00:11:10,166
square brackets or when used in

232
00:11:10,167 --> 00:11:12,566
connection with shorthands, has a meaning of

233
00:11:12,567 --> 00:11:15,033
navigation, but in this case, it will match

234
00:11:15,034 --> 00:11:17,166
at the start of the input text before any

235
00:11:17,167 --> 00:11:18,700
character in the input text.

236
00:11:19,133 --> 00:11:21,066
So, let us add some input text.

237
00:11:21,067 --> 00:11:26,233
[No Audio]

238
00:11:26,234 --> 00:11:29,333
I will next include the code for printing the matches.

239
00:11:29,334 --> 00:11:32,066
[No Audio]

240
00:11:32,067 --> 00:11:34,133
In this case, the cap will match at

241
00:11:34,134 --> 00:11:36,133
the start of the input tag, and then there are

242
00:11:36,134 --> 00:11:38,733
regex demands the letters of a followed by

243
00:11:38,734 --> 00:11:42,433
b and then a. Since the input text has this,

244
00:11:42,434 --> 00:11:45,333
therefore it will match. Let us cargo run to confirm.

245
00:11:45,334 --> 00:11:52,166
[No Audio]

246
00:11:52,167 --> 00:11:54,500
if I remove the starting a in the text, then

247
00:11:54,501 --> 00:11:57,000
it will not match, because the cap matches at

248
00:11:57,001 --> 00:11:58,933
the start of the text, and then the regex

249
00:11:58,934 --> 00:12:01,566
demands the letters of a which is not there,

250
00:12:01,567 --> 00:12:03,366
let us cargo run again.

251
00:12:04,900 --> 00:12:06,533
In summary, the cap is

252
00:12:06,534 --> 00:12:08,733
used if you want a regex to match at the

253
00:12:08,734 --> 00:12:12,000
start of the input text only. Similar to the

254
00:12:12,033 --> 00:12:14,266
starting anchor, we have the ending anchor

255
00:12:14,267 --> 00:12:16,766
which will match at the end of the input text

256
00:12:16,833 --> 00:12:19,166
Let us define a regex and some input text.

257
00:12:19,167 --> 00:12:25,366
[No Audio]

258
00:12:25,367 --> 00:12:28,066
Let me also include the code for matching.

259
00:12:28,833 --> 00:12:31,200
In this case now, the regex will first match

260
00:12:31,201 --> 00:12:33,866
dollar with the ending of the text, and then

261
00:12:33,900 --> 00:12:36,300
from the ending of the text, the last element

262
00:12:36,301 --> 00:12:38,866
should be c, and then the second last element

263
00:12:38,867 --> 00:12:41,566
should be b. Since the input text contains

264
00:12:41,567 --> 00:12:43,500
such a match, therefore the match will be

265
00:12:43,501 --> 00:12:46,500
successful. A key point to note is that, the

266
00:12:46,501 --> 00:12:49,233
starting and ending anchors do not match any

267
00:12:49,234 --> 00:12:51,200
characters themselves, they are just telling

268
00:12:51,201 --> 00:12:53,966
the regex engine to match at the start and at

269
00:12:53,967 --> 00:12:55,466
the end of the input text.

270
00:12:56,900 --> 00:12:58,466
We can use the starting and

271
00:12:58,467 --> 00:13:00,200
ending anchors at the same time,

272
00:13:00,201 --> 00:13:01,600
let us define a regex.

273
00:13:01,601 --> 00:13:07,366
[No Audio]

274
00:13:07,367 --> 00:13:09,633
In this case, now the cap will enforce to

275
00:13:09,634 --> 00:13:12,733
match at the start followed by bc, and the

276
00:13:12,734 --> 00:13:15,233
ending dollar will enforce to match from the

277
00:13:15,234 --> 00:13:18,866
end. The dollar sign requires that last

278
00:13:18,867 --> 00:13:21,266
letter should be c, and second last letter

279
00:13:21,267 --> 00:13:24,000
should be b. Since all these requirements are

280
00:13:24,001 --> 00:13:26,066
being fulfilled, therefore there will be a

281
00:13:26,067 --> 00:13:29,233
successful match. Let us take another quiz, I

282
00:13:29,234 --> 00:13:31,200
will add a regex, and you will have to guess

283
00:13:31,201 --> 00:13:32,533
what it will match.

284
00:13:32,534 --> 00:13:38,800
[No Audio]

285
00:13:38,801 --> 00:13:40,833
I will give you five seconds again.

286
00:13:40,834 --> 00:13:46,900
[No Audio]

287
00:13:46,901 --> 00:13:49,600
This will match any input text which contains

288
00:13:49,601 --> 00:13:52,633
exactly two digits. Let us now talk about

289
00:13:52,666 --> 00:13:54,900
word boundaries, which is done with the help

290
00:13:54,901 --> 00:13:58,166
of a special metacharacter called \b.

291
00:13:58,167 --> 00:14:01,000
It will match at the three places, that is before

292
00:14:01,001 --> 00:14:03,500
the start in the input text before any input

293
00:14:03,501 --> 00:14:06,533
character, or at the end of the input text

294
00:14:06,566 --> 00:14:09,433
after there is no character, and between two

295
00:14:09,434 --> 00:14:11,333
characters, if the first one is a word

296
00:14:11,334 --> 00:14:13,866
character and the second one is not a word

297
00:14:13,867 --> 00:14:17,066
character. Again I will repeat that, it will

298
00:14:17,067 --> 00:14:20,066
match either at the start or at the end, and

299
00:14:20,067 --> 00:14:21,933
between two characters, if one is a word

300
00:14:21,934 --> 00:14:24,266
character and the second one is not a word

301
00:14:24,267 --> 00:14:26,600
character. Let us use it inside some

302
00:14:26,601 --> 00:14:29,533
examples. I will include a simple regex.

303
00:14:29,534 --> 00:14:35,733
[No Audio]

304
00:14:35,734 --> 00:14:38,166
let me also include a simple text.

305
00:14:38,167 --> 00:14:42,100
[No Audio]

306
00:14:42,101 --> 00:14:44,333
Now I will include the code for matching.

307
00:14:46,066 --> 00:14:48,866
In this case now, the \b will match at the start

308
00:14:48,867 --> 00:14:52,000
before any character. Like cap and the dollar,

309
00:14:52,001 --> 00:14:54,300
it does not match any character, but it rather

310
00:14:54,301 --> 00:14:56,766
matches a position which is the start and end,

311
00:14:57,000 --> 00:14:58,966
and any other position where one of the

312
00:14:58,967 --> 00:15:02,233
position after or before is a word character,

313
00:15:02,234 --> 00:15:04,900
and one position after or before is an non word

314
00:15:04,901 --> 00:15:07,600
character. In this case, the \b will

315
00:15:07,601 --> 00:15:09,866
first match at the start position. After the

316
00:15:09,867 --> 00:15:12,300
start of the regex requires any word

317
00:15:12,301 --> 00:15:14,933
corrector, so it will match the letter edge

318
00:15:14,934 --> 00:15:18,066
in the word of Hi. Next, the position between

319
00:15:18,067 --> 00:15:21,300
the two words of Hi and my, it will match the

320
00:15:21,301 --> 00:15:24,300
position after the letter i, because before

321
00:15:24,301 --> 00:15:26,400
this position, there is a word corrector, and

322
00:15:26,401 --> 00:15:28,966
after this position there is a space, that is

323
00:15:29,000 --> 00:15:31,700
non word corrector. However, the pattern will

324
00:15:31,701 --> 00:15:34,833
not match, because after this position, we do

325
00:15:34,834 --> 00:15:37,300
not have any word corrector, rather we have a

326
00:15:37,301 --> 00:15:40,833
space. Next, it will match the position just

327
00:15:40,834 --> 00:15:43,800
before the letter m, because it also satisfies

328
00:15:43,801 --> 00:15:46,566
the requirements. After the position of m,

329
00:15:46,567 --> 00:15:49,066
there is a word corrector of m, so it will

330
00:15:49,100 --> 00:15:52,400
return a match there also. In summary, it

331
00:15:52,401 --> 00:15:54,166
will return all the starting letters of the

332
00:15:54,167 --> 00:15:56,966
words. Let us cargo run to confirm.

333
00:15:56,967 --> 00:16:03,633
[No Audio]

334
00:16:03,634 --> 00:16:05,933
A star in the regex means zero or more

335
00:16:05,934 --> 00:16:08,733
times of a certain pattern. So, if I put a

336
00:16:08,734 --> 00:16:12,266
star on the \w, it will mean zero or more

337
00:16:12,267 --> 00:16:14,900
times have any character. In this case now, it

338
00:16:14,901 --> 00:16:17,066
will match the complete words in the input

339
00:16:17,067 --> 00:16:19,200
text. Let us cargo run this to confirm

340
00:16:19,201 --> 00:16:26,000
[No Audio]

341
00:16:26,001 --> 00:16:27,300
We end this tutorial here

342
00:16:27,301 --> 00:16:28,900
as it is getting a bit lengthy.

343
00:16:29,066 --> 00:16:31,166
In the next tutorial, we will try to wind up

344
00:16:31,167 --> 00:16:33,033
the topic, and we'll cover some more stuff

345
00:16:33,034 --> 00:16:35,400
related to regexes. Do come back for covering

346
00:16:35,401 --> 00:16:37,866
that, and until then happy Rust programming

347
00:16:37,867 --> 00:16:42,300
[No Audio]