1 00:00:00,000 --> 00:00:08,066 [No Audio] 2 00:00:08,067 --> 00:00:10,230 In this section, we will be looking at some 3 00:00:10,231 --> 00:00:12,420 practical and real life applications, and 4 00:00:12,421 --> 00:00:15,480 problem solving using Rust. The idea is to 5 00:00:15,481 --> 00:00:17,610 have some hands on experience in real life 6 00:00:17,611 --> 00:00:19,950 problem solving using simple as well as 7 00:00:19,951 --> 00:00:22,560 complicated data types such as Linked lists, 8 00:00:22,590 --> 00:00:26,366 Trees, HashMaps, HashSet, and binary search trees. 9 00:00:27,200 --> 00:00:29,190 The first application we will 10 00:00:29,191 --> 00:00:31,560 implement is that of correct search results 11 00:00:31,590 --> 00:00:34,500 through word grouping. Consider an online 12 00:00:34,501 --> 00:00:37,170 store, the store has many items to sell, 13 00:00:37,171 --> 00:00:40,470 which are searchable. They want to implement 14 00:00:40,500 --> 00:00:43,320 a functionality that, if a user misspelled a 15 00:00:43,321 --> 00:00:46,380 word, they are still being displayed the 16 00:00:46,381 --> 00:00:48,990 correct search results. The business 17 00:00:48,991 --> 00:00:51,720 considers this function to be important from 18 00:00:51,721 --> 00:00:53,970 the user satisfaction and usability 19 00:00:53,971 --> 00:00:56,880 perspectives. The solution to this will 20 00:00:56,881 --> 00:00:59,550 include to first split the words describing 21 00:00:59,551 --> 00:01:01,860 the items that are present in the store into 22 00:01:01,861 --> 00:01:05,340 sets, so that all words in a set are 23 00:01:05,370 --> 00:01:08,970 anagrams. Anagrams are words having exactly 24 00:01:08,971 --> 00:01:11,850 the same letters. The assumption in this 25 00:01:11,851 --> 00:01:15,060 problem is that, the misspelled word will most 26 00:01:15,061 --> 00:01:17,760 probably contain the same letters as in the 27 00:01:17,790 --> 00:01:21,540 original word, but in different order. Let us 28 00:01:21,541 --> 00:01:24,266 start looking at the implementation details. 29 00:01:24,267 --> 00:01:26,910 We will assume that we are given a list of 30 00:01:26,911 --> 00:01:30,120 words describing some items, and we will need 31 00:01:30,121 --> 00:01:33,533 to group the words that are anagrams of each other. 32 00:01:34,533 --> 00:01:37,260 We will be using HashMaps and Loops to 33 00:01:37,261 --> 00:01:40,590 implement this problem. I will first declare 34 00:01:40,591 --> 00:01:42,810 a vector of strings containing some words 35 00:01:42,811 --> 00:01:45,300 have incorrect and incorrect spellings. 36 00:01:45,301 --> 00:02:03,030 [No Audio] 37 00:02:03,031 --> 00:02:04,433 Next, we will implement the 38 00:02:04,434 --> 00:02:06,060 required functionality using a 39 00:02:06,061 --> 00:02:09,166 function. I will name the function as 40 00:02:09,167 --> 00:02:11,133 word_grouping. 41 00:02:11,134 --> 00:02:14,166 [No Audio] 42 00:02:14,167 --> 00:02:15,500 The input to the function 43 00:02:15,501 --> 00:02:16,920 will be the vector of strings 44 00:02:16,921 --> 00:02:19,140 containing the words, and the output will be 45 00:02:19,141 --> 00:02:22,710 grouping of words. So we will need a vector 46 00:02:22,711 --> 00:02:26,133 which will contain string of factors as elements. 47 00:02:26,134 --> 00:02:29,600 [No Audio] 48 00:02:29,601 --> 00:02:31,933 First, I will create an empty HashMap. 49 00:02:31,934 --> 00:02:34,133 [No Audio] 50 00:02:34,134 --> 00:02:36,570 The key part of this HashMap will 51 00:02:36,571 --> 00:02:39,420 be the frequencies of the letters in the word, 52 00:02:39,750 --> 00:02:42,600 and the value part will contain all the words 53 00:02:42,601 --> 00:02:46,170 having the same frequencies of letters. This 54 00:02:46,171 --> 00:02:48,930 way, all the words corresponding to the same 55 00:02:48,931 --> 00:02:51,480 key will be anagrams, and are effectively 56 00:02:51,481 --> 00:02:55,020 stored together inside a HashMap. To compute 57 00:02:55,021 --> 00:02:57,360 the frequencies, we will first create a 58 00:02:57,361 --> 00:03:02,100 vector containing 26 entries, all initialized from 0. 59 00:03:02,101 --> 00:03:06,433 [No Audio] 60 00:03:06,434 --> 00:03:08,933 Each entry will correspond to one letter. 61 00:03:09,133 --> 00:03:11,130 Next, we will iterate through all the 62 00:03:11,131 --> 00:03:13,133 words in the input word list. 63 00:03:13,134 --> 00:03:17,066 [No Audio] 64 00:03:17,067 --> 00:03:18,766 For each word, we will compute 65 00:03:18,767 --> 00:03:20,160 the frequency and the 66 00:03:20,161 --> 00:03:22,650 frequencies of individual characters, so we 67 00:03:22,651 --> 00:03:24,960 will add another loop for iterating through 68 00:03:24,961 --> 00:03:27,033 all the characters in a given word. 69 00:03:27,034 --> 00:03:31,333 [No Audio] 70 00:03:31,334 --> 00:03:34,500 The lower_case function will convert to 71 00:03:34,501 --> 00:03:37,740 lowercase and the .char function will 72 00:03:37,741 --> 00:03:39,720 grab the individual characters from the 73 00:03:39,721 --> 00:03:42,030 current_word. Next, I will update the 74 00:03:42,031 --> 00:03:44,190 frequencies for the current character. 75 00:03:44,191 --> 00:03:55,620 [No Audio] 76 00:03:55,621 --> 00:03:59,766 The c as u32 will essentially grade the ASCII 77 00:03:59,767 --> 00:04:02,130 corresponding to the current character, 78 00:04:02,131 --> 00:04:04,560 and then we will subtract it from the ASCII 79 00:04:04,561 --> 00:04:07,680 corresponding to letter a, and then we'll use 80 00:04:07,681 --> 00:04:10,260 it as an index to update the frequency of the 81 00:04:10,261 --> 00:04:13,440 corresponding letter. For instance, if the 82 00:04:13,441 --> 00:04:17,279 current letter or character is, let's say d, 83 00:04:17,280 --> 00:04:20,579 the ASCII for d is 100. We will subtract it 84 00:04:20,580 --> 00:04:24,000 from the ASCII of letter a which is 97, so we 85 00:04:24,001 --> 00:04:27,840 will have a value of 3. This means that 86 00:04:27,841 --> 00:04:30,600 for the index of 3, we will have its value 87 00:04:30,601 --> 00:04:34,020 incremented by a value of 1. Please note 88 00:04:34,021 --> 00:04:37,080 again that indexes in Rust starts from 0, 89 00:04:37,081 --> 00:04:39,510 and therefore the index for d will be 3. 90 00:04:41,010 --> 00:04:43,920 Now once the inner loop is over, we will have 91 00:04:43,921 --> 00:04:46,320 the frequencies of all the letters in the 92 00:04:46,321 --> 00:04:49,866 word stored in the array of char_freq. 93 00:04:50,400 --> 00:04:52,710 We will use this array as the key to 94 00:04:52,711 --> 00:04:55,600 insert the corresponding word in the HashMap. 95 00:04:56,340 --> 00:04:59,430 To use it as a single entry for the key part, 96 00:04:59,700 --> 00:05:02,730 I have converted into string using iterators. 97 00:05:02,731 --> 00:05:09,510 [No Audio] 98 00:05:09,511 --> 00:05:12,870 The map function will convert each char to 99 00:05:12,871 --> 00:05:15,570 string. Next we use the collect function to 100 00:05:15,571 --> 00:05:19,050 have the result in a string format. The key 101 00:05:19,051 --> 00:05:21,450 variable now contains the frequencies of the 102 00:05:21,451 --> 00:05:24,810 characters in the string format. The key 103 00:05:24,811 --> 00:05:27,510 variable will now be used as a key for the 104 00:05:27,511 --> 00:05:30,633 hash, and we will insert the current_word 105 00:05:30,634 --> 00:05:33,630 into it as a value. The 106 00:05:33,631 --> 00:05:37,566 current_word is the word from the word_list, 107 00:05:37,567 --> 00:05:39,330 that is under processing in 108 00:05:39,331 --> 00:05:41,040 the current iteration of the loop. 109 00:05:41,041 --> 00:05:47,100 [No Audio] 110 00:05:47,101 --> 00:05:50,190 We use the .or_insert function, 111 00:05:50,460 --> 00:05:53,430 which will ensure that if there is no already 112 00:05:53,431 --> 00:05:55,920 existing entry for the value part, so by 113 00:05:55,921 --> 00:05:58,533 default, an empty vector will be inserted. 114 00:05:58,534 --> 00:06:01,170 And then we push the current_word 115 00:06:01,171 --> 00:06:04,650 for the value part. If more than two words 116 00:06:04,651 --> 00:06:06,690 have the frequency of characters, then 117 00:06:06,691 --> 00:06:08,550 they will correspond to the same key in the 118 00:06:08,551 --> 00:06:11,130 HashMap. In fact, they will correspond to the 119 00:06:11,131 --> 00:06:13,440 value part of the HashMap corresponding to 120 00:06:13,441 --> 00:06:16,710 the same key. Next, we will make sure that 121 00:06:16,711 --> 00:06:20,166 before processing the next word in the word_list 122 00:06:20,167 --> 00:06:21,966 in the next iteration of the loop, 123 00:06:21,967 --> 00:06:24,533 we re initialize the char_freq 124 00:06:24,534 --> 00:06:27,660 variable from all zeros. This is 125 00:06:27,661 --> 00:06:30,990 important, because we do not want to use the 126 00:06:30,991 --> 00:06:34,110 values or frequencies of characters stored 127 00:06:34,111 --> 00:06:35,433 for the previous word. 128 00:06:35,434 --> 00:06:37,766 [No Audio] 129 00:06:37,767 --> 00:06:39,120 Outside the loop, I 130 00:06:39,121 --> 00:06:42,210 will display the HashMap using a loop. Let us 131 00:06:42,211 --> 00:06:44,100 add the code and a suitable print 132 00:06:44,101 --> 00:06:45,733 statement for this purpose. 133 00:06:45,734 --> 00:06:56,220 [No Audio] 134 00:06:56,221 --> 00:06:58,920 Finally, I will return the grouping of the 135 00:06:58,921 --> 00:07:02,280 words in a string vector. You may note that 136 00:07:02,281 --> 00:07:04,350 the individual groups are basically the 137 00:07:04,351 --> 00:07:06,840 entries corresponding to the same keys in the 138 00:07:06,841 --> 00:07:09,840 HashMap, since they have the same frequencies 139 00:07:09,841 --> 00:07:13,350 of characters. To return the grouping, I will 140 00:07:13,351 --> 00:07:15,690 just return the value part corresponding to 141 00:07:15,691 --> 00:07:19,233 keys, I will use iterators again for this purpose. 142 00:07:19,234 --> 00:07:22,900 [No Audio] 143 00:07:22,901 --> 00:07:24,870 I will use the map function to 144 00:07:24,871 --> 00:07:27,133 transform the HashMap entries from 145 00:07:27,134 --> 00:07:30,960 key,value format to only value format. This is 146 00:07:30,961 --> 00:07:33,210 because, I am only interested in the value 147 00:07:33,211 --> 00:07:35,970 part, which contains the individual groups. 148 00:07:36,690 --> 00:07:39,510 First I will mention the inputs. The first 149 00:07:39,511 --> 00:07:42,750 input corresponds to the keys, which I will 150 00:07:42,751 --> 00:07:45,870 ignore by using underscore. The second one is 151 00:07:45,871 --> 00:07:48,090 the value part, which contains the groups of 152 00:07:48,091 --> 00:07:51,240 words corresponding to the same key. This is 153 00:07:51,241 --> 00:07:55,200 the part which I am interested in. For the 154 00:07:55,201 --> 00:07:57,600 returning or transformation or mapping part, 155 00:07:57,810 --> 00:08:01,200 I will just mention v. This will now take 156 00:08:01,230 --> 00:08:04,260 each and every entry of the HashMap, and will 157 00:08:04,261 --> 00:08:07,860 return only the value part. Finally, I will 158 00:08:07,861 --> 00:08:09,930 use the collect function to have all the 159 00:08:09,931 --> 00:08:12,660 information in a single collection. This 160 00:08:12,661 --> 00:08:15,766 statement will be the returning value. 161 00:08:17,633 --> 00:08:20,220 I will use this function now in the main. I 162 00:08:20,221 --> 00:08:22,110 will call the function and we'll store the 163 00:08:22,111 --> 00:08:23,633 result in a variable. 164 00:08:23,634 --> 00:08:28,566 [No Audio] 165 00:08:28,567 --> 00:08:30,090 Now let us assume that 166 00:08:30,091 --> 00:08:33,133 the user writes some word in the search bar. 167 00:08:33,466 --> 00:08:35,909 I would like to return the group that query 168 00:08:35,910 --> 00:08:38,610 word belongs to. Let us first declare a 169 00:08:38,611 --> 00:08:41,633 variable which will store the user entered word. 170 00:08:41,634 --> 00:08:45,400 [No Audio] 171 00:08:45,401 --> 00:08:47,310 Next, I will iterate through all the 172 00:08:47,311 --> 00:08:50,733 groups to see the groups to which the word belongs. 173 00:08:50,734 --> 00:08:55,620 [No Audio] 174 00:08:55,621 --> 00:08:56,820 They can contents function 175 00:08:56,821 --> 00:08:58,800 can be used to check if the input word is 176 00:08:58,801 --> 00:09:00,466 contained within the group. 177 00:09:00,467 --> 00:09:03,366 [No Audio] 178 00:09:03,367 --> 00:09:05,600 I'm allowed a suitable print statement in this case. 179 00:09:05,601 --> 00:09:09,333 [No Audio] 180 00:09:09,334 --> 00:09:11,566 Okay, it's time to see the action. 181 00:09:11,567 --> 00:09:13,066 Let me execute the code. 182 00:09:13,067 --> 00:09:16,566 [No Audio] 183 00:09:16,567 --> 00:09:18,570 First, we have the HashMap being 184 00:09:18,571 --> 00:09:20,370 displayed showing its keys and the 185 00:09:20,371 --> 00:09:22,830 corresponding values. The keys are strings of 186 00:09:22,831 --> 00:09:25,320 zeros and ones reflecting the frequencies of 187 00:09:25,321 --> 00:09:27,833 the characters, and the values are the words 188 00:09:27,840 --> 00:09:31,410 that are anagrams of each other. Finally, for 189 00:09:31,411 --> 00:09:33,866 the input word, we have the corresponding group. 190 00:09:34,133 --> 00:09:36,450 We end this tutorial here, I hope you 191 00:09:36,451 --> 00:09:38,250 would have enjoyed problem solving using 192 00:09:38,251 --> 00:09:40,590 HashMap and loops. Do come back again for 193 00:09:40,591 --> 00:09:44,300 more and until next tutorial, happy Rust programming. 194 00:09:44,301 --> 00:09:49,400 [No Audio]