1 00:00:06,660 --> 00:00:09,750 - In the previous section, I introduced the box structure 2 00:00:09,750 --> 00:00:10,950 and I described the fact that it's 3 00:00:10,950 --> 00:00:12,870 like a pointer to an object on heap. 4 00:00:12,870 --> 00:00:14,280 It's a smart pointer. 5 00:00:14,280 --> 00:00:15,113 In this section, 6 00:00:15,113 --> 00:00:18,000 we're going to look at the realistic use of boxes 7 00:00:18,000 --> 00:00:22,140 to define a recursive data structure, like a link list, 8 00:00:22,140 --> 00:00:25,710 I've called it chain here, but it's like a link list. 9 00:00:25,710 --> 00:00:28,860 So imagine you wanted to define a structure 10 00:00:28,860 --> 00:00:32,343 to represent a node in a link list, or a chain. 11 00:00:33,180 --> 00:00:36,060 Each node would have a pointer to the next node. 12 00:00:36,060 --> 00:00:37,563 Because it's like a link list. 13 00:00:38,430 --> 00:00:40,650 The final node points to null 14 00:00:40,650 --> 00:00:42,390 to indicate that's the end of the list. 15 00:00:42,390 --> 00:00:45,900 Okay, so it's a very common kinda data structure. 16 00:00:45,900 --> 00:00:47,070 I'm sure you've seen that kind of thing before. 17 00:00:47,070 --> 00:00:49,680 Each node has a pointer to the next one, 18 00:00:49,680 --> 00:00:51,810 and then eventually the last pointer 19 00:00:51,810 --> 00:00:53,970 points to null, that kind of arrangement. 20 00:00:53,970 --> 00:00:55,350 How would you do that in Rust? 21 00:00:55,350 --> 00:00:57,480 Well, the answer is you have to use box 22 00:00:57,480 --> 00:00:58,630 to achieve that effect. 23 00:00:59,490 --> 00:01:03,510 So this would be our first approach, maybe. 24 00:01:03,510 --> 00:01:07,230 This approach doesn't use box, it's not going to work. 25 00:01:07,230 --> 00:01:08,280 I'll draw a picture 26 00:01:08,280 --> 00:01:10,020 and then you can see what the problem is. 27 00:01:10,020 --> 00:01:13,080 I'll ignore the bit about option for now, okay? 28 00:01:13,080 --> 00:01:15,600 That's to do with whether it's a null pointer or not. 29 00:01:15,600 --> 00:01:17,040 This is how you represent something 30 00:01:17,040 --> 00:01:20,580 that may be a null pointer in Rust using Option. 31 00:01:20,580 --> 00:01:23,220 Let me come back to that point a bit later. 32 00:01:23,220 --> 00:01:24,660 The key point at the moment 33 00:01:24,660 --> 00:01:27,150 is whenever you define a structure, 34 00:01:27,150 --> 00:01:30,810 the compiler needs to know how big it is in bytes, 35 00:01:30,810 --> 00:01:34,080 so we can know how much space to allocate on the stack. 36 00:01:34,080 --> 00:01:36,780 But when the compiler sees this data structure, 37 00:01:36,780 --> 00:01:40,050 it won't compile because NodeBad, as I've called it, 38 00:01:40,050 --> 00:01:42,150 is infinitely nested. 39 00:01:42,150 --> 00:01:43,740 You can imagine if the compiler 40 00:01:43,740 --> 00:01:46,290 is trying to figure out how many bytes it is in size, 41 00:01:46,290 --> 00:01:48,720 it would say, okay then so let's say 42 00:01:48,720 --> 00:01:51,960 I've got one NodeBad instance. 43 00:01:51,960 --> 00:01:56,730 It's got a piece of data which is an integer, like that. 44 00:01:56,730 --> 00:01:58,590 Probably the value 42. 45 00:01:58,590 --> 00:02:01,473 And then it's got another field called Next. 46 00:02:02,910 --> 00:02:06,540 And that is another, let's ignore the option bit for now. 47 00:02:06,540 --> 00:02:07,740 We will come back to that. 48 00:02:07,740 --> 00:02:10,860 It's basically another instance of NodeBad, okay? 49 00:02:10,860 --> 00:02:15,033 So that box there is another instance of NodeBad, 50 00:02:15,960 --> 00:02:19,830 and inside that NodeBad, it's recursive, okay? 51 00:02:19,830 --> 00:02:21,330 Kind of infinitely nested. 52 00:02:21,330 --> 00:02:24,342 It will say, how big is this inner NodeBad? 53 00:02:24,342 --> 00:02:26,910 Well, it's got some data, okay? 54 00:02:26,910 --> 00:02:30,630 That might be the value 43, controversially. 55 00:02:30,630 --> 00:02:33,240 Oh, and then it's got another field called Next, 56 00:02:33,240 --> 00:02:35,823 which is another instance of NodeBad. 57 00:02:36,720 --> 00:02:39,150 Okay, so this thing is going to be infinitely nested 58 00:02:39,150 --> 00:02:42,210 like dolls inside dolls, inside dolls. 59 00:02:42,210 --> 00:02:43,830 The compiler, it'll never end. 60 00:02:43,830 --> 00:02:45,420 The compiler can't possibly figure out 61 00:02:45,420 --> 00:02:47,100 how big it's gonna be in memory. 62 00:02:47,100 --> 00:02:49,320 Basically, you can't nest objects together 63 00:02:49,320 --> 00:02:50,160 like I've done here. 64 00:02:50,160 --> 00:02:53,340 You need pointers to point to something external. 65 00:02:53,340 --> 00:02:56,730 And that's basically where box comes in. 66 00:02:56,730 --> 00:02:59,490 So Rust can't determine the size of this data structure here 67 00:02:59,490 --> 00:03:00,633 and it won't compile. 68 00:03:01,530 --> 00:03:03,300 So this is a better approach. 69 00:03:03,300 --> 00:03:05,310 Again, ignore the option bit for now. 70 00:03:05,310 --> 00:03:07,950 Okay, and I'll come back to that in a moment. 71 00:03:07,950 --> 00:03:10,680 Focusing on the kind of box aspect of it, 72 00:03:10,680 --> 00:03:13,260 a node would look like this. 73 00:03:13,260 --> 00:03:17,253 A node object would have a piece of data, 74 00:03:19,020 --> 00:03:21,870 an I32 like that. 75 00:03:21,870 --> 00:03:24,960 So it knows how big that is. It's 32 bits. 76 00:03:24,960 --> 00:03:29,250 And then it's got a field called Next, okay? 77 00:03:29,250 --> 00:03:31,530 And ignoring the option bit for now, 78 00:03:31,530 --> 00:03:35,820 it is effectively a box, which is like a pointer. 79 00:03:35,820 --> 00:03:39,720 When you see box, think pointer or smart pointer. 80 00:03:39,720 --> 00:03:43,050 So it is a pointer to something externally allocated. 81 00:03:43,050 --> 00:03:44,613 It's a pointer to another node. 82 00:03:45,540 --> 00:03:47,070 It is a pointer to another node. 83 00:03:47,070 --> 00:03:49,590 That's the way to think about box. 84 00:03:49,590 --> 00:03:50,423 All right? 85 00:03:50,423 --> 00:03:52,683 And then, there's another node here, 86 00:03:53,520 --> 00:03:55,230 but we've already solved the problem. 87 00:03:55,230 --> 00:03:58,860 The compiler can ascertain how big this thing is. 88 00:03:58,860 --> 00:04:01,980 It's 32 bits for the integer. 89 00:04:01,980 --> 00:04:05,340 And a box is probably also four bytes 90 00:04:05,340 --> 00:04:06,930 because it's just a pointer really. 91 00:04:06,930 --> 00:04:08,880 So the compiler can work out the size 92 00:04:08,880 --> 00:04:09,840 of this data structure. 93 00:04:09,840 --> 00:04:13,470 Let's say it's eight bytes, problem solved, okay? 94 00:04:13,470 --> 00:04:17,910 So by using a box, it effectively means it's a pointer 95 00:04:17,910 --> 00:04:20,550 to something in external memory. 96 00:04:20,550 --> 00:04:22,230 So we don't need to worry about that 97 00:04:22,230 --> 00:04:24,680 when we're trying to figure out the size of this. 98 00:04:26,130 --> 00:04:28,680 Okay, great. So box to the rescue. 99 00:04:28,680 --> 00:04:30,360 When you go to recursive data structure, 100 00:04:30,360 --> 00:04:31,530 you've got to basically use box 101 00:04:31,530 --> 00:04:34,623 to give you a pointer to another instance. 102 00:04:35,670 --> 00:04:36,783 What about option? 103 00:04:38,010 --> 00:04:40,413 Rust doesn't really do null pointers, 104 00:04:42,291 --> 00:04:44,250 it kind of does pointers to a fashion. 105 00:04:44,250 --> 00:04:46,413 Box is like a smart pointer, 106 00:04:47,400 --> 00:04:49,170 but it always contains a box, 107 00:04:49,170 --> 00:04:50,880 always contains a real... 108 00:04:50,880 --> 00:04:52,620 It always points to a real object. 109 00:04:52,620 --> 00:04:56,490 A box object will never have a null pointer 110 00:04:56,490 --> 00:04:57,600 because that would be dangerous, 111 00:04:57,600 --> 00:04:59,400 and Rust doesn't like danger. 112 00:04:59,400 --> 00:05:03,510 So a box object will always point to something. 113 00:05:03,510 --> 00:05:04,410 So what I've done 114 00:05:04,410 --> 00:05:08,130 is I've decided to put the box inside an option. 115 00:05:08,130 --> 00:05:12,060 So if we think of this more carefully, 116 00:05:12,060 --> 00:05:15,800 let's say we've got two nodes in our data structure, okay? 117 00:05:15,800 --> 00:05:20,233 In the first node, it'll have some data and integer 42. 118 00:05:22,230 --> 00:05:24,090 And then it has an option. 119 00:05:24,090 --> 00:05:28,290 An option can either be some value, or none. 120 00:05:28,290 --> 00:05:31,980 And that's basically where the null concept comes in. 121 00:05:31,980 --> 00:05:35,250 So let's say that this object will have a next pointer. 122 00:05:35,250 --> 00:05:38,073 So the option, if I kind of draw next to it. 123 00:05:39,090 --> 00:05:43,890 The option will be in the sum state. 124 00:05:43,890 --> 00:05:46,300 In other words, the option will contain a box 125 00:05:49,740 --> 00:05:54,740 and that box will point to a node, like this. 126 00:05:54,840 --> 00:05:56,910 It points to the next node. 127 00:05:56,910 --> 00:05:58,560 The next node, let's say this next node 128 00:05:58,560 --> 00:06:00,930 is actually the end of the list, like that. 129 00:06:00,930 --> 00:06:05,670 So it'll have some data, 43, like that, 130 00:06:05,670 --> 00:06:07,083 and it'll have a Next field. 131 00:06:08,130 --> 00:06:09,900 And this Next field is an option. 132 00:06:09,900 --> 00:06:12,600 It can either be some value or none, 133 00:06:12,600 --> 00:06:14,610 and it would contain the value, none. 134 00:06:14,610 --> 00:06:17,160 So instead of containing a box, 135 00:06:17,160 --> 00:06:19,410 which is the value it could contain, 136 00:06:19,410 --> 00:06:22,083 it'll either contain some box, or it'll contain none. 137 00:06:23,220 --> 00:06:26,190 None, which you can think of as effectively being 138 00:06:26,190 --> 00:06:29,160 the Rust equivalent of a safe null pointer. 139 00:06:29,160 --> 00:06:32,400 This is the way you achieve null pointers in Rust. 140 00:06:32,400 --> 00:06:37,050 You have an option which may contain a real object, 141 00:06:37,050 --> 00:06:39,810 or may contain none, like that. 142 00:06:39,810 --> 00:06:42,690 And that's basically how you do the concept of null pointers 143 00:06:42,690 --> 00:06:44,940 in Rust, in a safe way. 144 00:06:44,940 --> 00:06:47,253 Because whenever you access an option, 145 00:06:48,157 --> 00:06:49,747 you've gotta use pattern marking to say, 146 00:06:49,747 --> 00:06:53,040 "Do you contain some value, or do you contain none?" 147 00:06:53,040 --> 00:06:55,040 You can't just directly de-reference it, 148 00:06:56,040 --> 00:06:57,840 unless you call the Unwrap function, of course. 149 00:06:57,840 --> 00:06:59,700 And then good luck with that. 150 00:06:59,700 --> 00:07:02,160 So this is fine now. 151 00:07:02,160 --> 00:07:06,570 The node contains a box, which is effectively a pointer. 152 00:07:06,570 --> 00:07:11,570 So the memory for this thing is known in size 153 00:07:11,850 --> 00:07:13,080 and the pointer points 154 00:07:13,080 --> 00:07:18,000 to an externally allocated node living elsewhere, okay? 155 00:07:18,000 --> 00:07:21,300 So Rust can determine the size of this data structure, 156 00:07:21,300 --> 00:07:23,973 which is the first hurdle that it had to overcome. 157 00:07:25,020 --> 00:07:28,170 Right, so that was a quick example of the theory 158 00:07:28,170 --> 00:07:31,500 of how boxes would be used in a recursive data structure. 159 00:07:31,500 --> 00:07:33,450 I've got quite a detailed example 160 00:07:33,450 --> 00:07:35,100 which I'm gonna step you through now, 161 00:07:35,100 --> 00:07:37,530 which shows how to actually implement this link list. 162 00:07:37,530 --> 00:07:40,170 I've called it chain in the demo. 163 00:07:40,170 --> 00:07:41,340 So we'll have a look at main 164 00:07:41,340 --> 00:07:43,290 and then we'll have a look at demo_box3, 165 00:07:43,290 --> 00:07:44,390 and then we'll run it.