1 00:00:07,100 --> 00:00:11,800 The next section that I want to talk about is jails jails was one of the 2 00:00:11,800 --> 00:00:15,400 very early mechanisms in any of the Unix systems 3 00:00:15,400 --> 00:00:19,600 for creating virtual machines that are 4 00:00:19,600 --> 00:00:23,000 really just instances of the underlying operating system. 5 00:00:23,000 --> 00:00:27,700 So we've had virtual machines for long time, that 6 00:00:27,700 --> 00:00:31,600 something that actually came about from the days of IBM machines. 7 00:00:31,600 --> 00:00:35,900 Wherein you could create essentially what looked like 8 00:00:35,900 --> 00:00:36,700 a clone of 9 00:00:36,800 --> 00:00:40,400 The Bear hardware and then you can just boot in on that 10 00:00:40,400 --> 00:00:44,800 virtual machine, Any operating system, you want it. And we 11 00:00:44,800 --> 00:00:48,300 have those sorts of things today. VMware is a good example of that. 12 00:00:48,300 --> 00:00:52,800 And this would allow you to, for example, boot a Linux kernel and then 13 00:00:52,800 --> 00:00:56,300 run instances of FreeBSD or Solaris or 14 00:00:56,300 --> 00:01:00,500 whatever else you wanted. Two different versions of Linux 15 00:01:00,500 --> 00:01:04,500 Debian version, maybe, and a red hat and another one. And 16 00:01:04,500 --> 00:01:06,600 each of them sort of has the appearance. 17 00:01:06,800 --> 00:01:10,700 It's as if they're running actually on the Bear hardware, the drawback to Virtual 18 00:01:10,700 --> 00:01:14,700 machines is that they are very, very resource-intensive. It's not 19 00:01:14,700 --> 00:01:18,700 uncommon to need a gigabyte or two of memory per virtual machine that you're 20 00:01:18,700 --> 00:01:22,900 running on the system. And so unless you have a machine with 21 00:01:22,900 --> 00:01:26,600 just a colossal amount of memory and other resources on it. You really 22 00:01:26,600 --> 00:01:30,300 can't run. Very many virtual machines before you 23 00:01:30,300 --> 00:01:33,900 start really running up against resource limitations. 24 00:01:35,000 --> 00:01:39,900 So the idea of jails was to provide the sort of equivalent of a virtual 25 00:01:39,900 --> 00:01:43,900 machine, but you don't get to boot any operating system. You want 26 00:01:44,000 --> 00:01:48,600 is an instance of FreeBSD. So it's an instance of the underlying 27 00:01:48,800 --> 00:01:50,300 operating system that you're running. 28 00:01:51,500 --> 00:01:55,900 It was first implemented in the early 90s because there 29 00:01:55,900 --> 00:01:59,800 were a number of people that were 30 00:01:59,800 --> 00:02:03,500 trying to use BST to provide web 31 00:02:03,500 --> 00:02:07,400 hosting or mail hosting or essentially 32 00:02:07,400 --> 00:02:11,900 various services for clients and absent. The ability to 33 00:02:11,900 --> 00:02:15,700 have jails or something like them. Their only choice was 34 00:02:15,700 --> 00:02:19,800 to essentially have Bear hardware because most 35 00:02:19,800 --> 00:02:21,400 people that were running something 36 00:02:21,500 --> 00:02:25,700 A website hosting site expected to be able to 37 00:02:25,700 --> 00:02:29,900 have root privilege and be able to do whatever they wanted is root. And that would 38 00:02:29,900 --> 00:02:33,800 obviously give them the ability to impact other users that were running on the system. 39 00:02:34,800 --> 00:02:38,800 And so, the idea of jails was to create this little, sort of instance 40 00:02:39,200 --> 00:02:43,800 of FreeBSD. That would look like you had your own Bear 41 00:02:43,800 --> 00:02:47,900 hardware, but in fact you were running on the the operating system, so you didn't have 42 00:02:47,900 --> 00:02:51,300 to actually have extra copies of the operating system. There was just 43 00:02:51,400 --> 00:02:55,400 Is the one real copy of the operating system that was running. And then it would 44 00:02:55,400 --> 00:02:59,800 simply provide these various instances. So where it might be possible to run 45 00:02:59,800 --> 00:03:03,900 five or maybe even 10 virtual machines on a really large machine. I really 46 00:03:03,900 --> 00:03:07,200 large set of Hardware in the case of jails 47 00:03:07,800 --> 00:03:11,700 as we quickly learned due to feedback about things that were running slowly when they got to be too 48 00:03:11,700 --> 00:03:15,900 many of them. You can have hundreds if not thousands of jails 49 00:03:16,600 --> 00:03:20,700 Because the actual memory footprint to create a new jail is a 50 00:03:20,700 --> 00:03:21,300 few megabytes. 51 00:03:21,500 --> 00:03:25,400 At most and so the actual 52 00:03:25,800 --> 00:03:29,800 memory requirements are low. The resources are already there because you're 53 00:03:29,800 --> 00:03:33,900 just running on the one true copy of the operating system that's running. And so really 54 00:03:33,900 --> 00:03:37,200 the cost of a jail is the cost of what's running on it. I mean, 55 00:03:37,200 --> 00:03:41,700 obviously if you're running a thousand jails, you can't have every jail running 56 00:03:41,700 --> 00:03:45,500 flat out on a CPU because you likely don't have a thousand 57 00:03:45,500 --> 00:03:49,500 CPUs, but for things like web servers, low-volume web 58 00:03:49,500 --> 00:03:50,900 servers in particular. 59 00:03:51,400 --> 00:03:55,600 It's a very inexpensive way because, you know, you have the 60 00:03:55,600 --> 00:03:59,800 appearance of running on heart Bear hardware. But in fact, you're just using 61 00:03:59,800 --> 00:04:03,000 part of a very small slice of another machine. 62 00:04:04,200 --> 00:04:08,700 This essentially allowed the people that were selling web hosting services to 63 00:04:08,800 --> 00:04:12,600 dramatically cut their prices, because instead of having to provide a piece of hardware, 64 00:04:12,800 --> 00:04:16,600 for every client that they had, they can simply 65 00:04:16,600 --> 00:04:20,900 create jails and as clients would ramp up the amount they were using them. 66 00:04:21,500 --> 00:04:25,800 Could then put them on machines with fewer and fewer jails. And eventually, if they got through a volume where it 67 00:04:25,800 --> 00:04:29,900 made sense, they could have their own Hardware, but at that point, they would presumably be going 68 00:04:29,900 --> 00:04:33,600 to pay the multiple hundreds of dollars a month to get that kind of service. 69 00:04:35,400 --> 00:04:39,700 So this picture that you see here, shows you the sort of the contents of a 70 00:04:39,700 --> 00:04:43,900 jail. And so we have the the 71 00:04:43,900 --> 00:04:47,700 host at the top there. So that's the the instance where we 72 00:04:47,700 --> 00:04:51,300 originally booted. The machine on the Bear hardware and 73 00:04:51,400 --> 00:04:55,900 Then what we've created inside that we have the, 74 00:04:56,200 --> 00:05:00,700 the bin and Devon, Etc and user and so on that you would expect to see on the 75 00:05:00,700 --> 00:05:04,600 regular system, but then you'll see that we have user jails and then 76 00:05:04,600 --> 00:05:08,800 down in user jails. We create the various jails that we want to 77 00:05:08,800 --> 00:05:12,600 implement. So we've got one jail in this picture that's being used for web hosting 78 00:05:13,000 --> 00:05:17,800 and another one that's being used for male hosting. It doesn't mean that you couldn't have a single jail and do 79 00:05:17,800 --> 00:05:21,300 both web and mail in that jail, but there's just for purposes of this example. 80 00:05:21,700 --> 00:05:25,900 You see there, that each of those jails has its own set 81 00:05:25,900 --> 00:05:29,800 of binaries, which is usually a subset of those that are 82 00:05:29,800 --> 00:05:33,000 from the host. You can either make copies of the 83 00:05:33,000 --> 00:05:37,900 hosts binaries into each of the jails or if you're willing 84 00:05:37,900 --> 00:05:41,500 to make them read only you can actually just create a hard link 85 00:05:41,500 --> 00:05:45,800 from the jails binary to the house. Binary, what 86 00:05:45,800 --> 00:05:49,900 this means is that if on that particular jail, you want to replace 87 00:05:49,900 --> 00:05:51,600 a particular binary, so 88 00:05:51,800 --> 00:05:55,900 For example, you've got been our last, you want to replace, you have 89 00:05:55,900 --> 00:05:59,800 to remove the link and then you can put a copy and of the ls that you actually want to 90 00:05:59,800 --> 00:06:03,800 have within your jail because if you were able to overwrite the ls, then 91 00:06:04,300 --> 00:06:08,900 the host and everybody else that was sharing that would see your changes. But this does allow 92 00:06:08,900 --> 00:06:12,800 you to predominantly borrow copies of things from the host and 93 00:06:12,800 --> 00:06:16,900 only put in private copies of things where you want to make changes to it. 94 00:06:17,200 --> 00:06:21,500 Now, one of the other issues of course with jails is 95 00:06:21,700 --> 00:06:25,900 That you want to be able to get access to networking. And when 96 00:06:25,900 --> 00:06:29,500 jails were first done, what we gave, you was 97 00:06:29,500 --> 00:06:33,600 one IP address on, whatever the actual 98 00:06:33,700 --> 00:06:37,300 network interface was in this picture. That would be em0. 99 00:06:38,600 --> 00:06:42,800 The problem with this. Well, then you are then allowed to create ports that any 100 00:06:42,800 --> 00:06:46,200 ports you wanted for your web address, but you weren't allowed to 101 00:06:47,600 --> 00:06:51,800 create ports for any other address and other details, of course, weren't allowed to create 102 00:06:51,800 --> 00:06:55,500 ports for your address. And so that's the way that we bifurcated the traffic. 103 00:06:56,000 --> 00:07:00,800 The problem with this is that we did not allow a lot of the tools that people are 104 00:07:00,800 --> 00:07:04,800 generally used to using that require essentially 105 00:07:05,400 --> 00:07:07,900 getting raw access to the interface. Because as soon as you have 106 00:07:08,000 --> 00:07:12,800 Access to the interface. You can then Snoop all the traffic that's coming across it and you 107 00:07:12,800 --> 00:07:16,900 would be able to see traffic going to other hosts protection potentially 108 00:07:16,900 --> 00:07:20,500 inject traffic. That might go to the other host. And this is obviously not 109 00:07:20,500 --> 00:07:24,900 desirable. So a few years ago and FreeBSD we created 110 00:07:24,900 --> 00:07:28,900 what? We call the virtual Network stack, and the idea of the virtual 111 00:07:28,900 --> 00:07:32,800 Network stack, is that we create instances of the, 112 00:07:32,800 --> 00:07:36,900 the network Stacks in essence. It's like, you have 113 00:07:36,900 --> 00:07:37,900 your very own copy. 114 00:07:38,000 --> 00:07:42,900 Of the stack. And so all the things that you sort of think of AS Global variables like back off counters 115 00:07:42,900 --> 00:07:46,700 and retransmission times and all that sort of thing. Are 116 00:07:46,800 --> 00:07:50,900 you can individually control those which historically would have changed it 117 00:07:50,900 --> 00:07:54,600 for everybody. But with the virtual Network stack, you're only changing it for your 118 00:07:54,600 --> 00:07:58,800 instance of the network stack. So, once we have this capability, what we 119 00:07:58,800 --> 00:08:02,800 can do is, what is shown in this picture and that is, we take the actual original 120 00:08:02,800 --> 00:08:06,800 Hardware network interface in this case, em0. And on 121 00:08:06,800 --> 00:08:07,800 top of that, we put a 122 00:08:08,000 --> 00:08:12,900 Will Network stack V, Net, Zero and all that V 123 00:08:12,900 --> 00:08:16,800 naught 0 is really going to do is IP routing. So 124 00:08:16,800 --> 00:08:20,700 it's going to take any packets that are for IP addresses of the the 125 00:08:20,700 --> 00:08:24,600 web jail, and they're going to be sent over to the 126 00:08:24,600 --> 00:08:28,800 virtually m0a and anything that's for the 127 00:08:28,800 --> 00:08:32,900 virtual male stack, which is going to be sent over to virtually M1A 128 00:08:32,900 --> 00:08:36,800 and then virtually m0a 129 00:08:36,800 --> 00:08:37,800 takes as input. 130 00:08:38,000 --> 00:08:42,300 It passes it up to virtually m0 B, which is the interface that appears 131 00:08:42,300 --> 00:08:46,600 within the web jail, and the 132 00:08:46,600 --> 00:08:50,600 EM1 be appears within the mail jail. 133 00:08:50,800 --> 00:08:54,900 So, the upshot of this then is that the in 134 00:08:54,900 --> 00:08:58,400 the web browser side, you can do whatever you want with 135 00:08:58,400 --> 00:09:02,900 virtually m0b. You can put it in promiscuous mode, you know, look at 136 00:09:02,900 --> 00:09:06,500 all the packets that are coming in. Config it up, configure down. 137 00:09:06,500 --> 00:09:07,200 Whatever. 138 00:09:08,100 --> 00:09:12,700 Not going to actually be able to see any packets other than the one that are being sent to you by V naught 139 00:09:12,700 --> 00:09:16,900 0. So you don't have to worry that somehow by snooping. You're going to be able to see 140 00:09:16,900 --> 00:09:20,800 other people's packets. You're only going to be able to see those that are destined for your IP 141 00:09:20,800 --> 00:09:24,700 address. Similarly, over on the male side. It's only going to be able to 142 00:09:24,700 --> 00:09:28,700 see things that are being sent over through. Its virtual network 143 00:09:28,700 --> 00:09:32,800 interface. They of course, each have their own virtual Nets v-net one and two 144 00:09:32,800 --> 00:09:36,500 which allows them to configure their networks in whatever way they view as 145 00:09:36,500 --> 00:09:37,900 desirable without. 146 00:09:38,000 --> 00:09:42,400 Affecting the networks that other jails are using. And again, the 147 00:09:42,700 --> 00:09:46,900 creating a virtual Network, really just requires that. You create an instance 148 00:09:46,900 --> 00:09:50,600 of the global variables associated with the network, which is a few kilobytes 149 00:09:50,600 --> 00:09:54,900 worth of variables. And so, the upshot is 150 00:09:54,900 --> 00:09:58,200 that it's very, very cheap to create these virtual Nets, the virtual 151 00:09:58,200 --> 00:10:02,700 interfaces and through this kind of a mechanism. Then we allow you to 152 00:10:03,300 --> 00:10:07,900 have the appearance as if you've got your own network interface so that you can do whatever you want. You can. In fact, 153 00:10:08,000 --> 00:10:12,800 Figure it to receive packets for IP addresses that are not the ones that have been assigned to 154 00:10:12,800 --> 00:10:16,800 you, but you won't ever get any traffic on that because V naught 0 is never going to Route 155 00:10:16,800 --> 00:10:20,800 those through to you. So although you can do things that might 156 00:10:20,800 --> 00:10:24,500 seem like they would be inappropriate. The fact of the matter is that nothing 157 00:10:24,500 --> 00:10:28,700 inappropriate is going to actually happen. So, let's look at 158 00:10:28,900 --> 00:10:30,900 some of the rules that we have for jails. 159 00:10:31,800 --> 00:10:35,800 Although we do in fact, hand out the root password to the 160 00:10:35,800 --> 00:10:39,900 jail. We don't allow all the things that route can do on the host machine to 161 00:10:39,900 --> 00:10:43,600 be done in a jail. So the sorts of things that we do allow 162 00:10:43,800 --> 00:10:47,800 is running or signaling processes within the jail. So any process that's running 163 00:10:47,800 --> 00:10:51,700 within your jail as root. You have the right to start it, stop it, kill it, 164 00:10:51,700 --> 00:10:55,800 whatever. If you do something, like PS within a jail, the 165 00:10:55,800 --> 00:10:59,800 only processes that will be returned to, you will be those that are within your 166 00:10:59,800 --> 00:11:01,200 jail. So, you 167 00:11:01,600 --> 00:11:05,800 The other processes running in the system are simply not visible. Once you're in the jail and 168 00:11:05,800 --> 00:11:09,700 hence, you know, you can't even see the other processes and even if you 169 00:11:09,700 --> 00:11:13,500 were to somehow get what their process IDs were, if you were to try and send some 170 00:11:13,600 --> 00:11:17,900 signal to one of them, it would simply be rejected. Although, instead of telling you that you don't have 171 00:11:17,900 --> 00:11:21,900 permission. It would simply say it doesn't exist. You are allowed to make arbitrary changes to 172 00:11:21,900 --> 00:11:25,900 files within a jail. Just as you would for arbitrary files, anywhere in 173 00:11:25,900 --> 00:11:29,500 the file system issue, is that you can't get out of your jail. So, the root of your 174 00:11:29,500 --> 00:11:31,400 jail, even though it's a 175 00:11:31,500 --> 00:11:35,900 A, a subdirectory of the host machine. You're not allowed out of that. 176 00:11:36,000 --> 00:11:40,300 We've essentially done at route to the root of your jail and 177 00:11:40,600 --> 00:11:44,700 so dot dot, as far as you're concerned, just leads you, right to where you 178 00:11:44,700 --> 00:11:48,900 already are. As I've already said, you can bind ports to the Jail's IP addresses. 179 00:11:49,300 --> 00:11:53,400 We no longer say that. You can only have one IP address. In fact, you can have an 180 00:11:53,400 --> 00:11:57,900 ipv4 address and an IPv6 address. You can have multiple ipv4, addresses multiple 181 00:11:57,900 --> 00:12:01,400 IPv6 addresses. It's really just a matter of that. 182 00:12:01,500 --> 00:12:05,800 Being at zero being configured to pass packets for the addresses that have been 183 00:12:05,800 --> 00:12:09,900 assigned to you, through to you. There's an arbitrary number of addresses that can 184 00:12:09,900 --> 00:12:13,900 be assigned to you. And, as I said earlier, if you pick an address, that's not one 185 00:12:13,900 --> 00:12:17,500 of the ones assigned to you. You simply won't see any of the packets that are sent to those 186 00:12:17,500 --> 00:12:20,700 addresses because they won't be being forwarded to your virtual interface. 187 00:12:21,500 --> 00:12:25,400 It does mean that you can access raw diverter routing sockets. 188 00:12:25,400 --> 00:12:29,800 You have a whole routing table, you can totally control because that's all part of the 189 00:12:29,800 --> 00:12:33,600 virtual networking stack. So you can do completely crazy routing 190 00:12:33,600 --> 00:12:37,700 things and the only processes that are going to be affected. 191 00:12:37,700 --> 00:12:41,700 If you make bad choices, there are the ones running in your jail because the other 192 00:12:41,700 --> 00:12:45,800 jails have their own routing tables, which are presumably being managed in a more coherent 193 00:12:45,800 --> 00:12:46,100 way. 194 00:12:46,900 --> 00:12:50,800 And you can do sort of arbitrary things with the with your 195 00:12:50,800 --> 00:12:54,800 virtual network interface. What you're not permitted to 196 00:12:54,800 --> 00:12:58,700 do is to get information on processes outside of the jail. So if you 197 00:12:58,700 --> 00:13:02,700 do a system call requesting information on quote, 198 00:13:02,700 --> 00:13:06,800 all the processes in the system, it will simply report back to those that are in your 199 00:13:06,800 --> 00:13:10,900 jail and it won't tell you about anything that's outside your jail. And indeed if you 200 00:13:10,900 --> 00:13:14,700 just do some out-of-band way, find out the 201 00:13:14,700 --> 00:13:16,700 process ID of some process. That's 202 00:13:16,900 --> 00:13:20,800 In your jail and try and Signal, it won't even acknowledge the fact that that process 203 00:13:20,800 --> 00:13:24,700 exists by telling you that you don't have permission. It will simply say there is no such 204 00:13:24,700 --> 00:13:25,300 process. 205 00:13:26,500 --> 00:13:30,700 We do not allow you to change kernel variables. You're allowed to change 206 00:13:30,700 --> 00:13:34,600 the sort of what was previously thought of as Global state within your network 207 00:13:34,600 --> 00:13:38,700 stack, but you're not allowed to change the variables. Like what's the maximum number of 208 00:13:38,700 --> 00:13:42,800 processes or the maximum of open? Descriptors? Because those variables would, in 209 00:13:42,800 --> 00:13:46,600 fact affect every jail and the host running on that system. 210 00:13:47,000 --> 00:13:51,900 We do not allow mounting or unmounting a file systems because if 211 00:13:51,900 --> 00:13:55,600 you can reach out and get other file systems, it would potentially give you access to 212 00:13:55,600 --> 00:13:59,700 files. That you should not be able to get access to. There is 213 00:13:59,700 --> 00:14:03,600 some relaxation of this if you're running with ZFS, in 214 00:14:03,600 --> 00:14:07,900 ZFS, you can be given the permission to create file systems. So if you have created that, filesystem, 215 00:14:07,900 --> 00:14:11,700 that filesystem will be associated with your jail and you will be allowed to 216 00:14:11,700 --> 00:14:15,900 mount and unmount that filesystem within your jail. We will not allow you to modify the 217 00:14:15,900 --> 00:14:16,700 physical Network. 218 00:14:17,000 --> 00:14:21,600 Cases or configurations, IE. You cannot go to the actual Hardware network interface 219 00:14:21,600 --> 00:14:25,700 and Mark it up or down, or put it into raw, or 220 00:14:25,700 --> 00:14:29,900 divert mode or any of those things. Because, obviously, again, it would give you access to 221 00:14:29,900 --> 00:14:33,700 things that you should not be able to see. And finally, we do not allow 222 00:14:33,700 --> 00:14:37,800 you to reboot the system. You can effectively reboot your 223 00:14:37,800 --> 00:14:41,300 jail. You can essentially shut your jail down to the equivalent of single user mode 224 00:14:41,300 --> 00:14:45,900 and then start it back up again. So it will run through all of its 225 00:14:45,900 --> 00:14:46,700 RC Scripts. 226 00:14:47,000 --> 00:14:51,600 This can be convenient if your web server is housed and you've changed some configurations and rather than 227 00:14:51,600 --> 00:14:55,400 trying to get it to learn about the new configurations. You just take it down to the 228 00:14:55,400 --> 00:14:59,500 equivalent of single user and then bring it back up again, a little, well. It's 229 00:14:59,500 --> 00:15:03,100 distinctly quicker to take a jail down to single user and bring it back up again, 230 00:15:03,100 --> 00:15:07,900 then it is to do so with the actual system because a lot 231 00:15:07,900 --> 00:15:11,900 of the sinking and other things that you need to do, when you take a real system down, don't need to 232 00:15:11,900 --> 00:15:15,700 be done. When you take a jail down. You just need to shut off. Shut down all the 233 00:15:15,700 --> 00:15:16,600 processes and closer. 234 00:15:17,000 --> 00:15:17,600 Features and so on.