1 00:00:06,750 --> 00:00:07,583 - In this video, 2 00:00:07,583 --> 00:00:10,440 I will show you how to manage processes using top. 3 00:00:10,440 --> 00:00:11,370 And in order to do so, 4 00:00:11,370 --> 00:00:14,910 I'm starting top with elevated privileges 5 00:00:14,910 --> 00:00:17,610 to ensure that I have access to everything. 6 00:00:17,610 --> 00:00:18,720 So what do we see? 7 00:00:18,720 --> 00:00:20,160 Well, the top interface 8 00:00:20,160 --> 00:00:21,960 and the top interface is called top 9 00:00:21,960 --> 00:00:25,380 because it's sorts processes by CPU activity. 10 00:00:25,380 --> 00:00:28,320 So you will always see the busiest process 11 00:00:28,320 --> 00:00:30,540 listed as number one. 12 00:00:30,540 --> 00:00:32,070 In the five headlines, 13 00:00:32,070 --> 00:00:35,340 you get some additional information about the system. 14 00:00:35,340 --> 00:00:37,620 It starts with some uptime information, 15 00:00:37,620 --> 00:00:40,290 number of users that are logged in. 16 00:00:40,290 --> 00:00:43,170 And then it's giving you a load average. 17 00:00:43,170 --> 00:00:45,030 The load average is giving information 18 00:00:45,030 --> 00:00:48,420 about the current workload of the system. 19 00:00:48,420 --> 00:00:50,940 And it's expressed in three numbers 20 00:00:50,940 --> 00:00:53,493 for the last one, five and 15 minutes. 21 00:00:54,600 --> 00:00:57,960 Now, the number reflects the average number of processes 22 00:00:57,960 --> 00:01:00,930 that has been asking for attention. 23 00:01:00,930 --> 00:01:02,010 And you know what? 24 00:01:02,010 --> 00:01:04,218 Let's just start a couple of processes. 25 00:01:04,218 --> 00:01:06,801 (quiet typing) 26 00:01:10,860 --> 00:01:13,740 I am going to run three background processes 27 00:01:13,740 --> 00:01:15,390 because it's more interesting to see 28 00:01:15,390 --> 00:01:18,180 that top is actually showing something. 29 00:01:18,180 --> 00:01:19,020 So as you can see, 30 00:01:19,020 --> 00:01:21,617 we have the three dd processes 31 00:01:21,617 --> 00:01:24,117 that are listed as the most active processes. 32 00:01:24,117 --> 00:01:27,510 And you can see the load average slowly climbing up. 33 00:01:27,510 --> 00:01:30,090 It's an average for the last one minute. 34 00:01:30,090 --> 00:01:31,800 So it will take about one minute 35 00:01:31,800 --> 00:01:35,253 until we will see a three right here. 36 00:01:36,270 --> 00:01:38,310 Now, the load average is an indicator 37 00:01:38,310 --> 00:01:40,740 of the performance health of your system. 38 00:01:40,740 --> 00:01:41,970 Big question is, 39 00:01:41,970 --> 00:01:43,860 what is good and what is bad? 40 00:01:43,860 --> 00:01:47,190 Well, it all depends on the number of CPUs in your system. 41 00:01:47,190 --> 00:01:48,270 And from this interface, 42 00:01:48,270 --> 00:01:51,390 we don't see directly how many CPUs there are. 43 00:01:51,390 --> 00:01:53,820 The third line is just a summary of all the CPUs 44 00:01:53,820 --> 00:01:56,373 that are available, but if you press 01, 45 00:01:56,373 --> 00:01:59,610 then you get one line for every single CPU. 46 00:01:59,610 --> 00:02:01,533 So this is a two CPU system. 47 00:02:02,850 --> 00:02:03,840 So what does that mean? 48 00:02:03,840 --> 00:02:05,232 Well, two CPUs means 49 00:02:05,232 --> 00:02:08,130 that if we have processes 50 00:02:08,130 --> 00:02:11,549 that are asking 100% of CPU attention, 51 00:02:11,549 --> 00:02:16,470 the two CPUs can only have two processes at the same time. 52 00:02:16,470 --> 00:02:20,460 So if the load average is higher than the number of CPUs 53 00:02:20,460 --> 00:02:22,530 that you have in your system, 54 00:02:22,530 --> 00:02:25,080 then you need to investigate what is going on. 55 00:02:25,080 --> 00:02:27,660 It's not necessarily a bad situation 56 00:02:27,660 --> 00:02:30,123 but you should investigate what is going on. 57 00:02:32,220 --> 00:02:33,180 So the second line, 58 00:02:33,180 --> 00:02:35,760 on the second line we can see the number of tasks 59 00:02:35,760 --> 00:02:37,050 that is running. 60 00:02:37,050 --> 00:02:38,100 The interesting part. 61 00:02:38,100 --> 00:02:39,540 The interesting part here 62 00:02:39,540 --> 00:02:42,750 is that we can see four of them are actively running. 63 00:02:42,750 --> 00:02:44,281 Running is for processes 64 00:02:44,281 --> 00:02:47,550 that recently have asked CPU attention. 65 00:02:47,550 --> 00:02:49,800 And we have 244 sleeping, 66 00:02:49,800 --> 00:02:53,070 they are not creating any significant workload. 67 00:02:53,070 --> 00:02:55,830 No stopped ones, no zombies. 68 00:02:55,830 --> 00:02:57,030 A zombie is a process 69 00:02:57,030 --> 00:02:59,580 that has gone into an unmanageable state 70 00:02:59,580 --> 00:03:01,680 and you shouldn't see them at all. 71 00:03:01,680 --> 00:03:02,730 And if you see them, 72 00:03:02,730 --> 00:03:04,140 well, probably the best thing 73 00:03:04,140 --> 00:03:06,270 is either to wait until they go away 74 00:03:06,270 --> 00:03:09,513 or to reboot your system to force them to go away. 75 00:03:10,530 --> 00:03:12,090 Then we have the CPUs line. 76 00:03:12,090 --> 00:03:13,380 Remember if I press 01, 77 00:03:13,380 --> 00:03:16,140 then I see a summary for all CPUs 78 00:03:16,140 --> 00:03:19,280 and I can press one again to toggle 79 00:03:19,280 --> 00:03:22,380 for a display that is showing one line 80 00:03:22,380 --> 00:03:24,330 for every single CPU. 81 00:03:24,330 --> 00:03:25,740 In the CPUs line, 82 00:03:25,740 --> 00:03:29,940 we can see information about what exactly the CPU is doing. 83 00:03:29,940 --> 00:03:32,910 So we can see about 16% in us, 84 00:03:32,910 --> 00:03:35,190 us is user space. 85 00:03:35,190 --> 00:03:37,860 User space are non-privileged processes. 86 00:03:37,860 --> 00:03:41,670 It's typically tasks that are executed by an ordinary user. 87 00:03:41,670 --> 00:03:45,300 We have 80% in sy, sy is system space. 88 00:03:45,300 --> 00:03:47,940 That is where drivers are involved. 89 00:03:47,940 --> 00:03:50,670 And I can tell you given the workload of dd, 90 00:03:50,670 --> 00:03:54,900 where we use dd from one interface to another interface 91 00:03:54,900 --> 00:03:58,320 there it's understandable that the activity in system space 92 00:03:58,320 --> 00:03:59,910 is quite high, but normally 93 00:03:59,910 --> 00:04:03,303 you should see your activity more in the user space. 94 00:04:04,230 --> 00:04:08,880 A very important indicator is id, id is for the idle loop. 95 00:04:08,880 --> 00:04:12,510 It will tell you how much time your CPU is spending 96 00:04:12,510 --> 00:04:14,130 in doing nothing. 97 00:04:14,130 --> 00:04:17,239 And the other important parameter is a wa, 98 00:04:17,239 --> 00:04:19,560 wa means waiting for I/O 99 00:04:19,560 --> 00:04:22,500 and waiting for I/O means that your system is waiting 100 00:04:22,500 --> 00:04:26,460 for your disc channel to take care of processing the data. 101 00:04:26,460 --> 00:04:28,590 If you have a high number in wa 102 00:04:28,590 --> 00:04:30,840 then there is definitely something not alright 103 00:04:30,840 --> 00:04:33,000 in your storage channel. 104 00:04:33,000 --> 00:04:35,820 And that's a candidate for optimization. 105 00:04:35,820 --> 00:04:38,310 The other parameters are not so very important. 106 00:04:38,310 --> 00:04:42,000 So keep an eye on us, si, id and wa 107 00:04:42,000 --> 00:04:43,800 and you can get a pretty good impression 108 00:04:43,800 --> 00:04:44,943 of what is going on. 109 00:04:45,840 --> 00:04:48,120 Then we have a line for the memory usage. 110 00:04:48,120 --> 00:04:50,730 So this system as four gigabytes in total 111 00:04:50,730 --> 00:04:54,420 and about 1.2 gigabytes free. 112 00:04:54,420 --> 00:04:58,000 Also, we can see about 1.3 gigabytes used 113 00:04:58,000 --> 00:05:01,800 and 1,400 megabytes in buffer and cache. 114 00:05:01,800 --> 00:05:03,510 Now, what is this buffer and cache? 115 00:05:03,510 --> 00:05:04,680 Simple. 116 00:05:04,680 --> 00:05:06,120 It's all about memory 117 00:05:06,120 --> 00:05:08,727 and this system has four gigabytes of memory. 118 00:05:08,727 --> 00:05:12,240 Now, Linux is using memory for cache, 119 00:05:12,240 --> 00:05:14,670 if it is not needed by anything else. 120 00:05:14,670 --> 00:05:17,160 Cache is used for recently requested files. 121 00:05:17,160 --> 00:05:19,560 And the thing is, if the file is stored in cache, 122 00:05:19,560 --> 00:05:21,630 the next time that somebody needs it, 123 00:05:21,630 --> 00:05:23,280 it can be serviced from cache. 124 00:05:23,280 --> 00:05:25,410 And that makes your system fast. 125 00:05:25,410 --> 00:05:27,750 If the memory is not needed for anything else, 126 00:05:27,750 --> 00:05:30,090 your data will be kept in cache forever. 127 00:05:30,090 --> 00:05:32,850 At some point, your system will reach the state 128 00:05:32,850 --> 00:05:35,400 where the memory is needed for something else. 129 00:05:35,400 --> 00:05:38,550 And then the Linux kernel is going to clear data from cache, 130 00:05:38,550 --> 00:05:40,773 if it has not recently been used. 131 00:05:41,670 --> 00:05:43,830 Last, we can see the Swap parameter. 132 00:05:43,830 --> 00:05:47,400 The Swap parameter indicates the usage of Swap space. 133 00:05:47,400 --> 00:05:51,870 Swap space is emulated RAM on disc. 134 00:05:51,870 --> 00:05:54,420 So it's kind of overflow and it is useful 135 00:05:54,420 --> 00:05:57,120 in cases where you are completely running out of RAM 136 00:05:57,120 --> 00:06:00,150 because then memory pages can be moved to Swap 137 00:06:00,150 --> 00:06:03,183 and your system can still continue functioning. 138 00:06:04,350 --> 00:06:06,060 Swap is a little bit slower 139 00:06:06,060 --> 00:06:07,530 but the Linux kernel is dealing 140 00:06:07,530 --> 00:06:09,660 with Swap in a very smart way. 141 00:06:09,660 --> 00:06:13,680 It'll only move inactive memory pages to Swap, 142 00:06:13,680 --> 00:06:15,090 if that is possible. 143 00:06:15,090 --> 00:06:17,910 And if you only have inactive memory pages in Swap 144 00:06:17,910 --> 00:06:19,980 then that just cleans up your RAM 145 00:06:19,980 --> 00:06:22,710 and you can use your RAM for more useful stuff. 146 00:06:22,710 --> 00:06:25,410 And that, in general, is considered to be good. 147 00:06:25,410 --> 00:06:27,570 As you can see, we have two gigabytes of Swap 148 00:06:27,570 --> 00:06:29,610 and two gigabytes of Swap are free 149 00:06:29,610 --> 00:06:31,170 and nothing is used. 150 00:06:31,170 --> 00:06:33,570 The last parameter is not about Swap. 151 00:06:33,570 --> 00:06:35,520 This is about a total memory. 152 00:06:35,520 --> 00:06:38,040 So the total memory is set to 153 00:06:38,040 --> 00:06:40,803 more than two gigabytes of available memory. 154 00:06:41,700 --> 00:06:43,860 Now, the thing that is a little bit confusing here 155 00:06:43,860 --> 00:06:48,210 is the relation between free memory and available memory. 156 00:06:48,210 --> 00:06:49,680 Now, what is the relation? 157 00:06:49,680 --> 00:06:51,000 Free memory is memory 158 00:06:51,000 --> 00:06:54,060 that's currently not used for anything at all. 159 00:06:54,060 --> 00:06:56,940 You won't always see so much free memory 160 00:06:56,940 --> 00:07:01,260 because if your system has been up for a longer period 161 00:07:01,260 --> 00:07:03,843 then you'll have less free memory available. 162 00:07:04,680 --> 00:07:07,200 That is because memory will always be used 163 00:07:07,200 --> 00:07:08,970 for caching files. 164 00:07:08,970 --> 00:07:11,460 Now in cache, the Linux kernel keeps track 165 00:07:11,460 --> 00:07:14,880 of memory pages that are allocated to cache 166 00:07:14,880 --> 00:07:17,190 that have not recently been used. 167 00:07:17,190 --> 00:07:19,710 That's what we call, inactive cache. 168 00:07:19,710 --> 00:07:23,190 And available memory is a summary of inactive cache 169 00:07:23,190 --> 00:07:24,510 and free memory. 170 00:07:24,510 --> 00:07:26,070 The thing about inactive cache 171 00:07:26,070 --> 00:07:28,800 is that if the memory is needed by something else, 172 00:07:28,800 --> 00:07:32,310 well, you can just clear the inactive cache. 173 00:07:32,310 --> 00:07:33,143 And that makes it, 174 00:07:33,143 --> 00:07:35,943 we have 2.3 gigabytes of available memory. 175 00:07:36,780 --> 00:07:39,420 Right in the lowest part of the top interface, 176 00:07:39,420 --> 00:07:41,370 we can see these processes. 177 00:07:41,370 --> 00:07:43,320 For every process, the PID, 178 00:07:43,320 --> 00:07:45,570 the user that has started to process 179 00:07:45,570 --> 00:07:47,460 and some more information. 180 00:07:47,460 --> 00:07:50,490 The interesting information right here is the CPU percentage 181 00:07:50,490 --> 00:07:51,600 and the memory. 182 00:07:51,600 --> 00:07:55,440 That's a percentage of CPUs that the process is running 183 00:07:55,440 --> 00:07:57,990 and the percentage of memory. 184 00:07:57,990 --> 00:08:00,757 And now we can see the names of these processes. 185 00:08:00,757 --> 00:08:03,090 Let me go out of top for now. 186 00:08:03,090 --> 00:08:04,143 We have seen enough.