1 00:00:06,429 --> 00:00:07,262 - In this video 2 00:00:07,262 --> 00:00:10,710 we are going to talk about compression utilities. 3 00:00:10,710 --> 00:00:12,750 There's a couple of compression utilities 4 00:00:12,750 --> 00:00:16,890 that are commonly used in a Linux environment. 5 00:00:16,890 --> 00:00:19,480 To start with, there is gzip. 6 00:00:19,480 --> 00:00:23,970 If you want to use gzip in tar, you use the Z option. 7 00:00:23,970 --> 00:00:26,305 This is the most common compression utility 8 00:00:26,305 --> 00:00:30,300 because it's giving a nice balance between compression ratio 9 00:00:30,300 --> 00:00:32,613 and time that it takes to create the archive. 10 00:00:33,480 --> 00:00:37,950 There is bzip2 or -J, if you use it in tar, 11 00:00:37,950 --> 00:00:39,690 that's an alternative utility 12 00:00:39,690 --> 00:00:43,350 which is offering better compression, but it takes longer. 13 00:00:43,350 --> 00:00:45,401 And there is xzip, 14 00:00:45,401 --> 00:00:50,401 or -J uppercase J in tar is a utility relatively 15 00:00:50,910 --> 00:00:53,190 new that offers the best possible compression 16 00:00:53,190 --> 00:00:55,501 but is significantly slower. 17 00:00:55,501 --> 00:00:57,510 And also for those of you who know it 18 00:00:57,510 --> 00:01:00,300 from other environments, there is zip utility 19 00:01:00,300 --> 00:01:03,570 and the zip utility has a windows compatible syntax. 20 00:01:03,570 --> 00:01:05,910 The zip utility mainly is something from history. 21 00:01:05,910 --> 00:01:08,160 You don't really need it nowadays anymore. 22 00:01:08,160 --> 00:01:11,272 And that is because windows nowadays also 23 00:01:11,272 --> 00:01:16,020 reads Linux native compression formats like gzip. 24 00:01:16,020 --> 00:01:17,310 If you have a gzip file 25 00:01:17,310 --> 00:01:19,563 you can easily read that on windows as well. 26 00:01:20,970 --> 00:01:21,990 Now, the main difference 27 00:01:21,990 --> 00:01:24,810 between these compression utilities is the ratio 28 00:01:24,810 --> 00:01:27,330 and time it takes to compress 29 00:01:27,330 --> 00:01:30,780 and you can use time to measure the time that it takes. 30 00:01:30,780 --> 00:01:33,873 And well, let me show you how to use it. 31 00:01:36,120 --> 00:01:39,296 So to start with, I want to get back to this 32 00:01:39,296 --> 00:01:42,330 this first tar command that we have used. 33 00:01:42,330 --> 00:01:44,400 I'm going to modify it a little bit. 34 00:01:44,400 --> 00:01:46,080 First thing that I'm going to modify 35 00:01:46,080 --> 00:01:47,580 I am going to compress. 36 00:01:47,580 --> 00:01:51,368 So I want to that to be easily seen from the 37 00:01:51,368 --> 00:01:56,368 the extensions, you can go to waste for the extension. 38 00:01:56,460 --> 00:02:00,000 I'm using .tar.z, which is totally acceptable. 39 00:02:00,000 --> 00:02:03,300 You could also use .dgz as the extension 40 00:02:03,300 --> 00:02:04,400 which would also work. 41 00:02:05,250 --> 00:02:07,530 The next thing that we need to change is the 42 00:02:07,530 --> 00:02:10,530 compression option and this compression option that is the Z 43 00:02:10,530 --> 00:02:11,880 that I'm going to add. 44 00:02:11,880 --> 00:02:13,770 And then the final thing that I wanna do 45 00:02:13,770 --> 00:02:16,390 I want to put the time command in front of it, 46 00:02:16,390 --> 00:02:19,170 because there are two things that really 47 00:02:19,170 --> 00:02:22,320 matter regarding compression 48 00:02:22,320 --> 00:02:24,300 and that is the compression ratio 49 00:02:24,300 --> 00:02:26,070 but also the time that it takes. 50 00:02:26,070 --> 00:02:28,560 So let's record the time so 51 00:02:28,560 --> 00:02:31,203 that we can actually see how much time it is taking. 52 00:02:34,560 --> 00:02:35,393 There we go. 53 00:02:35,393 --> 00:02:38,040 As you can see, there was a reasonable eight seconds. 54 00:02:38,040 --> 00:02:39,930 Okay. That was take one. 55 00:02:39,930 --> 00:02:40,800 Let's do take two. 56 00:02:40,800 --> 00:02:41,670 And then take two. 57 00:02:41,670 --> 00:02:46,670 We are creating bz a bz2 compressed file. 58 00:02:47,228 --> 00:02:49,740 The extension here really doesn't matter 59 00:02:49,740 --> 00:02:51,660 because it's just to inform your users. 60 00:02:51,660 --> 00:02:54,450 So use any extension that you want. 61 00:02:54,450 --> 00:02:58,751 I am going to use J, J entire is the option for bz 62 00:02:58,751 --> 00:03:02,569 to compression and for the rest of it 63 00:03:02,569 --> 00:03:04,500 we don't need to change anything. 64 00:03:04,500 --> 00:03:07,170 I want to create an archive of exactly the same. 65 00:03:07,170 --> 00:03:09,873 And again, we are going to record the time. 66 00:03:10,800 --> 00:03:12,990 We have just seen eight seconds in total. 67 00:03:12,990 --> 00:03:15,903 So I'm wondering how much time is this going to take? 68 00:03:19,920 --> 00:03:24,093 And as you can see more than double, 22 seconds in total. 69 00:03:25,295 --> 00:03:26,370 All right, of course you wanna know 70 00:03:26,370 --> 00:03:28,920 about Contra compression ratio as well. 71 00:03:28,920 --> 00:03:31,680 We are going to find out in just a little bit 72 00:03:31,680 --> 00:03:36,680 but first I want to do my third run, the xz utility. 73 00:03:37,260 --> 00:03:39,343 So again, I'm changing the extension 74 00:03:39,343 --> 00:03:42,870 and we are using uppercase J to make sure 75 00:03:42,870 --> 00:03:45,570 that we are using the xz utility. 76 00:03:45,570 --> 00:03:48,210 And there we go running time again 77 00:03:48,210 --> 00:03:50,760 and let's figure out how long it takes. 78 00:03:50,760 --> 00:03:53,103 As you can see, the time it takes to compress, 79 00:03:54,056 --> 00:03:55,560 this file really takes a lot longer. 80 00:03:55,560 --> 00:03:57,145 And that is because 81 00:03:57,145 --> 00:03:59,490 of the compression algorithm that is used. 82 00:03:59,490 --> 00:04:01,650 But the thing that matters of course 83 00:04:01,650 --> 00:04:04,140 is what is the compression ratio. 84 00:04:04,140 --> 00:04:06,440 And we are going to find out in just a second. 85 00:04:13,290 --> 00:04:14,370 So what do we see? 86 00:04:14,370 --> 00:04:15,203 Oh, wow. 87 00:04:15,203 --> 00:04:17,065 One minute and 10 seconds. 88 00:04:17,065 --> 00:04:20,040 That is again, more than three times 89 00:04:20,040 --> 00:04:23,429 the amount of time as the bz2 utility. 90 00:04:23,429 --> 00:04:27,150 Another thing that we really wanna know is what 91 00:04:27,150 --> 00:04:28,620 about compression ratio? 92 00:04:28,620 --> 00:04:33,360 So ls -l on /tmp files 93 00:04:34,597 --> 00:04:36,030 .star is showing us 94 00:04:36,030 --> 00:04:41,030 that the original archive was 339 megabytes. 95 00:04:41,970 --> 00:04:44,709 Then we have the gzip archive, which is, well 96 00:04:44,709 --> 00:04:46,710 shall we say a hundred? 97 00:04:46,710 --> 00:04:49,740 The bz2 one is about 90. 98 00:04:49,740 --> 00:04:54,740 And the xz compressed archive is only 78 megabytes. 99 00:04:57,750 --> 00:05:02,750 So for 20% more effective compression, you need to wait. 100 00:05:03,108 --> 00:05:06,000 Well, about 10 times as much 101 00:05:06,000 --> 00:05:08,943 the difference between gzip and xz. 102 00:05:10,320 --> 00:05:11,153 It's up to you, 103 00:05:11,153 --> 00:05:13,170 what best fits your needs. 104 00:05:13,170 --> 00:05:15,244 These are the different utilities. 105 00:05:15,244 --> 00:05:17,220 Now you can use these utilities 106 00:05:17,220 --> 00:05:18,540 on the command line as well 107 00:05:18,540 --> 00:05:21,843 to compress as well as to extract. 108 00:05:23,070 --> 00:05:26,100 So if I would use gzip, for example 109 00:05:26,100 --> 00:05:30,040 on /tmp/files.tar 110 00:05:30,040 --> 00:05:34,099 then we can see that, oh, automatically it wants to 111 00:05:34,099 --> 00:05:36,900 to create a file with the extension. 112 00:05:36,900 --> 00:05:40,950 And it's kindly asking if I want to overwrite and well 113 00:05:40,950 --> 00:05:43,740 I don't want to overwrite you get the idea. 114 00:05:43,740 --> 00:05:48,740 Gzip will compress and gunzip will do the decompression. 115 00:05:49,740 --> 00:05:52,740 And likewise, you can use bzip2 and xz 116 00:05:52,740 --> 00:05:54,510 from the command line as well. 117 00:05:54,510 --> 00:05:56,580 But most of the times you will find yourself 118 00:05:56,580 --> 00:05:58,500 using the compression utilities 119 00:05:58,500 --> 00:06:01,053 as an argument to the tar command.