1 00:00:06,360 --> 00:00:07,900 - So what a mutation-based fuzzer, 2 00:00:07,900 --> 00:00:09,050 I think it's fairly obvious, 3 00:00:09,050 --> 00:00:10,970 is it takes well-formatted input 4 00:00:10,970 --> 00:00:13,040 like a message, a string, maybe a file, 5 00:00:13,040 --> 00:00:14,720 and it mutates the data. 6 00:00:14,720 --> 00:00:17,550 And you don't need to have any structure of the input data 7 00:00:17,550 --> 00:00:20,230 for mutation-based fuzzer, at least for very simple ones. 8 00:00:20,230 --> 00:00:21,840 For good code coverage you need 9 00:00:21,840 --> 00:00:24,410 that large collection of well-formed inputs. 10 00:00:24,410 --> 00:00:25,910 This is called a corpus. 11 00:00:25,910 --> 00:00:28,870 And you can get this information from the internet. 12 00:00:28,870 --> 00:00:31,860 You're fuzzing against a file, or maybe you're doing fuzzing 13 00:00:31,860 --> 00:00:33,650 against a certain packet format. 14 00:00:33,650 --> 00:00:36,610 You can get just PCAPs that are from the internet 15 00:00:36,610 --> 00:00:38,360 that are basically representations 16 00:00:38,360 --> 00:00:41,160 of the data that the process would use. 17 00:00:41,160 --> 00:00:43,340 And then you could mutate that data 18 00:00:43,340 --> 00:00:45,440 and use it as fuzzing test cases. 19 00:00:45,440 --> 00:00:48,100 And those mutations can include flipping bits. 20 00:00:48,100 --> 00:00:50,720 You can duplicate certain fields. 21 00:00:50,720 --> 00:00:52,910 You can randomize data that's in the message. 22 00:00:52,910 --> 00:00:54,730 And then you can use your imagination. 23 00:00:54,730 --> 00:00:57,490 There's many different ways you can mutate the data. 24 00:00:57,490 --> 00:00:59,100 And then the result of the mutation 25 00:00:59,100 --> 00:01:00,520 is to be fed into the target, 26 00:01:00,520 --> 00:01:02,720 and that is your anomalous message. 27 00:01:02,720 --> 00:01:04,420 And then, when you have a failure 28 00:01:04,420 --> 00:01:06,800 you have to have some way of detecting that failure, 29 00:01:06,800 --> 00:01:09,550 because sometimes you won't have a hard crash. 30 00:01:09,550 --> 00:01:11,230 Or maybe the process is alive. 31 00:01:11,230 --> 00:01:12,420 Maybe it's stuck, 32 00:01:12,420 --> 00:01:15,080 maybe it is stuck in an infinite loop. 33 00:01:15,080 --> 00:01:17,500 Maybe it no longer processes messages, 34 00:01:17,500 --> 00:01:20,440 but the process is waiting for something, basically. 35 00:01:20,440 --> 00:01:22,800 Waiting for a resource to be freed up. 36 00:01:22,800 --> 00:01:24,950 So you have to have some way of monitoring this. 37 00:01:24,950 --> 00:01:27,870 And typically, you would monitor the process directly 38 00:01:27,870 --> 00:01:29,210 and you could just see whether or not 39 00:01:29,210 --> 00:01:30,060 the process is running. 40 00:01:30,060 --> 00:01:32,780 Or you can see whether or not it's consuming CPU. 41 00:01:32,780 --> 00:01:35,730 If it consumes over 90% or 95% CPU 42 00:01:35,730 --> 00:01:39,460 that would be indicative of an infinite loop scenario. 43 00:01:39,460 --> 00:01:41,660 And then another way to detect failures, 44 00:01:41,660 --> 00:01:43,100 which I would recommend, 45 00:01:43,100 --> 00:01:45,010 is just to send a know good message. 46 00:01:45,010 --> 00:01:47,050 Rather than sending an anomalous message 47 00:01:47,050 --> 00:01:48,450 with modifications to it, 48 00:01:48,450 --> 00:01:50,060 you could just send one that's good. 49 00:01:50,060 --> 00:01:52,370 And if the process acts the way it should 50 00:01:52,370 --> 00:01:53,930 and responds maybe the way it should, 51 00:01:53,930 --> 00:01:54,870 then you know that it's fine. 52 00:01:54,870 --> 00:01:55,703 You can move on. 53 00:01:57,352 --> 00:01:58,310 And then if you don't have a response, 54 00:01:58,310 --> 00:02:00,410 then you might have a denial-of-service condition. 55 00:02:00,410 --> 00:02:01,820 Maybe it doesn't crash or anything, 56 00:02:01,820 --> 00:02:03,840 maybe it just went off to lunch. 57 00:02:03,840 --> 00:02:05,840 And then, so in the case of the crash, 58 00:02:05,840 --> 00:02:07,680 you have to restart the process 59 00:02:07,680 --> 00:02:09,500 and then note that information. 60 00:02:09,500 --> 00:02:11,320 And then the anomalous message 61 00:02:11,320 --> 00:02:12,660 and program crash information, 62 00:02:12,660 --> 00:02:14,030 you save it, like I said before, 63 00:02:14,030 --> 00:02:16,890 and then you would use that for further analysis. 64 00:02:16,890 --> 00:02:18,570 Generation-based fuzzers, 65 00:02:18,570 --> 00:02:20,700 these are your smart fuzzers. 66 00:02:20,700 --> 00:02:23,270 And those work with a data model of the protocol, 67 00:02:23,270 --> 00:02:25,800 and that data model has to be created in advance. 68 00:02:25,800 --> 00:02:29,360 So it requires a lot of hard work first 69 00:02:29,360 --> 00:02:31,610 before you can ever do any kind of fuzzing. 70 00:02:31,610 --> 00:02:34,590 Generation-based fuzzers are typically 71 00:02:34,590 --> 00:02:36,490 something you would do much later on 72 00:02:36,490 --> 00:02:38,140 after you set up mutation for a little while. 73 00:02:38,140 --> 00:02:41,100 You can do some generation fuzzing after you have a model. 74 00:02:41,100 --> 00:02:43,720 You can actually create a message from scratch. 75 00:02:43,720 --> 00:02:45,780 In other words, you're actually creating from the ground-up 76 00:02:45,780 --> 00:02:47,630 a message based on the model. 77 00:02:47,630 --> 00:02:49,130 And then, in other times, 78 00:02:49,130 --> 00:02:50,500 you can just start from the corpus. 79 00:02:50,500 --> 00:02:53,750 And then use the model to modify the corpus, 80 00:02:53,750 --> 00:02:55,630 but also do those fix ups that you need 81 00:02:55,630 --> 00:02:57,530 to make the protocol look the way it should 82 00:02:57,530 --> 00:02:59,690 so that the parse, or parses the message. 83 00:02:59,690 --> 00:03:01,070 And then modifications. 84 00:03:01,070 --> 00:03:03,840 You add to the data model for each anomalous test case. 85 00:03:03,840 --> 00:03:05,500 So you can start from one field 86 00:03:05,500 --> 00:03:07,632 and you can go to another field. 87 00:03:07,632 --> 00:03:09,900 And then the data model may include fix ups 88 00:03:09,900 --> 00:03:10,760 that you have to do. 89 00:03:10,760 --> 00:03:13,200 And those could be intelligently made 90 00:03:13,200 --> 00:03:16,200 by the fuzzer based on the data model itself. 91 00:03:16,200 --> 00:03:17,033 And those could be things 92 00:03:17,033 --> 00:03:19,423 like length fields, for instance, or check sums. 93 00:03:20,670 --> 00:03:24,090 And well, I found that generation-based fuzzers 94 00:03:24,090 --> 00:03:26,920 do provide better results for code 95 00:03:26,920 --> 00:03:28,480 that maybe is not as well mature. 96 00:03:28,480 --> 00:03:31,230 If the code has been around for a long time, 97 00:03:31,230 --> 00:03:34,420 you may find that generation fuzzers may not find anything. 98 00:03:34,420 --> 00:03:35,500 It's kind of a, you know, 99 00:03:35,500 --> 00:03:38,040 you really won't know until you try. 100 00:03:38,040 --> 00:03:41,150 So to move on, on the newer generations of fuzzers, 101 00:03:41,150 --> 00:03:42,360 the evolutionary fuzzers. 102 00:03:42,360 --> 00:03:45,140 This was where a lot of research is happening. 103 00:03:45,140 --> 00:03:47,860 And these actually generate test cases based 104 00:03:47,860 --> 00:03:50,360 on looking at the program itself. 105 00:03:50,360 --> 00:03:52,900 So you have a, another process that runs 106 00:03:52,900 --> 00:03:54,000 alongside the program, 107 00:03:54,000 --> 00:03:57,190 and it actually watches what's happening with the program. 108 00:03:57,190 --> 00:04:00,040 So you can see what the code coverage is. 109 00:04:00,040 --> 00:04:02,040 You usually start with something very small, 110 00:04:02,040 --> 00:04:04,360 a very small seed corpus to get you into that state 111 00:04:04,360 --> 00:04:06,960 to where you can start walking through the program, 112 00:04:06,960 --> 00:04:10,210 and then figure out what you want to do from there. 113 00:04:10,210 --> 00:04:13,410 And it aims for more complete coverage. 114 00:04:13,410 --> 00:04:17,370 And you also want to eliminate redundant test cases 115 00:04:17,370 --> 00:04:19,040 that are testing the exact same code. 116 00:04:19,040 --> 00:04:21,050 There's really no point to running the same code 117 00:04:21,050 --> 00:04:22,730 over and over again 118 00:04:22,730 --> 00:04:24,720 just to look for those anomalous test cases. 119 00:04:24,720 --> 00:04:26,930 And so it's much quicker to set up 120 00:04:26,930 --> 00:04:30,480 and it may give you better results than both, 121 00:04:30,480 --> 00:04:32,883 either mutation or generation-based fuzzing. 122 00:04:34,330 --> 00:04:35,505 But there are some downsides to running 123 00:04:35,505 --> 00:04:37,840 with an evolutionary fuzzer. 124 00:04:37,840 --> 00:04:40,830 In some cases, the program has to be written a certain way. 125 00:04:40,830 --> 00:04:43,340 It has to be written in a certain language. 126 00:04:43,340 --> 00:04:45,070 It has to be that you have a source code available, 127 00:04:45,070 --> 00:04:47,490 so you can recompile the program 128 00:04:47,490 --> 00:04:49,660 so that you could instrument it 129 00:04:49,660 --> 00:04:52,680 and actually listen to what's going on in the program. 130 00:04:52,680 --> 00:04:55,650 And so you may need code available 131 00:04:55,650 --> 00:04:57,230 and you may not have that code available. 132 00:04:57,230 --> 00:04:59,530 So maybe that's something you can't do. 133 00:04:59,530 --> 00:05:03,110 There are evolutionary fuzzers that do work on like QMU. 134 00:05:03,110 --> 00:05:05,150 One of the examples is American fuzzy lop. 135 00:05:05,150 --> 00:05:07,870 It's an experimental feature, but you can try it. 136 00:05:07,870 --> 00:05:10,650 But this is definitely something to try, 137 00:05:10,650 --> 00:05:11,970 simply, like I said, 138 00:05:11,970 --> 00:05:15,480 you really want to always be fuzzing when you have a target, 139 00:05:15,480 --> 00:05:17,900 and evolutionary fuzzer gives you a quick way 140 00:05:17,900 --> 00:05:18,853 to get started.