1 00:00:06,430 --> 00:00:07,840 - So let's talk a little bit about fuzzing. 2 00:00:07,840 --> 00:00:09,900 We talked about buffer overflows before 3 00:00:09,900 --> 00:00:12,460 and you know what they were and how to exploit them. 4 00:00:12,460 --> 00:00:14,720 Now we're gonna show how to find vulnerabilities 5 00:00:14,720 --> 00:00:15,990 like buffer overflows, 6 00:00:15,990 --> 00:00:18,200 and one of your primary tools 7 00:00:18,200 --> 00:00:19,970 aside from things like code inspection, 8 00:00:19,970 --> 00:00:22,810 where you're looking at the source code is to use fuzzing. 9 00:00:22,810 --> 00:00:23,820 Fuzzing is testing 10 00:00:23,820 --> 00:00:27,100 that you would feed a program unexpected data, 11 00:00:27,100 --> 00:00:29,800 and that data could be malformed data. 12 00:00:29,800 --> 00:00:31,290 It could be unexpected data, you know, 13 00:00:31,290 --> 00:00:33,120 it could be expecting an IP address 14 00:00:33,120 --> 00:00:36,200 and you feed it an IPv6 address, for example. 15 00:00:36,200 --> 00:00:39,480 And it'll take action based on that input 16 00:00:39,480 --> 00:00:40,840 and it'll do something with it. 17 00:00:40,840 --> 00:00:44,220 And sometimes you could cause the program to crash. 18 00:00:44,220 --> 00:00:46,730 And so fuzzing is also called negative testing. 19 00:00:46,730 --> 00:00:48,620 You also hear it called robustness testing. 20 00:00:48,620 --> 00:00:49,453 And then, in other words, 21 00:00:49,453 --> 00:00:51,830 you're testing how robust the program is 22 00:00:51,830 --> 00:00:54,120 to certain kinds of input 23 00:00:54,120 --> 00:00:57,410 and this unexpected data, like I said, it could be invalid. 24 00:00:57,410 --> 00:00:59,500 It could be just random data. 25 00:00:59,500 --> 00:01:00,700 It could just be garbage, 26 00:01:00,700 --> 00:01:02,500 that comes from a random number generator 27 00:01:02,500 --> 00:01:03,520 that you feed to it. 28 00:01:03,520 --> 00:01:05,410 And you see exactly what happens 29 00:01:05,410 --> 00:01:09,023 and what'll happen is occasionally the program will fail. 30 00:01:09,023 --> 00:01:10,950 It will hang sometimes. 31 00:01:10,950 --> 00:01:12,710 In other words, it'll get into an infinite loop 32 00:01:12,710 --> 00:01:15,230 and just consume a hundred percent CPU. 33 00:01:15,230 --> 00:01:16,820 And in that case what you'll do 34 00:01:16,820 --> 00:01:18,610 is you monitor the program. 35 00:01:18,610 --> 00:01:21,650 And if the program crashes or goes into that state 36 00:01:21,650 --> 00:01:25,100 you'll see what message you sent to the program. 37 00:01:25,100 --> 00:01:26,560 And so in other words, you know the input 38 00:01:26,560 --> 00:01:27,800 that was being sent to the program 39 00:01:27,800 --> 00:01:29,110 and you say, look, something happened, 40 00:01:29,110 --> 00:01:30,680 let's figure out what happened. 41 00:01:30,680 --> 00:01:33,850 And in other cases you could crash. 42 00:01:33,850 --> 00:01:34,760 But when you crash, 43 00:01:34,760 --> 00:01:38,360 it's important to know, am I crashing in the same location? 44 00:01:38,360 --> 00:01:40,670 Is it, you know, is the input maybe different 45 00:01:40,670 --> 00:01:43,980 but is it the same where it make me crash in the same place? 46 00:01:43,980 --> 00:01:46,100 Are these crashes exploitable? 47 00:01:46,100 --> 00:01:48,240 I mean, if it's a null point or de-reference, 48 00:01:48,240 --> 00:01:50,610 that may not be as interesting. 49 00:01:50,610 --> 00:01:52,110 When I say null point or de-reference, 50 00:01:52,110 --> 00:01:55,110 that means I'm actually reading from address zero 51 00:01:55,110 --> 00:01:57,970 which is not mapped into a process typically. 52 00:01:57,970 --> 00:02:00,210 You wanna know whether or not a particular vulnerability 53 00:02:00,210 --> 00:02:01,600 is exploitable. 54 00:02:01,600 --> 00:02:03,850 And if the program crashes 55 00:02:03,850 --> 00:02:07,620 an EIP is set to something you control, 56 00:02:07,620 --> 00:02:09,290 then that's a great day, right? 57 00:02:09,290 --> 00:02:11,950 You found a buffer overflow vulnerability on the stack 58 00:02:11,950 --> 00:02:15,110 and you have direct control of the instruction pointer. 59 00:02:15,110 --> 00:02:16,890 And so there are different types of fuzzers. 60 00:02:16,890 --> 00:02:20,690 We'll start with the dumb or mutation based fuzzers. 61 00:02:20,690 --> 00:02:24,560 And what these do is these take a known good, 62 00:02:24,560 --> 00:02:26,510 well-formed message, 63 00:02:26,510 --> 00:02:29,770 and you somehow change it or massage it, 64 00:02:29,770 --> 00:02:33,180 so that it's changed a little bit and maybe corrupted. 65 00:02:33,180 --> 00:02:36,890 And that collection of known good messages 66 00:02:36,890 --> 00:02:38,500 is called a Corpus. 67 00:02:38,500 --> 00:02:43,060 Now let's say you're fuzzing a JPEG file, which is an image. 68 00:02:43,060 --> 00:02:46,620 And so you would want to have a collection of images 69 00:02:46,620 --> 00:02:49,700 that exercise different parts of the code. 70 00:02:49,700 --> 00:02:52,600 You have smart or generation based fuzzers. 71 00:02:52,600 --> 00:02:54,570 And these have a data model 72 00:02:54,570 --> 00:02:58,039 that you have to program ahead of time into the fuzzer 73 00:02:58,039 --> 00:03:00,650 so that the fuzzer can create the message 74 00:03:00,650 --> 00:03:02,610 sometimes from scratch. 75 00:03:02,610 --> 00:03:06,870 And then further along, you have evolutionary-based fuzzers, 76 00:03:06,870 --> 00:03:10,830 and those fuzzers actually monitor the program itself, 77 00:03:10,830 --> 00:03:13,880 and they glean heuristics from the program 78 00:03:13,880 --> 00:03:16,350 to figure out if you're exercising the code 79 00:03:16,350 --> 00:03:17,560 that you wanna exercise. 80 00:03:17,560 --> 00:03:21,250 So it's constantly checking whether or not, 81 00:03:21,250 --> 00:03:22,730 you know, you're actually running the code 82 00:03:22,730 --> 00:03:23,923 that you expect to run.