1 00:00:06,493 --> 00:00:07,956 - So, let's talk a little bit about fuzzing. 2 00:00:07,956 --> 00:00:09,907 We talked about buffer overflows before 3 00:00:09,907 --> 00:00:12,503 and you know what they were and how to exploit them. 4 00:00:12,503 --> 00:00:13,931 Now, we're gonna show how to find 5 00:00:13,931 --> 00:00:16,176 vulnerabilities like buffer overflows, 6 00:00:16,176 --> 00:00:17,812 and one of your primary tools 7 00:00:17,812 --> 00:00:19,933 in ciphering things like code inspection 8 00:00:19,933 --> 00:00:22,823 where you're looking at the source code is to use fuzzing. 9 00:00:22,823 --> 00:00:25,548 Fuzzing is testing that you would feed a program 10 00:00:25,548 --> 00:00:28,636 unexpected data and that data could be 11 00:00:28,636 --> 00:00:31,155 malformed data, it could be unexpected data, 12 00:00:31,155 --> 00:00:33,111 it could be expecting an IP address 13 00:00:33,111 --> 00:00:36,628 and you feed it an IPv6 address for example. 14 00:00:36,628 --> 00:00:39,126 It will take action based on that input 15 00:00:39,126 --> 00:00:40,831 and it will do something with it, 16 00:00:40,831 --> 00:00:44,374 and sometimes you could cause the program to crash. 17 00:00:44,374 --> 00:00:46,739 Fuzzing is also called negative testing. 18 00:00:46,739 --> 00:00:48,686 You also hear it called robustness testing. 19 00:00:48,686 --> 00:00:50,717 In other words, you're testing how robust 20 00:00:50,717 --> 00:00:53,928 the program is to certain kinds of input. 21 00:00:53,928 --> 00:00:55,914 This unexpected data, like I say, 22 00:00:55,914 --> 00:00:59,616 it could be invalid, it could be just random data, 23 00:00:59,616 --> 00:01:01,371 it could just be garbage that comes from 24 00:01:01,371 --> 00:01:03,770 a random number generator that you feed to it, 25 00:01:03,770 --> 00:01:05,717 and you see exactly what happens. 26 00:01:05,717 --> 00:01:09,009 What will happen is, occasionally the program will fail. 27 00:01:09,009 --> 00:01:11,402 It will hang sometimes, in other words, 28 00:01:11,402 --> 00:01:15,057 it will get into an infinite loop and just consume 100% CPU. 29 00:01:15,057 --> 00:01:18,272 In that case, what you'll do is you'll monitor the program 30 00:01:18,272 --> 00:01:21,438 and if the program crashes or goes into that state, 31 00:01:21,438 --> 00:01:25,137 you will see what message you sent to the program 32 00:01:25,137 --> 00:01:26,563 and so in other words, you know the input 33 00:01:26,563 --> 00:01:28,275 that was being sent to the program and you say, 34 00:01:28,275 --> 00:01:30,794 "Look something happened, let's figure out what happened." 35 00:01:30,794 --> 00:01:33,518 In other cases, you could crash, 36 00:01:33,518 --> 00:01:35,916 but when you crash, it's important to know, 37 00:01:35,916 --> 00:01:38,445 "Am I crashing in the same location? 38 00:01:38,445 --> 00:01:40,844 "Is the input maybe different, 39 00:01:40,844 --> 00:01:43,729 "but is it the same or make me crash in the same place? 40 00:01:43,729 --> 00:01:46,253 "Are these crashes exploitable?" 41 00:01:46,253 --> 00:01:48,292 I mean if it's a null point or a dereference 42 00:01:48,292 --> 00:01:50,613 that may not be as interesting. 43 00:01:50,613 --> 00:01:52,319 When I say null point or dereference 44 00:01:52,319 --> 00:01:55,081 that means I'm actually reading from address zero, 45 00:01:55,081 --> 00:01:57,895 which is not mapped into a process typically. 46 00:01:57,895 --> 00:01:59,030 You want to know whether or not 47 00:01:59,030 --> 00:02:01,628 a particular vulnerability is exploitable, 48 00:02:01,628 --> 00:02:06,144 and if the program crashes and the IP is set to something 49 00:02:06,144 --> 00:02:09,441 you control, then that's a great day right? 50 00:02:09,441 --> 00:02:11,395 You have found a buffer overflow vulnerability 51 00:02:11,395 --> 00:02:13,307 on the stack and you have direct control 52 00:02:13,307 --> 00:02:15,184 of the instruction pointer. 53 00:02:15,184 --> 00:02:16,938 There are different types of fuzzers. 54 00:02:16,938 --> 00:02:20,563 We'll start with the dumb or mutation-based fuzzers, 55 00:02:20,563 --> 00:02:24,552 and what these do is these take a known, good, 56 00:02:24,552 --> 00:02:27,641 well-formed message and you somehow 57 00:02:27,641 --> 00:02:30,171 change it, or massage it, so that it's 58 00:02:30,171 --> 00:02:33,145 changed a little bit and maybe corrupted, 59 00:02:33,145 --> 00:02:36,728 and that collection of known good messages 60 00:02:36,728 --> 00:02:39,045 is called a corpus. 61 00:02:39,045 --> 00:02:41,324 Let's say you're fuzzing a jpeg file, 62 00:02:41,324 --> 00:02:44,535 which is an image, so you would want to have 63 00:02:44,535 --> 00:02:47,341 a collection of images that exercise 64 00:02:47,341 --> 00:02:49,665 different parts of the code. 65 00:02:49,665 --> 00:02:52,675 You have smart or generation-based fuzzers, 66 00:02:52,675 --> 00:02:55,038 and these have a data model that you have 67 00:02:55,038 --> 00:02:57,647 to program ahead of time into the fuzzer 68 00:02:57,647 --> 00:03:00,601 so that the fuzzer can create the message, 69 00:03:00,601 --> 00:03:03,944 sometimes from scratch, and then further along, 70 00:03:03,944 --> 00:03:08,171 you have evolutionary-based fuzzers and those fuzzers 71 00:03:08,171 --> 00:03:11,109 actually monitor the program itself and 72 00:03:11,109 --> 00:03:13,426 they glean heuristics from the program 73 00:03:13,426 --> 00:03:15,580 to figure out if you're exercising 74 00:03:15,580 --> 00:03:17,658 the code that you wanna exercise, 75 00:03:17,658 --> 00:03:21,145 so it's constantly checking whether or not 76 00:03:21,145 --> 00:03:25,312 you're actually running the code that you expect to run.