1
00:00:06,627 --> 00:00:08,199
- So your strategy when you're fuzzing.

2
00:00:08,199 --> 00:00:10,030
If you don't know a whole lot

3
00:00:10,030 --> 00:00:11,766
about the target that you're fuzzing,

4
00:00:11,766 --> 00:00:14,571
don't wait until you have a complete suite

5
00:00:14,571 --> 00:00:16,131
of tests with the fuzzer.

6
00:00:16,131 --> 00:00:17,192
Always fuzz.

7
00:00:17,192 --> 00:00:19,020
That's my recommendation to you.

8
00:00:19,020 --> 00:00:21,187
Fuzzing takes a long time.

9
00:00:22,380 --> 00:00:25,370
And so you really don't want to be waiting

10
00:00:25,370 --> 00:00:27,864
until the very end when you
have a collection of tests.

11
00:00:27,864 --> 00:00:29,128
Just start fuzzing.

12
00:00:29,128 --> 00:00:31,404
And it could be just a
simple mutational fuzzer.

13
00:00:31,404 --> 00:00:33,791
When some cases you're doing
an evolutionary fuzzer,

14
00:00:33,791 --> 00:00:35,721
you can set up something really quickly.

15
00:00:35,721 --> 00:00:38,484
So a fuzzing can take really a long time

16
00:00:38,484 --> 00:00:40,057
if you want it to be complete.

17
00:00:40,057 --> 00:00:42,189
It can take weeks, and so if you can,

18
00:00:42,189 --> 00:00:44,483
try to fuzz in parallel.

19
00:00:44,483 --> 00:00:47,317
If you know a target,
if you know the version

20
00:00:47,317 --> 00:00:48,841
of software that's being run,

21
00:00:48,841 --> 00:00:51,496
run as many instances of that as you can.

22
00:00:51,496 --> 00:00:53,229
You can even run them in a cloud

23
00:00:53,229 --> 00:00:55,094
like in Amazon Web Services or Azure.

24
00:00:55,094 --> 00:00:57,954
Just have multiple instances of it,

25
00:00:57,954 --> 00:01:00,111
and you're fuzzing at the same time.

26
00:01:00,111 --> 00:01:02,041
Maybe starting from a different test case

27
00:01:02,041 --> 00:01:04,688
and that way you finish
things a lot faster

28
00:01:04,688 --> 00:01:06,476
and you can get better results.

29
00:01:06,476 --> 00:01:08,647
So when you're looking for fuzzing,

30
00:01:08,647 --> 00:01:10,361
you're looking for code coverage.

31
00:01:10,361 --> 00:01:12,223
In other words, your one exercise is much

32
00:01:12,223 --> 00:01:16,287
of the program's parser and
code that's based on the parser.

33
00:01:16,287 --> 00:01:18,092
Information that comes from the parser,

34
00:01:18,092 --> 00:01:20,602
you want to execute that
as much as possible.

35
00:01:20,602 --> 00:01:22,482
So you can get as much coverage.

36
00:01:22,482 --> 00:01:25,095
Some of the basic fuzzers
that you're dealing with,

37
00:01:25,095 --> 00:01:27,952
like mutational fuzzers,
they really only find results

38
00:01:27,952 --> 00:01:30,126
for bugs that are what are called shallow.

39
00:01:30,126 --> 00:01:32,327
In other words, you're not going very far

40
00:01:32,327 --> 00:01:33,968
into the parser itself.

41
00:01:33,968 --> 00:01:35,592
You're doing maybe just a little bit

42
00:01:35,592 --> 00:01:37,261
of things on the outside.

43
00:01:37,261 --> 00:01:39,123
And really to get any further than that,

44
00:01:39,123 --> 00:01:42,001
you either need to do
the generational fuzzer

45
00:01:42,001 --> 00:01:43,790
where you have information about

46
00:01:43,790 --> 00:01:45,555
how the protocol is supposed to work.

47
00:01:45,555 --> 00:01:47,818
Or the evolutionary fuzzers
which actually monitor

48
00:01:47,818 --> 00:01:49,979
the programs so that you can get

49
00:01:49,979 --> 00:01:51,751
more deep into the program.

50
00:01:51,751 --> 00:01:54,507
So that's what you need to have.

51
00:01:54,507 --> 00:01:57,569
A complex and sometimes even
a custom-written fuzzer.

52
00:01:57,569 --> 00:02:00,055
So in other words, you're
writing everything either

53
00:02:00,055 --> 00:02:03,089
from scratch or you're using
some kind of fuzzing framework,

54
00:02:03,089 --> 00:02:05,622
and I'll go through
some of those later on.

55
00:02:05,622 --> 00:02:08,420
And there are certain kinds
of vulnerabilities and bugs

56
00:02:08,420 --> 00:02:10,776
that you will not find with fuzzing.

57
00:02:10,776 --> 00:02:12,547
This is an example.

58
00:02:12,547 --> 00:02:14,522
There are some vulnerabilities

59
00:02:14,522 --> 00:02:16,065
that will not crash your program

60
00:02:16,065 --> 00:02:18,311
because they are basically
just a logic error,

61
00:02:18,311 --> 00:02:20,957
or they allow you some additional access

62
00:02:20,957 --> 00:02:23,013
to do things that you shouldn't be doing

63
00:02:23,013 --> 00:02:25,091
like injecting a command.

64
00:02:25,091 --> 00:02:27,344
That's a very common vulnerability

65
00:02:27,344 --> 00:02:29,324
especially on network equipment

66
00:02:29,324 --> 00:02:31,288
is a command injection vulnerability.

67
00:02:31,288 --> 00:02:33,265
And what that means is that normally

68
00:02:33,265 --> 00:02:35,680
you would issue a command on
the command line interface,

69
00:02:35,680 --> 00:02:39,558
and it would run some program
in the operating system.

70
00:02:39,558 --> 00:02:42,525
And then you may be able
to tack on something

71
00:02:42,525 --> 00:02:44,150
like if you do a semicolon,

72
00:02:44,150 --> 00:02:46,962
you could maybe tack on
a shell command to that

73
00:02:46,962 --> 00:02:50,303
and be able to do things
like executing a Netcat

74
00:02:50,303 --> 00:02:53,058
and calling home to you and
then getting that shell.

75
00:02:53,058 --> 00:02:55,951
Another thing you can do is
maybe directory traversal.

76
00:02:55,951 --> 00:02:57,889
You can access part of the file system

77
00:02:57,889 --> 00:02:59,827
that you shouldn't be able to access.

78
00:02:59,827 --> 00:03:02,290
These are things that
fuzzing will not uncover

79
00:03:02,290 --> 00:03:05,432
because they do not cause
the program to crash,

80
00:03:05,432 --> 00:03:08,105
and there's really no way of knowing

81
00:03:08,105 --> 00:03:10,297
what it is that it was looking at.

82
00:03:10,297 --> 00:03:11,891
So in these cases, you
may able to do some kind

83
00:03:11,891 --> 00:03:14,178
of instrumentation and
analysis after the fact

84
00:03:14,178 --> 00:03:16,347
to find out what the
device is responding with.

85
00:03:16,347 --> 00:03:19,136
But in other cases, you
just won't find them.

86
00:03:19,136 --> 00:03:21,168
So here's how you choose what to fuzz.

87
00:03:21,168 --> 00:03:23,392
When I approach a target, and I wanna know

88
00:03:23,392 --> 00:03:25,181
what I should work on first,

89
00:03:25,181 --> 00:03:28,749
first I wanna know what user
inputs are used by the program.

90
00:03:28,749 --> 00:03:30,734
What actual user inputs can I supply,

91
00:03:30,734 --> 00:03:33,601
and which of those are
from an untrusted source?

92
00:03:33,601 --> 00:03:35,599
In other words, are they
coming from a session

93
00:03:35,599 --> 00:03:37,675
that is already authenticated?

94
00:03:37,675 --> 00:03:39,244
If I'm already logged in with SSH,

95
00:03:39,244 --> 00:03:43,784
it may not be as important as
if I had actually been able

96
00:03:43,784 --> 00:03:47,087
to send this as a packet over
the internet, for instance.

97
00:03:47,087 --> 00:03:50,436
So if you have those kinds
of inputs that are untrusted,

98
00:03:50,436 --> 00:03:54,150
you would want to prioritize
those above all others.

99
00:03:54,150 --> 00:03:57,066
What features are mature versus immature?

100
00:03:57,066 --> 00:03:59,388
If it's a new feature, it may have

101
00:03:59,388 --> 00:04:01,269
been implemented just recently.

102
00:04:01,269 --> 00:04:04,519
And the code may not
have been tested as well.

103
00:04:04,519 --> 00:04:07,188
So if you know a part of the feature

104
00:04:07,188 --> 00:04:09,721
or you know a feature of
the code that is newer,

105
00:04:09,721 --> 00:04:11,613
you may fuzz that first because

106
00:04:11,613 --> 00:04:14,806
the code has not been as well vetted.

107
00:04:14,806 --> 00:04:17,902
And then there are other things you need

108
00:04:17,902 --> 00:04:19,914
to worry about when you're
choosing what to fuzz.

109
00:04:19,914 --> 00:04:22,231
Are there input constraints
which the parser checks first?

110
00:04:22,231 --> 00:04:25,589
In other words, does the input
have to be null terminated?

111
00:04:25,589 --> 00:04:27,367
In other words, are they doing something

112
00:04:27,367 --> 00:04:29,132
like we did earlier with a string copier?

113
00:04:29,132 --> 00:04:30,456
Are they using a strcpy?

114
00:04:30,456 --> 00:04:32,232
So the input must be null terminated.

115
00:04:32,232 --> 00:04:33,857
Otherwise it just continues to copy.

116
00:04:33,857 --> 00:04:36,375
You have certain packets and payloads

117
00:04:36,375 --> 00:04:38,362
that you need that have
a checksum in them,

118
00:04:38,362 --> 00:04:40,099
and that checksum has to be correct.

119
00:04:40,099 --> 00:04:43,030
Before it'll look any
further into the packet,

120
00:04:43,030 --> 00:04:45,828
you need to know that that checksum

121
00:04:45,828 --> 00:04:47,440
has to be fixed up first.

122
00:04:47,440 --> 00:04:49,308
Otherwise the parser will just see that

123
00:04:49,308 --> 00:04:51,273
that's no good and just
throw the packet away.

124
00:04:51,273 --> 00:04:53,137
And same thing if you're doing

125
00:04:53,137 --> 00:04:55,168
a secure connection using TLS.

126
00:04:55,168 --> 00:04:58,355
Maybe you have to have that set up first.

127
00:04:58,355 --> 00:05:00,544
You have to actually have
a connection set up first.

128
00:05:00,544 --> 00:05:02,639
The input may have length fields.

129
00:05:02,639 --> 00:05:04,575
It has count fields, and those have

130
00:05:04,575 --> 00:05:06,273
to match some kind of payload.

131
00:05:06,273 --> 00:05:08,177
Those are all fix ups that you would have

132
00:05:08,177 --> 00:05:10,209
to do when you're choosing what to fuzz.

133
00:05:10,209 --> 00:05:13,367
And so here is a drawing of program input.

134
00:05:13,367 --> 00:05:15,677
Some of these may be
fairly obvious to you.

135
00:05:15,677 --> 00:05:17,634
Like maybe a socket that you're

136
00:05:17,634 --> 00:05:19,671
listening on the network for.

137
00:05:19,671 --> 00:05:22,534
And then there's others that
may not be very obvious to you.

138
00:05:22,534 --> 00:05:24,547
You have environment which are values

139
00:05:24,547 --> 00:05:27,786
that are stored with the program
when the program executes.

140
00:05:27,786 --> 00:05:30,675
Those can be like the path variable

141
00:05:30,675 --> 00:05:33,101
that allows you to look in the file system

142
00:05:33,101 --> 00:05:34,736
for certain programs.

143
00:05:34,736 --> 00:05:36,703
You may be able to modify that environment

144
00:05:36,703 --> 00:05:38,398
before you invoke a program.

145
00:05:38,398 --> 00:05:39,837
You have file systems and files

146
00:05:39,837 --> 00:05:41,794
that are accessible from the application.

147
00:05:41,794 --> 00:05:43,617
Maybe when you're running the application,

148
00:05:43,617 --> 00:05:45,352
it has files that it uses on the

149
00:05:45,352 --> 00:05:47,314
file system for doing sockets.

150
00:05:47,314 --> 00:05:48,960
That's called a Unix domain socket.

151
00:05:48,960 --> 00:05:50,388
You have memory.

152
00:05:50,388 --> 00:05:52,005
So memory could be mapped into an

153
00:05:52,005 --> 00:05:53,909
application from somewhere else.

154
00:05:53,909 --> 00:05:56,005
It could be shared memory based on a file.

155
00:05:56,005 --> 00:05:57,484
Also if you write into that file

156
00:05:57,484 --> 00:05:59,632
and that memory is executable,

157
00:05:59,632 --> 00:06:01,815
then you might be able to run things

158
00:06:01,815 --> 00:06:03,998
that are directly in that shared memory.

159
00:06:03,998 --> 00:06:05,344
You have positive message queues,

160
00:06:05,344 --> 00:06:07,272
and you have libraries that are mapped in.

161
00:06:07,272 --> 00:06:09,419
And those libraries
can either be libraries

162
00:06:09,419 --> 00:06:11,370
that are on disk like the libc,

163
00:06:11,370 --> 00:06:12,972
and there's also libraries called

164
00:06:12,972 --> 00:06:15,004
the virtual dynamic shared
object which are libraries

165
00:06:15,004 --> 00:06:18,113
that the kernel itself
provides to each process.

166
00:06:18,113 --> 00:06:22,197
If you ever heard of an attack
recently called Dirty Cow,

167
00:06:22,197 --> 00:06:25,509
one of the ways to be able
to exploit a process was

168
00:06:25,509 --> 00:06:28,309
to overwrite the virtual
dynamic shared object.

169
00:06:28,309 --> 00:06:31,246
So that when something got
called in there from the kernel,

170
00:06:31,246 --> 00:06:33,232
you're actually calling
some attacker code,

171
00:06:33,232 --> 00:06:35,772
and you've actually
taken the process over.

172
00:06:35,772 --> 00:06:37,535
So this is the fuzzing process.

173
00:06:37,535 --> 00:06:39,439
We're gonna start from the
top and work our way down.

174
00:06:39,439 --> 00:06:41,667
So what you'll do first is you'll set up

175
00:06:41,667 --> 00:06:43,495
the test environment so that it's

176
00:06:43,495 --> 00:06:45,234
in the state that you want.

177
00:06:45,234 --> 00:06:46,827
You'll pick a random seed.

178
00:06:46,827 --> 00:06:48,603
Typically when you're
doing some kind of fuzzing,

179
00:06:48,603 --> 00:06:50,298
you'll have a random number generator

180
00:06:50,298 --> 00:06:52,373
that will determine what
inputs you actually fuzz

181
00:06:52,373 --> 00:06:54,536
and then the values that you might use.

182
00:06:54,536 --> 00:06:56,298
So you want to have a random seed

183
00:06:56,298 --> 00:06:58,042
that is available that you know,

184
00:06:58,042 --> 00:07:01,371
so that when you're fuzzing
and you cause a crash later on,

185
00:07:01,371 --> 00:07:04,097
you want to be able to
start from that random seed.

186
00:07:04,097 --> 00:07:06,152
So if you're not recording that anywhere,

187
00:07:06,152 --> 00:07:08,259
if you fuzz again, you
use a different seed,

188
00:07:08,259 --> 00:07:10,151
you may have different test cases.

189
00:07:10,151 --> 00:07:11,753
So then you also want to choose the input.

190
00:07:11,753 --> 00:07:13,588
You wanna choose what
value you wanna fuzz.

191
00:07:13,588 --> 00:07:16,049
And then the fuzzer
will create a test case.

192
00:07:16,049 --> 00:07:17,743
In other words, it creates an

193
00:07:17,743 --> 00:07:20,297
anomalous message based on that field.

194
00:07:20,297 --> 00:07:21,979
And then you run the target

195
00:07:21,979 --> 00:07:23,839
with that test case as the input.

196
00:07:23,839 --> 00:07:26,440
And then you instrument the
target and check for a crash.

197
00:07:26,440 --> 00:07:29,942
And if you have a crash,
then you would record

198
00:07:29,942 --> 00:07:32,733
the test case inputs and the crash data.

199
00:07:32,733 --> 00:07:35,566
And then you would move on and go back

200
00:07:35,566 --> 00:07:37,330
to choose the input again.

201
00:07:37,330 --> 00:07:39,848
And if you don't crash,
then I guess it was okay.

202
00:07:39,848 --> 00:07:41,626
There wasn't anything that happened.

203
00:07:41,626 --> 00:07:44,180
So you may kill the target
and restart it again

204
00:07:44,180 --> 00:07:47,832
just to get you back
into that original state.