1 00:00:00,750 --> 00:00:02,840 In this and the next two videos, 2 00:00:02,840 --> 00:00:06,200 I'll continue my introduction to data science 3 00:00:06,200 --> 00:00:10,900 by presenting more on simulation and introducing 4 00:00:10,900 --> 00:00:13,130 static visualizations. 5 00:00:13,130 --> 00:00:15,870 Now, in this particular series of three videos, 6 00:00:15,870 --> 00:00:19,640 we're going to be demonstrating static bar charts 7 00:00:19,640 --> 00:00:22,880 that show the final results of a simulation 8 00:00:22,880 --> 00:00:25,940 in which we roll a six-sided die 9 00:00:25,940 --> 00:00:29,790 a specified number of times, and in the next video, 10 00:00:29,790 --> 00:00:33,000 I'll demonstrate the script that we're going to 11 00:00:33,000 --> 00:00:36,780 produce in action, and then in the subsequent video, 12 00:00:36,780 --> 00:00:39,530 we'll take a look at the actual source code 13 00:00:39,530 --> 00:00:42,210 for that simulation. 14 00:00:42,210 --> 00:00:46,120 Now, in general when you are doing data science, 15 00:00:46,120 --> 00:00:48,880 visualizations are extremely important 16 00:00:48,880 --> 00:00:53,020 and can be a great way for you to get to know your data. 17 00:00:53,020 --> 00:00:55,860 So, very quickly, with these powerful 18 00:00:55,860 --> 00:00:58,460 visualization libraries that are available to us 19 00:00:58,460 --> 00:01:02,010 with Python, we can hand off a chunk of data 20 00:01:02,010 --> 00:01:05,090 to a library and instantly get these 21 00:01:05,090 --> 00:01:08,020 really nice charts and graphs that can help us 22 00:01:08,020 --> 00:01:09,840 get a sense of the data. 23 00:01:09,840 --> 00:01:11,660 Now, let's say we're rolling 24 00:01:11,660 --> 00:01:15,020 a six-sided die six million times. 25 00:01:15,020 --> 00:01:19,280 By summarizing how many times each die face occurs 26 00:01:19,280 --> 00:01:21,370 and then plotting that on a bar chart, 27 00:01:21,370 --> 00:01:24,400 we can get a sense of whether or not 28 00:01:24,400 --> 00:01:26,940 we get approximately the same number of 29 00:01:26,940 --> 00:01:30,620 ones, twos, threes, fours, fives, and sixes, 30 00:01:30,620 --> 00:01:32,840 which is the key thing that we'll be showing 31 00:01:32,840 --> 00:01:35,350 in this particular example. 32 00:01:35,350 --> 00:01:39,340 So, that's just one example, but as you get into big data, 33 00:01:39,340 --> 00:01:42,120 sometimes the data is so massive 34 00:01:42,120 --> 00:01:45,300 that the only ways to really get to know the data 35 00:01:45,300 --> 00:01:48,770 are to calculate things like descriptive statistics 36 00:01:48,770 --> 00:01:51,050 that we've started to present to you 37 00:01:51,050 --> 00:01:56,050 or to take a look at graphs that summarize the data for you. 38 00:01:56,360 --> 00:02:00,580 Now, we'll be looking, in this sequence of three videos, 39 00:02:00,580 --> 00:02:04,760 at the Seaborn and Matplotlib libraries. 40 00:02:04,760 --> 00:02:08,890 Matplotlib is a very popular visualization library. 41 00:02:08,890 --> 00:02:11,860 It happens to also have support built 42 00:02:11,860 --> 00:02:14,250 into Jupiter Notebooks, which makes it 43 00:02:14,250 --> 00:02:16,380 even more popular for that reason. 44 00:02:16,380 --> 00:02:18,790 And then, separately, Seaborn is built 45 00:02:18,790 --> 00:02:22,420 on top of Matplotlib, and then also, therefore, 46 00:02:22,420 --> 00:02:25,580 can work in context of Jupiter Notebooks. 47 00:02:25,580 --> 00:02:29,430 And Seaborn, basically, is going to simplify 48 00:02:29,430 --> 00:02:32,690 a number of Matplotlib operations. 49 00:02:32,690 --> 00:02:37,560 It also beautifies a few things about Matplotlib as well. 50 00:02:37,560 --> 00:02:39,910 So in the next video, we'll take a look at 51 00:02:39,910 --> 00:02:42,300 the running simulation and talk about 52 00:02:42,300 --> 00:02:43,870 what we're trying to show, and then 53 00:02:43,870 --> 00:02:46,740 in the subsequent video, we will demonstrate 54 00:02:46,740 --> 00:02:49,253 the code for that simulation.